Cycles generates misaligned depth when projecting to 3D point cloud

Hi, not sure if here’s the correct place. I’ve found that the depth exported from Cycles is misaligned when projecting back to 3D point clouds.

Long story short:

  1. I exported 2 views (RGBD) from a .blend file.
  2. I projected the RGBD points back into 3D point cloud in world coordinate
  3. I visualized the point clouds in MeshLab and found that the point clouds of 2 views are misaligned.

Details can be found here: cycles render engine - Misalignment of depth map when projecting two views into 3D point cloud - Blender Stack Exchange

Any help is much appreciated. Thank you!

It’s probably easiest to render a Position pass, which will directly give you a world coordinate.

To recover a world position from depth, you need to invert the perspective camera transform. From a quick glance I didn’t see that kind of matrix in your script, but I didn’t look very closely.

Thank you! Could you please guide me on how to use Position pass? I was trying to obtain some depth map, but realized that the depth map I got was misaligned.

I constructed the projection matrix by multiplying the extrinsics matrix (converted from cam.matrix_world, the c2w matrix in the script) and intrinsics matrix (manually constructed, the K matrix in the script). Is that the same as “invert the perspective camera transform” ?

Passes are documented in the manual:

cam.matrix_world is not a perspective transform matrix. It converts between camera and world space. What I mean is a projection matrix that converts from camera to screen or NDC space, like this.

But it’s easier to bypass that entirely with a Position pass, and as a bonus it will work for every type of camera.

Thank you, I’ll try using the Position passes.
I’m still a bit confused about the difference between depth and position. If the position pass obtains a correct unique 3D position of each point, it should be consistent with the depth pass right? I’m not sure what caused the depth pass generates misaligned depth values.

Regarding the projection matrix: As far as I understand, I’m not using NDC space, so the only need is to convert from camera to screen right? And I multiplied the coord with depth value (x,y,1) <-> (xz, yz, z) to achieve this.