It’s probably easiest to render a Position pass, which will directly give you a world coordinate.
To recover a world position from depth, you need to invert the perspective camera transform. From a quick glance I didn’t see that kind of matrix in your script, but I didn’t look very closely.
Thank you! Could you please guide me on how to use Position pass? I was trying to obtain some depth map, but realized that the depth map I got was misaligned.
I constructed the projection matrix by multiplying the extrinsics matrix (converted from cam.matrix_world, the c2w matrix in the script) and intrinsics matrix (manually constructed, the K matrix in the script). Is that the same as “invert the perspective camera transform” ?
Thank you, I’ll try using the Position passes.
I’m still a bit confused about the difference between depth and position. If the position pass obtains a correct unique 3D position of each point, it should be consistent with the depth pass right? I’m not sure what caused the depth pass generates misaligned depth values.
Regarding the projection matrix: As far as I understand, I’m not using NDC space, so the only need is to convert from camera to screen right? And I multiplied the coord with depth value (x,y,1) <-> (xz, yz, z) to achieve this.