# How can I compute the camera pose using relative rotation and translation matrix in RGBD images?

I have a kinect camera that can move around a certain object. I have computed 3d corresponding points in two consecutive images and got 3*3 rotation matrix and 3*1 translation matrix to convert the first 3d point clouds to the second ones but I need to obtain the camera pose (the location and orientation (yaw,pitch,roll) of the camera) along times and then track it. I don't know how I can use obtained matrix for computing the camera pose.

## All Answers (11)

Jami Pekkanen· University of HelsinkiMahdi Manafzade· Ferdowsi University Of MashhadEugen Funk· German Aerospace Center (DLR)Mahdi Manafzade· Ferdowsi University Of MashhadDamien Lefloch· Universität SiegenYou said you are able to register two pointclouds (given by two consecutives kinect frames, is it correct?). Which means that you found the relative camera position of your destination point cloud to your source point cloud. Is it not what you want?

You should maybe take a look to some paper like kinectFusion that track the camera motion during a full kinect sequence via a point to plane ICP. All frames are registered in the same coordinate (the one given by the first input frame). There is an opensource implementation on GPU of the tracker in the PCL library (the module is called kinfu).

Eugen Funk· German Aerospace Center (DLR)as Damien said, I would also assume that by finding the rotation and translation between two point clouds, you also have (its the same or the inverse) the camera movement between these two frames.

Mahdi Manafzade· Ferdowsi University Of Mashhadthe coordinates of 3d points for the first point clouds are with respect to the center of the first position of the camera(0,0,0) and also the coordinates of 3d points for the second point clouds are with respect to the center of the second position of the camera(0,0,0). we have two origins. I think the relative camera position is different from the relative rotation matrix between two point clouds.

Damien Lefloch· Universität SiegenSorry, I still don't really understand the problem but i will try to give more details. The relative rotation and translation matrices between two point clouds are directly linked with the camera motion. Knowing the camera motion and the camera position of the previous acquisition, you can compute easily the camera position of the current acquisition.

Let's assume that you have two depth images D0 and D1 taken at two different camera positions (but not so far from each other that ICP can operate).

Both resulting point clouds (PC0 and PC1) are with respect to each of both camera centers.

The result of ICP is the transformation matrix that express one point cloud to the coordinate of the other one. Let's assume that you actually run ICP in such a way that you can express the point cloud PC1 in the coordinate PC0; Let T1->0 be this 4x4 transformation matrix. Note that the invert of T1->0 is the matrix T0->1 that transform the pointcloud PC0 to the coordinate of PC1 (it is really easy to compute the inverse of such a matrix)

Now let's assume that the camera position C0 (with respect to D0) is centered at (0,0,0) and its viewing ray is pointing to the negative Z axis.

Then the camera position C1 will be centered at

T0->1 * (0;0;0;1)

And its viewing ray will be

T0->1 * (0;0;-1;0)

Note that if you want to process a sequence of n frames, you will need to accumulate the transformation matrix.

Hope that it helps.

Mahdi Manafzade· Ferdowsi University Of MashhadMahdi Manafzade· Ferdowsi University Of MashhadSio-Hoi Ieng· Pierre and Marie Curie University - Paris 61)you said your system is giving you the relative pose between two successive camera's positions. Let's say X1 is a 3D point coordinate in the camera at time t1 and X2 at time t2, so you have (R,T) such that:

X1=R.X2+T.

2)If (R1,T1) , (R2,T2) etc are the poses of the camera with respect to an arbitrary world coordinate frame (e.g. one in which the 3D points are not moving), then the (Ri,Ti) are exactly what you are looking for.

3)So if X is the coordinate of the previous 3D point in this coordinate frame we have:

X1=R1.X+T1 and

X2=R2.X+T2

hence, X1 = R (R2.X+T2) + T, thus we get

R^t (X1 - T) =[R2 T2] . (X;1)

The last equation can be rearranged into Ax = B with x contening R2 and T2, A the R, X1 and the T and finally B contains X. The system can be solved with some least square technique assuming you are providing enough 3D points X.

R,T are as you stated, given by your system. X is also easy to have since the world coordinate frame is chosen. The only not so easy part is to ensure that X is track across de time (which I assumed is the case since you are able to compute the relative pose from the points cloud i.e. the X1,X2,...)

Can you help by adding an answer?