Answer : If with "world coordinates" you mean "object coordinates", you have to get the inverse transformation of the result given by the pnp algorithm. There is a trick to invert transformation matrices that allows you to save the inversion operation, which is usually expensive, and that explains the code in Python. Given a transformation [R|t] , we have that inv([R|t]) = [R'|-R'*t] , where R' is the transpose of R . So, you can code (not tested): cv::Mat rvec, tvec; solvePnP(..., rvec, tvec, ...); // rvec is 3x1, tvec is 3x1 cv::Mat R; cv::Rodrigues(rvec, R); // R is 3x3 R = R.t(); // rotation of inverse tvec = -R * tvec; // translation of inverse cv::Mat T = cv::Mat::eye(4, 4, R.type()); // T is 4x4 T( cv::Range(0,3), cv::Range(0,3) ) = R * 1; // copies R into T T( cv::Range(0,3), cv::Range(3,4) ) = tvec * 1; // copies tvec into T // T is a 4x4 matrix with the pose of the camera in the object frame Update: Later, to use T with OpenGL ...