A polynomial formula for the perspective four points problem

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We present a fast and accurate solution to the perspective $n$-points problem, by way of a new approach to the n=4 case. Our solution hinges on a novel separation of variables: given four 3D points and four corresponding 2D points on the camera canvas, we start by finding another set of 3D points, sitting on the rays connecting the camera to the 2D canvas points, so that the six pair-wise distances between these 3D points are as close as possible to the six distances between the original 3D points. This step reduces the perspective problem to an absolute orientation problem, which has a solution via explicit formula. To solve the first problem we set coordinates which are as orientation-free as possible: on the 3D points side our coordinates are the squared distances between the points. On the 2D canvas-points side our coordinates are the dot products of the points after rotating one of them to sit on the optical axis. We then derive the solution with the help of a computer algebra system. Our solution is an order of magnitude faster than state of the art algorithms, while offering similar accuracy under realistic noise. Moreover, our reduction to the absolute orientation problem runs two orders of magnitude faster than other perspective problem solvers, allowing extremely efficient seed rejection when implementing RANSAC.

💡 Research Summary

The paper introduces a novel, closed‑form solution for the perspective‑four‑points (P4P) problem, which asks for the six‑degree‑of‑freedom pose of a calibrated camera given four 3‑D world points and their four corresponding 2‑D image projections. The authors’ key insight is to replace the raw coordinates of the points (which amount to 20 numbers) with a set of orientation‑free invariants: the six squared inter‑point distances among the 3‑D points and six scalar products derived from the 2‑D points after rotating the image so that one of the points lies on the optical axis.

By expressing the geometric constraints in terms of these invariants, they derive six polynomial equations linking the unknown depths (z_i) of the image rays to the known invariants. The equations have the form

A polynomial formula for the perspective four points problem

💡 Research Summary

Comments & Academic Discussion

Leave a Comment