Analytical Solution

In this section, we first present how to obtain the solution for the extrinsic calibration of the LRF-camera system from just a single observation of the calibration target. Then, we show the solution from multiple observations which is needed to reduce the effect of noise. Note that an analytical solution is obtained in our constraints system, which is more general than the closed-form solution in . Moreover, we present a strategy to exclude invalid solutions from the cheirality check.

From a Single Observation

We outline seven steps to solve the polynomial system (Eqs. ([constraint4])-([constraint3])). For convenience, the geometry constraints (Eqs. ([constraint1])-([constraint3])) are reformulated as follows

\begin{equation}
 \label{constraintRe}
{^C}\bar{\mathbf{n}}_i^{\top} \left({_L^C}\mathbf{R} \cdot {^L}\bar{\mathbf{p}}_i + {^C}\mathbf{t}{_L} \right) = \bar{d}_i, \quad i = 1,2,...,6
\end{equation}

where the parameters $`{^C}\bar{\mathbf{n}}_i`$, $`{^L}\bar{\mathbf{p}}_i`$ and $`\bar{d}_i`$ are defined as

\begin{equation}
 \label{reDefinition}
\scalebox{0.7}
{$
\begin{cases}
{^C}\bar{\mathbf{n}}_i = {^C}\mathbf{n}_i, &i = 1,2 \\
{^C}\bar{\mathbf{n}}_j = {^C}\mathbf{n}_3, &j = 3,4 \\
{^C}\bar{\mathbf{n}}_k = {^C}\mathbf{n}_4, &k = 5,6
\end{cases}
$} , \ 
\scalebox{0.7}
{$
\begin{cases}
{^L}\bar{\mathbf{p}}_i = {^L}\mathbf{p}_1, &i = 1,3 \\
{^L}\bar{\mathbf{p}}_j = {^L}\mathbf{p}_2, &j = 2,5 \\
{^L}\bar{\mathbf{p}}_k = {^L}\mathbf{p}_3, &k = 4,6
\end{cases}
$} , \ 
\scalebox{0.7}
{$
\begin{cases}
\bar{d}_i = 0, &i = 1,2 \\
\bar{d}_j = d_1, &j = 3,4 \\
\bar{d}_k = d_2, &k = 5,6
\end{cases}
$} .
\end{equation}

STEP 1: The problem is reformulated in the view of nonlinear optimization as stated in shown below

\begin{equation}
 \label{nonlinearRe}
\begin{gathered}
\argmin_{{_L^C}\mathbf{R}, {^C}\mathbf{t}{_L}} J = \sum_{i=1}^N \left( {^C}\bar{\mathbf{n}}_i^{\top} \left({_L^C}\mathbf{R} \cdot {^L}\bar{\mathbf{p}}_i + {^C}\mathbf{t}{_L} \right) - \bar{d}_i \right)^2 \\
\text{s. t.} \quad {_L^C}\mathbf{R}^{\top} \cdot {_L^C}\mathbf{R} = \mathbf{I}, \quad \det \left({_L^C}\mathbf{R}\right) = 1
\end{gathered} ,
\end{equation}

where $`N = 6`$. From the reformulated problem ([nonlinearRe]), the optimal solution for $`{^C}\mathbf{t}{_L}`$ is obtained as shown below

\begin{equation}
 \label{translation}
\begin{gathered}
\dfrac{\partial J}{\partial{^C}\mathbf{t}{_L}} = \sum_{i=1}^N 2\left[ {^C}\bar{\mathbf{n}}_i^{\top} \left( {_L^C}\mathbf{R} \cdot {^L}\bar{\mathbf{p}}_i + {^C}\mathbf{t}{_L} \right) - \bar{d}_i \right] {^C}\bar{\mathbf{n}}_i = 0 \\
\Rightarrow {^C}\mathbf{t}{_L} = \mathcal{N}_o^{-1} \left( \mathcal{D}_n - \sum_{i=1}^N {^C}\bar{\mathbf{n}}_i {^C}\bar{\mathbf{n}}_i^{\top} {_L^C}\mathbf{R} {^L}\bar{\mathbf{p}}_i \right) ,
\end{gathered}
\end{equation}

where $`\mathcal{N}_o = \sum_{i=1}^N {^C}\bar{\mathbf{n}}_i {^C}\bar{\mathbf{n}}_i^{\top}`$ and $`\mathcal{D}_n = \sum_{i=1}^N \bar{d}_i {^C}\bar{\mathbf{n}}_i`$.

$`\mathcal{N}_o`$ is a non-singular matrix and thus invertible.

Lemma [lemma2] is proved in Appendix 17.

Since a laser point is defined as $`{^L}\bar{\mathbf{p}}_i = [\bar{x}_i, 0, \bar{z}_i]^{\top}`$, we arrange the expression of $`{^C}\mathbf{t}{_L}`$ in ([translation]) to the form

\begin{equation}
 \label{translationS}
{^C}\mathbf{t}{_L} = \mathcal{N}_o^{-1} \left(\mathcal{D}_n - \mathcal{N}_{\alpha}\mathbf{r}_1 -  \mathcal{N}_{\gamma}\mathbf{r}_3 \right) ,
\end{equation}

in which $`\mathcal{N}_{\alpha} = \sum_{i=1}^N {^C}\bar{\mathbf{n}}_i {^C}\bar{\mathbf{n}}_i^{\top} \bar{x}_i`$ and $`\mathcal{N}_{\gamma} = \sum_{i=1}^N {^C}\bar{\mathbf{n}}_i {^C}\bar{\mathbf{n}}_i^{\top} \bar{z}_i`$.

STEP 2: With $`\mathcal{N}_x`$, $`\mathcal{N}_z`$, $`\mathcal{N}`$ and $`\mathcal{D}`$ defined as

\begin{equation}
\mathcal{N}_x = \scalebox{0.7}
{$
\begin{bmatrix}
\bar{x}_1{^C}\bar{\mathbf{n}}_1^{\top}\\
\vdots \\
\bar{x}_N{^C}\bar{\mathbf{n}}_N^{\top}
\end{bmatrix}
$} , \ 
\mathcal{N}_z = \scalebox{0.7}
{$
\begin{bmatrix}
\bar{z}_1{^C}\bar{\mathbf{n}}_1^{\top} \\
\vdots \\
\bar{z}_N{^C}\bar{\mathbf{n}}_N^{\top}
\end{bmatrix}
$} , \ 
\mathcal{N} = \scalebox{0.7}
{$
\begin{bmatrix}
{^C}\bar{\mathbf{n}}_1^{\top} \\
\vdots \\
{^C}\bar{\mathbf{n}}_N^{\top}
\end{bmatrix}
$} , \ 
\mathcal{D} = \scalebox{0.7}
{$
\begin{bmatrix}
\bar{d}_1\\
\vdots \\
\bar{d}_N
\end{bmatrix}
$} ,
\end{equation}

we substitute ([translationS]) in constraints ([constraintRe]) and obtain

\begin{equation}
 \label{expressionR1P}
\mathcal{G}_x \mathbf{r}_1 + \mathcal{G}_z \mathbf{r}_3 = \mathcal{G}_d ,
\end{equation}

where $`\mathcal{G}_x = \mathcal{N}_x-\mathcal{N}\mathcal{N}_o^{-1}\mathcal{N}_{\alpha}`$, $`\mathcal{G}_z = \mathcal{N}_z-\mathcal{N}\mathcal{N}_o^{-1}\mathcal{N}_{\gamma}`$ and $`\mathcal{G}_d = \mathcal{D}-\mathcal{N}\mathcal{N}_o^{-1}\mathcal{D}_n`$. Then, $`\mathbf{r}_1`$ is further expressed in terms of $`\mathbf{r}_3`$

\begin{equation}
 \label{expressionR1}
\mathbf{r}_1 = \mathcal{H}\mathbf{r}_3 + \mathcal{K} ,
\end{equation}

where $`\mathcal{H} = - \left(\mathcal{G}_x^{\top}\mathcal{G}_x\right)^{-1}\mathcal{G}_x^{\top}\mathcal{G}_z`$ and $`\mathcal{K} = \left(\mathcal{G}_x^{\top}\mathcal{G}_x\right)^{-1}\mathcal{G}_x^{\top}\mathcal{G}_d`$.

Note that $`\mathcal{G}_x^{\top}\mathcal{G}_x`$ is invertible. The proof is by contradiction. We first assume the $`3 \times 3`$ matrix $`\mathcal{G}_x^{\top}\mathcal{G}_x`$ is non-invertible thus rank deficient. From ([expressionR1P]), we have $`\left(\mathcal{G}_x^{\top}\mathcal{G}_x\right) \mathbf{r}_1 = \mathcal{G}_x^{\top} \mathcal{G}_d - \mathcal{G}_x^{\top}\mathcal{G}_z \mathbf{r}_3`$, which is a $`\mathbf{A}\mathbf{x} = \mathbf{b}`$ system for solving $`\mathbf{r}_1`$. Then for any given $`\mathbf{r}_3`$ (thus $`\mathbf{b}`$ is given), the rank-deficient $`\mathbf{A}`$ results in infinite solutions for $`\mathbf{x}`$ (which is $`\mathbf{r}_1`$). It means that our system has ambiguity, which is in contradiction to the uniqueness proof in Section 19. Hence, $`\mathcal{G}_x^{\top}\mathcal{G}_x`$ is invertible.

STEP 3: Now we can eliminate $`\mathbf{r}_1`$ by substituting ([expressionR1]) in the three remaining second order constraints ([constraint4]). After full expansion, we have the following

\begin{align}
\scalebox{0.55}
{$
e_{11}r_{13}^2 + e_{12}r_{13}r_{23} + e_{13}r_{23}^2 + e_{14}r_{13}r_{33} + e_{15}r_{23}r_{33} + e_{16}r_{33}^2 + e_{17}r_{13} + e_{18}r_{23} + e_{19}r_{33} + m = 0
$} \label{constraint1R} \\
\scalebox{0.55}
{$
e_{21}r_{13}^2 + e_{22}r_{13}r_{23} + e_{23}r_{23}^2 + e_{24}r_{13}r_{33} + e_{25}r_{23}r_{33} + e_{26}r_{33}^2 + e_{27}r_{13} + e_{28}r_{23} + e_{29}r_{33} = 0
$} \label{constraint2R} \\
r_{13}^2+r_{23}^2+r_{33}^2 - 1 = 0 , \label{objectR}
\end{align}

where the coefficients $`e_{ij}`$ and the constant $`m`$ are computed in a closed form in terms of the components of $`\mathcal{H}`$ and $`\mathcal{K}`$. To solve the polynomial system (Eqs. ([constraint1R])-([objectR])), we aim to obtain a univariate polynomial in $`r_{33}`$ using Macaulay resultant . This multivariate resultant is the ratio of two determinants, the denominator ([resDenominator]) and numerator ([resNumerator])

\begin{equation}
 \label{resDenominator}
\begin{bmatrix}
e_{11} & 0 & 0 \\
0 & e_{11} & 1 \\
E_{16} & e_{13} & 1
\end{bmatrix} , \quad E_{16} = e_{16}r_{33}^2 + e_{19}r_{33} + m
\end{equation}

\begin{equation}
 \label{resNumerator}
\scalebox{0.55}
{$
\begin{bmatrix}
e_{11} &0      &0      &0      &0      &0      &0      &0      &0      &0      &0      &0      &0      &0      &0      \\
e_{12} &e_{11} &0      &0      &0      &0      &1      &0      &e_{21} &0      &0      &0      &0      &0      &0      \\
E_{14} &0      &e_{11} &0      &0      &0      &0      &1      &0      &e_{21} &0      &0      &0      &0      &0      \\
e_{13} &e_{12} &0      &e_{11} &0      &0      &0      &0      &e_{22} &0      &1      &0      &0      &0      &0      \\
E_{15} &E_{14} &e_{12} &0      &e_{11} &0      &0      &0      &E_{24} &e_{22} &0      &1      &0      &e_{21} &0      \\
E_{16} &0      &E_{14} &0      &0      &e_{11} &0      &0      &0      &E_{24} &0      &0      &1      &0      &e_{21} \\
0      &e_{13} &0      &e_{12} &0      &0      &1      &0      &e_{23} &0      &0      &0      &0      &0      &0      \\
0      &E_{15} &e_{13} &E_{14} &e_{12} &0      &0      &1      &E_{25} &e_{23} &0      &0      &0      &e_{22} &0      \\
0      &E_{16} &E_{15} &0      &E_{14} &e_{12} &E_{31} &0      &E_{26} &E_{25} &0      &0      &0      &E_{24} &e_{22} \\
0      &0      &E_{16} &0      &0      &E_{14} &0      &E_{31} &0      &E_{26} &0      &0      &0      &0      &E_{24} \\
0      &0      &0      &e_{13} &0      &0      &0      &0      &0      &0      &1      &0      &0      &0      &0      \\
0      &0      &0      &E_{15} &e_{13} &0      &0      &0      &0      &0      &0      &1      &0      &e_{23} &0      \\
0      &0      &0      &E_{16} &E_{15} &e_{13} &0      &0      &0      &0      &E_{31} &0      &1      &E_{25} &e_{23} \\
0      &0      &0      &0      &E_{16} &E_{15} &0      &0      &0      &0      &0      &E_{31} &0      &E_{26} &E_{25} \\
0      &0      &0      &0      &0      &E_{16} &0      &0      &0      &0      &0      &0      &E_{31} &0      &E_{26} 
\end{bmatrix}
$} ,
\end{equation}

with elements $`E_{14}`$, $`E_{15}`$, $`E_{31}`$, $`E_{24}`$, $`E_{25}`$ and $`E_{26}`$ defined as

\begin{equation}
\scalebox{0.85}
{$
\begin{aligned}
E_{14} &= e_{14}r_{33} + e_{17}, \ E_{15} = e_{15}r_{33} + e_{18}, \ E_{31} = r_{33}^2 - 1 \\
E_{24} &= e_{24}r_{33} + e_{27}, \ E_{25} = e_{25}r_{33} + e_{28}, \ E_{26} = e_{26}r_{33}^2 + e_{29}r_{33}
\end{aligned}
$} .
\end{equation}

We set this resultant to $`0`$, and obtain the univariate polynomial equation

\begin{equation}
\scalebox{0.75}
{$
g_1r_{33}^8 + g_2r_{33}^7 + g_3r_{33}^6 + g_4r_{33}^5 + g_5r_{33}^4 + g_6r_{33}^3 + g_7r_{33}^2 + g_8r_{33} + g_9 = 0
$} ,
\end{equation}

where the coefficients $`g_i`$ are computed in closed-form of the coefficients of Eqs. ([constraint1R])-([objectR]).

Although the eighth-degree (higher than four) univariate polynomial $`\mathbf{P}`$ does not have a closed-form roots expression, we can obtain its all roots by computing the eigenvalues of its companion matrix $`\mathcal{C}(\mathbf{P})`$ . For numerical stability, we approximate the roots through an iterated method which uses a generalized companion matrix $`\mathcal{C}(\mathbf{P}, \mathbf{S})`$ constructed from $`\mathbf{P}`$ and initialized by $`\mathcal{C}(\mathbf{P})`$. Here, $`\mathcal{C}(\mathbf{P})`$ and $`\mathcal{C}(\mathbf{P}, \mathbf{S})`$ are expressed as

\begin{equation}
\scalebox{0.55}
{$
\mathcal{C}(\mathbf{P}) =
\begin{bmatrix}
0 & 1 & 0 & \cdots & 0 \\
0 & 0 & 1 & \cdots & 0 \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
-\frac{g_9}{g_1} & -\frac{g_8}{g_1} & -\frac{g_7}{g_1} & \cdots & -\frac{g_2}{g_1}
\end{bmatrix}
$} ,
\scalebox{0.55}
{$
\mathcal{C}(\mathbf{P}, \mathbf{S}) =
\begin{bmatrix}
s_1 & 0 & \cdots & 0 \\
0 & s_2 & \cdots & 0 \\
\vdots & \vdots & \ddots & \vdots \\
0 & 0 & \cdots & s_8
\end{bmatrix} - 
\begin{bmatrix}
l_1 & l_2 & \cdots & l_8 \\
l_1 & l_2 & \cdots & l_8 \\
\vdots & \vdots & \vdots & \vdots \\
l_1 & l_2 & \cdots & l_8
\end{bmatrix}
$} ,
\end{equation}

where $`\mathbf{S} = (s_1, ... , s_8)`$, $`l_i = \lvert \frac{\mathbf{P}(s_i)}{\mathbf{Q}^{\prime}(s_i)} \rvert`$ and $`\mathbf{Q}^{\prime}(s_i) = \prod_{i \neq j}(s_i - s_j)`$. $`\mathbf{S}`$ is first initialized as the eigenvalues of $`\mathcal{C}(\mathbf{P})`$. Then for each iteration, $`\mathbf{S}`$ is updated as the eigenvalues of $`\mathcal{C}(\mathbf{P}, \mathbf{S})`$ until $`\mathbf{S}`$ is converged. Eight possible solutions for $`r_{33}`$ are obtained.

STEP 4: Each solution for $`r_{33}`$ (numeric value $`\hat{r_{33}}`$) corresponds to a single solution for the rest of the unknowns. For numerical stability, we compute the Sylvester resultant of Eq. ([constraint1R]) and Eq. ([objectR]) w.r.t. $`r_{23}`$. With the determinant of this resultant set to zero, we obtain a quartic polynomial $`\mathbf{P}_1`$ in $`r_{13}`$

\begin{equation}
\scalebox{0.6}
{$
\mathbf{P}_1 = \det \left( \begin{bmatrix}
f_{12} & e_{12}r_{13} + e_{15}\hat{r_{33}} + e_{18} & e_{13} & 0 \\
0 & f_{12} & e_{12}r_{13} + e_{15}\hat{r_{33}} + e_{18} & e_{13} \\
r_{13}^2 + \hat{r_{33}}^2 - 1 & 0 & 1 & 0 \\
0 & r_{13}^2 + \hat{r_{33}}^2 - 1 & 0 & 1
\end{bmatrix} \right)  = 0
$} ,
\end{equation}

where $`f_{12} = e_{11}r_{13}^2 + e_{14}r_{13}\hat{r_{33}} + e_{16}\hat{r_{33}}^2 + e_{17}r_{13} + e_{19}\hat{r_{33}} + m`$. Similarly, we compute the Sylvester resultant of Eq. ([constraint2R]) and Eq. ([objectR]) w.r.t. $`r_{23}`$, and set its determinant to zero to obtain another quartic polynomial $`\mathbf{P}_2`$ in $`r_{13}`$. To solve this overdetermined system, we aim to minimize the sum of the squares of $`\mathbf{P}_1`$ and $`\mathbf{P}_2`$, and thus set the derivative of $`\mathbf{P}_1^2`$ + $`\mathbf{P}_2^2`$ w.r.t. $`r_{13}`$ to zero to get a seventh-degree polynomial. Seven solutions for $`r_{13}`$ obtained by iterated method mentioned above are tested if both $`\mathbf{P}_1(\hat{r_{13}}) = 0`$ and $`\mathbf{P}_2(\hat{r_{13}}) = 0`$.

After substituting the numeric solutions $`\hat{r_{13}}`$ and $`\hat{r_{33}}`$ into Eqs. ([constraint1R])-([objectR]), we perform the same optimization method to solve the overdetermined system for $`r_{23}`$. Note that we have a closed-form roots expression for the third-degree polynomial obtained from the derivative of the cost function w.r.t. $`r_{23}`$ (the sum of the squares of three polynomials in ([constraint1R])-([objectR])). We only keep the solution $`\hat{r_{23}}`$ if all Eqs. ([constraint1R])-([objectR]) hold.

STEP 5: After obtaining $`\mathbf{r}_{3}`$, $`\mathbf{r}_{1}`$ can be calculated from Eq. ([expressionR1]) and $`\mathbf{r}_{2}`$ can be retrieved from Eq. ([crossProdR2]). Finally, $`{^C}\mathbf{t}{_L}`$ can be obtained using Eq. ([translation]).

Eight possible solutions give us up to four real solutions. Four complex solutions can be eliminated as follows. We square all the elements of $`\mathbf{r}_{1}`$, $`\mathbf{r}_{2}`$, $`\mathbf{r}_{3}`$ and $`{^C}\mathbf{t}{_L}`$, and check if they all have non-negative real parts.

STEP 6: In practice, while the solution for $`\mathbf{r}_3`$ fails to deliver a real solution in the presence of noise, we use its projection on real domain as the initial value. Eqs. ([constraint1R])-([objectR]) are then treated as a whole $`\mathcal{F}(\mathbf{x}) = \mathbf{0}`$ for $`\mathbf{r}_3`$, which can be solved using the Trust-Region Dogleg method . At each iteration $`k`$, the trust region subproblem here is

\begin{equation}
 \label{trustRegionDogleg}
\begin{gathered}
\argmin_{\mathbf{d}_k} \scalebox{0.7}
{$
\dfrac{1}{2}\mathcal{F}(\mathbf{x}_k)^{\top}\mathcal{F}(\mathbf{x}_k) + \mathbf{d}_k^{\top}\mathcal{J}(\mathbf{x}_k)^{\top}\mathcal{F}(\mathbf{x}_k) + \dfrac{1}{2}\mathbf{d}_k\mathcal{J}(\mathbf{x}_k)^{\top}\mathcal{J}(\mathbf{x}_k)\mathbf{d}_k
$} \\
\text{s. t. } \quad \Vert \mathbf{d}_k \Vert \le \Delta_k
\end{gathered} ,
\end{equation}

where $`\Delta_k`$ is updated, and the Jacobian is defined as $`\mathcal{J}(\mathbf{x}_k) = \left[ \nabla\mathcal{F}_1(\mathbf{x}_k), \nabla\mathcal{F}_2(\mathbf{x}_k), \nabla\mathcal{F}_3(\mathbf{x}_k) \right]^{\top}`$. The step $`\mathbf{d}_k`$ to obtain $`\mathbf{x}_{k+1}`$ is then constructed as

\begin{equation}
\mathbf{d}_k = 
\begin{cases}
\lambda \mathbf{d}_C, &0 \le \lambda \le 1 \\
\mathbf{d}_C + (\lambda - 1)(\mathbf{d}_{GN} - \mathbf{d}_C),  &1 \le \lambda \le 2
\end{cases} ,
\end{equation}

where $`\lambda`$ is the largest value such that $`\Vert \mathbf{d}_k \Vert \le \Delta_k`$. With $`\mathbf{g}_k = \mathcal{J}(\mathbf{x}_k)^{\top}\mathcal{F}(\mathbf{x}_k)`$ and $`\mathbf{B}_k = \mathcal{J}(\mathbf{x}_k)^{\top}\mathcal{J}(\mathbf{x}_k)`$, the Cauchy step and the Gauss-Newton step are respectively defined as $`\mathbf{d}_C = - \frac{\mathbf{g}_k^{\top} \mathbf{g}_k}{\mathbf{g}_k^{\top} \mathbf{B}_k \mathbf{g}_k} \mathbf{g}_k`$ and $`\mathbf{d}_{GN} = - \mathbf{B}_k^{-1} \mathbf{g}_k`$.

STEP 7: We must choose the unique solution from invalid solutions through the cheirality check. Fig. 1 shows four possible real solutions, which are $`\{L_1, C_1\}`$, $`\{L_2, C_1\}`$, $`\{L_3, C_2\}`$ and $`\{L_4, C_2\}`$ in terms of a combination of the LRF frame and the camera frame.

Four possible real solutions are shown as a combination of the LRF frame and the camera frame ({L₁, C₁}, {L₂, C₁}, {L₃, C₂} and {L₄, C₂}). The unique solution {L₁, C₁} is determined through the cheirality check.

The unique solution $`\{L_1, C_1\}`$ is determined as follows. Both the LRF and the camera must face towards the same direction, and three laser points $`{^L}\mathbf{p}_1`$, $`{^L}\mathbf{p}_2`$ and $`{^L}\mathbf{p}_3`$ after transformation to the camera frame must be in front of the camera. These two conditions are

\begin{equation}
\begin{gathered}
\mathbf{n}_z^{\top} \cdot {_{L_1}^{C_1}}\mathbf{R} \cdot \mathbf{n}_z > 0, \quad \mathbf{n}_z = [0,0,1]^{\top} \\
\mathbf{n}_z^{\top} \cdot \left({_{L_1}^{C_1}}\mathbf{R} \cdot {^{L_1}}\mathbf{p}_i + {^{C_1}}\mathbf{t}{_{L_1}}\right) > 0, \quad i = \{ 1, 2, 3\}
\end{gathered} .
\end{equation}

From Multiple Observations

Multiple observations from different views can be used to suppress noise. The problem in this case is formulated as follows

\begin{equation}
 \label{multMonlinearObj}
\begin{gathered}
\argmin_{{_L^C}\mathbf{R}, {^C}\mathbf{t}{_L}} \sum_{i=1}^M \sum_{j=1}^N \left( {^C}\bar{\mathbf{n}}_{ij}^{\top} \left({_L^C}\mathbf{R} \cdot {^L}\bar{\mathbf{p}}_{ij} + {^C}\mathbf{t}{_L} \right) - \bar{d}_{ij} \right)^2 \\
\text{s. t. } \quad {_L^C}\mathbf{R}^{\top} \cdot {_L^C}\mathbf{R} = \mathbf{I}, \quad \det \left({_L^C}\mathbf{R}\right) = 1
\end{gathered} ,
\end{equation}

where $`N = 6`$ and $`M`$ is the total number of different multiple observations. The solution is obtained through the seven steps mentioned in Subsection 9.1. As the rotation is represented by the Rodrigues’ formula, the calibration result is further refined using Levenberg-Marquardt (LM) method . Our solution in ([multMonlinearObj]) serves as the accurate initial value.

Details of the Calibration Approach

We explain below how to accurately extract the features required for our method. These are four normal vectors $`{^C}\mathbf{n}_i`$ ($`i = 1,2,3,4`$) and two distances $`d_j`$ ($`j = 1,2`$) from the camera, and three laser edge points $`{^L}\mathbf{p}_k`$ ($`k = 1,2,3`$) from the LRF.

A calibration target built in a convex V-shape is put on the table. Accurate data extraction from both the LRF and the camera is performed by line intersection and optimized line detection.

Data from the LRF

Our calibration can be built in a convex shape as shown in Fig. 2. We can put the calibration target on a planar object (such as table, wall and the ground) such that $`{^L}\mathbf{p}_1`$, $`{^L}\mathbf{p}_2`$ and $`{^L}\mathbf{p}_3`$ are accurately determined by the intersection between line segments. Specifically, the laser intersection with the objects are first segmented into four parts $`{^L}\mathbf{l}_{1}`$, $`{^L}\mathbf{l}_{13}`$, $`{^L}\mathbf{l}_{23}`$, and $`{^L}\mathbf{l}_{2}`$ (e.g. using the Ramer-Douglas-Peucker algorithm (RDP) algorithm ). We then fit a line to each segment using the total least squares method . $`{^L}\mathbf{p}_1`$ is thus obtained by the intersection of lines $`{^L}\mathbf{l}_1`$ and $`{^L}\mathbf{l}_{13}`$. Similarly, $`{^L}\mathbf{p}_2`$ and $`{^L}\mathbf{p}_3`$ are respectively determined by the other two pairs $`\{ {^L}\mathbf{l}_{2}, {^L}\mathbf{l}_{23} \}`$ and $`\{ {^L}\mathbf{l}_{13}, {^L}\mathbf{l}_{23} \}`$.

The histogram of computational errors for computed rotation and translation in 10⁴ random and noise-free Monte-Carlo trials.

Data from the Camera

Using our calibration target with two checkerboards, the camera calibration is first performed by the MATLAB Toolbox . Two normal vectors $`{^C}\mathbf{n}_3`$ and $`{^C}\mathbf{n}_4`$ of planes $`T_3`$ and $`T_4`$ can be obtained directly from the calibration toolbox. Each checkerboard is on the plane $`z_T = 0`$ in its own world frame. For example, we have the rotation $`{_{T_3}^C}\mathbf{R}`$ and translation $`^C\mathbf{t}_{T_3}`$ from the calibration result, which represent the orientation and position of the plane $`T_3`$ w.r.t. the camera. $`{^C}\mathbf{n}_3`$ is just minus the 3rd column of $`{_{T_3}^C}\mathbf{R}`$, and the distance $`d_1`$ from the camera to $`T_3`$ is $`d_1 = {^C}\mathbf{n}_3^{\top} \cdot ^C\mathbf{t}_{T_3}`$. Similarly, $`{^C}\mathbf{n}_4`$ and $`d_2`$ can be also calculated.

From the image (See Fig. 2), the unit line directions of $`\overline{PQ}`$, $`\overline{PR}`$ and $`\overline{PO}`$ are projected as $`\mathbf{v}_1`$, $`\mathbf{v}_2`$ and as $`\mathbf{v}_3`$, and the projection point of the vertex $`P`$ is $`\mathbf{p}_P`$. As stated in , we perform a weighted optimization initialized by the LSD line detector to accurately estimate the set $`\{ \mathbf{p}_P, \mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3 \}`$. Specifically, only the pixels within a rectangular region $`S_i`$ are considered for fitting $`\mathbf{v}_i`$. Let $`\boldsymbol{\eta}_i`$ be the normal of $`\mathbf{v}_i`$. Hence, the optimization problem is expressed as

\begin{equation}
\argmin_{\mathbf{p}_P, \mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3} \sum_{i=1}^3 \sum_{j=1}^{N_i} G_i^j \cdot ( ( \mathbf{p}_i^j - \mathbf{p}_P ) \cdot \boldsymbol{\eta}_i )^2 ,
\end{equation}

where each region $`S_i`$ has the number $`N_i`$ of valid pixels $`\mathbf{p}_i^j`$ whose gradient magnitudes $`G_i^j`$ as their weights are above a threshold. Given the intrinsic matrix $`\mathbf{K}`$, the normal vector $`{^C}\mathbf{n}_i`$ ($`i = 1, 2`$) is obtained by $`{^C}\mathbf{n}_i = \frac{\mathbf{K}^{-1}\mathbf{v}_i}{\Vert \mathbf{K}^{-1}\mathbf{v}_i \Vert}`$.

Snapshot Selection

The solution from a single snapshot constrains three laser points $`{^L}\mathbf{p}_k`$ ($`k = 1,2,3`$). In the presence of noise, it should also guarantee that laser points $`{^L}\mathbf{p}_{13}^i`$ from the line $`{^L}\mathbf{l}_{13}`$ and $`{^L}\mathbf{p}_{23}^j`$ from $`{^L}\mathbf{l}_{23}`$ must respectively lie on $`T_3`$ and $`T_4`$. We determine a snapshot as ill-conditioned if the average squared distance of laser points to their corresponding planes is larger than a threshold $`\epsilon^2`$. From multiple observations, it can guide us to select well-conditioned snapshots for further accuracy. Given the solution $`{_L^C}\hat{\mathbf{R}}`$ and $`{^C}\hat{\mathbf{t}}{_L}`$, we will keep this snapshot if it satisfies

\begin{equation}
 \label{selectSnapshot}
\scalebox{0.65}
{$
\sum\limits_{i=1}^{N_{13}} \dfrac{( {^C}\mathbf{n}_3^{\top} ( {_L^C}\hat{\mathbf{R}} \cdot {^L}\mathbf{p}_{13}^i + {^C}\hat{\mathbf{t}}{_L} ) - d_1 )^2}{2N_{13}} + \sum\limits_{j=1}^{N_{23}} \dfrac{( {^C}\mathbf{n}_4^{\top} ( {_L^C}\hat{\mathbf{R}} \cdot {^L}\mathbf{p}_{23}^j + {^C}\hat{\mathbf{t}}{_L} ) - d_2 )^2}{2N_{23}} \le \epsilon^2
$} ,
\end{equation}

where $`{^L}\mathbf{l}_{13}`$ and $`{^L}\mathbf{l}_{23}`$ are treated equally, and have $`N_{13}`$ and $`N_{23}`$ points, respectively. The average squared distance of each line is first calculated and then half weighted as the left two terms of summation in Eq. ([selectSnapshot]).

Real Experiments

To further validate our method, we perform real experiments in comparison with other two existing methods and .

A LRF Hokuyo URG-04LX is rigidly mounted on a stereo rig which consists of a pair of Point Grey Chameleon CMLN-13S2C cameras (See Fig. [fig:cameralaserRig]). Sensors are synchronized based on time stamps. The LRF is set to $`180^\circ`$ horizontal field of view, with an angular resolution of $`0.36^\circ`$ and a line scanning frequency of 10 Hz. Its scanning accuracy is $`\pm 1`$ cm within a range from 2 cm to 100 cm, and has $`1\%`$ error for a range from 100 cm to 400 cm. The cameras have a resolution of $`640\times 480`$ pixels, and are pre-calibrated based on . The images prior to data processing are warped to get rid off the radial and tangent distortions.

Comparison with Existing Methods

We compare our method to two state-of-the-art algorithms and using the ground truth of the stereo-rig baseline. Specifically, let $`\mathbf{T} = \left[\begin{smallmatrix} \mathbf{R} &\mathbf{t} \\ 0 0 0 &1 \end{smallmatrix} \right]`$ represent the transformation (rotation and translation) between two frames. For each method, the LRF is first calibrated w.r.t both left and right cameras to obtain $`_{L}^{C_l}\mathbf{T}`$ and $`_{L}^{C_r}\mathbf{T}`$. We then compute the relative pose (baseline) between stereo cameras and compare it with the the ground truth $`_{C_l}^{C_r}\mathbf{T}`$ calibrated from the toolbox . Hence, the error matrix is $`\mathbf{T}_e = _{C_l}^{C_r}\mathbf{T} \cdot _{L}^{C_l}\mathbf{T} \cdot ( _{L}^{C_r}\mathbf{T} )^{-1}`$, where $`\mathbf{R}_e`$ and $`\Vert \mathbf{t}_e \Vert_2`$ are respectively compared with the identity and zero.

The stereo cameras are calibrated for 10 times where each time 20 image pairs are randomly chosen from 40 pairs. With the rotation represented by the Rodrigues’ formula, the means of the rotation angle and the baseline distance are respectively $`0.0137^\circ`$ and 96.0511 mm (the translation is [$`-96.0505`$, $`-0.3035`$, $`-0.1326`$]$`^{\top}`$ in mm). Because of their low STDs $`0.0018^\circ`$ and 0.2742 mm (within $`0.01^\circ`$ and 1 mm), we treat their means as the ground truth. The distances between the LRF and the stereo cameras are approximately 100 mm and 150 mm. 30 best observations of each method are obtained using a RANSAC framework (5 snapshots are randomly chosen from a total 50). We randomly select subsets of 1, 5, 10, 15 and 20 observations to perform the calibration between the LRF and the stereo rig. The calibration is performed 10 times for each random subset.

Fig. 7 shows that our method has the smallest errors of both rotation and translation as the number of observations increases. Specifically, the mean errors from 20 observations are respectively $`0.3^\circ`$ and $`3.4`$ mm almost three times lower than the method of Zhang and Pless ($`1.3^\circ`$ and $`12.0`$ mm) and the method of Kwak et al. ($`1.0^\circ`$ and $`12.6`$ mm). Moreover, our method can obtain a reasonable result even using only one snapshot, which is not possible for the other two methods. In other words, our method can achieve an accuracy at the same level but using the smallest number of observations.

Real Scene Validation

This experiment from the real scene tests the calibration results between LRF to both left and right cameras, respectively, obtained by two existing methods and our method for comparison (See Fig. 4). We generate a dataset of 30 input observations for each method using their own calibration targets. The calibration results are obtained from 20 observations randomly chosen from 30 in total of each method. We then respectively test them using the stereo images from new observations of our calibration target which are not involved in the calibration process. From the comparison, we observe that the laser scanning lines for our method more reasonably match the calibration target (board boundaries) from both left and right cameras. Thus, it validates the correctness of our calibration results of each LRF-camera pair.

The projection of laser points scanned on the calibration target onto the images from both left and right cameras using three different methods. Cyan boxes show the close-up of laser point projection. Methods and are green and yellow colored, respectively. Our approach is red colored.

Synthetic Experiments

Means and STDs of the errors for estimated rotation and translation versus the angle between two triangle boards of the calibration target. Each point represents the mean error with its STD from 1000 Monte-Carlo trials. The errors gently vary as long as the angle between boards is obtuse.

The comparision of means and STDs of the errors for estimated rotation and translation between single-view-check and without-any-check versus the number of observations of the calibration target in 1000 Monte-Carlo trials.

For the simulation setting, the obtuse angle between two triangle boards of the calibration target is set to $`150^{\circ}`$. Then, we uniformly and randomly generate roll, pitch and yaw in the range of $`\pm 45^{\circ}`$ for the orientation of the LRF w.r.t. the camera, and three components of the position from 5 to 30 cm. For each instance of the ground truth, we randomly generate the orientation and position of the calibration target within the range of $`\pm 45^{\circ}`$ and 50 to 150 cm. In the extreme case, the position angle between the LRF-to-target direction and the camera-to-target direction is more than $`33^{\circ}`$. It includes the scenario in proportion that the camera-to-LRF distance is large, as long as the target is visible from two sensors.

Numerical Stability

We performed $`10^4`$ Monte-Carlo trials to validate the numerical stability of our solution in the noise-free case. For each trial, only one snapshot of the calibration target is needed. The error metric $`e = \Vert [\mathbf{R}_{gt}|\mathbf{t}_{gt}] - [\hat{\mathbf{R}}|\hat{\mathbf{t}}] \Vert_{\mathcal{F}}`$ is the Frobenius norm ($`\mathcal{F}`$) of the difference between the ground truth ($`gt`$) and our solution. The histogram of errors is shown in Fig. 3. The computational error varies but the accuracy is still high (around $`10^{-8}`$). It demonstrates that our method correctly solves the LRF-camera calibration problem, which further validates the sufficiency of the constraints from a single snapshot.

Comparisons with the method of Zhang and Pless and the method of Kwak et al. . 1st and 2nd: Means of estimated rotation and translation errors for all three methods as the number of observations of the calibration target increases; 3rd and 4th: STDs of estimated rotation and translation errors for all three methods as the number of observations increases.

Sensitivity to Data Noise

This simulation tests the sensitivity of our solution to the noise in feature data. Two sources of error are taken into account: the laser depth uncertainty along the beam direction and the pixel uncertainty in line detection. Based on the practical setting, we respectively set the standard deviations (STDs) of laser noise and pixel error as $`\sigma_1 = 10`$ mm and $`\sigma_2 = 3`$ px in the $`640\times 480`$ image. A factor $`k_{\sigma}`$ varies from 0 to 1 to combine the noise information as $`k_{\sigma}\sigma_i`$ ($`i = 1, 2`$) from both sensors in one plot. 1000 Monte-Carlo trials are performed, each of which needs only one snapshot. The metrics for angular error (the chordal distance of rotation) and distance error (translation) are respectively $`e_{\theta} = 2\arcsin ( \frac{1}{2\sqrt{2}} \Vert \hat{\mathbf{R}}-\mathbf{R}_{gt} \Vert_{\mathcal{F}} )`$ and $`e_d = \Vert \hat{\mathbf{t}}-\mathbf{t}_{gt} \Vert_2`$. Fig. [fig:AngleDistErr] demonstrates that as the noise level increases, our solution is robust to the image noise but has a greater sensitivity to the laser depth noise.

Sensitivity to Angle between Boards

The sensitivity of our solution to the angle between two triangle boards of the calibration target is tested. We set $`\sigma_1 = 10`$ mm and $`\sigma_2 = 3`$ px. The angle varies from $`30^{\circ}`$ to $`170^{\circ}`$. Fig. 5 shows the means and STDs for both angular and distance error. Our solution is not sensitive to the angle between boards but still has the smallest errors around $`135^{\circ}`$.

Noise Reduction

We report a simulation designed for testing the noise reduction of our solution when using multiple observations. Here, $`\sigma_1`$ varies from 1 mm to 10 mm with $`\sigma_2`$ set to 3 px. At each noise level, the means and STDs are calculated for both $`e_{\theta}`$ and $`e_d`$ from 1000 Monte-Carlo trials. From Fig. [fig:NoiseReduction], we observe that as number of observations increases, the means and STDs of both errors asymptotically decrease. Moreover, with a small number of snapshots, our solution can achieve a highly accurate initial value for further optimization. Specifically, $`e_{\theta}`$ and $`e_d`$ are respectively around $`0.5^{\circ}`$ and 5 mm for only 5 snapshots.

Snapshot Selection

With snapshot selection, additional accuracy is achieved (See Fig. 6) in the presence of noise $`\sigma_1 = 10`$ mm and $`\sigma_2 = 3`$ px. The distance threshold $`\epsilon`$ is set to 5 mm. The errors $`e_{\theta}`$ and $`e_d`$ are further decreased by nearly $`50\%`$ compared with the solution without any single-snapshot check. Specifically, $`e_{\theta}`$ and $`e_d`$ are respectively around $`0.25^{\circ}`$ and 2 mm for only 5 snapshots. Thus, our method can guarantee a high accuracy of the solution when using multiple snapshots.