[ceres-solver] From QuaternionParameterization to LocalParameterization

This article starts with ceres::QuaternionParameterization and forms an understanding of ceres::LocalParameterization on the basis of understanding ceres::QuaternionParameterization.

The explanation of LocalParameterization in ceres-solver is as follows.

Sometimes the parameters \(\mathbf{x}\) can overparameterize a problem. In that case it is desirable to choose a parameterization to remove the null directions of the cost. More generally, if \(\mathbf{x}\) lies on a manifold of a smaller dimension than the ambient space that it is embedded in, then it is numerically and computationally more effective to optimize it using a parameterization that lives in the tangent space of that manifold at each point.

In short. LocalParameterization needs to be considered when optimizing variables on Manifold. Variables on Manifold are over parameterized, that is, the dimensionality of the variables on Manifold is greater than its degree of freedom. This will lead to constraints between the various quantities of the variables on the Manifold. If these quantities are directly derived and optimized, then this is a constrained optimization, which is difficult to achieve. In order to solve this problem, mathematically, the tangent space (Tangent Space) formed by Manifold at the current variable value is derived, optimized on the tangent space, and finally projected back to Manifold.

For the SLAM problem, the widely encountered Manifold is rotation. The rotation can be expressed by only 3 quantities (so(3)). However, in practical applications, the problem of universal locks is involved. Higher dimensions express rotation. The quaternion is the rotation of a three-dimensional space with 3 degrees of freedom expressed in dimension 4.

The method int GlobalSize() of QuaternionParameterization returns 4, which means the Manifold space (quaternion) is 4-dimensional, and the method int LocalSize() returns 3, which means the tangent space is 3-dimensional.

The method of QuaternionParameterization bool ComputeJacobian(const double* x, double* jacobian) calculates a 4×3 matrix. These matrices calculated by ComputeJacobian are called “global_to_local” in the ceres code, which means the derivative of the variables on the Manifold to the variables on the Tangent Space. Provide the derivative of residuals to the variables on Manifold at ceres::CostFunction (\({\partial \mathbf{e} \over \partial \mathbf{X}^{(G)}} \)), multiply this matrix (\({\partial \mathbf{X}^{(G)} \over \partial \mathbf{X}^{( L)}}\)) then becomes the derivative of the variable on Tangent Space (\({\partial \mathbf{e} \over \partial \mathbf{X} ^{(L)}}\)).

The method of QuaternionParameterization bool Plus(const double* x, const double* delta, double* x_plus_delta) adds the optimized increment (delta in Tangent Space) to the The optimization variable (x, in Manifold) forms an optimization result (x_plus_delta, in Manifold), and an optimization iteration is completed.

Attention. 1. Where there is no clear description in the ceres source code, it is considered that the matrix raw memory storage method is Row Major, which is the opposite of Eigen’s default Col Major. 2. The default Quaternion raw memory storage mode of ceres is w, x, y, z, and the storage mode of Eigen Quaternion is x, y, z, w, which leads to the addition of ceres::QuaternionParameterization in the ceres code ceres::EigenQuaternionParameterization. 3. Quaternion in ceres is Hamilton Quaternion, which follows Hamilton’s multiplication rule.

1. The cooperation of ceres::QuaternionParameterization and AutoDiff

First, the method QuaternionParameterization::ComputeJacobian The source code is excerpted as follows.

bool QuaternionParameterization::ComputeJacobian(const double* x, double* jacobian) const {jacobian[0] = -x[1]; jacobian[1] = -x[2] ; jacobian[2] = -x[3]; // NOLINT jacobian[3] = x[0]; jacobian[4] = x[3]; jacobian[5] = -x[2]; // NOLINT jacobian [6] = -x[3]; jacobian[7] = x[0]; jacobian[8] = x[1]; // NOLINT jacobian[9] = x[2]; jacobian[10] = -x [1]; jacobian[11] = x[0]; // NOLINT return true;}

According to the storage method of Row Major, according to the variable (x) on Manifold The storage mode is w, x, y, z, and the above codes can be converted into mathematical formulas.

\[\begin{align} {\partial \mathbf{X}^{(G)} \over \partial \mathbf{X}^{(L )}} = \begin{bmatrix} -x & -y & -z \ w & z & -y \ -z & w & x \ y & -x & w \end{bmatrix} \end{align}\]

You need to understand the origin of this matrix. Reference [1] formula (18), the above matrix corresponds to the three columns on the right side of \([\mathbf{q}]_R\). If the variable \(\mathbf{X}^{(G)}\) on Manifold is written as \(\mathbf{q} \), write the variables on Tangent Space as \(\mathbf{\theta}\), then the above matrix \ ({\partial \mathbf{X}^{(G)} \over \partial \mathbf{X}^{(L)}} = {\partial \mathbf{\theta} \otimes \mathbf{q} \over \partial \mathbf{\theta}}\). This is a form of left disturbance. Check the method QuaternionParameterization::Plus. The source code is excerpted as follows.

bool QuaternionParameterization::Plus(const double* x, const double* delta, double* x_plus_delta) const {const double norm_delta = sqrt(delta[0] * delta[0] + delta[1] * delta[1] + delta[2] * delta[2]); if (norm_delta> 0.0) {const double sin_delta_by_delta = (sin(norm_delta) / norm_delta); double q_delta[4]; q_delta[0 ] = cos(norm_delta); q_delta[1] = sin_delta_by_delta * delta[0]; q_delta[2] = sin_delta_by_delta * delta[1]; q_delta[3] = sin_delta_by_delta * delta[2]; QuaternionProduct(q_plus_delta, x, );} else {for (int i = 0; i <4; ++i) {x_plus_delta[i] = x[i];}} return true;}

You can see, QuaternionProduct(q_delta, x, x_plus_delta);, the optimized increment q_delta is left multiplied by the original variable x to get the new variable x_plus_delta< /code>, this corresponds to the left disturbance Jacobian in QuaternionParameterization::ComputeJacobian.

Experiment, change the Jacobian calculated by QuaternionParameterization::ComputeJacobian to the formula of reference [1] (18) \([\ mathbf{q}]_L\) in the right three columns, the process of adding the increment in QuaternionParameterization::Plus to the original variable is changed to right multiplication, that is, after the change QuaternionProduct( x, q_delta, x_plus_delta);. Then the QuaternionParameterization of this right perturbation form is combined with AutoDiff to solve the actual problem, and it can be seen that the result obtained is consistent with the result of the left perturbation within the allowable error range.

Conclusion. QuaternionParameterization's Plus and ComputeJacobian jointly decide to use the left perturbation or the right perturbation form.

The advantage of QuaternionParameterization above is that it can provide guidance when AutoDiff or NumericDiff is not used, and the user code provides Jacobian. According to the available data, the derivative of so(3) is generally obtained, and it is difficult to find the derivative of quaternion. If you use ceres::QuaternionParameterization, ceres expects the user code to provide the derivative of the quaternion, which is troublesome for the user to provide.

If users provide a QuaternionParameterization by themselves, they can gain control over parameterization. Users can provide the derivation of so(3) in CostFunction and define QuaternionParameterization::ComputeJacobian to return matrix \(\begin{bmatrix}\mathbf{I }_{3\times3} \\ (\mathbf{0}_{3\times1})^T\end{bmatrix}\). Matrix \(\begin{bmatrix}\mathbf{I}_{3\times3} \\ (\mathbf{0}_{3\times1})^T\end{bmatrix} \) Cut the matrix of \(N(e)\times4\) returned by the user into \(N (e)\times3\) matrix. Note, \(N(e)\) represents the number of error residuals. And the addition of the increment in QuaternionParameterization::Plus should be consistent with the left and right perturbation of so(3) derivative in CostFunction. In this process, the derivative of so(3) returned by the user in CostFunction should be 0 in the last column, and the first three columns are filled with values. (It should be noted that ceres::QuaternionParameterization::Plus operates on semi-axial angles, not shaft angles. If applied directly, it will make the calculated Jacobian difference by a factor of 2.)

Of course, the operation described in the above paragraph can be realized with preconditions, "All users' derivatives of Manifold variables will be converted to the derivatives of Tangent Space variables through LocalParameterization::ComputeJacobian after entering ceres, and then Used elsewhere". In the current use of ceres, the above assumption has not been broken. .

Now prove that the derivative of the error calculated by AutoDiff to the quaternion is multiplied by ceres::QuaternionParameterization::ComputeJacobian The result of the returned matrix is ​​the error to so(3) Left perturbation derivative.

Quaternion's most extensive use in error calculation is to transform a three-dimensional point coordinate from one coordinate system to another coordinate system, such as transforming the coordinates in the world coordinate system to the coordinates in the camera coordinate system ,\({}^{C}\mathbf{X}=\mathbf{R}\{{}^C\mathbf{q}_W\}{}^W\mathbf{ X}+{}^C\mathbf{t}_W\). According to the chain rule, only need to prove \({\partial {}^{C}\mathbf{X} \over \partial {}^C\mathbf{q}_W}{\ partial \mathbf{\theta} \otimes {}^C\mathbf{q}_W \over \partial \mathbf{\theta}} = {\partial {}^{C}\mathbf{X} \over \partial \ mathbf{\theta}}\).

Reference [1] formula (115), get \({}^{C}\mathbf{X}\).

\[\begin{align} {}^{C}\mathbf{X} &= \begin{bmatrix} [1-2(q_y^2+ q_z^2)]X + 2(q_xq_y-q_wq_z)Y + 2(q_xq_z+q_wq_y)Z + t_x \ 2(q_xq_y+q_wq_z)X + [1-2(q_x^2+q_z^2)]Y + 2 (q_yq_z-q_wq_x)Z + t_y \ 2(q_xq_z-q_wq_y)X + 2(q_yq_z+q_wq_x)Y + [1-2(q_x^2+q_y^2)]Z + t_z \end{bmatrix} \ {\ partial {}^{C}\mathbf{X} \over \partial {}^C\mathbf{q}_W} &= \begin{bmatrix} {\partial {}^{C}X \over \partial q_w} & {\partial {}^{C}X \over \partial q_x} & {\partial {}^{C}X \over \partial q_y} & {\partial {}^{C}X \over \partial q_z } \ {\partial {}^{C}Y \over \partial q_w} & {\partial {}^{C}Y \over \partial q_x} & {\partial {}^{C}Y \over \partial q_y} & {\partial {}^{C}Y \over \partial q_z} \ {\partial {}^{C}Z \over \partial q_w} & {\partial {}^{C}Z \over \ partial q_x} & {\partial {}^{C}Z \over \partial q_y} & {\partial {}^{C}Z \over \partial q_z} \end{bmatrix} \notag \ &= 2 \begin {bmatrix} q_yZ-q_zY & q_yY+q_zZ & q_wZ+q_xY-2q_yX & -q_wY+q_xZ-2q_zX \ -q_xZ+q_zX & -q_wZ-2q_xY+q_yX & q _xX+q_zZ & q_wX+q_yZ-2q_zY \ q_xY-q_yX & q_wY-2q_xZ+q_zX & -q_wX-2q_yZ+q_zY & q_xX+q_yY \end{bmatrix} \ {\partial {}^{C}\mathb \over \partial {}^C\mathbf{q}_W}{\partial \mathbf{\theta} \otimes {}^C\mathbf{q}_W \over \partial \mathbf{\theta}} &= 2 % q_yX & q_wY-2q_xZ+q_zX & -q_wX-2q_yZ+q_zY & q_xX+q_yY \end{bmatrix} \notag \ &\phantom{=} {1 \over 2}\begin{bmatrix} -q_x & -q_y &- q_z \ q_w & q_z & -q_y \ -q_z & q_w & q_x \ q_y & -q_x & q_w \end{bmatrix} \notag \ &= \begin{bmatrix} 0 & 2(-q_wq_y+q_xq_z)+2(q_wq_x +q_yq_z)Y+[1-2(q_x^2+q_y^2)]Z & -2(q_wq_z+q_xq_y)X+[1-2(q_w^2+q_y^2)]Y+2(q_wq_x-q_yq_z) Z \ 2(q_wq_y-q_xq_z)-2(q_wq_x+q_yq_z)Y-[1-2(q_x^2+q_y^2)]Z & 0 & [1-2(q_y^2+q_z^2)]X +2(-q_wq_z+q_xq_y)Y+2(q_wq_y+q_xq_z)Z \ -2(q_wq_z+q_xq_y)X-[1-2(q_w^2+q_y^2)]Y+2(-q_wq_x+q_yq_z) Z & -[1-2(q_y^2+q_z^2)]X+2(q_wq_z-q_xq_y)Y-2(q_wq_y+q_xq_z)Z & 0 \end{bmat rix} \end{align}\]

Refer to "Visual SLAM 14 Lectures" 4.3.5, know that the left disturbance derivative to so(3) is \(-({}^C\mathbf{X})^{\wedge}\), which basically corresponds to the result obtained by the above derivation, which is a shift\({}^C\mathbf{t}_W\). This difference can be explained. The disturbance model using se(3) is \(\Delta\mathbf{R}(\mathbf{R}\mathbf{p}+\mathbf{t}) +\Delta \mathbf{t}\), so the derivative with respect to so(3) contains a displacement. The disturbance model in the above formula is \((\Delta\mathbf{R})\mathbf{R}\mathbf{p}+\mathbf{t}+\Delta\mathbf{ t}\), you should apply "Visual SLAM Fourteen Lectures" 4.3.4 to perform disturbances on so(3). In summary, if you are doing a Parameterization of the entire se(3) (refer to Zheng Fan’s blog "On-Manifold Optimization Demo using Ceres Solver"), then you need to use the derivative model of "Visual SLAM Fourteen Lectures" 4.3.5, if Only QuaternionParameterization, then only need to use the "Visual SLAM Fourteen Lectures" 4.3.4 derivative model. When determining the derivative model of Parameterization, you should also pay attention to how the residual is calculated in CostFunction.

2. Other usages of ceres::LocalParameterization

The above content has been used for the two most important functions in ceres::QuaternionParameterization, ceres:: QuaternionParameterization::ComputeJacobian and ceres::QuaternionParameterization::Plus are explained. ceres::QuaternionParameterization inherits from ceres::LocalParameterization. After understanding the operating principle of ceres::QuaternionParameterization, you can inherit and implement your own ceres::LocalParameterization according to your own needs.

A very easy to use simple example is "fixed mold length optimization". As explained earlier, ceres::LocalParameterization is used to optimize the variables on the Manifold. There are mutual constraints between the variables on the Manifold, so it can be regarded as an optimization with restricted conditions. "Fixed mold length optimization" is to limit the modulus length of the variables to be optimized. The most common "fixed mold length optimization" is optimization of the direction of gravity acceleration. Refer to Algorithm 1 of VINS [2]. During the initialization process The direction of gravity acceleration is optimized.

In the implementation process, you can calculate the derivative of the gravity three-dimensional variable in CostFunction, determine the main direction in ceres::LocalParameterization::ComputeJacobian, and calculate the \ (\mathbf{b}_1, \mathbf{b}_2\), project the derivative of the three-dimensional gravity variable to \(\mathbf{b}_1, \mathbf {b}_2\) These two directions. The optimized \(\mathbf{b}_1, \mathbf{b}_2\) increment in two directions, in ceres::LocalParameterization::Plus Merged into the gravity three-dimensional coordinates, and during the merging process, pay attention to normalization to achieve a fixed mold length. This method, after my own experiments, is successful. The VINS code did not look closely. The function void RefineGravity(map &all_image_frame, Vector3d &g, VectorXd &x) did not use ceres::LocalParameterization.

References

[1] Sola, Joan. "Quaternion kinematics for the error-state Kalman filter." arXiv preprint arXiv:1711.02508 (2017).

[2] VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator, Tong Qin, Peiliang Li, Zhenfei Yang, Shaojie Shen, IEEE Transactions on Robotics.

Leave a Comment

Your email address will not be published.