Autonomous driving jargon: Commonly used point cloud registration methods and future development directions

A point cloud is essentially a dataset, and the data contained in the point cloud output by different types of sensors varies slightly. For LiDAR, the output point cloud generally includes three-dimensional spatial coordinates (x, y, z), reflection intensity information, timestamps, and other data.

However, due to the limitations of sensor field of view, a single sensor can often only acquire point clouds within a limited field of view. How can a complete 3D point cloud of a scene be generated to support subsequent object recognition, classification, and path planning? Autonomous vehicles pre-store high-precision point cloud maps of a certain area; how can the vehicle's current position be determined using point clouds acquired in real-time by its sensors? Given initial attitude information, how can relative attitude change information be estimated using point clouds from two consecutive frames collected by the vehicle's sensors?

Point cloud registration, a crucial part of the entire point cloud processing pipeline, is a perfect solution to the aforementioned problems. In the 24th episode of "Autonomous Driving Insider," we'll explain what point cloud registration is, its role, common methods, and future development directions.

definition

Point cloud registration, also known as point cloud stitching or point cloud registration, is a process that transforms two point clouds with overlapping information into the same coordinate system by solving the transformation matrix (rotation matrix R and translation matrix T).

Point cloud registration has gone through three stages: manual, semi-automatic, and automatic. Manual registration refers to engineers adjusting two point cloud images by hand and eye until they are aligned in a way that is visually acceptable. Semi-automatic registration relies on instruments. Automatic registration uses algorithms to automatically register two point cloud frames, which is the focus of this article.

Point cloud registration consists of two steps: coarse registration and fine registration. Coarse registration refers to a rough registration process performed when the point cloud positions of two frames differ significantly and their relative poses are completely unknown. Its purpose is to provide better initial transformation values for subsequent fine registration. Fine registration, given an initial transformation matrix, further optimizes the process to obtain a more accurate transformation.

effect

Map building. Creating a point cloud map is the first step in high-precision map production. By registering consecutive frame point clouds collected from different locations, multiple frame point clouds from different locations can be unified into the same coordinate system, constructing a complete point cloud map of the scene.

High-precision positioning. For autonomous driving, accurate positioning is a prerequisite for achieving all the desired goals. Aside from Tesla, most other autonomous driving companies rely on high-precision maps as an essential tool for accurate positioning. Point cloud registration technology matches the point clouds scanned in real-time by vehicle sensors with pre-stored high-precision point cloud maps, and then fuses the resulting positioning with GNSS output, IMU output, and odometer output to achieve the final high-precision positioning result for the vehicle.

Attitude estimation. By registering the point clouds of two consecutive frames obtained in real time from the vehicle's sensors, the relative attitude information of the vehicle can be estimated.

method

I. ICP

Iterative Closest Point (ICP) is arguably one of the most classic registration methods. Proposed by PJBesl in 1992, it has been a full thirty years since then, yet its brilliance remains undiminished. From the earliest point-to-point ICP, methods such as point-to-line ICP, point-to-area ICP, and area-to-area ICP have been developed, collectively establishing the epoch-making significance of ICP in the field of point cloud registration.

The core idea of the earliest point-to-point ICP method is very simple: the same point in space is closest to each other in two frames of point clouds. After translation and rotation transformation, the coordinates of the two frames of point clouds at the same position are brought as close as possible or even equal, so as to achieve the registration effect.

The point-to-point ICP method comprises two stages: correspondence search and transform estimation. Correspondence search involves finding matching points for each point in two point clouds, while transform estimation uses this correspondence to estimate the transform matrix. These two stages iterate continuously, making the correspondence increasingly accurate, thereby finding the optimal transform matrix. The implementation process of the point-to-point ICP method can be mathematically described as follows:

(1) Given the point cloud PS to be registered, which is the point cloud currently scanned by the sensor, it consists of n three-dimensional points pi. At the same time, we have the target point cloud Pt, which can be a pre-built high-precision point cloud map or a point cloud collected by other sensors but with overlapping areas, which also contains a series of three-dimensional points pj.

(2) What we need to do is to find a 3x3 rotation matrix R and a 3x1 translation matrix t such that the following expression is minimized.

(3) Use the quaternion method or SVD method to solve for the values of R and t.

Finding the closest point between two point clouds in one step requires calculating the distance between each point in the point cloud to be registered and each point in the target point cloud sequentially. This is not only computationally complex but also time-consuming. Therefore, a distance threshold is usually set to speed up this process. When the distance between a point in the point cloud to be registered and a point in the target point cloud is less than a certain threshold, the corresponding point is considered to have been found, and it is no longer necessary to traverse every point in both the point cloud to be registered and the target point cloud.

After one iteration, by adjusting the weights of some corresponding point pairs and removing some unreasonable corresponding point pairs, we can obtain a temporary transformed point cloud. By calculating the cost function, we can obtain the optimal transformation between the temporary transformed point cloud and the target point cloud. Then, we compare this temporary point cloud and the target point cloud again to obtain another transformed point cloud and the optimal transformation. We iterate this process until we find the nearest point in the target point cloud and the globally optimal transformation for each point in the point cloud to be registered.

The registration process is illustrated in the figure below. In two point cloud frames with good initial values, through continuous iteration, the point cloud to be registered moves closer to the target point cloud along the direction of the arrow, and finally the registration of the target point cloud and the point cloud to be registered is achieved.

Point-to-point ICP methods are known for their simplicity, requiring no point cloud segmentation or feature extraction, no training data, and can be well extended to unknown scenarios. Furthermore, with good initial values, rigorous mathematical theory can guarantee good accuracy and convergence.

However, the point-to-point ICP method traverses the entire point cloud, resulting in an astonishingly high computational cost when the number of point clouds is large. Furthermore, the point-to-point ICP method only considers the distance between points, lacking utilization of point cloud feature information. Under constantly changing environmental conditions and the influence of measurement errors, it can produce significant registration errors. The most critical problem is that the point-to-point ICP method is sensitive to initial values and easily gets trapped in local optima, leading to registration failure.

When the initial values in the above image are poor, as shown in the image below, the point cloud to be registered will rotate clockwise as indicated by the arrow, following the principle of minimizing the distance between corresponding points. Although this will cause the distance between corresponding points on both sides to increase, the distance between most of the remaining points after removing the two sides will decrease. This is the local optimum problem caused by poor initial values in the point-to-point ICP method.

To address the high computational cost of traversing all points in a point cloud, the point-to-point ICP method introduces the concept of feature points. Feature points are representative points in the point cloud that best represent the effective spatial information of the point cloud, such as corner points or points at convex/concave boundaries. By extracting feature points from the point cloud and then applying the point-to-point ICP method, the computational cost can be significantly reduced while maintaining high accuracy.

In order to optimize the local optimum problem, some improved ICP algorithms have emerged, including point-to-surface ICP and surface-to-surface ICP.

Point-to-surface ICP considers the distance from the vertex of the point cloud to be registered to the surface containing the vertex of the target point cloud. Compared with directly calculating the point-to-point distance, it takes into account the local structure of the target point cloud, resulting in higher accuracy and less likelihood of getting trapped in local optima. However, point-to-surface ICP is a nonlinear problem, which is relatively slow and inefficient.

Surface-to-surface ICP considers the distance between the faces containing the vertices of the point cloud to be registered and the faces containing the vertices of the target point cloud, and also takes into account the local structures of both point clouds, resulting in higher accuracy and less susceptibility to local optima. However, as it is still a nonlinear optimization problem, it is slower and less efficient.

Therefore, the point-to-point ICP method is often used in the fine registration stage after coarse registration has been completed and good initial values have been obtained.

II. NDT

NDT (Normal Distribution Transform) transforms the target point cloud (which can be a high-precision point cloud map) into a normal distribution of multidimensional variables, and uses optimization techniques to find the optimal transformation so that the sum of the probability densities of the point cloud to be registered in the target point cloud is maximized after the transformation.

The implementation process of the NDT method can be described in mathematical terms as follows:

(1) Divide the target point cloud into grids of various sizes (for 2D point clouds, it is a 2D plane; for 3D point clouds, it is a 3D cube), while ensuring that each grid contains at least 6 points.

(2) Assume that the points in each grid follow a normal distribution, and calculate the mean vector q and covariance matrix C of the points in each grid.

(3) Based on the above information, we can model the normal distribution N(q, C) for each point X in the grid and obtain the Gaussian probability density function P(X) for each point in this grid cell.

In equation P(X), c is the normalization constant. We also see that this equation involves inverting the covariance matrix C. If the number of points in a cell is too small, the condition number of the covariance matrix will be too large, leading to excessive errors. Therefore, the above requires that each cell contain at least six points. Furthermore, each P(X) can be seen as an approximate representation of the cell surface, describing the position, orientation, and smoothness of the cell surface.

We will then provide an initial transformation matrix to transform the point cloud to be registered and bring it closer to the target point cloud. In this way, the points on the point cloud to be registered will fall into the grid cells of the target point cloud. Taking a single grid cell as an example, by substituting the point Y in the point cloud to be registered that falls into this grid cell into P(X) above, we can obtain the probability value of this point in this grid cell. This value is a likelihood value.

To reduce the impact of grid discretization and discontinuities, we shift this grid cell downwards, to the left, downwards again, and then to the left, resulting in three additional grid cells. Each of these three cells will have a probability value, and these four values are summed to obtain the total score for this point in the point cloud to be registered.

Each point falling within the target point cloud's grid cells has a score value. Multiplying all the score values gives us our objective function. Our ultimate goal is to find the optimal parameters that maximize the objective function score (maximum likelihood), at which point we consider the registration successful. This can be understood as the point cloud to be registered should occupy as many locations as possible in the target point cloud where points have a high probability of appearing.

The NDT method, through grid partitioning, can support registration of larger and denser point cloud maps. Furthermore, it does not utilize feature calculations and matching of corresponding points during the registration process, making it more efficient and computationally less resource-intensive than ICP. Moreover, when the overlap between the two point cloud frames to be registered is low and the structure is not obvious (few planes), NDT is more accurate than point ICP.

III. Deep Learning

Deep learning has proven incredibly effective in the field of autonomous driving, and point cloud registration is no exception. Deep learning-based point cloud registration algorithms are constantly being proposed, including PointNetLK, DCP, IDAM, RPM-Net, and 3DRegNet, among others. These deep learning models have already demonstrated in the lab that their performance and speed far surpass those of ICP and NDP methods. However, their effectiveness in complex real-world scenarios warrants continued attention.

Initial value acquisition

For ICP, the quality of the initial value largely determines the registration result of two frame point clouds. For NDT, although it is not sensitive to the initial value (the point cloud positions of the two frames are within 3m, and the angle is within +/- 45°), a good initial value is beneficial to improving the efficiency and quality of the registration. Different vehicle configurations dictate different methods for obtaining the initial value. For vehicles equipped only with LiDAR, the transformation result of the previous frame can be used as the initial value.

For vehicles equipped with both an IMU and an odometer, the IMU's acceleration measurement is more accurate, but the integrated velocity and displacement measurements will have accumulated errors. The odometer's velocity measurement is more accurate, but the acceleration calculated from the motion model will have some error. Therefore, for this configuration, fusing the IMU's acceleration measurement results and the odometer's velocity measurement results into an initial value will yield a highly reliable result.

For vehicles equipped with GNSS, if the GNSS can also output the heading angle, then the GNSS output can be used directly as the initial value. For GNSS that cannot output the heading angle, the heading angle of a planar motion can also be calculated using continuous position coordinates.

Outlook

There are mature algorithms for point cloud registration of a single sensor, but it can be seen that each method has its limitations and cannot meet the point cloud registration requirements of all autonomous driving scenarios. How to combine the strengths and make up for the weaknesses has become an important way for the development of point cloud registration of a single sensor.

With multi-sensor fusion becoming the mainstream, the registration of point clouds acquired by different types of sensors has become crucial and essential. However, research in this area is still in its early stages, with only some optimization strategies and deep neural network models available at the laboratory level. How to train a usable neural network model is something that everyone is eagerly anticipating.

Autonomous driving jargon: Commonly used point cloud registration methods and future development directions

Read next

CATDOLL Sabrina Hybrid Silicone Head

CATDOLL Charlotte TPE Head

CATDOLL 115CM Rosie TPE

CATDOLL 148CM Qing Silicone Doll