Top Banner
Towards terrestrial 3D data registration improved by parallel programming and evaluated with geodetic precision Janusz Będkowski a, , Karol Majek a , Pawel Musialik a , Artur Adamek b , Dariusz Andrzejewski b , Damian Czekaj b a Institute of Mathematical Machines, ul. Krzywickiego 34, 02-078 Warsaw, Poland b Faculty of Geodesy and Cartography, Warsaw University of Technology, Pl. Politechniki 1, 00-661 Warsaw, Poland abstract article info Article history: Received 24 October 2013 Received in revised form 10 June 2014 Accepted 25 July 2014 Available online 29 August 2014 Keywords: Iterative closest point Data registration Mobile mapping CUDA parallel programming Spatial design support In this paper a quantitative and qualitative evaluation of proposed ICP-based data registration algorithm, improved by parallel programming in CUDA (compute unied device architecture), is shown. The algorithm was tested on data collected with a 3D terrestrial laser scanner Z+F Imager 5010 mounted on the mobile platform PIONNER 3AT. Parallel implementation enables data registration on-line, even using a laptop with a standard hardware conguration (graphic card NVIDIA GeForce 6XX/7XX series). Robustness is assured by the use of CUDA-enhanced fast NNS (nearest neighbor search) applied for ICP (iterative closest point) with SVD (singular value decomposition) solver. The evaluation is based on the reference ground truth data registered with geodetic precision. The geodetic approach extends our previous work and gives an accurate benchmark for the algorithm. The data were collected in an urban area under a demolition scenario in a real environment. We compared four registration strategies concerning data preprocessing, such as subsampling and vegetation removal. The result is the analysis of measured performance and the accuracy of the geometric maps. The system provides accurate metric maps on-line and can be used in several applications such as mobile robotics for construction area modelling or spatial design support. It is a core component for our future work on mobile mapping systems. © 2014 Elsevier B.V. All rights reserved. 1. Introduction The 6D-SLAM (simultaneous localization and mapping) algorithm, apart from solving the simultaneous localization and mapping problem, allows for the quick and reliable creation of digital models of large environments without the need for direct intervention. The 6D comes from the six dimensions of the robot motion model, which integrates 3D position coordinates (x, y, z) with orientation information (yaw, pitch, and roll). Such a model is a natural choice for an outdoor environment. There is no limitation in using 6D-SLAM in indoor environments, but using pitch and roll angles on at surfaces is not always necessary. The output of the algorithm, in most cases, is a map in one of two forms dense or sparse. Dense maps are related with 3D point clouds [1] obtained typically with 3D laser scanners; sparse maps are related with features extracted mostly from images. The 3D data registration problem was introduced by Besl and McKay in Ref. [2]; from that moment on, many researchers have been trying to solve the problem of augmenting the accuracy and the performance of aligning two clouds of points. Based on the State of the Art, we can state that the solutions to key issues of 3D GPGPU (general purpose computing on graphics processing units) data registration proposed in the important contributions are very close to optimum, but may still be improved upon. An approach that is widely used for 3D data registration is the iterative closest point (ICP) algorithm. The goal of the ICP is to nd the transformation matrix that minimizes the sum of distances between the corresponding points in two different data sets. The method's effectiveness depends mostly on solutions to two important problems: the nearest neighbor search (NNS) and choosing the proper optimization technique for the minimization of the mentioned function (estimation of the 3D rigid transformation). The NNS procedure is dominant compared to the rest of the ICP algorithm; therefore, many researchers are trying to optimize the time of its execution. The SoA provides several CUDA based approaches for the NNS problem in the ICP algorithm. An approach from Ref. [3] uses regular grid decomposition [4], whereas in Ref. [5] kd-tree is used. The second problem, choosing the proper optimization technique, has been a research topic in recent decades. A comparison of four algorithms for estimating 3D rigid transformation is shown in Ref. [6]. The rst algorithm proposed in Ref. [7] uses singular value decomposition (SVD) for derive matrix. The second approach, based on orthonormal matrices and the computation of an eigensystem of a derived matrix, is proposed in Ref. [8]. The third algorithm is shown in Ref. [9]. It nds the transformation for the ICP algorithm by using unit quaternions. The fourth algorithm, shown in Ref. [10], uses the so-called dual Automation in Construction 47 (2014) 7891 Corresponding author. E-mail address: [email protected] (J. Będkowski). http://dx.doi.org/10.1016/j.autcon.2014.07.013 0926-5805/© 2014 Elsevier B.V. All rights reserved. Contents lists available at ScienceDirect Automation in Construction journal homepage: www.elsevier.com/locate/autcon
14

Towards terrestrial 3D data registration improved by parallel programming and evaluated with geodetic precision

Mar 06, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Towards terrestrial 3D data registration improved by parallel programming and evaluated with geodetic precision

Automation in Construction 47 (2014) 78–91

Contents lists available at ScienceDirect

Automation in Construction

j ourna l homepage: www.e lsev ie r .com/ locate /autcon

Towards terrestrial 3D data registration improved by parallelprogramming and evaluated with geodetic precision

Janusz Będkowski a,⁎, KarolMajek a, PawelMusialik a, Artur Adamek b, Dariusz Andrzejewski b, Damian Czekaj b

a Institute of Mathematical Machines, ul. Krzywickiego 34, 02-078 Warsaw, Polandb Faculty of Geodesy and Cartography, Warsaw University of Technology, Pl. Politechniki 1, 00-661 Warsaw, Poland

⁎ Corresponding author.E-mail address: [email protected] (J. Będk

http://dx.doi.org/10.1016/j.autcon.2014.07.0130926-5805/© 2014 Elsevier B.V. All rights reserved.

a b s t r a c t

a r t i c l e i n f o

Article history:Received 24 October 2013Received in revised form 10 June 2014Accepted 25 July 2014Available online 29 August 2014

Keywords:Iterative closest pointData registrationMobile mappingCUDA parallel programmingSpatial design support

In this paper a quantitative and qualitative evaluation of proposed ICP-based data registration algorithm,improved by parallel programming in CUDA (compute unified device architecture), is shown. The algorithmwas tested on data collected with a 3D terrestrial laser scanner Z+F Imager 5010 mounted on the mobileplatform PIONNER 3AT. Parallel implementation enables data registration on-line, even using a laptop with astandard hardware configuration (graphic card NVIDIA GeForce 6XX/7XX series). Robustness is assured by theuse of CUDA-enhanced fast NNS (nearest neighbor search) applied for ICP (iterative closest point) with SVD(singular value decomposition) solver. The evaluation is based on the reference ground truth data registeredwith geodetic precision. The geodetic approach extends our previous work and gives an accurate benchmarkfor the algorithm. The data were collected in an urban area under a demolition scenario in a real environment.We compared four registration strategies concerning data preprocessing, such as subsampling and vegetationremoval. The result is the analysis ofmeasured performance and the accuracy of the geometricmaps. The systemprovides accurate metric maps on-line and can be used in several applications such as mobile robotics forconstruction area modelling or spatial design support. It is a core component for our future work on mobilemapping systems.

© 2014 Elsevier B.V. All rights reserved.

1. Introduction

The 6D-SLAM (simultaneous localization and mapping) algorithm,apart from solving the simultaneous localization andmapping problem,allows for the quick and reliable creation of digital models of largeenvironments without the need for direct intervention. The 6D comesfrom the six dimensions of the robot motion model, which integrates3D position coordinates (x, y, z) with orientation information(yaw, pitch, and roll). Such a model is a natural choice for an outdoorenvironment. There is no limitation in using 6D-SLAM in indoorenvironments, but using pitch and roll angles on flat surfaces is notalways necessary. The output of the algorithm, in most cases, is a mapin one of two forms — dense or sparse. Dense maps are related with3D point clouds [1] obtained typically with 3D laser scanners; sparsemaps are related with features extracted mostly from images. The 3Ddata registration problem was introduced by Besl and McKay in Ref.[2]; from that moment on, many researchers have been trying to solvethe problem of augmenting the accuracy and the performance ofaligning two clouds of points. Based on the State of the Art, we canstate that the solutions to key issues of 3D GPGPU (general purposecomputing on graphics processing units) data registration proposedin the important contributions are very close to optimum, but may

owski).

still be improved upon. An approach that is widely used for 3D dataregistration is the iterative closest point (ICP) algorithm. The goal ofthe ICP is to find the transformation matrix that minimizes the sum ofdistances between the corresponding points in two different datasets. The method's effectiveness depends mostly on solutions to twoimportant problems:

• the nearest neighbor search (NNS) and• choosing the proper optimization technique for the minimization ofthe mentioned function (estimation of the 3D rigid transformation).

The NNS procedure is dominant compared to the rest of the ICPalgorithm; therefore, many researchers are trying to optimize the timeof its execution. The SoA provides several CUDA based approaches forthe NNS problem in the ICP algorithm. An approach from Ref. [3] usesregular grid decomposition [4], whereas in Ref. [5] kd-tree is used. Thesecond problem, choosing the proper optimization technique, hasbeen a research topic in recent decades. A comparison of four algorithmsfor estimating 3D rigid transformation is shown in Ref. [6]. The firstalgorithm proposed in Ref. [7] uses singular value decomposition(SVD) for derive matrix. The second approach, based on orthonormalmatrices and the computation of an eigensystem of a derived matrix,is proposed in Ref. [8]. The third algorithm is shown in Ref. [9]. It findsthe transformation for the ICP algorithm by using unit quaternions.The fourth algorithm, shown in Ref. [10], uses the so-called dual

Page 2: Towards terrestrial 3D data registration improved by parallel programming and evaluated with geodetic precision

79J. Będkowski et al. / Automation in Construction 47 (2014) 78–91

quaternions. Apart from these four closed-form solution methods, anovel linear solution to the scan registration problem is shown in Ref.[11]. The advantage of these new linear solutions is that they can be ex-tended straightforwardly to n-scan registrations. It was stated thatunder the assumption that the transformation (R,t) that has to be calcu-lated by the ICP algorithm is small, it can be approximated by applyinginstantaneous kinematics. This solution was initially given in Refs. [12,13]. Reported experiments have shown that the helix transform per-forms qualitatively as well as the uncertainty-based algorithm usingEuler angles. The paper is composed of 11 chapters. The current oneprovides an introduction and short state of the art summary. The secondexplains the motivation behind the research. In the real task scenariodetails of the experiment are explained. Chapter 4 describes the data ac-quisition and processing. In chapter 5, themethodology for evaluation isdescribed, followed by algorithmmodifications, vegetation removal andsub-sampling. Chapter 8 provides a detailed description of the ex-periment, with the analysis of the results in chapter 9. The article inchapter 10 introduces the end-user case study, and chapter 11 closeswith a summary and conclusions.

2. Problem formulation

Thegoal of thiswork is to benchmark the 3Ddata registrationmethod,improved by CUDA parallel programming, shown in the previous work[14] within the scope of quantitative spatial design support. The bench-mark is analyzed using reference data of geodetic precision. Theapproach extends the state of the art by providing qualitative informa-tion concerning the accuracy of the proposed method. The secondarygoal is to test the system in the real task scenario with an assumptionof the on-line performance. The resulting maps can be used for numer-ous applications: urban area modeling, spatial design support, basicspace design, etc.

3. Real task scenario

To assure the real-life conditions for the experiment, a proper envi-ronment has to be chosen. Fig. 1a shows an object of interest: a buildingin village Klomino (Poland), abandoned since 1993. The choice is moti-vated by the hard terrain conditions of the area. The goal of the experi-ment is to create a metric model of this building. Data for themodel aregathered with a geodetic laser range finder mounted onto a robotic

(a) Location of scenario. (b) Object

Fig. 1. Real task

platform (Fig. 1c). This scenario simulates a potential real robotic appli-cation: deployment of a mobile platform in a hazardous environmentfor gathering data and providing a metric map in an on-line fashion.Similar equipment (RIEGL LMS-Z210) was involved in disaster assess-ment at Fukushima 1 in 2011. The key factor is the accuracy of scanmatching, which has to be as high as possible to increase the fidelityof the produced metric map.

4. Data registration

The ICP algorithm, with its variations, point to point and point toplane, has become a well-known method since it appeared in Ref. [2].The fastest implementation that can be found in literature needs60 ms to align two point clouds, each of 320 × 240 data points [15],but the authors unfortunately did not discuss the scalability of proposedmethod. The key concept of the standard ICP algorithm can be summa-rized in two steps [16]:

1. Compute correspondences between the two scans (nearest neighborsearch).

2. Compute a transformation which minimizes distance betweencorresponding points.

Iteratively repeating these two steps should result in convergence tothe desired transformation. Range images (scans) are defined as modelsetM where

jMj ¼ Nm ð1Þ

and data set D where

jDj ¼ Nd: ð2Þ

The alignment of these two data sets is solved by minimizing thefollowing cost function:

E R; tð Þ ¼XNm

i¼1

XNd

j¼1

wij mi− Rd j þ t� ����

���2 ð3Þ

wij is assigned 1 if the ith point of M corresponds to the jth point in D.Otherwise wij = 0. R is the rotation matrix, t is the translation matrix,mi corresponds to points from the model set M, and dj corresponds to

of interest. (c) Mobile robot andgeodetic equipment.

scenario.

Page 3: Towards terrestrial 3D data registration improved by parallel programming and evaluated with geodetic precision

80 J. Będkowski et al. / Automation in Construction 47 (2014) 78–91

points from the data set D. It was already proven that the ICP algorithmneeds a good prediction to achieve an accurate matching. Therefore, inthis paper we decided to show that by decreasing the radius of NNSduring ICP we can improve the accuracy. The main contribution in thispaper is related to the following improvements of the 3D data registra-tion shown in our previous work [3]:

• processing up to 64 × 1024 × 1024 points in a single step,• using regular grid decomposition for a robust nearest neighborhoodsearch,

• using a parallel reduction for the correlation matrix computation,• implementation of SVD solver in CUDA,• data post processing implemented in CUDA.

(a) Scalable programming model in CUDA.

(c) System over view f

Fig. 2. Parallel implem

4.1. CUDA implementation of classic ICP

NVIDIAGPUs are fully programmablemulti-core chips built around anarray of processors working in parallel. The GPU is composed of an arrayof SM (FERMI)/SMx (KEPLER) multiprocessors, where each of them canlaunch up to 1024 co-resident concurrent threads. Thread management(creation, scheduling, synchronization) is performed on a hardwarelevel (SM/SMx); therefore, overhead cost is extremely low. The SM/SMxmultiprocessors work in an SIMT scheme (single instruction, multiplethread), where threads are executed in groups of 32, called warps. TheCUDA programming model defines the host and the device. The host ex-ecutes CPU sequential procedures, whereas the device executes parallelprograms— kernels. A kernel works according to a SPMD scheme (single

(b) Neighboring buckets with its indexing inregular grid of buckets implemented on GPU(k=32, 64, 128, 256 or 512).

or data registration.

entation in CUDA.

Page 4: Towards terrestrial 3D data registration improved by parallel programming and evaluated with geodetic precision

(a) Input data. (b) Normal vectors. (c) Removed vegetation.

Fig. 3. Vegetation detection and removal.

(a) Geodetic network. (b) Numbered poses of registered scans

Fig. 4. Geodetic network used for measuring ground truth data.

81J. Będkowski et al. / Automation in Construction 47 (2014) 78–91

Page 5: Towards terrestrial 3D data registration improved by parallel programming and evaluated with geodetic precision

Fig. 5. Comparison of three strategies for registration of observations 1 and 2.

82 J. Będkowski et al. / Automation in Construction 47 (2014) 78–91

program,multiple data). CUDAmassively parallel computation has sever-al applications in a given problem. Themain idea is using the GPU for NNSby decomposing the 3D space x, y, ∈ b− 1, 1 N into a regular grid of 2k×2k × 2k (k∈ {4, 5, 6, 7, 8, 9}) buckets. Another idea is to perform calcula-tions for each query point in parallel using SIMT — single-instruction,multiple-thread in CUDA. Assuming a scalable programming modelshown on Fig. 2a that allows the GPU architecture to scale the numberof multiprocessors and memory partitions, we can expect higher perfor-mance onGPUswith ahigher number ofmultiprocessors. Todemonstrateit, we have shown the performance difference between the high-endGeForce GTX TITAN (14 multiprocessors) and the common GeForce GT650M(2multiprocessors) in Fig. 8. Eachmultiprocessor in our implemen-tation is able to perform calculations for up to 1024 query points in paral-lel. Therefore increased performance can be observed for data sets over2048 data points. In a 6D-SLAM, one of the problems is that single pointclouds only partially overlap each other. Because the assumption of fulloverlap is violated, we are forced to add a maximummatching thresholdparameter dmax. This threshold addresses the fact that some points willnot have any correspondence in the second scan, preventing them frombeing matched. The value of the threshold is connected with the dimen-sion size of a single bucket dmax b 2/2k. In most implementations of ICP,the choice of dmax represents a tradeoff between convergence and accura-cy. A low valuemay lead to not finding any neighbors and, as such, a veryrandom solution. On the other hand, large values may result in bad con-vergence (far from the optimal). In our approach the State of the Art algo-rithmdescribed inRef. [17] is improvedby replacing the complex k–d treedata structure with CUDA regular grid. As creating grid representation ismuch faster than building a full tree, overall performance of the closestpoint search is significantly increased. Parallel implementation further de-creases the computation time. All derivations of investigated registrationmethod can be found in Ref. [18]. For the comparison purpose, the classicICP is listed as algorithm2, and the ICP algorithmusing CUDAparallel pro-gramming is listed in algorithm 2. The main idea is to decompose the 3D

space into a regular grid of buckets (Fig. 2b) and to perform NNS com-putation for each query point in parallel.

5. Method of evaluation

The main idea behind evaluating the proposed method is to create,using geodetic methods, an accurate reference model and comparewith it the one resulting from 6D-SLAM. Before conducting the experi-ment with the robot a local, high accuracy geodetic control networkhas to be established. The number of control points of the network ischosen based on the object of interest shape. For the described experi-ment, four points were chosen. The position of the control points hasto be determined with high accuracy, as it is later used for transforming3D scans into the network's coordinate system. Precise measurementof the network was performed using highly accurate station LeicaTCRP1201+. Results were adjusted using the least square method.

Page 6: Towards terrestrial 3D data registration improved by parallel programming and evaluated with geodetic precision

(a) Raw data

(b) Subsampled data

83J. Będkowski et al. / Automation in Construction 47 (2014) 78–91

The final accuracy of the control network was approximately 0.3 mm.Apart from the control points, a set of artificial markers and natural tiepoints (building corners etc.) may be used. In the experiment a set offour paper markers was used. The reference model is built in twosteps. First, a uniform local coordinate system is established for thescans. After that, the scans are transformed into the control's networkgeodetic coordinate system. Transformation parameters can be com-puted using a 3D Helmert transformation, which was used in our case.The final accuracy of the model was 2.9 mm. All process was donemanually. Detailed description of the reference model building processis given in Section 8. Fig. 4a shows the final network. The process ofevaluating the models created by 6D-SLAM algorithm requires themto be transformed into the same geodetic coordinate system as the ref-erencemodel. Often control points, used as tie points in geodetic modelbuilding, cannot be used, as the ambiguity of the SLAM models is toohigh. Alternatively, the centroids of each single scan of the model canbe chosen. Then a 3D Helmert transformation with six parameters(without changing scale) can be used. This enables the exclusion of er-rors coming from the exterior pose of the full models, permitting afocus only on those coming from the matching process. The centroid'spose is compared with the reference model counterparts. The resultsare local fit errors that can be averaged to get the global error. This infor-mation informs the decision about the accuracy of the model.

(c) Subsampled data without trees

Fig. 6. Data for registration experiment shown in Fig. 7.

6. Vegetation detection

Considering the nature of the ICP algorithm, it is evident that dy-namic and unstable objects can decrease the accuracy of the matching.In many cases such interference may be ignored because they are localin nature (for example, a single person, even moving, is only a smallpart of the scan). The problem is more significant when we considerlarge unstable areas, such as vegetation. The size of vegetation in thescan may influence the global accuracy of the matching. Therefore wedecided to implement a robust method for vegetation identificationand removal. The process is based on the normal vector analysis(Fig. 3). Estimating the surface normal is done by the principal compo-nent analysis (PCA) of a covariance matrix C created from the nearestneighbors of the query point [19]. We developed a PCA solver basedon the SVDmethod that performsnormal vector computation in parallelfor each query point. In the last step of the algorithm, the orientation ofthe normal vector is decided. The base principle behind vegetation de-tection is checking, for each point, whether the direction and orienta-tion of the neighbors are similar to that of considered point. Points for

whom the percentage of similarity is lower than a threshold (10% inour case) are considered vegetation and are removed from the scan.This simple approach misclassifies some building points, especially oncorners, but the number is much lower than the vegetation point filter-ing and thus is not considered a problem.

Page 7: Towards terrestrial 3D data registration improved by parallel programming and evaluated with geodetic precision

Fig. 7. Registration errors for data from Fig. 6.

84 J. Będkowski et al. / Automation in Construction 47 (2014) 78–91

7. Data subsampling

In our research, we observed that the equal density of points ofinitial 3D scans improves the matching accuracy. Therefore, foreach scan we perform subsampling. The method used counts pointswithin a bucket and leaves a given maximum amount of them (in ourexperiment we leave 1000 points per bucket, assuming decomposition256 × 256 × 256 buckets).

Fig. 8. Comparison between GeForce GT 650M and GeForce GTX TITAN performance ofthe total registration of 18 scans. Y axis— time in seconds, X axis— number of experiment(1 — inner = 50 and outer = 50, data subsampled without trees; 2 — inner = 300 andouter = 300, data subsampled without trees; 3 — inner = 1000 and outer = 1000, datasubsampled without trees; 4 — inner = 300 outer = 300, data subsampled).

8. Experiments

Fig. 1 shows the environment where data were collected. To ensurethe full coverage of the building, 18 scanning pointswere chosen. Fig. 4bshows 18 initial poses for data registration evaluation. Data were col-lected with the laser measurement system 3D Z+F IMAGER 5010mounted onto robotic platform. Initial poses were obtained manuallyusing dedicated software. The goal was to register all 18 scans andto compare themwith the reference model obtained with classical geo-detic methods.

8.1. Ground truth data with geodetic precision

The control network was established using measurements from thegeodetic survey total station, by adjusting them with the least squaremethod. In the first step, the distances and directions were averagedfor control points in 4 series and for markers in 1. Skew distance wasreduced on an instrument level, and network side lengths were aver-aged from 2 measurements each. It was decided to adjust the poses ofboth control points andmarkers on buildings in one calculation process.Thus the observation system consisted, besides observations madeon control points, of observations connecting the network with themarkers.

8.2. Horizontal adjustment

For purposes of this paper, the coordinate system was made aslocal system with axis X parallel to the section between points 1002and 1001. Horizontal coordinates of the 1002 point was determined asX = 100.000 m and Y = 100.000 m. The control network was not tiedto any external coordinate system, so it was decided to perform adjust-mentwith free type conditions. Using already knownangle and distancevalues, approximate coordinates of four control points and four targetsmarked on the building were calculated. The system of equationsconsisted of ten angle observations and ten distance observations.

Page 8: Towards terrestrial 3D data registration improved by parallel programming and evaluated with geodetic precision

Fig. 9. Reference geometric model obtained with geodetic precision.

85J. Będkowski et al. / Automation in Construction 47 (2014) 78–91

Theywere compensated by the rootmean square error of each observa-tion. RMS errors were based on the accuracy of those used in measure-ments Leica TCRP 1201+ total station (angle error 0.0003 GON,distance error in reflector-less mode 0,002 m+ 2 ppm) and the numberof observation series. As a result of adjustment, coordinates and accuracyof eight points were obtained. Besides two targets, whose accuracy wasnear 2mm (caused bymeasurements made only from one position), tar-gets and control pointswere calculatedwith an accuracy level higher than1mm. Also, parameters of mean error ellipsewere calculated and plottedon Fig. 4a. The last step of the horizontal adjustment was to evaluate sta-tistical tests. Both global tests, which check for gross errors and correctchoices of alignment model, and local tests, were passed successfully.

8.3. Height adjustment

The first step was calculating the height difference between controlpoints and targets on the building. For that purpose averaged vertical

Fig. 10. Errors for eachof our four evaluated registration strategies using the referencemodel obt

angles and distances, reduced on incremental level, were used. Point1001 (Fig. 4) was assumed fixed, and its height was set to 10 m. A sys-tem of equations was created from ten equations of height differences.Observations were compensated by wage parameter p, calculated asthe inverse of the network's side length multiplied by the square rootof the number of measurements m: m = 1 if measurement was takenfrom one point, andm=2 if measurementwas taken from both points.

8.4. Building of a reference model

As a first step, a roughmanual orientation of the scanswas performed.Afterwards, a precise orientation was performed based on the controlpoints andmarkers visible in the scan. Theorientationwas done separate-ly for X and Y horizontal coordinates and Z vertical coordinates. Measure-ments made on control points were used to do georeference (exteriororientation) model to network's coordinate system. Interior orientationwas based on paper markers on walls and characteristic, easy to identify

ainedwith geodetic precisionwith additional comparison to the SoA algorithm from3DTK.

Page 9: Towards terrestrial 3D data registration improved by parallel programming and evaluated with geodetic precision

(a) Height -1,5m

(b) Height - 12m

Fig. 11.Qualitative evaluation of the registration— intersection of the building onheight of1.5 m and 12 m.

86 J. Będkowski et al. / Automation in Construction 47 (2014) 78–91

points of building. For best accuracy, points were chosen regularly on awhole building. As a result, a compact and precise reference model of anobject was compiled. Orientation parameters of all eighteen point cloudswere calculated. An accuracy evaluation of model was done. Based onresiduals vx, vy, and vz obtained for every measured point, RMS wascalculated. Its value reached 2.9 mm.

8.5. Evaluation of registration method

The experiment was meant to answer two main questions:

1. How to improve the accuracy of the final model?2. How to efficiently reduce the data set and minimize the impact on

the accuracy of the final model?

To improve the accuracy we decided to apply a registration methodthat decreases an NNS radius parameter after performing a numberof ICP iterations. This parameter strongly determines the NNS area.Fig. 5 demonstrates the impact of the NNS radius parameter on theimprovement of the registration's accuracy (for the demonstration, weregistered observation 1 and 2). Three strategies for registration observa-tion 1 and 2 were compared:

1. 300 iterations with radius = 6.25 m,2. 200 iterations with radius = 6.25 m+ 100 iterations with radius =

1.56 m,3. 200 iterations with radius = 6.25 m + 50 iterations with radius =

1.56 m + 50 iterations with radius = 0.20 m.

In all cases (errors for angles and errors of displacement), the thirdregistration strategy converges to a satisfactory result. We cannot startregistration with small radius because the registration tends to find a dif-ferent local minimum. Therefore, large value of radius should be used foran initial registration. Another problem that was observed during exper-imentswas relatedwith the rawscans.We concluded that, to achieve bet-ter results, proper subsampling, ensuring equal density, is needed. Weshow the results on Figs. 6 and 7, wherewewere trying to register obser-vation 1 and 2 with different strategies (raw, subsampled, subsampledwithout trees). We applied the following radius tuning strategy for theregistration: 10 iterations: r = 6.25 m + 10 iterations, r = 3.125 m +10 iterations, r= 1.5625m+10 iterations, r= 1.5625m+10 iterations,r = 1 m + 10 iterations, r = 0.80 m + 10 iterations, r = 0.60 m + 10iterations, r = 0.40 m + 30 iterations, r = 0.20 m. The result shows amuch higher error for raw data than for other approaches. The registra-tion of subsampled data and subsampled data without trees convergesto the similar error, which leads to the conclusion that the proposeddata reduction is beneficial. An important observation is that data reduc-tion based on vegetation removal slightly reduces the accuracy of the dataregistration. However, as shown further in the paper, it provides approx-imately 2 times an increase in time performance. Finally, we decided tocompare this registration strategy within four experiments:

1. inner = 50 and outer = 50, data subsampled without trees (totalnumber of data 3,457,939),

2. inner = 300 and outer = 300, data subsampled without trees (totalnumber of data 3,457,939),

3. inner = 1000 and outer = 1000, data subsampled without trees(total number of data 3,457,939),

4. inner = 300 and outer = 300, data subsampled (total number of5,786,040).

Parameters inner and outer define what number of points is usedduring the nearest neighbor search. Inner defines the number of ran-dom points that are chosen from the bucket in which the processedpoint is. Outer defines the number of points taken from each of the 26neighboring buckets. For further explanation of the inner and outerparameters, we encourage studying our previous work [3]. These pa-rameters can drastically influence the performance of the registration:both in the time and accuracy. Accuracy is greater with higher numbers,

whereas computation time is lower with lower ones. The best resultswere obtained for experiment (inner= 300 and outer= 300, data sub-sampled). Data reduction slightly decreases the accuracy of the registra-tion. A decreased number of searched points within the NNS procedure

Page 10: Towards terrestrial 3D data registration improved by parallel programming and evaluated with geodetic precision

(a) (b)

(c) (d)

Fig. 12.Maps of differences, negative and positive residuals, contour of reference model marked by the red line.

87J. Będkowski et al. / Automation in Construction 47 (2014) 78–91

affects the accuracy, but the computation time is much lower. The totalcomputation time of 18 scans' registration is as follows (GeForce FT650M 1GB GDDR5, GeForce GTX TITAN 6GB GDDR5):

1. inner = 50 and outer = 50, data subsampled without trees: totalregistration time (415 s, 70s),

2. inner = 300 and outer = 300, data subsampled without trees: totalregistration time (1754s, 228 s),

3. inner = 1000 and outer = 1000, data subsampled without trees:total registration time (4770 s, 607 s),

4. inner = 300 and outer = 300, data subsampled: total registrationtime (3527 s, 449 s).

Fig. 8 visualizes this comparison. GeForce GTX TITAN (14 multi-processors), on average, is seven time faster than GeForce GT 650M (2multiprocessors); therefore it proves the statement that scaling the

Page 11: Towards terrestrial 3D data registration improved by parallel programming and evaluated with geodetic precision

(a) (b)

(c) (d)

Fig. 13.Maps of differences, negative and positive residuals, contour of reference model marked by the red line.

88 J. Będkowski et al. / Automation in Construction 47 (2014) 78–91

number of multiprocessors can increase the performance of our parallelimplementation.

9. Evaluation of the accuracy of the registration

The reference geometric model (2.9 mm accuracy), used in quanti-tative evaluation, is shown on Fig. 9. Fig. 10 shows resulting errors foreach of the four evaluated registration strategies. For comparison, the

results for SoA implementation from Ref. [17] are shown (3DTKimplementation). The smallest errors were observed for the fourthstrategy: inner = 300 and outer = 300, data subsampled. Removingvegetation slightly decreases the accuracy, but the performance of theregistration ismuch better (from3527 s down to 1754s of total registra-tion time using GeForce GT 650M). To demonstrate the qualitative re-sult of our approach, we show walls and corners for all models inreference to the geodetic model. Fig. 4b shows numbering for the

Page 12: Towards terrestrial 3D data registration improved by parallel programming and evaluated with geodetic precision

Table 1Average error by method.

Model σx[m] σy[m] σz[m] σω[rad] σϕ[rad] σκ[rad] σxyz[m]

1 0.039 0.034 0.194 0.0085 0.0136 0.0019 0.1192 0.035 0.034 0.036 0.0023 0.0035 0.0017 0.0363 0.038 0.042 0.034 0.0022 0.0027 0.0014 0.0394 0.017 0.034 0.018 0.0013 0.0017 0.0012 0.025

89J. Będkowski et al. / Automation in Construction 47 (2014) 78–91

analysis: black circles 1, 2, 3, 4—walls; white circles 1, 2, 3, 4— corners.For eachmodel, nomatter what approach was used for its creation, theconsistency with the assumed accuracy is required. It should provide atrue representation of the geometry and structure of the scanned object.Themodels were tested by comparing selected elements of their visual-ization with that of the reference model. Two types of tests were con-ducted. The first was to generate a map of differences for individualwalls' deviations between studied models and the reference one. Inthe corners of the building, a cross section was completed. The testedmodels, for the most part, are located on one side of the referencemodel. It is evident that Model 1 has deviations much larger than theother models — which demonstrates the significant geometric distor-tion. The most interesting are the results for wall 2 (Fig. 12) and wall4 (Fig. 13): the deviations for the worst model (number 1) are 0.25–0.30 m. Building has been mapped most accurately for Model 4. Onwall 1, displacements reached 15 cm. It can be noted that deviationsare lower in Model 2 than 3 and 4.

The wall number 4 has been clearly mapped worse than the threeothers. This may be the result of the iterative scan-matching, as thescans of wall 4 where taken last. Thus they are biasedwith accumulatederror from fitting previous point clouds. The results for all 4 walls showthat approach 4 was best, with approach 1 being the worst. Analysis ofthe control cross sections of the modeled object allowed us to obtainadditional information about the quality ofmodelsmadeusing our algo-rithm. Sections were positioned in eight places — at the corners of theblock at 1.5 m and 12 m (Fig. 11) above the ground. The analysis con-firms the findings from the previous examination. Considering allmodels, it is apparent that the Model 1 coherence is the worst, andModel 4 is the best. There is no significant visible non-compliance inthe Model 4. Differences in deviations from the reference model be-tween sections at the bottom and at the top of the building suggestthat the accuracy changes with height. Such discrepancies may becaused by the tilt of models in comparison to the reference model, a

(a) Robot Husky equipped with3D laser Z+F Imager 5010.

(b) Mob

Fig. 14. Mobile mapping syst

shift along the Z axis or errors in the angular and linear parameters oforientation of scans. In summary, both differences maps and cross-sections analysis confirm the conclusions of the examination of orienta-tion of the models. Table 1 shows the average error for each of poseparameters, computed for each of the presented models.

10. End user case study

Fig. 14 shows themobile mapping system developed at the Instituteof Mathematical Machines for field operations as a result of researchshown in this paper. Currently, this system is used in two research pro-jects (“Research of Mobile Spatial Assistance System” Nr: LIDER/036/659/L-4/12/NCBR/2013 and FP7 ICARUS — “Integrated Components forAssisted Rescue and Unmanned Search operations”). The first projectconcerns the application of spatial design support where a mobile map-ping system is used for accuratemapping of urban environments. Thesemaps are used for spatial design support by providing software tools forinteractionwith spatial intent. The secondproject concerns 3Dmappingin SAR (search and rescue). Research on building mobile mapping sys-tem is inspired by Fukushima Nuclear Disaster in 2011, where 3Dmapping in such scenarios could help in mission execution and 3Ddata collection. This practical and very important application scenarionecessitates the on-line nature of the approach. Decreased time ofmea-surement, data registration and visualization can decrease the risk ofpotential contamination of workers involved in this particular SARmis-sion. We are convinced that laser scanning technologies and efficientexploitation of scanning results can contribute to future catastropheprevention. The mobile mapping system is composed of mobile robotHusky equipped with 3D laser Z+F Imager 5010 and ruggedizedNVIDIA GRID system (hardware configured by Boston Limited companyfrom UK). NVIDIA GRID system is composed of Supermicro RZ-1240i-NVK2 serverwith Citrix software capable of GPU virtualization. The sys-tem needs two operators, and the initial phase of operational proceduretakes about 15 minutes. After this procedure, the system is capable ofworking 2 hours by acquiring 20–30 local scans in different locations.The operator controls the robot and the laser from laptop connected viaWiFi router. Software in the GRID system for data registration and visual-ization is designed in the SaaS (software as a service)model and it is avail-able from any device (laptop, smartphone, tablet). The single operator iscollecting data from the robot and performing data registration. The visu-alization of an accurate 3D map is then redistributed over the local net-work via Citrix XenApp. Therefore, it is accessible from any mobile

ile ruggedized NVIDIA GRID system.

em for field operations.

Page 13: Towards terrestrial 3D data registration improved by parallel programming and evaluated with geodetic precision

(a) Institute of Mathematical Machines(design support scenario).

(c) Underground garage (SAR scenario). (d) Indoor (SAR scenario).

(b) Klomino - main object of interest inthis paper (SAR scenario).

Fig. 15. Use cases for mobile mapping system. These locations were scanned with Z+F Imager 5010 and registered with software described in this paper.

90 J. Będkowski et al. / Automation in Construction 47 (2014) 78–91

device. Using GRID technology improves 3D rendering over Ethernet andit is solving the problem of high performance computing (CUDA) in thecloud. The system is capable of registering 3D data in the field and imme-diately redistributing this map over local network for many end users.This functionality can help in increasing awareness of SAR operation. Tosummarize, the proposed mobile mapping system can be used for:

• spatial design support (Fig. 15a),• search and rescue applications (Fig. 15bcd).

11. Conclusions

In this paper we have shown the quantitative and qualitative evalua-tion of the data registration algorithm improved by CUDA parallel pro-gramming. Data were collected in urban area demolition scenario usingthe 3D terrestrial laser scanner Z+F Imager 5010 mounted onto mobileplatform PIONNER 3AT to simulate a real mobile robotic task. We haveshowna system for on-line data registrationwith the analysis of the accu-racy. The proposed implementation is robust because of the fast nearestneighborhood search applied for the iterative closest point with singularvalue decomposition solver. The performed qualitative and quantitativeevaluation is based on the reference ground truth data calculated withgeodetic precision. The geodetic approach extends the previous work [3,14] and provides an accurate benchmark for the algorithm.We comparedfour registration strategies for data preprocessing, such as subsamplingand vegetation removal. We observed that proper subsampling increasesthe accuracy and performance. Vegetation removal increases the

performance, but, at the same time, it slightly decreases the accuracy. Itis thus recommended for applications that are time critical. Our systemis already tested in realistic experiments and provides accurate consistentmetric maps on-line. The accuracy level of the maps is appropriate to po-tential applications, such as information gathering for urban area model-ing, spatial design support and initial space planning. It is a corecomponent for our future work on mobile mapping systems.

Acknowledgements

The research leading to these results has received funding from theEuropean Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 285417 — project ICARUS IntegratedComponents for Assisted Rescue and Unmanned Search operations.This work is done also with the support of NCBiR (Polish Centre forResearch and Development) project: Research of Mobile SpatialAssistance System no. LIDER/036/659/L-4/12/NCBR/2013 and with thesupport of NCN (Polish National Center of Science) project: Methodolo-gy of semantic models building based on mobile robots observations,nr: DEC- 2011/03/D/ST6/03175. We would like to thank our reviewersfor the hard work and valuable comments.

References

[1] A. Nüchter, H. Surmann, K. Lingemann, J. Hertzberg, S. Thrun, 6D SLAM with anapplication in autonomous mine mapping, Proceedings of the IEEE InternationalConference on Robotics and Automation, 2004, pp. 1998–2003.

Page 14: Towards terrestrial 3D data registration improved by parallel programming and evaluated with geodetic precision

91J. Będkowski et al. / Automation in Construction 47 (2014) 78–91

[2] P.J. Besl, N.D. McKay, A method for registration of 3-D shapes, IEEE Trans.Pattern Anal. Mach. Intell. 14 (2) (1992) 239–256, http://dx.doi.org/10.1109/34.121791.

[3] J. Bedkowski, A. Maslowski, G. de Cubber, Real time 3D localization andmapping forUSAR robotic application, Ind. Robot. 39 (5) (2012) 464–474.

[4] T. Rozen, K. Boryczko, W. Alda, GPU bucket sort algorithm with applications tonearest-neighbour search, WSCG 16 (1–3) (2008) 161–167.

[5] D. Qiu, S. May, A. Nüchter, GPU-accelerated nearest neighbor search for 3D registra-tion, Proceedings of the 7th International Conference on Computer Vision Systems,ICVS09, Springer-Verlag, Berlin, Heidelberg, 2009, pp. 194–203.

[6] A. Lorusso, D. Eggert, R. Fisher, A comparison of four algorithms for estimating3-D rigid transformations, Proceedings of the 1995 British Conference onMachine Vision (BMVC95), Birmingham, vol. 1, BMVA Press, Guilford, 1995,pp. 237–246.

[7] K.S. Arun, T.S. Huang, S.D. Blostein, Least-squares fitting of two 3-D point sets, IEEETrans. Pattern Anal. Mach. Intell. 9 (5) (1987) 698–700, http://dx.doi.org/10.1109/TPAMI.1987.4767965.

[8] B.K.P. Horn, H. Hilden, S. Negahdaripour, Closed-form solution of absolute orienta-tion using orthonormal matrices, J. Opt. Soc. Am. A 5 (7) (1988) 1127–1135.

[9] B.K.P. Horn, Closed-form solution of absolute orientation using unit quaternions, J.Opt. Soc. Am. 4 (4) (1987) 629–642.

[10] M.W. Walker, L. Shao, R.A. Volz, Estimating 3-D location parameters using dualnumber quaternions, CVGIP: Image Underst. 54 (3) (1991) 358–367, http://dx.doi.org/10.1016/1049-9660(91)90036-O.

[11] A. Nüchter, J. Elseberg, P. Schneider, D. Paulus, Study of parameterizations for therigid body transformations of the scan registration problem, Comput. Vis. ImageUnderst. 114 (8) (2010) 963–980, http://dx.doi.org/10.1016/j.cviu.2010.03.007.

[12] H. Pottmann, S. Leopoldseder, M. Hofer, Simultaneous registration of multiple viewsof a 3D object, ISPRS Arch. 34 (3A) (2002) 265–270.

[13] M. Hofer, H. Pottmann, Orientierung von laserscanner-punktwolken, VermessungGeoinf. 91 (2003) 297–306.

[14] J. Bedkowski, Intelligent mobile assistant for spatial design support, Autom. Constr.32 (2013) 177–186, http://dx.doi.org/10.1016/j.autcon.2012.09.009 (URL http://www.sciencedirect.com/science/article/pii/S0926580512001586).

[15] S.-Y. Park, S.-I. Choi, J. Kim, J. Chae, Real-time 3D registration using GPU, Mach. Vis.Appl. (2010) 1–1410, http://dx.doi.org/10.1007/s00138-010-0282-z.

[16] A. Segal, D. Haehnel, S. Thrun, Generalized-ICP, Proceedings of Robotics: Science andSystems, Seattle, USA, 2009, pp. 1–8.

[17] A. Nüchter, K. Lingemann, J. Hertzberg, Cached k–d tree search for ICP algorithms,Proceedings of the Sixth International Conference on 3-D Digital Imaging andModeling, IEEE Computer Society, Washington, DC, USA, 2007, pp. 419–426,http://dx.doi.org/10.1109/3DIM.2007.15.

[18] A. Nüchter, J. Hertzberg, Towards semantic maps for mobile robots, Robot. Auton.Syst. 56 (11) (2008) 915–926, http://dx.doi.org/10.1016/j.robot.2008.08.001.

[19] R.B. Rusu, Z.C. Marton, N. Blodow, M. Beetz, Learning informative point classes forthe acquisition of object model maps, Proceedings of the 10th International Confer-ence on Control, Automation, Robotics and Vision (ICARCV), Hanoi, Vietnam, 2008,pp. 1–8, (URL http://files.rbrusu.com/publications/Rusu08ICARCV.pdf).