A_Parallel_Dynamic_Convex_Hull_Algorithm_based_on_M2M_model

A Parallel Dynamic Convex Hull Algorithm based on the M2M model originated from thinking pattern of human beings

Author(s) Name(s)Author Affiliation(s)

Abstract.

In this paper, we analyze how humans can solve problems quickly by the process of solving the problem of convex hull. Then we present the M2M (Macro to Micro) data structure which maintains a finite set of n points in the plane under insertion and deletion of points in amortized O(1) time per operation and O(n) space usage. In addition, as the insert operation of each point is independent, the algorithm has high parallelism. And because the insert operations will not cause the imbalance of the tree structure, the M2M data structure is dynamic. Moreover, it can be shared by all the algorithms based on M2M model which greatly improves the efficiency when a variety of algorithms work simultaneity, such as in image processing and pattern recognition. In all, the M2M model points out a general pattern for designing high parallel algorithms, an efficient strategy for solving multi-operational problems and a new approach for computer to stimulate the thinking pattern of human beings.

Key Words: Convex Hull, M2M

1. Introduction

Due to the long-term evolution and development, nature exhibits substantial fine structures and intelligence, which motivates humans to solve a lot of practical problems in bionics. As the most outstanding species in nature, humans have a special thinking pattern and problem-solving system which is worthy of study.

Artificial intelligence (AI) is such a subject that simulates human’s thinking pattern. Although AI does well in computer game, knowledge discovery, logic reasoning, etc. But in some fields, the performance of computer is still inferior to that of human beings, such as pattern recognition and machine learning.

For example, using Quickhull algorithm, a PC takes a couple of minutes to identify the convex hull of the following pictures which contains 10 million points randomly generated from Gaussian distribution (Fig. 1). However, a person can solve this problem in less

than one minute. Maybe we can get inspiration from the thinking pattern of human beings to improve computer’s performance.

Figure 1. A typical convex hull problem

The procedure of solving convex hull problem by human can be mainly divided into two stages – unconscious stage and conscious stage.

At the first stage, the key process is to transform light signal into nerve signal, which is accomplished in the retina. The retina has 120 million rods and 7 million cones. They work in parallel, transforming light signal into nerve signal and transmitting the signal to cerebral cortex by a nerve tract with high bandwidth. Since this process is in high parallelism, it can be accomplished quickly. Through this process, the brain constructs the graph of the point set. And it is not just one but a series of graphs representing the same point set, from fuzzy to clear (Fig. 2). These series of graphs of different granularities is constructed by adjusting the lens of human.

Figure 2. The forming process of an image

At the second stage, human beings start to solve the convex hull problem quickly based on the hierarchical structure formed before. In this process, we will analyze macro graph (coarsest-grained) first instead of the micro graph (finest-grained). In this way, we can exclude those points which are obviously not in the margin of the convex hull, reducing the scope of the problem. Then, we try to find the convex hull in a clearer and finer-grained graph. Gradually, the scope of the problem becomes smaller and smaller, until the

granularity reaches a proper level where we can carefully find out all the hull points of the convex hull.

In brief, human beings first construct graphs of different granularities in high parallelism, then reduce the scope of problem from macro view to micro view to achieve fast process. The Macro-to-Micro (M2M) model is such an algorithmic model that simulates the mode of thinking and cognition of human beings.

Based on the M2M model, we can develop a series of algorithms. They share a highly parallel preprocessing, which corresponds to transformation from light signal into nerve signal in quantities of photo sensory cells. So the preprocessing can be accelerated by parallel computing device such as GPU. The hierarchical data structure constructed by preprocessing corresponds to the born ability of human beings to handle problems with macro view. With preprocessing sharing, it can handle multiple tasks as well.

The M2M convex hull algorithm (M2M-CH) is one of the M2M algorithm series. Its dynamic data structure is based on M2M model which was proposed [19]. The preprocessing of M2M-CH takes O(n) time to construct the data structure and supports parallel computation. Based on this data structure, the query procedure shrinks the search space from coarse-grained level to fine-grained level and finally obtains the convex hull in a considerably small search space. Additionally, the M2M model has many desirable properties, such as dynamic structure, preprocessing sharing, trade-off between time cost and accuracy, etc. With these properties, M2M-CH can be applied to various applications and solve problems efficiently.

2. Related Works

Computing the convex hull of a given point set is one of the most basal problems in computational geometry and it is applied in many fields, such as pattern recognition, image processing, statistics and GIS system. It is also a key component for many other computational geometry algorithms.

Ron Graham presented the first algorithm to compute the convex hull of points in the plane with O(nlogn) complexity in 1972 [1]. If the points are already sorted by one of the coordinates or by an angle to a fixed vector, then the algorithm takes O(n) time. Another solution with O(nlogn) complexity is the divide and conquer algorithm for the convex hull, published in 1977 by Preparata and Hong [2]. This algorithm is also applicable to three dimensions. Later, Avis [4] and Yao [5] proved lower bounds of Ω(nlogn) to find a convex hull.

R. A. Jarvis constructed an "output sensitive" algorithm whose running time depends on the output size [3]. Jarvis's algorithm runs in O(nh) time where h is the number of vertices of the convex hull. In 1986, Kirkpatrick and Seidel [6] computed the convex hull of a set of n points in the plane in O(nlogh) time. (Later, the same result was obtained by Chan using a much simpler algorithm [7].) They showed that, on algebraic decision trees of any fixed order, O(nlogh) was a lower bound for computing convex hulls of sets of n points，where h was the number of vertices of the convex hull.

Started with seminal work by Clarkson, randomized algorithms have played an increasingly important role in computational geometry and many randomized convex hull algorithms were proposed [8, 9, 10].

Dynamic convex hull algorithms are widely studied [11-17], for we often require computing the convex hull of the point set in practice which is changing in a small scale.

In order to reduce the time complexity, some researchers focused on the techniques of designing fast parallel algorithms for convex hulls [18].

3. The Outline of the M2M Model

When used for solving problems, the M2M model will work in a macro perspective to remove unnecessary factors, and then focus on a smaller scale of the problem. Generally speaking, the goal of the macro-to-micro process is to shrink the search space until it is sufficiently small. This idea inheres in many algorithms which are related to “Decrease-and-Conquer”.

Generally, an M2M algorithm includes following two procedures: PreprocessingThe preprocessing procedure constructs the hierarchical data structure of M2M algorithm from fine-grained level to coarse-grained level. In each level, data set will be divided into a number of similar partitions. QueryThe query procedure shrinks the search space at each level, from coarse-grained level to fine-grained level. Finally, the solution can be obtained at the finest-grained level quickly.

4. Terminology4.1 Terminology of the M2M ModelLevel

The hierarchical data structure consists of levels of different granularity. A coarser-grained level presents abstracted data classification in macro view, while a

http://www.answers.com/topic/pattern-recognition

http://www.answers.com/topic/franco-p-preparata

http://www.answers.com/topic/divide-and-conquer-algorithm

http://www.answers.com/topic/geographic-information-system-1

http://www.answers.com/topic/statistics

http://www.answers.com/topic/image-processing

finer-grained level presents detailed data classification in micro view.Part

Part is defined as a subset of the data points. At each level, the original point set is divided into squares of the same size. All the data points in such a square belong to this part.

Figure 3. Terminology Explanation

4.2 Terminology of M2M-CHCenter Hull

The convex hull of all the center points of parts that contain at least one point.Representative Point

An arbitrarily designated point in a part whose center point is the vertex of the center hull.Representative Hull

The convex hull of all the representative points.

Figure 4. Terminology Explanation

5. The Convex Hull Algorithm Based on the M2M Model

There are two procedures in M2M-CH: preprocessing and querying. Preprocessing constructs the hierarchical data structure. Assume the original point set is at level 1. At level k, the graph is divided into square parts of the same size: Part1(k), Part2(k), …, Partn(k). Each part corresponds to one subset of the original point set. All the points in Parti(k) are abstracted to the center point of Partj(k+1) when Parti(k) is the child part of Partj(k+1) which is in the next coarser-grained level (Fig.3). If all the child parts of Parti(k) don’t contain any point, Parti(k) doesn’t either. A preset parameter – Density of Parti(1) denotes the maximum number of points in a part at in the level 1. It determines the total number of levels of the hierarchy because the total number of points in Parti(1) cannot exceed Density of Parti(1). For example, if the number of original points is 12 and the Density of Parti(1) is 3, the number of parts at level 1 is 4. But the number of original points is 13 and the Density of Parti(1) is also 3, the number of parts at level 1 will be 9 because 13/4>3.

Table 1: M2M-CH procedureM2M-CH Input: V // the original point setM // M2M data structure built on the original point set Output : the convex hull of VM2M-CH(V, M)1. S← V //S is the considering point set2. i←1 //current exploring level3. While(i≠M.Depth )4. do C← Extract the center points from S at level i5. P← QuickHull(C) //compute the center hull6. R← Extract the representative hull from P7. S← the point set belonging to the parts which

intersect with R8. i←i+19. Return QuickHull(S)

After preprocessing the original point set (Fig. 5(a)), M2M-CH works as the procedure shown in Table 1. At each level, M2M-CH preserves parts which intersect with the representative hull as the search space for the next finer-grained level. Repeat this step until it reaches the finest-grained level. Then M2M-CH computes the resulting convex hull in a considerably small search space. The correctness of M2M-CH will be proved in the next section. An example of the query procedure is shown in Fig. 5.

Since M2M-CH shrinks search space based on large parts at coarser-grained level (Figure xx), it can quickly exclude a large region which is obviously in the convex hull. Moreover, it is necessary to go through finer-grained level, for the parts in this level are too large for precise operation.

At finer-grained level, M2M-CH shrinks search space based on small part so that it can precisely

exclude points which are not the vertices of the convex hull.

In summary, M2M-CH emphasizes on fast process at coarse-grained level, while on accurate solution at fine-grained level. It gradually shrinks search space level by level until it can finally obtain the convex hull within a considerably small search space by a traditional algorithm.

(a) The original point set (b) The center points in the top level

(c) The center hull of the top level

(d) The representative hull of the top level

(e) The parts that intersect with the

representative hull in the top level

(f) The search space preserved for the second

level

(g) The center points in the second level

(h) The center hull of the second level

(i) The representative hull of the second level

(j) The parts that intersect with the

representative hull in the second level

(k) The search space preserved for the bottom

level

(l) the final convex hull

Figure 5. The query procedure of M2M-CH

6. Proof of the Correctness of M2M-CHTo prove the correctness of M2M-CH, we introduce

the following lemmas first:

Lemma 1At any level, a part whose center point is outside of

the center hull contains no point.

ProofConsidering a part at level k, namely Parti(k),

whose center point is pi. If Parti(k) contains a point and pi is outside of the center hull, then it contradicts to the definition of convex hull that requires all the points, including pi, should be inside the convex hull. Hence, Parti(k) contains no point. Lemma 1 is proved.

Lemma2At any level, a part which is completely inside of

the representative hull contains no hull point.Proof

Considering a part at level k, namely Parti(k), which has a point pi. If pi is a hull point and Parti(k) is completely inside of the representative hull, then there must be at least a point qi outside of the resulting convex hull, where qi is the vertex of the representative hull. It contradicts to the definition of convex hull. Hence, Parti(k) contains no hull point. Lemma 2 is proved.Lemma 3

At any level, a part whose center point is inside of the center hull and outside of the representative hull has an intersection with the representative hull. Proof

Figure 6. Proof of Lemma 3Considering a part at level k, namely Parti(k),

whose center point is pi. pi is inside of the center hull and outside of the representative hull (Figure xx). We define the margin representative hull is a representative hull with the smallest size and the side length of Parti(k) is 1. Let d denote the distance between the corresponding lines of the center hull and the representative hull. Figure xx clearly shows that

the maximum value of d is , smaller than the side

length of Parti(k). Hence, Parti(k) must have an intersection with the representative hull. Lemma 3 is proved.

Lemma 4At any level, a part which contains hull points has

an intersection with the representative hull. Proof

Considering a part at level k, namely Parti(k), whose center point is pi. If Parti(k) contains hull points and has no intersection with the representative hull, then it must be either completely inside or outside of the representative hull. If it is completely inside of the representative hull, according to Lemma 2, it contains no hull point. If it is completely outside of the representative hull, then pi is outside of the center hull. According to Lemma 1, it contains no hull point. Then, it is concluded that Parti(k) has no hull point if it has no intersection with the representative hull, which contradicts with the hypothesis. Hence, Lemma 4 is proved.

Now, we prove the correctness of M2M-CH using the following loop invariant.

ProofInitializationThe querying processing begins from the coarsest-grained level. The search space in this level includes all the points of the original points set. Hence, at the initialization, all hull points are included.MaintenanceM2M-CH preserves all the parts which have intersection with the representative hull while exclude other parts from the search space. According to lemma 4, the search space preserved for the next finer-grained level contains all the hull points. It guarantees that no hull point is excluded while shrinking the search space.TerminationSince all hull points are in the search space of the finest-grained level according to loop invariant in maintenance, the inner algorithm can generate the correct convex hull. This completes the proof.

7. Proof of the Time Complexity of the Preprocessing

Definitions and NotationSuppose the number of levels in M2M model is K,

then the height of the tree structure is also K.Suppose the number of division in each level is d,

then the maximum number of branches in each level of the tree structure is also d. Main Theorem

The preprocessing consists of an index initialization and a series of insertions. Firstly, the indices of parts in each level are maintained by a 2-dimension array, which is initialized to null. Secondly, the cost of every insertion may be different. Each insertion adds a point in the map to the corresponding parts based on their

locations, and builds the required parts if they haven’t been built before. Note that each part-building operation is expected to be associated with a point-adding operation in order that each part has at least one point in it. Each insertion point should occur exactly within one leaf node (bottom level) in the tree structure.Proof

Generally, the preprocessing cost consists of those from part-building phase, point-adding phase and index initialization phase, therefore, the time complexity of preprocessing is,

(1),

where is the total time cost of the part-building phase. The time complexity in the worst case is following, which indicates that each parent node has d children in the tree,

(2),

where is the total time cost of adding phase, which consists of two types of operations. One is the point-adding operations occurring in the non-leaf nodes (non-bottom levels) which are associated with the part-building operations; the other one is the point-adding operations occurring in the tree leaves (bottom level), which may be or not associated with the part-building operations,

(3),

where is the total time cost of the index initialization phase, which depends on the number of all possible parts, therefore, the cost is equal to that of the worst case in the part-building phase.

(4).

In general, the time complexity T is

. In our experiments, we define the relation between K (the number of levels) and d (the number of divisions) with following equation,

(5).so that,

(6).

That is, with equation (6), we can get that the time complexity of the preprocessing T is O(n).8. Experiment

In the following experiments, we use some classical convex hull algorithms (Graham scan [1], quick hull [2] and Jarvis march [3]) as benchmark algorithms to analyze the performance of M2M-CH. We analyze the efficiency of these algorithms by the same point set.

In the experiments, planar points are generated randomly in different types of distribution, namely uniform distribution, Gaussian distribution, Laplace distribution, cluster Gaussian distribution and cluster Laplace distribution. In each distribution M2M-CH runs with two different value of Density of Parti(1), namely 300, 30000. The scale of the point set ranges from 0.1 million to 2 million.

Figure 7 shows that when computing the convex hull of a point set whose scale is small, Graham's scan has the best performance. As scale of the point set increases, M2M-CH outperforms other classical algorithms. Figure 8 shows the relationship between the time cost of the query procedure of M2M-CH and the scale of the point set.

The time complexity of the preprocessing of M2M-CH is O(n), for the preprocessing consists of n times of insertion operations and the amortized time cost of each insertion operation is O(1). But the time complexity of the query procedure is hard to estimate, for it depends on the distribution of the point set. The worst situation is that all the points are the vertices of the convex hull. In this situation, the time complexity of M2M-CH is O(nlogn), the same as those traditional algorithms. But this pathological ordering doesn’t occur in practice. In most cases, M2M-CH excludes a large amount of points in the convex hull level by level. This greatly reduces the problem scale. Thus, the time cost of the query procedure is much smaller than that of the preprocessing.

Table 2: Abbreviation DenotationM2M-CH The M2M convex hull algorithm,

including preprocessing and querying.

M2M-PP The preprocessing process of M2M-CH.

GSCH Graham scan algorithm.QHCH Quick hull algorithm.JMCH Jarvis's algorithm

Figure 7. Comparison among convex hull algorithms Figure 8. The time cost of the query procedure

Density

300 30000

Uniform

Distribution

Gaussian

Distribution

Laplace

Distribution

Cluster

Gaussian

Distribution

Distribution

Cluster L

aplace D

istribution

Figure 9. The comparison among convex hull algorithms with different distributions9. The Advantages of Convex Hull

Algorithm of the M2M ModelCompared to conventional convex hull algorithms,

M2M-CH has following advantages:

High parallelismAlthough the preprocessing of M2M-CH occupies a great proportion of the total time, it can be run by parallel computing device, for operations on different points are independent. Hence, M2M-CH has great potential to reduce the overall time cost.

Dynamic structureThe operation of M2M data structure such as insertion or deletion can be finished in O(1) time in most cases while in O(logn) time in the worst case. Hence, there’s no need to reconstruct the hierarchical structure when the original point set changes a little, but update the information of those new points and the parts they belong to.Preprocessing sharingPreprocessing sharing is very helpful in the image processing field where many operations need to be executed on the same image. For instance, in face recognition, we probably require computing the convex hull in different regions of the same graph. In this case, M2M-CH only constructs the hierarchical structure once, and then it can compute the convex hull of different regions quickly according to preprocessing sharing.Trade-off between time cost and accuracyM2M-CH can make a trade-off between the efficiency and the accuracy of the solution by outputting the representative hull of level k (k>1) instead of the convex hull of the finest-grained level. In this case, the time cost is shorter while the result is just approximately correct. It is obvious that the smaller the value of k we set, the more precise the resulting convex hull is. In this way, M2M-CH can be applied to various applications with different demands on time cost and accuracy.Extension to Three Dimensions

Since the M2M model is based on regular parts, M2M-CH can be extended to three dimensions by defining a part as a cube.

10. Conclusion

From the experiment result, we can conclude that the efficiency of M2M-CH is better than that of classical algorithms when the scale of the point set is comparatively large. More importantly, the M2M data structure is dynamic and can be constructed in parallel. The amortized time complexity of its update operations is optimal.

Based on the M2M model, M2M-CH has many desirable characteristics, such as high parallelism, dynamic, preprocessing sharing, trade-off, etc. These characteristics play a significant role in various applications. The M2M model introduces a general model to design algorithms with high parallelism, an efficient strategy to handle multi-operational problems and a new approach to stimulate the thinking pattern of human beings. With the help of the M2M model, computer can solve problems with a wider holistic view like human.References [1] R. L. Graham, An efficient algorithm for determining the convex hull of a finite planar set, Information Processing Letters 1 (1972) 132–133.[2] F. P. Preparata, S. J. Hong, Convex hulls of finite point sets in two and three dimensions, Communications of the ACM 2 (20) (1977) 87–93.[3] A. Jarvis, On the identification of the convex hull of a finite set of points in the plane, Information Processing Letters 2 (1973) 18–21.[4]Avis, D. Comments on a lower bound for convex hull determination. Inform. Process. Lett.11 (1980), 126.[5]Yao, A.C. A lower bound to finding convex hulls. J. ACM 28 (1981), 780-787[6] D. G. Kirkpatrick, R. Seidel, The ultimate planar convex hull algorithm, SIAM Journal on Computing 15 (1) (1986) 287–299.[7] T. M. Chan. Optimal output-sensitive convex hull algorithms in two and three dimensions. Discrete Comput. Geom., 1996. Eleventh Annual Symposium on Computational Geometry.

[8]Clarkson, K.L. New applications of random sampling in computational geometry. Discrete Comput. Geom. 2(1987), 195-222.[9]Clarkson, K.L. A randomized algorithm for closest-point queries. SIAM J. Comput. 17(1988), 830-847.[10] R. Wenger, Randomized quick hull, Algorithmica 17 (1997) 322–329.[11] M. H. Overmars and J. van Leeuwen. Maintenance of configurations in the plane. J. Comput. System Sci., 23(2):166–204, 1981.[12] J. Hershberger and S. Suri. Applications of a semi-dynamic convex hull algorithm. BIT, 32(2):249–267, 1992.[13] J. Hershberger and S. Suri. Off-line maintenance of planar configurations. J. Algorithms, 21(3):453–475, 1996.[14] T. M. Chan. Dynamic planar convex hull operations in nearlogarithmic amortized time. Journal of the ACM, 48(1):1–12, January 2001.[15] Gerth Stolting Brodal and Riko Jacob Dynamic Planar Convex Hull. 43 rd Annual IEEE, 2002.[16] Maarten Löffler and Marc van Kreveld, Largest and Smallest Convex Hulls for Imprecise Points, OpenAccess, 2008.[17] Rong Liu, Hao Zhang and James Busby, Convex Hull Covering of Polygonal Scenes for Accurate Collision Detection in Games, Graphics Interface,2008[18] Neelima Gupta and Sandeep Sen, Faster output-sensitive parallel algorithms for 3D convex hulls and vector maxima, Journal of Parallel and Distributed Computation, 2003.[19] YingPeng Zhang, ZhiZhuo Zhang, Qiong Chen A NEW NEAREST NEIGHBOUR SEARCHING ALGORITHM BASED ON M2M MODEL. THE INTERNATIONAL MULTICONFERENCEOF ENGINEERS AND COMPUTER SCIENTISTS 2007, 2007.

A_Parallel_Dynamic_Convex_Hull_Algorithm_based_on_M2M_model

Documents

problem of convex hull

pattern recognition

special thinking pattern

general pattern

nerve signal

lens of human

m2m model

light signal