Top Banner
Algorithmic and Visual Analysis of Spatiotemporal Stops in Movement Data Peter Bak * , Eli Packer, Harold J. Ship, Dolev Dotan IBM Research Haifa, Israel {peter.bak,elip,harold,dotan}@il.ibm.com ABSTRACT The extensive use of geographic positioning devices leads to the generation of large amounts of movement data, collected and stored in digital repositories. Such collections of movement data enable domain experts and scientists to analyze and discover interesting movement patterns. Analyzing the occurrence of stops (stops refer to full halts or very slow speed) in transportation systems is an im- portant challenge in movement data analysis. This analysis can be used to better understand traffic congestion problems and find cor- responding solutions. We propose an efficient system to analyze stop occurrences. It consists of two major parts that we describe in this paper. The first one deals with our algorithmic solutions. We propose an efficient clustering algorithm to partition the stops into groups. The main goal is to detect strongly connected com- ponents of stops, where two stops belong to the same component if they are close enough. The idea is that each component will be an isolated analysis element, providing an easy and useful means for investigation. The second part of our system deals with visual- ization. We visualize the clusters obtained with our algorithm with the polar area diagram famously introduced by Florence Nightin- gale, giving the users an easy and efficient way to detect important characteristics such as daily stop times and volumes. We instanti- ate our system and prove its usefulness using a real-world dataset involving urban public transportation. Categories and Subject Descriptors H.4 [Information Systems Applications]: Miscellaneous; D.2.8 [Spatiotemporal Analysis]: Metrics—Movement Analysis, Inter- active Visualization General Terms Technique Keywords Spatiotemporal, Movement Analysis, Interactive Visualization * Please treat as confidential until final publication; if declined for publication, please return to the authors as confidential information. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$15.00. 1. INTRODUCTION With the widespread adoption of location-aware technologies, such as global positioning systems (GPS), radio-frequency identi- fication (RFID), smart phones, and wireless sensor networks, huge collections of data about moving objects are captured and stored in digital repositories. These objects may represent anything, includ- ing people, pieces of equipment, shipping containers, buses, and airplanes. The movement of objects is captured by recording their locations during consecutive moments in time. The collection of such spatiotemporal data enables domain experts and researchers to analyze the data and discover valuable behavioral patterns, but also poses a significant challenge to researchers revealing the spa- tiotemporal nature of the domain. One fundamental class of patterns in movement analysis deals with stops that interrupt motion. In many application domains, lo- cating and analyzing the places where objects stop could lead to interesting findings. In a traffic analysis application, for example, the locations, duration, and times of traffic jams could all be ana- lyzed. Comparing the planned versus the actual stops at stations, or even detecting the locations of exceptionally long and repeated stops could help optimize the service provided by public and other means of transportation. In applications for traffic analysis, as well as in other movement analysis applications, the definition of a stop often goes beyond that of a full stop (a period of zero velocity) and includes extreme slow- downs, such as traffic jams. Thus, we define stops to be any part of a trajectory, where the velocity is lower than some user-defined threshold, which may be specific to the application, to the object being explored, or even to the user’s preference. In this paper we introduce techniques that play significant roles in a tool we develop for spatiotemporal visual analysis (See Sec- tion 4.1). Our tool has two main goals. The first is to provide efficient, robust and scalable algorithms to analyze spatiotempo- ral data. Ultimately, the generated results will constitute valuable information that can be investigated by the domain expert. The second goal is to provide interactive and advanced visualization methods that will further help the domain expert to make decisions and detect interesting and important movement patterns. The work described in this paper is concerned with the above two goals. We believe that the combining efficient spatiotemporal algorithms with useful visualization techniques provides a vital core for domain ex- pert spatiotemporal tools. In this work we explore the behavior of public vehicles that stop or drive slowly enough in urban areas. We provide an efficient, robust, scalable and easy to implement algo- rithm for constructing useful information from the data and also propose a practical visualization technique to present the results in a clear and simple way. The goals of our algorithm are to identify and cluster stops. The
10

Algorithmic and Visual Analysis of Spatiotemporal Stops in ...

Dec 28, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Algorithmic and Visual Analysis of Spatiotemporal Stops in ...

Algorithmic and Visual Analysis of Spatiotemporal Stopsin Movement Data

Peter Bak∗, Eli Packer, Harold J. Ship, Dolev DotanIBM ResearchHaifa, Israel

{peter.bak,elip,harold,dotan}@il.ibm.com

ABSTRACTThe extensive use of geographic positioning devices leads to thegeneration of large amounts of movement data, collected and storedin digital repositories. Such collections of movement data enabledomain experts and scientists to analyze and discover interestingmovement patterns. Analyzing the occurrence of stops (stops referto full halts or very slow speed) in transportation systems is an im-portant challenge in movement data analysis. This analysis can beused to better understand traffic congestion problems and find cor-responding solutions. We propose an efficient system to analyzestop occurrences. It consists of two major parts that we describein this paper. The first one deals with our algorithmic solutions.We propose an efficient clustering algorithm to partition the stopsinto groups. The main goal is to detect strongly connected com-ponents of stops, where two stops belong to the same componentif they are close enough. The idea is that each component will bean isolated analysis element, providing an easy and useful meansfor investigation. The second part of our system deals with visual-ization. We visualize the clusters obtained with our algorithm withthe polar area diagram famously introduced by Florence Nightin-gale, giving the users an easy and efficient way to detect importantcharacteristics such as daily stop times and volumes. We instanti-ate our system and prove its usefulness using a real-world datasetinvolving urban public transportation.

Categories and Subject DescriptorsH.4 [Information Systems Applications]: Miscellaneous; D.2.8[Spatiotemporal Analysis]: Metrics—Movement Analysis, Inter-active Visualization

General TermsTechnique

KeywordsSpatiotemporal, Movement Analysis, Interactive Visualization∗Please treat as confidential until final publication; if declined forpublication, please return to the authors as confidential information.

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$15.00.

1. INTRODUCTIONWith the widespread adoption of location-aware technologies,

such as global positioning systems (GPS), radio-frequency identi-fication (RFID), smart phones, and wireless sensor networks, hugecollections of data about moving objects are captured and stored indigital repositories. These objects may represent anything, includ-ing people, pieces of equipment, shipping containers, buses, andairplanes. The movement of objects is captured by recording theirlocations during consecutive moments in time. The collection ofsuch spatiotemporal data enables domain experts and researchersto analyze the data and discover valuable behavioral patterns, butalso poses a significant challenge to researchers revealing the spa-tiotemporal nature of the domain.

One fundamental class of patterns in movement analysis dealswith stops that interrupt motion. In many application domains, lo-cating and analyzing the places where objects stop could lead tointeresting findings. In a traffic analysis application, for example,the locations, duration, and times of traffic jams could all be ana-lyzed. Comparing the planned versus the actual stops at stations,or even detecting the locations of exceptionally long and repeatedstops could help optimize the service provided by public and othermeans of transportation.

In applications for traffic analysis, as well as in other movementanalysis applications, the definition of a stop often goes beyond thatof a full stop (a period of zero velocity) and includes extreme slow-downs, such as traffic jams. Thus, we define stops to be any partof a trajectory, where the velocity is lower than some user-definedthreshold, which may be specific to the application, to the objectbeing explored, or even to the user’s preference.

In this paper we introduce techniques that play significant rolesin a tool we develop for spatiotemporal visual analysis (See Sec-tion 4.1). Our tool has two main goals. The first is to provideefficient, robust and scalable algorithms to analyze spatiotempo-ral data. Ultimately, the generated results will constitute valuableinformation that can be investigated by the domain expert. Thesecond goal is to provide interactive and advanced visualizationmethods that will further help the domain expert to make decisionsand detect interesting and important movement patterns. The workdescribed in this paper is concerned with the above two goals. Webelieve that the combining efficient spatiotemporal algorithms withuseful visualization techniques provides a vital core for domain ex-pert spatiotemporal tools. In this work we explore the behavior ofpublic vehicles that stop or drive slowly enough in urban areas. Weprovide an efficient, robust, scalable and easy to implement algo-rithm for constructing useful information from the data and alsopropose a practical visualization technique to present the results ina clear and simple way.

The goals of our algorithm are to identify and cluster stops. The

Page 2: Algorithmic and Visual Analysis of Spatiotemporal Stops in ...

underlying algorithm has two input parameters: speed threshold fordetecting stops and spatial distance for clustering stops (see Sec-tion 3 for the relationship between the distance and the clusters).Both can be interactively altered by users. Once input is provided,the algorithm is fully automatic. It clusters stops into stops groupsand within each group identifies vehicle sub-trajectories that repre-sent consecutive sequences of very slow movement.

Using the results of our algorithm, we applied a visualizationtechnique that makes the properties of stops accessible to domainexperts and allows the interpretation of stop patterns for optimiza-tion purposes. The novelty here is the use of this visualizationscheme in spatiotemporal data visualization systems. The result-ing visualization, inspired by the flower-like diagrams of FlorenceNightingale, is based on a novel mapping of data attributes to visualproperties. The group information then serves as a basis for furthervisualizing and exploring the data. We use this visualization tech-nique and show how it helps infer important stop patterns.

The rest of the paper is organized as follows: we first reviewexisting literature in the domain and of the related techniques. InSection 3, we describe our proposed algorithm and the resulting vi-sualization for the stops. In Section 4 we instantiate our techniqueusing a real-world dataset. Section 5 is devoted to the experimen-tation of our algorithms and visualization techniques. Finaly, wesummarize our work and propose new research directions in Sec-tion 6.

2. RELATED WORKIn recent years, much work has been conducted in the area of

movement analysis. This work focused on detecting patterns, ana-lyzing the data, and better understanding complicated dynamic ve-hicle systems for optimization purposes. It is evident that many re-lated research directions involve local relations of movement events,which led to the development of proximity problem solutions. Froma completely different research perspective, much work has beenconducted in an attempt to understand the anatomy of movement.This research focuses primarily on developing taxonomies, defin-ing possible movement patterns, and developing visualization andanalytic techniques for exploring the movement data.

2.1 Related Algorithmic WorkThe proximity problem family is closely related to our work as

we are interested in finding spatially close stops before we clusterthem. Proximity query problems that ask to report neighboring el-ements of a given one are fundamental in the geometric algorithmdomain. Popular solutions for the problems use KD-trees, Voronoidiagram and Delaunay Triangulation. A famous variant is the (k-)nearest neighbor problem which ask for the (k-)nearest neighborsof a given element. We refer the reader to [29, 28, 8] which areinteresting algorithmic and experimental publications for this prob-lem, as well as good summaries for related work. Over the years,variants of proximity problems touched numerous algorithmic do-mains. Recently, we have witnessed a lot of effort in using proxim-ity problem variants for the sake of geographical system analysis(see e.g., [31, 27, 32, 20]).

A specific variant of our interest to us is the Fixed-Radius NearNeighbors problem (FRNN for short), which is defined as follows.Given a set of Points P in IR2, a point p∈ P and a fixed distance r≥0, find all points in P that are r-close to p. This problem goes backmany years: Bentely [7] describes an algorithm based on bucketingthe input elements and cites a paper by Levinthal from 1966 thatuses this technique (we refer to this algorithm as BUCKET). Inwhat follows, to be aligned with our goals, we will refer to FRNNas a similar problem that queries all input points for near-neighbors,

using a fixed r > 0 (so basically we ask for all pairs of points whichare r-close to each other). Under our modified definition, BUCKETwould run in expected time O(n+ k) (where n is the input size;k is the output size) if one uses hash tables. When using binarytrees, BUCKET has an O(n logn+ k) worst time. The main ideaof the BUCKET algorithm is to lay a grid of buckets with edgelength r and then insert each input point to the bucket in which itis contained. Then, to find close pairs, only points that are locatedin the same bucket or in neighboring buckets are tested (other pairsclearly cannot be neighbors).

Clustering techniques have been the focus of different leadingmathematical areas for many years (e.g., pattern recognition andstatistical analysis). We refer the interested reader to the book byDuda and Heart [15] for a comprehensive study of this area. In ourwork, we are interested to cluster points in IR2. Here we just notetwo popular clustering algorithms for point sets: DBSCAN [17]and its extended variant OPTICS [5].

2.2 Movement AnalysisA comprehensive taxonomy of movement data was provided by

Dodge et al. [14] and by Andrienko et al. [1], who developed aconceptual framework for tasks and methods involved in the analy-sis of movement. The former describes movement behavior basedon single, multiple, and complex relations among moving objects.The latter focuses on the data and the task, which can be distin-guished by the type of information they target and by the level ofanalysis.

On a more conceptual level, Andrienko et al. [1] describe pos-sible aspects of the relation between moving objects and their en-vironment. These researchers suggest using movement trajecto-ries as possible analytic targets, in that they describe the spatial,temporal, and thematic characteristics over space and time. Thiswork also focuses on the distance relationships among trajectoriesas a function of these characteristics. Another approach was intro-duced by Laube [24], who developed the concept of lifelines. Life-lines are identified by individual motion behavior, events of distinctgroup motion behavior, and by the relation between individuals andgroups or proportion of individuals to groups.

Arguably, stops are among the most interesting movement pat-terns. Exploring, visually or otherwise, the characteristics of stopareas is useful. for example, they can include , when the stops typ-ically take place in each area, the stop duration at different timesof day, and how many vehicles participate in them, as we did inour work. As mentioned above, a stop is often defined as a part ofthe trajectory in which the velocity is lower than a certain thresh-old. Palma et al. [26] derive this threshold from the characteristicsof the trajectory data, using a quantile function that uses the meanand standard deviation between any two consecutive points in thedata, as well as the minimum time for a stop, as its input. Yan andSpaccapietra [34] use a dynamic threshold that takes into accountattributes regarding the road and the moving objects. Furthermore,they add a density-based method for identifying non-standard stopevents, such as a car that circles around a block at normal speedlooking for parking. The stops then undergo semantic enrichment,and statistical algorithms are used to extract patterns that may helpto understand movement behaviors.

2.3 Visualization of Movement PatternsThe visualization community has contributed in multiple domains

relating to the analysis of movement. Currently, animated maps [3,4] and interactive cubes [23, 22] are widely used to visualize move-ment data. Map and cube displays are complemented by graphs anddiagrams demonstrating various facets of a movement and its con-

Page 3: Algorithmic and Visual Analysis of Spatiotemporal Stops in ...

text [16, 23, 25, 6].Several works have suggested visual methods for identifying stops

from movement data and analyzing their characteristics. For ex-ample, Scheepens et al. [30] introduced a visualization scheme forship trajectories, enabling the detection of anchoring areas. Adrienkoet al. [2] developed a method wherein stop areas are identifiedby clustering individual stops using density-based algorithms. Theclusters are then analyzed visually, using time-series visualizations.This is done by color-mapping their constituent events according tothe clusters or according to some characteristic of a cluster, such asdirection or average speed.

In summary, the topic of extracting and visualizing patterns inmovement analysis has been widely investigated. Less attentionwas given to the pattern of stops that interrupt movement. Over-all, there is a clear need for a technique that extracts and visualizesstop patterns, by preserving the geographic context and highlight-ing the spatiotemporal nature of stops together with their additionalcharacteristics. Even though some of the existing techniques couldprobably be adapted to perform certain aspects of this task, we seeour contribution as a comprehensive solution to address the chal-lenge – combining algorithm and interactive visualization for theexploration of the stop patterns in geographic movement data.

3. METHODWe propose an efficient algorithm for identifying stops from the

raw input data of trajectories and grouping them into comparablestop areas. We emphasis that each stop area may contain stops ofmore than a single trajectory.

Figure 1 shows the steps of the algorithm and the interactive vi-sualization. The user provides two input parameters: the speedthreshold (α) for the identification of stops and the spatial distancethreshold (β ), which identifies pairs of stops that are close enoughfor clustering purposes. An interactive loop is provided, in whichthe user can alter these input parameters and re-cluster. Using thisframework, users conduct their analyses through the visual layoutof the stop areas. Users are therefore able to draw conclusions andgain insights by viewing the results of our clustering algorithm.

Interactive Visualization

Insights

Stop Analysis

Figure 1: Pipeline of the overall flow shows the automatic com-putation of stop-groups and interactive visualization for theirexploration.

The interim results of the algorithmic flow are represented in Fig-ure 2. Trajectories are presented as a sequence of location signals,from which we extract those that are below the speed thresholdand declare them as stops, represented by thick segments. We thencluster them into stop-groups if they are below a distance threshold(marked by rectangles in the figure). Within each group we seg-ment consecutive stops along each trajectory that intersect with thegroup into stop-segments. To complete this process, the individualstop-groups and stop-segments are visually mapped to a represen-tative glyph and then mapped back to their original locations, aselaborated in Sections 4 and 5.

3.1 AlgorithmLet r be a trajectory with an ordered sequence of reported events

Trajectory with location events

Stop when speed lower than threshold between events

Stop Groups when distance lower than threshold

Figure 2: Schematic description of analytic pipeline’s interimresults. From left to right: trajectory sequence, stop segmentsalong the trajectory (shown in bold), and the stop-group gener-ated by the algorithms (rectangles).

r1,r2, . . . ,rv. Each ri is a tuple [idi,xi,yi, ti] in which idi is the tra-jectory identifier, xi and yi are the IR2 location, and ti is the reportedtime. Let n be the size of the input and m be the number of iden-tified stops (m ≤ n). Our algorithm proceeds in three consecutivesteps as described next.

3.1.1 Identifying the StopsGiven a set of trajectories, R, we detect which of the events along

each trajectory are stops. A stop is defined as a location where thespeed of the vehicle is sufficiently slow. Let vi be the average speedfrom ri to ri+1. If vi < α (recall that α is the speed threshold inputparameter), we classify both ri and ri+1 as stops. We also con-nect ri and ri+1 for further processing with the following pointers:ri.next = ri+1 and ri+1.prev = ri.

To identify the stops, we check each pair of consecutive reportedevents on any trajectory. We do it by traversing the events sortedby time while saving the last reported event on each trajectory. It iseasy to show that sorting and traversing the events can be done inO(n logn) time.

3.1.2 Constructing the Stop-GroupsWe say that two stops (potentially from different trajectories) are

β -close if their Euclidean distance is smaller than the input param-eter β . Let G = (V,E) be a graph, where V is the collection of thestops we identified in the previous step, and each pair of stops si ands j is connected with an edge e(si,s j) ∈ E if D(si,s j)< β where Dis the Euclidean distance function. After computing, G, we clusterthe stops by computing the strongly connected components of G;each strongly connected component corresponds to a stop-group.

Let the≤ relation of points in IR2 be defined as follows: p1 ≤ p2if and only if p1.x < p2.x or p1.x = p2.x and p1.y≤ p2.y.

We next describe our Plane-Sweep Near Neighbor (PSNN) algo-rithm to construct G by finding all β -close pairs of stop. We firstsort the collection of stops using the ≤ relation (time O(m logm)).We then sweep the plane from left to right according to the planesweep paradigm [12]. The events of the sweep are the stops, sortedwith the ≤ relation. We maintain a rectangular window γ of widthβ whose right line is the plane-sweep status line. γ maintains allthe stops inside it sorted by their y-coordinates. At each process-ing of stop p, to maintain γ , we insert p to γ and remove the stops{q ∈ γ : q.x < p.x−β} from it (see Figure 3 for an illustration).

LEMMA 3.1. Maintaining γ takes O(m logm) time and O(m)space.

PROOF. We implement γ as a sorted list. Since it contains atmost m stops, each insertion and deletion takes O(logm) time. Notethat each stop will be inserted once (when the status line reachesit), thus the total time for insertion is O(m logm). We are left to

Page 4: Algorithmic and Visual Analysis of Spatiotemporal Stops in ...

show that the deletion part takes O(m logm) time too. Since thestops are already sorted by their x-values, the deletion action canuse this order and traverse the stops from left to right during thesweep. While processing p, it will remove stops from γ as long astheir x-difference from p is larger than β . Let cp be the number ofstops we test for deletion when processing p. Note that cp−1 stopsof those will be removed from γ (its possible that we will test p fordeletion too). We will charge the non-removed stop to p. Since eachstop can be removed from γ at most once during the sweep (timeO(logm) for a single removal), the time follows. Since γ containsat most m stops, the space complexity follows.

Figure 3: The window γ (shaded) while processing p. At thispoint of time, γ contains six stops (including p).

Once γ is updated, we detect stops inside γ that are β -close top. We do that by traversing γ up and down from p until the y-coordinate difference of the traversed stop and p is larger than β .For each stop being traversed, we test if its distance to p is smallerthan β (if it is, they are reported as β -close pairs). Using the proper-ties of the sweep-line paradigm and considering Lemma 3.1, it fol-lows that the complete sweep can be done in time O(m logm+Q),where Q is the number of proximity tests we perform. It is evidentthat some of the tests may include pairs of stops that are not β -close; we refer to these tests as false proximity tests. Nevertheless,Bentley [7] shows a proof that in BUCKET the number of total testsare in the same order of magnitude as the number of true proximitytests (the tests that result in β -close stops and thus reported). It iseasy to show that this proof holds true with PSNN too (we omitthe details) if we ignore the proximity tests of the stops that haltthe traversal of γ (see above). However, since each stop p can beassociated with at most two such stops (recall that we traverse γ upand down), we can charge them to p. Thus, we can ignore the falseproximity test quantity when computing our time performance andtake the number of true proximity tests instead of the total num-ber of tests. Hence, we can formulate the time complexity as anO(m logm+K), where K is the output size (note that K = O(n2)).

THEOREM 3.2. PSNN correctly reports all pairs of stops thatare β -close to each other.

PROOF. We prove that PSNN reports a pair of stops if and onlyif the pair’s stops are β -close. The ’only if’ direction is immediatefrom the distance test we perform. Thus, we are left with the ’if’direction. Let p1 and p2 be two stops and suppose that D(p1, p2)≤β . Without loss of generality, assume that p1 ≤ p2. Consider thetime when we process p2. Since D(p1, p2)≤ β , it follows that p1 ∈γ . Let ξ (p2) = (p2.x−β , p2.y−β ) and κ(p2) = (p2.x, p2.y+β )be two coordinates and θ(p2) be the rectangle whose lower-leftcorner is ξ (p2) and top-right corner is κ(p2). It follows by ouralgorithm that any p ∈ θ(p2) is tested for proximity with p2. SinceD(p1, p2) ≤ β , we get that p1 ∈ θ(p2), and thus p1 will be testedwith p2 for proximity and this pair will be reported. The theoremfollows.

We note that with the majority of real-world instances, PSNNwould perform far fewer proximity tests than BUCKET (see Sec-tion 2.1). To detect β -close stops of a specific stop p, PSNN testsonly stops contained inside θ(p) (θ is defined in theorem 3.2).Note that since |θ(p)|= 2β 2, the proximity tests for each stop be-ing processed are limited to area size 2β 2 (except for possibly twotests with stops located outside θ(p) that halt the trevaersal of γ asexplained above). In comparison, BUCKET tests five buckets ofsize β 2 for each stop p being processed; those are the bucket thatcontains p and its neighboring buckets that come before it whentraversing the buckets in a left-right and top-bottom order (in Fig-ure 4 we visualize this comparison). It follows that the area PSNNexamines for each stop is 60% smaller than the area examined byBUCKET. Moreover, simple geometric observation shows that un-der uniform distribution assumption, PSNN’s false proximity testsrate will be 21.5% while BUCKET’s false proximity tests rate willrange between 37.2% and 84.3%, depending on the position of thestop inside its containing bucket.

As we described above, after constructing the neighboring graphG using our PSNN algorithm, we cluster the stops by computingthe strongly connected components embedded in G. We use theunion-find algorithm [11] to find these components. We do thiswhile sweeping the plane while merging the sets of every two β -close stops upon detection. Consequentially, the resulting sets arethe stop groups. Constructing the clusters takes O(mα(m)) time,where α is the super-slow-growing inverse Ackerman function.

Figure 4: Comparing the areas processed by the algorithmswhen processing stop p. PSNN will process the shaded rect-angle whose area is 2β 2 while BUCKET will process the fivedashed buckets of total area 5β 2.

3.1.3 Segmenting the Stop Groups to Stop SegmentsWe say that a sequence of consecutive stops forms a stop-segment

if they belong to the same stop-group and they are connected by thenext pointer (see Section 3.1.1). We partition each stop-group intoa collection of stop-segments. These segments, each contained ina specific stop-group, constitute the smallest units that users wouldneed to assess during exploration.

To segment the stops, we traverse every stop group, partitioningit to stop segments using the prev and next pointers. This can beeasily done in linear time (we omit the details).

We conclude with a theorem that summarizes PSNN.

THEOREM 3.3. Computing the stop groups and stop segmentsusing PSNN takes O(n logn+K) time and O(n) space, where n isthe number of input events and K is the output size.

PROOF. Since m≤ n, we can replace m by n for asymptotic com-plexity computation purposes. Then the time and space complexi-ties are simply the accumulation of the times and spaces we specifyabove.

Page 5: Algorithmic and Visual Analysis of Spatiotemporal Stops in ...

Discussion: our algorithm vs. existing point clusteringalgorithms.

Although there exist good clustering algorithms for points in IR2

(see Section 2.1), most do not match our goals. Our model assumesthat the input contains noise-free GPS signals (noise is assumed tobe removed in advance), thus we can ignore outliers. Moreover,we are interested in both a fixed spatial distance clustering thresh-old, since we consider all areas in the map equal, and a simpleclustering using the idea of strongly connected components, whichfits perfectly to our stop analysis design. Thus, more sophisticatedclustering algorithms that handle noise and outliers, as well as onesthat perform dynamic clustering, are not of our interest in this work.Moreover, being more sophisticated, those algorithms generally re-quire more time to compute than solutions to our problem. Hence,the combination of detecting pairs of close stops with an FRNNalgorithm and subsequently cluster them using the strongly con-nected component structure is our choice.

3.2 Visual Mapping and InteractionOnce the stop groups and stop segments are generated, the user

requires a visualization of its patterns for comparison and furtherinvestigation. Our main idea is to locally visualize each stop group.Since a stop group consists of a set of segments, its visualizationconsists of local visualizations of the stop segments themselves. Toconduct a systematic assessment of the segments, we first definedthe following data properties to be made visually accessible to theuser, together with the corresponding user task:

• The temporal distribution of stops, such as hours of the dayor days of the week, and the comparison of the time momentsshould be supported.

• The number of stopping vehicles should be made compara-ble, such as many versus few vehicles.

• The duration of the stops should be made accessible, such aslong and short stops.

• The overall volume of stops should be visually salient com-bining the number of vehicles stopping at a location and theirstop duration, such as long intensive versus occasional shortstop-location, and any combination of these.

Florence Nightingale’s polar area diagram, which she created tographically represent mortality causes during the Crimean War, is alandmark in information visualization [13, 9, 10, 33]. Also knownas the Nightingale rose diagram, the circular plot represents a multi-scale temporal dimension together with a value distribution, form-ing a flower-shaped chart. The unusual appearance of Nightingale’sgraph is eye-catching and undoubtedly conveys its message. In-spired by this visualization, we have developed a more complex andsophisticated layout for our stop patterns. The advantage of such alayout is twofold; first, it has a compact form to be shown on a mapand second, it has multiple visual attributes available for mappingdata attributes. A systematic empirical comparison of multiple vi-sual metaphors is beyond the scope of the current work, but we doaddress this and justify our design decisions in the Design Deci-sions section of the Evaluation. For the method of visual mapping,we define the following visual properties of the circular layout weused for data attributes:

• The segments of the circle are able to represent different timemoments to show the time of the stop-segment.

• The sequence of segments represent time moments in a givenresolution and show consequently the temporal distribution.

• Each stop-segment can be mapped filled using a transparencyto reveal differences in the number of occurring stops-segments.

• The size of the filled area within a segment, starting from thecenter, can be dedicated to the duration of the stops to showtheir distribution and differences between time segments.

• As a result of combining opacity and overall size of a circlethe stop-volume will be revealed to the user.

To demonstrate the power of this visual mapping , we have cre-ated a schematic description, as shown in Figure 5. Figure 5(a)shows the visual variables of a flower to be mapped to the data at-tributes. Figure 5(b) shows one possible schematic representationof a resulting flower with long intensive stops and short seldomstops. Just for exemplary purposes, at midnight (center upper seg-ment) we show a short frequent stop, from 6am to 9am an increasein the duration of stops, at 2pm we show a distribution of shortto long duration with decreasing number of vehicles involved (asshown by the transparency).

Circle segments for Temporal Resolution

Fill-area of the segment from the center

Fill-Transparency to single events

Position by time

(a)

00:00

12:00

16:00 08:00

20:00 04:00

(b)

Figure 5: Schematic description of visual mapping revealingthe timing, temporal distribution, and duration of stops. (a)Visual attributes and (b) exemplary result for one stop pattern.

Our method also enables supporting users to interact with thevisualizations. One of the key requirements of the visualizationwas to allow mapping at the actual location of the stops, using thegeographic map background. As such, users can interact with themap and the visualizations simultaneously, as supported by the mapprovider. Zooming and panning on the map enable accessing in-teresting areas. In addition, we provide two advanced interactiontechniques; overall size adjustment and smart highlighting. Over-all, size adjustment allows users to set the scale of the circles, whichcan be used to avoid clutter or reveal more details of the underlyingmap. Smart highlighting enables users to correlate the trajectoriesof paths with the corresponding stops and vice verse, by simplyhovering over either the stops or the trajectories themselves. Onecould highlight just the stops of one single vehicle as well, as all thevehicles corresponding to one stop segment. From this interactivevisualization, a user can clearly gain an overview of the spatiotem-poral distribution of stops, compare locations and time segments,and grasp the volume of stopsâATas a combination of the relativenumber of vehicles and their duration.

4. INSTANTIATION OF THE METHODTo demonstrate the utility of the proposed method, we carried out

an instantiation on a real-world dataset, selected from the domainof public urban transportation. The advantage of this domain is thatdata is publicly accessible and available in large quantities. Publictransportation is not only the heartbeat of urban life but is also very

Page 6: Algorithmic and Visual Analysis of Spatiotemporal Stops in ...

interesting and challenging to examine from both a research andbusiness perspective.

The task at hand was to explore and define stop patterns in thecity of Helsinki’s public transportation system. The intent of thisexploration and definition was to reveal the deviation between ac-tual and planned behavior, point to outliers, and ultimately achievea broader understanding of the domain for purposes of planningand optimization. The stops in public transportation are of criticalvalue. Planned stops are required at certain junctions and stationsand unplanned stops are often unavoidable. Although precisely pre-dicting the duration and location of both types of stops can be diffi-cult, they must be carefully calculated since they represent the keyperformance indicator for the public transportation service. Theultimate aim of the authorities was to achieve full predictability,minimizing route time and guaranteeing high-quality service. Ourundertaking aimed to provide a deeper understanding and broaderassessment of the stops.

4.1 Data Collection and EnvironmentTo gather traffic data, we used the Helsinki Regional Transport’s

HSL Live web service [21]. We connected to the HSL Push inter-face, which sends one record per second per active bus and tram,for the duration of 24 hours. The result was a list of the locationsof all active buses and trams within a predefined geographic area.We parsed and saved these locations as latitude and longitude bythe identity of the vehicles. The data covered more than a hundredvehicles on 16 tram routes and 8 bus routes. We restricted our at-tention to the 6 bus routes with the most activity in the downtownHelsinki area. The final dataset contained more than 1,000 trajec-tories, with approximately 200,000 geographic positions.

We used our own implementation of a workbench for the visualexploration of movement data and for the computation and visu-alization of the stop patterns. The workbench is a web-based vi-sualization and analytic tool that supports cartographic renderingand the visualization of spatiotemporal data, such as trajectories,sensor networks, and events. As a map background, we used a car-tographic map provided by OpenStreetMap [19].

Within this environment, we triggered the computation of thestop patterns by first defining the speed of the vehicles for detect-ing stops as 0m/h, in order to capture only full halts of vehicles.We then consecutively defined a spatial distance as 100m betweenthe stops, for clustering them into a stop-group. These parameterswere defined by the domain expert from the public transportationorganization. We believe that these values might be eventually pre-computed using an optimization function as further discussed in theEvaluation section. Unfortunately, the realization of such a func-tion is beyond the scope of the current work.

The generation and exploration of stop patterns according to theanalytic pipeline included two iterative loops. In the first iteration,the domain expert defined and set the algorithmic parameters. Thesecond iterative loop involved exploring the results; this was anintensive and long undertaking. the domain expert had two com-plementary approaches: (a) Defining the points of interests and ex-tracting the corresponding stop patterns for investigation, and (b)Exploring the shapes of stop patterns and investigating them in theirgeographic context.

4.2 Results of the InvestigationOur algorithm extracted a set of patterns described in the follow-

ing section. These represent a small subset of patterns found andobserved. The first part of the investigation dealt with the over-all distribution and the location of the stops. Figure 6 shows anoverview of the results on the whole city. A few larger stop areas

with significantly long stops and with a distinct distribution overtime are readily apparent. We observed that most of these loca-tions are close to or located at final stations. In the central-northernpart of the city, two close stop areas appear with very long stops.We also noticed that at almost every junction, single but signifi-cantly long stops occur, pointing to different hours of the day, asindicated by the direction of the segments. An in-depth analysis ofthe individual stop group representations revealed some interestingpatterns and phenomena, which are referenced as letter labels inFigure 6:

A. The final station in Saunalahdentie in Figure 6A shows thattrams arrive at this station at all working hours of the day. Thedark parts of the flower also show small variations of about 3 min-utes. Some longer stops (6 minutes at most) occur at around noon(between 11am and 4pm). Interestingly, a slight deviation occursin the early morning, between 6am and 10am, as indicated by theconsistently overlapping segments with no transparency. For theexpert, this pattern indicated a good alignment of planned versusactual stops.

B. The Kulosaari Island in Figure 6B is a famous tourist place,showing one large stop area and several outlier-stops. Major stopsoccur between 6am and 3pm, but there are some further significantones in the evening at 10pm. Overall, the stops are fairly short,ranging from 1 to 4 minutes. Single occurrences in this contextwere not further investigated by our expert, since they do not seemto re-occur. Since the duration shows no significant deviations, weconcluded that the planned stops aligned well with the actual stops.

C. The area of the central railway station in Figure 6C is criticalfor the investigator. The longest stop was 15 minutes, instead of theplanned 3-4 minutes. The largest deviations and longest stops wereobserved at rush hour, from 7-10am and after 8pm. Most interest-ingly, the morning stops reoccur with very high duration, resultingin the assumption that either the plan for the entire schedule has tochange accordingly, or that an alternative method for stop reductionhas to be implemented. Our domain expert, aware of this problem-atic location, did not seem to be surprised; rather, it confirmed hisknowledge and hypothesis.

D. Another stop area at the central railway station in Figure 6Dalso shows a very high deviation between the planned and actualduration of stops. These results align with the previous example,that the duration of stops reaches 15 minutes instead of the planned3 to 4 minutes. Surprisingly, the longest stops and highest devi-ations were observed in the early afternoon, between 12pm and4pm. This very valuable finding extends the previous description,indicating that these two locations seemingly divide the load of theday, being only 1km away from each other. The current dataset didnot provide enough evidence to conclude a far-reaching explana-tion of the phenomena, but just motivated us for further data col-lection and comparison on a daily basis. If this hypothesis – namelythat these two stations share and divide the load experienced dur-ing a day – is confirmed, we can assume to have found a bottleneckwith far-reaching consequences for planning and optimizing publictransportation.

E. The end of the investigation focused on finding outliers ofinterest. Along the Aleksanterintaku Road, we found several one-time occurrences of long waiting times, which continued to occurbeyond the bridge on Kanavakatu as well (from west to east/south-east), as shown in Figure 6E. The duration of these stops is ap-proximately 4.5 minutes. Interestingly enough, these stops seemto have a re-occurrence every 500 meters, and they do not alwayshappen to be at junctions, stops, or traffic lights. Particularly, thetwo stops after the bridge, on the lower center-right of the image,happen at early afternoon for no obvious reasons. Further investi-

Page 7: Algorithmic and Visual Analysis of Spatiotemporal Stops in ...

Figure 6: Overview of the spatiotemporal distribution of stop patterns in Helsinki public transportation. Letter labels refer topatterns discussed in the instantiation of the method section.

gation by the domain expert could not provide a plausible expla-nation. The longest duration happens to occur in the early after-noon, followed by some morning and late-night occurrences. Theimmediate though of the expert to overcome such events was to im-plement alarm systems for the dispatcher in an attempt to shortenthese times or avoid them entirely.

In summary, the instantiation of the method demonstrated theusefulness of the proposed method using a real-world dataset onurban public transportation. The investigation was carried out inclose collaboration with a domain expert. We described the inter-pretations of the findings. We received valuable feedback for thetool and also for the visualizations.

5. EVALUATIONIn this section, we evaluate the performance of the implemen-

tation of PSNN. We assess the effect of user input on the resultsand the performance of our method, and compare the performancewith the implementation of BUCKET. We also describe our designdecisions, which we made in an effort to achieve optimization andenhanced visualizations.

5.1 Assessing the Effect of User Parameterson Performance

To recap the method, user input is required first for the speedthreshold to identify a stop, and second to set the spatial distancefor clustering the stops into groups. Performance is defined as thetime to create clustered stop-groups and the number stop-groupsthat were created at the end of the process.

When we set the speed distance threshold to 0 m/h, 27,783 stopsand 4,057 individual segments were created. Then we changed thespatial distance for the clustering of stops from 0 meters to 3000meters, in steps of 100 meters. The number of stop groups de-creased exponentially with an increase of the spatial distance. Weidentified 1551 stop groups for the 0-meter spatial distance thresh-old, and 200 stop groups for 500 meter thresholds. The perfor-mance time of the clustering and of the segmentation steps werealso affected by the changes in the spatial distance threshold. Seg-mentation time increased slightly, from 870 to 2040ms for 0 and3000 meter thresholds, respectively. The clustering time increasedmore drastically, from close to 4000 to 16.000ms, for the samethreshold ranges. Consecutively, we changed the speed and alsodistance thresholds systematically. The speed distance ranged from0 to 300m/h and the distance from 0 to 500 meters. As in theprevious analysis, we computed the time of clustering and of seg-

Page 8: Algorithmic and Visual Analysis of Spatiotemporal Stops in ...

mentation, and also the number of stop-groups identified. Figure8(a) shows that for the smaller distance thresholds, the number ofstop-groups increases in a logarithmic way with speed. However,for the higher distance thresholds, the number of stop-groups de-creases with speed. This indicates a more complex behavior andeffect of speed-distance thresholds on the number of stop-groups.Computation shows that this behavior can be approximated as apoly-logarithmic function, where the exponent is a function of thespatial distance threshold.

Figure 8(b) shows the same analysis from a different perspective,and confirms our findings. When distance thresholds are lower,higher levels of speed thresholds result in a high number of stop-groups. But for higher distances this relation has the opposite ef-fect: higher speed thresholds result in lower number of stop groups.We can approximate this behavior with a poly-logarithmic functionas well; this time the exponent is negative for all graphs. Moreover,the higher the speed the larger the absolute value of the exponent.

Understanding the relation between the input parameters and thestop-groups identified and displayed on the map is critical towardsan automatic initialization of the input parameters. A comprehen-sive approach will need to take into account not only the input pa-rameters’ behavior, but also the spatial distribution and resolutionof the data itself. This is unfortunately beyond the current scopebut certainly a consideration for future work.

(a)

(b)

Figure 7: Changes in the number of stop-groups (a) as a func-tion of distance threshold for different constant speed thresh-olds, and (b) as a function of speed threshold for different con-stant distance thresholds. We highlight the approximated poly-logarithmic behavior between number of stop-groups, speedand distance thresholds.

5.2 Comparison with BUCKETIn this section we evaluate PSNN’s performance and compare

it to that of BUCKET. Since both algorithms generate the sameresults, we are interested in comparing their performance. We usethe Helsinki public transportation data.

We first extracted stop sets for different values of α (the speedthreshold). Since these values are insignificant to the results ofthese experiment, we omit them for clarity. We then tested the stopsets with different values of β (the distance threshold), on bothPSNN and BUCKET. We were interested in two quantities. Thefirst is the time taken to completion. The second is the number oftotal proximity tests and false proximity tests taken by both algo-rithm. We report those quantities in order to show a major differ-ence between the algorithms and to explain the time differences.

Figure 8 shows the results with both algorithms. The Y -axisin both graphs represents the average quantities over several testswith different values of β (five tests for each stop set, using β

= 100,200,...,500). Overall, the results clearly show that PSNNoutperformed BUCKET. BUCKET performed far more proximitytests than PSNN, as shown in Figure 8(a). Bucket performed lesstrue proximity tests than PSNN, as shown in Figure 8(b). Con-sequently BUCKET took 38% to 100% more time than PSNN tocomplete (see Figure 8(c)). Note that these differences increasedsignificantly when number of stops increased. This observation in-dicates that PSNN is potentially more scalable than BUCKET.

We finally note that we experimented with more datasets, in-cluding random data to be independent of the geographic data. Inall experiments we obtained similar performance differences as wereport above. In the full paper we intend to give more details onour full experimentations.

5.3 Reflecting on Design DecisionsAs an important part of the evaluation, we reflect on the strengths

and weaknesses of our design decisions, address points for im-provement, and suggest alternatives. These reflections are basedon empirical evaluations conducted in the past and on our own ex-perience in working with visualization. Diehl et al. [13] presenteda comprehensive study for empirically evaluating the strengths andweaknesses of radial displays over Cartesian ones. We made ourdesign decisions carefully, in light of the guidelines provided bythe authors in that work [13] and of the discussion provided byGelman & Unwin [18]. The main attributes of our design, alongwith their rationale, are outlined and reflected on as follows:

1. The relative position of the circle segments to one another re-flects the temporal dimension. The segments can also be used fordefining the temporal resolution, such as hours of a day, or monthsof a year. Literature [13] reports that the disadvantages of the ra-dial display are more pronounced when absolute positions have tobe remembered, as opposed to relative positions. Therefore, someissues come into question — such as whether users could locateone particular hour of the day as a segment without any additionalhelp such as labeling or color coding.

2. The sequence of time is mapped in a circular clockwise man-ner to the flower chart. Arguably, a simple line chart or Cartesianrepresentation could have the same expressive power. This argu-ment, however, does not take into account that multiple objects (ve-hicles, in our case) contribute to the creation of one circle segment.In the flower representation, we use transparency to represent thenumber of objects involved. Darker, less transparent segments in-dicate higher volumes and lighter, more transparent segments, indi-cate lower volume. As a result, users can compare the significanceof two occurrences and distinguish one-time flukes from generalpatterns. Therefore, we believe that the flower layout holds the ad-

Page 9: Algorithmic and Visual Analysis of Spatiotemporal Stops in ...

0

2

4

6

8

10

12

14

16

18

2000 2500 3000 3500 4000 4500 5000 5500 6000 6500 7000 8500

Tota

l o

f P

rox

imit

y T

est

s (i

n m

il)

Number of Stops

PSNN

BUCKET

(a) Total Number of Tests

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2000 2500 3000 3500 4000 4500 5000 5500 6000 6500 7000 8500

Tru

e P

rox

imit

y T

est

s (i

n %

)

Number of Stops

PSNN

BUCKET

(b) Percentage of True Tests

0

500

1,000

1,500

2,000

2,500

3,000

3,500

2000 2500 3000 3500 4000 4500 5000 5500 6000 6500 7000 8500

Pe

rfo

rma

nce

Tim

e (

in m

s)

Number of Stops

PSNN

BUCKET

(c) Performance Time

Figure 8: Comparing PSNN and BUCKET on an increasing number of stops: (a) the absolute number of proximity test (in millions),(b) the percentage of true proximity tests among all tests, and (c) Time taken to both algorithm to complete.

vantage for capturing the scaling of the number of objects involved.3. The duration of a stop is mapped to the length of a segment,

starting from the center of the circle. We considered multiple alter-natives for this mapping. First, as we generated the first layout, weused color intensity to encode duration. We rejected this choice,however, because it does not work together with transparency andcauses mixed colors, due to over plotting, which are hard to inter-pret. There is a further consideration, in case of circle segments,because of the angle involved, the shape is not only lengthening,but also widening as radius increases. We must take into consider-ation the fact that this introduces an unwanted bias that has to becorrected by normalizing the length by the angle using trigonomet-ric functions.

4. The shape of the flowers aims to reveal the overall distributionof the stops at a given location. The main user task we consideredwas generating an overview of the stops’ distribution. Through-out our exploration and work, we have learned to use the shape ofthe flowers as the first indicator of interesting-ness. At this stage,we question whether the shapes’ possible divergences are powerfulenough to be perceived and remembered as patterns. Nevertheless,the flowers of Florence Nightingale are renowned for being strongconveys of the messages they represent [13, 9, 10, 33]. In addition,the general learning curve for the circular layout is similar to anyCartesian layout [13].

Overall, our impression is positive regarding the expressivenessof the visual metaphor we selected for stop patterns. Users can ac-cess the stop volume and compare locations and time points. Inaddition, the flowers scale well to the number of stops visualized,temporal resolution, and distances of stops. Nevertheless, we real-ize the need to improve the assessment of the absolute position ofthe segment, as opposed to the relative position, and to overcomethe clutter and over plotting problem of the flowers in highly denseareas.

6. CONCLUSIONSWe presented a system to analyze the stops occurrences of mov-

ing vehicles. The goal is to help domain experts analyze vehiclesmoving slowly for the sake of important stop pattern derivation.Our system consists of two main parts: an algorithm to cluster thestops based on their location and a visualization technique to visu-alize the data in a useful way. This system is now embedded in aspatiotemporal visualization tool whose main goals are to provideefficient analysis and useful visualization capabilities.

We devised a clustering algorithm that has three parts: identify-ing the stops, clustering them, and segmenting stop-subtrajectories

within stop clusters. We focused on the clustering phase, whichis the most challenging among the three. The main idea is to firstbuild a neighboring geometric graph of the stops and then to find itsstrongly connected components. To find the neighbors, we formu-lated the problem as a fixed-radius near neighbor problem and de-vised a corresponding algorithm based on the plane-sweep paradigm.We showed that this algorithm is efficient both asymptotically andexperimentally. We implemented it and compared it experimen-tally with a well-known algorithm based on bucketing IR2 points.We concluded that our plane-sweep algorithm clearly outperformsthe bucketing algorithm and explained the reason for that geomet-rically.

Our work contributes a novel way to use a well-known visual-ization paradigm in the context of detecting and analyzing stops asone type of geographic movement patterns.

We discovered some weaknesses in our technique that are par-tially technical in nature and can be overcome with further imple-mentation and development. Some conceptual challenges, such assetting the optimal overall size of the flowers, need further researchand experimentation. In general, we believe that we have satisfiedthe task of showing stop volume and patterns embedded in a ge-ographic map. Further research will need to consider alternativetechniques and might consider empirical comparison among them.Problems of overlapping and compensating for unequal distribu-tions in space and in attribute values will also need to be addressed.

In the future we would also like to investigate the effect of theinput parameters and other influencing factors on the visualizationquality more deeply, and to define an optimization method to au-tomatically set them to initial values. We believe that the initialvalues are very useful for user experience and for the generation ofmeaningful results. Nevertheless, users must be able to alter thesevalues according to their domain knowledge, the tasks to be con-ducted, and individual preferences.

7. REFERENCES[1] G. Andrienko, N. Andrienko, P. Bak, D. Keim, S. Kisilevich,

and S. Wrobel. A conceptual framework and taxonomy oftechniques for analyzing movement. Journal of VisualLanguages and Computing, 22(3):213–232, 2011.

[2] G. Andrienko, N. Andrienko, C. Hurter, S. Rinzivillo, andS. Wrobel. From movement tracks through events to places:Extracting and characterizing significant places frommobility data. In Visual Analytics Science and Technology(VAST), 2011 IEEE Conference on, pages 161–170. IEEE,2011.

Page 10: Algorithmic and Visual Analysis of Spatiotemporal Stops in ...

[3] N. Andrienko, G. Andrienko, and P. Gatalsky. Supportingvisual exploration of object movement. In Proceedings of theworking conference on Advanced visual interfaces, pages217–220. ACM, 2000.

[4] N. Andrienko, G. Andrienko, P. Gatalsky, et al. Impact ofdata and task characteristics on design of spatio-temporaldata visualization tools. Exploring Geovisualization, pages201–222, 2005.

[5] M. Ankerst, M. M. Breunig, H. peter Kriegel, and J. Sander.Optics: Ordering points to identify the clustering structure.In ACM SIGMOD international conference on Managementof data, pages 49–60. ACM Press, 1999.

[6] P. Bak, M. Marder, S. Harary, A. Yaeli, and H. Ship. Scalabledetection of spatiotemporal encounters in historicalmovement data. Computer Graphics Forum, 31(3):915–924,2012.

[7] J. L. Bentley. A survey of techniques for fixed-radius nearneighbor searching. In Technical Report SLAC-186 andSTAN-CS-75-513, Stanford Linear Accelerator Center, 1975.

[8] M. Birn, M. Holtgrewe, P. Sanders, and J. Singler. Simpleand Fast Nearest Neighbor Search. In Workshop onAlgorithm Engineering & Experiments (ALENEX), pages43–54, 2010.

[9] M. Bostock and J. Heer. Protovis: A graphical toolkit forvisualization. Visualization and Computer Graphics, IEEETransactions on, 15(6):1121–1128, 2009.

[10] L. Brasseur. Florence nightingale’s visual rhetoric in the rosediagrams. Technical Communication Quarterly,14(2):161–182, 2005.

[11] T. H. Cormen, C. Stein, R. L. Rivest, and C. E. Leiserson.Introduction to Algorithms. McGraw-Hill Higher Education,2nd edition, 2001.

[12] M. de Berg, M. van Kreveld, M. Overmars, andO. Schwarzkopf. Computational geometry: algorithms andapplications. Springer-Verlag New York, Inc., Secaucus, NJ,USA, 1997.

[13] S. Diehl, F. Beck, and M. Burch. Uncovering strengths andweaknesses of radial visualizations—an empirical approach.Visualization and Computer Graphics, IEEE Transactionson, 16(6):935–942, 2010.

[14] S. Dodge, R. Weibel, and A. Lautenschütz. Towards ataxonomy of movement patterns. Information visualization,7(3-4):240, 2008.

[15] R. O. Duda, P. E. Hart, and D. G. Stork. PatternClassification. Wiley, 2. edition, 2001.

[16] J. Dykes and D. Mountain. Seeking structure in records ofspatio-temporal behaviour: visualization issues, efforts andapplications. Computational Statistics & Data Analysis,43(4):581–603, 2003.

[17] M. Ester, H. peter Kriegel, J. S, and X. Xu. A density-basedalgorithm for discovering clusters in large spatial databaseswith noise. In Proceedings of the Second InternationalConference on Knowledge Discovery and Data Mining,pages 226–231. AAAI Press, 1996.

[18] A. Gelman and A. Unwin. Infovis and statistical graphics:Different goals, different looks. Website / Blog, 2011.http://www.stat.columbia.edu/~gelman/bayescomputation/GelmanUnwin2011.pdf.

[19] M. Haklay and P. Weber. Openstreetmap: User-generatedstreet maps. Pervasive Computing, IEEE, 7(4):12–18, 2008.

[20] H. Hayashi, D. Ito, M. Tanizaki, K. Kimura, and

H. Kajiyama. Dual-heap knn: k-nearest neighbor search forspatial data retrieval in embedded dbms. In Proceedings ofthe 16th ACM SIGSPATIAL international conference onAdvances in geographic information systems, GIS ’08, pages40:1–40:10. ACM, 2008.

[21] HSL. Helsinki region transport - live vehicle apidocumentation. Website, 2011. http://developer.reittiopas.fi/pages/en/other-apis.php.

[22] T. Kapler and W. Wright. Geotime information visualization.Information Visualization, 4(2):136–146, 2005.

[23] M. Kraak. The space-time cube revisited from ageovisualization perspective. In Proc. 21st InternationalCartographic Conference, pages 1988–1996, 2003.

[24] P. Laube, S. Imfeld, and R. Weibel. Discovering relativemotion patterns in groups of moving point objects.International Journal of Geographical Information Science,19(6):639–668, 2005.

[25] D. Mountain. Visualizing, querying and summarizingindividual spatio-temporal behaviour. ExploringGeovisualization.(Eds: Dykes, JA, Kraak, MJ, andMacEachren, AM) Elsevier, London, pages 181–200, 2005.

[26] A. Palma, V. Bogorny, B. Kuijpers, and L. Alvares. Aclustering-based approach for discovering interesting placesin trajectories. In Proceedings of the 2008 ACM symposiumon Applied computing, pages 863–868. ACM, 2008.

[27] J. Pan and D. Manocha. Fast gpu-based locality sensitivehashing for k-nearest neighbor computation. In Proceedingsof the 19th ACM SIGSPATIAL International Conference onAdvances in Geographic Information Systems, GIS ’11,pages 211–220. ACM, 2011.

[28] H. Samet. K-nearest neighbor finding using maxnearestdist.IEEE Trans. Pattern Anal. Mach. Intell., 30(2):243–252,2008.

[29] J. Sankaranarayanan, H. Samet, and A. Varshney. A fast allnearest neighbor algorithm for applications involving largepoint-clouds. Computers & Graphics, 31(2):157–174, 2007.

[30] R. Scheepens, N. Willems, H. van de Wetering, and J. vanWijk. Interactive visualization of multivariate trajectory datawith density maps. In Pacific Visualization Symposium(PacificVis), 2011 IEEE, pages 147–154. IEEE, 2011.

[31] S. Shang, B. Yuan, K. Deng, K. Xie, and X. Zhou. Findingthe most accessible locations: reverse path nearest neighborquery in road networks. In Proceedings of the 19th ACMSIGSPATIAL International Conference on Advances inGeographic Information Systems, GIS ’11, pages 181–190.ACM, 2011.

[32] J. Shao, L. Kulik, and E. Tanin. Easiest-to-reach neighborsearch. In Proceedings of the 18th SIGSPATIALInternational Conference on Advances in GeographicInformation Systems, GIS ’10, pages 360–369. ACM, 2010.

[33] E. Tufte. The visual display of quantitative information,volume 7. Graphics press Cheshire, CT, 1983.

[34] Z. Yan and S. Spaccapietra. Towards semantic trajectory dataanalysis: a conceptual and computational approach. In Proc.VLDB 2009 PhD Workshop, 2009.