Top Banner
Knowledge Discovery and Delivery Lab (ISTI-CNR & Univ. Pisa) www-kdd.isti.cnr.it Anna Monreale Fabio Pinelli Roberto Trasarti Fosca Giannotti A. Monreale, F. Pinelli, R. Trasarti, F. Giannotti. WhereNext: a Location Predictor on Trajectory Pattern Mining. KDD 2009
24
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 1. Anna Monreale Fabio Pinelli Roberto TrasartiFosca Giannotti A. Monreale, F. Pinelli, R. Trasarti, F. Giannotti.WhereNext: a Location Predictor on Trajectory Pattern Mining . KDD 2009 Knowledge Discovery and Delivery Lab (ISTI-CNR&Univ. Pisa) www-kdd.isti.cnr.it

2.

  • Wireless networks infrastructures are thenerves of our territory
  • besides offering their services, they gather highly informativetracesabout the human mobile activities
  • Miniaturization, wearability, pervasiveness will produce traces of increasing
    • positioning accuracy
    • semantic richness

3.

  • From the analysis of the traces of our mobile phones it is possible to reconstruct our mobile behaviour, the way we collectively move
  • This knowledge may help us improving decision-making in many mobility-related issues:
    • Planning traffic and public mobility systems in metropolitan areas;
    • Planning physical communication networks
    • Forecasting traffic-related phenomena
    • Organizing logistics systems
    • Prediction

4. 5.

  • Predicting the next location of a trajectory can improve a large set of services such as:
  • Navigational services.
  • Trac management.
  • Location-based advertising.
  • Services Pre-fetching.
  • Simulation.

? ? ? .4 .8 .35 6.

  • How to realize this idea:
  • Extract patterns fromall theavailable movementsin a certain area instead of on the individual history of an object;
  • Using theseLocal movement patternsas predictive rules.
  • Build a prediction tree as global model.

Trajectory dataset Local patterns Prediction Tree 7. Select the set of interesting trajectories Validation Evaluation Extract T-Patterns (A set of Local models) Merge T-Patterns (Global model) Use the Condensed model as predictor 8.

  • The local pattern we use is theT-Pattern.It describes the common behavior of a group of users in space and time.

F. Giannotti, M. Nanni, F. Pinelli, and D. Pedreschi.Trajectory pattern mining . KDD 2007: 330-339. 9.

  • Generatingall rulesfrom each T-pattern and using them to build a classifier is too expensive.

T-Pattern Rules 1 2 3 R 1 R 2 R 3 R 4 R 1 R 2 R 3 R 4 R 1 R 2 R 3 R 4 10.

  • To avoid the rules generation the T-Pattern set is organized as a prefix tree.
  • For Each nodev Ididenties the nodev
  • Regiona spatial component of the T-Pattern
  • Supportis the support of the T-pattern
  • For Each edgej
  • [a,b]correspond to the time interval nof the T-Pattern

11.

  • Three steps:
    • Search for best match
    • Candidate generation
    • Make predictions

How to compute the Best Match? Best Match Prediction 12.

  • The spatio-temporal distance computed between the segment of trajectory (bounded in time using the previous transition time) and the current node of the path.

Case a : The trajectory segment intersects the region of the node Case b : The enlarged trajectory segment intersects the region Case c : The enlarged trajectory segment doesnt intersect the region Wheretheth_tis the time tolerance window defined by the user. 13.

  • The path score is the aggregation of all punctual scores along a path.
  • TheBest Matchis the path having:
    • the maximum path score;
    • at least one admissible prediction.

10 min 15 min 8 min 10 min Punctual score: 1 Punctual Score: .58 Punctual Score: .8 11 min 16 min Path score .79 14.

  • Averagegeneralizes distances between the trajectory and each node
  • Sumis based on the concept of depth
  • Maxis the optimistic one, the best punctual score is selected as path score
  • Context-dependentaggregations can take into consideration other aspects of the problem.

15.

  • The WhereNext algorithm can be tuned using its parameters: -th_t: time window tolerance
  • -th_s : space window tolerance
  • -th_score : minimum prediction score threshold
  • -th_agg : the aggregation function used to compute the path score (Avg, Sum or Max)

16.

  • It is very hard to understand which is the best set ofT-patterns we can use to build the our model:
  • a big set ofT-patternsvery slow prediction.
  • a small set of T-patternscoverage leaks
  • For this reason we have defined a way to measure the prediction power of a T-Pattern set.

17.

  • An evaluating function is defined to estimate thepredicting powerof a T-Pattern set.
  • SpatialCoverage : the space coverage of the regions contained in the T-Patterns set;
  • DatasetCoverage : measures how much the T-Pattern set represents the trajectories
  • RegionSeparation : the precision of the regions in the T-Pattern set.

Model 1 Model 2 Testing the a priori evaluation 18. You are here 19.

  • The results are evaluated using the following measures:
  • Accuracy : rate of the correctly predicted locations (space and time) divided by the total number of trajectories to be predicted.
  • Average Error : the average distance between the real trajectories in the predicted interval and the region predicted.
  • Prediction rate : the number of trajectories which have a prediction divided by the total number of trajectories to be predicted.

Predicted Location Cut Original Predicted Location Cut Original Error 20.

  • We used real life GPS dataset obtained from 17,000 vehicles in the urban area of the city of Milan.

Training set : 4000 trajectories between 7am and 10 am on WednesdayTest set : 500 trajectories between 7am and 10 am on Thursday. 21.

  • Predictedvsth_score

Average Errorvsth_space 22.

  • AccuracyvsAverage Error

Single UsersAccuracyandPrediction rate 23.

  • A visual example of the application on Milan mobility data. The context is traffic management and we want to predict how the traffic will move in the city center.
  • We have built a predictor on a good set ofT-patterns whichincludethe city gates of Milan.

Part of the GeoPKDD integrated platform.F. Giannotti, D. Pedreschi, and et al. Geopkdd:Geographic privacy-aware knowledge discovery and delivery(european project), 2008. 24.

  • - Anew techniqueto predict the next locations of a trajectory based on previous movements of all the objects without considering any information about the users. - Thetime informationis used not only to order the events but is intrinsically equipped in the T-Patterns used to build the Prediction tree. - The user cantune the methodto obtain a good accuracy and prediction rate.
  • - We are experimenting the methodin real worldapplications.

25. 26. Trajectories Dataset Regions of Interest T-PATTERNS 27. 28.

  • The same exact spatial location (x,y) usually never occurs twice
  • The same exact transition times usually do not occur twice
  • Solution: allow approximation
    • a notion ofspatial neighborhood
    • a notion oftemporal tolerance

29.

  • Two points match if one falls within aspatial neighborhood N()of the other
  • Two transition times match if theirtemporal difference is
  • Example:

30.

  • Two points match if one falls within aspatial neighborhood N()of the other
  • Two transition times match if theirtemporal difference is
  • Example:

31.

  • Two points match if one falls within aspatial neighborhood N()of the other
  • Two transition times match if theirtemporal difference is
  • Example:

32.

  • T-pattern mining can be mapped to a density estimation problem over R 3n-1
    • 2 dimensions for each (x,y) in the pattern (2n)
    • 1 dimension for each transition (n-1)
  • Density computed by
    • mapping each sub-sequence of n points of each input trajectory toR 3n-1
    • drawing an influence area for each point (composition ofN()and )
  • Too computationally expensive, heuristics needed
  • Our solution: a combination of sequential pattern mining and density-based clustering