Page 1
GeorgiaTech
Learning Significant Locations and Predicting
User Movement with GPS
Daniel Ashbrook and Thad Starner
Contextual Computing Grouphttp://www.cc.gatech.edu/ccg
College of Computing, GVU CenterGeorgia Institute of Technology
Atlanta, GA USA
Page 2
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Motivation
• Location is a very common form of context– easy to collect
– infer other pieces of context
• Most applications rely only on user’s current location
Page 3
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Motivation
• How can we improve location context?• Look for patterns of movement and learn
user’s daily schedule– predict where user is going based on where
user has been
• Goal: computer can act as agent– offer suggestions at appropriate times– enable collaboration between colleagues
Page 4
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Applications
• Potential applications for location prediction
• Single–user applications– system only knows about one user’s
movements
• Multi–user applications– system combines predictions for several
people
Page 5
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Applications
• Single user: Pre–emptive Reminders– remind user at an appropriate time
– example: library book•try to determine if user will pass library today
•only then remind user to take book before leaving home
Page 6
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Applications
• Single user: Wireless caching– wireless networks often unavailable
•lack of infrastructure
•radio shadows (buildings, subway)
– hide lack of connectivity by caching
– predict when caching will be insufficient•warn user
•suggest alternative routes
Page 7
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Applications
• Single user: Wireless caching– cache even when network is available
•transmission power can increase with 4th power of distance in complex environments (i.e., city)
•cost can vary with network used, time of day
– prediction can allow savings•of battery power
•of money
Page 8
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Applications
• Multi–user: Enabling collaboration– “Will I see Bob today?”
•compare the user’s and Bob’s schedules
•give yes or no answer
– Scheduling many–person meetings•find when most people are free and suggest a time
•also discover most convenient place to meet
Page 9
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Applications
• Multi–user: Favor exchange– remotely coordinate favor trading
– example: FedEx/UPS package trading
Page 10
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Related Work
• Bhattacharya — cell phone prediction
• Davis — prediction with ad–hoc networks
• Kortuem — Walid
• Marmasse — comMotion
• Liu — predictively caching network architecture
• Orwant — Doppelgänger
• Sparacino — Museum Wearable
• Wolf — travel diaries
Page 11
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Hardware
• Garmin GPS model 35-LVS
• GeoStats data logger– 1 MPH recording limit
Page 12
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Hardware
• Preliminary data collected in Atlanta Sep-Dec 2001
• Data currently being collected from multiple users in Zürich, Switzerland
Preliminary data—Atlanta, GA
Page 13
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Preliminary implementation– finds points of possible significance
– creates probabilistic model of user’s movements•Markov model
– using model, simple queries are possible:•“The user is at home. Where will she go next?”
•“How likely is the user to visit the grocery store today?”
Page 14
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Markov model– collection of nodes
Page 15
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Markov model– collection of nodes
– transitions between nodes
Page 16
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Markov model– collection of nodes
– transitions between nodes
– each transition has a probability of occurring
Page 17
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Markov model– collection of nodes
– transitions between nodes
– each transition has a probability of occurring
– can also have self–
transitions
Page 18
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Our Markov model– nodes are significant
locations
– transitions are trips between those locations
Page 19
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Significance– how do we determine if a particular GPS
coordinate might have some meaning to the user?
Page 20
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Places– logged GPS
coordinates with more than time t of “resting time”
Page 21
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• How to pick t ?
Page 22
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• How to pick t ?– try lots of values
– graph number of places found for each value
Page 23
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• How to pick t ?– try lots of values
– graph number of places found for each value
– but relationship is nearly linear!
Page 24
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• How to pick t ?– try lots of values
– graph number of places found for each value
– but relationship is nearly linear!
– so we pick an arbitrary value: t = 10 minutes
Page 25
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
All data
Page 26
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
All data Only places,with t = 10m
Page 27
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Locations– problem: too many places
•GPS inaccuracy
•different exit points from buildings
Page 28
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Locations– problem: too many places
•GPS inaccuracy
•different exit points from buildings
– solution: cluster places to form locations•all places within a radius r of a particular place
form a single location
Page 29
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
All data Only places,with t = 10m
Page 30
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
All data Only locationsOnly places,with t = 10m
Page 31
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• How to pick radius r ?
Page 32
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• How to pick radius r ?– too large value
• too few clusters• unrelated places
together
– too small value• too many clusters
Page 33
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• How to pick radius r ?– too large value
• too few clusters• unrelated places
together
– too small value• too many clusters
• Solution:– try various values for r– find knee in graph
Page 34
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Clustering places into locations– pick one place (•)
– find all places within radius r (•)
Page 35
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Clustering places into locations– pick one place (•)
– find all places within radius r (•)
– find the mean of those places (x)
Page 36
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Clustering places into locations– pick one place (•)
– find all places within radius r (•)
– find the mean of those places (x)
– repeat with x as the new center
Page 37
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Clustering places into locations– pick one place (•)
– find all places within radius r (•)
– find the mean of those places (x)
– repeat with x as the new center
– continue until the mean stops
changing
Page 38
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Clustering places into locations– pick one place (•)
– find all places within radius r (•)
– find the mean of those places (x)
– repeat with x as the new center
– continue until the mean stops
changing
– start again with another place– repeat until no more places
Page 39
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Sublocations– problem: subsuming
smaller-scale paths
Page 40
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Sublocations– problem: subsuming
smaller-scale paths– solution: create
sublocations within larger clusters
Page 41
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• How to determine if sublocations exist?
Page 42
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• How to determine if sublocations exist?– use same knee &
graph algorithm on each location
– if no knee exists, not enough points to form sublocations
Page 43
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Sublocations can have multiple scales– Country level
Page 44
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Sublocations can have multiple scales– Country level
– State level
Page 45
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Sublocations can have multiple scales– Country level
– State level
– City level
Page 46
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Sublocations can have multiple scales– Country level
– State level
– City level
– Campus level
Page 47
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Prediction– each location gets a unique ID
•user may provide a unique name for each locationsuch as “home” or “work”
– replace each place in original list with ID•result: list of locations that were visited, in the
order that they were visited
Page 48
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• For each location– count number of visits
to each other location
Page 49
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• For each location– count number of visits
to each other location– count total number of
visits to other locations
Page 50
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• For each location– count number of visits
to each other location– count total number of
visits to other locations
– divide to get probability of transition
Page 51
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• For each location– count number of visits
to each other location– count total number of
visits to other locations
– divide to get probability of transition
– result: Markov model for each location
Page 52
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• People don’t move randomly!– 23 locations total, so chance of A→? = 1/22
= 4.5%
– measured ratio CRB→Home = 16/77 = 21%
Page 53
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• Orders of Markov model– 1st order A → ?
•a given state’s transition probabilities only depend on that state
– 2nd order B → A → ?•a given state’s transition probabilities depend on
that state and the previous state
– and so on…
Page 54
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• First order predictions
4%
6%
6%
10%
13%
13%
21%
% Chance
3/77CRB → Taco Bell
5/77CRB → 10th/14th St.
5/77CRB → GA400
8/77CRB → Grocery store
10/77CRB → Jake’s Ice Cream
10/77CRB → Hardware store
16/77CRB → Home
ProbabilityMovement
••• ••
• •••
Random chance: 1/22 = 4.5%
Page 55
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
7%
7%
14%
21%
21%
70%
% Chance
1/14Home → CRB → 10th/14th St.
1/14Home → CRB → GA400
2/14Home → CRB → Jake’s Ice Cream
3/14Home → CRB → Grocery store
3/14Home → CRB → Home
14/20Home → CRB
ProbabilityMovement
0%0/14Home → CRB → Hardware store
• Second order predictions
••• ••
• •••
Random chance: 1/22 = 4.5%
Page 56
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Software
• How many orders to use?– sequence of 141 locations visited
– 23 total unique locations
86
82
73
56
Observed unique paths
137
138
139
140
Approx. expected unique paths
23 * 224 = 5,387,8884
23 * 223 = 244,9043
23 * 222 = 11,1322
23 * 221 = 5061
PermutationsOrder
Page 57
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Future Work
• Collect more data– Georgia Tech students in Zürich & Atlanta
• Investigate other sensors for smaller scales– RF/IR beacons
• Consider privacy policies• Add time of day to Markov model
– predict when a user will leave as well as where they’re going
Page 58
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Future Work
• Schedule “sharpness”– always on time = important ?– example: work at 8AM vs. grocery store
• Speed of model update vs. accuracy– new schedule for college students every term– weight new events more heavily?
•how to avoid unduly weighting one–time trips?•use confidence intervals to determine schedule
changes
Page 59
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Future Work
• Real–time update of models– currently, data is post–processed
– need full wearable computers for real–time
• User interface– visualize location model
– allow user to influence model
• Favor trading implementation
Page 60
Daniel Ashbrook and Thad StarnerGeorgia
Tech
Thank You
Questions?