Top Banner
Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar Karagoz, Hasan Davulcu The Computer Journal 2015
28

Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Jan 17, 2018

Download

Documents

INTRODUCTION Location Prediction Sequential Pattern Mining Motivation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Predicting the Location and Timeof Mobile Phone Users by Using Sequential Pattern Mining Techniques

Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar Karagoz, Hasan Davulcu

The Computer Journal 2015

Page 2: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

CONTENTS Introduction Data & Problem Definition Proposed Methods Evaluation & Experimental Results Conclusion & Discussion

Page 3: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

INTRODUCTIONLocation PredictionSequential Pattern MiningMotivation

Page 4: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

MotivationMobile phone operator companies are eager to

know the location flow of their users to build more reasonable advertisement strategies. to build more reasonable base station installation

plans. can be used by city administrators to determine

mass people movement patterns around the city.

Page 5: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

PROBLEM DEFINITIONSThree Sub-Problem Definitions of Broader Location

Prediction Problem Next Location and Time Prediction Using Spatio-

Temporal Data Next Location Change Prediction Using Spatial

Data Next Location Change and Time Prediction Using

Spatio-Temporal Data

Page 6: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Problem DefinitionNext Location and Time Prediction Using Spatio-

Temporal Data• to predict the location and the time of the

next action in the next time interval of the user• divide a day into time intevals• cluster base stations according to their

locations into regions

Page 7: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Training Data Definition ( Call Detail Data )Have 11 attributesbase station id#1, phone number#1, city plate#1,base station id#2, phone number#2, city plate#2, call time, cdr type, url, duration, call date.

The real data is obtained from one of the largest mobile

phone operators in Turkey.

Page 8: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Training DataThe data corresponds to an area roughly 25,000

km2 with a population around 5 million.Almost 70% of the population is concentrated in a

large urban area of approximately 1/3 of the region.

The data contains roughly 1 million users' log records for a period of 1 month.

The whole area contains 13281 base stations.

Page 9: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Method 1 - Next Location and Time Prediction Using Spatio-Temporal DataPreprocessingExtracting RegionsExtracting Frequent PatternsPrediction

Page 10: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Method 1 - PreprocessingThis paper filters unnecessary attributes.Daily call data records of each user are merged

into one row in a temporal order.Daily sequences structured as <base station id,

time of the day> pairs are created.

Page 11: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Method 1 - Preprocessing

Page 12: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Method 1 – Extracting RegionsUnder high number of base stations, it is not

practical to consider each as the center of movement and predict accordingly.

The paper clustered 13281 base stations into 100 regions by using K-Means algorithm.

Base station ids in the preprocessed data are replaced with the corresponding region ids in the daily sequences.

Page 13: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Method 1 – Extracting Regions

Extracted Regions

Page 14: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Method 1 – Extracting Frequent PatternsWork with four parameters;

• preprocessed training data• pattern length (the length of the desired

frequent pattern)• minimum support (the minimum ratio of the

pattern to occur in order to be identified as frequent)• time interval length (is used to discretize the

time of the day, defines the length of each interval)

Page 15: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Method 1 – Extracting Frequent PatternsThe method is very similar to AprioriAll

algorithm.Frequent pattern generation.

• The paper traverses the data to extract all candidate desired length patterns.• The ones that fall below the minimum support

threshold are eliminated.

Page 16: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Method 1 – Sample Frequent PatternsThree sample frequent patterns with the length 4

are presented below.

Page 17: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Test sequence is length of (k-1) and we want to predict kth element.

Then this (k-1) length pattern is searched in frequent pattern set.

If pattern starting with test sequence have been found, the last element of the matching pattern with the maximum support is generated as prediction.

Method 1 - Prediction

Page 18: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Method 1 - Prediction

Page 19: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Method 1 – Prediction – Time ToleranceDifficult to find exact matches between the current

user navigation sequence and existing frequent sequences.

Base station id and time interval pairs can be moved forward and backward in time with tolerance value.

Test instance: <(91,1015),(95,1230),(45,1630)> Frequent pattern set: {...,<(91,1000),(95,1245),(45,1630),(52,1700)>,...} Time tolerance value: 15 minutes Prediction: (52,1700)

Page 20: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

This paper validated the results with real data obtained from one of the largest mobile phone operators in Turkey.

Results are very encouraging, and we have obtained very high accuracy results in predicting the next location change and time of users.

EVALUATION & RESULTS

Page 21: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Evaluation MetricsThis paper introduced 2 metrics to evaluate our

methods;

• g-accuracy: g-accuracy = • p-accuracy: p-accuracy=

The reason for using two different accuracy calculation is due to the fact that maybe there is no matching frequent pattern found for the queried instance.

Page 22: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

This paper analyzes the effect of length of the frequent patterns and support threshold using the following parameter values.

• Pattern Length is 6• Minimum Support is 1.00E-6• Cluster Count is 100• Time Interval Length is 15 min• Time Tolerance is 75 min

Results of Method 1

Page 23: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Results – Pattern Length

Page 24: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

When the pattern length increases, predicting g-accuracy decreases.

This is due to the fact that the number of longer frequent patterns is much fewer than the number of shorter frequent patterns.

Page 25: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Results – Minimum Support

Page 26: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Results – Minimum Support

When minimum support threshold value increases, prediction g-accuracy drops.

The reason for this result is that as minimum support threshold increases the number of generated frequent pattern decreases.

Page 27: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

CONCLUSION & DISCUSSIONThis work shows that determining the

potential change of location of mobile phone users through sequential pattern mining techniques is possible with quite high accuracy.

This paper elaborated the effect of several factors such as pattern length tolerance and multi prediction limit and further improved the prediction performance.

Page 28: Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Thank you !