A Topic Model for Traffic Speed Data Analysis

Post on 21-May-2015

106 Views

Category:

Engineering

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

http://link.springer.com/chapter/10.1007%2F978-3-319-07467-2_8

Transcript

A Topic Model forTraffic Speed Data Analysis

Tomonari MASADANagasaki University

masada@nagasaki-u.ac.jp

Real-Time Traffic Speed Data | NYC Open Datahttps://data.cityofnewyork.us/Transportation/Real-Time-Traffic-Speed-Data/xsat-x5sa

Speed measurements at hundreds of sensors

(Regrettably, the data seems no longer maintained.)

Problem

• Traffic speed data show a periodicity at

one day period.

• However, there is a wide variety not only

between periods but also within periods.

• How can we analyze it?

Solution

• We take intuition from topic models

in text mining.

Topic models for documents

• We can assume that each document contains

multiple topics.

• That is, each document is modeled

– not as a single word probability distribution,

– but as a mixture of word probability distributions.

Latent Dirichlet Allocation (LDA)

• LDA [Blei et al. 03]

topic <-> word probability distribution

document <-> mixing proportions of topics

• LDA models each document as follows:

v3v3

v1v1

v3v3

v2v2

v2v2

v1 v2 v3 v4

t3φ31

φ32

φ33

φ34

v1 v2 v3 v4

t2φ21

φ22

φ23 φ24

v1 v2 v3 v4

t1φ11

φ12

φ13

φ14

θj1 θj2

θj3

An important difference

• Words are discrete entities.

– Therefore, LDA uses multinomial distributions for

modeling per-topic word distributions.

• Speeds (in mph) are continuous entities.

– We can’t use multinomial distributions.

Gamma distribution

Comparing LDA with Patchy

• LDA <-> Patchy

– Word <-> Speed observation (in mph)

– Topic (multinomial) <-> Patch (Gamma)

– Document <-> Roll (from 0 AM to 12 PM)

Full joint distribution of Patchy

• We estimate parameters by a variational

Bayesian inference.

Variational Bayesian inference

• The posterior parameters are estimated

by maximizing ELBO.

– ELBO = the lower bound of the evidence

Context dependency

Observations of the same mph

are assigned to different patches.

Observations of the same mph

are assigned to different patches.

Context dependency

• Context = mixing proportions of patches

– Which patch is dominant?

• Context-dependency

–Observations of the same speed can be

assigned to different patches depending on

their contexts.

Context dependencyOn May 27, this purple patch is

dominant.

On May 28, this yellow patch is

dominant.

Evaluation

• Binary classification

–Weekdays / Weekends (Sat, Sun)

• Data

– Training: May 27 ~ June 16 (three weeks)

– Test: July 23 ~ August 5 (two weeks)

Comparison

• Nearest neighbor

–Measure similarity by Euclidean distance

–Require timestamps

• Patchy

–Measure similarity by predictive probability

–Require no timestamps

Classification results

Nearest neighbor

Summary

• We proposed a topic model for traffic data analysis.

• Patchy can assign the observations of the same

traffic speed to different groups in a context-

dependent manner.

• Patchy achieved a classification accuracy comparable

with NN with no timestamps.

Future work

• Model timestamps

top related