Change Point Detec.on with Bayesian Inference By Frank Kelly Py data 6 th January 2015
Jul 17, 2015
Change Point Detec.on with Bayesian Inference
By Frank Kelly Py data
6th January 2015
Overview
• Nigeria, oil wells & drilling • Noisy data • Some maths • Python implementaDon • Examples in different domains
FPSO (oil plaIorm picture)
Mud pulse telemetry
• InformaDon encoded digitally, transmiOed via pressure pulses through mud fluid.
• Alert drillers that they have reached oil, detect rock types and general monitoring.
The problem
• Poor bit rate and resoluDon
• Time consuming analysis
Approaches to staDsDcs
• FrequenDst – Data gathered is a repeatable random sample. “Frequency”
– Underlying parameters are constant
– Fisher’s 0.05
• Bayesian – Data are, fixed and observed from the realised sample
– Parameters unknown and described probabilisDcally
– Introduce “subjecDvity”
FrequenDst vs. Bayesian
The Theory: Bayesian inference
• Methodology of mathemaDcal inference: – Choosing between several possible models – ExtracDng parameters for these models
• Bayes’ Theorem:
Rev Thomas Bayes 1702 -‐ 1761
p(w |D) = p(D |w)p(w)p(D)
Likelihood Prior
Probability
Posterior Probability Evidence
-‐ Remove nuisance parameters by marginalisaDon
-‐ InteresDng ones remain
Modelling the problem
µ2
1µ
m
N
0 20 40 60 80 100 120 140 160 180 200 0.5
1
1.5
2
2.5
data = model + noise
• a sequence of N
samples of data from a piecewise constant source with added Gaussian noise.
• Noise independent of mean, idenDcally distributed and S.D. = σ
• Heterogenous: divide into two homogenous segments
µ2⎩⎨⎧
+
+=
i
ii e
ed
2
1
µµ
Nimmi≤<
≤
1µ
Nm
Single changepoint detector: How does it work?
• SubsDtute likelihood into Bayes’ Law
– Simple model-‐ consider Ockham’s Razor
• Interested in changepoint locaDon m, integrate w.r.t. the nuisance parameters (µ1, µ2 and σ)…rearrange this…
• …get a BIG expression for p({m}|dI), code in Python
• On running obtain most likely changepoint locaDon
Ockham’s razor: hOp://www.jstor.org/discover/10.2307/29774559?sid=21105568247973&uid=3738032&uid=4&uid=2
The maths
More maths
• Integrate w.r.t. (and thereby remove) nuisance parameters
Other applicaDons…
hOp://moz.com/google-‐algorithm-‐change
“Google’s algorithm is the “secret sauce recipe” that has enabled it to dominate search.” -‐ FT.com 16th Sept 2014
hOp://www.p.com/cms/s/0/9615661c-‐3ce1-‐11e4-‐9733-‐00144feabdc0.html?siteediDon=uk#axzz3DSwXYAW8
Any business with an online presence today open struggles to accurately evaluate: ● The quality of their website and associated linking pages, as perceived by Google ● The robustness of their website to a sudden change in Google’s search algorithm
Web traffic
30000
35000
40000
45000
50000
55000
60000
raw daily google search-‐sourced pageviews
Web traffic (2)
30000
35000
40000
45000
50000
55000
60000
smoothed data using moving average
Web traffic (3)
30000
35000
40000
45000
50000
55000
60000
smoothed data with cyclicality removed
Web traffic (4)
-‐838
-‐837.5
-‐837
-‐836.5
-‐836
-‐835.5
-‐835
-‐834.5
-‐834
-‐833.5
-‐833
30000
35000
40000
45000
50000
55000
60000
likelihood of change in data plo>ed over .me
day removed likelihood CP
number of tropical storms per year in the North AtlanDc
Data obtained from ibtracs database: hOps://www.ncdc.noaa.gov/ibtracs/
"Amo Dmeseries 1856-‐present" by Rosentod, Marsupilami -‐ hOp://www.cdc.noaa.gov/CorrelaDon/amon.us.long.data. Licensed under Public Domain via Wikimedia Commons -‐ hOp://commons.wikimedia.org/wiki/File:Amo_Dmeseries_1856-‐present.svg#mediaviewer/File:Amo_Dmeseries_1856-‐present.svg
Other applicaDons / possibiliDes
• Financial markets and poliDcal events
• Combine with frequenDst staDcal methods: – Use of GLR in online (moving window) detecDon applicaDon
• Your own data/ ideas !
Thank you • Link to Python code on github:
hOps://github.com/swhustla/pydata-‐bayes-‐changepoint – Single changepoint detector (as seen tonight) – Dual changepoint detector – Ramp detector
• Further reading: – Numerical Bayesian Methods Applied to Signal Processing (StaDsDcs and CompuDng) by Fitzgerald, O’Ruanaidh, 1996 : hOp://www.amazon.co.uk/Numerical-‐Bayesian-‐Processing-‐StaDsDcs-‐CompuDng/dp/0387946292
– Bayesian Inference on Change Point Problems (2007)hOp://www.cs.ubc.ca/~murphyk/Students/Xuan_MSc07.pdf
TwiOer: @norhustla Email: [email protected]
Thank you • AddiDonal links:
– Google Algo updates: hOp://moz.com/google-‐algorithm-‐change – Mathsight -‐> insights into algorithm changes hOp://mathsight.org – AtlanDc mulD-‐decadal oscillaDon spaDal paOern:
hOp://commons.wikimedia.org/wiki/File:AMO_PaOern.png – NaDonal climaDc data center hOps://www.ncdc.noaa.gov/ibtracs/ – Ockham’s Razor and Bayesian Inference:
hOp://www.jstor.org/discover/10.2307/29774559?sid=21105568247973&uid=3738032&uid=4&uid=2
– ConverDng from Matlab to Python: hOp://mathesaurus.sourceforge.net/matlab-‐numpy.html
TwiOer: @norhustla Email: [email protected]