Top Banner
A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins University Applied Physics Lab
22

A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Dec 29, 2015

Download

Documents

Patricia Parks
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

A Wavelet-based Anomaly Detector for Disease Outbreaks

Thomas LotzeGalit Shmueli

University of Maryland College Park

Sean MurphyHoward Burkom

Johns Hopkins University Applied Physics Lab

Page 2: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Outline

● Motivation● Wavelet method● Difficulties● Preconditioning● Results

Page 3: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Related Work

● Bakshi Wavelets in

Chemical SPC● Zhang

Baseline wavelets Normalize syndromic

baseline● Goldenberg, et. al.

Wavelets in syndromic surveillance

Page 4: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Motivation● Detecting disease

outbreaks Bioterrorist attacks Virulent diseases Early detection

saves lives!● Syndromic Data will

show outbreaks● Anomaly detection

to find outbreaks faster

Page 5: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Wavelets

● Models a series as a sum of “wavelets”

● Wavelets are at different scales

● Wavelets are local (change over time)

Page 6: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Goldenberg et. al., 2002

Xt

SPC( )

AR(aLX)

AR(dLX)

AR(dmX)

AR(d1X)

WT

1ˆtx

Decompose the series

with desired wavelet

Use an AR at each of the detail levels and coarsest approximation

level to forecast the next point

Reconstruct series and obtain next day

forecast

Compare the forecast with the actual value.

Use a control chart to monitor the discrepancy

11ˆ tt xxW

Page 7: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Difficulties

● Holidays● Non-stationary

Day of week Seasonal

● Noisy● Outbreaks are not

labeled● Outbreak pattern not

known in advance

Page 8: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Preconditioning

● Differs from Goldenberg, et. al

● Replace holidays One week previous

● Day-of-week Ratio to moving

average

Page 9: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Evaluation: Simulated Outbreaks

● Real data from 5 cities, Resp and Gi● Simulated outbreak patterns inserted● Specific pattern of additional syndromes over

several days● Size is normalized by standard deviation of

recent days● Inserted at different starting points within the

sample data● Average detection rates vs. false alarm rates

can be determined to create ROC curves

Page 10: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Results

● Comparable to Holt-Winters

● Not amazing

Page 11: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Results

● Preconditioning is important

● Detection is much better when preconditioned

Page 12: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Results

● Easier to detect on some days than others

● Days with low counts

● Daily preconditioning not sufficient

Page 13: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Summary

● Wavelets are a fairly good detection method

● Preconditioning is very important

● Day-of-week not fully accounted for

Page 14: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Questions?

● More details on wavelets method?

● Difficulties?● Other outbreak

signals?● Future work?● Will Microsoft

survive Bill Gates' stepping down?

Page 15: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Bonus: More on Wavelets

● Level 1:– Run the data through a low-

pass filter. This gives the approximation coefficients

– Run the data through a high-pass filter. This gives the detail coefficients

– Down-sample– Reconstruct approximation

and detail by up-sampling and running “reconstruction” filters.

● Level 2 and on:– Repeat the steps by applying

them to the previous level approximation coefficients.

Page 16: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Bonus: Wavelets on Cough Medication Sales

100 200 300 400 500

10

15

20

a1

81012141618

a2

81012141618

a3

8

10

12

14

16

a4

8

10

12

14

16

a5

10

15

20

s

Signal and Approximation(s)

10

15

20

s

cfs

Coefs, Signal and Detail(s)

54321

-2

0

2

d5

-1

0

1

d4

-1

0

1

d3

-1

0

1

d2

100 200 300 400 500-2

0

2

d1

Haar Wavelet:

h = [1/sqrt(2), 1/sqrt(2)]

g = [1/sqrt(2), -1/sqrt(2)]

Downsample

Upsample

h* = [1/sqrt(2), 1/sqrt(2)]

g* = [-1/sqrt(2), 1/sqrt(2)]

In general:

s = a5 + d1 + d2 +… + d5

Page 17: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Bonus: Wavelet Prediction

● Additional details:● 5 level decomposition

Can be performed with more or fewer● SWT: Fill in “holes”

Perform a decomposition for every possible position

Series are no longer independent● Edge issue

Prediction is not possible at all time steps Solution: construct wavelets “backwards” from

most recent observations

Page 18: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Bonus: Ratio-to-Moving-Average

● Way of normalizing day-of-week effects● 1: Determine moving averages

a(i)=(x(i-3) + x(i-2) + ... + x(i+3)) /7● 2: Determine ratio (“raw seasonal”) for each

day r(i)=x(i)/a(i)

● 3: Determine avg. ratio for each day r(Mon)=sum(r(i): i is Mon) / count(i is Mon)

● 4: Normalize ratios to sum to 1 r'(Mon)=r(Mon) / (r(Mon) + ... + r(Sun))

● 5: Divide each day by its ratio x'(i)=x(i)/r(Mon)

Page 19: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Bonus: Possible Extensions

● Multivariate wavelets● Each day-of-week as a separate series● Different wavelet shapes● Different wavelet scale basis● Different preconditioning● Different sizes, lengths of outbreaks● Don't normalize outbreak by standard

deviation of recent days Show when outbreaks are harder to detect

● Estimate confidence based on experience● Boosting

Page 20: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Bonus: Wavelet Prediction

● Decompose into timescales

● Use AR or EWMA to predict for each timescale

● Reconstruct prediction from predicted timescales

● Monitor deviations from prediction

Page 21: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Bonus: Alternative Preconditioning

● Regression using day-of-week predictors● 7-day differencing● Holt-Winters as preconditioner● Seasonal preconditioning

Page 22: A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.

Bonus: Other Outbreak Signals

● Normalized by total size● Lognormal, exponential, step● Spike is much easier than the others