Top Banner
Heavy Hitters Tail Bounds Anna Karlin continued
33

Heavy Hitters continued Tail Bounds

Jan 10, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Heavy Hitters continued Tail Bounds

Heavy HittersTail Bounds

Anna Karlin

continued

Page 2: Heavy Hitters continued Tail Bounds

Problem● Input: sequence of ! elements "!, "", … , "# from a known

universe % (e.g., 8-byte integers).

● Goal: perform a computation on the input, in a single left to right pass where

○ Elements processed in real time

○ Can’t store the full data. => minimal storage requirement to maintain working “summary”

A AM

Page 3: Heavy Hitters continued Tail Bounds

Heavy Hitters: Keys that occur many times

Applications:● Determining popular products● Computing frequent search queries● Identifying heavy TCP

X Xa Xz X11

V ten fxt i times element x hasappeared in X Ha Xt

Goal Find ontput all elements with Fx ITThese etfs are heavyhitters

Page 4: Heavy Hitters continued Tail Bounds

Output has size 04

Provably impossible tosolve this

problem exactly withsublinearspace

Modifiedgoalsolve E HHproblem

If fi 7 X addedtoHH list

If Sone ett say y added

to list then wp I on

fig 32k En

Page 5: Heavy Hitters continued Tail Bounds

Count-min sketch● Maintain a short summary of the information that still

enables answering queries.

● Cousin of the Bloom filter

○ Bloom Filter solves the “membership problem”.

○ We want to extend it to solve a counting problem.

Page 6: Heavy Hitters continued Tail Bounds

Count-Min Sketch Modifiedgoalsolve E HHproblem

If FF ZTz X addedtoHH list

If Sone ett say y added

to list then wp s l oh

fxy 372 Eh

designerspecifies K E S b l

keep 2D arrayl hash tables each of size b

Page 7: Heavy Hitters continued Tail Bounds

initialize tables with all Os

when ett showsupUpdate x tf is jet increment tj hjCx

Count x return imine tj h xD

if Count x 3 add to HIT list

I 2 3 4 5 bhi1h22

he e

b 2

I x

1 32

I 4

Page 8: Heavy Hitters continued Tail Bounds

Assumptions

hash functions behave likerandommaps

h hee U 0,1 bigV xfy Pr hj x hj y s

hashhis h heare indep of

eachother

initialize tables with all Os

when ett showsupUpdate x tf is jet increment tj hjCx

Count x return jzetj h xDif Countlx 3 add to HH list

wed

Page 9: Heavy Hitters continued Tail Bounds

Fix time t Xi it had just arrivedtZ E tj hjGD

Page 10: Heavy Hitters continued Tail Bounds
Page 11: Heavy Hitters continued Tail Bounds

Count-Min Sketch● Elegant small space data structure.

● Space used is independent of n.

● Is implemented in several real systems.

○ AT&T used in network switches to analyze network traffic.

○ Google uses a version on top of Map Reduce parallel processinginfrastructure and in log analysis.

● Huge literature on sketching and streaming algorithms (algorithms like Distinct Elements, Heavy Hitters and many many other very cool algorithms).

Page 12: Heavy Hitters continued Tail Bounds

Hashfunctions

Page 13: Heavy Hitters continued Tail Bounds

6.1 Tail bounds

Most Slides by Joshua Fan and Alex Tsun

Page 14: Heavy Hitters continued Tail Bounds

Agenda● Markov’s Inequality● Chebyshev’s Inequality● The law of large numbers

Page 15: Heavy Hitters continued Tail Bounds

Markov’s Inequality (intuition)

a too

b 50

c 25

d no bound

Page 16: Heavy Hitters continued Tail Bounds

Markov’s Inequality (intuition)

Page 17: Heavy Hitters continued Tail Bounds

Markov’s Inequality (intuition)

a 100

b 50

c 25

d no bound

Page 18: Heavy Hitters continued Tail Bounds

Markov’s Inequality (intuition)

Ia 100

b 50

c 25

d no bound

Page 19: Heavy Hitters continued Tail Bounds

Markov’s Inequality

Page 20: Heavy Hitters continued Tail Bounds

Markov’s Inequality

Page 21: Heavy Hitters continued Tail Bounds
Page 22: Heavy Hitters continued Tail Bounds

Markov’s Inequality (Proof)

Page 23: Heavy Hitters continued Tail Bounds

Chebyshev’s Inequality

Page 24: Heavy Hitters continued Tail Bounds

Chebyshev’s Inequality

Page 25: Heavy Hitters continued Tail Bounds

Chebyshev’s Inequality (picture for Gaussian)

Page 26: Heavy Hitters continued Tail Bounds

Chebyshev’s Inequality (picture for Gaussian)

Page 27: Heavy Hitters continued Tail Bounds

Chebyshev’s Inequality (Proof)

Page 28: Heavy Hitters continued Tail Bounds

Chebyshev’s Inequality (Proof)

Page 29: Heavy Hitters continued Tail Bounds

Chebyshev’s Inequality (Proof)

Page 30: Heavy Hitters continued Tail Bounds

The Law of Large Numbers

Page 31: Heavy Hitters continued Tail Bounds

Proof of the WLLN

Page 32: Heavy Hitters continued Tail Bounds

Proof of the WLLN

Page 33: Heavy Hitters continued Tail Bounds

END PIC

Alex TsunJoshua Fan