Top Banner
Alternative Data Structures in Ruby Tyler McMullen Friday, February 19, 2010
75

Alternative Data Structures in Ruby

Nov 18, 2014

Download

Documents

Señor Smiles
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Alternative Data Structures in Ruby

Alternative Data Structures in Ruby

Tyler McMullen

Friday, February 19, 2010

Page 2: Alternative Data Structures in Ruby

Why?

Friday, February 19, 2010

Page 3: Alternative Data Structures in Ruby

Why?

• Speed

• Memory

• Clarity

Friday, February 19, 2010

Page 4: Alternative Data Structures in Ruby

What’s wrong with my favorite data structure, X?

Friday, February 19, 2010

Page 5: Alternative Data Structures in Ruby

Nothing. (Maybe.)

Friday, February 19, 2010

Page 6: Alternative Data Structures in Ruby

•Bloom Filter

•BK-tree

•Splay Tree

•Trie

Friday, February 19, 2010

Page 7: Alternative Data Structures in Ruby

Bloom Filters

• Tests for existence in a set

• Probabilistic

• Minimal memory use

Friday, February 19, 2010

Page 8: Alternative Data Structures in Ruby

100 million strings in a Set

Traditional Set: Minimum 10gb

Friday, February 19, 2010

Page 9: Alternative Data Structures in Ruby

100 million strings in a Set

Traditional Set: Minimum 10gbBloom Filter (0.00001): 280mb

Friday, February 19, 2010

Page 10: Alternative Data Structures in Ruby

100 million strings in a Set

Traditional Set: Minimum 10gbBloom Filter (0.00001): 280mb

Bloom Filter (0.001): 170mb

Friday, February 19, 2010

Page 11: Alternative Data Structures in Ruby

Friday, February 19, 2010

Page 12: Alternative Data Structures in Ruby

10 2 3 4 5 6 7

Friday, February 19, 2010

Page 13: Alternative Data Structures in Ruby

10 2 3 4 5 6 7

“to be or not to be”

Friday, February 19, 2010

Page 14: Alternative Data Structures in Ruby

10 2 3 4 5 6 7

add: “to be or not to be”

Friday, February 19, 2010

Page 15: Alternative Data Structures in Ruby

10 2 3 4 5 6 7

add: “that is the question”

Friday, February 19, 2010

Page 16: Alternative Data Structures in Ruby

10 2 3 4 5 6 7

query: “whether ‘tis nobler”

NO MATCH

Friday, February 19, 2010

Page 17: Alternative Data Structures in Ruby

10 2 3 4 5 6 7

query: “to be or not to be”

MATCH

Friday, February 19, 2010

Page 18: Alternative Data Structures in Ruby

10 2 3 4 5 6 7

query: “in the mind to suffer”

FALSE MATCH

Friday, February 19, 2010

Page 19: Alternative Data Structures in Ruby

File Server

Friday, February 19, 2010

Page 20: Alternative Data Structures in Ruby

File Server

Request

exists?

200 404

Y N

Friday, February 19, 2010

Page 21: Alternative Data Structures in Ruby

File Server

Request

exists?

200 404

Y N

Bloom Filter

Friday, February 19, 2010

Page 22: Alternative Data Structures in Ruby

Bloom Filter

• Test for existence in set

• Tiny Memory Footprint

• Excellent Speed

Friday, February 19, 2010

Page 23: Alternative Data Structures in Ruby

BK-tree

Friday, February 19, 2010

Page 24: Alternative Data Structures in Ruby

BK-tree

• find items within a distance of a target

• reduces search space

• works inside a metric space

Friday, February 19, 2010

Page 25: Alternative Data Structures in Ruby

Triangle Inequality| d(x, y) - d(x, z) | ≤ d(y, z)

Friday, February 19, 2010

Page 26: Alternative Data Structures in Ruby

Triangle Inequality| d(x, y) - d(x, z) | ≤ d(y, z)

x

y

z

Friday, February 19, 2010

Page 27: Alternative Data Structures in Ruby

Triangle Inequality| d(x, y) - d(x, z) | ≤ d(y, z)

1

4

x

y

z

Friday, February 19, 2010

Page 28: Alternative Data Structures in Ruby

Triangle Inequality| d(x, y) - d(x, z) | ≤ d(y, z)

1

4

x

y

z

?

Friday, February 19, 2010

Page 29: Alternative Data Structures in Ruby

Triangle Inequality| 4 - 1 | ≤ d(y, z)

1

4

x

y

z

?

Friday, February 19, 2010

Page 30: Alternative Data Structures in Ruby

Triangle Inequality3 ≤ d(y, z)

1

4

x

y

z

≥3

Friday, February 19, 2010

Page 31: Alternative Data Structures in Ruby

BK-tree

paste

pasta

taser

pastor

shave

light

Friday, February 19, 2010

Page 32: Alternative Data Structures in Ruby

BK-tree

paste

pasta

taser

pastor

shave

light

Friday, February 19, 2010

Page 33: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

root

Friday, February 19, 2010

Page 34: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

rootpastu

Friday, February 19, 2010

Page 35: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

rootpastu

1

Friday, February 19, 2010

Page 36: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

rootpastu

1

Friday, February 19, 2010

Page 37: Alternative Data Structures in Ruby

BK-tree

paste

pasta pastor

rootpastu

1

1 2

Friday, February 19, 2010

Page 38: Alternative Data Structures in Ruby

BK-tree

paste

pasta pastor

rootpastu

1

1 2

Friday, February 19, 2010

Page 39: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

root

Friday, February 19, 2010

Page 40: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

rootpastu

Friday, February 19, 2010

Page 41: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

rootpastu

Friday, February 19, 2010

Page 42: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

rootpastu

Friday, February 19, 2010

Page 43: Alternative Data Structures in Ruby

BK-tree

paste

pasta taserpastor shave light1 2 3 4 5

rootpastu

Friday, February 19, 2010

Page 44: Alternative Data Structures in Ruby

BK-tree

• Most often used for spelling correctors

• Work in any metric space

• Reduce the search space

Friday, February 19, 2010

Page 45: Alternative Data Structures in Ruby

Splay Tree

Friday, February 19, 2010

Page 46: Alternative Data Structures in Ruby

Tangent: Access Patterns

Friday, February 19, 2010

Page 47: Alternative Data Structures in Ruby

Access Patterns

Usually assumed to be random or even.

Friday, February 19, 2010

Page 48: Alternative Data Structures in Ruby

Access Patterns

Rarely the case.

Friday, February 19, 2010

Page 49: Alternative Data Structures in Ruby

Splay Tree

• Self-balancing binary tree

• Brings most accessed items toward root

• The more uneven the access pattern, the better

Friday, February 19, 2010

Page 50: Alternative Data Structures in Ruby

Splay Tree

7

4

2 6

5 41 3

11

9 13

12 148 10

Friday, February 19, 2010

Page 51: Alternative Data Structures in Ruby

Splay Tree

7

4

2 6

5 41 3

11

9 13

12 148 10

Friday, February 19, 2010

Page 52: Alternative Data Structures in Ruby

Splay Tree

7

4

2 6

5 41 3

11

9

13

12 14

8

10

Friday, February 19, 2010

Page 53: Alternative Data Structures in Ruby

Splay Tree

7

4

2 6

5 41 3

11

9

13

12 14

8

10

Friday, February 19, 2010

Page 54: Alternative Data Structures in Ruby

Splay Tree

• Made for very uneven access patterns

• Caches, Garbage collectors, etc...

Friday, February 19, 2010

Page 55: Alternative Data Structures in Ruby

Trie

Friday, February 19, 2010

Page 56: Alternative Data Structures in Ruby

Trie

• O(1) on lookup, add, removal

• Ordered traversals

• Prefix matching

• Excellent memory usage (depending on implementation)

Friday, February 19, 2010

Page 57: Alternative Data Structures in Ruby

Trie

Friday, February 19, 2010

Page 58: Alternative Data Structures in Ruby

Trie

T

H

N

I

add: “thin”

Friday, February 19, 2010

Page 59: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

add: “trap”

Friday, February 19, 2010

Page 60: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

add: “bar”

Friday, February 19, 2010

Page 61: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

add: “burp”

Friday, February 19, 2010

Page 62: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “trap”

Friday, February 19, 2010

Page 63: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “trap”

Friday, February 19, 2010

Page 64: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “trap”

Friday, February 19, 2010

Page 65: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “trap”

Friday, February 19, 2010

Page 66: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “trap”

Success!Friday, February 19, 2010

Page 67: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “bumpkin”

Friday, February 19, 2010

Page 68: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “bupkis”

Friday, February 19, 2010

Page 69: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “bupkis”

Friday, February 19, 2010

Page 70: Alternative Data Structures in Ruby

Trie

T

H

N

I

R

A

P

B

A

R

U

R

P

query: “bupkis”

Fail!Friday, February 19, 2010

Page 71: Alternative Data Structures in Ruby

Trie

Example: Autocompleter

Friday, February 19, 2010

Page 72: Alternative Data Structures in Ruby

Trie

class  Autocompleter    def  initialize(words)        @trie  =  Trie.new        words.each  {  |word|  @trie.add(word)  }    end

   def  query(word)        return  @trie.children(word)    endend

Friday, February 19, 2010

Page 73: Alternative Data Structures in Ruby

Trieclass  Autocompleter    def  initialize(words)        @trie  =  Trie.new        words.each  {  |word|  @trie.add(word)  }    end

   def  call(env)        request  =  Rack::Request.new(env)        return  [200,                        {  ‘content-­‐type’  =>  ‘application/json’  },                        @trie.children(word).to_json]    endend

Friday, February 19, 2010

Page 74: Alternative Data Structures in Ruby

Conclusion: Data structures are cool.

Friday, February 19, 2010

Page 75: Alternative Data Structures in Ruby

Questions?

Friday, February 19, 2010