CSE 326: Data Structures SlSplay Trees · AVL Trees Revisited •What extra info did we maintain in each node? – The height of each node • Where were rotations performed?were

Post on 13-Mar-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

CSE 326: Data Structures

S l TSplay TreesJames Fogarty

Spring 2009

AVL Trees Revisited

• Balance condition:Left and right subtrees of every nodeLeft and right subtrees of every nodehave heights differing by at most 1

– Strong enough : Worst case depth is O(log n)– Easy to maintain : one single or double rotation

• Running time for– Find ? O(log n)– Insert ?– Delete ? O(log n)

O(log n)

( g )

– buildTree ? O(n log n)

Single and Double Rotationsa

bZ

YXh hh

aab

c hZ

W

c

X Yh-1

h

h h-1

AVL Trees Revisited

• What extra info did we maintain in each node?

– The height of each node

• Where were rotations performed?Where were rotations performed?

– At the bottom-most node where an imbalance is detected

• How did we locate this node?

– Check balance on our way up out of the recursionCheck balance on our way up out of the recursion

• Seems like a lot of work, doesn’t it?

– Any wacky ideas?

Splay Trees

• Blind adjusting version of AVL treesWh b t b l ? J t t t lik !– Why worry about balances? Just rotate like crazy!

– Don’t track anything, store anything, just do it!

• Amortized time per operations is O(log n)

• Worst case time per operation is O(n)B t t d t h l– But guaranteed to happen rarely

• Splay Trees : AVL Trees :: :Skew Heaps Leftist Heaps• Splay Trees : AVL Trees :: :Skew Heaps Leftist Heaps

Recall: Amortized Complexity

If a sequence of M operations takes O(M f(n)) time,we say the amortized runtime is O(f(n)).

• Worst case time per operation can still be large, say O(n)

• Worst case time for any sequence of M operations is O(M f(n))y q p ( ( ))

Average time per operation for any sequence is O(f(n))

Amortized complexity is worst-case guarantee oversequences of operations.

Recall: Amortized Complexity

• Is amortized guarantee any weaker than worstcase?Y i i l f

• Is amortized guarantee any stronger than averagecase?

Yes, it is only for sequences

• Is average case guarantee good enough in practice?Yes, guarantees no bad sequences

• Is amortized guarantee good enough in practice?

No, adversarial input, bad day, …

• When is amortized maybe not good enough?Yes, again, no bad sequences

If that very rare O(n) operation will kill somebody

The Splay Tree Idea

10 All the way to the root!

17If you’re forced to make a really deep access:y p

Since you’re down there anyway,fix up a lot of deep nodes!

5

p p

92

3

Do it all with AVL single rotations?Consider the ordered “list tree” at left. Now do find(1) and splay it to the root with only AVL single rotations:

6 6 6 6 6 1

5

4

5

4

5

4

5

6

1 6

4

3

4

3

4

1 4

1 5

4

5

4

2

1 2

1 3

2

3

2

3

2

3

21

Do it all with AVL single rotations?

2 3 4

find(2) find(3) find(4) find(6)

1

find(1)

g

6 6

find(5)

61

2

62

3

63

4

6

1

5

6

4

6

5

5

4

5

4

1 5

1

25

4 2

3 4

34

3

44

3 13

2

2 1

Cost of sequence: find(1), find(2), … find(n)?q ( ), ( ), ( )

Single rotations can help, but they are not enough…

Splaying node k to the root:Need to be careful!Need to be careful!

One option (that we won’t use) is to repeatedly use AVL single rotation until k becomes the root: ( S ti 4 5 1 f d t il )rotation until k becomes the root: (see Section 4.5.1 for details)

p

qF

k

s

r

D

E

q

ps

A

Ak

D

rE

FA B

B C DC

Splaying node k to the root:Need to be careful!Need to be careful!

What’s bad about this process? r is pushed almost as low as k wasBad seq: find(k) find(r) find( )Bad seq: find(k), find(r), find(…), …

q

p k

r

q

E

F

q

ps

s

Ak

Dr

q

E

FA B

B C DC

Find/Insert in Splay Trees

1. Find or insert a node k2 S l k h i h i2. Splay k to the root using three operations:

zig-zag rotationzig zig rotationzig-zig rotationplain old zig rotation

Depending on path from current location to the rootp g p

Why could this be good?? 1. Helps the new root, k

o Great if k is accessed again2 And helps many others!2. And helps many others!

o Great if many others on the path are accessed

Splay: Zig-Zag*

g k

p g pX

kkW Y WZX

Helps those in blueY Z

Just like an

Helps those in blueHurts those in red

Which nodes improve depth?Just like an…

AVL double rotation

Which nodes improve depth?

k and its original children

Splay: Zig-Zig*

kg kg

Zp

Wp

Yg

Xk

XWY Z

Is this just two AVL single rotations in a row?

Not quite – we rotate g and p, then p and k

Why does this help?Same number of nodes helped as hurt. k and its children benefit.

Special Case for Root: Zigproot k root

kZ

pX

X Y ZY

Relative depth of p, Y, Z? Relative depth of everyone else?

Down 1 level

Relative depth of p, Y, Z? Relative depth of everyone else?

Nodes under X have been repeatedly raised

Why not drop zig-zig and zig all the way? Zig only helps one child!

Splaying Example: Find(6)

1 1

2 2?

3Find(6)

3

?

Zig-zig

4

( )

6Zig zig

5 5

6 4

Still Splaying 6

1 1

2

1 1

6

3 3

?

6 2 5Zig-zig

5 4

4

Finally…

1 61

6

6

1

3 3

?

2 5 2 5Zig

4 4

Another Splay: Find(4)

6 66

1

6

1

3 4

?

Find(4)

2 5 3 5Zig-zag

4 2

Example Splayed Out

6 46

1 61

4

4 3 5?

3 5 2Zig-zag

2

But Wait…

What happened here?

Didn’t two find operations take linear time instead of our promised logarithmic?

What about the amortized O(log n) guarantee?

That still holds, though we must account for the previous steps used to create this tree.

What is the worst case?

Find keys in sorted (or reverse sorted) order

Why Splaying Helps

• If a node n on the access path is at depth d before the splay it’s at about depth d/2 after the splaythe splay, it’s at about depth d/2 after the splay

• Overall, nodes which are low on the access path tend to move closer to the roottend to move closer to the root

• Splaying gets amortized O(log n) performance.

Practical Benefit of Splaying

• No heights to maintain, no imbalance to check forL t d i t d– Less storage per node, easier to code

D t d i ft d i• Data accessed once, is often soon accessed again– Splaying does implicit caching by bringing it to the root

Splay Operations: Find

• Find the node in normal BST mannerS l h d h• Splay the node to the root– if node not found, splay what would have been its parent

What if we didn’t splay?

Amortized guarantee fails!Amortized guarantee fails!Bad sequence: find(leaf k), find(k), find(k), …

Splay Operations: Insert

• Insert the node in normal BST mannerS l h d h• Splay the node to the root

What if we didn’t splay?

Amortized guarantee fails!Amortized guarantee fails!Bad sequence: insert(k), find(k), find(k), …

Splay Operations: RemoveEverything else splayed, so we’d better do that for remove

kfind(k)

L RL R

delete k

L R > k< k

Now what?

JoinJ i (L R)Join(L, R):

given two trees such that (stuff in L) < (stuff in R), merge them:

Lsplay

L R Rmax

Splay on the maximum element in L, then attach RSimilar to BST delete find max find element with no right childSimilar to BST delete – find max = find element with no right child

Does this work to join any two trees? No, need L < R

Delete ExampleDelete(4)

6( )

4 6

91

4 7

find(4)

9

61

2

1

2

9

74 7

2

9

7

2 2 7

Find max2 7

22

1

9

61

9

6

9

7

9

7

Splay Tree Summary

• All operations are in amortized O(log n) time

• Splaying can be done top-down; this may be better because:– only one pass Like what? Skew heaps! (don’t need to wait)only one pass– no recursion or parent pointers necessary– we didn’t cover top-down in class

Like what? Skew heaps! (don t need to wait)

p

• Splay trees are very effective search trees– Relatively simple– No extra fields required

E cellent l lit properties:

What happens to node that never get accessed?(tend to drop to the bottom)

– Excellent locality properties: frequently accessed keys are cheap to find

Splay ESplay E

I

AH

GB

CF

ED

Splay ESplay E

I

AH

B

E

H

GB

A

B

I

H

D F

CF

C G

ED

Splay E

A

B

I

H

C

D

G

FF

E

Splay E

A E

B

IIA

H

C B H

D

G

F D

C

F

G

F

E

Other Possibilities?• Could use different balance conditions, different ways to

maintain balance different guarantees on running timemaintain balance, different guarantees on running time, …

• Many other balanced BST data structuresy– Red-Black trees– AA trees

Splay Trees– Splay Trees– 2-3 Trees– B-Trees– …

Red-Black Trees Not on midterm

Structure property:– Every node is “colored” either red or black.y– The root is black.– If a node is red, its children are black. (A leaf can be red.)– For each node, all paths down to null pointer must contain

the same number of black nodes.

36

Red-Black Trees Not on midterm

Notes:• Uses the standard rotations plus some coloring• Uses the standard rotations, plus some coloring

operations, to maintain structure.• Worst case find insert delete: O(log n)Worst case find, insert, delete: O(log n)• Has nice top-down, non-recursive implementation.• Java uses top-down red-black trees (TreeMap)• Java uses top-down red-black trees (TreeMap)

37

Treaps Not on midterm

Order property:h d h d l i d i i l• Each node has a randomly assigned priority value,

in addition to its key value.T h b th BST d h d !• Tree has both BST and heap order!

38Orange = low priority value, Yellow = high priority value

Midterm Comments and Plans

top related