Top Banner
1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1 keys The root has at least 2 children All leaves are at the same distance from the root
44

1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

1

B trees

• Nodes have more than 2 children• Each internal node has between k

and 2k children and between k-1 and 2k-1 keys

• A leaf has between k-1 and 2k-1 keys

• The root has at least 2 children• All leaves are at the same distance

from the root

Page 2: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

2

2-4 tree and General k

• k=2• Each node has 2,3,or 4 children• WHAT IS BETTER: k =2 or k >> 2??• Depth?

Large k better

• But what about degree? Small k better

• Overall: nk klog

Page 3: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

3

A 4-node

10 30 35

key < 10

10 ≤ key < 30

30 ≤ key < 35

35 ≤ key

Page 4: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

4

B vs. B+

• In a B tree items are in every node

• In B+ tree items are at the leaves; internal nodes have keys to direct the search

• The leaves are (possibly) also maintained in a linked list to allow fast sequential access

Page 5: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

5

A 2-4+ tree

15 30

10

10

1 3 30 40 5016 17

4 7 9

5 7 9

Page 6: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

6

The height

• The root has at least 2 children• At level 2 we have at least 2k nodes• At level 3 we have at least 2k2 nodes• At level h we have at least 2kh-1 nodes

1 l2 oghkn h nk

Page 7: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

7

Red-Black Trees

• n = 230 = 109 (approx).• 30 <= height <= 60.• When the red-black tree resides on a

disk, up to 60 disk access are made for a search.

• Disk access takes about 5 millisecond (10-4 sec)

• Memory access takes about 100 nano (10-7 sec)

Page 8: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

8

B-trees

• B-trees are used when the tree resides in secondary storage.

• k is picked according to the size of a disk block

• Since the height is smaller we do less I/O, we get more in each single access

Page 9: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

9

B-Trees

• Large degree B-trees are used to represent very large dictionaries that reside on disk.

• Smaller degree B-trees used for internal-memory dictionaries to overcome cache-miss penalties.

Page 10: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

10

Node’s structure

• ai is a pointer to a subtree.

• pi is a key

j a0 p1 a1 p2 a2 … pj aj

Can search linearly each node.total time ≈ kh ≈ klogkn time

Can maintain a little red-black tree or an array in each node so search takes ≈ log2k h ≈ log2n

k ≤ j ≤ 2k

Page 11: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

11

Insert

15 30

14

14

1 3 30 40 5016 175 9

5 9

Insert(2,T).

Page 12: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

12

Insert

15 30

14

14

30 40 5016 175 9

5 9

Insert(2,T).

1 2 3

Page 13: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

13

Insert

15 30

14

14

30 40 5016 175 9

5 9

Insert(4,T).

1 2 3

Page 14: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

14

Insert

15 30

14

14

30 40 5016 175 9

5 9

Insert(4,T).

1 2 3 4

Page 15: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

15

Split

15 30

14

14

30 40 5016 175 9

5 9

Insert(4,T).

1 2 3 4

Page 16: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

16

Split

15 30

14

14

30 40 5016 175 9

5 9

Insert(4,T).

1 2 3 4

Page 17: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

17

Split

15 30

14

14

30 40 5016 175 9

Insert(4,T).

1 2 3 4

3 5 9

Page 18: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

18

15 30

14

14

30 40 5016 175 9

Insert(6,T).

1 2 3 4

3 5 9

Page 19: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

19

15 30

14

14

30 40 5016 179

Insert(6,T).

1 2 3 4

3 5 9

5 6

Page 20: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

20

15 30

14

14

30 40 5016 179

Insert(7,T).

1 2 3 4

3 5 9

5 6

Page 21: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

21

15 30

14

14

30 40 5016 179

Insert(7,T).

1 2 3 4

3 5 9

5 6 7

Page 22: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

22

15 30

14

14

30 40 5016 179

Insert(8,T).

1 2 3 4

3 5 9

5 6 7

Page 23: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

23

15 30

14

14

30 40 5016 179

Insert(8,T).

1 2 3 4

3 5 9

5 6 7 8

Page 24: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

24

Split

15 30

14

14

30 40 5016 179

Insert(8,T).

1 2 3 4

3 5 9

5 6 7 8

Page 25: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

25

Split

15 30

14

14

30 40 5016 179

Insert(8,T).

1 2 3 4 5 6 7 8

3 5 7 9

Page 26: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

26

Split

15 30

14

14

30 40 5016 179

Insert(8,T).

1 2 3 4 5 6 7 8

3 5 7 9

Page 27: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

27

Split

15 30

14 30 40 5016 179

Insert(8,T).

1 2 3 4 5 6 7 8

7 9

5 14

3

Page 28: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

28

Insert -- definition

Add the new key in its position. Say in a node v.

(*) If v has 4 keys split v into a 2-node u, a 1-node w, and a key k, (or two 2-nodes and a key if v is a leaf)

If v was the root then create a new root r parent of u and w and stop.

Replace v by u and w as children of p(v).

Repeat (*) for v := p(v).

Page 29: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

29

Split

(2k) a0 p1 a1 p2 a2 … p2k a2k

(k-1) a0 p1 a1 p2 a2 … pk-1 ak-1

(k) ak pk+1 ak+1 … p2k a2k

• pk is inserted in parent.

Page 30: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

30

Split

(2k) a0 p1 a1 p2 a2 … p2k a2k

(k-1) a0 p1 a1 p2 a2 … pk-1 ak-1

(k) ak pk ak+1 … p2k a2k

• pk is inserted in parent.

Takes O(k) time

Page 31: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

32

Insert (summary)

• O(logn) time and at most O(logkn) each split takes O(k) time

• Can show that the amortized # of splits is O(1) per insert

Page 32: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

33

Delete

15 30

14 30 40 5016 179

delete(14,T).

1 2 3 4 5 6 7 8

7 9

5 14

3

Page 33: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

34

Delete

30 40 5016 179

delete(14,T).

1 2 3 4 5 6 7 8

7 9

5 14

303

Page 34: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

35

Delete

30 40 5016 179

delete(17,T).

1 2 3 4 5 6 7 8

7 9

5 14

303

Page 35: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

36

Delete

30 40 509

delete(17,T).

1 2 3 4 5 6 7 8

7 9

5 14

30

16

3

Page 36: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

37

Delete

30 40 509

delete(16,T).

1 2 3 4 5 6 7 8

7 9

5 14

30

16

3

Page 37: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

38

Delete

30 40 509

delete(16,T).

1 2 3 4 5 6 7 8

7 9

5 14

303

Page 38: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

39

Borrow

30 40 509

delete(16,T).

1 2 3 4 5 6 7 8

7 9

5 14

303

Page 39: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

40

Borrow

30 40 509

delete(16,T).

1 2 3 4 5 6 7 8

5 9

303 7

Page 40: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

41

30 40 509

delete(9,T).

1 2 3 4 5 6 7 8

5 9

303 7

Page 41: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

42

30 40 50

delete(9,T).

1 2 3 4 5 6 7 8

5 9

303 7

Page 42: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

43

30 40 50

delete(9,T).

1 2 3 4 5 6 7 8

5 9

3

Fusion

7 30

Page 43: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

44

30 40 50

delete(9,T).

1 2 3 4 5 6 7 8

3

Fusion

7 30

5

Page 44: 1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.

45

Delete -- definition

Remove the key.

If it is the only key in the node remove the node, and let v be the parent that loses a child, otherwise return

(*) If v has one child, and v is the root discard v.

Otherwise (v is not a root), if v has a sibling w of degree 3 or 4, borrow a child from w to v and terminate.

Otherwise, fuse v with its sibling to a degree 3 node and repeat (*) with the parent of v.