Top Banner
1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003
44

1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

1

Lecture 19: B-trees and Hash Tables

Wednesday, November 12, 2003

Page 2: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

2

Outline

• B-trees (13.3)

• Hash-tables (13.4)

Page 3: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

3

B+ Trees

• Search trees

• Idea in B Trees:– make 1 node = 1 block

• Idea in B+ Trees:– Make leaves into a linked list (range queries are

easier)

Page 4: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

4

• Parameter d = the degree

• Each node has >= d and <= 2d keys (except root)

• Each leaf has >=d and <= 2d keys:

B+ Trees Basics

30 120 240

Keys k < 30Keys 30<=k<120 Keys 120<=k<240 Keys 240<=k

40 50 60

40 50 60

Next leaf

Page 5: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

5

B+ Tree Example

80

20 60 100 120 140

10 15 18 20 30 40 50 60 65 80 85 90

10 15 18 20 30 40 50 60 65 80 85 90

d = 2 Find the key 40

40 80

20 < 40 60

30 < 40 40

Page 6: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

6

B+ Tree Design

• How large d ?

• Example:– Key size = 4 bytes– Pointer size = 8 bytes– Block size = 4096 byes

• 2d x 4 + (2d+1) x 8 <= 4096

• d = 170

Page 7: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

7

Searching a B+ Tree

• Exact key values:– Start at the root– Proceed down, to the leaf

• Range queries:– As above– Then sequential traversal

Select nameFrom peopleWhere age = 25

Select nameFrom peopleWhere age = 25

Select nameFrom peopleWhere 20 <= age and age <= 30

Select nameFrom peopleWhere 20 <= age and age <= 30

Page 8: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

B+ Trees in Practice

• Typical order: 100. Typical fill-factor: 67%.– average fanout = 133

• Typical capacities:– Height 4: 1334 = 312,900,700 records– Height 3: 1333 = 2,352,637 records

• Can often hold top levels in buffer pool:– Level 1 = 1 page = 8 Kbytes– Level 2 = 133 pages = 1 Mbyte– Level 3 = 17,689 pages = 133 MBytes

Page 9: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

9

Insertion in a B+ Tree

Insert (K, P)• Find leaf where K belongs, insert• If no overflow (2d keys or less), halt• If overflow (2d+1 keys), split node, insert in parent:

• If leaf, keep K3 too in right node• When root splits, new root has 1 key only

K1 K2 K3 K4 K5

P0 P1 P2 P3 P4 p5

K1 K2

P0 P1 P2

K4 K5

P3 P4 p5

parent K3

parent

Page 10: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

10

Insertion in a B+ Tree

80

20 60 100 120 140

10 15 18 20 30 40 50 60 65 80 85 90

10 15 18 20 30 40 50 60 65 80 85 90

Insert K=19

Page 11: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

11

Insertion in a B+ Tree

80

20 60 100 120 140

10 15 18 19 20 30 40 50 60 65 80 85 90

10 15 18 20 30 40 50 60 65 80 85 9019

After insertion

Page 12: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

12

Insertion in a B+ Tree

80

20 60 100 120 140

10 15 18 19 20 30 40 50 60 65 80 85 90

10 15 18 20 30 40 50 60 65 80 85 9019

Now insert 25

Page 13: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

13

Insertion in a B+ Tree

80

20 60 100 120 140

10 15 18 19 20 25

30 40

50 60 65 80 85 90

10 15 18 20 25 30 40 60 65 80 85 9019

After insertion

50

Page 14: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

14

Insertion in a B+ Tree

80

20 60 100 120 140

10 15 18 19 20 25

30 40

50 60 65 80 85 90

10 15 18 20 25 30 40 60 65 80 85 9019

But now have to split !

50

Page 15: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

15

Insertion in a B+ Tree

80

20 30 60 100 120 140

10 15 18 19 20 25

60 65 80 85 90

10 15 18 20 25 30 40 60 65 80 85 9019

After the split

50

30 40

50

Page 16: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

16

Deletion from a B+ Tree

80

20 30 60 100 120 140

10 15 18 19 20 25

60 65 80 85 90

10 15 18 20 25 30 40 60 65 80 85 9019

Delete 30

50

30 40

50

Page 17: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

17

Deletion from a B+ Tree

80

20 30 60 100 120 140

10 15 18 19 20 25

60 65 80 85 90

10 15 18 20 25 40 60 65 80 85 9019

After deleting 30

50

40 50

May change to 40, or not

Page 18: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

18

Deletion from a B+ Tree

80

20 30 60 100 120 140

10 15 18 19 20 25

60 65 80 85 90

10 15 18 20 25 40 60 65 80 85 9019

Now delete 25

50

40 50

Page 19: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

19

Deletion from a B+ Tree

80

20 30 60 100 120 140

10 15 18 19 20 60 65 80 85 90

10 15 18 20 40 60 65 80 85 9019

After deleting 25Need to rebalanceRotate

50

40 50

Page 20: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

20

Deletion from a B+ Tree

80

19 30 60 100 120 140

10 15 18 19 20

60 65 80 85 90

10 15 18 20 40 60 65 80 85 9019

Now delete 40

50

40 50

Page 21: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

21

Deletion from a B+ Tree

80

19 30 60 100 120 140

10 15 18 19 20

60 65 80 85 90

10 15 18 20 60 65 80 85 9019

After deleting 40Rotation not possibleNeed to merge nodes

50

50

Page 22: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

22

Deletion from a B+ Tree

80

19 60 100 120 140

10 15 18 19 20

50

60 65 80 85 90

10 15 18 20 60 65 80 85 9019

Final tree

50

Page 23: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

23

In Class

• Suppose the B+ tree has depth 4 and degree d=200

• How many records does the relation have (maximum) ?

• How many index blocks do we need to read and/or write during:– A key lookup– An insertion– A delection

Page 24: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

24

Hash Tables

• Secondary storage hash tables are much like main memory ones

• Recall basics:– There are n buckets– A hash function f(k) maps a key k to {0, 1, …, n-1}– Store in bucket f(k) a pointer to record with key k

• Secondary storage: bucket = block, use overflow blocks when needed

Page 25: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

25

• Assume 1 bucket (block) stores 2 keys + pointers

• h(e)=0

• h(b)=h(f)=1

• h(g)=2

• h(a)=h(c)=3

Hash Table Example

e

b

f

g

a

c

0

1

2

3

Page 26: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

26

• Search for a:

• Compute h(a)=3

• Read bucket 3

• 1 disk access

Searching in a Hash Table

e

b

f

g

a

c

0

1

2

3

Page 27: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

27

• Place in right bucket, if space

• E.g. h(d)=2

Insertion in Hash Table

e

b

f

g

d

a

c

0

1

2

3

Page 28: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

28

• Create overflow block, if no space• E.g. h(k)=1

• More over-flow blocksmay be needed

Insertion in Hash Table

e

b

f

g

d

a

c

0

1

2

3

k

Page 29: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

29

Hash Table Performance

• Excellent, if no overflow blocks

• Degrades considerably when number of keys exceeds the number of buckets (I.e. many overflow blocks).

Page 30: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

30

Extensible Hash Table

• Allows has table to grow, to avoid performance degradation

• Assume a hash function h that returns numbers in {0, …, 2k – 1}

• Start with n = 2i << 2k , only look at first i most significant bits

Page 31: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

31

Extensible Hash Table

• E.g. i=1, n=2i=2, k=4

• Note: we only look at the first bit (0 or 1)

0(010)

1(011)

i=1 1

1

01

Page 32: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

32

Insertion in Extensible Hash Table

• Insert 11100(010)

1(011)

1(110)

i=1 1

1

01

Page 33: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

33

Insertion in Extensible Hash Table

• Now insert 1010

• Need to extend table, split blocks

• i becomes 2

0(010)

1(011)

1(110), 1(010)

i=1 1

1

01

Page 34: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

34

Insertion in Extensible Hash Table

0(010)

10(11)

10(10)

i=2 1

2

00011011

11(10) 2

Page 35: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

35

Insertion in Extensible Hash Table

• Now insert 0000, then 0101

• Need to split block

0(010)

0(000), 0(101)

10(11)

10(10)

i=2 1

2

00011011

11(10) 2

Page 36: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

36

Insertion in Extensible Hash Table

• After splitting the block00(10)

00(00)

10(11)

10(10)

i=2

2

2

00011011

11(10) 2

01(01) 2

Page 37: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

37

Extensible Hash Table

• How many buckets (blocks) do we need to touch after an insertion ?

• How many entries in the hash table do we need to touch after an insertion ?

Page 38: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

38

Performance Extensible Hash Table

• No overflow blocks: access always one read

• BUT:– Extensions can be costly and disruptive– After an extension table may no longer fit in

memory

Page 39: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

39

Linear Hash Table

• Idea: extend only one entry at a time

• Problem: n= no longer a power of 2

• Let i be such that 2i <= n < 2i+1

• After computing h(k), use last i bits:– If last i bits represent a number > n, change msb

from 1 to 0 (get a number <= n)

Page 40: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

40

Linear Hash Table Example

• n=3(01)00

(11)00

(10)10

i=2

000110

(01)11 BIT FLIP

Page 41: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

41

Linear Hash Table Example

• Insert 1000: overflow blocks…

(01)00

(11)00

(10)10

i=2

000110

(01)11

(10)00

Page 42: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

42

Linear Hash Tables

• Extension: independent on overflow blocks

• Extend n:=n+1 when average number of records per block exceeds (say) 80%

Page 43: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

43

Linear Hash Table Extension

• From n=3 to n=4

• Only need to touchone block (which one ?)

(01)00

(11)00

(10)10

i=2

000110

(01)11(01)11

(01)11

i=2

000110

(10)10

(01)00

(11)00

n=11

Page 44: 1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.

44

Linear Hash Table Extension

• From n=3 to n=4 finished

• Extension from n=4to n=5 (new bit)

• Need to touch everysingle block (why ?) (01)11

i=2

000110

(10)10

(01)00

(11)00

11