Digital search trees - TUM · Digital search tree A digital search tree is a dictionary implemented as a digital tree which stores strings in internal nodes, so there is no need for

Post on 05-Sep-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Digital search trees

Analysis of different digital trees with Rice’s integrals.

JASS

Nicolai v. Hoyningen-Huene

28.3.2004

28.3.2004 JASS 04 - Digital search trees 1

content

➮ Tree

➮ Digital search tree:

• Definition

• Average case analysis

➮ Tries:

• Definition

• Average case analysis

➮ General framework

28.3.2004 JASS 04 - Digital search trees 2

content

➠ Tree

➮ Digital search tree

➮ Tries

➮ General framework

28.3.2004 JASS 04 - Digital search trees 3

Tree

28.3.2004 JASS 04 - Digital search trees 4

Tree

Definition 0.1 A tree is defined in several ways:

1. A connected, undirected, acyclic graph. It is rooted and ordered

unless otherwise specified.

2. A data structure accessed beginning at the root node. Each node

is either a leaf or an internal node. An internal node has one or

more child nodes and is called the parent of its child nodes. All

children of the same node are siblings.

3. A tree is either empty (no nodes), or a root and zero or more

subtrees. The subtrees are ordered.

28.3.2004 JASS 04 - Digital search trees 5

Search tree

28.3.2004 JASS 04 - Digital search trees 6

content

➮ Tree

➠ Digital search tree:

• Definition

• Average case analysis

➮ Tries

➮ General framework

28.3.2004 JASS 04 - Digital search trees 7

content

➮ Tree

➮ Digital search tree:

➠ Definition

• Average case analysis

➮ Tries

➮ General framework

28.3.2004 JASS 04 - Digital search trees 8

Digital search tree

28.3.2004 JASS 04 - Digital search trees 9

Digital search tree

A digital search tree is a dictionary implemented as a digital tree

which stores strings in internal nodes, so there is no need for extra

leaf nodes to store the strings.

28.3.2004 JASS 04 - Digital search trees 10

content

➮ Tree

➮ Digital search tree:

• Definition

➠ Average case analysis:– Internal path length

– External internal nodes

➮ Tries

➮ General framework

28.3.2004 JASS 04 - Digital search trees 11

content

➮ Tree

➮ Digital search tree:

• Definition

• Average case analysis:

➠ Internal path length– External internal nodes

➮ Tries

➮ General framework

28.3.2004 JASS 04 - Digital search trees 12

Internal path length

The internal path length of a tree is the sum of the depth of every

node of the tree.

28.3.2004 JASS 04 - Digital search trees 13

Internal path length

Fundamental recurrence relation

AN = N−1+∞∑

k=0

1

2N−1

(N − 1

k

)(Ak + AN−1−k) , N ≥ 1

with A0 := 0.

28.3.2004 JASS 04 - Digital search trees 14

Internal path length

Transformation

∞∑N=1

ANzN−1

(N − 1)!= zez + 2

∞∑

k=0

Ak

k!

(z

2

)k

ez2

A′ (z) = zez + 2A(z

2

)e

z2

28.3.2004 JASS 04 - Digital search trees 15

Internal path length

Substitution by B(z)

A (z) = ezB (z) =

( ∞∑N=0

zN

N !

)( ∞∑N=0

BNzN

N !

)

AN =N∑

k=0

(N

k

)Bk

28.3.2004 JASS 04 - Digital search trees 16

Internal path length

Substitution by B(z)

A′ (z) = zez + 2A(z

2

)e

z2

B′ (z) + B (z) = z + 2B(z

2

)

BN + BN−1 =1

2N−2BN−1

28.3.2004 JASS 04 - Digital search trees 17

Internal path length

Substitution by B(z)

BN = (−1)NN−2∏j=1

(1− 1

2j

)

28.3.2004 JASS 04 - Digital search trees 18

Internal path length

Introduction of QN

QN =N∏

j=1

(1− 1

2j

)

28.3.2004 JASS 04 - Digital search trees 19

Internal path length

Introduction of Q(x)

Q (x) =∞∏

j=1

(1− x

2j

)

Q (1) = Q∞

28.3.2004 JASS 04 - Digital search trees 20

Internal path length

Q(x) is used for a meromorphic function of QN

QN =Q (1)

Q (2−N)

28.3.2004 JASS 04 - Digital search trees 21

Internal path length

The function Q(x)

–4

–2

0

2

4

y

–10 –8 –6 –4 –2 2 4 6 8 10x

28.3.2004 JASS 04 - Digital search trees 22

Internal path length

Explicit formula for AN

AN =N∑

k=2

(N

k

)(−1)k Q (1)

Q (2−k+2)

28.3.2004 JASS 04 - Digital search trees 23

Internal path length

Rice’s method

N∑

k=0

(N

k

)(−1)k f (k) = − 1

2πi

C

B (N + 1,−z) f (z) dz

28.3.2004 JASS 04 - Digital search trees 24

Internal path length

Application of Rice’s method

AN = − 1

2πi

C

B (N + 1,−z)Q (1)

Q (2−z+2)dz

28.3.2004 JASS 04 - Digital search trees 25

Internal path length

Beta-Function

B (p, q) =Γ (p) Γ (q)

Γ (p + q)

28.3.2004 JASS 04 - Digital search trees 26

Internal path length

Beta-Function

–1e+11

–5e+10

5e+10

1e+11

–10 10 20 30 40z

28.3.2004 JASS 04 - Digital search trees 27

Internal path length

Gamma-Function

–10

–8

–6

–4

–20

2

4

6

8

10

y

–4 –2 2 4z

28.3.2004 JASS 04 - Digital search trees 28

Internal path length

Cauchy’s theorem

If f(z) is analytic in C except final number of poles a1, a2, ..., an

inside C, then

1

2πi

C

f (z) dz =n∑

k=1

Resz=akf (z)

28.3.2004 JASS 04 - Digital search trees 29

Internal path length

Approximation with Rectangle RXY with (12 ± iY,X ± iY )

AN = − 1

2πi

RXY

B (N + 1,−z)Q (1)

Q (2−z+2)dz −ResRXY /C

28.3.2004 JASS 04 - Digital search trees 30

Internal path length

Approximation of the integral

1

2πi

RXY

B (N + 1,−z)Q (1)

Q (2−z+2)dz =

O

(∫ Y

−Y

Γ (N + 1)

Γ(N + 1

2− iy

)dy

)= O

(∫ Y

−Y

N12−iydy

)= O

(N

12

)

28.3.2004 JASS 04 - Digital search trees 31

Internal path length

Residue at z = 1

−B (N + 1,−z) = − N

z − 1−N (HN−1 − 1) + O (z − 1)

HN−1 = γ + ln N −O

(1

N

)

−B (N + 1,−z) = − N

z − 1−N (γ + ln N − 1) + O (z − 1)

28.3.2004 JASS 04 - Digital search trees 32

Internal path length

Residue at z = 1

Q (1)

Q (2−z+1)= 1− α ln 2 (z − 1) + O

((z − 1)2)

28.3.2004 JASS 04 - Digital search trees 33

Internal path length

Residue at z = 1

1

1− 2−z+1=

1

1− eln 2(−z+1)

= − 1

(−z + 1) ln 2+

1

2− −z + 1

12+ O

((−z + 1)3)

=1

(z − 1) ln 2+

1

2+ O (z − 1)

28.3.2004 JASS 04 - Digital search trees 34

Internal path length

Residue at z = 1

∆z=1 = −N lg N −N

(γ − 1

ln 2− α +

1

2

)+ O (1)

28.3.2004 JASS 04 - Digital search trees 35

Internal path length

Residues at z = j ± 2πikln 2 for Q(2−z+j)

∆z=1± 2πIkln 2

= −Nδ (N) + O (1)

where

δ (N) =1

ln 2

k 6=0

Γ

(−1− 2πik

ln 2

)e2πik lg N

28.3.2004 JASS 04 - Digital search trees 36

Internal path length

Average case

AN = N lg N + N

(γ − 1

ln 2− α +

1

2+ δ (N)

)+ O

(N

12

)

28.3.2004 JASS 04 - Digital search trees 37

content

➮ Tree

➮ Digital search tree:

• Definition

• Average case analysis:– Internal path length

➠ External internal nodes– Multiway branching

➮ Tries

➮ General framework

28.3.2004 JASS 04 - Digital search trees 38

External internal nodesDefinition

External internal nodes are nodes with both links null.

28.3.2004 JASS 04 - Digital search trees 39

External internal nodesFundamental recurrence relation

CN =∞∑

k=0

1

2N−1

(N − 1

k

)(Ck + CN−1−k) , N ≥ 2

with C1 = 1 and C0 = 0.

28.3.2004 JASS 04 - Digital search trees 40

External internal nodesTransformation

C (z) =∞∑

N=0

CNzN

N !

C ′ (z) = 1 + 2C(z

2

)e

z2

28.3.2004 JASS 04 - Digital search trees 41

External internal nodes

Substitution by D (z) = e−zC (z)

D (z) =∞∑

N=0

DNzN

N !

D′ (z) + D (z) = e−z + 2D(z

2

)

28.3.2004 JASS 04 - Digital search trees 42

External internal nodes

Recurrence for DN

DN + DN−1 = (−1)N−1 +1

2N−2DN−1

DN = (−1)N−1 −(

1− 1

2N−2

)DN−1, N ≥ 2

with D1 = 1 and D0 = 0

28.3.2004 JASS 04 - Digital search trees 43

External internal nodes

Introduction of RN

RN = QN

(1 +

N∑

k=1

1

Qk

)

28.3.2004 JASS 04 - Digital search trees 44

External internal nodes

Explicit formula for CN

CN = N −∞∑

k=2

(N

k

)(−1)k Rk−2

28.3.2004 JASS 04 - Digital search trees 45

External internal nodes

Simpler coefficients R∗N

R∗N =

(N + 1− α) qN+1

1− qN+1+

1

1− qN+1R∗

N+1

28.3.2004 JASS 04 - Digital search trees 46

External internal nodes

The meromorphic function R∗ (z)

R∗ (z) =∞∑i=2

(z + 1 + i− α) qz+1+i

∏ij=0 (1− qz+1+j)

28.3.2004 JASS 04 - Digital search trees 47

External internal nodes

The meromorphic function R∗ (z)

–4

–2

0

2

4

y

–10 –8 –6 –4 –2 2 4 6 8 10z

28.3.2004 JASS 04 - Digital search trees 48

External internal nodes

Explicit formula for CN

CN = (N − 1) (α + 1)−∞∑

k=2

(N

k

)(−1)k R∗

k−2

28.3.2004 JASS 04 - Digital search trees 49

External internal nodesApplying Rice’s method

CN − (N − 1) (α + 1) =1

2πi

C

B (N + 1,−z) R∗ (z − 2) dz

28.3.2004 JASS 04 - Digital search trees 50

External internal nodes

Approximation with Rectangle RXY

CN−(N − 1) (α + 1) =1

2πi

RXY

B (N + 1,−z) R∗ (z − 2) dz−∆R

28.3.2004 JASS 04 - Digital search trees 51

External internal nodesApproximation of the integral

O

(∫ Y

−Y

Γ (N + 1)

Γ(N + 1

2− iy

)dy

)= O

(∫ Y

−Y

N12−iydy

)= O

(N

12

)

28.3.2004 JASS 04 - Digital search trees 52

External internal nodes

Residue at z = 1

∆z=1 = N

(β + 1− 1

Q∞

(α2 − α− 1

ln q

))

28.3.2004 JASS 04 - Digital search trees 53

External internal nodes

Residues at z = −1± 2πikln q

δ∗ (N) =2πik

Q∞ ln q

k 6=0

1

ln qΓ

(−1− 2πik

ln q

)e2πik lg N

28.3.2004 JASS 04 - Digital search trees 54

External internal nodesAverage case

CN = N

(β + 1− 1

Q∞

(1

ln 2+ α2 − α

)+ δ∗ (N)

)+O

(N

12

)

28.3.2004 JASS 04 - Digital search trees 55

content

➮ Tree

➮ Digital search tree:

• Definition

• Average case analysis:– Internal path length

– External internal nodes

➠ Multiway branching

➮ Tries

➮ General framework

28.3.2004 JASS 04 - Digital search trees 56

Multiway branching

Fundamental recurrence relation for external nodes

C[M ]N =

k1+k2+...+kM=N−1

1

MN−1

(N − 1

k1, k2, ..., kM

) (M∑i=1

C[M ]ki

)

with C[M ]1 = 1 and C

[M ]0 = 0

28.3.2004 JASS 04 - Digital search trees 57

Multiway branching

Fundamental recurrence relation for external nodes

C[M ]N = M

k1+k2+...+kM=N−1

1

MN−1

(N − 1

k1, k2, ..., kM

)C

[M ]k1

with C[M ]1 = 1 and C

[M ]0 = 0

28.3.2004 JASS 04 - Digital search trees 58

Multiway branching

Transformation

C [M ] (z) =∞∑

N=0

C[M ]N zN

N !

C [M ]′ (z) = 1 + MC [M ]( z

M

)(e(1− 1

M )z)

28.3.2004 JASS 04 - Digital search trees 59

Multiway branching

average case for external nodes

C[M ]N = N

(β[M ] + 1− 1

Q[M ]∞

(1

ln M+ α[M ]2 − α[M ]

))

+Nδ[M ] (N)

+O(N

12

)

28.3.2004 JASS 04 - Digital search trees 60

content

➮ Tree

➮ Digital search tree

➠ Tries

• Defintions

• Average case analysis

➮ General framework

28.3.2004 JASS 04 - Digital search trees 61

content

➮ Tree

➮ Digital search tree

➮ Tries:

➠ Defintion of– Digital search trie

– Patricia trie

• Average case analysis

➮ General framework

28.3.2004 JASS 04 - Digital search trees 62

Digital search trie

A digital search trie is a digital tree for storing a set of strings in

which there is one node for every prefix of every string in the set.

28.3.2004 JASS 04 - Digital search trees 63

Patricia trie

A Patricia tree is defined as a compact representation of a digital

search trie where all nodes with one child are merged with their

parent.

28.3.2004 JASS 04 - Digital search trees 64

content

➮ Tree

➮ Digital search tree:

➮ Tries:

• Defintions

➠ Average case analysis:– External path length

– External internal nodes

➮ General framework

28.3.2004 JASS 04 - Digital search trees 65

content

➮ Tree

➮ Digital search tree

➮ Tries:

• Defintions

• Average case analysis:

➠ External path length– External internal nodes

➮ General framework

28.3.2004 JASS 04 - Digital search trees 66

External path length for digital search trie

Fundamental recurrence relation

A[T ]N = N +

∞∑

k=0

1

2N

(N

k

) (A

[T ]k + A

[T ]N−k

), N ≥ 2

with A[T ]0 = A

[T ]1 = 0

28.3.2004 JASS 04 - Digital search trees 67

External path length for digital search trie

Transformation

A[T ] (z) = z (ez − 1) + 2A[T ](z

2

)ez−2

28.3.2004 JASS 04 - Digital search trees 68

External path length for digital search trie

Substitution by B (z)

A (z) = ezB (z)

B[T ] (z) = z(1− e−z

)+ 2B[T ]

(z

2

)

28.3.2004 JASS 04 - Digital search trees 69

External path length for digital search trie

Explicit formula

B[T ] (z) =N (−1)N

1− (12

)N−1

A[T ]N =

∞∑

k=2

(N

k

)k (−1)k

1− (12

)k−1

28.3.2004 JASS 04 - Digital search trees 70

External path length for digital search trie

average case

A[T ]N = N lg N + N

ln 2+

1

2+ δ (N)

)+ O(1)

28.3.2004 JASS 04 - Digital search trees 71

External path length for Patricia trie

Fundamental recurrence relation

A[P ]N = N

(1− 1

2N−1

)+

∞∑

k=0

1

2N

(N

k

) (A

[P ]k + A

[P ]N−k

), N ≥ 1

28.3.2004 JASS 04 - Digital search trees 72

External path length for Patricia trie

Transformation

A[P ] (z) = z(ez − e

z2

)+ 2A[P ]

(z

2

)e

z2

28.3.2004 JASS 04 - Digital search trees 73

External path length for Patricia trie

Substitution

B[P ] (z) = z(1− e−

z2

)+ 2B[P ]

(z

2

)

B[P ] (z) =N (−1)N

2N−1 − 1

28.3.2004 JASS 04 - Digital search trees 74

External path length for Patricia trie

Explicit formula

A[P ]N =

∞∑

k=2

(N

k

)k (−1)k

2k−1 − 1= A

[T ]N −N

28.3.2004 JASS 04 - Digital search trees 75

content

➮ Tree

➮ Digital search tree

➮ Tries:

• Defintions

• Average case analysis:– External path length

➠ External internal nodes

➮ General framework

28.3.2004 JASS 04 - Digital search trees 76

External internal nodes for Patricia trieFundamental recurrence relation

C[P ]N =

∑ 1

2N

(N

k

) (C

[P ]k + C

[P ]N−k

), N ≥ 3

with C[P ]0 = C

[P ]1 = 0 and C

[P ]2 = 1

28.3.2004 JASS 04 - Digital search trees 77

External internal nodes for Patricia trieTransformation

C [P ] (z) =(z

2

)2

+ 2C [P ](z

2

)e

z2

28.3.2004 JASS 04 - Digital search trees 78

External internal nodes for Patricia trieSubstitution

D[P ] (z) =(z

2

)2

e−z + 2D[P ](z

2

)

28.3.2004 JASS 04 - Digital search trees 79

External internal nodes for Patricia trieExplicit formula

C[P ]N =

1

4

N∑

k=2

(N

k

)k (k − 1) (−1)k

1− 12

k−1

28.3.2004 JASS 04 - Digital search trees 80

External internal nodes for Patricia trieaverage case

C[P ]N = N

(1

4 ln 2+ δ

[P ](N)

)

28.3.2004 JASS 04 - Digital search trees 81

content

➮ Tree

➮ Digital search tree

➮ Trie

➠ General framework

28.3.2004 JASS 04 - Digital search trees 82

General Framework for digital search trees

Fundamental recurrence relation

X (T ) =∑

subtrees Tj of the root of T

X (Tj) + x (T )

28.3.2004 JASS 04 - Digital search trees 83

General Framework for digital search trees

Transformation

X (z) =∞∑

N=0

XNzN

N !

X ′ (z) = MX( z

M

)e(1− 1

M )z + x (z)

28.3.2004 JASS 04 - Digital search trees 84

General Framework for digital search trees

Substitution

Y (z) = e−zX (z)

y (z) = e−zx (z)

Y ′ (z) + Y (z) = MY( z

M

)+ y′ (z) + y (z)

28.3.2004 JASS 04 - Digital search trees 85

General Framework for digital search trees

Explicit formula

XN =N∑

k=0

(N

k

)Yk

28.3.2004 JASS 04 - Digital search trees 86

General Framework for digital search trees

Asymptotic analysis of (−1)k Yk

Find a function Y ∗k which

(i) is simply related to Yk so that∑N

k=0

(Nk

) (Yk − (−1)k Y ∗

k

)is

easily evaluated,

(ii) satisfies a recurrence of the form

Y ∗N+1 = (1− g (M, N)) Y ∗

N + f (M,N),

(iii) goes to zero quickly as N →∞.

28.3.2004 JASS 04 - Digital search trees 87

General Framework for digital search trees

Asymptotic analysis of (−1)k Yk

Turn the recurrence around to extend Y ∗N to the complex plane.

Evaluate∑N

k=0

(Nk

) (Yk − (−1)k Y ∗

k

)as detailed in the previous

sections.

28.3.2004 JASS 04 - Digital search trees 88

General Framework for tries

X (T ) =∑

subtrees Tj of the root of T

X (Tj) + x (T )

X (z) = MX( z

M

)e

zM + x (z)

This can be solved by Rice’s method or also by Mellin transform

techniques.

28.3.2004 JASS 04 - Digital search trees 89

Thank you for your attention!

28.3.2004 JASS 04 - Digital search trees 90

top related