Holistic Twig Joins: Optimal XML Pattern Matching Nicholas Bruno, Nick Koudas, Divesh Srivastava ACM SIGMOD 02 Presented by: Li Wei, Dragomir Yankov.

Post on 18-Jan-2018

220 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Problem Statement Given a query twig pattern Q, and a XML database D, compute ALL the answers to Q in D. Example: QueryXML document

Transcript

Holistic Twig Joins: Optimal XML Pattern Matching

Nicholas Bruno, Nick Koudas, Divesh Srivastava

ACM SIGMOD 02

Presented by: Li Wei, Dragomir Yankov

Outline• Problem Statement• PathStack Algorithm• TwigStack Algorithm• Experimental Results

Problem Statement• Given a query twig pattern Q, and a XML database D, compute

ALL the answers to Q in D. • Example:

author

l n

j ane doe

fn

book(1, 1: 150, 1)

t i t l e(1, 2: 4, 2)

al l authors(1, 5: 60, 2)

year(1, 61:63, 2)

chapter(1, 64:93, 2)

XML(1, 3, 3)

author(1, 6:20, 3)

fn(1, 7: 9, 4)

l n

j ane(1, 8, 5)

poe(1, 11, 5)

2000(1, 62, 3)

t i t l e(1, 65:67, 3)

XML(1, 66, 4)

secti on(1, 68: 78, 3)

head(1, 69: 71, 4)

Ori gi ns(1, 70, 5)

author

fn l n

j ohn doe(1, 26, 5)

author

fn l n

j ane(1, 43, 5)

doe(1, 46, 5)

Query XML document

Binary Structural Joins• The approach

– Decompose the twig pattern into binary structural relationships

– Use structural join algorithms to match the binary relationships against the XML database

– Stitch together the basic matches• The problem

– The intermediate result sizes can get large, even when the input and output sizes are more manageable.

Example

author

l n

j ane doe

fn

book(1, 1:150,1)

t i t l e(1, 2: 4, 2)

al l authors(1, 5: 60, 2)

year(1, 61: 63, 2)

chapter(1, 64: 93, 2)

XML(1, 3, 3)

author(1, 6: 20, 3)

f n(1, 7: 9, 4)

l n

j ane(1, 8, 5)

poe(1, 11, 5)

2000(1, 62, 3)

t i t l e(1, 65: 67, 3)

XML(1, 66, 4)

secti on(1, 68: 78, 3)

head(1, 69: 71, 4)

Ori gi ns(1, 70, 5)

author

fn l n

j ohn doe(1, 26, 5)

author

fn l n

j ane(1, 43, 5)

doe(1, 46, 5)

Query XML document

Example

author

l n

j ane doe

fn

book(1, 1:150,1)

t i t l e(1, 2: 4, 2)

al l authors(1, 5: 60, 2)

year(1, 61: 63, 2)

chapter(1, 64: 93, 2)

XML(1, 3, 3)

author(1, 6: 20, 3)

f n(1, 7: 9, 4)

l n

j ane(1, 8, 5)

poe(1, 11, 5)

2000(1, 62, 3)

t i t l e(1, 65: 67, 3)

XML(1, 66, 4)

secti on(1, 68: 78, 3)

head(1, 69: 71, 4)

Ori gi ns(1, 70, 5)

author

fn l n

j ohn doe(1, 26, 5)

author

fn l n

j ane(1, 43, 5)

doe(1, 46, 5)

Query XML document

Decomposition

author – fn

author – ln

fn – jane

ln – doe

Example

Decomposition Number of Intermediate Results3

author

l n

j ane doe

fn

book(1, 1:150,1)

t i t l e(1, 2: 4, 2)

al l authors(1, 5: 60, 2)

year(1, 61: 63, 2)

chapter(1, 64: 93, 2)

XML(1, 3, 3)

author(1, 6: 20, 3)

f n(1, 7: 9, 4)

l n

j ane(1, 8, 5)

poe(1, 11, 5)

2000(1, 62, 3)

t i t l e(1, 65: 67, 3)

XML(1, 66, 4)

secti on(1, 68: 78, 3)

head(1, 69: 71, 4)

Ori gi ns(1, 70, 5)

author

fn l n

j ohn doe(1, 26, 5)

author

fn l n

j ane(1, 43, 5)

doe(1, 46, 5)

Query XML document

author – fn

author – ln

fn – jane

ln – doe

Example

author

l n

j ane doe

fn

book(1, 1:150,1)

t i t l e(1, 2: 4, 2)

al l authors(1, 5: 60, 2)

year(1, 61: 63, 2)

chapter(1, 64: 93, 2)

XML(1, 3, 3)

author(1, 6: 20, 3)

f n(1, 7: 9, 4)

l n

j ane(1, 8, 5)

poe(1, 11, 5)

2000(1, 62, 3)

t i t l e(1, 65: 67, 3)

XML(1, 66, 4)

secti on(1, 68: 78, 3)

head(1, 69: 71, 4)

Ori gi ns(1, 70, 5)

author

fn l n

j ohn doe(1, 26, 5)

author

fn l n

j ane(1, 43, 5)

doe(1, 46, 5)

Query XML document

Decomposition Number of Intermediate Results3

3

author – fn

author – ln

fn – jane

ln – doe

Example

author

l n

j ane doe

fn

book(1, 1:150,1)

t i t l e(1, 2: 4, 2)

al l authors(1, 5: 60, 2)

year(1, 61: 63, 2)

chapter(1, 64: 93, 2)

XML(1, 3, 3)

author(1, 6: 20, 3)

f n(1, 7: 9, 4)

l n

j ane(1, 8, 5)

poe(1, 11, 5)

2000(1, 62, 3)

t i t l e(1, 65: 67, 3)

XML(1, 66, 4)

secti on(1, 68: 78, 3)

head(1, 69: 71, 4)

Ori gi ns(1, 70, 5)

author

fn l n

j ohn doe(1, 26, 5)

author

fn l n

j ane(1, 43, 5)

doe(1, 46, 5)

Query XML document

Decomposition Number of Intermediate Results3

3

2

author – fn

author – ln

fn – jane

ln – doe

Example

author

l n

j ane doe

fn

book(1, 1:150,1)

t i t l e(1, 2: 4, 2)

al l authors(1, 5: 60, 2)

year(1, 61: 63, 2)

chapter(1, 64: 93, 2)

XML(1, 3, 3)

author(1, 6: 20, 3)

f n(1, 7: 9, 4)

l n

j ane(1, 8, 5)

poe(1, 11, 5)

2000(1, 62, 3)

t i t l e(1, 65: 67, 3)

XML(1, 66, 4)

secti on(1, 68: 78, 3)

head(1, 69: 71, 4)

Ori gi ns(1, 70, 5)

author

fn l n

j ohn doe(1, 26, 5)

author

fn l n

j ane(1, 43, 5)

doe(1, 46, 5)

Query XML document

Decomposition

author – fn

author – ln

fn – jane

ln – doe

Number of Intermediate Results3

3

2

2

Example

author

l n

j ane doe

fn

book(1, 1:150,1)

t i t l e(1, 2: 4, 2)

al l authors(1, 5: 60, 2)

year(1, 61: 63, 2)

chapter(1, 64: 93, 2)

XML(1, 3, 3)

author(1, 6: 20, 3)

f n(1, 7: 9, 4)

l n

j ane(1, 8, 5)

poe(1, 11, 5)

2000(1, 62, 3)

t i t l e(1, 65: 67, 3)

XML(1, 66, 4)

secti on(1, 68: 78, 3)

head(1, 69: 71, 4)

Ori gi ns(1, 70, 5)

author

fn l n

j ohn doe(1, 26, 5)

author

fn l n

j ane(1, 43, 5)

doe(1, 46, 5)

Query XML document

Decomposition

author – fn

author – ln

fn – jane

ln – doe

Number of Intermediate Results3

3

2

2

Output

1

Holistic Twig Joins• The approach

– Uses linked stacks to compactly represent partial results to query paths

– Merges results to query paths to obtain matches for the twig pattern

• The advantage– It ensures that no intermediate solutions is

larger than the final answer to the query.

Example

author

l n

j ane doe

fn

Query XML documentbook

(1, 1: 150, 1)

t i t l e(1, 2: 4, 2)

al l authors(1, 5: 60, 2)

year(1, 61:63, 2)

chapter(1, 64: 93, 2)

XML(1, 3, 3)

author1(1, 6: 20, 3)

fn1(1, 7: 9, 4)

l n1

j ane1(1, 8, 5)

poe(1, 11, 5)

2000(1, 62, 3)

t i t l e(1, 65: 67, 3)

XML(1, 66, 4)

secti on(1, 68:78, 3)

head(1, 69:71, 4)

Ori gi ns(1, 70, 5)

author2

fn2 l n2

j ohn doe1(1, 26, 5)

author3

fn3 l n3

j ane2(1, 43, 5)

doe2(1, 46, 5)

Example

Decomposition

author – fn – jane

author – ln – doe

Intermediate Results

1

1

Output

author

l n

j ane doe

fn

Query XML document

1

book(1, 1: 150, 1)

t i t l e(1, 2: 4, 2)

al l authors(1, 5: 60, 2)

year(1, 61:63, 2)

chapter(1, 64: 93, 2)

XML(1, 3, 3)

author1(1, 6: 20, 3)

fn1(1, 7: 9, 4)

l n1

j ane1(1, 8, 5)

poe(1, 11, 5)

2000(1, 62, 3)

t i t l e(1, 65: 67, 3)

XML(1, 66, 4)

secti on(1, 68:78, 3)

head(1, 69:71, 4)

Ori gi ns(1, 70, 5)

author2

fn2 l n2

j ohn doe1(1, 26, 5)

author3

fn3 l n3

j ane2(1, 43, 5)

doe2(1, 46, 5)

Number of Intermediate Results

author3 – fn3 – jane2

author3 – ln3 – doe2

book(1, 1: 150, 1)

t i t l e(1, 2: 4, 2)

al l authors(1, 5: 60, 2)

year(1, 61:63, 2)

chapter(1, 64:93, 2)

XML(1, 3, 3)

author1(1, 6: 20, 3)

fn1(1, 7: 9, 4)

l n1

j ane1(1, 8, 5)

poe(1, 11, 5)

2000(1, 62, 3)

t i t l e(1, 65:67, 3)

XML(1, 66, 4)

secti on(1, 68: 78, 3)

head(1, 69: 71, 4)

Ori gi ns(1, 70, 5)

author2

fn2 l n2

j ohn doe1(1, 26, 5)

author3

fn3 l n3

j ane2(1, 43, 5)

doe2(1, 46, 5)

author

l n

j ane doe

fn

Query

isLeaf (author) = false

isRoot (author) = true

parent (fn) = author

children (author) = {fn, ln}

subtreeNodes (author) = {fn, ln, jane, doe}

XML document

StreamsTa: a1, a2, a3

Tfn: fn1, fn3

Tln: ln2, ln3

Tj: j1, j2

Td: d1, d2

eof (Ta) = false

advance (Ta) => Ta: a1, a2, a3

next (Ta) = a1

nextL (Ta) = 6

nextR (Ta) = 20

Notation

SaSfnSl nSjSd

a3f 3

Stacks

empty (Sa) = false

pop (Sf)

push (Sln, ln3, pointer to a3)

topL (Sa) = LeftPos of a3

topR (Sa) = RightPos of a3

Algorithm: PathStack

A1

B1

A2

B2

C1

SASBSC

A1B1A2B2

C1

While the streams of the leaves are not empty (i.e. a solution could be found) do:- select the node with minimal LeftPos value and push it into stack- if it is a leaf, print the solution

A

B

C

A1

B1

A2

B2

C1

A1B1C1

A1B2C1

A2B2C1

Intuition:

TA: A1, A2

TB: B1, B2

TC: C1

Stacks Comments

A1

B1

A2

B2

C1

A

B

C

SASBSC

Streams

A1B1

A2B2C1

qmin = A

06) moveStreamToStack(TA, SA, null)

TA: A1, A2

TB: B1, B2

TC: C1

Stacks Comments

A1

B1

A2

B2

C1

A

B

C

Streams

A1B1

A2B2C1

qmin = B

06) moveStreamToStack(TB, SB, A1)SASBSC

A1

SASBSC

A1B1

TA: A1, A2

TB: B1, B2

TC: C1

Stacks Comments

A1

B1

A2

B2

C1

A

B

C

Streams

A1B1

A2B2C1

qmin = A

06) moveStreamToStack(TA, SA, null)

SASBSC

A1B1A2

TA: A1, A2

TB: B1, B2

TC: C1

Stacks Comments

A1

B1

A2

B2

C1

A

B

C

Streams

A1B1

A2B2C1

qmin = B

06) moveStreamToStack(TB, SB, A2)

SASBSC

A1B1A2B2

TA: A1, A2

TB: B1, B2

TC: C1

Stacks Comments

A1

B1

A2

B2

C1

A

B

C

Streams

A1B1

A2B2C1

qmin = C

06) moveStreamToStack(TC, SC, B2)

SASBSC

A1B1A2B2

C1

TA: A1, A2

TB: B1, B2

TC: C1

Stacks Comments

A1

B1

A2

B2

C1

A

B

C

Streams

A1B1

A2B2C1

07) isLeaf(C) = true

08) showSolutions(SC, 1)

09) pop(SC)

SASBSC

A1B1A2B2

TA: A1, A2

TB: B1, B2

TC: C1

Stacks Comments

A1

B1

A2

B2

C1

A

B

C

Streams

A1B1

A2B2C1

01) end(q) = true

Algorithm ends.

Procedure: showSolutions

SASBSC

A1B1A2B2

C1

Intuition:- stacks have the compact encodings of the anwers

- output is in leaf-to-root order

A

B

C

A1

B1

A2

B2

C1 C1B1A1

C1B2A1

C1B2A2

Analysis: PathStack• Correctness

– (Theorem 3.1) Given a query path pattern Q and an XML database D, Algorithm PathStack correctly returns all answers for Q on D.

• Optimality– (Theorem 3.2) Algorithm PathStack has worst

case I/O and CPU time complexities linear in the sum of sizes of the input lists and the output list.

PathMPMJ

• A naïve extension of MPMGJN could be to backtrack all possible solutions – PathMPMJNaive

• A much faster approach is to keep “k” pointers on the streams and prune part of the solutions - PathMPMJ

A

B

C

TA = A1, A2, A3…

TB = B1, B2 … BK…

TC = C1, C2, C3 …

author

l n

j ane doe

fn

PathStack Limitations• Merging the path queries for twig joins is

not optimalExample:

allauthors(1,5:60,2)

author1(1,6:20,3)

fn1(1,7:9,4)

ln1

jane1(1,8,5)

poe(1,11,5)

author2

fn2 ln2

john doe1(1,26,5)

author3

fn3 ln3

jane2(1,43,5)

doe2(1,46,5)

...

Query result:

(a3, fn3, ln3, j2, d2)

Query:

author

jane

fn

author

doe

ln

(a1, fn1, j1)

(a3, fn3, j3)

(a2, ln2, d2)

(a3, ln3, d3)

TwigStackallauthors

(1,5:60,2)

author1(1,6:20,3)

fn1(1,7:9,4)

ln1

jane1(1,8,5)

poe(1,11,5)

author2

fn2 ln2

john doe1(1,26,5)

author3

fn3 ln3

jane2(1,43,5)

doe2(1,46,5)

...

Intuition:

author

l n

j ane doe

fn

While the streams of the leaves are not empty (i.e. a solution could be found) do:

- select a node that could be expanded to a solution - if it is a leaf, print the solution

author

l n

j ane doe

fn

StreamsTa: a1, a2, a3

Tfn: fn1, fn2, fn3

Tln: ln1, ln2, ln3

Tj: j1, j2

Td: d1, d2

SaSf nSl nSjSd

StacksComments: Phase101: while (notEmpty(Tj) || notEmpty(Td)) do:

TwigStack: Example...

allauthors(1,5:60,2)

author1(1,6:20,3)

fn1(1,7:9,4)

jane1(1,8,5)

poe(1,11,5)

author2

fn2 ln2

john doe1(1,26,5)

author3

fn3 ln3

jane2(1,43,5)

doe2(1,46,5)

ln1

author

l n

j ane doe

fn

SaSf nSl nSjSd

StacksComments: iteration1qact = getNext(a) fn getNext(fn) fn getNext(j) j nmin=nmax=8 (j1) getNext(ln) ln getNext(d) d nmin=nmax=26 (d1)

advance(ln) nmin=7(fn1) nmax=ln2 advance(Ta)advance(Tfn)

TwigStack: Example...

allauthors(1,5:60,2)

author1(1,6:20,3)

fn1(1,7:9,4)

jane1(1,8,5)

poe(1,11,5)

author2

fn2

john doe1(1,26,5)

author3

fn3 ln3

jane2(1,43,5)

doe2(1,46,5)

ln1 ln2

StreamsTa: a1, a2, a3

Tfn: fn1, fn2, fn3

Tln: ln1, ln2, ln3

Tj: j1, j2

Td: d1, d2

author

l n

j ane doe

fn

SaSf nSl nSjSd

StacksComments: iteration2qact = getNext(a) j getNext(fn) j getNext(j) j nmin=nmax=8 (j1) getNext(ln) ln getNext(d) d nmin=nmax=26 (d1) nmin=8(j1) nmax=ln2advance(Tj)

TwigStack: Example...

allauthors(1,5:60,2)

author1(1,6:20,3)

fn1(1,7:9,4)

jane1(1,8,5)

poe(1,11,5)

author2

fn2

john doe1(1,26,5)

author3

fn3 ln3

jane2(1,43,5)

doe2(1,46,5)

ln1 ln2

StreamsTa: a1, a2, a3

Tfn: fn1, fn2, fn3

Tln: ln1, ln2, ln3

Tj: j1, j2

Td: d1, d2

author

l n

j ane doe

fn

SaSf nSl nSjSd

StacksComments: iteration3qact = getNext(a) ln getNext(fn) fn getNext(j) j nmin=nmax=43 (j2) advance(fn) getNext(ln) ln getNext(d) d nmin=nmax=26 (d1) nmin=ln2 nmax=fn3 advance(Ta)advance(Tln)

TwigStack: Example

StreamsTa: a1, a2, a3

Tfn: fn1, fn2, fn3

Tln: ln1, ln2, ln3

Tj: j1, j2

Td: d1, d2

...allauthors

(1,5:60,2)

author1(1,6:20,3)

fn1(1,7:9,4)

jane1(1,8,5)

poe(1,11,5)

author2

fn2

john doe1(1,26,5)

author3

fn3 ln3

jane2(1,43,5)

doe2(1,46,5)

ln1 ln2

author

l n

j ane doe

fn

SaSf nSl nSjSd

StacksComments: iteration4qact = getNext(a) d getNext(fn) fn getNext(j) j nmin=nmax=43 (j2) getNext(ln) d getNext(d) d nmin=nmax=26 (d1) nmin=26(d1) nmax=fn3advance(Td)

TwigStack: Example...

allauthors(1,5:60,2)

author1(1,6:20,3)

fn1(1,7:9,4)

jane1(1,8,5)

poe(1,11,5)

author2

fn2

john doe1(1,26,5)

author3

fn3 ln3

jane2(1,43,5)

doe2(1,46,5)

ln1 ln2

StreamsTa: a1, a2, a3

Tfn: fn1, fn2, fn3

Tln: ln1, ln2, ln3

Tj: j1, j2

Td: d1, d2

author

l n

j ane doe

fn

StacksComments: iteration5qact = getNext(a) a getNext(fn) fn getNext(j) j nmin=nmax=43 (j2) getNext(ln) ln getNext(d) d nmin=nmax=46 (d2) nmin=fn3 nmax=ln3moveStreamToStack(Ta) advance(Ta)

TwigStack: Example

SaSfnSlnSjSd

a3

...allauthors

(1,5:60,2)

author1(1,6:20,3)

fn1(1,7:9,4)

jane1(1,8,5)

poe(1,11,5)

author2

fn2

john doe1(1,26,5)

author3

fn3 ln3

jane2(1,43,5)

doe2(1,46,5)

ln1 ln2

StreamsTa: a1, a2, a3

Tfn: fn1, fn2, fn3

Tln: ln1, ln2, ln3

Tj: j1, j2

Td: d1, d2

author

l n

j ane doe

fn

StacksComments: iteration6qact = getNext(a) fn getNext(fn) fn getNext(j) j nmin=nmax=43 (j2) getNext(ln) ln getNext(d) d nmin=nmax=46 (d2) nmin=fn3 nmax=ln3moveStreamToStack(Tfn) advance(Tfn)

TwigStack: Example

SaSfnSlnSjSd

a3fn3

...allauthors

(1,5:60,2)

author1(1,6:20,3)

fn1(1,7:9,4)

jane1(1,8,5)

poe(1,11,5)

author2

fn2

john doe1(1,26,5)

author3

fn3 ln3

jane2(1,43,5)

doe2(1,46,5)

ln1 ln2

StreamsTa: a1, a2, a3

Tfn: fn1, fn2, fn3

Tln: ln1, ln2, ln3

Tj: j1, j2

Td: d1, d2

author

l n

j ane doe

fn

StacksComments: iteration7qact = getNext(a) j getNext(fn) j getNext(j) j nmin=nmax=43 (j2) getNext(ln) ln getNext(d) d nmin=nmax=46 (d2) nmin=43(j2) nmax=ln3moveStreamToStack(Tj) advance(Tj) pop(Sj)showSolutionsWithBlocking(j)

TwigStack: Example

“Merge-joinable” root-to-leaf path: (j2, fn3, a3)

SaSfnSlnSjSd

a3fn3j2

...allauthors

(1,5:60,2)

author1(1,6:20,3)

fn1(1,7:9,4)

jane1(1,8,5)

poe(1,11,5)

author2

fn2

john doe1(1,26,5)

author3

fn3 ln3

jane2(1,43,5)

doe2(1,46,5)

ln1 ln2

StreamsTa: a1, a2, a3

Tfn: fn1, fn2, fn3

Tln: ln1, ln2, ln3

Tj: j1, j2

Td: d1, d2

author

l n

j ane doe

fn

StacksComments: iteration8qact = getNext(a) ln3 getNext(fn) nil getNext(j) nil nmin=nmax=nil getNext(ln) ln getNext(d) d nmin=nmax=46 (d2) nmin=ln3 nmax=ln3moveStreamToStack(Tln) advance(Tln)

TwigStack: Example

“Merge-joinable” root-to-leaf path: (j2, fn3, a3)

SaSfnSlnSjSd

a3fn3ln3

...allauthors

(1,5:60,2)

author1(1,6:20,3)

fn1(1,7:9,4)

jane1(1,8,5)

poe(1,11,5)

author2

fn2

john doe1(1,26,5)

author3

fn3 ln3

jane2(1,43,5)

doe2(1,46,5)

ln1 ln2

StreamsTa: a1, a2, a3

Tfn: fn1, fn2, fn3

Tln: ln1, ln2, ln3

Tj: j1, j2

Td: d1, d2

author

l n

j ane doe

fn

StacksComments: iteration9qact = getNext(a) ln3 getNext(fn) nil getNext(j) nil nmin=nmax=nil getNext(ln) d getNext(d) d nmin=nmax=46 (d2) nmin=d nmax=dmoveStreamToStack(Td) advance(Td) pop(Sd)showSolutionsWithBlocking(d)

TwigStack: Example

“Merge-joinable” root-to-leaf paths: (j2, fn3, a3)

(d2, ln3, a3)

SaSfnSlnSjSd

a3fn3ln3d2

...allauthors

(1,5:60,2)

author1(1,6:20,3)

fn1(1,7:9,4)

jane1(1,8,5)

poe(1,11,5)

author2

fn2

john doe1(1,26,5)

author3

fn3 ln3

jane2(1,43,5)

doe2(1,46,5)

ln1 ln2

StreamsTa: a1, a2, a3

Tfn: fn1, fn2, fn3

Tln: ln1, ln2, ln3

Tj: j1, j2

Td: d1, d2

author

l n

j ane doe

fn

StacksComments: Phase212: MergeAllPathSolutions()

TwigStack: Example

TwigStack solution:

(j2, fn3, d2, ln3, a3)

SaSfnSlnSjSd

a3fn3ln3

StreamsTa: a1, a2, a3

Tfn: fn1, fn2, fn3

Tln: ln1, ln2, ln3

Tj: j1, j2

Td: d1, d2

...allauthors

(1,5:60,2)

author1(1,6:20,3)

fn1(1,7:9,4)

jane1(1,8,5)

poe(1,11,5)

author2

fn2

john doe1(1,26,5)

author3

fn3 ln3

jane2(1,43,5)

doe2(1,46,5)

ln1 ln2

Analysis of TwigStack• Let getNext(q) = qN

– qN has minimum descendant extension

– for all qi subtreeNodes(qN) next(Tqi) = hqi

– Either q=qN or parent(qN) has no min right extension

• Any ancestor of qN whose extension uses hqn is returned by getNext before qN => correctness (TwigStack finds all solutions to q)

• TwigStack is time and space optimal for ancestor-descendant edges

Suboptimality for parent-child edges

Example

A1

A2 B2

B1

C2

C1

A

B C

final solutions

TS Phase1 solutions:

(A1, B2, C2)

(A2, B1, C1)

(A1, B1, C1)

(A1, B1, C2)Would be optimal for:

A

B C

TwigStack and XB-Treesa1

(2:95)

a2(3:50)

a3(6:48)

a4(10:45)

a5(20:30)

a6(55:58)

a7(60:94)

a8(62:75)

a10(80:88)

a9(70:72)

a11(80:88)

• XB-Trees - B+ trees with some additional features1

-Internal nodes have the form [L:R], sorted on L

-Parent node interval includes child node intervals

-Each page P has pointer P.parent

• TwigStackXB – same as TwigStack with the following modifications

-Tq for a query node with an index is now the XB tree rather than a stream

-The advance operation is modified according to the pointer act=(actPage,actIndex)

- The drilldown operation is introduced

2:95 20:88

2:95 6:48

2:95

3:50

6:48

10:45

20:58 60:94

2:95

50:58

60:94

62:75

80:88

82:86

80:88

70:72

1. “An Evaluation of XML indexes for Structural Join” demonstrates that while all – B+, XR and XB trees build the same tree structure, for “highly recursive” XML XB trees outperform the other two

Experimental Results

PS vs TS for binary twig query PS vs TS for parent-child query

Questions?

top related