Top Banner
Reverse Spatial and Textu al k Nearest Neighbor Sear ch Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China
40

Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Mar 27, 2015

Download

Documents

Daniel Buckley
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Reverse Spatial and Textual k Nearest Neighbor Search

Jiaheng Lu

Renmin University of China

Sep 6 2011

Presentation in HP Labs China

Page 2: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Research experience Associate Professor: Renmin University of China

XML data management, Spatial data management, Cloud data management

Post-doc: University of California, Irvine Data integration, Approximate string match

PhD National University of Singapore XML data management

Page 3: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Outline

XML data management XML twig query processing XML keyword search

Approximate string matching Reverse Spatial and Textual k Nearest Neighbor

Search (SIGMOD 2011)

Page 4: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

XML twig query processing

XPath: Section[Title]/Paragraph//Figure Twig pattern

Section

Title Paragraph

Figure

Page 5: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

XML twig query processing (Cont.) Problem Statement

Given a query twig pattern Q, and an XML database D, we need to compute ALL the answers to Q in D.

E.g. Consider Query and Document:

Document: s1

s2

f1

p1

t1

t2

Section

title figure

Query solutions: (s1, t1, f1) (s2, t2, f1) (s1, t2, f1)

Query:

Page 6: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

An example for TJFast algorithmDocument: Query:

A

D B

C

a1

a2 a3 b2

d2 b1

c2

d3

c1

d1

0.0

0.0.1

0.3

0.3.1

0.3.2

0.3.2.1

0.5

0.5.0.0

0.3.2.1, 0.5.0.0

0.0.1 , 0.3.1, 0.5.0TD:

TC:

Root0

0.5.0

A set for the branching node A

{ }

Page 7: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

XML twig query processing (Cont.)

Several efficient pattern matching algorithms TJFast (VLDB 05) iTwigJoin (SIGMOD 05) TwigStackList (CIKM 04) TreeMatch (TKDE 10)

Current works: distributed XML twig pattern processing

Page 8: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

XML twig query processing Jiaheng Lu, Ting Chen, Tok Wang Ling: Efficient processing of XML twig patterns with

parent child edges: a look-ahead approach. CIKM 2004:533-542 Jiaheng Lu, Tok Wang Ling, Chee Yong Chan, Ting Chen: From Region Encoding To

Extended Dewey: On Efficient Processing of XML Twig Pattern Matching. VLDB 2005:193-204

Jiaheng Lu, Tok Wang Ling: Labeling and Querying Dynamic XML Trees. APWeb 2004:180-189

Jiaheng Lu, Ting Chen, Tok Wang Ling: TJFast: effective processing of XML twig pattern matching. WWW (Special interest tracks and posters) 2005:1118-1119

Jiaheng Lu, Tok Wang Ling, Tian Yu, Changqing Li, Wei Ni: Efficient Processing of Ordered XML Twig Pattern. DEXA 2005:300-309

Jiaheng Lu: Benchmarking Holistic Approaches to XML Tree Pattern Query Processing - (Extended Abstract of Invited Talk). DASFAA Workshops 2010:170-178

Tian Yu, Tok Wang Ling, Jiaheng Lu: TwigStackList-: A Holistic Twig Join Algorithm for Twig Query with Not-Predicates on XML Data. DASFAA 2006:249-263

Zhifeng Bao, Tok Wang Ling, Jiaheng Lu, Bo Chen: SemanticTwig: A Semantic Approach to Optimize XML Query Processing. DASFAA 2008:282-298

Ting Chen, Jiaheng Lu, Tok Wang Ling: On Boosting Holism in XML Twig Pattern Matching using Structural Indexing Techniques. SIGMOD 2005:455-466

……

Page 9: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

课题背景: XQuery vs. 关键字查询

XQuery: for $a in doc(“bib.xml”)//author $n in $a/name where $n=”Mike” return $a//inproceedings

Query papers by “Mike”

Keyword search:

Mike , inproceedings

Complicated

XML keyword search

Page 10: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

The proposed keyword search returns the set of smallest trees containing all keywords.

bib

author author

name publications hobby

title

inproceedings articles

year

Mikeward

Paperfolding

title year

Base line of XML key

Information Retrival

20022002

name publications hobby

title

inproceedings article

year

JohnHopking Read

book

title year

Data Mining

KeywordSearch

in XML

20092007

Keywords:

Mike hobby

article 2009

Paper

XML keyword search

Page 11: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Effectiveness

Capture user’s search intentionIdentify the target that users intend to search forInfer the predicate constraint that user intends to search via

Result rankingRank the query results according to their objective

relevance to user search intention

Page 12: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Zhifeng Bao, Jiaheng Lu, Tok Wang Ling: XReal: an interactive XML keyword searching. CIKM 2010:1933-1934

Zhifeng Bao, Jiaheng Lu, Tok Wang Ling, Liang Xu, Huayu Wu: An Effective Object-Level XML Keyword Search. DASFAA 2010:93-109

Zhifeng Bao, Jiaheng Lu, Tok Wang Ling, Bo Chen: Towards an Effective XML Keyword Search. TKDE, 22(8):1077-1092 (2010)

Zhifeng Bao, Bo Chen, Tok Wang Ling, Jiaheng Lu: Demonstrating Effective Ranked XML Keyword Search with Meaningful Result Display. DASFAA 2009:750-754

Zhifeng Bao, Tok Wang Ling, Bo Chen, Jiaheng Lu: Effective XML Keyword Search with Relevance Oriented Ranking. ICDE 2009:517-528

Bo Chen, Jiaheng Lu, Tok Wang Ling: Exploiting ID References for Effective Keyword Search in XML Documents. DASFAA 2008:529-537

Jianjun Xu, Jiaheng Lu, Wei Wang, Baile Shi: Effective Keyword Search in XML Documents Based on MIU. DASFAA 2006:702-716

……

XML keyword search

Page 13: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Outline

XML data management XML twig query processing XML keyword search

Approximate string matching Reverse Spatial and Textual k Nearest Neighbor

Search

Page 14: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Motivation: Data Cleaning

Source: http://en.wikipedia.org/wiki/Heisenberg's_microscope, Jan 2008

Real-world data is dirty

Typos

Inconsistent representations

(PO Box vs. P.O. Box)

Approximately check against

clean dictionary

Should clearly be “Niels Bohr”

Page 15: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Motivation: Record Linkage

Name Hobbies AddressBrad Pitt … …Forest Whittacker … …George Bush … …Angelina Jolie … …Arnold Schwarzenegger … …

Phone Age Name… … Brad Pitt… … Arnold Schwarzeneger … … George Bush… … Angelina Jolie … … Forrest Whittaker

We want to link records belonging to the same entity

No exact match!

The same entity may have similar representations

Arnold Schwarzeneger versusArnold Schwarzenegger

Forrest Whittaker versusForest Whittacker

Page 16: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Motivation: Query Relaxation

http://www.google.com/jobs/britney.html

Errors in queries

Errors in data

Bring query and meaningful

results closer together

Actual queries gathered by Google

Page 17: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

What is Approximate String Search?

String Collection: (People)

Brad PittForest WhittackerGeorge BushAngelina JolieArnold Schwarzeneger………

Queries against collection:Find all entries similar to “Forrest Whitaker”Find all entries similar to “Arnold Schwarzenegger”Find all entries similar to “Brittany Spears”

What do we mean by similar to?- Edit Distance- Jaccard Similarity- Cosine Similaity- Dice- Etc.

The similar to predicate can help our described applications!

How can we support these types of queries efficiently?

Page 18: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Approximate Query Answering

Main Idea: Use q-grams as signatures for a string

irvine

2-grams {ir, rv, vi, in, ne}

Intuition: Similar strings share a certain number of grams

Inverted index on grams supports finding all data strings sharing enough grams with a query

Sliding Window

Page 19: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Approximate Query Example

Query: “irvine”, Edit Distance 1 2-grams {ir, rv, vi, in, ne}

tf vi ir ef rv ne unin ……

Lookup Grams

2-grams134579

59

15

1239

39

79

569

Inverted Lists

(stringIDs)

12456

Each edit operations can “destroy” at most q gramsAnswers must share at least T = 5 – 1 * 2 = 3 grams

T-Occurrence problem: Find elements occurring at least T=3 times among inverted lists. This is called list-merging. T is called merging-threshold.

Candidates = {1, 5, 9}May have false positivesNeed to compute real similarity

Page 20: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Approximate string matching Jiaheng Lu, Jialong Han, Xiaofeng Meng: Efficient algorithms for

approximate member extraction using signature-based inverted lists. CIKM 2009:315-324

Alexander Behm, Shengyue Ji, Chen Li, Jiaheng Lu: Space-Constrained Gram-Based Indexing for Efficient Approximate String Search. ICDE 2009:604-615

Chen Li, Jiaheng Lu, Yiming Lu: Efficient Merging and Filtering Algorithms for Approximate String Searches. ICDE 2008:257-266

Yuanzhe Cai, Gao Cong, Xu Jia, Hongyan Liu, Jun He, Jiaheng Lu, Xiaoyong Du: Efficient Algorithm for Computing Link-Based Similarity in Real World Networks. ICDM 2009:734-739

……

Page 21: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Outline

XML data management XML twig query processing XML keyword search

Approximate string matching Reverse Spatial and Textual k Nearest Neighbor

Search (SIGMOD 2011)

Page 22: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

If add a new shop at Q, which shops will be influenced?

Influence facts Spatial Distance

Results: D, F Textual Similarity

Services/Products...Results: F, C

Motivation

food

clothes

sports

food

clothes

clothes

clothes

2

Page 23: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Problems of finding Influential Sets

Traditional queryReverse k nearest neighbor query (RkNN)

Our new queryReverse spatial and textual k nearest neighbor query (RSTkNN)

3

Page 24: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Problem Statement

Spatial-Textual Similarity• describe the similarity between such objects based o

n both spatial proximity and textual similarity.

Spatial-Textual Similarity Function

4

Page 25: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Problem Statement (con’t)

RSTkNN query finding objects which have the query

object as one of their k spatial-textual similar objects.

5

Page 26: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Related Work• Pre-computing the kNN for each object

(Korn ect, SIGMOD2000, Yang ect, ICDE2001)

• (Hyper) Voronio cell/planes pruning strategy(Tao ect, VLDB2004, Wu ect, PVLDB2008, Kriegel ect, ICDE2009)

• 60-degree-pruning method(Stanoi ect, SIGMOD2000)

• Branch and Bound (based on Lp-norm metric space)(Achtert ect, SIGMOD2006, Achtert ect, EDBT2009)

• Pre-computing the kNN for each object(Korn ect, SIGMOD2000, Yang ect, ICDE2001)

• (Hyper) Voronio cell/planes pruning strategy(Tao ect, VLDB2004, Wu ect, PVLDB2008, Kriegel ect, ICDE2009)

• 60-degree-pruning method(Stanoi ect, SIGMOD2000)

• Branch and Bound (based on Lp-norm metric space)(Achtert ect, SIGMOD2006, Achtert ect, EDBT2009)

7

Challenging Features:

• Lose Euclidean geometric properties.

• High dimension in text space.

• k and α are different from query to query.

Challenging Features:

• Lose Euclidean geometric properties.

• High dimension in text space.

• k and α are different from query to query.

Page 27: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

N1

N3

N2

N4

y

x

p4

p2

p1

p5

q(0.5, 2.5)

p3

ObjVct2

[0, 1][0, 1]

ObjVct3

[1, 0][1, 0]

[4,4][4,4]

p2 p3

IntUniVct11

[4,4][4,4]

p1ObjVct1

N1 N2

N4

ObjVct4

[3, 2.5][3, 2.5]

ObjVct5

[3.5, 1.5][3.5, 1.5]

p4 p5

N3

[0,0][1,1]

2

[3,1.5][3.5,2.5]

2

x

ObjVct1 1 1

ObjVct2 1 1

ObjVct3 5 5

ObjVct4 8 8

ObjVct5 1 1

4

1

0

2.5

1.5

4

0

1

3

3.5

p1

p2

p3

p4

p5

q 0.5 2.5 ObjVctQ 8 8

vectorsy

UniVct1 1 1

UniVct2 5 5

UniVct3 8 8

IntVct1 1 1

IntVct2 1 1

IntVct3 1 1

IntUniVct2

IntUniVct3

Intersection and Union R-tree (IUR-tree)

10

Page 28: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Overview of Search Algorithm

RSTkNN Algorithm: Travel from the IUR-tree root Progressively update lower and upper bounds Apply search strategy:

prune unrelated entries in Pruned; report entries to be results Ans; add candidate objects to Cnd.

FinalVerification For objects in Cnd, check whether results or not by

updating the bounds for candidates using expanding entries in Pruned.

14

Page 29: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

N4

N1p1

N2p2 p3

N3p4 p5

EnQueue(U, N4);

Initialize N4.CLs;

N1

N3

N2

N4

y

x

p4

p2

p1

p5

q(0.5, 2.5)

p3

x

ObjVct1 1 1

ObjVct2 1 1

ObjVct3 5 5

ObjVct4 8 8

ObjVct5 1 1

4

1

0

2.5

1.5

4

0

1

3

3.5

p1

p2

p3

p4

p5

q 0.5 2.5 ObjVctQ 8 8

vectorsy

Example: Execution of the RSTkNN Algorithm on IUR-tree, given k=2, alpha=0.6

U

N4, (0, 0)15

Page 30: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Example: Execution of the RSTkNN Algorithm on IUR-tree, given k=2, alpha=0.6

U N4(0, 0)

DeQueue(U, N4) Mutual-effectN1 N2

N1 N3

N2 N3

N1

N3

N2

N4

y

x

p4

p2

p1

p5

q(0.5, 2.5)

p3

x

ObjVct1 1 1

ObjVct2 1 1

ObjVct3 5 5

ObjVct4 8 8

ObjVct5 1 1

4

1

0

2.5

1.5

4

0

1

3

3.5

p1

p2

p3

p4

p5

q 0.5 2.5 ObjVctQ 8 8

vectorsy

N4

N1p1

N2p2 p3

N3p4 p5

EnQueue(U, N2)EnQueue(U, N3)Pruned.add(N1)

Pruned N1(0.37, 0.432)

N3(0.323, 0.619 ) N2(0.21, 0.619 )

16

Page 31: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Example: Execution of the RSTkNN Algorithm on IUR-tree, given k=2, alpha=0.6

U

DeQueue(U, N3) Mutual-effectp4 N2

p5 p4,N2

N1

N3

N2

N4

y

x

p4

p2

p1

p5

q(0.5, 2.5)

p3

x

ObjVct1 1 1

ObjVct2 1 1

ObjVct3 5 5

ObjVct4 8 8

ObjVct5 1 1

4

1

0

2.5

1.5

4

0

1

3

3.5

p1

p2

p3

p4

p5

q 0.5 2.5 ObjVctQ 8 8

vectorsy

Answer.add(p4)Candidate.add(p5)

Pruned N1(0.37, 0.432)

N3(0.323, 0.619 ) N2(0.21, 0.619 )

Answer

Candidate

p4(0.21, 0.619 )

p5(0.374, 0.374)

N4

N1

p1

N2

p2 p3

N3

p4 p5

17

Page 32: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Example: Execution of the RSTkNN Algorithm on IUR-tree, given k=2, alpha=0.6

U

DeQueue(U, N2) Mutual-effectp2 p4,p5

p3 p2,p4,p5

N1

N3

N2

N4

y

x

p4

p2

p1

p5

q(0.5, 2.5)

p3

x

ObjVct1 1 1

ObjVct2 1 1

ObjVct3 5 5

ObjVct4 8 8

ObjVct5 1 1

4

1

0

2.5

1.5

4

0

1

3

3.5

p1

p2

p3

p4

p5

q 0.5 2.5 ObjVctQ 8 8

vectorsy

Answer.add(p2, p3)

Pruned.add(p5)

Pruned N1(0.37, 0.432)

N2(0.21, 0.619 )

Answer

Candidate

p4

p5(0.374, 0.374)

N4

N1

p1

N2

p2 p3

N3

p4 p5

p2 p3

So far since U=Cand=empty, algorithm ends.

Results: p2, p3, p4.

So far since U=Cand=empty, algorithm ends.

Results: p2, p3, p4.

18

Page 33: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Cluster IUR-tree: CIUR-tree

IUR-tree: Texts in an index node could be very different.

CIUR-tree: An enhanced IUR-tree by incorporating textual clusters. N1

N3

N2

N4

y

x

p4

p2

p1

p5

q(0.5, 2.5)

p3

ObjVct2

[0, 1][0, 1]

ObjVct3

[1, 0][1, 0]

[4,4][4,4]

p2 p3

IntUniVct1 1

[4,4][4,4]

p1ObjVct1

N1 N2

N4

ObjVct4

[3, 2.5][3, 2.5]

ObjVct5

[3.5, 1.5][3.5, 1.5]

p4 p5

N3

[0,0][1,1]

2

[3,1.5][3.5,2.5]

2

x

ObjVct1 1 1

ObjVct2 5 5

ObjVct3 5 5

ObjVct4 8 8

ObjVct5 1 1

4

1

0

2.5

1.5

4

0

1

3

3.5

p1

p2

p3

p4

p5

q 0.5 2.5 ObjVctQ 8 8

vectors word1

word2

y

word

2

word

1

UniVct1 1 1

UniVct2 5 5

UniVct3 8 8

word

2

word

1

IntVct1 1 1

IntVct2 1 1

IntVct3 1 1

IntUniVct2

IntUniVct3

C1:1

C2:2

C1:1, C3:1

C1

C2

C2

C3

C1

cluster

19

Page 34: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Optimizations

Motivation To give a tighter bound during CIUR-tree traversal To purify the textual description in the index node

Outlier Detection and Extraction (ODE-CIUR) Extract subtrees with outlier clusters Take the outliers into special account and calculate their

bounds separately.

Text-entropy based optimization (TE-CIUR) Define TextEntropy to depict the distribution of text

clusters in an entry of CIUR-tree Travel first for the entries with higher TextEntropy, i.e.

more diverse in texts.20

Page 35: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Experimental Study

Experimental Setup OS: Windows XP; CPU: 2.0GHz; Memory: 4GB Page size: 4KB; Language: C/C++.

Compared Methods baseline, IUR-tree, ODE-CIUR, TE-CIUR, and ODE-TE.

Datasets ShopBranches(Shop), extended from a small real data GeographicNames(GN), real data CaliforniaDBpedia(CD), generated combining location in California and documents

from DBpedia. Metric

Total query time Page access number

Statistics Shop CD GN

Total # of objects 304,008 1,555,209 1,868,821

Total unique words in dataset 3933 21,578 222,409

Average # words per object 45 47 4

21

Page 36: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Scalability

0.1

1

10

100

1000

10000

100000

1000000

10000000

50K 300K 550K 800K 1050K

dataset size

quer

y tim

e (s

ec)

baseline IUR-Tree

ODE-CIUR TE-CIUR

ODE-TE

0

2

4

6

8

50K 300K 550K 800K 1050K

dataset size

qu

ery

tim

e (s

ec)

baseline IUR-Tree

ODE-CIUR TE-CIUR

ODE-TE

0.2K 3K 40K 550K 4M

(1) Log-scale version (2) Linear-scale version

22

Page 37: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Effect of k

0

1

2

3

4

1 3 5 7 9

k

quer

y ti

me

(sec

)

IUR-Tree ODE-CIUR TE-CIUR ODE-TE

Query time

23

Page 38: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Conclusion

Propose a new query problem RSTkNN. Present a hybrid index IUR-Tree. Show the enhanced variant CIUR-Tree

and two optimizations ODE-CIUR and TE-CIUR to further improve search processing.

24

Page 39: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Current and future works

Distributed XML query processing

Cloud-based SQL Processing

Spatial and Temporal Keyword search

Page 40: Reverse Spatial and Textual k Nearest Neighbor Search Jiaheng Lu Renmin University of China Sep 6 2011 Presentation in HP Labs China.

Thank youQ&A