Top Banner
Vivek Kantariya (09bce020) Guided by : Prof. Vibha Patel
24

Indexing Data Structure

Aug 29, 2014

Download

Education

Vivek Kantariya

 
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Indexing Data Structure

Vivek Kantariya(09bce020)

Guided by :Prof. Vibha Patel

Page 2: Indexing Data Structure

Manage large data Provide faster access Easy search Reduce unwanted memory access Proper memory allocation Increase efficiency

Page 3: Indexing Data Structure

It contains a search key and a pointer. Search key - an attribute or set of

attributes that is used to look up the records in a file.

Pointer - contains the address of where the data is stored in memory.

Page 4: Indexing Data Structure

Five Factors involved when choosing the indexing technique:

1)access type2)access time3)insertion time4)deletion time5)space overhead

Page 5: Indexing Data Structure

1) Access type - is the type of access being used.

2) Access time - time required to locate the data.

3) Insertion time - time required to insert the new data.

4) Deletion time - time required to delete the data.

5) Space overhead - the additional space occupied by the added data structure.

Page 6: Indexing Data Structure

It is for multi- dimension data. Used to describe 2D or 3D objects. Real world usage.

Examples are : R tree , R+ tree , KD tree , A tree , Hilbert tree , etc

Page 7: Indexing Data Structure

Computer Aided Design (CAD) Geographic applications (like maps) Multimedia Applications (like X-rays) Biological Databases

Page 8: Indexing Data Structure

Any Type of Geometry Point

City Line

Trail Polygon

Border A Collection of Geometries

Ski Resort Trails Any Coordinate System

Meters Pixels WGS84 (GPS)

Page 9: Indexing Data Structure
Page 10: Indexing Data Structure

• Proposed by • Antonin Guttman• UC Berkley

• All Spatial Data Enveloped• Minimum Bounding Rectangle (MBR)

• Stored and Indexed According to MBR• Structure Resembles B+-tree• Height Balanced

Page 11: Indexing Data Structure

• For an index record <I, tuple-identifier>• I = (I0, I1, … In)• n = Number of Dimensions in the Geometry• Each I is a set of the form [a,b] describing the range

of the rectangle along the dimension• a or b can be equal to infinity

• Tuple-identifier points to a record• Non-leaf nodes are in the form:

<I, child-pointer>

Page 12: Indexing Data Structure

• M is the maximum number of entries in one node• m specifies the minimum number of entries in a

node , where m ≤ M/2• Properties :1. Every Leaf Node Contains Between m and M index

records unless it is root.2. For each index record, <I, tuple-identifier> in a leaf

node is the smallest rectangle that spatially contains the n-dimensional data object.

Page 13: Indexing Data Structure

3. Every non-leaf node has between m and M children unless it is the root.

4. For each entry <I, child-pointer> in a non-leaf node, I is the smallest rectangle that spatially contains the rectangles in the child nodes.

5. The root node has at least two children unless it is a leaf.

6. All leaves appear on the same level.

Page 14: Indexing Data Structure
Page 15: Indexing Data Structure

1. Search2. Insert3. Delete4. Nearest Neighbor

Page 16: Indexing Data Structure

1. Given R-tree with root T and and all records overlap with Search rectangle S.

2. If T is not leaf, check each entry E to determine whether Ei overlaps with S.

3. For all overlapping entries invoke search on each of them with root as node pointed by Ep.

4. If T is a leaf check each entry E. If it overlaps output it.

Page 17: Indexing Data Structure

1) Start at the root node2) Select the child that needs the least

enlargement in order to fit the new geometry.

3) Repeat until at a leaf node.4) If leaf node has available space then insert.

Page 18: Indexing Data Structure

5) Else split the entry into two nodes.• Update parent nodes• Update the entry that pointed to the node with

a new MBR [ Minimum Bounding Rectangle ] .• Add a new entry for the second new node

6) If there is no space in the parent node, split and repeat.

Page 19: Indexing Data Structure

Make sure nodes are split so they cover the smallest possible area.

Split should minimize average search time.

GOOD SPLIT!

BAD!

Page 20: Indexing Data Structure

1) Remove index node E from R-Tree.2) Find node containing record.3) Remove E.4) If node contains fewer than m records

remove the node and add it to Queue.5) Move up and do the same reducing

covering rectangles.6) Reinsert all records from Queue.

Page 21: Indexing Data Structure

• Split Entries in the tree so that there is no overlap• No more multiple paths to reach a solution• Child pointers duplicated within the tree

R-Tree MBRs R+-Tree MBRs

Page 22: Indexing Data Structure

Do not split nodes on insertTake entries from the overfull node and reinsert

them into the tree Changes MBRs

Saves time and possibly rebalances the tree

Page 23: Indexing Data Structure

1. www.ieeexplore.ieee.org◦ A NEW APPROACH TO CREATING SPATIAL INDEX

WITH R-TREE byZe-Bao Zhang, Jian-Pei Zhang, Jing Yang, Yue Yang

◦ A NEW VARIATION OF R-TREE FOR INDEXING SPACIAL DATA IN GIS byChen Yongkang , Zhou Xintie , Shi Tailai , Feng Xiaoming

2. http://wikipedia.org/wiki/R_tree

Page 24: Indexing Data Structure