Top Banner
Index Sen Zhang
23

Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

Index

Sen Zhang

Page 2: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

INDEX

• When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through the table to locate specific records.

Page 3: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

Index Structures

• Secondary access structure used to speed up the retrieval of records in response to certain search conditions.

Page 4: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

• Indexes are useful because they help you to locate specific target within a large amount of data without having to look through every object.

Page 5: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

A data table

First name Last name Resid state

Year of born

Year of entering

John Edward VM 1989 2001

Kathy Alex NY 1987 2002

Joseph bush NJ 1983 2000

George Clinton CA 1983 1999

Alex Jordan VA 1991 2000

Bill Herbert AZ 1984 1999

James Perl SC 1986 1998

Narian Geller NC 1992 1999

Frank Thomason FL 1994 2005

Page 6: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

Index table vs. data table

First name Last name Resid state

Year of born

Year of entering

John Edward VM 1989 2001

Kathy Alex NY 1987 2002

Joseph bush NJ 1983 2000

George Clinton CA 1983 1999

Alex Jordan VA 1991 2000

Bill Herbert AZ 1984 1999

James Perl SC 1986 1998

Narian Geller NC 1992 1999

Frank Thomason FL 1994 2005

First Index pointer

Alex 5

Bill 6

Frank 9

George 4

James 7

John 1

joseph 3

kathy 2

Narian 8

Page 7: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

• Logically speaking at a conceptual level, the index is simply the row number.

• But physically, the index are pointers to the precise position on external storage , i.e. disks or memory (when loaded into memory.)

• They are pointers to the offsets of records in the physical file.

• The file system actually will map these offsets to a specific offsets in a specific sector in a specific track on a specific disk.

Page 8: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

Which attributes should be indexed?

• A table could be associated with multiple indexes.

• Which attribute(s) should be indexed? The choice depends on user requirement, what do you want from those indexes, and depend on your applications, depend upon each individual database designer or database programmer’s understanding toward the application.

Page 9: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

How index will work in a dynamic environment.

• Once you created an index on a table, oracle automatically keeps the index synchronized with that table.

• That means, when you insert a new record to the data table, oracle will insert a pointer to the index table at the right position at the same time. So every insertion will take a little bit more time.

• Similarly, when delete, update are involved, the index tables will also be involved, thus take a little bit more time in this sense, but it is tricky.

• As a tradeoff, index will expedite (speed up) other data manipulation operations such as delete, update and select.

Page 10: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

• Insertion to tables with indexes will take longer time!

• So, it is not always a good idea to enforce indexes to tables, especially for OLTP.

Page 11: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

OLTP

• OLTP (On-Line Transaction Processing) – supports a business? day-to-day activities – high insertion rate, simple queries – MB or GB

Page 12: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

OLAP

• OLAP (On-Line Analytical Processing) system or Data Warehouse – analyzes operational data – low update rate, complex queries – GB or TB – a strategic business decision: high rewards,

but low chance of success

Page 13: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

DDL statement to create index

• Create index index_name on table_name(column_name_list);

Page 14: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

• An index is an auxiliary way to organize your data file based on the characteristics of values contained in your data file.

Page 15: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

Why index works

• Proper data structure• Data manipulation related operations such

as inserting, deleting, updating, and searching algorithm can achieve better time complexities.

• For example, searching in an unsorted list could be O(n); but searching in a sorted list could be O(lgn), binary search for example.

Page 16: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

Binary search tree

• A binary search tree is a binary tree where every node has a value, every node's left subtree has values less than the node's value, and every right subtree has values greater. A new node is added as a leaf.

Page 17: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

Are these BSTs?50

7525

12 45 66 90

50

7525

12 55 73 90

1 a BST?

2 a BST?

Page 18: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

• Note that the worst case of this build_binary_tree routine is O(n2) - if you feed it a sorted list of values, it chains them into a linked list with no left subtrees.

• For example, build_binary_tree([1, 2, 3, 4, 5]) yields the tree (None, 1, (None, 2, (None, 3, (None, 4, (None, 5, None))))).

• There are a variety of balanced schemes for overcoming this flaw with simple binary trees.

Page 19: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

• Because databases cannot typically be maintained entirely in memory (512M main memory is good ), b-trees or b* trees are often used to index the data and to provide fast access.

• Theoretically speaking, searching an unindexed and unsorted database containing n key values will have a worst case running time of O(n); if the same data is indexed with a b-tree, the same search operation will run in O(log n).

• For example, to perform a search for a single key on a set of one million keys (1,000,000), a linear search will require at most 1,000,000 comparisons at the worst case. If the same data is indexed with a b-tree of minimum degree 10, 114 comparisons will be required in the worst case.

Page 20: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

B*-tree

• It is a B-tree in which nodes are kept 2/3 full by redistributing keys to fill two child nodes, then splitting them into three nodes.

Page 21: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

B-tree

Definition: A balanced search tree in which every node has between m/2 and m children, where m>1 is a fixed integer. m is the order. The root may have as few as 2 children. This is a good structure if much of the tree is in slow memory (disk), since the height, and hence the number of accesses, can be kept small, say one or two, by picking a large m.

• Also known as balanced multiway tree. • Generalization :balanced tree, search tree.

Page 22: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

• Clearly, indexing large amounts of data can significantly improve search performance. Although other balanced tree structures can be used, a b-tree also optimizes costly disk accesses that are of concern when dealing with large data sets.

Page 23: Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

view