Ch.12 Indexing and Hashing Common DB operations we want to support support: random lookup + sequential scan READ p.482 → Five factors for evaluating indexing/hashing algorithms Insertion Deletion Concepts: Classifications: Clustered (a.k.a. primary) vs. non-clustered (a.k.a. secondary) Dense vs. sparse
38
Embed
Ch.12 Indexing and Hashing - Tarleton State University
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Ch.12 Indexing and Hashing
Common DB operations we want to support support: random lookup + sequential scan
READ p.482 → Five factors for evaluating indexing/hashing algorithms
Insertion
Deletion
Concepts:
Classifications:
Clustered (a.k.a. primary) vs. non-clustered (a.k.a. secondary)
Dense vs. sparse
Examples:
Dense:
Sparse:
Clustered or non-clustered?
Other minor practical issues: Overflow blocks
Long records that extend over multiple blocks
Duplicates that extend over multiple blocks
Major practical issue: For a large table, the index itself will be large!
Solutions: Store index in RAM
Store index on disk how many blocks?
o Since index is sorted logarithmic search log2(b) disk accesses
o Logarithmic search vs. linear search, worst-case
Multi-level index → example on next page
Index updates:
Single-level
o Insertion
dense
sparse
o Deletion
Dense
Sparse
Multi-level ……..
READ and take notes: Section 12.2.3 → Detailed algorithms for the above
What if the file is not ordered on the desired searck key?
Secondary index
All secondary indices must be dense!
Problem with all index-sequential files:
Both random lookups and sequential scans get slower after many
insertions and deletions, due to overflow blocks
o Solution: reorganize file periodically O(K) linear time