Chapter 17 Indexing Structures for Files and Physical ...orion.towson.edu/~karne/teaching/c657sl/Ch17Notes.pdf · Indexing Structures for Files and Physical Database Design ... o

1

Chapter 17

Indexing Structures for Files and Physical Database Design

We assume that a file already exists with some primary organization

unordered, ordered or hash. The index provides alternate ways to

access the records without affecting the existing placement of records

on the disk.

Each indexing approach have a particular data structure to speed up

the search. A variety of indexing techniques are studied here.

Types of Single-Level Ordered Indexes

- Concept of indexes is similar to an index of terms in a book

- Index access structure is usually a single field of a file called indexing

field

- The index stores each value of the field along with all disk blocks

that contain records with this field

- The values in the index are ordered so that a binary search can be

done

- Both the index and data files are ordered, but index file is smaller

Several types of ordered indexes:

- Primary index specified on a key field

- Clustering index, ordering field is not a key field; the data file is

called clustered file

- A file can have at most one physical ordering field; it can have one

primary index, or one clustering index but not both

2

- Secondary index can be specified on any non-ordering field of a file;

a data file can have several secondary indexes in addition to the

primary access method

Primary Indexes

- Ordered file with 2 fields, PK field and data ptr. PK is the primary key

of the data file. Ptr is the pointer to a disk block; PK is the value for

the first record in the block

- Each block in the data file has one entry in the index file

- The two fields <K(i), P(i)>; P(i) is the pointer for the block in data file

- In general the two fields are: <K(i), X>:

o X may be the physical address of a block (or page)

3

o X may be the record address made up of a block address and a

record id (for offset) with in the block

o X may be a logical address of the block or of the record within

the file and is a relative number that would be mapped to

physical address

Fig. 17-1

- First record in each block of the data file is called an block anchor or

anchor record

- Indexes can be dense or parse

- A dense index has an entry for every search key value; a parse index

has index entries for only for some of the search values

- To retrieve a record given the value of its PK field, we do a binary

search on the index file to find appropriate entry I, and then retrieve

the data field block whose address is P(i).

4

5

Example 1.

Ordered file with records r = 300000

Disk block size = B = 4096 bytes

File records are fixed size and unspanned

Record length = R = 100 bytes

Blocking factor bfr = [B/R] (lower ceiling) = 4096/100 = 40 records per

block

The number of blocks needed for the file = [r/bfr] (upper ceiling)

= 300000/40 = 7500 blocks

A binary search on the data file = log2(7500) (upper ceiling) = 13

Let ordering key field is 9 bytes, block pointer is 6 bytes; total 15 bytes

in each entry of index file

bfr for index file is 4095/15 (lower ceiling) = 273

total number of index entries = total no of blocks

The number of index blocks = [7500/273] (upper ceiling) = 28

To perform binary search on index file: log2(28) (upper ceiling) = 5

To search for a record, we need one additional access to read the data

block, thus we need 5+1 = 6 block accesses using a binary search

Whereas, we need 13 block accesses without an index file

Problems with Primary Index

- Insertion

o Inserting in correct position (make space, change index

entries)

o Move records to make space for new records

6

o Move will change anchor records of some blocks

o Use linked list or overflow records

- Deletion

o Use delete markers

Clustering Indexes

- Ordered on a non-key field (no distinct values), clustering field

- The data field is ordered on a non-key field called a clustered file

- Seed up retrieval of all records that have the same value for the

clustering field

- Includes one index entry for each distinct value of the field, the

index entry points to the first data block that contains records with

the field value

- Another example of non-dense (or parse) index

- Insertion and deletion problems (reserve one or more blocks for

each value of the clustering field);

Example 2:

r = 300000 records

B = 4096 bytes

It is ordered by zip codes; there are 1000 zip codes in the file

Average 300 records per zip code (assume even distribution)

The index 1000 index entries, 5 bytes zip code, 6 bytes block no, 11

bytes total in each entry;

bfr = 4096/11 (lower ceiling) = 372 index entries per block

The number of index blocks = 1000/372 (upper ceiling) = 3

7

Binary search on index file would require log2(3) = 2 block accesses

The index is loaded in main memory 1000*11 = 11000 bytes.

8

9

Secondary Indexes

- The data field records could be unordered, ordered, or hashed

- A secondary index provides a secondary means of accessing a file for

which some primary access already exists

- The secondary index may be on a field which is a candidate key and

has a unique value in every record, or a non-key with duplicate

values

- The index is an ordered file with two fields:

o The first field is of the same data type as some non-ordering

field of the data file that is an indexing field

o The second field is either a block or record pointer

o There can be many secondary indexes (and hence, indexing

fields) for the same file; each represents an additional means

of accessing that file based on some specific field

- Includes one entry for each record in the data file; hence it is a

dense index (records of the data file are not physically ordered by

secondary key)

- A secondary index needs more storage space and longer search time

than primary index, because of its longer number of entries

- Search time is improved as there is no need to do linear search on

records in a data file (records are directly accessed)

10

11

Example 3:

r = 300000 records

size R = 100 bytes

block size B = 4096 bytes

no of records per block bfr = 4096/100 = 40

no of blocks for the data file = b = 300000/40 = 7500

suppose we want to search for a record with a specific value for the

secondary key ….a non-ordering key with 9 bytes value

without a secondary index; to do a linear search on the file would

require: b/2 searches (7500/2 = 3750) block accesses

suppose that we construct a secondary index on the nonordering key

field of the file; 9+6 = 15 byte entry value;

the blocking factor for the index: 4096/15 = 273 entries per block

in a dense secondary index; total number of index entries is equal to

the number of records; 300000

the number of blocks needed for the secondary index = 300000/273

(upper ceiling) = 1099 blocks

A binary search on this secondary index needs log2b = log2(1099) = 11

block accesses

To search for a record using the index, we need 1 additional block

access to the data file

11+1 = 12 block accesses (compared to 3750 block accesses)

12

We can also create a secondary index on a non-key, non-ordering field.

Numerous records in the data file can have the same value for the

indexing field. There are several options:

1. Include duplicate index entries with the same K(i) value, one for

each record. Dense index.

2. Have variable length records for the index entries, with a

repeating field for the pointer. Keep a list of pointers

<P(i, 1), P(i, 2), …,P(i, k) in the index entry K(i).

3. Create an extra level of indirection to handle the multiple

pointers. Fig. 17.5. If one block of indirection is not enough, a

cluster can be used.

13

14

Multilevel Indexes

We covered ordered index file techniques. A binary search is needed to

locate pointers for disk blocks or records. A binary search requires

log2bi block accesses for an index with bi blocks. Each step of the

binary search reduces the file by 2.

The idea of multilevel index is to reduce the part of the index that we

continue to search by bfri . the blocking factor for the index, which is

larger than 2. The search space is reduced much faster.

The value bfri is called fan-out (fo).

Multilevel index requires logfobi block accesses.

With a 4096 byte block size, 9 byte key (SSN) and 4 byte pointer (13

bytes total index entry)

bfri = 4096/13 (lower ceiling) = 315.

In multilevel index, the index file is the first level (base) of multilevel

index, as an ordered file with distinct values for each K(i).

We can create a primary index for the first level is called the second

level multilevel index. The second level has one entry for each block, as

it is a primary index (we can use block anchors)

The process repeats for 3rd level and so on until all entries fit in one disk

block for the top level.

-problems: insertion and deletion; leave some space in each of its

blocks called dynamic multilevel index

15

Example 4

r = 300000 fixed length records

B = 4096 bytes

Record size R = 100 bytes

bfr = 4096/100 (lower ceiling) = 40 records per block

number of blocks = 300000/40 (upper ceiling) = 7500 blocks

Number of bytes in each index entry = 9 + 6 = 15

bfri = 4096/15 (lower ceiling) = 273 = fo of the multilevel index

number of blocks needed for the index = 300000/273 (upper ceiling) =

b1 = 1099 blocks (number of first level blocks)

(nth level n-1 level ……. 1st level) in multilevel structure

The number of 2nd level blocks = b2 = b1/fo (upper ceiling) =

1099/273 = 5 blocks

The number of 3rd level blocks = b3 = b2/fo (upper ceiling) =

5/273 = 1 block

Hence, the third level is the top level (for multilevel index), t = 3

To access a record by searching a multilevel index, we must access one

block at each level plus 1 block from the data file = 3+1 = 4 blocks (in a

single level it was 12 blocks)

16

17

Dynamic Multilevel Indexes Using B-Tree and B+ Trees

Tree Structure:

- A tree is formed of nodes

- Each node except the root, has one parent node and 0 or more child

nodes

- The root node has no parent

- A node with no child nodes is a leaf node

- A nonleaf node is an internal node

- The level of node is always one higher than its parent

- The level of root node is 0

- A subtree of a node consists of that node and all its descendant

nodes

- If the leaf nodes are at different levels, it is an unbalanced tree

- B-tree nodes are kept 50-100% full

- Pointers to the data blocks are stored at internal and leaf nodes for

B-trees

- Pointers to the data blocks are stored at leaf nodes for B+ trees

Search Trees

A search tree is a special type of tree used to guide the search for a

record given the value of one of the record’s fields.

Fig. 17.8

A search tree of order p is a tree such that each node contains p-1

search values and p pointers.

<P1, K1, P2, K2, ….Pq-1, Kq-1, Pq> Each Pi is a pointer to a

child node or null

18

We can use search tree as a mechanism to search for records on disk.

Two constraints must hold all the time:

1. Within each node K1 < K2 < K3 <…..Kq-1

2. For all values of X in the subtree pointed at by Pi

We have; Ki-1 < X < Ki

19

20

- Search field is same as the index field

- Algorithms needed to insert and delete

- May result in unbalanced tree

- Make sure nodes are evenly distributed

- Make search speed uniform

- Minimize number of levels; also make sure it does not require

restructuring many times

B-Trees

B-Tree solves the above problems:

- Tree is always balanced

- Space wasted by deletion, if any, will not be excessive

- Insertion and deletion algorithms are complex

A B-Tree of order p, when used as an access structure on a key field to

search for records in a data file can be defined as follows:

1. Each internal node in the B-Tree is of the form:

<P1, <K1, Pr1>, P2, <K2, Pr2>, ….Pq-1, <Kq-1,Prq-1>, Pq>

Where, q <= p, each Pi is a tree pointer – a pointer to the record, whose

search key field value is equal to Ki (or the data field value containing

that record)

2. Within each node K1 < K2 < …<Kq-1

3. For all search key field values X in the subtree pointed at by Pi, we

have Ki-1 < X< Ki for 1 < I <q; X < Ki for i = 1; Ki-1 < X for i=q

4. Each node has at most p tree pointers

21

5. Each node, except the root and leaf nodes, has at least [p/2]

(upper ceiling) pointers; the root node has at least two tree

pointers unless it is the only node in the tree

6. A node with q tree pointers, q <=p, has q-1 search key field values

and hence has q-1 data pointers

7. All the leaf nodes are at the same level. Leaf nodes have same

structure as the internal nodes, except they have null pointers.

22

- Fig. 17-10(b) illustrate a B-Tree with p = 3. All search key values are

unique, as it is a key field

- If we use a B-Tree on a nonkey field, we must change the file pointer

Pri to point to a block or a cluster of blocks that contain the

pointers to the records.

- The B-Tree starts with a single root node (which is also a leaf node)

at level 0

- Once the root node is full with p-1 search key values, we attempt to

insert another entry in the tree; the root node splits into two nodes

at level 1. Only the middle value is kept at the root node, the rest of

the values are split evenly on other nodes

- When a nonroot node is full, and a new entry is inserted into it, that

node is split into two nodes at the same level, and the middle entry

is moved into the parent node, along with two pointers to the newly

split nodes. ……

- Read deletion from p/621

- SKIP Example 5

B-Tree (order p)

- No node has more than p children

- Every node except the root and terminal nodes has at least [p/2]

(upper ceiling) children

- The root has at least two children, unless the tree has only one node

- All terminal nodes appear on the same level, i.e., same distance

from the root

- A non-terminal node with k children contains k-1 records; a terminal

node contains at least ([p/2] – 1) records (upper ceiling) and at most

p-1 records

- The largest number of records allowed in a node is p-1

23

B-Tree Insertion:

- New records are always inserted into terminal nodes

- Every null pointer represents an insertion pointer, where a new

record might go

- To determine the insertion point, searching for a new record as if it

were already in the tree

- Problems with inserting is that nodes can overflow because there is

upper bound p-1 records

- If the node into which we have inserted a record now exceeds the

max size, then redistribute or split on overflow

- The node is split into three parts

- Splitting on overfull node with p records, the middle record is

passed upward and inserted into its parent

B-Tree Exmples: (perform it on the B-Tree, p=3)

(1) Insert 13

(2) Insert 10

(3) Insert 16

24

25

B-Tree Deletion

- Start the delete operation at the lowest level of the tree

- Replace it by a copy of its successor, the record with the next

highest key, the successor will be at the lowest level (predecessor

will also work as well)

- We may have an underflow after deletion (node may be smaller

than the minimum size)

- Use redistribution or concatenation to solve underflow

B-Tree Examples: p=5

Delete 10

Delete 13

Delete 18

26

27

B+ Trees

Knuth proposed a variation on B trees.

- Records on a B+ tree are held only on the terminal nodes

- The terminal nodes are linked together to facilitate sequential

processing of the records and are termed the sequential set

- No need for terminal nodes to have tree pointers

- Terminal nodes have different structure than non-terminal nodes

Each internal node is of the form:

1. <P1, K1, P2, K2, ….Pq-1, Kq-1, Pq>, Where, q ≤ p, each Pi is a

tree pointer 2. Within each node K1 < K2 < …<Kq-1

3. For all search filed values X in the subtree pointed at by Pi, we

have Ki-1 < X ≤ Ki for 1 <i<q, X ≤ Ki for i=1; and Ki-1 < X for i=q

4. Each internal node has at most p tree pointers

5. Each internal node, except the root, has at least ᴦp/2˥ tree

pointers; the root node has at least two tree pointers if it is an

internal node (ᴦp/2˥ to p)

6. An internal node with q pointers, q≤p, has q-1 search fields values

(Notice that there is no Kq as this pointer will lead to another subtree, it

is simply a pointer, no need for a key)

The structure of the lead nodes of B+ tree of order p is as follows:

1. <<K1,Pr1>, <K2, Pr2>, …….,<Kq-1, Prq-1>, Pnext> where q ≤ p, each Pri

is a data pointer, and Pnext points to the next leaf node

28

2. Within each leaf node K1 < K2 < …<Kq-1, q≤p

3. Each Pri is a data pointer that points to the record whose search

field value is Ki or to a file block containing the record (or to a

block of record pointers that point to records whose search field

value is Ki, if the search field is not a key)

4. Each leaf node has at least ᴦp/2˥ values

5. All leaf nodes are at the same level

- By starting at the left most leaf node, it is possible to traverse leaf

nodes as a linked list using Pnext pointers

- Provides ordered access to the data records

- A Pprevious can also be included

- As the structures for internal and leaf nodes are different, their

order can be different; order p for internal nodes, and order pleaf for

leaf nodes.

Example 6:

Search key field is V = 9 bytes

Block size is B = 512 bytes

Record pointer is Pr = 7 bytes

Block pointer/tree pointer is P = 6 bytes

An internal node can have up to p tree pointers and p-1 key fields,

these must fit in a single block

Thus,

p * P + (p-1) * V ≤ B

p*6+(p-1)9 ≤ 512

29

15p ≤ 521

P ≤ 34 for intermediate nodes

The leaf nodes have the same number of values and pointers, except

they are data pointers and next pointer.

The order of pleaf can be calculated as follows:

Pleaf * (Pr+V) + P ≤ 512

Pleaf * (7+9) + 6 ≤ 512

Pleaf * 16 ≤ 512 - 6

Pleaf * 16 ≤ 506

Pleaf ≤ 31

Example 7:

Construct a B+ tree on example 6

- Assume each node is 69% full

- p = 34, pleaf = 31

- On the average, each internal node have 0.69 * 34 ~ 23 pointers and

22 key values

- On the average leaf node has 0.69 * 31 ~ 21 data record pointers

Root 1 node 22 key entries (22+1=23) ptrs

Level1 23 nodes 23*22 (506) (506+23=529) ptrs

Level2 23*23(529) 528*22(11638) (11638+529=12167)

Leaf 529*23(12167) 12167*21 (255507) data record ptrs

30

31

B+ Tree Example

32

B+ Tree Insertion

1. If an index node has to split, the algorithm is same as B tree.

2. If the node splits when we insert record into a terminal node, we

put a copy of the key of the central record in TOOBIG into the

index. Thus the central record will also be one of the two halves

after splitting,

B+ Tree Deletion

1. When a record is deleted from B+ tree, no distribution or

concatenation is needed

2. No changes are made to the index, even if the key of the record to

be deleted appears in the record, it can be left as a separator

Use the following B+ tree for deletion.

33

34

Indexes on Multiple Keys

So far, we have considered single attributes as search attributes.

However, in real world, multiple attributes are used to search records.

EMPLOYEE

ssn, dno, age, street, city, zip, salary, skill-code

Query:

List the employees in department number 4, where age is 59.

Both attributes department number and age are non-key attributes,

that is, a search value for either of these will point to multiple records.

Ordered index on multiple attributes:

- Create an index on a search key field that is a combination of <dno,

age>. The search key is a pair of values, <4, 59> in this example.

- In general, <A1, A2, …, An> attributes result in values <v1, v2, …,Vn>

- <3, n> precedes <4,m> in this ordering. The ascending order for dno

keys will be <4,18>, <4,19>,….etc..The composite attribute indexing

can be used to access data

Partitioned Hashing

For a key consisting of n components, the hash function is designed to

produce a result with n separate hash addresses. For example, <Dno,

Age> search key; suppose Dno=4 has a hash function 010 and Age=59

has a hash function 10101. Then, the search value goes to 10010101.

Just to search with employees with Age=59, it will be go through all 8

buckets of 001 – 111 combinations resulting in 00010101, 00110101,

….searches. This approach is only good for equality search, not range

searches.

35

Grid Files

Organize records as a grid file. <Dno, Age> has a 2 dimensional grid. n

dimensional grid can be formed with n attributes, which is hard to

construct and maintain. The scales are made in a way to achieve

uniform distribution. Dn0=4 and Age=59 falls into grid (1,5). Each cell or

cluster of cells can point to one bucket pool. This is suitable for range

queries.

36

SKIP 17.5

General Issues Concerning Indexing

- When physical index changes, then index entry needs to change;

thus one can use a logical address to cope with this problem; it

causes another level of indirect mapping of addresses and more

overhead and maintenance

- Index creation: many RDBMS have commands for creating index

CREATE | [UNIQUE] INDEX <index-name>

ON <table-name> (<column-name> [<order>]…)[CKUSTER];

CREATE INDEX DnoIndex

ON EMPLOYEE (Dno)

CLUSTER;

- Index creation process: index is not part of the data file, but can be

created and discarded dynamically (called access structure)

- Whenever we expect to access a file frequently based on some

search condition involving a particular attribute, we can request the

DBMS to create an index

- Usually, a secondary index is created to avoid reordering of records

on the disk

- Secondary index can be created with any primary record

organization

- Insertion of a large number of entries into index is called bulk

loading the index

- Indexing of strings cause problem as the strings vary in size; prefix

compression is used to reduce the size of strings to short fields

- Tuning indexes: The initial choice of indexes may have to be revisted

for the following reasons:

37

o Certain queries may take too long to run for the lack of an

index

o Certain indexes may not be utilized at all

o Certain indexes may undergo too much updating due to

frequent changes

- Some indexes may be dropped, some new ones created; trace

facility shows the usage of indexes

- Rebuilding the index: to improve performance and restructure the

tree

- It is common to use an index to enforce a key constraint on an

attribute; while inserting a record, it can be checked to see if

another record exists with the same key attribute (key integrity

constraint)

- If an index is created on a nonkey field, duplicates occur; data

records for the duplicate may contain in the same block or span

across many blocks; some systems use row-id with the record, so

that records with duplicates have their own unique identifiers

- Inverted file: a file that has a secondary index on every one of its

fields is called a fully inverted file. The data file itself is an unordered

file.

- Using indexing hints on queries: provision for allowing hints in

queries that are suggested alternatives or indicators to the query

process and optimization process for expediting query execution

SELECT /*+ INDEX (EMPLOYEE emp_dno_index)*/ Emp_ssn,

Salary, Dno

FROM EMPLOYEE

WHERE Dno < 10;

- Column-based storage of relations

38

o Vertically partitioning the table column by column, thus a two

column table can be constructed, only the needed columns

can be accessed (index value, data value)

o Using materialized views to support queries on multiple

columns

Physical database design in Relational Databases

The goal of the physical design is to provide appropriate structuring of

data to provide optimal performance.

Factors that influence physical database design:

(a) Analyze the database queries and transactions: intended use of

the database by defining high level form of queries and

transactions that will run; for each retrieval query the following

information would be needed:

i. The files (relations accessed by the query)

ii. Attributes on which selection condition is specified

iii. Selection condition is equality, inequality or a range

iv. Join and multiple tables and attributes

v. The attributes whose value is retrieved by queries

For each update operation or transaction, the following

information will be needed:

i. The files that will be updated

ii. Type of operation (insert, update or delete)

iii. Attributes on selection

iv. Attributes that will be changed

(b) Analyze the expected frequency of invocation of queries and

transactions (80% processing and 20% querying rule)

39

(c) Analyzing the time constraints on the queries and transactions

(min of 4 seconds and max of 20 seconds)

(d) Analyzing the expected frequency of update operations;

slow down the operation

(e) Analyzing the uniqueness constraints on attributes; checking

uniqueness constraints during inserts will slow the process

Physical database design decisions:

- Most relational databases represent each base relation as a physical

database file

- The access path options include individual or composite attributes

for primary file organization (keys)

- At most one of the indexes on each file may be a primary or

clustering index. Any number of secondary indexes can be created

- The performance largely depends upon which indexes or hashing

schemes exist to expedite the processing of selections and joins

- The physical design decisions for indexing fall into the following

categories:

o Whether to index an attribute (used in a query)

o What attribute or attributes to index on (one or more)

o Whether to set up a clustered index (primary index, key;

clustered index, non-key; which one depends on ordering of

the table on that attribute or attributes)

o Whether to use a hash or tree index (B+ trees support equality

and range queries; hash do not support range queires; most

commonly used are tree index (B+ tree)

o Whether to use dynamic hashing (files that grow and shrink

often; not commonly used).

Chapter 17 Indexing Structures for Files and Physical ...orion.towson.edu/~karne/teaching/c657sl/Ch17Notes.pdf · Indexing Structures for Files and Physical Database Design ... o

Documents