Query Tree Question • Should we do a pname, pnumber then pname = ‘Aquarius’ then pnumber ? • No, since the operations are done together – the processor would read a row of project, see if pname = ‘Aquarius’ then use pnumber to perform the join. • Our query tree has only 2 groups, not 3
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Query Tree Question
• Should we do a pname, pnumber then pname = ‘Aquarius’ then pnumber ?
• No, since the operations are done together– the processor would read a row of project,
see if pname = ‘Aquarius’ then use pnumber to perform the join.
• Our query tree has only 2 groups, not 3
Select Operation Strategies
And Indexing
(Chapter 8)
*Some info on slides from Dr. S. Son, U. Va
Disk access
• DBs traditionally stored on disk
• Cheaper to store on disk than in memory
• Costs for:– Seek time, latency, data transfer time
• Disk access is page oriented
• 2 - 4 KB page size
Access time
• Time to randomly access a page :– 12-20 ms which is 50-83 I/O's per second
• Large disparity between disk access and memory access (10-200 ns)
• System initially determines if page in memory buffer (page tables, etc.)
Table scan
• Linear search - all data rows read in – I/O parallelism can be used
• multiple I/O read requests satisfied at the same time
• stripe the data across different disks
– Problems with parallelism?• must balance disk arm load to gain maximum
parallelism • requires the same total number of random I/O's,
but using devices for a shorter time
Sequential prefetch I/O
• Retrieve one disk page after another (on same track) - typically 32
• Seek time no longer a problem
• Must know in advance to read 32 successive pages
• Speed up of I/O by a factor of ≈10 (500 I/O's per second vs. 70)
Access time
• Seek time – 10-15ms
• Latency time – 2-5 ms
• Data transfer time – 10-200 ns
Access time for fast I/O
RIO Seq. Prefetch .010 .010 Seek - disk arm to cylinder .002 .002 Latency - platter to sector .0015 .048 Data transfer - Page .0135 .060 1 page vs. 32 pages
.43 seconds .060 seconds for 32 pages for both
Textbook access time
RIO Seq. Prefetch .008 .008 Seek - disk arm to cylinder .004 .004 Latency - platter to sector .0005 .016 Data transfer - Page .0125 .028 1 page vs. 32 pages
.40 seconds .028 seconds for 32 pages for both
Disk allocation
• Disk Resource Allocation for Databases (DBA has control)
• Goal – contiguous sectors on disk - want data as close together as possible to minimize seek time
• No standard SQL approach, but general way to deal with allocation
• Some OS allow specification of size of file and disk device
Tablespace
• Allocation medium for tables and indexes for ORACLE, DB2, etc.
• Usually relations (files) cannot span disk devices • Can put >1 table in a table space if accessed
together • Tablespace corresponds to 1 or more OS files
and can span disk devices
Query Language
• ORACLE DB's contain several tablespaces, including one called system - data description + indexes + user-defined tables
• default tablespace given to each user • if multiple tablespaces - better control over load
balancing • can take some disk space off-line
Extent
• extent - contiguous storage on disk • when data segment or index segment first created,
given an initial extent from tablespace 10KB (5 pages) • if need more space given next contiguous extent
• can increase the size by a positive % (cannot decrease) initial n - size of initial extent next n - size of next max extents - maximum number of extents min extents - number of extents initially
allocated pct increase n - % by which next extent
grows over previous one
Create table
• Create table statement - can specify tablespace, no. of extents– When initial extent full, new extent allocated
• pctfree - determine how much space can be used for inserts of new rows – if pctfree =10%, inserts stop when page is 90% full
• pctused – determines when new inserts start again – if fall below certain percentage of total, default
pctused = 40% pctfree + pctused < 100
Rows
• Row layout on each disk page (see figure) • Row directory – row number and page byte
offset– Row number is row number in page – book calls it
slot#– Page byte offset – with varchar, row size not constant
• To identify a particular row use RID (RowID) –
page #, slot # [file#]
slot# is number in row directory (logical #)
Differences in DBMSs
• RID can be retrieved in ORACLE but not DB2 (violates relational model rule)
• ORACLE • rows can be slit between pages (row record
fragmentation) • Can have rows from multiple tables on same page,
more info
• DB2, no splitting, entire row moved to new page, need forwarding pointer
Binary Search
• ``Find all students with gpa > 3.0’’– If data is in sorted file, do binary search to find
first such student, then scan to find others.– Cost of binary search can be quite high.
• Simple idea: Create an `index’ file.
Page 1 Page 2 Page NPage 3 Data File
k2 kNk1 Index File
Binary Search
• Binary search on disk – optimal for comparisons - not optimal for disk-
based look-up – must keep data in order – may be reading values from same page at
different times
• Instead use B+-tree index
Indexing
• Keyed access retrieval method • index is a sorted file - sorted by index key • index entries:
index key pointer (RID)
• pointer is RID • index resides on disk, partially memory resident when
accessed
Indexing
• As for any index, 3 alternatives for data entries k*:– Data record with key value k– <k, rid of data record with search key value k>– <k, list of rids of data records with search key k>
• Choice is orthogonal to the indexing technique used to locate data entries k*.
• Tree-structured indexing techniques support both range searches and equality searches.
• B+ tree: dynamic, adjusts gracefully under inserts and deletes.
B+-tree
• Most commonly used index structure type in DBs today
• Based on B-tree
• Used to minimize disk I/O
• available in DB2, ORACLE also has hash cluster, Ingres has heap structure, B-tree, isam (chain together new nodes) Example
Structure of B+ Trees
• leaf level pointers to data (RIDs)
• the remaining are directory (index) nodes that point to other index nodes
Index Entries
Data Entries("Sequence set")
(Direct search)
Characteristics of B+ Tree
• Insert/delete at log F N cost; keep tree height-balanced. (F = fanout, N = # leaf pages)
• Minimum 50% occupancy (except for root). Each node contains d <= m <= 2d entries. The parameter d is called the order of the tree.
• Supports equality and range-searches efficiently
Cost of I/O for B+-tree
• Assume number of entries in each index node fits on one page - one node is one page
• If tree with depth of 3, 3 I/Os to get pointer to data B+-tree structured to get most out of every disk page read
• Read in index node, can make multiple probes to same page if remains in memory
– likely since frequent access to upper -level nodes of actively used B+-trees