Storing Data: Disks and Files Lecture 3 (R&G Chapter 9) “ea! "ro# the ta$le o" #% #e#or% 'll ipe aa% all triial "ond records*+ ,, Shakespeare! Hamlet
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 1/36
Storing Data: Disks and Files
Lecture 3
(R&G Chapter 9)
“ea! "ro# the ta$le o" #% #e#or%'ll ipe aa% all triial "ond records*+
,, Shakespeare! Hamlet
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 2/36
Reie
- .ren't Data$ases Great/
- Relational #odel
- S0L
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 3/36
Disks! 1e#or%! and Files
Query Optimization
and Execution
Relational Operators
Files and Access Methods
Buffer Management
Disk pace Management
DB
2he G picture4
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 4/36
Disks and Files
- D1S stores in"or#ation on disks*5 n an electronic orld! disks are a #echanical
anachronis#6- 2his has #a7or i#plications "or D1S design6
5 R8.D: trans"er data "ro# disk to #ain #e#or%(R.1)*
5 R28: trans"er data "ro# R.1 to disk*5 oth are high,cost operations! relatie to in,
#e#or% operations! so #ust $e planned care"ull%6
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 5/36
h% ot Store 8er%thing in 1ain1e#or%/
- Costs too much* For ;<=>>>!?CConnection ill sell %ou either5 ;@G o" R.1
5 ;3>G o" Aash5 ;B* 2 o" disk
- Main memory is volatile* e ant
data to $e saed $eteen runs*($iousl%6)
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 6/36
2he Storage Eierarch%
Source: Operating Systems Concepts 5th Edition
51ain #e#or% (R.1) "orcurrentl% used data*
5Disk "or the #aindata$ase (secondar%
storage)*5 2apes "or archiing olderersions o" the data(tertiar% storage)*
maller! Faster
Bigger! lo"er
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 7/36
i# Gra%'s Storage Latenc%.nalog%:
Eo Far .a% is the Data/
RegistersOn #hip #acheOn Board #ache
Memory
Disk
$%
$&
$&&
'ape (OpticalRo)ot
$&*
$& +
acramento
'his ,ecture -all
'his RoomMy -ead
$& min
$./ hr
% 0ears
$ min
1luto
%!&&& 0ears
Andromeda
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 8/36
Disks
- Secondar% storage deice o" choice*- 1ain adantage oer tapes: random
access s* sequential*
- Data is stored and retrieed in unitscalled disk blocks or pages*
- nlike R.1! ti#e to retriee a disk $lockaries depending upon location on disk*5 2here"ore! relatie place#ent o" $locks on
disk has #a7or i#pact on D1S per"or#ance6
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 9/36
Co#ponents o" a Disk
1latters
2he platters spin (sa%! =B> rps)*
pindle
2he ar# asse#$l% is#oed in or out to
position a head on adesired track* 2racksunder heads #ake acylinder (i#aginar%6)*
Disk head
Arm mo2ement
Arm assem)ly
nl% one head
readsHrites at an%one ti#e*
'racks
ector
Block size is a #ultipleo" sector size (hich is IJed)*
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 10/36
.ccessing a Disk ?age
- 2i#e to access (readHrite) a disk $lock:5 seek time (#oing ar#s to position disk head on track)5 rotational delay (aiting "or $lock to rotate under head)
5 transfer time (actuall% #oing data toH"ro# disk sur"ace)- Seek ti#e and rotational dela% do#inate*
5 Seek ti#e aries $eteen a$out >*3 and =>#sec5 Rotational dela% aries "ro# > to K#sec5 2rans"er rate around *>#sec per M $lock
- Me% to loer H cost: reduce seekHrotationdela%s6 Eardare s* so"tare solutions/
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 11/36
.rranging ?ages on Disk
- NNext ' $lock concept:5 $locks on sa#e track! "olloed $%5 $locks on sa#e c%linder! "olloed $%
5 $locks on ad7acent c%linder- locks in a Ile should $e arranged
seOuentiall% on disk ($% NneJt')! to#ini#iPe seek and rotational dela%*
- For a seOuential scan! pre-fetchingseeral pages at a ti#e is a $ig in6
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 12/36
Disk Space 1anage#ent
- Loest la%er o" D1S so"tare #anages space ondisk (using S Ile s%ste# or not/)*
- Eigher leels call upon this la%er to:
5 allocateHde,allocate a page5 readHrite a page
- est i" a reOuest "or a sequence o" pages is satisIed$% pages stored seOuentiall% on disk65 Responsi$ilit% o" disk space #anager*
5 Eigher leels don't kno ho this is done! or ho "reespace is #anaged*5 2hough the% #a% #ake per"or#ance assu#ptions6
- Eence disk space #anager should do a decent 7o$*
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 13/36
ConteJt
Query Optimization
and Execution
Relational Operators
Files and Access Methods
Buffer Management
Disk pace Management
DB
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 14/36
uQer 1anage#ent in aD1S
- ata must be in !"M for BM# to operate on it$
- Bu%er Mgr hides the fact that not all data is in !"M
D
1. 181R
DSM
disk page
"ree "ra#e
?age ReOuests "ro# Eigher Leels
FF8R ?L
choice o" "ra#e dictated$% replacement policy
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 15/36
hen a ?age is ReOuested ***
- uQer pool in"or#ation ta$le contains:&frame'( pageid( pin)count( dirty*
- " reOuested page is not in pool:5 Choose a "ra#e "or replacement+
,nly un-pinned. pages are candidates$
5 " "ra#e is “dirt%+! rite it to disk5 Read reOuested page into chosen "ra#e
- /in the page and return its address*
0f requests can be predicted 1e+g+( sequential
pages can be pre-fetched several pages at a ti
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 16/36
1ore on uQer 1anage#ent
- ReOuestor o" page #ust eentuall% unpinit! and indicate hether page has $een#odiIed:5 dirty $it is used "or this*
- ?age in pool #a% $e reOuested #an%ti#es!5 a pin count is used*5 2o pin a page! pincount
5 . page is a candidate "or replace#ent iQ pincount TT > (unpinned. )
- CC & recoer% #a% entail additional Hhen a "ra#e is chosen "or replace#ent*5
3rite-"head 4og protocolU #ore later6
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 17/36
uQer Replace#ent ?olic%
- Fra#e is chosen "or replace#ent$% a replacement policy55 Least,recentl%,used (LR)! 1R!
Clock! etc*- ?olic% can hae $ig i#pact on V o"
H'sU depends on the access
pattern*
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 18/36
LR Replace#ent ?olic%
- 4east !ecently 6sed 14!625 "or each page in $uQer pool! keep track o" ti#e hen
last unpinned
5 replace the "ra#e hich has the oldest (earliest) ti#e5 er% co##on polic%: intuitie and si#ple
- orks ell "or repeated accesses to popular pages
- ?ro$le#s/- /roblem5 #equential 7ooding
5 LR repeated seOuential scans*5 V $uQer "ra#es W V pages in Ile #eans each page
reOuest causes an H*5 dea: 1R $etter in this scenario/ e'll see in E=6
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 19/36
“Clock+ Replace#ent ?olic%
- .n approJi#ation o" LR- .rrange "ra#es into a c%cle! store one
reference bit per frame5 Can think o" this as the Bnd chance $it
- hen pin count reduces to >! turn on re"* $it- hen replace#ent necessar%
do "or each page in c%cle X
i" (pincount TT > && re" $it is on)turn oQ re" $itU
else i" (pincount TT > && re" $it is oQ)choose this page "or replace#entU
Y until a page is chosenU 0uestions:Eo like LR/?ro$le#s/
.(=)
(p)
C(=)
D(=)
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 20/36
D1S s* S File S%ste#
S does disk space & $uQer #g#t: h% not letS #anage these tasks/
- So#e li#itations! e*g*! Iles can't span disks*- uQer #anage#ent in D1S reOuires a$ilit% to:5 pin a page in $uQer pool! "orce a page to disk & order
rites (i#portant "or i#ple#enting CC & recoer%)5 ad7ust replacement policy( and pre,"etch pages $ased
on access patterns in t%pical D operations*
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 21/36
ConteJt
Query Optimization
and Execution
Relational Operators
Files and Access Methods
Buffer Management
Disk pace Management
DB
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 22/36
Files o" Records
- locks are the inter"ace "or H! $ut4- Eigher leels o" D1S operate on
records! and 8les of records*
- FL8: . collection o" pages! eachcontaining a collection o" records* 1ustsupport:5 insertHdeleteH#odi"% record
5 "etch a particular record (speciIed usingrecord id)
5 scan all records (possi$l% ith so#econditions on the records to $e retrieed)
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 23/36
nordered (Eeap) Files
- Si#plest Ile structure contains records in noparticular order*
- .s Ile gros and shrinks! disk pages are allocatedand de,allocated*
- 2o support record leel operations! e #ust:5 keep track o" the pages in a Ile5 keep track o" free space on pages5 keep track o" the records on a page
- 2here are #an% alternaties "or keeping track o"this*5 e'll consider B
E Fil l t d
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 24/36
Eeap File #ple#ented as aList
- 2he header page id and Eeap Ile na#e #ust $estored so#eplace*5 Data$ase “catalog+
- 8ach page contains B Npointers' plus data*
Eeader?age
Data?age
Data?age
Data?age
Data?age
Data?age
Data?age
?ages ithFree Space
Full ?ages
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 25/36
Eeap File sing a ?age Director%
- 2he entr% "or a page can include the
nu#$er o" "ree $%tes on the page*- 2he director% is a collection o" pagesU linked
list i#ple#entation is 7ust one alternatie*5 Much smaller than linked list of all H9 pages6
Data
?age =
Data?age B
Data?age
HeaderPage
DIRECTORY
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 26/36
ndeJes (a sneak preie)
- . Eeap Ile allos us to retriee records:5 $% speci"%ing the rid( or5 $% scanning all records seOuentiall%
- So#eti#es! e ant to retriee records $%speci"%ing the values in one or more 8elds!e*g*!5 Find all students in the “CS+ depart#ent5 Find all students ith a gpa Z 3
- ndeJes are Ile structures that ena$le us toanser such alue,$ased Oueries e[cientl%*
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 27/36
Record For#ats: FiJed Length
- n"or#ation a$out Ield t%pes sa#e "orall records in a IleU stored in system catalogs+
- Finding i:th Ield done ia arith#etic*
ase address ()
L1 L2 L3 L4
F1 F2 F3 F4
.ddress T L=LB
R d F t \ i $l
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 28/36
Record For#ats: \aria$leLength
- 2o alternatie "or#ats (V Ields is IJed):
Second oQers direct access to i'th Ield! e[cient storo" nulls (special don:t kno; alue)U s#all director% oer
< < < <
Fields Deli#ited $% Special S%#$ols
F= FB F3 FK
F= FB F3 FK
.rra% o" Field Qsets
? F t Fi d L th
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 29/36
?age For#ats: FiJed LengthRecords
!ecord id < &page id( slot '*+ 0n 8rst alternative(moving records for free space managementchanges rid= may not be acceptable+
Slot =Slot B
Slot
. . . . . .
1=>* * *
1 *** 3 B =
?.CM8D ?.CM8D! 21.?
Slot =Slot B
Slot
FreeSpace
Slot 1
==
nu#$er
o" records
nu#$er
o" slots
? F t \ i $l L th
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 30/36
?age For#ats: \aria$le LengthRecords
Can move records on page ;ithout changingrid= so( attractive for 8xed-length records too*
?age iRid T (i!)
Rid T (i!B)
Rid T (i!=)
?ointerto start
o" "reespace
SL2 DR8C2R
* * * B =
B> =] BK
V slots
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 31/36
S%ste# Catalogs
- For each relation:5 na#e! Ile location! Ile structure (e*g*! Eeap Ile)5 attri$ute na#e and t%pe! "or each attri$ute5 indeJ na#e! "or each indeJ
5 integrit% constraints- For each indeJ:
5 structure (e*g*! tree) and search ke% Ields
- For each ie:
5 ie na#e and deInition- ?lus statistics! authoriPation! $uQer pool siPe!
etc*
Catalogs are themselves stored as relation
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 32/36
.ttrCat(attrna#e! relna#e! t%pe!position)
attrna#erelna#e t%pe positionattrna#e .ttri$uteCat string =relna#e .ttri$uteCat string Bt%pe .ttri$uteCat string 3
position .ttri$uteCat integer Ksid Students string =na#e Students string Blogin Students string 3age Students integer Kgpa Students real Id Facult% string ="na#e Facult% string Bsal Facult% real 3
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 33/36
pgattri$ute
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 34/36
Su##ar%
- Disks proide cheap! non,olatile storage*5 Rando# access! $ut cost depends on location o" page
on diskU i#portant to arrange data seOuentiall% to
#ini#iPe seek and rotation dela%s*- uQer #anager $rings pages into R.1*
5 ?age sta%s in R.1 until released $% reOuestor*5 ritten to disk hen "ra#e chosen "or replace#ent
(hich is so#eti#e a"ter reOuestor releases the page)*5 Choice o" "ra#e to replace $ased on replacement policy+
5 2ries to pre-fetch seeral pages at a ti#e*
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 35/36
Su##ar% (Contd*)
- D1S s* S File Support5 D1S needs "eatures not "ound in #an% S's!
e*g*! "orcing a page to disk! controlling the ordero" page rites to disk! Iles spanning disks!
a$ilit% to control pre,"etching and pagereplace#ent polic% $ased on predicta$le accesspatterns! etc*
- \aria$le length record "or#at ith Ield
oQset director% oQers support "or directaccess to i'th Ield and null alues*- Slotted page "or#at supports aria$le
length records and allos records to #oeon page*
7/18/2019 03 Disks and Files
http://slidepdf.com/reader/full/03-disks-and-files 36/36
Su##ar% (Contd*)
- File la%er keeps track o" pages in a Ile! andsupports a$straction o" a collection o" records*5 ?ages ith "ree space identiIed using linked list or
director% structure (si#ilar to ho pages in Ile arekept track o")*
- ndeJes support e[cient retrieal o" records$ased on the alues in so#e Ields*
- Catalog relations store in"or#ation a$outrelations! indeJes and ies* (0nformation thatis common to all records in a given collection+)