1 Disk Management Disk Management operating systems
Mar 30, 2015
1
Disk ManagementDisk ManagementDisk ManagementDisk Management
operatingsystems
2
Goals of I/O SoftwareGoals of I/O SoftwareGoals of I/O SoftwareGoals of I/O Software
Provide a common, abstract view of all devices to the application programmer (open, read, write)
Provide as much overlap as possible between the operation of I/O devices and the CPU.
Provide a common, abstract view of all devices to the application programmer (open, read, write)
Provide as much overlap as possible between the operation of I/O devices and the CPU.
operatingsystems
3
I/O – Processor OverlapI/O – Processor OverlapI/O – Processor OverlapI/O – Processor Overlap
Application programmers expect serial execution semantics
read (device, “%d”, x);y = f(x);
We expect that this statement will completebefore the assignment is executed.
To accomplish this, the OS blocks the processuntil the I/O operation completes.
operatingsystems
4
User Process
Device Independent Layer
Device Dependent Layer
Interrupt Handler
Device Controller
data status command
read(device, %D”, x);y = f(x);…
x 5
READ
CPU
5
1212
Without Blocking!Without Blocking!
The read is issued.The read has not Completed … but theprocess continues toexecute.
The read completesand the value of x isupdated.
operatingsystems
5
In a multi-programming environment, another application could use the cpu while the first application waits for the I/O to complete.
app1
app2
I/Ocontroller
Request I/Ooperation
I/OComplete
operatingsystems
app2done
6
PerformancePerformancePerformancePerformance
Thread execution time can be broken into:•Time compute The time the thread spends doing computations•Time device The time spent on I/O operations•Time overhead The time spent determining if I/O is complete
So, Time total = Time compute + Time device + Time overhead
operatingsystems
7
PerformancePerformancePerformancePerformance
Time total = Time compute + Time device + Time overhead
Time overhead = The period of time between the point where the device completes the operation and the point where the polling loop determines that the operation is complete. This is generally just a few instruction times.
operatingsystems
When the device driver polls
Note that when the device driver polls, no otherprocess can use the cpu. Polling consumes the cpu.
8
Are you done yet?
9
PerformancePerformancePerformancePerformance
Time total = Time compute + Time device + Time overhead
When the device driver uses interrupts Time overhead = Time handler + Time ready
Time handler is the time spent in the interrupt handlerTime ready is the time the process waits for the cpuafter it has completed its I/O, while another processuses the CPU.
operatingsystems When the device driver uses interrupts
10
process
I/Ocontroller
For simplicity’s sake assume processes of the following form: Each process computes for a long while and then writes its results to a file.We will ignore the time taken to do a context switch.
Time compute
Time device
Time compute
operatingsystems
Request anI/O operation
11
Proc 1
Polling CasePolling Case
Time compute
Time device
Time compute
Time overhead
Time compute
Time device
Time compute
Time overhead
Proc 2
operatingsystems
In the polling case, the process starts the I/O operation,and then continually loops, asking the device if it is done.
Pro
c1 p
olls
Pro
c2 p
olls
12
Proc 1
Interrupt CaseInterrupt Case
Time compute
Time device
Time compute
Time overhead
Time device
Proc 2
Time computeoperatingsystems
In the interrupt case, the process starts the I/Ooperation, and then blocks. When the I/O is done,the os will get an interrupt.
Time interrupt handler
13
Which gives better system throughput? * Polling * Interrupts
Which gives better application performance? * Polling * Interrupts
If you were developing an operating system,would you choose interrupts or polling?
14
Buffering IssuesBuffering IssuesBuffering IssuesBuffering Issues
Userspace
Kernel Read from the diskInto user memory
Assume that you are using interrupts…
What problems Exist in this situation?
operatingsystems
15
Buffering IssuesBuffering IssuesBuffering IssuesBuffering Issues
Userspace
Kernel Read from the diskInto user memory
Assume that you are using interrupts…
What problems Exist in this situation?
operatingsystems
The process cannot be completely swapped out of memory. At least the page containing the addresses into which the data is being written must remain in real memory.
16
Buffering IssuesBuffering IssuesBuffering IssuesBuffering Issues
Userspace
Kernel
Read from the disk into kernel buffer.When the buffer is full, transfer to memory in user space.
operatingsystems
17
Buffering IssuesBuffering IssuesBuffering IssuesBuffering Issues
Userspace
Kernel
We can now swap the user processor outWhile the I/O completes.
What problems Exist in this situation?
1. The O/S has to carefully keep track of the assignment of system buffers to user processes.
2. There is a performance issue when the user process is not in memory and the O/S is ready to transfer its data to the user process. Also, the device must wait while data is being transferred.
3. The swapping logic is complicated when the swapping operation uses the same disk drive for paging that the data is being read from.
operatingsystems
18
Buffering IssuesBuffering IssuesBuffering IssuesBuffering Issues
Userspace
Kernel
Some of the performance issues can be addressed by double buffering. While one buffer is being transferred to the user process, the device is reading data into a second buffer.
operatingsystems
19
Networking may involve Networking may involve many copiesmany copies
Networking may involve Networking may involve many copiesmany copies
operatingsystems
20
Disk SchedulingDisk SchedulingDisk SchedulingDisk Scheduling
Because Disk I/O is so important, it is worth our time toinvestigate some of the issues involved in disk I/O.
One of the biggest issues is disk performance.
operatingsystems
21
seek time is the timerequired for the read head tomove to the track containingthe data to be read.
22
rotational delay orlatency, is the time required for the sectorto move under the read head.
23
Performance ParametersPerformance ParametersPerformance ParametersPerformance Parameters
Wait for device
Wait forChannel
Device busy
seek
rotationaldelay data
transfer
Seek time is the time required tomove the disk arm to the specified track
Ts = # tracks * disk constant + startup time~
Rotational delay is the time required for the data onthat track to come underneath the read heads. Fora hard drive rotating at 3600 rpm, the average rotationaldelay will be 8.3ms.
Transfer Time
Tt = bytes / ( rotation_speed * bytes_on_track )
(latency)
operatingsystems
24
Data Organization vs. Data Organization vs. PerformancePerformance
Data Organization vs. Data Organization vs. PerformancePerformance
Consider a file where the data is stored as compactly as possible, in this case the file occupies all of the sectors on 8 adjacent tracks (32 sectors x 8 tracks = 256 sectors total).
The time to read the first track will be
average seek time 20 ms rotational delay 8.3 ms read 32 sectors 16.7 ms
Assuming that there is essentially no seek time on the remaining tracks,each successive track can be read in 8.3 + 16.7 ms = 25ms.
Total read time = 45ms + 7 * 25ms = 220ms = 0.22 seconds
operatingsystems
45ms
25
If the data is randomly distributed across the disk:
For each sector we have
average seek time 20 ms rotational delay 8.3 ms read 1 sector 0.5 ms
Total time = 256 sectors * 28.8 ms/sector = 7.73 seconds
28.8 ms
Random placement of data can be a problem when multiple processes are accessing the same disk.
operatingsystems
26
In the previous example, the biggest factoron performance is ?
Seek time!
To improve performance, weneed to reduce the average seek time.
operatingsystems
27
Queue
Request
Request
Request
Request
…
If requests are scheduled in random order,then we would expect the disk tracks to be
visited in a random order.
operatingsystems
28
Queue
Request
Request
Request
Request
…
If there are few processes competing for the drive, we can hope for good performance.
•If there are a large number of processes competing for the drive, then performance approaches the random scheduling case.
operatingsystems
First-come, First-served First-come, First-served SchedulingScheduling
29
Track
10
20
30
40
Steps50 100
15 to 4 11 steps4 to 40 36 steps40 to 11 29 steps11 to 35 24 steps35 to 7 28 steps7 to 14 7 steps
135 steps
While at track 15, assume some random set of read requests -- tracks 4, 40, 11, 35, 7 and 14
Head Path Tracks Traveled
operatingsystems
30
Queue
Request
Request
Request
Request
…
Shortest Seek Time FirstShortest Seek Time First
Always select the request that requires the shortest seek time from the current position.
operatingsystems
31
Track
10
20
30
40
Steps50 100
While at track 15, assume some random set of read requests -- tracks 4, 40, 11, 35, 7 and 14
Shortest Seek Time First
Problem?
In a heavily loaded system, incoming requests with ashorter seek time will constantly push requests with long seek times to the end of the queue. This resultsin what is called “Starvation”.
Head Path Tracks Traveled
operatingsystems
32
Queue
Request
Request
Request
Request
…
The elevator algorithm The elevator algorithm (scan-look)(scan-look)
Search for shortest seek time from the current position only in one direction. Continue in this direction until all requests have been satisfied, then go the opposite direction.
operatingsystems
In the scan algorithm, the head moves all theway to the first (or last) track with a request before it changes direction.
33
Track
10
20
30
40
Steps50 100
While at track 15, assume some random set of read requestsTrack 4, 40, 11, 35, 7 and 14. Head is moving towards highernumbered tracks.
Scan-Look
Head Path Tracks Traveled
operatingsystems
34
Which algorithm would you choose if you wereimplementing an operating system? Issues to consider when selecting a disk scheduling algorithm:
Performance is based on the number and types of requests.
What scheme is used to allocate unused disk blocks?
How and where are directories and i-nodes stored?
How does paging impact disk performance?
How does disk caching impact performance?
35
Disk CacheDisk CacheDisk CacheDisk Cache
The disk cache holds a number of disk sectors in memory.When an I/O request is made for a particular sector, the disk cache is checked. If the sector is in the cache, it is read. Otherwise, the sector is read into the cache.
operatingsystems
36
Replacement StrategiesReplacement StrategiesReplacement StrategiesReplacement Strategies
Least Recently Used
replace the sector that has been in the cache the longest, without being referenced.
Least Frequently Used
replace the sector that has been used the least
operatingsystems
37
RAIDRAIDRedundant Array of Independent Disks
• Push Performance
• Add reliability
38
RAID Level 0: StripingRAID Level 0: Striping
Logical Disk
strip 0
strip 1
strip 2
strip 3
strip 4
strip 5
strip 6
strip 7
strip 8
strip 9
strip 10
strip 11
o o o
strip 0
strip 2
strip 4
strip 6
o o o
strip 1
strip 3
strip 5
strip 7
o o o
PhysicalDrive 1
PhysicalDrive 2
Disk ManagementSoftware
A Stripe
operatingsystems
39
RAID Level 1: MirroringRAID Level 1: Mirroring
Logical Disk
strip 0
strip 1
strip 2
strip 3
strip 4
strip 5
strip 6
strip 7
strip 8
strip 9
strip 10
strip 11
o o o
strip 0
strip 2
strip 4
strip 6
o o o
strip 1
strip 3
strip 5
strip 7
o o o
PhysicalDrive 1
PhysicalDrive 2
Disk ManagementSoftware
strip 0
strip 2
strip 4
strip 6
o o o
strip 1
strip 3
strip 5
strip 7
o o o
strip 0
strip 2
strip 4
strip 6
o o o
strip 1
strip 3
strip 5
strip 7
o o o
PhysicalDrive 3
PhysicalDrive 4
strip 0
strip 2
strip 4
strip 6
o o o
strip 1
strip 3
strip 5
strip 7
o o o
High Reliabilityoperatingsystems
40
RAID Level 3: ParityRAID Level 3: Parity
Logical Disk
strip 0
strip 1
strip 2
strip 3
strip 4
strip 5
strip 6
strip 7
strip 8
strip 9
strip 10
strip 11
o o o
strip 0
strip 2
strip 4
strip 6
o o o
strip 1
strip 3
strip 5
strip 7
o o o
PhysicalDrive 1
PhysicalDrive 2
Disk ManagementSoftware
strip 0
strip 3
strip 6
strip 9
o o o
strip 1
strip 4
strip 7
strip 10
o o o
strip 0
strip 2
strip 4
strip 6
o o o
strip 1
strip 3
strip 5
strip 7
o o o
PhysicalDrive 3
PhysicalDrive 4
strip 2
strip 5
strip 8
strip 11
o o o
para
parb
parc
pard
o o o
High Throughputoperatingsystems
parity
41
Thinking About What You Thinking About What You Have LearnedHave Learned
Thinking About What You Thinking About What You Have LearnedHave Learned
42
Suppose that 3 processes, p1, p2, and p3 are attempting to concurrently use a machine with interrupt driven I/O. Assuming that no two processes can be using the cpu or the physical device at the same time, what is the minimum amount of time required to execute the three processes, given the following (ignore context switches):
Process Time compute Time device
1 10 50 2 30 10 3 15 35
operatingsystems
43
Process Time compute Time device
1 10 50 2 30 10 3 15 35
0 10 20 30 40 50 60 70 80 90 100 110 120 130
P1p2
p3
105
44
Consider the case where the device controller is double buffering I/O.That is, while the process is reading a character from one buffer, the device is writing to the second.
Process
DeviceController
What is the effect on the running time of the process if the process is I/O bound and requests characters faster than the device can provide them?
The process reads from buffer A.It tries to read from buffer B, but the device is still reading. The process blocks until the data has been storedin buffer B. The process wakes up and reads the data, then tries to read Buffer A.Double buffering has not helped performance.
BA
45
Consider the case where the device controller is double buffering I/O.That is, while the process is reading a character from one buffer, the device is writing to the second. Process
DeviceController
What is the effect on the running time of the process if the process is Compute bound and requests characters much slower than the device can provide them?
The process reads from buffer A.It then computes for a long time.Meanwhile, buffer B is filled. When The process asks for the data it isalready there. The process does nothave to wait and performance improves.
A B
46
Suppose that the read/write head is at track is at track 97, moving toward the highest numbered track on the disk, track 199. The disk request queue contains read/write requests for blocks on tracks 84, 155, 103, 96, and 197, respectively.
How many tracks must the head step across usinga FCFS strategy?
47
Suppose that the read/write head is at track is at track 97, moving toward the highest numbered track on the disk, track 199. The disk request queue contains read/write requests for blocks on tracks 84, 155, 103, 96, and 197, respectively.
How many tracks must the head step across usinga FCFS strategy?
Track
50
100
150
199
Steps100 200
97 to 84 13 steps84 to 155 71 steps155 to 103 52 steps103 to 96 7 steps96 to 197 101 steps
244 steps
48
Suppose that the read/write head is at track is at track 97, moving toward the highest numbered track on the disk, track 199. The disk request queue contains read/write requests for blocks on tracks 84, 155, 103, 96, and 197, respectively.
How many tracks must the head step across usingan elevator strategy?
49
Suppose that the read/write head is at track is at track 97, moving toward the highest numbered track on the disk, track 199. The disk request queue contains read/write requests for blocks on tracks 84, 155, 103, 96, and 197, respectively.
How many tracks must the head step across usingan elevator strategy?
Track
50
100
150
199
Steps100 200
97 to 103 6 steps103 to 155 52 steps155 to 197 42 steps197 to 199 2 steps199 to 96 103 steps96 to 84 12 steps
217steps
50
In our class discussion on directories it was suggestedthat directory entries are stored as a linear list. Whatis the big disadvantage of storing directory entriesthis way, and how could you address this problems?
Consider what happens when look up a file …The directory must be searched in a linear way.
51
Which file allocation scheme discussed in class gives thebest performance? What are some of the concerns with this approach?
Contiguous allocation schemes gives the best performance.Two big problems are: * Finding space for a new file (it must all fit in contiguous blocks) * Allocating space when we don’t know how big the file will be, or handling files that grow over time.
52
What is the difference between internal andexternal fragmentation?
Internal fragmentation occurs when only a portion of aFile block is used by a file.
External fragmentation occurs when the free space on a diskdoes not contain enough space to hold a file.
53
Linked allocation of disk blocks solves many of theproblems of contiguous allocation, but it does notwork very well for random access files. Why not?
To access a random block on disk, you must walkThrough the entire list up to the block you need.
54
Linked allocation of disk blocks has a reliabilityproblem. What is it?
If a link breaks for any reason, the disk blocks afterThe broken link are inaccessible.