Friday, March 23 CS 470 Operating Systems - Lecture 29 1 Lecture 29 Reminder: Homework 7 is due on Monday at class time for Exam 2 review; no late work accepted. Reminder: Exam 2 is on Wednesday. Exam 2 review sheet is posted. Questions?
Friday, March 23 CS 470 Operating Systems - Lecture 29 1
Lecture 29
Reminder: Homework 7 is due on Monday at class time for Exam 2 review; no late work accepted.
Reminder: Exam 2 is on Wednesday. Exam 2 review sheet is posted.
Questions?
Friday, March 23 CS 470 Operating Systems - Lecture 29 2
Outline
Disk systems Disk scheduling Disk management RAID
Friday, March 23 CS 470 Operating Systems - Lecture 29 3
Disk Drives
A disk is viewed logically as a linear array of blocks. How is it mapped onto a circular disk drive?
A disk drive is one or more platters rotating on a spindle. Each side of a platter has a head that reads the data off that side of the platter. Each platter side has concentric grooves called tracks. The vertical extent of the same track position on each platter is a cylinder. Each track/cylinder is divided into sectors.
Friday, March 23 CS 470 Operating Systems - Lecture 29 5
Disk Drives
Generally, block numbers are mapped with Block 0 at cylinder/track 0 (outermost groove), head 0, sector 0. The next block is sector 1 until the track is full, then the next block is head 1, sector 0, etc., until the cylinder is full, then the next block is cylinder/track 1, head, 0, sector 0, and so forth.
Conceptually, it is possible for OS's to map logical block numbers to <cyl, head, sector> addresses, but this does not happen any more with mapping handled by the disk controller.
Friday, March 23 CS 470 Operating Systems - Lecture 29 6
Disk Drives
One reason mapping is done in disk controller is that disks have been getting larger. Density has increased in three dimensions. # sectors/track (higher rotation speed) # tracks/platter (shorter seek separation) # bits/space (vertical writes in groove)
Components of disk performance are seek time: disk arm movement to correct cylinder rotational delay (latency): wait for correct sector to
rotate under the head
Friday, March 23 CS 470 Operating Systems - Lecture 29 7
Disk Drives
Taken together, data access time is determined by Bandwidth (bytes transferred/unit time): buffer to
disk, buffer to host Buffer size
Disk drives come in various speeds and sizes optimized for various applications
Friday, March 23 CS 470 Operating Systems - Lecture 29 8
Disk Drives
Disk drive Application Sizes RPM Cache Buffer to host
Notes, Street price
WD Caviar Blue Standard, internal desktop
80GB-1TB
7200 32MB SATA 6Gb/s
1TB ~$105
WD Caviar Black Maximum speed, internal desktop
500GB-2TB
7200 64MB SATA 6Gb/s
2TB ~$210
WD Caviar Green Maximum capacity, low power, internal desktop
320GB-3TB
variable 64MB SATA 3Gb/s
3TB ~$200
WD VelociRaptor Internal, enterprise server
150-600GB
10000 32MB SATA 6Gb/s
600GB ~$270
WD Scorpio Blue Standard, internal laptop
80GB-1TB
5200 8MB SATA 3Gb/s
1TB ~$135
WD Scorpio Black Maximum power, internal laptop
160GB-750GB
7200 16MB SATA 3Gb/s
750GB ~$165
Using Western Digital as a prototypical line
Friday, March 23 CS 470 Operating Systems - Lecture 29 9
Disk Drives
Disk drive Application Sizes RPM Cache Buffer to host
Notes, Street price
WD AV-25 24/7 surveillance 160-500GB
5400 32MB SATA 3Gb/s
MTBF 1 million hours, 500GB ~$90
WD My Book Essential
External desktop 1-3TB USB 3.0 5Gb/s
3TB ~$170
WD My PassportEssential
External portable 500GB-2TB
USB 3.0 5Gb/s
1TB ~$130
WD My Book Live Duo
NetworkedPersonal Cloud
Storage
4-6TB Ethernet RAID 1/0 (2 drives in box), 6TB ~$480
Toshiba makes a 240GB, 4200 RPM, 8MB cache disk drive. Why would anyone want to buy this small, slow drive?
Friday, March 23 CS 470 Operating Systems - Lecture 29 10
Disk Drives
What is the limit on the capacity of a disk drive using conventional magnetic media? Typical drives are ~250Gb/sq.in. The Toshiba drive is ~344Gb/sq.in. Current limit is ~500Gb/sq.in. Theoretical limit is ~1Tb/sq.in., any smaller grains
and heat will change the magnetization of the bits
Seagate research into ways of packing more bits. Theoretically up to 50Tb/sq.in.
Friday, March 23 CS 470 Operating Systems - Lecture 29 11
Disk Scheduling
As with all resources, can extract best perfor-mance if schedule disk accesses. Now mostly done in the disk controller because: Original IDE interface has maximum 16383
cylinders x 16 heads x 63 sectors = 8.4GB to be reported. All disks do this now and the EIDE interface was added to find the actual geometry using LBA (linear block addressing).
Most disks map out defective sectors to spare ones. # sectors/track is not constant. About 40% more
sectors on outer tracks than on inner tracks.
Friday, March 23 CS 470 Operating Systems - Lecture 29 12
Disk Scheduling
OS generally just makes requests to the controller. The controller has a queue and a scheduling algorithm to choose which request is serviced next.
The algorithms are straightforward and have similar properties to other scheduling algorithms that we have studied.
OS's are now more concerned with disk management. I.e., how to make a disk usable to users.
Friday, March 23 CS 470 Operating Systems - Lecture 29 13
Formatting
Low-level, physical formatting is done at the factory, but OS can do this, too.
File system formatting Create a partition table that groups cylinders into a
virtual disk. Tools like fdisk, sfdisk, PartitionMagic
Create file system. In Unix, makefs allocates inodes (index blocks).
Create swap space.
Friday, March 23 CS 470 Operating Systems - Lecture 29 14
Boot Block
How does a computer find the OS to boot? Cannot require that it be in a particular location on a particular disk, since we can choose between more than one.
Bootstrap loader is a program that loads OS's. It could be stored in ROM, but then would be hard to change. Usually very small loader is stored in ROM that knows where the loader program is in the boot block (aka MBR - master boot record). Example loaders include grub, lilo, the Windows loader,...
Friday, March 23 CS 470 Operating Systems - Lecture 29 15
Boot Block
Boot loaders know how to initialize the CPU and bring up the file system. They are configured to know where the OS program code resides. E.g., grub knows they are in the file system, usually in /boot.
Boot loader loads the kernel into memory, then jumps to the first instruction of the OS. Then the OS takes over.
Friday, March 23 CS 470 Operating Systems - Lecture 29 16
Bad Blocks
All disks have bad areas. The factory initially maps out the blocks that would be been allocated to these areas. (Too many of them causes the disk to be rejected.)
Some disk controllers are "smart" (e.g., SCSI) and automatically remap bad blocks when encountered. Spare sectors are reserved on each cylinder for this.
Other controllers rely on OS to inform. E.g. Win marks FAT entries after chkdisk scan.
Friday, March 23 CS 470 Operating Systems - Lecture 29 17
Swap Space
Usage of swap space depends on memory management algorithm and OS.
Some store entire program and data in swap space for duration of execution. Others only store the pages being used.
Friday, March 23 CS 470 Operating Systems - Lecture 29 18
Swap Space
Swap space issues include file vs. disk partition - usually a raw partition with
dedicated manager for speed single vs. multiple spaces location - if single, usually in center of disk; multiple
only if multiple disks size - running out means aborting processes, but
more real memory means less need to swap
Friday, March 23 CS 470 Operating Systems - Lecture 29 19
RAID
Disks have gotten physically smaller and much cheaper. Want to combine multiple disks into one system to increase read/write performance and to improve reliability.
Initially, RAID was Redundant Arrays of Inexpensive Disks focusing on providing large amounts of storage cheaply. Now focus is on reliability, so now RAID is Redundant Arrays of Independent Disks.
Friday, March 23 CS 470 Operating Systems - Lecture 29 20
RAID
Reliability is characterized by mean time to failure (MTF). E.g., 100,000 hours for a disk.
For an array of 100 disks, the MFT that some disk will fail is 100000/100 = 1000 hours = 41.66 days(!). If only one copy of each piece of data is stored, each failure is costly.
To solve this problem, introduce redundancy. I.e., store extra information that can be used to rebuild lost information.
Friday, March 23 CS 470 Operating Systems - Lecture 29 21
RAID
Simplest redundancy is to mirror a disk. I.e., create a duplicate. Every write goes to both disks and a read can go to either one. The only way to lose data is if the second disk fails during the time to repair the first disk.
MTF for the system depends on the MTF of the disks and the mean time to repair (MTR).
Friday, March 23 CS 470 Operating Systems - Lecture 29 22
RAID
If disk failures are independent and MTR is 10 hours, MTF (i.e., data loss) is
1000002/(2*10) hours
= 500x106 hours
= ~57,000 years(!)
Of course, many failures are not independent. E.g., power failures, natural disasters, manufacturing defects, etc.
Friday, March 23 CS 470 Operating Systems - Lecture 29 23
RAID
Performance is increased through parallelism. E.g., for a mirrored disk, transfer rate is the same as a single disk, but overall read rate doubles.
Transfer rate can be improved by striping data across multiple disks. E.g., if we have 8 disks, can write one bit of each byte on each disk simultaneously. Number of accesses per unit time is the same, but each accesses reads 8 times as much data. Larger units such as block striping is common.
Friday, March 23 CS 470 Operating Systems - Lecture 29 24
RAID Levels
Striping does not help with reliability, and mirroring is expensive. Various schemes, called RAID levels, provide both with different tradeoffs.
RAID 0 is simple striping. RAID 1 is simple mirroring. Higher levels are more complicated.