Top Banner
Disks and Files Vivek Pai Princeton University
37

Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

Disks and Files

Vivek Pai

Princeton University

Page 2: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

2

Gedankyou

Imagine the following: A disk scheduling policy says “handle the

request that is closest to where the disk head currently is”

On a system with lots of disk-intensive jobs, what problem can arise?

What tweaks can avoid this problem?

Page 3: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

3

Why Files

Physical reality Block oriented Physical sector #s No protection

among users of the system

Data might be corrupted if machine crashes

Filesystem model Byte oriented Named files Users protected

from each other Robust to machine

failures

Page 4: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

4

File Structures

Byte sequence Read or write a number of bytes Unstructured or linear

Record sequence Fixed or variable length Read or write a number of records

Tree Records with keys Read, insert, delete a record (typically using B-tree)

Page 5: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

5

File Structures Today

Stream of bytes Simplest to implement in kernel Easy to manipulate in other forms Little performance loss

More complicated structures Hardware assist fell out of favor Special-purpose hardware slower, costly

Page 6: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

6

File Types

ASCII – plain text A Unix executable file

header: magic number, sizes, entry point, flags Text (code) Data relocation bits symbol table

Devices Everything else in the system

Page 7: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

7

So What Makes Filesystems Hard?

Files grow and shrink in pieces

Little a priori knowledge 6 orders of magnitude in

file sizes Overcoming disk

performance behavior Desire for efficiency Coping with failure

Page 8: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

8

File System Components

Disk management Arrange collection of disk blocks

into files Naming

User gives file name, not track or sector number, to locate data

Security Keep information secure

Reliability/durability When system crashes, lose stuff in

memory, but want files to be durable

User

FileNaming

Fileaccess

Diskmanagement

Diskdrivers

Page 9: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

9

Some Definitions

File descriptor (fd) – an integer used to represent a file – easier than using names

Metadata – data about data - bookkeeping data used to eventually access the “real” data

Open file table – system-wide list of descriptors in use

Page 10: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

10

Kinds of Metadata

inode – index node, or a specific set of information kept about each file Two forms – on disk and in memory

Directory – names and location information for files and subdirectories Note: stored in files in Unix

Superblock – contains information to describe the file system, disk layout

Information about free blocks/inodes on disk

Page 11: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

11

Contents of an Inode

Disk inode: File type, size, blocks on disk Owner, group, permissions (r/w/x) Reference count Times: creation, last access, last mod Inode generation number Padding & other stuff

128 bytes on classic Unix

Page 12: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

12

Directories in Unix

Stored like regular files Contents are file names and inode #s Names are nul-terminated strings

Logic Separates file from location in tree File can appear in multiple places

What are the drawbacks?

Page 13: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

13

Effects of Corruption

inode – file gets “damaged” Maybe some “free” block gets viewed

Directory – “lose” files/directories Might get to read deleted files

Superblock – can’t figure out anything This is why we replicate the superblock

Page 14: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

14

Data Structures for A Typical File System

Processcontrolblock

...

Openfile

pointerarray

Open filetable

(systemwide)Memory Inode

Diskinode

Page 15: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

15

Opening A File

File name lookup and authentication

Copy the file metadata into the in-memory data structure, if it is not in yet

Create an entry in the open file table (system wide) if there isn’t one

Create an entry in PCB Link up the data structures Return a pointer to user

PCB

fd = open( FileName, access)

Openfile

table

Metadata

Allocate & link updata structures

File name lookup& authenticate

File system on disk

Page 16: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

16

Reading And Writing

What happens when you… read 10 bytes from a file? write 10 bytes into an existing file? write 4096 bytes into a file?

Disk works on blocks (sectors) Can have temporary (ephemeral) buffers Longer lasting buffers = disk cache

Page 17: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

17

Reading A Block

PCB

Openfile

table

Metadata

read( fd, userBuf, size )

Logical phyiscal

read( device, phyBlock, size )

Get physical block to sysBufcopy to userBuf

Disk device driver

Buffercache

Page 18: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

18

A Disk Layout for A File System

Superblock defines a file system size of the file system size of the file descriptor area free list pointer, or pointer to bitmap location of the file descriptor of the root directory other meta-data such as permission and various times

For reliability, replicate the superblock

Superblock

File metadata(i-node in Unix)

File data blocksBootblock

Page 19: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

19

File Usage Patterns

How do users access files? Sequential: bytes read in order Random: read/write element out of middle of arrays Whole file or partial file

How are files used? Most files are small Large files use up most of the disk space Large files account for most of the bytes transferred

Bad news Need everything to be efficient

Page 20: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

20

Data Structures for Disk Management

A “header” for each file (part of the file meta-data) Disk sectors associated with each file

A data structure to represent free space on disk Bit map

1 bit per block (sector) blocks numbered in cylinder-major order, why?

Linked list Others?

How much space does a bit map need for a 4G disk?

Page 21: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

21

Linked Files (Alto)

File header points to 1st block on disk

Each block points to next Pros

Can grow files dynamically Free list is similar to a file

Cons random access: horrible unreliable: losing a block

means losing the rest

File header

null

. . .

Page 22: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

22

Contiguous Allocation

Request in advance for the size of the file Search bit map or linked list to locate a space File header

first sector in file number of sectors

Pros Fast sequential access Easy random access

Cons External fragmentation Hard to grow files

Page 23: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

23

Single-Level Indexed Files orExtent-based Filesystems A user declares max size A file header holds an array

of pointers to point to disk blocks

Pros Can grow up to a limit Random access is fast

Cons Clumsy to grow beyond limit Periodic cleanup of new files Up-front declaration a real pain

File headerDiskblocks

Page 24: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

24

217

File Allocation Table (FAT) Approach

A section of disk for each partition is reserved

One entry for each block A file is a linked list of

blocks A directory entry points to

the 1st block of the file Pros

Simple Cons

Always go to FAT Wasting space

619

399

foo 217

EOF

FAT

0

399

619

Page 25: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

25

Multi-Level Indexed Files (Unix)

13 Pointers in a header 10 direct pointers 11: 1-level indirect 12: 2-level indirect 13: 3-level indirect

Pros & Cons In favor of small files Can grow Limit is 16G and lots of

seek What happens to reach

block 23, 5, 340?

1 2

data

data

...11 12 13

data...

...

data...

...

data...

...

Page 26: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

26

Reliability In Disk Systems

Make sure certain actions have occurred before function completes Known as “synchronous” operation Ex: make sure new inode is on disk & that the

directory has been modified before declaring a file creation is complete

Drawback: speed Some ops easily asynchronous: access time Some filesystems don’t care: Linux ext2fs

Page 27: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

27

Recovery After Failure

Need to ensure consistency Does free bitmap match tree walk? Do reference counts in inodes match directory

entries? Do blocks appear in multiple inodes?

This kind of recovery grows with disk size Clean shutdown – mark as such, no recovery

Page 28: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

28

Reducing Synchronous Times

Write to a faster storage Nonvolatile memory – expensive, requires some

additional OS/firmware support Write to a special disk or section – logging

Only have to examine log when recovering Eventually have to put information in place Some information dies in the log itself

Write in a special order Write metadata in a way that is consistent but

possibly recovers less

Page 29: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

29

Challenges

Unix filesystem has great flexibility Extent-based filesystems have speed Seeks kill performance – locality Bitmaps show contiguous free space Linked lists easy to search How do you perform backup/restore?

Page 30: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

30

A Quick XOR Overview

XOR = eXclusive OR a XOR a = 0 a XOR 0 = a a XOR b = b XOR a (a XOR b) XOR c = a XOR (b XOR c) In other words, count the bits,

even = 0, odd = 1

Page 31: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

31

More Fun With XOR

Result = XOR (a1, a2, a3, a4,…) a2 goes bad Can we reconstruct a2?

a2 = XOR (a1, result, a3, a4,…) What does this imply for disks?

What kinds of failures does it handle?

Page 32: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

32

Bigger, Faster, Stronger

Making individual disks larger is hard Throw more disks at the problem

Capacity increases Effective access speed may increase Probability of failure also increases

Use some disks to provide redundancy Generally assume a fail-stop model Fail-stop versus Byzantine failures

Page 33: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

33

RAID (Redundant Array of Inexpensive Disks)

Main idea Store the error correcting codes

on other disks General error correcting codes

are too powerful Use XORs or single parity Upon any failure, one can

recover the entire block from the spare disk (or any disk) using XORs

Pros Reliability High bandwidth

Cons The controller is complex

RAID controller

XOR

Page 34: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

34

Synopsis of RAID Levels

RAID Level 0: Non redundant (JBOD)

RAID Level 1:Mirroring

RAID Level 2:Byte-interleaved, ECC

RAID Level 3:Byte-interleaved, parity

RAID Level 4:Block-interleaved, parity

RAID Level 5:Block-interleaved, distributed parity

Page 35: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

35

Did RAID Work?

Performance: yes Reliability: yes Cost: no

Controller design complicated Fewer economies of scale High-reliability environments don’t care

Now also software implementations

Page 36: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

36

RAID’s Real Benefit

Partly addresses the failure problem Backup/restore less of an issue Failed disk “rebuilt” at sector level Lower performance during rebuild, but system

still on-line Still not perfect

Geographic problems Failure during rebuild

Page 37: Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.

37