Top Banner
CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8
70

CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Dec 28, 2015

Download

Documents

Kerry Sparks
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

CS3530OPERATING SYSTEMS

Summer 2014

File ManagementChapter 8

Page 2: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Purpose of File Management

Data should be organized in some convenient and efficient manner. In particular, users should be able to:Store data into filesFind and use files that have previously been created

Page 3: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

File System

A set of OS Services that provides Files and Directories for user applications

Page 4: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Files

A file is simply a sequence of bytes that have been stored in some device (storage) on the computer system

Page 5: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Files

The bytes contain data stored in the file, such as: A text file just containing characters that we are interested in A word processing document file that also contains data about

how to format the text A database contains data organized in multiple files.

In general, the File Management system does not have any knowledge about how the data in a file is organized. That is the responsibility of the application programs that create and use the file.

Page 6: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Permanent Storage Devices

Disk Drives Flash Memory (Memory stick) CDs and DVDs Magnetic tape drives

Page 7: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

File Attributes

Name Symbolic (Human-readable) label of the file

Type Executable file, text print file, binary file, etc.

Location The physical address on disk of the file

Size in Mb, Gb, or Tb

Page 8: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Other File Attributes

Protection Permissions for who can read, write, and execute

the file, etc. Time, date

When file was created, modified, accessed

Page 9: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Folders

Name of folder Typically, a folder may contain Files and

other folders (commonly called sub-folders or sub-directories)

This results in a Tree Structure of Folder and Files.

Page 10: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Folder/Directory Tree Structure

Page 11: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Pathnames

The pathname of a file specifies the sequence of folders one must traverse to travel down the tree to the file.

This pathname actually describes the absolute path of the file, which is the sequence of folders one must travel from the root of the tree to the desired file.

A relative path describes the sequence of Folders one must traverse starting at some intermediate place on the absolute path.

Page 12: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

File Links

Allow a directory entry to point to a file (or entry) that is not directly below it in the tree structureUnix: Symbolic LinkWindows: Shortcut

Page 13: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Link in Directory Tree Structure

Page 14: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Access Methods

An access method describes the manner and mechanisms by which a process accesses the data in a file.

There are two common access methods: Sequential Random (or Direct)

Page 15: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

File Operations

When a process needs to use a file, there are a number of operations it can perform:

Open Close Read Write

Page 16: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Create File

Allocate space for file Make entry for file in the Directory

Page 17: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Open File

Make file accessible for read/write operations Locates file in Directory Returns internal ID for the file

Commonly called a Handle handle = open(filename, parameters)

Page 18: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

File Open

Page 19: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Write File

System call specifies: Handle from Open call Location, length of information to be written Possibly, location in the file where data is to be

written write(file handle,buffer,length)

Page 20: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Write File

Use Handle to locate file on disk Use file’s Write pointer to determine the

position in the file to write to Update file’s Write Pointer

Page 21: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Read File

System call specifies: Handle from Open call Memory Location, length of information to be read Possibly, location in the file where data is to be

read from read(file handle, buffer) read(file handle, buffer, length)

Page 22: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Read File

Uses Handle to locate file on disk Uses file’s Read Pointer to determine the

position in the file to read from Update file’s Read Pointer

Page 23: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Close File

Makes file no longer accessible from application Deletes the Handle created by Open

Page 24: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

File Close

Page 25: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Delete File

Deletes entry for file in Directory De-allocates disk space used by the file

Page 26: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Sequential Access

If the process has opened a file for sequential access, the File Management subsystem will keep track of the current file position for reading and writing.

To carry this out, the system will maintain a file pointer that will be the position of the next read or write.

Page 27: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

File Pointer

The value of the file pointer will be initialized during Open to one of two possible values Normally, this value will be set to 0 to start the reading

or writing at the beginning of the file. If the file is being opened to append data to the file, the

File Position pointer will be set to the current size of the file.

After each read or write, the File Position Pointer will be incremented by the amount of data that was read or written.

Page 28: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Stream

A Stream is the flow of data bytes, one byte after another, into the process (for reading) and out of the process (for writing).

This concept applies to Sequential Access and was originally invented for network I/O, but several modern programming languages (e.g. C/C++, Java, C#) have also incorporated it.

Page 29: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Standard I/O

Standard Input Defaults to keyboard

Standard Output Defaults to console

Page 30: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

I/O Redirection

Standard Input can come from a file app.exe < def.txt

Standard Output can go to a file App.exe > def.txt

Standard Output from one application can be Standard Input for another with a pipe App1.exe | app2.exe

Called a Pipe

Page 31: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

A Pipe

Page 32: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Pipe

A Pipe is a connection that is dynamically established between two processes.

When a process reads data, the data will come from another process rather than a file. Thus, a pipe has a process at one end that is writing to the pipe and another process reading data at the other end of the pipe.

It is often the situation that one process will produce output that another process needs for input.

Page 33: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Pipe and Performance

Using a pipe can improve system performance in two ways:

By not using a file, the applications save time by not using disk I/O.

A pipe has the characteristic that the receiving process can read whatever data has already been written. Thus we do not need to wait until the first process has written all of the data before we start executing the second process.

Page 34: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Directory Functions

Search for a file Create a file Delete a file List a directory Rename a file Traverse the file system

Page 35: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Disk Space Allocation

Contiguous File is allocated contiguous disk space

Page 36: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Contiguous Allocation

Read/Write Disk Address Calculation

Page 37: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Contiguous Allocation

Advantages Simple to implement Good disk I/O performance

Disadvantages Need to know max file size ahead of time Probably will waste disk space Necessary space may not be available

Page 38: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Disk Space Allocation

Cluster Allocation Disk space allocated in blocks Space allocated as needed

Page 39: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Cluster Allocation

Page 40: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Cluster Allocation

Advantages Tends not to waste disk space

Disadvantages Additional overhead to keep track of clusters Can cause poor disk I/O performance May limit maximum size of File System

Page 41: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Cluster Performance

Clusters tend to be scattered around the disk This is called External Fragmentation Can cause poor performance as disk arm needs

to move a lot Requires De-fragmentation utility

Page 42: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Cluster Performance

Large clusters can reduce External Fragmentation If lots of small files, then space will be wasted

inside each cluster This is called Internal Fragmentation

Page 43: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Crossing Cluster Boundary

Break logical read into multiple physical reads

Page 44: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Managing Cluster Allocation

Linked Each cluster has a pointer to the next cluster

Indexed Single table has pointers to each of the clusters

Page 45: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Linked Blocks

Page 46: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Index Block

Page 47: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Windows File Systems

Fat16 (File Allocation Table) MS-Dos, Windows 95 Max 2GB space for a FileSystem Generally bad disk fragmentation

Fat32 Windows 98 Supported by Windows 2000, XP, 2003

NTFS (New Technology File System)

Page 48: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Windows FAT Table

Page 49: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Windows FAT Processing

Disk Space Allocation

Allocate a free cluster

Update FATSystem Failure

Page 50: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Free Space Management

Linked Linked list of free clusters

Bit Map Special File with a vector of bits Each bit corresponds to a cluster

Page 51: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Linked Free Blocks

Page 52: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Windows FAT Table

DirectoryEntry

FATCluster 1

Cluster 2

Cluster 3

FreeCluster

Free List

FreeCluster

Page 53: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Windows FAT Processing

Unreliable!!! Need to run Scandisk after reboot to attempt

to fix any problems

Page 54: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Windows NTFS File System

Available on Windows 2000, XP, 2003 Maintains transaction log to recover after

reboot Support for file protection Large (64 bit) cluster pointers

Allows small clusters Avoids internal fragmentation

Page 55: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Windows NTFS File System

Page 56: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Disk File System Types

FAT16, FAT32, NTFS, UFS Unix Journaling File System Windows Encrypting File System Network File System (NFS)

Page 57: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Multiple File System Types

Disk File Systems Other File System Organizations

CD-Rom DVD Zip Disk

Page 58: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Multiple File System Types

Each Disk Partition has a single File System A given computer can have a number of

different File System types Modern systems support this capability with a

Virtual File System

Page 59: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Virtual File System

Page 60: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Removable Devices

Connecting Media Called Mounting the FileSystem Can be

Physical Media Logical (across the network)

Page 61: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Journaling

Is only used when writing to a disk and it acts as a sort of punch clock for all writes.

This fixes the problem of disk corruption when data is written to the hard drive and then the computer crashes or power is lost.

Without a journal the operating system would have no way to know if the file was completely written to disk.

Page 62: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Journaling

With a journal the file is first written to the journal, punch-in, and then the journal writes the file to disk when ready.

Once it has successfully written to the disk, it is removed from the journal, punch-out, and the operation is complete.

If power is lost while being written to disk the file system can check the journal for all operations that have not yet been completed and remember where it left off.

Page 63: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Linux File Systems

Ext stands for Extended file system and was the first created specifically for Linux. It has had four revisions and each one has added fairly significant features.

The first version of Ext was a major upgrade from the Minix file system used at the time, but it lacks major features used in today’s computing.

At this time you probably should not use Ext in any machine due to its limitation and age. It also is no longer supported in many distributions.

Page 64: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Linux File Systems

Ext2 is not a journaling file system, and when introduced was the first to allow for extended file attributes and 2 terabyte drives. Because Ext2 does not use a journal it has significantly less writes applied to the disk.Due to lower write requirements, and hence lower erases, it is ideal for flash memory especially on USB flash drives.Modern SSDs have a increased life span and additional features that can negate the need for using a non-journaling file systems.

Page 65: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Linux File Systems

Ext3 is basically just Ext2 with journaling. The aim of Ext3 was to be backwards compatible with Ext2 and therefore disks can be converted between the two without needing to format the drive. The problem with keeping compatibility is many of the limitations of Ext2 still exist

in Ext3. The benefit of keeping backwards compatibility is the fact that most of the testing, bug fixes, and use cases for Ext2 also apply to Ext3 making it stable and fast.

Use if you need to upgrade a previous Ext2 file system to have journaling. You will probably get the best database performance from Ext3 due to years

of optimizations. Not the best choice for file servers because it lacks disk snapshots and file

recovery is very difficult if deleted.

Page 66: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Linux File Systems

Ext4, just like Ext3 before it, keeps backwards compatibility with its predecessors. As a matter of fact, you can mount Ext2 and Ext3 as an Ext4 file system in Linux and that alone can increase performance under certain conditions. You can also mount an Ext4 file system as Ext3 without ill effects. Ext4 reduces file fragmentation, allows for larger volumes and files, and employs delayed

allocation which helps with flash memory life as well as fragmentation. Although it is used in other file systems, delayed allocation has potential for data loss and has come under some scrutiny.

A better choice for SSDs than Ext3 and improves on general performance over both previous Ext versions. If this is your distro’s default supported file system, you should probably stick with it for any desktop or laptop you set up.

It also shows promising performance numbers for database servers, but hasn’t been around as long as Ext3.

Page 67: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Unix File System

Supports protection More reliable than Windows FAT system Need to run fsck (File System Check) utility

on boot-up (similar to Windows Scandisk)

Page 68: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Unix File System

Page 69: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Inodes In a file system, a file is represented by an inode, a data structure

containing information about the actual data that make up the file. Every partition has its own set of inodes. Each inode describes a data structure on the hard disk, storing the

attributes of a file, including the physical location of the file data. When a hard disk is initialized to accept data storage, usually during

the initial system installation process or when adding extra disks to an existing system, a fixed number of inodes per partition is created.

This number will be the maximum amount of files, of all types (including directories, special files, links etc.) that can exist at the same time on the partition.

The typically count is 1 inode per 2 to 8 kilobytes of storage.

Page 70: CS3530 OPERATING SYSTEMS Summer 2014 File Management Chapter 8.

Inodes At the time a new file is created, it also creates new inode. In that inode is

the following information:• Owner and group owner of the file.• File type (regular, directory, ...)• Permissions on the file Section 3.4.1• Date and time of creation, last read and change.• Date and time this information has been changed in the inode.• Number of links to this file (see later in this chapter).• File size• An address defining the actual location of the file data.

The only information not included in an inode, is the file name and directory. These are stored in the special directory files. By comparing file names and

inode numbers, the system can make up a tree-structure that the user understands.

Users can display inode numbers using the -i option to ls. The inodes have their own separate space on the disk.