Chapter 12
File Management Systems
Introduction to Computer Science 2
Chapter Goals
Describe the components and functions of a file management system
Compare the logical and physical organization of files and directories
Explain how secondary storage locations are allocated to files and describe the data structures used to record those allocations
Introduction to Computer Science 3
Chapter Goals (continued)
Describe file manipulation operations, including open, close, read, delete, and undelete operations
List access controls that can be applied to files and directories
Describe security, backup, recovery, and fault tolerance methods and procedures
Compare and contrast storage area networks and network-attached storage
Introduction to Computer Science 4
Introduction to Computer Science 5
File Management Systems
Collection of system software that manages all aspects of user and program access to secondary storage
Usually part of the operating system Translates operations into commands to
physical storage devices Implemented in four layers (command layer,
file control, storage I/O control, and secondary storage devices)
Introduction to Computer Science 6
Bridges between logical and physical views of secondary storage
Allocates secondary storage locations to individual files and directories
Includes software modules for device drivers for each storage device or device controller, interrupt handlers, buffers and cache managers
Introduction to Computer Science 7
Logical and PhysicalStorage Views Logical view
Collection of files organized within directories and storage volumes
Physical view Collection of physical storage locations organized
as a linear address space
Introduction to Computer Science 8
Introduction to Computer Science 9
The file is subdivided into multiple records and each record is composed of multiple fields.
Introduction to Computer Science 10
File Content and Type
FMS supports limited number of file types: Executable programs Operating system commands Textual or unformatted binary data
Modern FMSs can define new file types and install utility programs to manipulate them (file association)
Introduction to Computer Science 11
File Types
Normally declared when a file is created and: Stored within a directory, or Declared through a filename convention
Determine: Physical organization of data items and data
structures within secondary storage Operations that may be performed upon the file Filename restrictions
Introduction to Computer Science 12
Introduction to Computer Science 13
Introduction to Computer Science 14
Directory Content and Structure
Contain information about files and other directories, typically name, file type, location, size, ownership, access controls, and time stamps
Introduction to Computer Science 15
Hierarchical Directory Structure
Directories can contain other directories, creating a tree structure, but cannot be contained within more than one parent
Ways that names of access paths can be specified: Complete path (fully qualified reference) Relative path
Each storage device has a root directory
Introduction to Computer Science 16
Active (working) directory
Introduction to Computer Science 17
Graph Directory Structure
More flexible than hierarchical directory structure Files and subdirectories can be contained within
multiple directories Directory links can form a cycle
Introduction to Computer Science 18
Introduction to Computer Science 19
Storage Allocation
Secondary storage devices Large number of storage locations; low frequency
of allocation changes Divided into allocation units
Introduction to Computer Science 20
Allocation Units
Smallest number of secondary storage bytes that can be allocated to a file; cannot be smaller than unit of data transfer between storage device and controller (block)
Assigned/reclaimed by FMS as files and directories are created or expanded/shrink or are deleted
Size difficult to change once set
Introduction to Computer Science 21
Allocation Unit Size
Tradeoffs Efficient use of secondary storage space for files Size of storage allocation data structures Efficiency of storage allocation procedures
Smaller units: More efficient use of storage space
Larger units: Allow smaller storage allocation data structures
Introduction to Computer Science 22
Storage Allocation Tables
Data structures that record which allocation units are free and which belong to files
Format and content vary across FMSs Can contain linked lists in simpler FMSs or
indices or other complex data structures in more complex FMSs
Introduction to Computer Science 23
Introduction to Computer Science 24
Free allocation units are assigned to a hidden system file called SysFree.
Introduction to Computer Science 25
All of a file allocation’s units are “chained” together in sequential order by a series of pointers.
Introduction to Computer Science 26
Blocking
Logical record grouping within physical records
Described by a numeric ratio of logical records to physical records (blocking factor)
Introduction to Computer Science 27
Blocking factor = 4:3
Blocking factor = 2:3
Introduction to Computer Science 28
Buffering
Temporary storage of data as it moves between programs and secondary storage devices Physical records are stored in the buffer as they
are read from secondary storage FMS extracts logical records from buffers and
copies them to data area of the application program
Each buffer is the size of one allocation unit Improves I/O performance if enough are used
Introduction to Computer Science 29
Introduction to Computer Science 30
File Manipulation
Exact set of service layer functions varies among FMSs, but typically includes create, copy, move, delete, read, and write
Application programs interact directly with FMS through OS service layer
Users interact indirectly with FMS through command layer
Introduction to Computer Science 31
File Open and Close Operations
File open Causes FMS to find the file, verify access
privileges, allocate buffers, and update internal table of open files
File close Causes FMS to flush buffer content to the storage
device, release buffers, update file time stamps, and update table of open files
Introduction to Computer Science 32
Delete and Undelete Operations Delete
Does not immediately remove files; some content remains on secondary storage unit all allocation units have been reassigned and overwritten
File content can be visible to intruders Undelete
Can be used to reconstruct directory and storage allocation table contents
Introduction to Computer Science 33
Access Controls
Granted by file owners and system administrators for reading, writing, and executing files
Provide security at the expense of additional FMS overhead
Introduction to Computer Science 34
File Migration, Backup, and Recovery Provided by most FMSs to protect files
against damage or loss
Introduction to Computer Science 35
File Migration(Version Control) Automatic storage and backup of old file
versions Balances storage cost of each file version
with anticipated user demand for that version
Introduction to Computer Science 36
Original
Copy that has been updated to reflect new data
Introduction to Computer Science 37
File Backup
Protects against data loss (file content, directory content, and storage allocation tables)
Store backup copies on a different storage device in a different physical location
Manual or automatic Full or incremental
Introduction to Computer Science 38
Transaction Logging
Automatically records all changes to file content and attributes in a separate storage area; also writes them to the file’s I/O buffer
Provides high degree of protection against data loss due to program or hardware failure
Imposes a performance penalty; used only when costs of data loss are high
Introduction to Computer Science 39
File Recovery
Automated and manual components Can search backup logs for copies of lost or
damaged files Can perform consistency checking and repair
procedures for crashed system or physically damaged storage device
Introduction to Computer Science 40
Fault Tolerance
Methods of securing file content against hardware failure File backup Recovery Transaction logging Mirroring RAID (Redundant Array of Inexpensive Disks)
Introduction to Computer Science 41
Mirroring
All disk write operations are made concurrently to two different storage devices
Provides high degree of protection against data loss with no performance penalty if implemented in hardware
Disadvantages Cost of redundant disk drives Higher cost of disk controllers that implement
mirroring
Introduction to Computer Science 42
RAID
Disk storage technique that improves performance and fault tolerance
All levels except RAID 1 use data striping Breaks a unit of data into smaller segments and
stores them on multiple disks Multiple levels can be layered to combine
their best features (e.g. RAID 10) Can be implemented in hardware or software
Introduction to Computer Science 43
Introduction to Computer Science 44
Data striping: Each segment is written in parallel to a separate disk.
Introduction to Computer Science 45
If the parity disk fails, the other disks still retain their original data bits.
Introduction to Computer Science 46
RAID 10: Mirrors individual disks (RAID 1), then stripes data (RAID 0) across multiple mirrored pairs.
Introduction to Computer Science 47
Storage Consolidation
Overcomes inefficiencies of direct-attached storage (DAS) in multiple-server environments
Common approaches Storage area network (SAN) Network-attached storage (NAS)
Introduction to Computer Science 48
Storage Consolidation
Storage Area Network (SAN)
Network-Attached Storage (NAS)
High-speed interconnection among general-purpose servers and one or more storage servers
Block-oriented access Common in multi-server
environments with mainframes or supercomputers and substantial overlap among server storage needs
Expensive to purchase and administer, but avoid costs of duplicate storage and storage administration
Dedicated to managing one or more file systems
Accessed by other servers and clients over a local or wide area network
File-oriented access Common when geographically
dispersed servers need access to a common file system
Cheaper to acquire than SAN, but at the price of lower performance
Introduction to Computer Science 49
Introduction to Computer Science 50
Introduction to Computer Science 51
Summary
File management systems Directory content and structure Storage allocation File manipulation Access controls File migration, backup, and recovery Storage consolidation