计计计计•计计计计计计计 Lecture 15 File Systems xlanchen@06/03/2005
xlanchen@06/03/2005 Understanding the Inside of Windows2000
2计算机系信息处理实验室
Contents
Windows 2000 File System Formats
NTFS Design Goals and Features
File System Driver Architecture
NTFS File System Driver
NTFS On-Disk Structure
xlanchen@06/03/2005 Understanding the Inside of Windows2000
3计算机系信息处理实验室
Windows 2000 File System Formats
CDFS (CD-ROM File System)
1988, read-only formatting standard for CD-ROM media
UDF
FAT12, FAT16, and FAT32
NTFS
the native file system format of Windows 2000
xlanchen@06/03/2005 Understanding the Inside of Windows2000
4计算机系信息处理实验室
FAT series (12, 16, 32)
FAT format organization
Example: file allocation table
xlanchen@06/03/2005 Understanding the Inside of Windows2000
5计算机系信息处理实验室
NTFS Design Goals
Recoverability
Security
Data Redundancy
Fault Tolerance
xlanchen@06/03/2005 Understanding the Inside of Windows2000
6计算机系信息处理实验室
NTFS Features
Multiple data streams
Unicode-based names
General indexing facility
Dynamic bad-cluster remapping
Hard links and junctions
Compression and sparse files
Change logging
Per-user volume quotas
Link tracking
Encryption
POSIX support
Defragmentation
xlanchen@06/03/2005 Understanding the Inside of Windows2000
7计算机系信息处理实验室
NTFS filesNTFS file = set of ($attribute-name, data) pairs
Std attributes include:FileName, flags, data, MSDOS-name, ACL, other…
Filename up to 255 characters
Values (data) held in MFT entry if possible
Otherwise:Data attribute = set of (start-VCN, start-LCN, #clusters)
Allows sequence of VCNs to be discovered
Provides VCN->LCN->cluster mapping
“attribute list” attribute added if MFT entry too small
Points to (first) overflow MFT rec for mappings
xlanchen@06/03/2005 Understanding the Inside of Windows2000
8计算机系信息处理实验室
NTFS Directories
An index of filenames
Index blocks organized as balanced (b+) tree
Tree pointer gives (VCN,LCN) of next block
Index entries contain:
File Reference Number, plus
Size, timestamp, etc (for directory browsing) (saves reading MFT rec to find file attributes)
NT4 indexes on filename only
NT5 indexes on other file attributes also
xlanchen@06/03/2005 Understanding the Inside of Windows2000
9计算机系信息处理实验室
File System Driver (FSD) Architecture
FSDs manage file system formats
Kernel mode
Two different types of FSD (2K)
Local FSDs
Remote FSDs
xlanchen@06/03/2005 Understanding the Inside of Windows2000
10计算机系信息处理实验室
Local FSDs
A local FSD must register with the I/O manager
Local FSDs include:
Ntfs.sys, Fastfat.sys, Udfs.sys, Cdfs.sys, and the Raw FSD (integrated in Ntoskrnl.exe).
xlanchen@06/03/2005 Understanding the Inside of Windows2000
11计算机系信息处理实验室
Local FSD & other concept of the OS
Boot sector
I/O manager
volume parameter block (VPB)
Storage device file system device
Cache manager
xlanchen@06/03/2005 Understanding the Inside of Windows2000
12计算机系信息处理实验室
Remote FSDs
Remote FSDs consist of two components:
a client and a server
Client-side remote FSD (2K: LANMan Redirector)
allows applications to access remote files and directories
Accepts I/O request & translates into network commands
Server-side FSD (2K: LANMan Server)
Listens and fulfills the command
xlanchen@06/03/2005 Understanding the Inside of Windows2000
14计算机系信息处理实验室
EXPERIMENT
Viewing the List of Registered File Systems
xlanchen@06/03/2005 Understanding the Inside of Windows2000
15计算机系信息处理实验室
File system operationTwo ways
Directly, file I/O functions
Indirectly, file mapping
An FSD can be invoked through several paths
Explicit file I/O
From the memory manager's modified page writer
Indirectly from the cache manager's lazy writer
Indirectly from the cache manager's read-ahead thread
From the memory manager's page fault handler
xlanchen@06/03/2005 Understanding the Inside of Windows2000
16计算机系信息处理实验室
Components involved in file system I/O
xlanchen@06/03/2005 Understanding the Inside of Windows2000
17计算机系信息处理实验室
NTFS FSD
Components of the Windows 2000 I/O system
Layereddrivers
xlanchen@06/03/2005 Understanding the Inside of Windows2000
18计算机系信息处理实验室
NTFS and related components
xlanchen@06/03/2005 Understanding the Inside of Windows2000
19计算机系信息处理实验室
Log File Service (LFS)
NTFS provides file system recoverability by means of a transaction-processing technique called logging
LFS is a series of kernel-mode routines inside the NTFS driver
xlanchen@06/03/2005 Understanding the Inside of Windows2000
21计算机系信息处理实验室
NTFS On-Disk Structure
Volumes
Clusters
Directories
The storage of actual file data and attribute information
NTFS data compression
xlanchen@06/03/2005 Understanding the Inside of Windows2000
22计算机系信息处理实验室
Volumes
A logical partition on a disk
A disk can have one volume or several
stores all file system data as ordinary files, such as
Bitmaps 、 directories 、 system bootstrap
xlanchen@06/03/2005 Understanding the Inside of Windows2000
23计算机系信息处理实验室
Clusters
Cluster size established when formatting
Also called cluster factor
2n sectors
Cluster vs. sector
NTFS is independent from physical sector sizes
LCN, logical cluster numbers
the numbering of all clusters of the volume
VCN, virtual cluster numbers
Number the clusters belonging to a particular file
xlanchen@06/03/2005 Understanding the Inside of Windows2000
24计算机系信息处理实验室
Master File Table (MFT)
All data is contained in files, including metadata
Metadata
the data structures used to locate and retrieve files,
the bootstrap data,
the bitmap that records the allocation state of the entire volume
Easy to locate and maintain
Each can be protected by a security descriptor
xlanchen@06/03/2005 Understanding the Inside of Windows2000
25计算机系信息处理实验室
Master File Table (MFT)
MFT, the heart
Implemented as an array of file records
File records, fixed size, 1KB
Logically, contains one record for each file including the MFT itself.
Metadata files (with name prefixed with “$”)
$Mft
xlanchen@06/03/2005 Understanding the Inside of Windows2000
27计算机系信息处理实验室
Mount a volume with MFT
find the physical disk address of the MFT from the boot sector
Find information inside the file record of MFT
Open more metadata file
Perform the file system recovery operation
Open other metadata file
xlanchen@06/03/2005 Understanding the Inside of Windows2000
28计算机系信息处理实验室
Other metadata files
log file ($LogFile)
root directory ("\")
bitmap file ($Bitmap)
security file ($Secure)
boot file ($Boot)
bad-cluster file ($BadClus)
extensions ($Extend), a metadata directory
object identifier file ($ObjId), the quota file ($Quota), the change journal file ($UsnJrnl), and the reparse point file ($Reparse).
xlanchen@06/03/2005 Understanding the Inside of Windows2000
29计算机系信息处理实验室
File record vs. File
Normally, 1:1
May n:1
If a file has a large number of attributes
or becomes highly fragmented
First one called base file record
stores the locations of the others
Others extended file record
xlanchen@06/03/2005 Understanding the Inside of Windows2000
30计算机系信息处理实验室
File Reference Numbers
A file on an NTFS volume is identified a file reference
64-bit
File number, index to the file's file record position in the MFT
Sequence number, the reused times of an MFT file record position
xlanchen@06/03/2005 Understanding the Inside of Windows2000
31计算机系信息处理实验室
File Records
File, a collection of attribute/value pairs
Filename
time stamp information
unnamed data attribute
additional named data attributes
xlanchen@06/03/2005 Understanding the Inside of Windows2000
32计算机系信息处理实验室
Attribute Attribute Name Description
Volume information $VOLUME_INFORMATION, $VOLUME_NAME
These attributes are present only in the $Volume metadata file. They store volume version sand label information.
Standard information
$STANDARD_INFORMATION File attributes such as read-only, archive, and so on; time stamps, including when the file was created or last modified; and how many directories point to the file (its hard link count).
Filename $FILE_NAME The file's name in Unicode characters. A file can have multiple filename attributes, as it does when a hard link to a file exists or when a file with a long name has an automatically generated "short name" for access by MS-DOS and 16-bit Microsoft Windows applications.
Security descriptor $SECURITY_DESCRIPTOR This attribute is present for backward compatibility with previous versions of NTFS. The Windows 2000 version of NTFS stores all security descriptors in the $Secure metadata file, sharing descriptors among files and directories that have the same settings. Previous versions of NTFS stored private security descriptor information with each file and directory.
Data $DATA The contents of the file. In NTFS, a file has one default unnamed data attribute and can have additional named data attributes; that is, a file can have multiple data streams. A directory has no default data attribute but can have optional named data attributes.
Index root, index allocation, and index bitmap
$INDEX_ROOT, $INDEX_ALLOCATION, $BITMAP
Three attributes used to implement filename allocation and bitmap indexes for large directories (directories only).
xlanchen@06/03/2005 Understanding the Inside of Windows2000
33计算机系信息处理实验室
Attribute Attribute Name Description
Attribute list $ATTRIBUTE_LIST A list of the attributes that make up the file and the file reference of the MFT file record in which each attribute is located. This seldom-used attribute is present when a file requires more than one MFT file record.
Object ID $OBJECT_ID A 64-byte identifier for a file or directory, with the lowest 16 bytes (128 bits) unique to the volume. The link-tracking service assigns object IDs to shell shortcut and OLE link source files. NTFS provides APIs so that files and directories can be opened with their object ID rather than their filename.
Reparse information $REPARSE_POINT This attribute stores a file's reparse point data. NTFS junctions and mount points include this attribute.
Extended attributes $EA, $EA_INFORMATION Extended attributes aren't actively used but are provided for backward compatibility with OS/2 applications.
EFS information $LOGGED_UTILITY_STREAM EFS stores data in this attribute that's used to manage a file's encryption, such as the encrypted version of the key needed to decrypt the file and a list of users that are authorized to access the file. The word logged is in the attribute's name because changes to this attribute are recorded in the volume log file (described later in this chapter) for recoverability.
xlanchen@06/03/2005 Understanding the Inside of Windows2000
34计算机系信息处理实验室
Attribute streams
Each file attribute is stored as a separate stream of bytes within a file
Streams is the unit of file operation
create, delete, read and write
Attribute type code
The file attributes in an MFT record are ordered by type codes
Attribute
Type code: value: optional name
xlanchen@06/03/2005 Understanding the Inside of Windows2000
35计算机系信息处理实验室
Filenames
NTFS, <=255 characters
xlanchen@06/03/2005 Understanding the Inside of Windows2000
36计算机系信息处理实验室
Resident and Nonresident Attributes
resident attribute
the value of an attribute is stored directly in the MFT
Example:
xlanchen@06/03/2005 Understanding the Inside of Windows2000
37计算机系信息处理实验室
Several attributes are defined as always being resident
The standard information
Index root attributes
xlanchen@06/03/2005 Understanding the Inside of Windows2000
38计算机系信息处理实验室
Resident attribute header and value
Example: filename attribute
Only need once disk accessing
xlanchen@06/03/2005 Understanding the Inside of Windows2000
39计算机系信息处理实验室
MFT file record for a small directory
If a particular attribute is too large
Clusters outside MFT is allocated, called run
If the value size grows, more runs is allocated
This is nonresident attributes
xlanchen@06/03/2005 Understanding the Inside of Windows2000
40计算机系信息处理实验室
Resident or nonresident
Determined by the file system
Location transparent to the process
Example:
MFT file record for a large file with two data runs
xlanchen@06/03/2005 Understanding the Inside of Windows2000
41计算机系信息处理实验室
A large directory can also have nonresident attributes
Example: MFT file record for a large directory with a nonresident filename index
xlanchen@06/03/2005 Understanding the Inside of Windows2000
42计算机系信息处理实验室
Keeping track of the runs
VCN-to-LCN mapping pairs
VCN
LCN
Example:
VCNs & LCNs for a nonresident data attribute
xlanchen@06/03/2005 Understanding the Inside of Windows2000
44计算机系信息处理实验室
Indexing
A file directory is simply an index of filenames
Example:
Filename index for a volume's root directory
xlanchen@06/03/2005 Understanding the Inside of Windows2000
45计算机系信息处理实验室
Index root attribute
For large directories:
Index buffer + B+ tree
With index root attribute contains the first level of B+ tree
File index entry:
The file reference in the MFT
Time stamp
File size information
xlanchen@06/03/2005 Understanding the Inside of Windows2000
46计算机系信息处理实验室
The index allocation attribute
For index buffer runs: VCNLCN
xlanchen@06/03/2005 Understanding the Inside of Windows2000
47计算机系信息处理实验室
Data Compression and Sparse Files
NTFS supports compression
per-file, per-directory, or per-volume
Related func
GetVolumeInformation
GetCompressedFileSize
DeviceIoControl
xlanchen@06/03/2005 Understanding the Inside of Windows2000
48计算机系信息处理实验室
Compressing Sparse Data
Sparse data is often large but contains only a small amount of nonzero data relative to its size
Runs of a noncompressed file and the related MFT record
xlanchen@06/03/2005 Understanding the Inside of Windows2000
49计算机系信息处理实验室
Compression technique
to remove long strings of zeros from the file
NTFS allocates space only for runs that contain nonzero data
So certain ranges of the file's VCNs have no disk allocations
xlanchen@06/03/2005 Understanding the Inside of Windows2000
50计算机系信息处理实验室
example
Runs of a compressed file containing sparse data
16~31
64~127
xlanchen@06/03/2005 Understanding the Inside of Windows2000
51计算机系信息处理实验室
The related MFT record
Read & write operation
For hole, return zero/ allocate & write
No mapping for 16~31; 64~127
xlanchen@06/03/2005 Understanding the Inside of Windows2000
52计算机系信息处理实验室
Compressing Nonsparse Data
Nonsparse data can also be compressed
Individual files or a whole directory
Compressing technology
compression units = 16 clusters long
For true compressing
at least save 1 cluster of storage
xlanchen@06/03/2005 Understanding the Inside of Windows2000
53计算机系信息处理实验室
example
Data runs of a compressed file
Actual storage
xlanchen@06/03/2005 Understanding the Inside of Windows2000
54计算机系信息处理实验室
For compressed file
Runs must started at a virtual 16-cluster boundary
Read/write unit: compressed unit (why 16 clusters)
Example: the related MFT record