Top Banner
UNIX Files How UNIX Sees and Uses Files
20

UNIX Files

Feb 25, 2016

Download

Documents

Jeff

UNIX Files. How UNIX Sees and Uses Files. I/O: UNIX approach. The basic model of the UNIX I/O system is a sequence of bytes that can be accessed either randomly or sequentially. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: UNIX Files

UNIX Files

How UNIX Sees and Uses Files

Page 2: UNIX Files

I/O: UNIX approach• The basic model of the UNIX I/O system is a sequence of bytes that can be accessed

either randomly or sequentially.

• The UNIX kernel uses a single data model, byte stream, to serve all applications. It imposes no structure on the data but instead views it as a stream of bytes. Put another way, everything to the kernel is a stream of bytes – stream I/O.

• As a result an I/O stream from one program can be fed as input to any other program;

• Pipelines can be formed between processes for exchanging data.

• Applications may impose various levels of structure for their data, but the kernel imposes no structure on I/O.

• Example: ASCII text editors process documents consisting of lines of characters where each line is terminated by ASCII line-feed character. Kernel knows nothing about this convention

Page 3: UNIX Files

What is a file?• File is a named persistent collection of data; a linear array of bytes with a one name.

• To the kernel, data is unstructured, sequential bytes is accessed by specifying an offset from beginning of the file.

• File attributes (metadata) include owner(s), permissions, time stamps, size etc.

• A file exists until all its names are deleted, or no process holds a descriptor for it.

• In the kernel I/O devices are accessed as files. These are called special device files.

• UNIX processes pass information to each other using in memory special files: pipes, FIFO, sockets.

• User processes access all files as ordinary files - regular files on disk, device special files

or in memory “pipe” files.

• Terminals, printers, tapes are all accessed as if they were streams of bytes. They have names in the file system and are referred to through their descriptors.

Page 4: UNIX Files

Special (Device) Files• The kernel can determine to what hardware device a special file refers and uses a

resident module called device driver to communicate with the device.

• Device special files are created by the mknode() system call (by the super-user only)

• To manipulate device parameters ioctl() system call is used;

• Different devices allow different operations through ioctl()

• Devices are divided into two groups:– Block devices (structured)– Character devices (unstructured)

Page 5: UNIX Files

Block devices• Random (anywhere in the stream) access devices;

• Filetype “b” in “ls –al” listing of /dev directory

• Internal implementation is based on the notion of block, a minimal group of bytes that can be transferred in one operation to and from the device.

• A number of blocks can be transferred in one operation (for effiiciency), but less then block bytes of data is not transferred.

• To user application, the block structure of the device is transparent through internal buffering being done in kernel. User process may read/write a single byte because it works with I/O stream abstraction

• Examples: tapes, magnetic disks, drums, cd-roms, zip disks, floppy disks, etc.

Page 6: UNIX Files

Character devices• Sequential access devices.

• Filetype “c” in “ls-al” listing of /dev directory

• Internal implementation often supports the notion of block transfer,

• In many cases the blocks supported by character devices are very large due to efficiencyconsiderations (e.g., communication interfaces)

• Called character because the first such devices were terminals, Mouse, keyboard, display, network interface, printer, etc.

• Sometime devices listed as “rxxx”, raw devices of xxx block devices.

Page 7: UNIX Files

Disk devices• Disk device files are natively defined as block devices. On some systems also support

character device interface – “raw disk”. This feature is going away on some systems.

• File systems, organized, collections of files, are always created on the block devices, and never on the character devices.

• Single physical block device can be partitioned into a number of logical devices (partition). The physical device is usually named hda, hdb, sda, sdb … representing IDE drive 1, SCSI drive 1, 2

• Each such logical device can have its own file system and is represented by its own special device file. sda1, sda2, sda3, sda4, sda5 ….

Page 8: UNIX Files

“In memory” files• Interprocess communication occurs in memory.

• These files are always sequential, may be unidirectional or bidirectional.

• Usually limited in size, may be represented by a temporary file in /tmp for user processes.

Page 9: UNIX Files

Pipe• A linear array of bytes as files, but they are unidirectional sequential communication

links between the related processes (parent/child).;

• Transient objects of limited size

• They get their file names in the /tmp directory automatically, but open() cannot be used for them.

• Descriptors obtained from pipe() system call.

• Data written to a pipe can be read only once from it, and only in the order it was written (FIFO);

Page 10: UNIX Files

FIFO• There is a special kind of pipes, called named pipes.

• They are identical to unnamed pipes in function, size and use. Except they have normal names, as any other file, and descriptors for them can be obtained through open() system call;

• Processes that wish to communicate through FIFO in both directions must open one FIFO for each direction.

Page 11: UNIX Files

Socket• Socket is a transient object that is used for inter-process communication; usually over a

network protocol.

• It exists only as long as some process holds a descriptor on it.

• Descriptor is created through the socket() system call.

• Sequential access; similar to pipes;

• Different types of sockets exist: Local/Remote, RPC/IPC, reliable/unreliable etc.

Page 12: UNIX Files

File Descriptor• A control structure, File Control Block (FCB) – e.g. descriptor, is associated with each

file in the file system – Each FCB has a unique identifier (FCB ID)– UNIX: i-node, identified by i-node number

• When opened, a filenumber presented to process as filehandle

• FCB structure: – File attributes (INODE)– A data structure for accessing the file’s data

• Operations:OPEN: Associate an FCB with a resource pathREAD: Bring a specified chunk of data from file into the process virtual address spaceWRITE: Write a specified chunk of data from the process virtual address space to the fileCLOSE: Close FCB, release resource pathh

• Other operations: CREATE, DELETE, SEEK, TRUNCATE, SET_ATTRIBUTES

Page 13: UNIX Files

File Descriptors• open, close, set_attributes Unix processes use descriptors to reference I/O streams.

• Descriptors are small unsigned integers.

• Descriptors are obtained from system calls open(), socket(), pipe().

• System calls read() and write() are applied to descriptors to transfer data.

• System call lseek() is used to specify position in the stream referred by descriptor.

• System call close() is used to de-allocate descriptors and the objects they refer to.

• File Descriptors represent objects whose access supported by the kernel: file, pipe, socket

Page 14: UNIX Files

File Descriptor Table• The kernel maintains a per-process descriptor table that kernel uses to translate the

external representation of I/O stream into internal representation.

• Descriptor is simply an index into this table. Consequently, descriptors have only local meaning. Note the standard filehandles provided to all processes 0 (STDIN), 1 (STDOUT), 2 (STDERR)

• Different descriptors in different processes can refer to the same I/O stream;

• Descriptor table is inherited upon fork();

• Descriptor table is preserved upon exec();

• When a process terminates the kernel reclaims all descriptors that were in use by this process

• Which is why UNIX processes inherit variables in only one direction

Page 15: UNIX Files

File descriptor

Page 16: UNIX Files

File descriptor

Page 17: UNIX Files

File descriptor

Page 18: UNIX Files

File descriptor

Page 19: UNIX Files

File descriptor

Page 20: UNIX Files

File commands• cp, rm, touch, ln ….etc • base64, uuencode, uudecode• ls (-i for inode)• lsof• stat• fuser• od• find –inum (see ls –i)