7/27/2019 Manipulating Files And Directories In Unix.pdf
1/18
[LUPG Home] [Tutorials] [Related Material] [Essays] [Project Ideas] [Send Comments]
v1.1
Manipulating Files And Directories In Unix
1. Who Is This For?
2. General Unix File System Structure
3. Standard "C" File Read And Write
1. The FILE Structure
2. Opening And Closing A File
3. Reading From An Open File
4. Writing Into An Open File
5. Moving The Read/Write Location In An Open File6. A Complete Example
4. Accessing Files With System Calls
1. The Little File Descriptor That Could
2. Opening And Closing File Descriptors
3. Reading From A File Descriptor
4. Writing Into A File Descriptor
5. Seeking In An Open File
6. Checking And Setting A File's permission modes
7. Checking A File's Status
8. Renaming A File
9. Deleting A File10. Creating A Symbolic Link
11. The Mysterious Mode Mask
12. A Complete Example
5. Reading The Contents Of Directories
1. The DIR And dirent Structures
2. Opening And Closing A Directory
3. Reading The Contents Of A Directory
4. Rewinding A Directory For A Second Scan
5. Checking And Changing The Working Directory
6. A Complete Example
Who Is This For?
The following tutorial describes various common methods for reading and writing files and directories
on a Unix system. Part of the information is common C knowledge, and is repeated here for
completeness. Other information is Unix-specific, although DOS programmers will find some of it
similar to what they saw in various DOS compilers. If you are a proficient C programmer, and know
everything about the standard I/O functions, its buffering operations, and know functions such as fseek
() orfread(), you may skip the standard C library I/O functions section. If in doubt, at least skim
through this section, to catch up on things you might not be familiar with, and at least look at the
standard C library examples.
Page 1 of 18Manipulating Files And Directories In Unix
10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html
7/27/2019 Manipulating Files And Directories In Unix.pdf
2/18
General Unix File System Structure
In the Unix system, all files and directories reside under a single top directory, called root directory, and
denoted as "/". Even if the computer has several hard disks attached, they are all combined in a single
directories tree. It is up to the system administrator to place all disks on this tree. Each disk is being
connected to some directory in the file system. This connection operation is called "mount", and is
usually done automatically when the system starts running.
Each directory may contain files, as well as other directories. In addition, each directory also contains
two special entries, the entries "." and ".." (i.e. "dot" and "dot dot", respectively). The "." entry refers to
the same directory it is placed in. The ".." entry refers to the directory containing it. The sole exception
is the root directory, in which the ".." entry still refers to the root directory (after all, the root directory is
not contained in any other directory).
A directory is actually a file that has a special attribute (denoting it as being a directory), that contains a
list of file names, and "pointers" to these files on the disk.
Besides normal files and directories, a Unix file system may contain various types of special files:
Symbolic link. This is a file that points to another file (or directory) in the file system. Opening
such a file generally opens the file it points to instead (unless special system calls are used).
Character (or block) special file. This file represents a physical device (and is usually placed in
the "/dev" directory). Opening this file allows accessing the given device directly. Each device
(disks, printers, serial ports etc) has a file in the "/dev" directory.
Other special files (pipes and sockets) used for inter-process communications.
Standard "C" File Read And Write
The basic method of reading files and writing into files is by using the standard C library's input andoutput functions. This works portably across all operating systems, and also gives us some efficiency
enhancements - the standard library buffers read and write operations, making file operations faster then
if done directly by using system calls to read and write files.
The FILE Structure
The FILE structure is the basic data type used when handling files with the standard C library. When we
open a file, we get a pointer to such a structure, that we later use with all other operations on the file,
until we close it. This structure contains information such as the location in the file from which we will
read next (or to which we will write next), the read buffer, the write buffer, and so on. Sometimes thisstructure is also referred to as a "file stream", or just "stream".
Opening And Closing A File
In order to work with a file, we must open it first, using the fopen() function. We specify the path to
the file (full path, or relative to the current working directory), as well as the mode for opening the file
(open for reading, for writing, for reading and writing, for appending only, etc.). Here are a few
examples of how to use it:
/* FILE structure pointers, for the return value of fopen() */FILE* f_read;FILE* f_write;FILE* f_readwrite;
Page 2 of 18Manipulating Files And Directories In Unix
10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html
7/27/2019 Manipulating Files And Directories In Unix.pdf
3/18
FILE* f_append;
/* Open the file /home/choo/data.txt for reading */f_read = fopen("/home/choo/data.txt", "r");if (!f_read) { /* open operation failed. */
perror("Failed opening file '/home/choo/data.txt' for reading:");exit(1);
}
/* Open the file logfile in the current directory for writing. *//* if the file does not exist, it is being created. *//* if the file already exists, its contents is erased. */f_write = fopen("logfile", "w");
/* Open the file /usr/local/lib/db/users for both reading and writing *//* Any data written to the file is written at the beginning of the file, *//* over-writing the existing data. */f_readwrite = fopen("/usr/local/lib/db/users", "r+");
/* Open the file /var/adm/messages for appending. *//* Any data written to the file is appended to its end. */f_append = fopen("/var/adm/messages", "a");
As you can see, the mode of opening the file is given as an abbreviation. More options are documented
in the manual page for the fopen() function. The fopen() function returns a pointer to a FILE
structure on success, or a NULL pointer in case of failure. The exact reason for the failure may be
anything from "file does not exist" (in read mode), "permission denied" (if we don't have permission to
access the file or its directory), I/O error (in case of a disk failure), etc. In such a case, the global
variable "errno" is being set to the proper error code, and theperror() function may be used to print
out a text string related to the exact error code.
Once we are done working with the file, we need to close it. This has two effects:
1. Flushing any un-saved changes to disk (actually, to the operating system's disk cache).2. Freeing the file descriptor (will be explained in the system calls section below) and any other
resources associated with the open file.
Closing the file is done with the fclose() function, as follows:
if (!fclose(f_readwrite)) {perror("Failed closing file '/usr/local/lib/db/users':");exit(1);
}
fclose() returns 0 on success, orEOF (usually '-1') on failure. It will then set "errno" to zero. One maywonder how could closing a file fail - this may happen if any buffered writes were not saved to disk,
and are being saved during the close operation. Whether the function succeeded or not, theFILE
structure may not be used any more by the program.
Reading From An Open File
Once we have a pointer for an open file's structure, we may read from it using any of several functions.
In the following code, assume f_read and f_readwrite pointers to FILE structures returned by previous
calls to fopen().
/* variables used by the various read operations. */int c;
Page 3 of 18Manipulating Files And Directories In Unix
10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html
7/27/2019 Manipulating Files And Directories In Unix.pdf
4/18
char buf[201];
/* read a single character from the file. *//* variable c will contain its ASCII code, or the value EOF, *//* if we encountered the end of the file's data. */c = fgetc(f_read);
/* read one line from the file. A line is all characters up to a new-line */
/* character, or up to the end of the file. At most 200 characters will be *//* read in (i.e. one less then the number we supply to the function call). *//* The string read in will be terminated by a null character, so that is *//* why the buffer was made 201 characters long, not 200. If a new line *//* character is read in, it is placed in the buffer, not removed. *//* note that 'stdin' is a FILE structure pre-allocated by the *//* C library, and refers to the standard input of the process (normally *//* input from the keyboard). */fgets(buf, 201, stdin);
/* place the given character back into the given file stream. The next *//* operation on this file will return this character. Mostly used by *//* parsers that analyze a given text, and try to guess what the next *//* is. If they miss their guess, it is easier to push the last character */
/* back to the file stream, then to make book-keeping operations. */ungetc(c, stdin);
/* check if the read/write head has reached past the end of the file. */if (feof(f_read)) {
printf("End of file reached\n");}
/* read one block of 120 characters from the file stream, into 'buf'. *//* (the third parameter to fread() is the number of blocks to read). */char buf[120];if (fread(buf, 120, 1, f_read) != 1) {
perror("fread");}
There are various other file reading functions (getc() for example), but you'll be able to learn them
from the on-line manual.
Note that when we read in some text, the C library actually reads it from disk in full blocks (with a size
of 512 characters, or something else, as optimal for the operating system we work with). For example,
if we read 20 consecutive characters usingfgetc() 20 times, only one disk operation is made. The rest
of the read operations are made from the buffer kept in theFILE structure.
Writing Into An Open File
Just like the read operations, we have write operations as well. They are performed at the current
location of the read/write pointer kept in the FILE structure, and are also done in a buffered mode - only
if we fill in a full block, the C library's write functions actually write the data to disk. Yet, we can force
it to write data at a given time (e.g. if we print to the screen and want partially written lines to appear
immediately). In the following example, assume that f_readwrite is a pointer to a FILE structure
returned from a previous call to fopen().
/* variables used by the various write operations. */int c;char buf[201];
/* write the character 'a' to the given file. */c = 'a';fputc(c, f_readwrite);
Page 4 of 18Manipulating Files And Directories In Unix
10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html
7/27/2019 Manipulating Files And Directories In Unix.pdf
5/18
/* write the string "hello world" to the given file. */strcpy(buf, "hello world");fputs(buf, f_readwrite);
/* write the string "hi there, mate" to the standard input (screen) *//* a new-line in placed in the string, to make the cursor move *//* to the next line on screen after writing the string. */
fprintf(stdout, "hi there, mate\n");
/* write out any buffered writes to the given file stream. */fflush(stdout);
/* write twice the string "hello, great world. we feel fine!\n" to 'f_readwrite'. *//* (the third parameter to fwrite() is the number of blocks to write). */char buf[100];strcpy(buf, "hello, great world. we feel fine!\n");if (fwrite(buf, strlen(buf), 2, f_readwrite) != 2) {
perror("fwrite");}
Note that when the output is to the screen, the buffering is done in line mode, i.e. whenever we write a
new-line character, the output is being flushed automatically. This is not the case when our output is to
a file, or when the standard output is being redirected to a file. In such cases the buffering is done for
larger chunks of data, and is said to be in "block-buffered mode".
Moving The Read/Write Location In An Open File
Until now we have seen how input and output is done in a serial mode. However, in various occasions
we want to be able to move inside the file, and write to different locations, or read from different
locations, without having to scan the whole code. This is common in database files, when we havesome index telling us the location of each record of data in the file. Traveling in a file stream in such a
manner is also called "random access".
The fseek() function allows us to move the read/write pointer of a file stream to a desired location,
stated as the number of bytes from the beginning of the file (or from the end of file, or from the current
position of the read/write pointer). The ftell() function tells us the current location of the read/write
header of the given file stream. Here is how to use these functions:
/* move the read/write pointer of the file stream to position '30' *//* in the file. Note that the first position in the file is '0', *//* not '1'. */
fseek(f_read, 29L, SEEK_START);
/* move the read/write pointer of the file stream 25 characters *//* forward from its given location. */fseek(f_read, 25L, SEEK_SET);
/* remember the current read/write pointer's position, move it *//* to location '520' in the file, write the string "hello world", *//* and move the pointer back to the previous location. */long old_position = ftell(f_readwrite);if (old_position < 0) {
perror("ftell");exit(0);
}if (fseek(f_readwrite, 520L, SEEK_SET) < 0) {
perror("fseek(f_readwrite, 520L, SEEK_SET)");exit(0);
}
Page 5 of 18Manipulating Files And Directories In Unix
10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html
7/27/2019 Manipulating Files And Directories In Unix.pdf
6/18
fputs("hello world", f_readwrite);if (fseek(f_readwrite, old_position, SEEK_SET) < 0) {
perror("fseek(f_readwrite, old_position, SEEK_SET)");exit(0);
}
Note that if we move inside the file withfseek(), any character put to the stream usingungetc() is
lost and forgotten.
Note: it is ok to seek past the end of a file. If we will try to read from there, we will get an error, but if
we try to write there, the file's size will be automatically enlarged to contain the new data we wrote. All
characters between the previous end of file and the newly written data will contain nulls ('\0') when
read. Note that the size of the file has grown, but the file itself does not occupy so much space on disk -
the system knows to leave "holes" in the file. However, if we try to copy the file to a new location using
the Unix "cp" command, the new file will have all wholes filled in, and will occupy much more disk
space then the original file.
A Complete Example
Two examples are given for the usage of the standard C library I/O functions. The first example is a file
copying program, that reads a given file one line at a time, and writes these lines to a second file. The
source code is found in the file stdc-file-copy.c. Note that this program does not check if a file with the
name of the target already exists, and thus viciously erases any existing file. Be careful when running it!
Later, when discussing the system calls interface, we will see how to avoid this danger.
The second example manages a small database file with fixed-length records (i.e. all records have the
same size), using the fseek() function. The source is found in the file stdc-small-db.c. Functions are
supplied for reading a record and for writing a record, based on an index number. See the source code
for more info. This program uses the fread() and fwrite() functions to read data from the file, or
write data to the file. Check the on-line manual page for these functions to see exactly what they do.
Accessing Files With System Calls
Usually, reading and writing files is done best using the standard C library functions. However, in
various occasions we need a more low-level to the files. For example, we cannot check file permissions
or file size using the standard C library. Also, you will see that Unix treats various devices in a similar
manner to using files, and using the same functions you can read from a file, from a network connection
and so on. Thus, it is useful to learn this generic interface.
The Little File Descriptor That Could
The basic system object used to manipulate files is called a file descriptor. This is an integer number
that is used by the various I/O system calls to access a memory area containing data about the open file.
This memory area has a similar role to the FILE structure in the standard C library I/O functions, and
thus the pointer returned from fopen() has a role similar to a file descriptor.
Each process has its own file descriptors table, with each entry pointing to a an entry in a system file
descriptor table. This allows several processes to share file descriptors, by having a table entry pointingto the same entry in the system file descriptors table. You will encounter this phenomena, and how it
can be used, when learning about multi-process programming.
Page 6 of 18Manipulating Files And Directories In Unix
10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html
7/27/2019 Manipulating Files And Directories In Unix.pdf
7/18
The value of the file descriptor is a non-negative integer. Usually, three file descriptors are
automatically opened by the shell that started the process. File descriptor '0' is used for the standard
input of the process. File descriptor '1' is used for the standard output of the process, and file descriptor
'2' is used for the standard error of the process. Normally the standard input gets input from the
keyboard, while standard output and standard error write data to the terminal from which the process
was started.
Opening And Closing File Descriptors
Opening files using the system call interface is done using the open() system call. Similar to fopen(),
it accepts two parameters. One containing the path to the file to open, the other contains the mode in
which to open the file. The mode may be any of the following:
O_RDONLY
Open the file in read-only mode.O_WRONLY
Open the file in write-only mode.
O_RDWROpen the file for both reading and writing.
In addition, any of the following flags may be OR-ed with the mode flag:
O_CREAT
If the file does not exist already - create it.O_EXCL
If used together with O_CREAT, the call will fail if the file already exists.O_TRUNC
If the file already exists, truncate it (i.e. erase its contents).O_APPEND
Open the file in append mode. Any data written to the file is appended at the end of the file.
O_NONBLOCK (orO_NDELAY)If any operation on the file is supposed to cause the calling process block, the system call instead
will fail, and errno be set to EAGAIN. This requires caution on the part of the programmer, to
handle these situations properly.O_SYNC
Open the file in synchronous mode. Any write operation to the file will block until the data is
written to disk. This is useful in critical files (such as database files) that must always remain in a
consistent state, even if the system crashes in the middle of a file operation.
Unlike the fopen() function, open() accepts one more (optional) parameter, which defines the access
permissions that will be given to the file, in case of file creation. This parameter is a combination of any
of the following flags:
S_IRWXU
Owner of the file has read, write and execute permissions to the file.
S_IRUSR
Owner of the file has read permission to the file.
S_IWUSR
Owner of the file has write permission to the file.
S_IXUSR
Owner of the file has execute permission to the file.
S_IRWXG
Group of the file has read,write and execute permissions to the file.
S_IRGRPGroup of the file has read permission to the file.
S_IWGRP
Page 7 of 18Manipulating Files And Directories In Unix
10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html
7/27/2019 Manipulating Files And Directories In Unix.pdf
8/18
Group of the file has write permission to the file.
S_IXGRP
Group of the file has execute permission to the file.
S_IRWXO
Other users have read,write and execute permissions to the file.
S_IROTH
Other users have read permission to the file.
S_IWOTHOther users have write permission to the file.
S_IXOTH
Other users have execute permission to the file.
Here are a few examples of using open():
/* these hold file descriptors returned from open(). */int fd_read;int fd_write;int fd_readwrite;
int fd_append;
/* Open the file /etc/passwd in read-only mode. */fd_read = open("/etc/passwd", O_RDONLY);if (fd_read < 0) {
perror("open");exit(1);
}
/* Open the file run.log (in the current directory) in write-only mode. *//* and truncate it, if it has any contents. */fd_write = open("run.log", O_WRONLY | O_TRUNC);if (fd_write < 0) {
perror("open");
exit(1);}
/* Open the file /var/data/food.db in read-write mode. */fd_readwrite = open("/var/data/food.db", O_RDWR);if (fd_readwrite < 0) {
perror("open");exit(1);
}
/* Open the file /var/log/messages in append mode. */fd_append = open("/var/log/messages", O_WRONLY | O_APPEND);if (fd_append < 0) {
perror("open");
exit(1);}
Once we are done working with a file, we need to close it, using the close() system call, as follows:
if (close(fd) == -1) {perror("close");exit(1);
}
This will cause the file to be closed. Note that no buffering is normally associated with files opened
with open(), so no buffer flushing is required.
Page 8 of 18Manipulating Files And Directories In Unix
10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html
7/27/2019 Manipulating Files And Directories In Unix.pdf
9/18
Note: If a file that is currently open by a Unix process is being erased (using the Unix "rm" command,
for example), the file is not really removed from the disk. Only when the process (or all processes)
holding the file open, the file is physically removed from the disk. Until then it is just removed from its
directory, not from the disk.
Reading From A File Descriptor
Once we got a file descriptor to an open file (that was opened in read mode), we may read data from the
file using the read() system call. This call takes three parameters: the file descriptor to read from, a
buffer to read data into, and the number of characters to read into the buffer. The buffer must be large
enough to contain the data. Here is how to use this call. We assume 'fd' contains a file descriptor
returned from a previous call to open().
/* return value from the read() call. */size_t rc;/* buffer to read data into. */char buf[20];
/* read 20 bytes from the file. */rc = read(fd, buf, 20);if (rc == 0) {
printf("End of file encountered\n");}else if (rc < 0) {
perror("read");exit(1);
}else {
printf("read in '%d' bytes\n", rc);}
As you can see, read() does not always read the number of bytes we asked it to read. This could be
due to a signal interrupting it in the middle, or the end of the file was encountered. In such a case, read
() returns the number of bytes it actually read.
Writing Into A File Descriptor
Just like we used read() to read from the file, we use the write() system call, to write data to the file.
The write operations is done in the location of the current read/write pointer of the given file, much like
the various standard C library output functions did. write() gets the same parameters asread() does,
and just like read(), might write only part of the data to the given file, if interrupted in the middle, orfor other reasons. In such a case it will return the number of bytes actually written to the file. Here is a
usage example:
/* return value from the write() call. */size_t rc;
/* write the given string to the file. */rc = write(fd, "hello world\n", strlen("hello world\n"));if (rc < 0) {
perror("write");exit(1);
}else {
printf("wrote in '%d' bytes\n", rc);}
Page 9 of 18Manipulating Files And Directories In Unix
10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html
7/27/2019 Manipulating Files And Directories In Unix.pdf
10/18
As you can see, there is never an end-of-file case with a write operation. If we write past the current end
of the file, the file will be enlarged to contain the new data.
Sometimes, writing out the data is not enough. We want to be sure the file on the physical disk gets
updated immediately (note that even thought the system calls do not buffer writes, the operating system
still buffers write operations using its disk cache). In such cases, we may use the fsync() system call.
It ensures that any write operations for the given file descriptor that are kept in the system's disk cache,are actually written to disk, when the fsync() system call returns to the caller. Here is how to use it:
#include /* declaration of fsync() */..if (fsync(fd) == -1) {
perror("fsync");}
Note that fsync() updates both the file's contents, and its book-keeping data (such as last modification
time). If we only need to assure that the file's contents is written to disk, and don't care about the lastupdate time, we can use fdatasync() instead. This is more efficient, as it will issue one fewer disk
write operation. In applications that need to synchronize data often, this small saving is important.
Seeking In An Open File
Just like we used the fseek() function to move the read/write pointer of the file stream, we can use the
lseek() system call to move the read/write pointer for a file descriptor. Assuming you understood the
fseek() examples above, here are a few similar examples usinglseek(). We assume that 'fd_read' is
an integer variable containing a file descriptor to a previously opened file, in read only mode.
'fd_readwrite' is a similar file descriptor, but for a file opened in read/write mode.
/* this variable is used for storing locations returned by *//* lseek(). */off_t location;
/* move the read/write pointer of the file to position '40' *//* in the file. Note that the first position in the file is '0', *//* not '1'. */location = lseek(fd_read, 39L, SEEK_START);
/* move the read/write pointer of the file stream 67 characters *//* forward from its given location. */location = lseek(fd_read, 67L, SEEK_SET);printf("read/write pointer location: %ld\n", location);
/* remember the current read/write pointer's position, move it *//* to location '664' in the file, write the string "hello world",*//* and move the pointer back to the previous location. */location = lseek(fd_readwrite, 0L, SEEK_SET);if (location == -1) {
perror("lseek");exit(0);
}if (lseek(fd_readwrite, 663L, SEEK_SET) == -1) {
perror("lseek(fd_readwrite, 663L, SEEK_SET)");
exit(0);}rc = write(fd_readwrite, "hello world\n", strlen("hello world\n"));if (lseek(fd_readwrite, location, SEEK_SET) == -1) {
Page 10 of 18Manipulating Files And Directories In Unix
10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html
7/27/2019 Manipulating Files And Directories In Unix.pdf
11/18
perror("lseek(fd_readwrite, location, SEEK_SET)");exit(0);
}
Note that lseek() might not always work for a file descriptor (e.g. if this file descriptor represents the
standard input, surely we cannot have random-access to it). You will encounter other similar cases
when you deal with network programming and inter-process communications, in the future.
Checking And Setting A File's permission modes
Since Unix supports access permissions for files, we would sometimes need to check these permissions,
and perhaps also manipulate them. Two system calls are used in this context,access() and chmod().
The access() system call is for checking access permissions to a file. This system call accepts a path
to a file (full or relative), and a mode mask (made of one or more permission modes). It returns '0' if the
specified permission modes are granted for the calling process, or '-1' if any of these modes are not
granted, the file does not exist, etc. The access is granted or denied based on the permission flags of thefile, and the ID of the user running the process. Here are a few examples:
/* check if we have read permission to "/home/choo/my_names". */if (access("/home/choo/my_names", R_OK) == 0)
printf("Read access to file '/home/choo/my_names' granted.\n");else
printf("Read access to file '/home/choo/my_names' denied.\n");
/* check if we have both read and write permission to "data.db". */if (access("data.db", R_OK | W_OK) == 0)
printf("Read/Write access to file 'data.db' granted.\n");
elseprintf("Either read or write access to file 'data.db' is denied.\n");
/* check if we may execute the program file "runme". */if (access("runme", X_OK) == 0)
printf("Execute permission to program 'runme' granted.\n");else
printf("Execute permission to program 'runme' denied.\n");
/* check if we may write new files to directory "/etc/config". */if (access("/etc/config", W_OK) == 0)
printf("File creation permission to directory '/etc/sysconfig' granted.\n");else
printf("File creation permission to directory '/etc/sysconfig' denied.\n");
/* check if we may read the contents of directory "/etc/config". */if (access("/etc/config", R_OK) == 0)
printf("File listing read permission to directory '/etc/sysconfig' granted.\n");else
printf("File listing read permission to directory '/etc/sysconfig' denied.\n");
/* check if the file "hello.world" in the current directory exists. */if (access("hello world", F_OK) == 0)
printf("file 'hello world' exists.\n");else
printf("file 'hello world' does not exist.\n");
As you can see, we can check for read, write and execute permissions, as well as for the existence of afile, and the same for a directory. As an example, we will see a program that checks out if we have read
Page 11 of 18Manipulating Files And Directories In Unix
10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html
7/27/2019 Manipulating Files And Directories In Unix.pdf
12/18
permission to a file, and notifies us if not - where the problem lies. The full source code for this
program is found in file read-access-check.c.
Note that we cannot use access() to checkwhy we got permissions (i.e. if it was due to the given
mode granted to us as the owner of the file, or due to its group permissions or its word permissions).
For more fine-grained permission tests, see the stat() system call mentioned below.
The chmod() system call is used for changing the access permissions for a file (or a directory). This callaccepts two parameters: a path to a file, and a mode to set. The mode can be a combination of read,
write and execute permissions for the user, group or others. It may also contain few special flags, such
as the set-user-ID flag or the 'sticky' flag. These permissions will completely override the current
permissions of the file. See the stat() system call below to see how to make modifications instead of
complete replacement. Here are a few examples of usingchmod().
/* give the owner read and write permission to the file "blabla", *//* and deny access to any other user. */if (chmod("blabla", S_IRUSR | S_IWUSR) == -1) {
perror("chmod");}
/* give the owner read and write permission to the file "blabla", *//* and read-only permission to anyone else. */if (chmod("blabla", S_IRUSR | S_IWUSR | S_IRGRP | S_IWOTH) == -1) {
perror("chmod");}
For the full list of access permission flags to use with chmod(), please refer to its manual page.
Checking A File's Status
We have seen how to manipulate the file's data (write) and its permission flags (chmod). We saw a
primitive way of checking if we may access it (access), but we often need more then that: what are the
exact set of permission flags of the file? when was it last changed? which user and group owns the file?
how large is the file?
All these questions (and more) are answered by the stat() system call.
stat() takes as arguments the full path to the file, and a pointer to a (how surprising) 'stat' structure.
When stat() returns, it populates this structure with a lot of interesting (and boring) stuff about the
file. Here are few of the fields found in this structure (for the rest, read the manual page):
mode_t st_mode
Access permission flags of the file, as well as information about the type of file (file? directory?
symbolic link? etc).
uid_t st_uid
The ID of the user that owns the file.
gid_t st_gid
The ID of the group that owns the file.
off_t st_size
The size of the file (in bytes).
time_t st_atime
Time when the file was last accessed (read from or written to). Time is given as number of
seconds since 1 Jan, 1970.
time_t st_mtimeTime when the file was last modified (created or written to).
time_t st_ctime
Page 12 of 18Manipulating Files And Directories In Unix
10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html
7/27/2019 Manipulating Files And Directories In Unix.pdf
13/18
Time when the file was last changed (had its permission modes changed, or any of its book-
keeping, but NOT a contents change).
Here are a few examples of howstat() can be used:
/* structure passed to the stat() system call, to get its results. */struct stat file_status;
/* check the status information of file "foo.txt", and print its *//* type on screen. */if (stat("foo.txt", &file_status) == 0) {
if (S_ISDIR(file_status.st_mode))printf("foo.txt is a directory\n");
if (S_ISLNK(file_status.st_mode))printf("foo.txt is a symbolic link\n");
if (S_ISCHR(file_status.st_mode))printf("foo.txt is a character special file\n");
if (S_ISBLK(file_status.st_mode))printf("foo.txt is a block special file\n");
if (S_ISFIFO(file_status.st_mode))printf("foo.txt is a FIFO (named pipe)\n");
if (S_ISSOCK(file_status.st_mode))printf("foo.txt is a (Unix domain) socket file\n");
if (S_ISREG(file_status.st_mode))printf("foo.txt is a normal file\n");
}else { /* stat() call failed and returned '-1'. */
perror("stat");}
/* add the write permission to the group owner of file "/tmp/parlevouz", *//* without overriding any of the previous access permission flags. */if (stat("/tmp/parlevouz", &file_status) == -1) {
perror("stat");
exit(1);}if (!S_IWGRP(file_status.st_mode)) { /* the group has no write permission */
mode_t curr_mode = file_status.st_mode & ~S_IFMTmode_t new_mode = curr_mode | S_IWGRP;
if (chmod("/tmp/parlevouz", new_mode) == -1) {perror("chmod");exit(1);
}}
The last item should be explained better. For some reason, the 'stat' structure uses the same bit field to
contain file type information and access permission flags. Thus, to get only the access permissions, weneed to mask off the file type bits. The mask for the file type bits is 'S_IFMT', and thus the mask for the
permission modes is its logical negation, or '~S_IFMT'. By logically "and"-ing this value with the
'st_mode' field of the 'stat' structure, we get the current access permission modes. We can add new
modes using the logical or ('|') operator, and remove modes using the logical and ('&') operator. After
we create the new modes, we use chmod() to set the new permission flags for the file.
Note that this operation will also implicitly modify the 'ctime' (change time) of the file, but that won't
be reflected in our 'stat' structure, unless westat() the file again.
Renaming A File
The rename() system call may be used to change the name (and possibly the directory) of an existing
file. It gets two parameters: the path to the old location of the file (including the file name), and a path
Page 13 of 18Manipulating Files And Directories In Unix
10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html
7/27/2019 Manipulating Files And Directories In Unix.pdf
14/18
to the new location of the file (including the new file name). If the new name points to a an already
existing file, that file is deleted first. We are allowed to name either a file or a directory. Here are a few
examples:
/* rename the file 'logme' to 'logme.1' */if (rename("logme", "logme1") == -1) {
perror("rename (1):");exit(1);
}
/* move the file 'data' from the current directory to directory "/old/info" */if (rename("data", "/old/info/data") == -1) {
perror("rename (2):");exit(1);
}
Note: If the file we are renaming is a symbolic link, then the symbolic link will be renamed, not the file
it is pointing to. Also, if the new path points to an existing symbolic link, this symbolic link will be
erased, not the file it is pointing to.
Deleting A File
Deleting a file is done using the unlink() system call. This one is very simple:
/* remove the file "/tmp/data" */if (unlink("/tmp/data") == -1) {
perror("unlink");exit(1);
}
The file will be removed from the directory in which it resides, and all the disk blocks is occupied will
be marked as free for re-use by the system. However, if any process currently has this file open, the file
won't be actually erased until the last process holding it open erases it. This could explain why often
erasing a log file from the system does not increase the amount of free disk space - it might be that the
system logger process (syslogd) holds this file open, and thus the system won't really erase it until
syslogd closes it. Until then, it will be removed from the directory (i.e. 'ls' won't show it), but not from
the disk.
Creating A Symbolic Link
We have encountered symbolic links earlier. lets see how to create them, with thesymlink() system
call:
/* create a symbolic link named "link" in the current directory, *//* that points to the file "/usr/local/data/datafile". */if (symlink("/usr/local/data/datafile", "link") == -1) {
perror("symlink");exit(1);
}
/* create a symbolic link whose full path is "/var/adm/log", *//* that points to the file "/usr/adm/log". */if (symlink("/usr/adm/log", "/var/adm/log") == -1) {
Page 14 of 18Manipulating Files And Directories In Unix
10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html
7/27/2019 Manipulating Files And Directories In Unix.pdf
15/18
perror("symlink");exit(1);
}
So the first parameter is the file being pointer to, and the second parameter is the file that will be the
symbolic link. Note that the first file does not need to exist at all - we can create a symbolic link that
points nowhere. If we later create the file this link points to, accessing the file via the symbolic link will
work properly.
The Mysterious Mode Mask
If you created files with open() orfopen(), and you did not supply the mode for the newly created
file, you might wonder how does the system assign access permission flags for the newly created file.
You will also note that these "default" flags are different on different computers or different account
setups. This mysteriousness is due to the usage of the umask() system call, or its equivalent umask shell
command.
The umask() system call sets a mask for the permission flags the system will assign to newly createdfiles. By default, newly created files will have read and write permissions to everyone (i.e. rw-rw-rw- ,
in the format reported by 'ls -l'). Using umask(), we can denote which flags will be turned offfor newly
created files. For example, if we set the mask to 077 (a leading 0 denotes an octal value), newly created
files will get access permission flags of 0600 (i.e. rw-------). If we set the mask to 027, newly created
files will get flags of 0640 (i.e. rw-r-----). Try translating these values to binary format in order to see
what is going on here.
Here is how to mess with the umask() system call in a program:
/* set the file permissions mask to '077'. save the original mask */
/* in 'old_mask'. */int old_mask = umask(077);
/* newly created files will now be readable only by the creating user. */FILE* f_write = fopen("my_file", "w");if (f_write) {
fprintf(f_write, "My name is pit stanman.\n");fprintf(f_write, "My voice is my pass code. Verify me.\n");fclose(f_write);
}
/* restore the original umask. */umask(old_mask);
Note: the permissions mask affects also calls toopen() that specify an exact permissions mask. If we
want to create a file whose permission are less restrictive the the current mask, we need to useumaks()
to lighten these restrictions, before callingopen() to create the file.
Note 2: on most systems you will find that the mask is different then the default. This is because the
system administrator has set the default mask in the system-wide shell startup files, using the shell's
umask command. You may set a different default mask for your own account by placing a properumask
command in your shell's starup file ("~/.profile" if you're using "sh" or "bash". "~/.cshrc" if you are
using "csh" or "tcsh").
A Complete Example
Page 15 of 18Manipulating Files And Directories In Unix
10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html
7/27/2019 Manipulating Files And Directories In Unix.pdf
16/18
As an example to the usage of the system calls interface for manipulating files, we will show a program
that handles simple log file rotation. The program gets one argument - the name of a log file, and
assumes it resides in a given directory ("/tmp/var/log"). If the size of the log file is more then 1024KB,
it renames it to have a ".old" suffix, and creates a new (empty) log file with the same name as the
original file, and the same access permissions. This code demonstrates combining many system calls
together to achieve a task. The source code for this program is found in the file rename-log.c.
Reading The Contents Of Directories
After we have learned how to write the contents of a file, we might wish to know how to read the
contents of a directory. We could open the directory and read its contents directly, but this is not
portable. Instead, we have a standard interface for opening a directory and scanning its contents, entry
by entry.
The DIRAnd dirent Structures
When we want to read the contents of a directory, we have a function that opens a directory, and returns
a DIR structure. This structure contains information used by other calls to read the contents of the
directory, and thus this structure is for directory reading, what the FILE structure is for files reading.
When we use the DIR structure to read the contents of a directory, entry by entry, the data regarding a
given entry is returned in a dirent structure. The only relevant field in this structure is d_name, which
is a null-terminated character array, containing the name of the entry (be it a file or a directory). note -
the name, NOT the path.
Opening And Closing A Directory
In order to read the contents of a directory, we first open it, using the opendir() function. We supply
the path to the directory, and get a pointer to a DIR structure in return (orNULL on failure). Here is how:
#include /* struct DIR, struct dirent, opendir().. */
/* open the directory "/home/users" for reading. */DIR* dir = opendir("/home/users");if (!dir) {
perror("opendir");exit(1);}
When we are done reading from a directory, we can close it using the closedir() function:
if (closedir(dir) == -1) {perror("closedir");exit(1);
}
closedir() will return '0' on success, or '-1' if it failed. Unless we have done something really silly,failures shouldn't happen, as we never write to a directory using the DIR structure.
Page 16 of 18Manipulating Files And Directories In Unix
10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html
7/27/2019 Manipulating Files And Directories In Unix.pdf
17/18
Reading The Contents Of A Directory
After we opened the directory, we can start scanning it, entry by entry, using the readdir() function.
The first call returns the first entry of the directory. Each successive call returns the next entry in the
directory. When all entries have been read,NULL is returned. Here is how it is used:
/* this structure is used for storing the name of each entry in turn. */struct dirent* entry;
/* read the directory's contents, print out the name of each entry. */printf("Directory contents:\n");while ( (entry = readdir(dir)) != NULL) {
printf("%s\n", entry->d_name);}
If you try this out, you'll note that the directory always contains the entries "." and "..", as explained in
the beginning of this tutorial. A common mistake is to forget checking these entries specifically, in
recursive traversals of the file system. If these entries are being traversed blindingly, an endless loop
might occur.
Note: if we alter the contents of the directory during its traversal, the traversal might skip directory
entries. Thus, if you intend to create a file in the directory, you would better not do that while in the
middle of a traversal.
Rewinding A Directory For A Second Scan
After we are done reading the contents of a directory, we can rewind it for a second pass, using the
rewinddir() function:
rewinddir(dir);
Checking And Changing The Working Directory
Sometimes we wish to find out the current working directory of a process. The getcwd() function is
used for that. Other times we wish to change the working directory of our process. This will allow using
short paths when accessing several files in the same directory. The chdir() system call is used for this.
Here is an example:
/* this buffer is used to store the full path of the current *//* working directory. */#define MAX_DIR_PATH 2048;char cwd[MAX_DIR_PATH+1];
/* store the current working directory. */if (!getcwd(cwd, MAX_DIR_PATH+1)) {
perror("getcwd");exit(1);
}
/* change the current directory to "/tmp". */if (!chdir("/tmp")) {
perror("chdir (1)");exit(1);
Page 17 of 18Manipulating Files And Directories In Unix
10/26/2013http://users.actcom.co.il/~choo/lupg/tutorials/handling-files/handling-files.html
7/27/2019 Manipulating Files And Directories In Unix.pdf
18/18
}
/* restore the original working directory. */if (chdir(cwd) == -1) {
perror("chdir (2)");exit(1);
}
A Complete Example
As an example, we will write a limited version of the Unix 'find' command. This command basically
accepts a file name and a directory, and finds all files under that directory (or any of its sub-directories)
with the given file name. The original program has zillions of command line options, and can also
handle file name patterns. Our version will only be able to handle substrings (that is, finding the files
whose names contain the given string). The program changes its working directory to the given
directory, reads its contents, and recursively scans each sub-directory it encounters. The program does
not traverse across symbolic-links to avoid possible loops. The complete source code for the theprogram is found in the find-file.c file.
[LUPG Home] [Tutorials] [Related Material] [Essays] [Project Ideas] [Send Comments]
This document is copyright (c) 1998-2002 by guy keren.
The material in this document is provided AS IS, without any expressed or implied warranty, or claimof fitness for a particular purpose. Neither the author nor any contributers shell be liable for any
damages incured directly or indirectly by using the material contained in this document.
permission to copy this document (electronically or on paper, for personal or organization internal use)
or publish it on-line is hereby granted, provided that the document is copied as-is, this copyright notice
is preserved, and a link to the original document is written in the document's body, or in the page
linking to the copy of this document.
Permission to make translations of this document is also granted, under these terms - assuming the
translation preserves the meaning of the text, the copyright notice is preserved as-is, and a link to the
original document is written in the document's body, or in the page linking to the copy of this
document.
For any questions about the document and its license, please contact the author.
Page 18 of 18Manipulating Files And Directories In Unix