Operating Systems

Operating Systems

Recitation 3, April 7-8th, 2002.

Overview

• Process descriptor and kernel data structures for managing processes.

• Process creation and termination

• Process synchronization

• Exercise

Simple example

Suppose you have two terminal windows on your screen, then you’re probably running the same terminal program twice - you have two terminal processes.– Each terminal window is probably running a shell;

each running shell is another process.• When you invoke a command from a shell the corresponding

program is executed in a new process.– The shell process resumes when that process

completes.

• A running instance of a program is called a process.

Process descriptor

• State: running, waiting, stopped, zombie.• Unique ID: a non-negative integer, assigned sequentially

as new processes are created.• Pointers to structures:

– current directory– pointers to file descriptors– pointers to memory area descriptors– signals received and how to handle them.

• Pointers to related processes:– parent, youngest child, younger and older siblings.

• Usage limits: maximum CPU time, file size, memory, etc.• Pointers to processes data structures.

Data structures

• Perceptually a process tree (pstree command), every process has one parent process, except for the init process (ID 1) which is the root, and can have many children.

In practice, the kernel maintains several data structures:1. Process list: a doubly linked circular list of pointers to

process descriptors, with the swapper process (ID 0) at its head – allow efficient search for processes of a given type by creating several such lists.

2. Running processes queue.3. Process ID hash table with chaining – to efficiently access a

process descriptor pointer given a process ID.4. Task array containing pointers to process descriptors, as

well as a doubly linked noncircular list of free entries – to efficiently add or remove processes.

5. Wait queues grouping processes waiting for event/resource.

Process ID#include <sys/types.h> (pid_t typedef)#include <stdio.h>#include <unistd.h>…pid_t id1, id2;id1 = getpid(); // returns ID of calling processid2 = getppid(); // returns ID of parent of calling processprintf("process ID is %d\n", id1);printf("parent process ID is %d\n", id2);

• If you invoke this program several times a different process ID is reported since each invocation is in a new process. However if you invoke it every time from the same shell, the parent ID (that is the process ID of the shell process) is the same.

Active processes

• The ps command displays the processes that are running on your system.

% ps PID TTY TIME CMD21693 pts/8 00:00:00 bash21694 pts/8 00:00:00 ps

• This invocation shows two processes. The first, bash, is the shell running on this terminal. The second is the running instance of the ps program itself.

• The kill command with the process ID terminates a process, by sending it a SIGTERM signal.

Process creation

• Dos and Windows API contains the spawn set of functions, which take as an argument the name of a program to run and create a new process instance of that program.

• Linux doesn't contain a single function that does it one step, instead it provides the fork function that creates a child process that is a copy of its parent process, and a set of exec functions, that cause a particular process to cease being an instance of one program and instead become an instance of another program.

• To spawn a new process, first use fork to make a copy of the current process, then use exec to transform one of these processes into an instance of the program to spawn.

• fork and exit system calls are used to create a new process and terminate it.

• exec system calls are invoked to load a new program.

fork

• A new process is created by the kernel when an existing process calls the fork function. The process that invokes fork is called the parent process whereas the new process created by fork is called the child process.

#include <sys/types.h>#include <unistd.h>…pid_t fork(void);

• Function is called once but returns twice.• Returns 0 in child process, and the process ID of the

new child in the parent, -1 on error.

fork

• When a process calls fork, a duplicate process, called the child process is created.

• Both child and parent continue executing with the instruction that follows the call to fork.

• How can we distinguish between them?1. Call getpid - child has a new process ID distinct from the parents2. Check fork return value - returns different values to the parent

and child (one process goes into the fork call and two come out).Return value in the parent is the process ID of its child.Return value in the child process is 0, and no process except the scheduler has that ID.

Example of using fork#include <stdio.h>#include <sys/types.h>#include <unistd.h>…pid_t pid;printf(“main process ID is %d\n", (int)getpid());pid = fork();if (pid != 0){ printf("this is the parent process, with ID %d\n", (int)getpid()); printf("the child's process ID is %d\n", pid);}else printf("this is the child process, with ID %d\n", (int)getpid());return 0;

Parent-child relationship

• Resources owned by the parent process are duplicated, and a copy is granted to the child process.

• Problem: process creation is inefficient, it requires copying the entire address space (data space, heap, stack) of the parent process.

• In practice, the child process rarely needs to read or modify all the resources already owned by the parent; in many cases, it immediately issues exec and wipes out the address space.

Copy-on-write

Solution 1:• Instead of performing a complete copy of the

parent a copy-on-write is used: regions are shared by the parent and the child.

• Both the parent and child read the same page frames. Their protection changed by the kernel to read-only.

• If either process tries to write on a page frame, the kernel then copies its contents into a new page frame that is assigned to the writing process.

vfork

Solution 2:• vfork system call creates a process that

shares the memory address space of its parent. To prevent the parent from overwriting data needed by the child, the parent’s execution is blocked until the child exits or executes a new program.

• Problem: avoiding a deadlock if the child depends on another action of the parent.

Uses of fork

• When a process wants to duplicate itself so that the parent and child can each execute different sections of code at the same time. For example, a network server - the parent waits for a service request from a client. When the request arrives, the parent calls fork and lets the child handle the request, the parent goes back to waiting for the next service request.

• When a process wants to execute a different program. In this case the child does an exec right after it returns from the fork.

Process termination

1. exit system call, the kernel releases resources owned by the process and sends the parent process a SIGCHLD signal.

2. Control flow reaches the last statement in the main procedure.

3. Signal

How does the parent know if its children terminated and successfully?

• wait system call allows a process to wait until one of its children terminates. Returns the process ID of the terminated child and a pointer to its status.

• When executing this system call the kernel checks whether a child has already terminated, if not then the process waits until a child terminates.

wait

#include <sys/types.h>#include <sys/wait.h>…pid_t wait(int *statloc);• Returns process ID, or -1 on error (no children).• Can block the caller until a child process terminates.• status of terminated process, checked by the macros:

– WIFEXITED(status) true if child terminated normally. In which case WEXITSTATUS(status) returns an integer, the exit status of the child

– WIFSIGNALED(status) true if child terminated abnormally– WIFSTOPPED(status) true if child is currently stopped.

waitpid

#include <sys/types.h>#include <sys/wait.h>…pid_t waitpid(pid_t pid, int *statloc, int options);• Returns process ID, 0, or -1 on error (not its child,

doesn’t exist).• pid controls which child it waits for:

-1: waits for any child process, equivalent to wait.positive: waits for the specified process.

• Optionally prevents the caller from blocking– 0, as in wait– WNOHANG, don’t block and return 0.

Zombie state

• Design choice: allow a process to query the kernel to get the ID of its parent, or a state of any of its children by calling wait.

• Consequence: do not discard data included in the process descriptor field right after termination, only after the parent waits.

• A process that has terminated but whose parent has not yet waited for it is called a zombie (for example in case the parent terminated before the child).

init process• Its a good practice for the kernel to keep information on a child

process until the parent issues it’s wait call, but suppose the parent process terminates without issuing that call.

• Problem: this information takes valuable memory that could be used for running processes. For example, many shells allow the user to start a command in the background and then log out. The process that is running the command shell terminates but its children continue execution.

• Solution: process 1 init is created during system initialization by process 0 swapper. When a process terminates the kernel changes the appropriate process descriptor pointers of all the existing children of the terminated process to make them become children of init (init becomes the parent process of any orphaned child process) This process monitors the execution of all its children and routinely issues wait system call, whose side effect is to get rid of all zombies.

exec

• exec functions replace the process with a new program.

• When a process calls an exec function, it immediately ceases executing that program and begins executing a new program from the beginning.

• exec replaces the current process with a new program from disk, with the same process ID.

exec

#include <unistd.h>

…

int execv(const char *pathname, char *const argv[]);

• Returns -1 on error, no return on success.

Example using fork & execspawn(char* program, char** arg_list) {

…pid = fork(); // duplicate this processif (pid == 0) // child process{

execvp(program, arg_list); // execute program, returns only on error… // error message and exit

}if (pid != 0) // parent process{

waitpid(pid, &status, 0);… // exit status

}}…char* arg_list[] = { “ls”, “-l”, “/”, NULL }; // passed as the program’s argument listspawn(“ls”, arg_list); // spawn a child process running a new program

fork wait

execvprogram

exit

parent processshell process shell process

zombie processchild process

program executes

Race conditions

• Multiple processes perform an operation on shared data, outcome depends on the order in which they run.

• After a call to fork we cannot predict which process runs first.

…if ((pid = fork()) < 0) handleError(“fork error\n”);else if (pid == 0) oneCharAtAtime(“output from child\n”);else oneCharAtAtime(“output from parent\n”);

% a.outoouuttppuutt ffrroomm cphairlednt

• A process can tell the other when it has finished a set of operations, and wait for the other to complete before continuing.

if ((pid = fork()) < 0) handleError(“fork error\n”);else if (pid == 0){

…childTellsParentItsDone(getppid()); // for child to go firstchildWaitsForParent(); // for parent to go first...exit(0);

}…parentTellsChildItsDone(pid); // for parent to go firstparentWaitsForChild(); // for child to go first…exit(0);

Synchronize parent and child

• Using pipes:

pfd1[1]

pfd2[0]

pfd1[0]

pfd2[1]

parent tells child its done, “p”parent child

child tells parent its done, “c”


void childTellsParentItsDone(pid_t pid)

{

if (write(pfd2[1], “c”, 1) != 1) handleError(“write error”);

}

void childWaitsForParent(void)

{

char c;

if (read(pfd1[0], &c, 1) != 1) handleError(“read error”); // blocking read

if (c != ‘p’) handleErrorAndQuit(“incorrect data”);

}


void ParentTellsChildItsDone(pid_t pid)

{

if (write(pfd1[1], “p”, 1) != 1) handleError(“write error”);

}

void ParentWaitsForChild(void)

{

char c;

if (read(pfd2[0], &c, 1) != 1) handleError(“read error”); // blocking read

if (c != ‘c’) handleErrorAndQuit(“incorrect data”);

}

Exercise 3• Write a program that executes another program.• The program receives from the user an executable file name and arguments. The

argument “.” denotes that there are no more arguments. (“.” is not passed on to the executed program).

• Your program should execute the program in a new process, wait for its termination, and print the returned exit status.

% ex-forkEnter program name: /bin/lsEnter argument 1: -aEnter argument 2: .Running the program </bin/ls –a>. .. .cshrc ex-fork.c ex-forkProgram exited normally, exit status = 0% ex-forkEnter program name: /bin/lsEnter argument 1: xyzEnter argument 2: .Running the program </bin/ls xyz>Cannot access xyz: No such file or directoryProgram exited normally, exit status = 2

Exercise 3

• Chapter 4.8, pages 72-73, in Toledo’s book.• Submission: Monday, April 22nd.• Software

Directory: ~username/os02b/ex-fork

Files: ex-fork.c

Permissions: chmod ugo+rx (to above)

• Hardcopyname, ID, login, CID

ex-fork.c

submit in mailbox 281, Nir Neumark, [email protected]

mailto:[email protected]

Exercise 3 - notes

• Use fork and execv.

• Use waitpid for the process to wait for its child to terminate.

• If the child terminates normally (WIFEXITED returns true) then print its exit status (WEXITSTATUS returns integer).

References

• Chapter 8 on process control in Stevens book.

• Chapter 3 on processes in Bovet & Cesati, 2001, or a local copy of “The Linux Kernel”

under: www.cs.tau.ac.il/~stoledo/os/tlk.pdf

Exercise 3 – answers to FAQ

• You can assume a maximum number of arguments for the program, as well as a maximum length of an argument name, provided you make a note of it in the hardcopy.

Operating Systems

Documents

process list

child process

process creationdos

particular process

process descriptorstate

id1printfparent process

swapper process id

different process id