SEM 5 MC0085 Advanced Operating Systems

7/29/2019 SEM 5 MC0085 Advanced Operating Systems

1/17

February 2010

Master of Computer Application (MCA) Semester 5

MC0085 Advanced Operating Systems

(Distributed Systems) 4 Credits

(Book ID: B0967)

Assignment Set 1 (60 Marks)

Answer all Questions Each Question carries FIFTEEN Marks

1. Explain the following:

A) Distributed Computing System Models B) Advantages of Distributed Systems

C) Distributed Operating Systems

(A) Distributed Computing System ModelsDistributed Computing system models can be broadly classified into five categories. They are

Minicomputer model

Workstation model

Workstation server model

Processor pool model

Hybrid modelMinicomputer Model

The minicomputer model (Fig. 1.3) is a simple extension of the centralized time-sharing system. A distributed

computing system based on this model consists of a few minicomputers (they may be large supercomputers as

well) interconnected by a communication network. Each minicomputer usually has multiple users simultaneously

logged on to it. For this, several interactive terminals are connected to each minicomputer. Each user is logged on

to one specific minicomputer, with remote access to other minicomputers. The network allows a user to access

remote resources that are available on some machine other than the one on to which the user is currently logged.

The minicomputer model may be used when resource sharing (such as sharing of information databases ofdifferent types, with each type of database located on a different machine) with remote users is desired. The early

ARPAnet is an example of a distributed computing system based on the minicomputer model.


2/17

Fig. 1.1: A Distributed Computing System based on Minicomputer Model

Workstation Model

A distributed computing system based on the workstation model (Fig. 1.4) consists of several workstationsinterconnected by a communication network. An organization may have several workstations located throughout a

building or campus, each workstation equipped with its own disk and serving as a single-user computer. It has

been often found that in such an environment, at any one time a significant proportion of the workstations are idle

(not being used), resulting in the waste of large amounts of CPU time. Therefore, the idea of the workstation

model is to interconnect all these workstations by a high-speed LAN so that idle workstations may be used to

process jobs of users who are logged onto other workstations and do not have sufficient processing power at their

own workstations to get their jobs processed efficiently.

Fig. 1.2: A Distributed Computing System based on Workstation Model

Workstation Server Model

The workstation model is a network of personal workstations, each with its own disk and a local file system. A

workstation with its own local disk is usually called a diskful workstation and a workstation without a local disk is


3/17

called a diskless workstation. With the proliferation of high-speed networks, diskless workstations have become

more popular in network environments than diskful workstations, making the workstation-server model more

popular than the workstation model for building distributed computing systems.

A distributed computing system based on the workstation-server model (Fig. 1.5) consists of a few minicomputers

and several workstations (most of which are diskless, but a few of which may be diskful) interconnected by acommunication network.

Fig. 1.3: A Distributed Computing System based on Workstation-server Model

Note that when diskless workstations are used on a network, the file system to be used by these workstations mustbe implemented either by a diskful workstation or by a minicomputer equipped with a disk for file storage. One or

more of the minicomputers are used for implementing the file system. Other minicomputers may be used for

providing other types of services, such as database service and print service. Therefore, each minicomputer is used

as a server machine to provide one or more types of services. Therefore in the workstation-server model, in

addition to the workstations, there are specialized machines (may be specialized workstations) for running server

processes (called servers) for managing and providing access to shared resources. For a number of reasons, such ashigher reliability and better scalability, multiple servers are often used for managing the resources of a particular

type in a distributed computing system. For example, there may be multiple file servers, each running on aseparate minicomputer and cooperating via the network, for managing the files of all the users in the system. Due

to this reason, a distinction is often made between the services that are provided to clients and the servers that

provide them. That is, a service is an abstract entity that is provided by one or more servers. For example, one or

more file servers may be used in a distributed computing system to provide file service to the users.

In this model, a user logs onto a workstation called his or her home workstation. Normal computation activities

required by the user's processes are performed at the user's home workstation, but requests for services provided

by special servers (such as a file server or a database server) are sent to a server providing that type of service that

performs the user's requested activity and returns the result of request processing to the user's workstation.Therefore, in this model, the user's processes need not migrated to the server machines for getting the work done

by those machines.

Processor Pool ModelThe processor-pool model is based on the observation that most of the time a user does not need any computing

power but once in a while the user may need a very large amount of computing power for a short time (e.g., when

recompiling a program consisting of a large number of files after changing a basic shared declaration). Therefore,

unlike the workstation-server model in which a processor is allocated to each user, in the processor-pool model the


4/17

processors are pooled together to be shared by the users as needed. The pool of processors consists of a large

number of microcomputers and minicomputers attached to the network. Each processor in the pool has its own

memory to load and run a system program or an application program of the distributed computing system.

Fig. 1.4: A distributed computing system based on the processor-pool model

Hybrid Model

Out of the four models described above, the workstation-server model, is the most widely used model for building

distributed computing systems. This is because a large number of computer users only perform simple interactive

tasks such as editing jobs, sending electronic mails, and executing small programs. The workstation-server model

is ideal for such simple usage. However, in a working environment that has groups of users who often performjobs needing massive computation, the processor-pool model is more attractive and suitable.

(B)Advantages of Distributed Systems

From the models of distributed computing systems presented above, it is obvious that distributed computing

systems are much more complex and difficult to build than traditional centralized systems (those consisting of a

single CPU, its memory, peripherals, and one or more terminals). The increased complexity is mainly due to the

fact that in addition to being capable of effectively using and managing a very large number of distributed

resources, the system software of a distributed computing system should also be capable of handling thecommunication and security problems that are very different from those of centralized systems. For example, the

performance and reliability of a distributed computing system depends to a great extent on the performance and

reliability of the underlying communication network. Special software is usually needed to handle loss of

messages during transmission across the network or to prevent overloading of the network, which degrades the

performance and responsiveness to the users. Similarly, special software security measures are needed to protect

the widely distributed shared resources and services against intentional or accidental violation of access control

and privacy constraints.

From the models of distributed computing systems presented above, it is obvious that distributed computing

systems are much more complex and difficult to build than traditional centralized systems (those consisting of a

single CPU, its memory, peripherals, and one or more terminals). The increased complexity is mainly due to thefact that in addition to being capable of effectively using and managing a very large number of distributed

resources, the system software of a distributed computing system should also be capable of handling the

communication and security problems that are very different from those of centralized systems. For example, theperformance and reliability of a distributed computing system depends to a great extent on the performance and

reliability of the underlying communication network. Special software is usually needed to handle loss of

messages during transmission across the network or to prevent overloading of the network, which degrades the

performance and responsiveness to the users. Similarly, special software security measures are needed to protect


5/17

the widely distributed shared resources and services against intentional or accidental violation of access control

and privacy constraints. Despite the increased complexity and the difficulty of building distributed computing

systems, the installation and use of distributed computing systems are rapidly increasing. This is mainly because

the advantages of distributed computing systems outweigh their disadvantages. The technical needs, the economic

pressures, and the major advantages that have led to the emergence and popularity of distributed computingsystems are described next.

(C) Distributed Operating Systems

Tanenbaum and Van Renesse define an operating system as a program that controls the resources of a computer

system and provides its users with an interface or virtual machine that is more convenient to use than the bare

machine. According to this definition, the two primary tasks of an operating system are as follows:

1. To present users with a virtual machine that is easier to program than the underlying hardware.

2. To manage the various resources of the system. This involves performing such tasks as keeping track of who is

using which resource, granting resource requests, accounting for resource usage, and mediating conflicting

requests from different programs and users.Therefore, the users' view of a computer system, the manner in which the users access the various resources of the

computer system, and the ways in which the resource requests are granted depend to a great extent on the

operating system of the computer system. The operating systems commonly used for distributed computingsystems can be broadly classified into two types network operating systems and distributed operating systems.

The three most important features commonly used to differentiate between these two types of operating systems

are system image, autonomy, and fault tolerance capability. These features are given below: System image: Under

network OS, the user views the distributed system as a collection of machines connected by a communication

subsystem. i.e the user is aware of the fact that multiple computers are used. A distributed OS hides the existenceof multiple computers and provides a single system image to the users. Autonomy: A network OS is built on a set

of existing centralized OSs and handles the interfacing and coordination of remote operations and communications

between these OSs. So, in this case, each machine has its own OS. With a distributed OS, there is a single system-

wide OS and each computer runs part of this global OS.

Fault tolerance capability: A network operating system provides little or no fault tolerance capability in the sense

that if 10% of the machines of the entire distributed computing system are down at any moment, at least 10% of

the users are unable to continue with their work. On the other hand, with a distributed operating system, most of

the users are normally unaffected by the failed machines and can continue to perform their work normally, withonly a 10% loss in performance of the entire distributed computing system.

Therefore, the fault tolerance capability of a distributed operating system is usually very high as compared to that

of a network operating system

2. Explain the following in the Context of Message Passing:

A) Synchronization B) Buffering C) Process Addressing

Ans:

(A) Synchronization

A major issue in communication is the synchronization imposed on the communicating processes by the

communication primitives. There are two types of communicating primitives: Blocking Semantics and Non-

Blocking Semantics.

Blocking Semantics: A communication primitive is said to have blocking semantics if its invocation

blocks the execution of its invoker (for example in the case of send, the sender blocks until it receives an

acknowledgement from the receiver.)


6/17

Non-blocking Semantics: A communication primitive is said to have non-blocking semantics if its

invocation does not block the execution of its invoker.

The synchronization imposed on the communicating processes basically depends on one of the two types of

semantics used for the send and receive primitives.Blocking Primitives

Blocking Send Primitive: In this case, after execution of the send statement, the sending process is

blocked until it receives an acknowledgement from the receiver that the message has been received.

Non-Blocking Send Primitive: In this case, after execution of the send statement, the sending process is

allowed to proceed with its execution as soon as the message is copied to the buffer.

Blocking Receive Primitive: In this case, after execution of the receive statement, the receiving process

is blocked until it receives a message.

Non-Blocking Receive Primitive: In this case, the receiving process proceeds with its execution after the

execution of receive statement, which returns the control almost immediately just after telling the kernel

where the message buffer is.

Handling non-blocking receives: The following are the two ways of doing this:

Polling: a test primitive is used by the receiver to check the buffer status

Interrupt: When a message is filled in the buffer, software interrupt is used to notify the receiver.

However, user level interrupts make programming difficult.

(B) BufferingThe transmission of messages from one process to another can be done by copying the body of the message from

the senders address space to the receivers address space. In some cases, the receiving process may not be ready

to receive the message but it wants the operating system to save that message for later reception. In such cases, the

operating system would rely on the receivers buffer space in which the transmitted messages can be stored prior

to receiving process executing specific code to receive the message.

The synchronous and asynchronous modes of communication correspond to the two extremes of buffering: a null

buffer, or no buffering, and a buffer with unbounded capacity. Two other commonly used buffering strategies are

single-message and finite-bound, or multiple message buffers. These four types of buffering strategies are givenbelow:

No buffering: In this case, message remains in the senders address space until the receiver executes the

corresponding receive.

Single message buffer: A buffer to hold a single message at the receiver side is used. It is used for

implementing synchronous communication because in this case an application can have only one

outstanding message at any given time.

Unbounded - Capacity buffer: Convenient to support asynchronous communication. However, it is

impossible to support unbounded buffer.

Finite-Bound Buffer: Used for supporting asynchronous communication.

Buffer overflow can be handled in one of the following ways:

Unsuccessful communication: send returns an error message to the sending process, indicating that themessage could not be delivered to the receiver because the buffer is full.

Flow-controlled communication: The sender is blocked until the receiver accepts some messages. This

violates the semantics of asynchronous send. This will also result in communication deadlock.


7/17

1.An absolute pointer value has no meaning (more on this when we talk about RPC). For example, a pointer to

a tree or linked list. So, proper encoding mechanisms should be adopted to pass such objects.

2. Different program objects, such as integers, long integers, short integers, and character strings occupy

different storage space. So, from the encoding of these objects, the receiver should be able to identify the type

and size of the objects.

A message data should be meaningful to the receiving process. This implies ideally that the structure of the

program should be preserved while they are being transmitted from the address space of the sending process

to the address space of the receiving process. It is not possible in heterogeneous systems in which the sending

and receiving processes are on computers of different architectures. Even in homogeneous systems, it is very

difficult to achieve this goal mainly because of two reasons:

One of the following two representations may be used for the encoding and decoding of a message data: 1.

Tagged representation: The type of each program object as well as its value is encoded in the message. In

this method, it is a simple matter for the receiving process to check the type of each program object in the

message because of the self-describing nature of the coded data format. 2. Untagged representation: Themessage contains only program objects, no information is included in the message about the type of each

program object. In this method, the receiving object should have a prior knowledge of how to decode the

received data because the coded data format is not self-describing.(C)Process Addressing

A message passing system generally supports two types of addressing:

Explicit Addressing: The process with which communication is desired is explicitly specified as a

parameter in the communication primitive. e.g. send (pid, msg), receive (pid, msg).

Implicit Addressing: A process does not explicitly name a process for communication. For example, a

process can specify a service instead of a process. e.g. send any (service id, msg), receive any (pid, msg

Methods for process addressing:

machine id@local id: UNIX uses this form of addressing (IP address, port number).

Advantages: No global coordination needed for process addressing. Disadvantages: Does not allow process

migration.

machine id1@local id@machine id2: machine id1 identifies the node on which the process is created. localid is generated by the node on which the process is created.

machine id2 identifies the last known location of the process. When a process migrates to another node, the

link information (the machine id to which the process migrates) is left with the current machine. Thisinformation is used for forwarding messages to migrated processes.

Disadvantages:

Overhead involved in locating a process may be large.

If the node on which the process was executing is down, it may not be possible to locate the process.

3. Explain the following algorithms with respect to distributed Shared Memory:

A) Centralized Server Algorithm B) Fixed Distributed Server Algorithm

C) Dynamic Distributed Server Algorithm

Ans:

(A) Centralized-Server Algorithm


8/17

A central server maintains a block table containing owner-node and copy-set information for each block.

When a read/write fault for a block occurs at node N, the fault handler at node N sends a read/write request to

the central server. Upon receiving the request, the central-server does the following:

If it is a read request:

adds N to the copy-set field and

sends the owner node information to node N

upon receiving this information, N sends a request for the block to the owner node.

upon receiving this request, the owner returns a copy of the block to N.

If it is a write request:

It sends the copy-set and owner information of the block to node N and initializes copy-set to {N}

Node N sends a request for the block to the owner node and an invalidation message to all blocks in the

copy-set.

Upon receiving this request, the owner sends the block to node N

(B) Fixed Distributed-Server Algorithm

Under this scheme

Several nodes have block managers, each block manager manages a predetermined set of blocks

When a read/write fault occurs, request for the block is sent to the corresponding block manager.

Upon receiving this request

The actions taken by the block manager are similar to that of the central-server approach.

(C) Dynamic Distributed Server AlgorithmUnder this approach, there is no block manager. Each node maintains information about the probable owner of

each block, and also the copy-set information for each block for which it is a owner. When a block fault occurs,

the fault handler sends a request to the probable owner of the block.

Upon receiving the request

if the receiving node is not the owner, it forwards the request to the probable owner of the block

according to its table.

if the receiving node is the owner, theno If the request is a read request, it adds the entry N to the copy-set field of the entry

corresponding to the block and sends a copy of the block to node N.

o If the request is a write request, it sends the block and copy-set information to the node N and

deletes the entry corresponding to the block from its block table.

o Node N, upon receiving the block, sends invalidation request to all nodes in the copy-set, and

updates its block table to reflect the fact that it is the owner of the block

4. Describe the Clock Synchronization Algorithms and Distributed Algorithms in the context of

Synchronization.

Ans:

Clock Synchronization Algorithms

Clock synchronization algorithms may be broadly classified as Centralized and Distributed:

Centralized Algorithms

In centralized clock synchronization algorithms one node has a real-time receiver. This node, called the time

server node whose clock time is regarded as correct and used as the reference time. The goal of these


9/17

algorithms is to keep the clocks of all other nodes synchronized with the clock time of the time server node.

Depending on the role of the time server node, centralized clock synchronization algorithms are again of two

types Passive Time Sever and Active Time Server.

1. Passive Time Server Centralized Algorithm: In this method each node periodically sends a message tothe time server. When the time server receives the message, it quickly responds with a message (time =

T), where T is the current time in the clock of the time server node. Assume that when the client node

sends the time = ? message, its clock time is T0, and when it receives the time = T message, its clocktime is T1. Since T0 and T1 are measured using the same clock, in the absence of any other information,

the best estimate of the time required for the propagation of the message time = T from the time server

node to the clients node is (T1-T0)/2. Therefore, when the reply is received at the clients node, its clock

is readjusted to T + (T1-T0)/2. 2. Active Time Server Centralized Algorithm: In this approach, the time

server periodically broadcasts its clock time (time = T). The other nodes receive the broadcast message

and use the clock time in the message for correcting their own clocks. Each node has a priori knowledge

of the approximate time (Ta) required for the propagation of the message time = T from the time server

node to its own node, Therefore, when a broadcast message is received at a node, the nodes clock is

readjusted to the time T+Ta. A major drawback of this method is that it is not fault tolerant. If the

broadcast message reaches too late at a node due to some communication fault, the clock of that node willbe readjusted to an incorrect value. Another disadvantage of this approach is that it requires broadcast

facility to be supported by the network.

2. Another active time server algorithm that overcomes the drawbacks of the above algorithm is theBerkeley algorithm proposed by Gusella and Zatti for internal synchronization of clocks of a group of

computers running the Berkeley UNIX. In this algorithm, the time server periodically sends a message

(time = ?) to all the computers in the group. On receiving this message, each computer sends back its

clock value to the time server. The time server has a priori knowledge of the approximate time required

for the propagation of a message from each node to its own node. Based on this knowledge, it first

readjusts the clock values of the reply messages, It then takes afault-tolerantaverage of the clock values

of all the computers (including its own). To take the fault tolerant average, the time server chooses a

subset of all clock values that do not differ from one another by more than a specified amount, and theaverage is taken only for the clock values in this subset. This approach eliminates readings from

unreliable clocks whose clock values could have a significant adverse effect if an ordinary average was

taken. The calculated average is the current time to which all the clocks should be readjusted, The time

server readjusts its own clock to this value, Instead of sending the calculated current time back to other

computers, the time server sends the amount by which each individual computers clock requires

adjustment, This can be a positive or negative value and is calculated based on the knowledge the time

server has about the approximate time required for the propagation of a message from each node to its

own node.

Centralized clock synchronization algorithms suffer from two major drawbacks:

1. They are subject to single point failure. If the time server node fails, the clock synchronizationoperation cannot be performed. This makes the system unreliable. Ideally, a distributed system, should be

more reliable than its individual nodes. If one goes down, the rest should continue to function correctly.

2. From a scalability point of view it is generally not acceptable to get all the time requests serviced by a

single time server. In a large system, such a solution puts a heavy burden on that one process.

Distributed AlgorithmsWe know that externally synchronized clocks are also internally synchronized. That is, if each nodes clock is

independently synchronized with real time, all the clocks of the system remain mutually synchronized.

Therefore, a simple method for clock synchronization may be to equip each node of the system with a realtime receiver so that each nodes clock can be independently synchronized with real time. Multiple real time


10/17

clocks (one for each node) are normally used for this purpose. Theoretically, internal synchronization of

clocks is not required in this approach. However, in practice, due to inherent inaccuracy of real-time clocks,

different real time clocks produce different time. Therefore, internal synchronization is normally performed

for better accuracy. One of the following two approaches is used for internal synchronization in this case.

1. Global Averaging Distributed Algorithms: In this approach, the clock process at each node broadcastsits local clock time in the form of a special resync message when its local time equals T0+iR for some

integer I, where T0 is a fixed time in the past agreed upon by all nodes and R is a system parameter that

depends on such factors as the total number of nodes in the system, the maximum allowable drift rate,

and so on. i.e. a resync message is broadcast from each node at the beginning of every fixed length

resynchronization interval. However, since the clocks of different nodes run slightly different rates, thesebroadcasts will not happen simultaneously from all nodes. After broadcasting the clock value, the clock

process of a node waits for time T, where T is a parameter to be determined by the algorithm. During this

waiting period, the clock process records the time, according to its own clock, when the message was

received. At the end of the waiting period, the clock process estimates the skew of its clock with respect

to each of the other nodes on the basis of the times at which it received resync messages. It then computes

a fault-tolerant average of the next resynchronization interval.

2. The global averaging algorithms differ mainly in the manner in which the fault-tolerant average of the

estimated skews is calculated. Two commonly used algorithms are: 1. The simplest algorithm is to takethe average of the estimated skews and use it as the correction for the local clock. However, to limit the

impact of faulty clocks on the average value, the estimated skew with respect to each node is compared

against a threshold, and skews greater than the threshold are set to zero before computing the average of

the estimated skews. 2. In another algorithm, each node limits the impact of faulty clocks by first

discarding the m highest and m lowest estimated skews and then calculating the average of the remaining

skews, which is then used as the correction for the local clock. The value of m is usually decided based

on the total number of clocks (nodes).


11/17

February 2010

Master of Computer Application (MCA) Semester 5

MC0085 Advanced Operating Systems

(Distributed systems) 4 Credits

(Book ID: B0967)

Assignment Set 2 (60 Marks)

Answer all Questions Each Question carries FIFTEEN Marks

1. Describe the following:

A) Task assignment Approach B) Load Balancing Approach

C) Load Sharing Approach

A) Task assignment Approach

Each process is viewed as a collection of tasks. These tasks are scheduled to suitable processor to improve

performance. This is not a widely used approach because:

It requires characteristics of all the processes to be known in advance.

This approach does not take into consideration the dynamically changing state of the system.

In this approach, a process is considered to be composed of multiple tasks and the goal is to find an optimal

assignment policy for the tasks of an individual process. The following are typical assumptions for the task

assignment approach:

Minimize IPC cost (this problem can be modeled using network flow model)

Efficient resource utilization

Quick turnaround time

A high degree of parallelism

B) Load Balancing Approach

In this, the processes are distributed among nodes to equalize the load among all nodes. The scheduling algorithmsthat use this approach are known as Load Balancing or Load Leveling Algorithms. These algorithms are based on

the intuition that for better resource utilization, it is desirable for the load in a distributed system to be balanced

evenly. This a load balancing algorithm tries to balance the total system load by transparently transferring the

workload from heavily loaded nodes to lightly loaded nodes in an attempt to ensure good overall performancerelative to some specific metric of system performance.

We can have the following categories of load balancing algorithms:


12/17

Static: Ignore the current state of the system. E.g. if a node is heavily loaded, it picks up a task randomly

and transfers it to a random node. These algorithms are simpler to implement but performance may not be

good.

Dynamic: Use the current state information for load balancing. There is an overhead involved incollecting state information periodically; they perform better than static algorithms.

Deterministic: Algorithms in this class use the processor and process characteristics to allocate processes

to nodes.

Probabilistic: Algorithms in this class use information regarding static attributes of the system such as

number of nodes, processing capability, etc.

Centralized: System state information is collected by a single node. This node makes all scheduling

decisions.

Distributed: Most desired approach. Each node is equally responsible for making scheduling decisions

based on the local state and the state information received from other sites.

Cooperative: A distributed dynamic scheduling algorithm. In these algorithms, the distributed entities

cooperate with each other to make scheduling decisions. Therefore they are more complex and involve

larger overhead than non-cooperative ones. But the stability of a cooperative algorithm is better than of a

non-cooperative one.

Non-Cooperative: A distributed dynamic scheduling algorithm. In these algorithms, individual entities

act as autonomous entities and make scheduling decisions independently of the action of other entities.

C) Load Sharing Approach

Several researchers believe that load balancing, with its implication of attempting to equalize workload on all the

nodes of the system, is not an appropriate objective. This is because the overhead involved in gathering the state

information to achieve this objective is normally very large, especially in distributed systems having a large

number of nodes. In fact, for the proper utilization of resources of a distributed system, it is not required to balancethe load on all the nodes. It is necessary and sufficient to prevent the nodes from being idle while some other

nodes have more than two processes. This rectification is called the Dynamic Load Sharing instead of Dynamic

Load Balancing.

The design of a load sharing algorithms require that proper decisions be made regarding load estimation policy,process transfer policy, state information exchange policy, priority assignment policy, and migration limiting

policy. It is simpler to decide about most of these policies in case of load sharing, because load sharing algorithms

do not attempt to balance the average workload of all the nodes of the system. Rather, they only attempt to ensure

that no node is idle when a node is heavily loaded. The priority assignments policies and the migration limiting

policies for load-sharing algorithms are the same as that of load-balancing algorithms.

2. Write about:

A) Processes B) Process Management

C) Process Migration Mechanisms

A) Processes

The term "process" was first used by the designers of the MULTICS in 1960's. Since then, the term process is used

somewhat interchangeably with 'task' or 'job'. The process has been given many definitions, for instance:A program in Execution.

An asynchronous activity.


13/17

The 'animated spirit' of a procedure in execution.

The entity to which processors are assigned.

The 'dispatchable' unit.

And many more definitions have been given. As we can see from the above that there is no universally agreed

upon definition, but the definition "Program in Execution" seems to be most frequently used. And this is a concept

used in the present study of operating systems.

Now that we have agreed upon the definition of process, the question is what the relation between process and

program is. It is same beast with different name or when this beast is sleeping (not executing) it is called programand when it is executing it becomes process. Well, to be very precise, a Process is not the same as program. In the

following discussion we point out some of the differences between process and program. Process is not the same

as program. A process is more than a program code. A process is an 'active' entity as opposed to program which is

considered to be a 'passive' entity. As we all know a program is an algorithm expressed in some suitable notation,

(e.g., programming language). Being passive, a program is only a part of process. Process, on the other hand,

includes:Current value of Program Counter (PC)

Contents of the processors register

Value of the variables

The process stacks (SP) which typically contains temporary data such as subroutine parameter, return address, and

temporary variables.

A data section that contains global variables.

A process is the unit of work in a system. In Process model, all software on the computer is organized into a

number of sequential processes. A process includes PC, registers, and variables. Conceptually, each process has its

own virtual CPU. In reality, the CPU switches back and forth among processes. (The rapid switching back and

forth is called multiprogramming).

B) Process Management

In a conventional (or centralized) operating system, process management deals with mechanisms and policies for

sharing the processor of the system among all processes. In a Distributed Operating system, the main goal of

process management is to make the best possible use of the processing resources of the entire system by sharing

them among all the processes. Three important concepts are used in distributed operating systems to achieve this

goal:1. Processor Allocation: It deals with the process of deciding which process should be assigned to which

processor.

2. Process Migration: It deals with the movement of a process from its current location to the processor to which

it has been assigned.

3. Threads: They deal with fine-grained parallelism for better utilization of the processing capability of the

system.

C) Process Migration Mechanisms


14/17

Migration of a process is a complex activity that involves proper handling of several sub-activities in order to meet

the requirements of a good process migration mechanism. The four major subactivities involved in process

migration are as follows:

1. Freezing the process and restarting on another node.

2. Transferring the process address space from its source node to its destination node

3. Forwarding messages meant for the migrant process

4. Handling communication between cooperating processes that have been separated as a result of process

migration.

3. Describe:

A) Stateful Versus Stateless Servers B) Replication

C) Caching

A) Stateful Versus Stateless Servers

The file servers that implement a distributed file service can be stateless or Stateful. Stateless file servers do not

store any session state. This means that every client request is treated independently, and not as a part of a new or

existing session. Stateful servers, on the other hand, do store session state. They may, therefore, keep track of

which clients have opened which files, current read and write pointers for files, which files have been locked by

which clients, etc.

The main advantage of stateless servers is that they can easily recover from failure. Because there is no state thatmust be restored, a failed server can simply restart after a crash and immediately provide services to clients as

though nothing happened. Furthermore, if clients crash the server is not stuck with abandoned opened or locked

files. Another benefit is that the server implementation remains simple because it does not have to implement the

state accounting associated with opening, closing, and locking of files.

The main advantage of Stateful servers, on the other hand, is that they can provide better performance for clients.

Because clients do not have to provide full file information every time they perform an operation, the size ofmessages to and from the server can be significantly decreased. Likewise the server can make use of knowledge of

access patterns to perform read-ahead and do other optimizations. Stateful servers can also offer clients extra

services such as file locking, and remember read and write positions.

B) Replication

The main approach to improving the performance and fault tolerance of a DFS is to replicate its content. A

replicating DFS maintains multiple copies of files on different servers. This can prevent data loss, protect a systemagainst down time of a single server, and distribute the overall workload. There are three approaches to replication

in a DFS:1. Explicit replication: The client explicitly writes files to multiple servers. This approach requires explicit

support from the client and does not provide transparency.

2. Lazy file replication: The server automatically copies files to other servers after the files are written. Remotefiles are only brought up to date when the files are sent to the server. How often this happens is up to the

implementation and affects the consistency of the file state.


15/17

3. Group file replication: write requests are simultaneously sent to a group of servers. This keeps all the replicas

up to date, and allows clients to read consistent file state from any replica.

C) CachingBesides replication, caching is often used to improve the performance of a DFS. In a DFS, caching involves

storing either a whole file, or the results of file service operations. Caching can be performed at two locations: at

the server and at the client. Server-side caching makes use of file caching provided by the host operating system.

This is transparent to the server and helps to improve the servers performance by reducing costly disk accesses.

Client-side caching comes in two flavours: on-disk caching, and in-memory caching. On-disk caching involves the

creation of (temporary) files on the clients disk. These can either be complete files (as in the upload/download

model) or they can contain partial file state, attributes, etc. In-memory caching stores the results of requests in the

client-machines memory. This can be process-local (in the client process), in the kernel, or in a separate dedicated

caching process. The issue of cache consistency in DFS has obvious parallels to the consistency issue in shared

memory systems, but there are other tradeoffs (for example, disk access delays come into play, the granularity of

sharing is different, sizes are different, etc.). Furthermore, because write-through caches are too expensive to be

useful, the consistency of caches will be weakened. This makes implementing Unix semantics impossible.

Approaches used in DFS caches include, delayed writes where writes are not propagated to the server

immediately, but in the background later on, and write-on-close where the server receives updates only after the

file is closed. Adding a delay to write-on-close has the benefit of avoiding superfluous writes if a file is deleted

shortly after it has been closed.

4. Describe the following with their real time applications:

A) Digital Signatures B) Design Principles

A) Digital Signatures

A digital signature of a message is a number dependent on some secret known only to the signer, and,

additionally, on the content of the message being signed. Signatures must be verifiable; if a dispute arises as to

whether a party signed a document (caused by either a lying signer trying torepudiate a signature it did create, or a

fraudulent claimant), an unbiased third party should be able to resolve the matter equitably, without requiring

access to the signer s secret information (private key). Digital signatures have many applications in information

security, including authentication, data integrity, and non-repudiation. One of the most significant applications of

digital signatures is the certification of public keys in large networks. Certification is a means for a trusted third

party (TTP) to bind the identity of a user to a public key, so that at some later time, other entities can authenticate a

public key without assistance from a trusted third party. The concept and utility of a digital signature was

recognized several years before any practical realization was available. The first method discovered was the RSA

signature scheme, which remains today one of the most practical and versatile techniques available. Subsequent


16/17

research has resulted in many alternative digital signature techniques. Some offer significant advantages in terms

of functionality and implementation.

Basic definitions

1. A digital signature is a data string which associates a message (in digital form) with some originating entity.

2. A digital signature generation algorithm (orsignature generation algorithm) is a method for producing a

digital signature.

3. A digital signature verification algorithm (orverification algorithm) is a method for verifying that a digital

signature is authentic (i.e., was indeed created by the specified entity).

4. A digital signature scheme (or mechanism) consists of a signature generation algorithm and an associated

verification algorithm.

5. A digital signature signing process (orprocedure) consists of a (mathematical) digital signature generation

algorithm, along with a method for formatting data into messages which can be signed.6. A digital signature verification process (orprocedure) consists of a verification algorithm, along with a method

for recovering data from the message.

7. (messages) M is the set of elements to which a signer can affix a digital signature.

8. (signing space) MS is the set of elements to which the signature transformations are applied. The signature

transformations are not applied directly to the set M.

9. (signature space) S is the set of elements associated to messages in M. These elements are used to bind the

signer to the message.

10. (indexing set) R is used to identify specific signing transformations.

B) Design Principles

Designers of security components of a distributed operating system should follow the following guidelines while

designing a secured network:

1. Least Privilege: This principle is also known as need-to-know principle. It states that any process should be

given only those access rights that enable it to access, at any time, what it needs to accomplish

its function and nothing more and nothing less. i.e. the security system must be flexible enough to allow the access

rights of a process to grow and shrink with its changing access requirements. This principle serves to limit the

damage when a system s security is broken.

2. Fail-Safe defaults: Access rights should be acquired by explicit permission only and the default should be no

access. This principle requires that access control decisions should be based on why an object should be accessibleto a process rather than on why it should not be accessible.

3. Open design: This principle requires that the design should not be secret but should be public. It is a mistake on

the part of a designer to assume that intruders will not know how the security mechanism of the system works.


17/17

4. Built into the system: This principle requires that the security be designed into the systems at their inception

and be built into the lowest layers of the systems. i.e. security should not be treated as an add-on feature because

security problems cannot be resolved very effectively by patching the penetration holes detected in an existing

system.

5. Check for current authority: This principle requires that every access to every object must be checked using

an access control database for authority. This is necessary to have immediate effect of revocation of previously

given access rights.

6. Easy granting and revocation of access rights: For greater flexibility, a security system must allow access

rights for an object to be granted or revoked dynamically. It should be possible to restrict some of the rights and to

grant to a user only those rights that are sufficient to accomplish its functions. On the other hand, a good

security system should allow immediate revocation with the flexibility of selective and partial revocation.

7. Never trust other parties: For producing a secured distributed system, the system components must be

designed with the assumption that other parties (human or programs) are not trustworthy until they are

demonstrated to be trustworthy.

8. Always ensure freshness of messages: To avoid security violations through the replay of messages, the

security of a distributed system must be designed to always ensure freshness of messages exchanged between twocommunicating entities.

9. Build firewalls: To limit the damage in case of a system s security being compromised, the system must have firewalls built into it. One way to meet these requirements is to allow only short-lived passwords and keys in the

system.

10. Efficient: The security mechanisms used must execute efficiently and be simple to implement.

11. Convenient to use: To be psychologically acceptable, the security mechanisms must be convenient to use.

Otherwise, they are likely to be bypassed or incorrectly used by the users.

12. Cost Effective: It is often the case that security needs to be traded off with other goals of the system, such as

performance or ease of use.

SEM 5 MC0085 Advanced Operating Systems

Documents