Top Banner

of 17

SEM 5 MC0085 Advanced Operating Systems

Apr 04, 2018

Download

Documents

onbsd
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/29/2019 SEM 5 MC0085 Advanced Operating Systems

    1/17

    February 2010

    Master of Computer Application (MCA) Semester 5

    MC0085 Advanced Operating Systems

    (Distributed Systems) 4 Credits

    (Book ID: B0967)

    Assignment Set 1 (60 Marks)

    Answer all Questions Each Question carries FIFTEEN Marks

    1. Explain the following:

    A) Distributed Computing System Models B) Advantages of Distributed Systems

    C) Distributed Operating Systems

    (A) Distributed Computing System ModelsDistributed Computing system models can be broadly classified into five categories. They are

    Minicomputer model

    Workstation model

    Workstation server model

    Processor pool model

    Hybrid modelMinicomputer Model

    The minicomputer model (Fig. 1.3) is a simple extension of the centralized time-sharing system. A distributed

    computing system based on this model consists of a few minicomputers (they may be large supercomputers as

    well) interconnected by a communication network. Each minicomputer usually has multiple users simultaneously

    logged on to it. For this, several interactive terminals are connected to each minicomputer. Each user is logged on

    to one specific minicomputer, with remote access to other minicomputers. The network allows a user to access

    remote resources that are available on some machine other than the one on to which the user is currently logged.

    The minicomputer model may be used when resource sharing (such as sharing of information databases ofdifferent types, with each type of database located on a different machine) with remote users is desired. The early

    ARPAnet is an example of a distributed computing system based on the minicomputer model.

  • 7/29/2019 SEM 5 MC0085 Advanced Operating Systems

    2/17

    Fig. 1.1: A Distributed Computing System based on Minicomputer Model

    Workstation Model

    A distributed computing system based on the workstation model (Fig. 1.4) consists of several workstationsinterconnected by a communication network. An organization may have several workstations located throughout a

    building or campus, each workstation equipped with its own disk and serving as a single-user computer. It has

    been often found that in such an environment, at any one time a significant proportion of the workstations are idle

    (not being used), resulting in the waste of large amounts of CPU time. Therefore, the idea of the workstation

    model is to interconnect all these workstations by a high-speed LAN so that idle workstations may be used to

    process jobs of users who are logged onto other workstations and do not have sufficient processing power at their

    own workstations to get their jobs processed efficiently.

    Fig. 1.2: A Distributed Computing System based on Workstation Model

    Workstation Server Model

    The workstation model is a network of personal workstations, each with its own disk and a local file system. A

    workstation with its own local disk is usually called a diskful workstation and a workstation without a local disk is

  • 7/29/2019 SEM 5 MC0085 Advanced Operating Systems

    3/17

    called a diskless workstation. With the proliferation of high-speed networks, diskless workstations have become

    more popular in network environments than diskful workstations, making the workstation-server model more

    popular than the workstation model for building distributed computing systems.

    A distributed computing system based on the workstation-server model (Fig. 1.5) consists of a few minicomputers

    and several workstations (most of which are diskless, but a few of which may be diskful) interconnected by acommunication network.

    Fig. 1.3: A Distributed Computing System based on Workstation-server Model

    Note that when diskless workstations are used on a network, the file system to be used by these workstations mustbe implemented either by a diskful workstation or by a minicomputer equipped with a disk for file storage. One or

    more of the minicomputers are used for implementing the file system. Other minicomputers may be used for

    providing other types of services, such as database service and print service. Therefore, each minicomputer is used

    as a server machine to provide one or more types of services. Therefore in the workstation-server model, in

    addition to the workstations, there are specialized machines (may be specialized workstations) for running server

    processes (called servers) for managing and providing access to shared resources. For a number of reasons, such ashigher reliability and better scalability, multiple servers are often used for managing the resources of a particular

    type in a distributed computing system. For example, there may be multiple file servers, each running on aseparate minicomputer and cooperating via the network, for managing the files of all the users in the system. Due

    to this reason, a distinction is often made between the services that are provided to clients and the servers that

    provide them. That is, a service is an abstract entity that is provided by one or more servers. For example, one or

    more file servers may be used in a distributed computing system to provide file service to the users.

    In this model, a user logs onto a workstation called his or her home workstation. Normal computation activities

    required by the user's processes are performed at the user's home workstation, but requests for services provided

    by special servers (such as a file server or a database server) are sent to a server providing that type of service that

    performs the user's requested activity and returns the result of request processing to the user's workstation.Therefore, in this model, the user's processes need not migrated to the server machines for getting the work done

    by those machines.

    Processor Pool ModelThe processor-pool model is based on the observation that most of the time a user does not need any computing

    power but once in a while the user may need a very large amount of computing power for a short time (e.g., when

    recompiling a program consisting of a large number of files after changing a basic shared declaration). Therefore,

    unlike the workstation-server model in which a processor is allocated to each user, in the processor-pool model the

  • 7/29/2019 SEM 5 MC0085 Advanced Operating Systems

    4/17

    processors are pooled together to be shared by the users as needed. The pool of processors consists of a large

    number of microcomputers and minicomputers attached to the network. Each processor in the pool has its own

    memory to load and run a system program or an application program of the distributed computing system.

    Fig. 1.4: A distributed computing system based on the processor-pool model

    Hybrid Model

    Out of the four models described above, the workstation-server model, is the most widely used model for building

    distributed computing systems. This is because a large number of computer users only perform simple interactive

    tasks such as editing jobs, sending electronic mails, and executing small programs. The workstation-server model

    is ideal for such simple usage. However, in a working environment that has groups of users who often performjobs needing massive computation, the processor-pool model is more attractive and suitable.

    (B)Advantages of Distributed Systems

    From the models of distributed computing systems presented above, it is obvious that distributed computing

    systems are much more complex and difficult to build than traditional centralized systems (those consisting of a

    single CPU, its memory, peripherals, and one or more terminals). The increased complexity is mainly due to the

    fact that in addition to being capable of effectively using and managing a very large number of distributed

    resources, the system software of a distributed computing system should also be capable of handling thecommunication and security problems that are very different from those of centralized systems. For example, the

    performance and reliability of a distributed computing system depends to a great extent on the performance and

    reliability of the underlying communication network. Special software is usually needed to handle loss of

    messages during transmission across the network or to prevent overloading of the network, which degrades the

    performance and responsiveness to the users. Similarly, special software security measures are needed to protect

    the widely distributed shared resources and services against intentional or accidental violation of access control

    and privacy constraints.

    From the models of distributed computing systems presented above, it is obvious that distributed computing

    systems are much more complex and difficult to build than traditional centralized systems (those consisting of a

    single CPU, its memory, peripherals, and one or more terminals). The increased complexity is mainly due to thefact that in addition to being capable of effectively using and managing a very large number of distributed

    resources, the system software of a distributed computing system should also be capable of handling the

    communication and security problems that are very different from those of centralized systems. For example, theperformance and reliability of a distributed computing system depends to a great extent on the performance and

    reliability of the underlying communication network. Special software is usually needed to handle loss of

    messages during transmission across the network or to prevent overloading of the network, which degrades the

    performance and responsiveness to the users. Similarly, special software security measures are needed to protect

  • 7/29/2019 SEM 5 MC0085 Advanced Operating Systems

    5/17

    the widely distributed shared resources and services against intentional or accidental violation of access control

    and privacy constraints. Despite the increased complexity and the difficulty of building distributed computing

    systems, the installation and use of distributed computing systems are rapidly increasing. This is mainly because

    the advantages of distributed computing systems outweigh their disadvantages. The technical needs, the economic

    pressures, and the major advantages that have led to the emergence and popularity of distributed computingsystems are described next.

    (C) Distributed Operating Systems

    Tanenbaum and Van Renesse define an operating system as a program that controls the resources of a computer

    system and provides its users with an interface or virtual machine that is more convenient to use than the bare

    machine. According to this definition, the two primary tasks of an operating system are as follows:

    1. To present users with a virtual machine that is easier to program than the underlying hardware.

    2. To manage the various resources of the system. This involves performing such tasks as keeping track of who is

    using which resource, granting resource requests, accounting for resource usage, and mediating conflicting

    requests from different programs and users.Therefore, the users' view of a computer system, the manner in which the users access the various resources of the

    computer system, and the ways in which the resource requests are granted depend to a great extent on the

    operating system of the computer system. The operating systems commonly used for distributed computingsystems can be broadly classified into two types network operating systems and distributed operating systems.

    The three most important features commonly used to differentiate between these two types of operating systems

    are system image, autonomy, and fault tolerance capability. These features are given below: System image: Under

    network OS, the user views the distributed system as a collection of machines connected by a communication

    subsystem. i.e the user is aware of the fact that multiple computers are used. A distributed OS hides the existenceof multiple computers and provides a single system image to the users. Autonomy: A network OS is built on a set

    of existing centralized OSs and handles the interfacing and coordination of remote operations and communications

    between these OSs. So, in this case, each machine has its own OS. With a distributed OS, there is a single system-

    wide OS and each computer runs part of this global OS.

    Fault tolerance capability: A network operating system provides little or no fault tolerance capability in the sense

    that if 10% of the machines of the entire distributed computing system are down at any moment, at least 10% of

    the users are unable to continue with their work. On the other hand, with a distributed operating system, most of

    the users are normally unaffected by the failed machines and can continue to perform their work normally, withonly a 10% loss in performance of the entire distributed computing system.

    Therefore, the fault tolerance capability of a distributed operating system is usually very high as compared to that

    of a network operating system

    2. Explain the following in the Context of Message Passing:

    A) Synchronization B) Buffering C) Process Addressing

    Ans:

    (A) Synchronization

    A major issue in communication is the synchronization imposed on the communicating processes by the

    communication primitives. There are two types of communicating primitives: Blocking Semantics and Non-

    Blocking Semantics.

    Blocking Semantics: A communication primitive is said to have blocking semantics if its invocation

    blocks the execution of its invoker (for example in the case of send, the sender blocks until it receives an

    acknowledgement from the receiver.)

  • 7/29/2019 SEM 5 MC0085 Advanced Operating Systems

    6/17

    Non-blocking Semantics: A communication primitive is said to have non-blocking semantics if its

    invocation does not block the execution of its invoker.

    The synchronization imposed on the communicating processes basically depends on one of the two types of

    semantics used for the send and receive primitives.Blocking Primitives

    Blocking Send Primitive: In this case, after execution of the send statement, the sending process is

    blocked until it receives an acknowledgement from the receiver that the message has been received.

    Non-Blocking Send Primitive: In this case, after execution of the send statement, the sending process is

    allowed to proceed with its execution as soon as the message is copied to the buffer.

    Blocking Receive Primitive: In this case, after execution of the receive statement, the receiving process

    is blocked until it receives a message.

    Non-Blocking Receive Primitive: In this case, the receiving process proceeds with its execution after the

    execution of receive statement, which returns the control almost immediately just after telling the kernel

    where the message buffer is.

    Handling non-blocking receives: The following are the two ways of doing this:

    Polling: a test primitive is used by the receiver to check the buffer status

    Interrupt: When a message is filled in the buffer, software interrupt is used to notify the receiver.

    However, user level interrupts make programming difficult.

    (B) BufferingThe transmission of messages from one process to another can be done by copying the body of the message from

    the senders address space to the receivers address space. In some cases, the receiving process may not be ready

    to receive the message but it wants the operating system to save that message for later reception. In such cases, the

    operating system would rely on the receivers buffer space in which the transmitted messages can be stored prior

    to receiving process executing specific code to receive the message.

    The synchronous and asynchronous modes of communication correspond to the two extremes of buffering: a null

    buffer, or no buffering, and a buffer with unbounded capacity. Two other commonly used buffering strategies are

    single-message and finite-bound, or multiple message buffers. These four types of buffering strategies are givenbelow:

    No buffering: In this case, message remains in the senders address space until the receiver executes the

    corresponding receive.

    Single message buffer: A buffer to hold a single message at the receiver side is used. It is used for

    implementing synchronous communication because in this case an application can have only one

    outstanding message at any given time.

    Unbounded - Capacity buffer: Convenient to support asynchronous communication. However, it is

    impossible to support unbounded buffer.

    Finite-Bound Buffer: Used for supporting asynchronous communication.

    Buffer overflow can be handled in one of the following ways:

    Unsuccessful communication: send returns an error message to the sending process, indicating that themessage could not be delivered to the receiver because the buffer is full.

    Flow-controlled communication: The sender is blocked until the receiver accepts some messages. This

    violates the semantics of asynchronous send. This will also result in communication deadlock.

  • 7/29/2019 SEM 5 MC0085 Advanced Operating Systems

    7/17

    1.An absolute pointer value has no meaning (more on this when we talk about RPC). For example, a pointer to

    a tree or linked list. So, proper encoding mechanisms should be adopted to pass such objects.

    2. Different program objects, such as integers, long integers, short integers, and character strings occupy

    different storage space. So, from the encoding of these objects, the receiver should be able to identify the type

    and size of the objects.

    A message data should be meaningful to the receiving process. This implies ideally that the structure of the

    program should be preserved while they are being transmitted from the address space of the sending process

    to the address space of the receiving process. It is not possible in heterogeneous systems in which the sending

    and receiving processes are on computers of different architectures. Even in homogeneous systems, it is very

    difficult to achieve this goal mainly because of two reasons:

    One of the following two representations may be used for the encoding and decoding of a message data: 1.

    Tagged representation: The type of each program object as well as its value is encoded in the message. In

    this method, it is a simple matter for the receiving process to check the type of each program object in the

    message because of the self-describing nature of the coded data format. 2. Untagged representation: Themessage contains only program objects, no information is included in the message about the type of each

    program object. In this method, the receiving object should have a prior knowledge of how to decode the

    received data because the coded data format is not self-describing.(C)Process Addressing

    A message passing system generally supports two types of addressing:

    Explicit Addressing: The process with which communication is desired is explicitly specified as a

    parameter in the communication primitive. e.g. send (pid, msg), receive (pid, msg).

    Implicit Addressing: A process does not explicitly name a process for communication. For example, a

    process can specify a service instead of a process. e.g. send any (service id, msg), receive any (pid, msg

    Methods for process addressing:

    machine id@local id: UNIX uses this form of addressing (IP address, port number).

    Advantages: No global coordination needed for process addressing. Disadvantages: Does not allow process

    migration.

    machine id1@local id@machine id2: machine id1 identifies the node on which the process is created. localid is generated by the node on which the process is created.

    machine id2 identifies the last known location of the process. When a process migrates to another node, the

    link information (the machine id to which the process migrates) is left with the current machine. Thisinformation is used for forwarding messages to migrated processes.

    Disadvantages:

    Overhead involved in locating a process may be large.

    If the node on which the process was executing is down, it may not be possible to locate the process.

    3. Explain the following algorithms with respect to distributed Shared Memory:

    A) Centralized Server Algorithm B) Fixed Distributed Server Algorithm

    C) Dynamic Distributed Server Algorithm

    Ans:

    (A) Centralized-Server Algorithm

  • 7/29/2019 SEM 5 MC0085 Advanced Operating Systems

    8/17

    A central server maintains a block table containing owner-node and copy-set information for each block.

    When a read/write fault for a block occurs at node N, the fault handler at node N sends a read/write request to

    the central server. Upon receiving the request, the central-server does the following:

    If it is a read request:

    adds N to the copy-set field and

    sends the owner node information to node N

    upon receiving this information, N sends a request for the block to the owner node.

    upon receiving this request, the owner returns a copy of the block to N.

    If it is a write request:

    It sends the copy-set and owner information of the block to node N and initializes copy-set to {N}

    Node N sends a request for the block to the owner node and an invalidation message to all blocks in the

    copy-set.

    Upon receiving this request, the owner sends the block to node N

    (B) Fixed Distributed-Server Algorithm

    Under this scheme

    Several nodes have block managers, each block manager manages a predetermined set of blocks

    When a read/write fault occurs, request for the block is sent to the corresponding block manager.

    Upon receiving this request

    The actions taken by the block manager are similar to that of the central-server approach.

    (C) Dynamic Distributed Server AlgorithmUnder this approach, there is no block manager. Each node maintains information about the probable owner of

    each block, and also the copy-set information for each block for which it is a owner. When a block fault occurs,

    the fault handler sends a request to the probable owner of the block.

    Upon receiving the request

    if the receiving node is not the owner, it forwards the request to the probable owner of the block

    according to its table.

    if the receiving node is the owner, theno If the request is a read request, it adds the entry N to the copy-set field of the entry

    corresponding to the block and sends a copy of the block to node N.

    o If the request is a write request, it sends the block and copy-set information to the node N and

    deletes the entry corresponding to the block from its block table.

    o Node N, upon receiving the block, sends invalidation request to all nodes in the copy-set, and

    updates its block table to reflect the fact that it is the owner of the block

    4. Describe the Clock Synchronization Algorithms and Distributed Algorithms in the context of

    Synchronization.

    Ans:

    Clock Synchronization Algorithms

    Clock synchronization algorithms may be broadly classified as Centralized and Distributed:

    Centralized Algorithms

    In centralized clock synchronization algorithms one node has a real-time receiver. This node, called the time

    server node whose clock time is regarded as correct and used as the reference time. The goal of these

  • 7/29/2019 SEM 5 MC0085 Advanced Operating Systems

    9/17

    algorithms is to keep the clocks of all other nodes synchronized with the clock time of the time server node.

    Depending on the role of the time server node, centralized clock synchronization algorithms are again of two

    types Passive Time Sever and Active Time Server.

    1. Passive Time Server Centralized Algorithm: In this method each node periodically sends a message tothe time server. When the time server receives the message, it quickly responds with a message (time =

    T), where T is the current time in the clock of the time server node. Assume that when the client node

    sends the time = ? message, its clock time is T0, and when it receives the time = T message, its clocktime is T1. Since T0 and T1 are measured using the same clock, in the absence of any other information,

    the best estimate of the time required for the propagation of the message time = T from the time server

    node to the clients node is (T1-T0)/2. Therefore, when the reply is received at the clients node, its clock

    is readjusted to T + (T1-T0)/2. 2. Active Time Server Centralized Algorithm: In this approach, the time

    server periodically broadcasts its clock time (time = T). The other nodes receive the broadcast message

    and use the clock time in the message for correcting their own clocks. Each node has a priori knowledge

    of the approximate time (Ta) required for the propagation of the message time = T from the time server

    node to its own node, Therefore, when a broadcast message is received at a node, the nodes clock is

    readjusted to the time T+Ta. A major drawback of this method is that it is not fault tolerant. If the

    broadcast message reaches too late at a node due to some communication fault, the clock of that node willbe readjusted to an incorrect value. Another disadvantage of this approach is that it requires broadcast

    facility to be supported by the network.

    2. Another active time server algorithm that overcomes the drawbacks of the above algorithm is theBerkeley algorithm proposed by Gusella and Zatti for internal synchronization of clocks of a group of

    computers running the Berkeley UNIX. In this algorithm, the time server periodically sends a message

    (time = ?) to all the computers in the group. On receiving this message, each computer sends back its

    clock value to the time server. The time server has a priori knowledge of the approximate time required

    for the propagation of a message from each node to its own node. Based on this knowledge, it first

    readjusts the clock values of the reply messages, It then takes afault-tolerantaverage of the clock values

    of all the computers (including its own). To take the fault tolerant average, the time server chooses a

    subset of all clock values that do not differ from one another by more than a specified amount, and theaverage is taken only for the clock values in this subset. This approach eliminates readings from

    unreliable clocks whose clock values could have a significant adverse effect if an ordinary average was

    taken. The calculated average is the current time to which all the clocks should be readjusted, The time

    server readjusts its own clock to this value, Instead of sending the calculated current time back to other

    computers, the time server sends the amount by which each individual computers clock requires

    adjustment, This can be a positive or negative value and is calculated based on the knowledge the time

    server has about the approximate time required for the propagation of a message from each node to its

    own node.

    Centralized clock synchronization algorithms suffer from two major drawbacks:

    1. They are subject to single point failure. If the time server node fails, the clock synchronizationoperation cannot be performed. This makes the system unreliable. Ideally, a distributed system, should be

    more reliable than its individual nodes. If one goes down, the rest should continue to function correctly.

    2. From a scalability point of view it is generally not acceptable to get all the time requests serviced by a

    single time server. In a large system, such a solution puts a heavy burden on that one process.

    Distributed AlgorithmsWe know that externally synchronized clocks are also internally synchronized. That is, if each nodes clock is

    independently synchronized with real time, all the clocks of the system remain mutually synchronized.

    Therefore, a simple method for clock synchronization may be to equip each node of the system with a realtime receiver so that each nodes clock can be independently synchronized with real time. Multiple real time

  • 7/29/2019 SEM 5 MC0085 Advanced Operating Systems

    10/17

    clocks (one for each node) are normally used for this purpose. Theoretically, internal synchronization of

    clocks is not required in this approach. However, in practice, due to inherent inaccuracy of real-time clocks,

    different real time clocks produce different time. Therefore, internal synchronization is normally performed

    for better accuracy. One of the following two approaches is used for internal synchronization in this case.

    1. Global Averaging Distributed Algorithms: In this approach, the clock process at each node broadcastsits local clock time in the form of a special resync message when its local time equals T0+iR for some

    integer I, where T0 is a fixed time in the past agreed upon by all nodes and R is a system parameter that

    depends on such factors as the total number of nodes in the system, the maximum allowable drift rate,

    and so on. i.e. a resync message is broadcast from each node at the beginning of every fixed length

    resynchronization interval. However, since the clocks of different nodes run slightly different rates, thesebroadcasts will not happen simultaneously from all nodes. After broadcasting the clock value, the clock

    process of a node waits for time T, where T is a parameter to be determined by the algorithm. During this

    waiting period, the clock process records the time, according to its own clock, when the message was

    received. At the end of the waiting period, the clock process estimates the skew of its clock with respect

    to each of the other nodes on the basis of the times at which it received resync messages. It then computes

    a fault-tolerant average of the next resynchronization interval.

    2. The global averaging algorithms differ mainly in the manner in which the fault-tolerant average of the

    estimated skews is calculated. Two commonly used algorithms are: 1. The simplest algorithm is to takethe average of the estimated skews and use it as the correction for the local clock. However, to limit the

    impact of faulty clocks on the average value, the estimated skew with respect to each node is compared

    against a threshold, and skews greater than the threshold are set to zero before computing the average of

    the estimated skews. 2. In another algorithm, each node limits the impact of faulty clocks by first

    discarding the m highest and m lowest estimated skews and then calculating the average of the remaining

    skews, which is then used as the correction for the local clock. The value of m is usually decided based

    on the total number of clocks (nodes).

  • 7/29/2019 SEM 5 MC0085 Advanced Operating Systems

    11/17

    February 2010

    Master of Computer Application (MCA) Semester 5

    MC0085 Advanced Operating Systems

    (Distributed systems) 4 Credits

    (Book ID: B0967)

    Assignment Set 2 (60 Marks)

    Answer all Questions Each Question carries FIFTEEN Marks

    1. Describe the following:

    A) Task assignment Approach B) Load Balancing Approach

    C) Load Sharing Approach

    A) Task assignment Approach

    Each process is viewed as a collection of tasks. These tasks are scheduled to suitable processor to improve

    performance. This is not a widely used approach because:

    It requires characteristics of all the processes to be known in advance.

    This approach does not take into consideration the dynamically changing state of the system.

    In this approach, a process is considered to be composed of multiple tasks and the goal is to find an optimal

    assignment policy for the tasks of an individual process. The following are typical assumptions for the task

    assignment approach:

    Minimize IPC cost (this problem can be modeled using network flow model)

    Efficient resource utilization

    Quick turnaround time

    A high degree of parallelism

    B) Load Balancing Approach

    In this, the processes are distributed among nodes to equalize the load among all nodes. The scheduling algorithmsthat use this approach are known as Load Balancing or Load Leveling Algorithms. These algorithms are based on

    the intuition that for better resource utilization, it is desirable for the load in a distributed system to be balanced

    evenly. This a load balancing algorithm tries to balance the total system load by transparently transferring the

    workload from heavily loaded nodes to lightly loaded nodes in an attempt to ensure good overall performancerelative to some specific metric of system performance.

    We can have the following categories of load balancing algorithms:

  • 7/29/2019 SEM 5 MC0085 Advanced Operating Systems

    12/17

    Static: Ignore the current state of the system. E.g. if a node is heavily loaded, it picks up a task randomly

    and transfers it to a random node. These algorithms are simpler to implement but performance may not be

    good.

    Dynamic: Use the current state information for load balancing. There is an overhead involved incollecting state information periodically; they perform better than static algorithms.

    Deterministic: Algorithms in this class use the processor and process characteristics to allocate processes

    to nodes.

    Probabilistic: Algorithms in this class use information regarding static attributes of the system such as

    number of nodes, processing capability, etc.

    Centralized: System state information is collected by a single node. This node makes all scheduling

    decisions.

    Distributed: Most desired approach. Each node is equally responsible for making scheduling decisions

    based on the local state and the state information received from other sites.

    Cooperative: A distributed dynamic scheduling algorithm. In these algorithms, the distributed entities

    cooperate with each other to make scheduling decisions. Therefore they are more complex and involve

    larger overhead than non-cooperative ones. But the stability of a cooperative algorithm is better than of a

    non-cooperative one.

    Non-Cooperative: A distributed dynamic scheduling algorithm. In these algorithms, individual entities

    act as autonomous entities and make scheduling decisions independently of the action of other entities.

    C) Load Sharing Approach

    Several researchers believe that load balancing, with its implication of attempting to equalize workload on all the

    nodes of the system, is not an appropriate objective. This is because the overhead involved in gathering the state

    information to achieve this objective is normally very large, especially in distributed systems having a large

    number of nodes. In fact, for the proper utilization of resources of a distributed system, it is not required to balancethe load on all the nodes. It is necessary and sufficient to prevent the nodes from being idle while some other

    nodes have more than two processes. This rectification is called the Dynamic Load Sharing instead of Dynamic

    Load Balancing.

    The design of a load sharing algorithms require that proper decisions be made regarding load estimation policy,process transfer policy, state information exchange policy, priority assignment policy, and migration limiting

    policy. It is simpler to decide about most of these policies in case of load sharing, because load sharing algorithms

    do not attempt to balance the average workload of all the nodes of the system. Rather, they only attempt to ensure

    that no node is idle when a node is heavily loaded. The priority assignments policies and the migration limiting

    policies for load-sharing algorithms are the same as that of load-balancing algorithms.

    2. Write about:

    A) Processes B) Process Management

    C) Process Migration Mechanisms

    A) Processes

    The term "process" was first used by the designers of the MULTICS in 1960's. Since then, the term process is used

    somewhat interchangeably with 'task' or 'job'. The process has been given many definitions, for instance:A program in Execution.

    An asynchronous activity.

  • 7/29/2019 SEM 5 MC0085 Advanced Operating Systems

    13/17

    The 'animated spirit' of a procedure in execution.

    The entity to which processors are assigned.

    The 'dispatchable' unit.

    And many more definitions have been given. As we can see from the above that there is no universally agreed

    upon definition, but the definition "Program in Execution" seems to be most frequently used. And this is a concept

    used in the present study of operating systems.

    Now that we have agreed upon the definition of process, the question is what the relation between process and

    program is. It is same beast with different name or when this beast is sleeping (not executing) it is called programand when it is executing it becomes process. Well, to be very precise, a Process is not the same as program. In the

    following discussion we point out some of the differences between process and program. Process is not the same

    as program. A process is more than a program code. A process is an 'active' entity as opposed to program which is

    considered to be a 'passive' entity. As we all know a program is an algorithm expressed in some suitable notation,

    (e.g., programming language). Being passive, a program is only a part of process. Process, on the other hand,

    includes:Current value of Program Counter (PC)

    Contents of the processors register

    Value of the variables

    The process stacks (SP) which typically contains temporary data such as subroutine parameter, return address, and

    temporary variables.

    A data section that contains global variables.

    A process is the unit of work in a system. In Process model, all software on the computer is organized into a

    number of sequential processes. A process includes PC, registers, and variables. Conceptually, each process has its

    own virtual CPU. In reality, the CPU switches back and forth among processes. (The rapid switching back and

    forth is called multiprogramming).

    B) Process Management

    In a conventional (or centralized) operating system, process management deals with mechanisms and policies for

    sharing the processor of the system among all processes. In a Distributed Operating system, the main goal of

    process management is to make the best possible use of the processing resources of the entire system by sharing

    them among all the processes. Three important concepts are used in distributed operating systems to achieve this

    goal:1. Processor Allocation: It deals with the process of deciding which process should be assigned to which

    processor.

    2. Process Migration: It deals with the movement of a process from its current location to the processor to which

    it has been assigned.

    3. Threads: They deal with fine-grained parallelism for better utilization of the processing capability of the

    system.

    C) Process Migration Mechanisms

  • 7/29/2019 SEM 5 MC0085 Advanced Operating Systems

    14/17

    Migration of a process is a complex activity that involves proper handling of several sub-activities in order to meet

    the requirements of a good process migration mechanism. The four major subactivities involved in process

    migration are as follows:

    1. Freezing the process and restarting on another node.

    2. Transferring the process address space from its source node to its destination node

    3. Forwarding messages meant for the migrant process

    4. Handling communication between cooperating processes that have been separated as a result of process

    migration.

    3. Describe:

    A) Stateful Versus Stateless Servers B) Replication

    C) Caching

    A) Stateful Versus Stateless Servers

    The file servers that implement a distributed file service can be stateless or Stateful. Stateless file servers do not

    store any session state. This means that every client request is treated independently, and not as a part of a new or

    existing session. Stateful servers, on the other hand, do store session state. They may, therefore, keep track of

    which clients have opened which files, current read and write pointers for files, which files have been locked by

    which clients, etc.

    The main advantage of stateless servers is that they can easily recover from failure. Because there is no state thatmust be restored, a failed server can simply restart after a crash and immediately provide services to clients as

    though nothing happened. Furthermore, if clients crash the server is not stuck with abandoned opened or locked

    files. Another benefit is that the server implementation remains simple because it does not have to implement the

    state accounting associated with opening, closing, and locking of files.

    The main advantage of Stateful servers, on the other hand, is that they can provide better performance for clients.

    Because clients do not have to provide full file information every time they perform an operation, the size ofmessages to and from the server can be significantly decreased. Likewise the server can make use of knowledge of

    access patterns to perform read-ahead and do other optimizations. Stateful servers can also offer clients extra

    services such as file locking, and remember read and write positions.

    B) Replication

    The main approach to improving the performance and fault tolerance of a DFS is to replicate its content. A

    replicating DFS maintains multiple copies of files on different servers. This can prevent data loss, protect a systemagainst down time of a single server, and distribute the overall workload. There are three approaches to replication

    in a DFS:1. Explicit replication: The client explicitly writes files to multiple servers. This approach requires explicit

    support from the client and does not provide transparency.

    2. Lazy file replication: The server automatically copies files to other servers after the files are written. Remotefiles are only brought up to date when the files are sent to the server. How often this happens is up to the

    implementation and affects the consistency of the file state.

  • 7/29/2019 SEM 5 MC0085 Advanced Operating Systems

    15/17

    3. Group file replication: write requests are simultaneously sent to a group of servers. This keeps all the replicas

    up to date, and allows clients to read consistent file state from any replica.

    C) CachingBesides replication, caching is often used to improve the performance of a DFS. In a DFS, caching involves

    storing either a whole file, or the results of file service operations. Caching can be performed at two locations: at

    the server and at the client. Server-side caching makes use of file caching provided by the host operating system.

    This is transparent to the server and helps to improve the servers performance by reducing costly disk accesses.

    Client-side caching comes in two flavours: on-disk caching, and in-memory caching. On-disk caching involves the

    creation of (temporary) files on the clients disk. These can either be complete files (as in the upload/download

    model) or they can contain partial file state, attributes, etc. In-memory caching stores the results of requests in the

    client-machines memory. This can be process-local (in the client process), in the kernel, or in a separate dedicated

    caching process. The issue of cache consistency in DFS has obvious parallels to the consistency issue in shared

    memory systems, but there are other tradeoffs (for example, disk access delays come into play, the granularity of

    sharing is different, sizes are different, etc.). Furthermore, because write-through caches are too expensive to be

    useful, the consistency of caches will be weakened. This makes implementing Unix semantics impossible.

    Approaches used in DFS caches include, delayed writes where writes are not propagated to the server

    immediately, but in the background later on, and write-on-close where the server receives updates only after the

    file is closed. Adding a delay to write-on-close has the benefit of avoiding superfluous writes if a file is deleted

    shortly after it has been closed.

    4. Describe the following with their real time applications:

    A) Digital Signatures B) Design Principles

    A) Digital Signatures

    A digital signature of a message is a number dependent on some secret known only to the signer, and,

    additionally, on the content of the message being signed. Signatures must be verifiable; if a dispute arises as to

    whether a party signed a document (caused by either a lying signer trying torepudiate a signature it did create, or a

    fraudulent claimant), an unbiased third party should be able to resolve the matter equitably, without requiring

    access to the signer s secret information (private key). Digital signatures have many applications in information

    security, including authentication, data integrity, and non-repudiation. One of the most significant applications of

    digital signatures is the certification of public keys in large networks. Certification is a means for a trusted third

    party (TTP) to bind the identity of a user to a public key, so that at some later time, other entities can authenticate a

    public key without assistance from a trusted third party. The concept and utility of a digital signature was

    recognized several years before any practical realization was available. The first method discovered was the RSA

    signature scheme, which remains today one of the most practical and versatile techniques available. Subsequent

  • 7/29/2019 SEM 5 MC0085 Advanced Operating Systems

    16/17

    research has resulted in many alternative digital signature techniques. Some offer significant advantages in terms

    of functionality and implementation.

    Basic definitions

    1. A digital signature is a data string which associates a message (in digital form) with some originating entity.

    2. A digital signature generation algorithm (orsignature generation algorithm) is a method for producing a

    digital signature.

    3. A digital signature verification algorithm (orverification algorithm) is a method for verifying that a digital

    signature is authentic (i.e., was indeed created by the specified entity).

    4. A digital signature scheme (or mechanism) consists of a signature generation algorithm and an associated

    verification algorithm.

    5. A digital signature signing process (orprocedure) consists of a (mathematical) digital signature generation

    algorithm, along with a method for formatting data into messages which can be signed.6. A digital signature verification process (orprocedure) consists of a verification algorithm, along with a method

    for recovering data from the message.

    7. (messages) M is the set of elements to which a signer can affix a digital signature.

    8. (signing space) MS is the set of elements to which the signature transformations are applied. The signature

    transformations are not applied directly to the set M.

    9. (signature space) S is the set of elements associated to messages in M. These elements are used to bind the

    signer to the message.

    10. (indexing set) R is used to identify specific signing transformations.

    B) Design Principles

    Designers of security components of a distributed operating system should follow the following guidelines while

    designing a secured network:

    1. Least Privilege: This principle is also known as need-to-know principle. It states that any process should be

    given only those access rights that enable it to access, at any time, what it needs to accomplish

    its function and nothing more and nothing less. i.e. the security system must be flexible enough to allow the access

    rights of a process to grow and shrink with its changing access requirements. This principle serves to limit the

    damage when a system s security is broken.

    2. Fail-Safe defaults: Access rights should be acquired by explicit permission only and the default should be no

    access. This principle requires that access control decisions should be based on why an object should be accessibleto a process rather than on why it should not be accessible.

    3. Open design: This principle requires that the design should not be secret but should be public. It is a mistake on

    the part of a designer to assume that intruders will not know how the security mechanism of the system works.

  • 7/29/2019 SEM 5 MC0085 Advanced Operating Systems

    17/17

    4. Built into the system: This principle requires that the security be designed into the systems at their inception

    and be built into the lowest layers of the systems. i.e. security should not be treated as an add-on feature because

    security problems cannot be resolved very effectively by patching the penetration holes detected in an existing

    system.

    5. Check for current authority: This principle requires that every access to every object must be checked using

    an access control database for authority. This is necessary to have immediate effect of revocation of previously

    given access rights.

    6. Easy granting and revocation of access rights: For greater flexibility, a security system must allow access

    rights for an object to be granted or revoked dynamically. It should be possible to restrict some of the rights and to

    grant to a user only those rights that are sufficient to accomplish its functions. On the other hand, a good

    security system should allow immediate revocation with the flexibility of selective and partial revocation.

    7. Never trust other parties: For producing a secured distributed system, the system components must be

    designed with the assumption that other parties (human or programs) are not trustworthy until they are

    demonstrated to be trustworthy.

    8. Always ensure freshness of messages: To avoid security violations through the replay of messages, the

    security of a distributed system must be designed to always ensure freshness of messages exchanged between twocommunicating entities.

    9. Build firewalls: To limit the damage in case of a system s security being compromised, the system must have firewalls built into it. One way to meet these requirements is to allow only short-lived passwords and keys in the

    system.

    10. Efficient: The security mechanisms used must execute efficiently and be simple to implement.

    11. Convenient to use: To be psychologically acceptable, the security mechanisms must be convenient to use.

    Otherwise, they are likely to be bypassed or incorrectly used by the users.

    12. Cost Effective: It is often the case that security needs to be traded off with other goals of the system, such as

    performance or ease of use.