Top Banner
1 Clock Synchronization & Clock Synchronization & Mutual Exclusion in Mutual Exclusion in Distributed Operating Distributed Operating Systems Systems Brett O’Neill Brett O’Neill CSE 8343 – Group A6 CSE 8343 – Group A6
29

1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

Dec 13, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

1

Clock Synchronization & Clock Synchronization & Mutual Exclusion in Mutual Exclusion in

Distributed Operating Distributed Operating SystemsSystems

Brett O’NeillBrett O’Neill

CSE 8343 – Group A6CSE 8343 – Group A6

Page 2: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

2

OverviewOverview

What is a Distributed Operating System (DOS)?What is a Distributed Operating System (DOS)? The Importance of SynchronizationThe Importance of Synchronization Clock SynchronizationClock Synchronization

Physical Clock SynchronizationPhysical Clock Synchronization UTCUTC Christian’s AlgorithmChristian’s Algorithm Berkeley AlgorithmBerkeley Algorithm Decentralized Averaging AlgorithmDecentralized Averaging Algorithm

Logical ClocksLogical Clocks Lamport’s Clock Synchronization AlgorithmLamport’s Clock Synchronization Algorithm

Mutual ExclusionMutual Exclusion Lamport’s AlgorithmLamport’s Algorithm Centralized AlgorithmCentralized Algorithm Distributed AlgorithmsDistributed Algorithms Token-Based AlgorithmsToken-Based Algorithms

Questions?Questions?

Page 3: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

3

What is a Distributed What is a Distributed Operating System (DOS)?Operating System (DOS)?

A DOS is a collection of A DOS is a collection of heterogeneous heterogeneous computers connected computers connected via a network.via a network.

The functions of a The functions of a conventional operating conventional operating system are distributed system are distributed throughout the throughout the networknetwork

To users of a DOS, it is To users of a DOS, it is as if the computer has as if the computer has a single processor.a single processor.

Page 4: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

4

What is a Distributed What is a Distributed Operating System (DOS)? Operating System (DOS)?

(cont.)(cont.)

The multiple processors do not share The multiple processors do not share a common memory or clock. a common memory or clock.

Instead, each processor has its own Instead, each processor has its own local memory and communicates local memory and communicates with other processes through with other processes through communication lines.communication lines.

Page 5: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

5

What is a Distributed What is a Distributed Operating System (DOS)? Operating System (DOS)?

(cont.)(cont.)

The goal of a DOS is to provide a The goal of a DOS is to provide a common, consistent view of:common, consistent view of: File systemsFile systems Name spaceName space TimeTime SecuritySecurity Access to resources while keeping the Access to resources while keeping the

details transparent to users.details transparent to users.

Page 6: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

6

The Importance of The Importance of SynchronizationSynchronization

Because various components of a distributed Because various components of a distributed system must cooperate and exchange system must cooperate and exchange information, synchronization is a necessity.information, synchronization is a necessity.

Various components of the system must Various components of the system must agree on the agree on the timingtiming and and orderingordering of events. of events. Imagine a banking system that did not track Imagine a banking system that did not track the timing and ordering of financial the timing and ordering of financial transactions. Similar chaos would ensue if transactions. Similar chaos would ensue if distributed systems were not synchronized.distributed systems were not synchronized.

Constraints, both implicit and explicit, are Constraints, both implicit and explicit, are therefore enforced to ensure synchronization therefore enforced to ensure synchronization of components.of components.

Page 7: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

7

Clock SynchronizationClock Synchronization

Page 8: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

8

Clock SynchronizationClock Synchronization

As in non-distributed systems, the As in non-distributed systems, the knowledge of when events occur is knowledge of when events occur is necessary.necessary.

However, clock synchronization is often However, clock synchronization is often more difficult in distributed systems more difficult in distributed systems because there is no ideal time source, and because there is no ideal time source, and because distributed algorithms must because distributed algorithms must sometimes be used.sometimes be used.

Distributed algorithms must overcome:Distributed algorithms must overcome: Scattering of informationScattering of information Local, rather than global, decision-makingLocal, rather than global, decision-making

Page 9: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

9

Clock SynchronizationClock Synchronization

Physical ClocksPhysical Clocks The time difference between two computers is known The time difference between two computers is known

as as driftdrift. Clock drift over time is known as . Clock drift over time is known as skewskew. . Computer clock manufacturers specify a maximum Computer clock manufacturers specify a maximum skew rate in their products.skew rate in their products.

Computer clocks are among the least accurate modern Computer clocks are among the least accurate modern timepieces. timepieces. Inside every computer is a chip surrounding a quartz crystal Inside every computer is a chip surrounding a quartz crystal

oscillator to record time. These crystals cost 25 seconds to oscillator to record time. These crystals cost 25 seconds to produce.produce.

Average loss of accuracy: 0.86 seconds per dayAverage loss of accuracy: 0.86 seconds per day This skew is unacceptable for distributed systems. This skew is unacceptable for distributed systems.

Several methods are now in use to attempt the Several methods are now in use to attempt the synchronization of physical clocks in distributed synchronization of physical clocks in distributed systems:systems:

Page 10: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

10

Clock SynchronizationClock Synchronization

Physical Clocks - UTC Physical Clocks - UTC

Coordinated Universal Coordinated Universal Time (UTC) is the Time (UTC) is the international time international time standard. UTC is the standard. UTC is the current term for what was current term for what was commonly referred to as commonly referred to as Greenwich Mean Time Greenwich Mean Time (GMT). Zero hours UTC is (GMT). Zero hours UTC is midnight in Greenwich, midnight in Greenwich, England, which lies on the England, which lies on the zero longitudinal meridian. zero longitudinal meridian. UTC is based on a 24-hour UTC is based on a 24-hour clock. clock.

Page 11: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

11

Clock SynchronizationClock Synchronization

Physical Clocks – Christian’s AlgorithmPhysical Clocks – Christian’s Algorithm Assuming there is one time server with UTC:Assuming there is one time server with UTC:

Each node in the distributed system periodically polls the time server.Each node in the distributed system periodically polls the time server. Time(T1) is estimated as Stime + (T1 – T0)/2Time(T1) is estimated as Stime + (T1 – T0)/2 This process is repeated several times and an average is provided.This process is repeated several times and an average is provided. Machine T1 then attempts to adjust its time. Machine T1 then attempts to adjust its time.

Disadvantages:Disadvantages: Must attempt to take delay between server T1 and time server Must attempt to take delay between server T1 and time server

into accountinto account Single point of failure if time server failsSingle point of failure if time server fails

Page 12: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

12

Clock SynchronizationClock Synchronization

Physical Clocks – Berkeley AlgorithmPhysical Clocks – Berkeley Algorithm One daemon without UTC:One daemon without UTC:

Periodically, the daemon polls all machines on Periodically, the daemon polls all machines on the distributed system for their times. the distributed system for their times.

The machines answer.The machines answer. The daemon computes an average time and The daemon computes an average time and

broadcasts it to the machines so they can adjust.broadcasts it to the machines so they can adjust.

Page 13: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

13

Clock SynchronizationClock Synchronization

Physical Clocks – Decentralized Averaging Physical Clocks – Decentralized Averaging AlgorithmAlgorithm

Each machine on the distributed system has a daemon Each machine on the distributed system has a daemon without UTC.without UTC.

Periodically, at an agreed-upon fixed time, each Periodically, at an agreed-upon fixed time, each machine broadcasts its local time.machine broadcasts its local time.

Each machine calculates the correct time by averaging Each machine calculates the correct time by averaging all results.all results.

Page 14: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

14

Clock SynchronizationClock Synchronization

Physical Clocks – Network Time Protocol Physical Clocks – Network Time Protocol (NTP)(NTP)

Enables clients across the Internet to be Enables clients across the Internet to be synchronized accurately to UTC.synchronized accurately to UTC. Overcomes large and variable message delaysOvercomes large and variable message delays Employs statistical techniques for filtering, based on Employs statistical techniques for filtering, based on

past quality of servers and several other measurespast quality of servers and several other measures Can survive lengthy losses of connectivity:Can survive lengthy losses of connectivity:

Redundant serversRedundant servers Redundant paths to servers Redundant paths to servers

Provides protection against malicious Provides protection against malicious interference through authentication techniquesinterference through authentication techniques

Page 15: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

15

Clock SynchronizationClock Synchronization

Physical Clocks – Network Time Protocol (NTP) (cont.)Physical Clocks – Network Time Protocol (NTP) (cont.)

Uses a hierarchy of servers located across the Internet. Uses a hierarchy of servers located across the Internet. Primary servers are directly connected to a UTC time Primary servers are directly connected to a UTC time source.source.

Page 16: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

16

Clock SynchronizationClock Synchronization

Physical Clocks – Network Time Protocol (NTP) (cont.)Physical Clocks – Network Time Protocol (NTP) (cont.)

NTP has three modes:NTP has three modes: Multicast Mode:Multicast Mode:

Suitable for user workstations on a LANSuitable for user workstations on a LAN One or more servers periodically multicasts the time to other One or more servers periodically multicasts the time to other

machines on the network.machines on the network. Procedure Call Mode:Procedure Call Mode:

Similar to Christian’s AlgorithmSimilar to Christian’s Algorithm Provides higher accuracy than Multicast Mode because delays are Provides higher accuracy than Multicast Mode because delays are

compensated forcompensated for Symmetric Mode:Symmetric Mode:

Pairs of servers exchange pairs of timing messages that contain time Pairs of servers exchange pairs of timing messages that contain time stamps of recent message events.stamps of recent message events.

The most accurate, but also the most expensive modeThe most accurate, but also the most expensive mode

Although NTP is quite advanced, there is Although NTP is quite advanced, there is still a drift of 20-35 milliseconds!!!still a drift of 20-35 milliseconds!!!

Page 17: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

17

Clock SynchronizationClock Synchronization

Logical ClocksLogical Clocks

Often, it is not necessary for a computer to know the Often, it is not necessary for a computer to know the exact time, only relative time. This is known as exact time, only relative time. This is known as “logical time”.“logical time”.

Logical time is not based on timing but on the Logical time is not based on timing but on the ordering of events.ordering of events.

Logical clocks can only advance forward, not in Logical clocks can only advance forward, not in reverse.reverse.

Non-interacting processes cannot share a logical Non-interacting processes cannot share a logical clock.clock.

Computers generally obtain logical time using Computers generally obtain logical time using interrupts to update a software clock. The more interrupts to update a software clock. The more interrupts (the more frequently time is updated), the interrupts (the more frequently time is updated), the higher the overhead.higher the overhead.

Page 18: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

18

Clock SynchronizationClock Synchronization

Logical Clocks - Lamport’s Logical Clock Synchronization AlgorithmLogical Clocks - Lamport’s Logical Clock Synchronization Algorithm

The most common logical clock synchronization algorithm for distributed The most common logical clock synchronization algorithm for distributed systems is Lamport’s Algorithm. It is used in situations where ordering is systems is Lamport’s Algorithm. It is used in situations where ordering is important but global time is not required.important but global time is not required.

Based on the Based on the “happens-before”“happens-before” relation: relation: EventEvent A A “happens-before” Event “happens-before” Event BB ((A→BA→B)) when all processes involved in a when all processes involved in a

distributed system agree that event distributed system agree that event AA occurred first, and occurred first, and BB subsequently subsequently occurred.occurred.

This DOES NOT mean that Event This DOES NOT mean that Event AA actually occurred before Event actually occurred before Event BB in in absolute clock time.absolute clock time.

A distributed system can use the “happens-before” relation when:A distributed system can use the “happens-before” relation when: Events Events AA and and BB are observed by the same process, or by multiple are observed by the same process, or by multiple

processes with the same global clockprocesses with the same global clock Event Event AA acknowledges sending a message and Event acknowledges sending a message and Event BB acknowledges acknowledges

receiving it, since a message cannot be received before it is sentreceiving it, since a message cannot be received before it is sent If two events do not communicate via messages, they are considered If two events do not communicate via messages, they are considered

concurrent – because order cannot be determined and it does not matter. concurrent – because order cannot be determined and it does not matter. Concurrent events can be ignored.Concurrent events can be ignored.

Page 19: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

19

Clock SynchronizationClock Synchronization

Logical Clocks - Lamport’s Logical Clock Synchronization Algorithm Logical Clocks - Lamport’s Logical Clock Synchronization Algorithm (cont.)(cont.)

In the previous examples, Clock In the previous examples, Clock (C)A < (C)B(C)A < (C)B If they are concurrent, If they are concurrent, (C)A = (C)B(C)A = (C)B Concurrent events can only occur on the same system, because Concurrent events can only occur on the same system, because

every message transfer between two systems takes at least one every message transfer between two systems takes at least one clock tick.clock tick.

In Lamport’s Algorithm, logical clock values for events may be In Lamport’s Algorithm, logical clock values for events may be changed, but always by moving the clock forward. Time values can changed, but always by moving the clock forward. Time values can never be decreased.never be decreased.

An additional refinement in the algorithm is often used:An additional refinement in the algorithm is often used: If Event If Event AA and Event and Event BB are concurrent. are concurrent. (C)A = (C)B, (C)A = (C)B, some unique some unique

property of the processes associated with these events can be used to property of the processes associated with these events can be used to choose a winner. This establishes a total ordering of all events. choose a winner. This establishes a total ordering of all events.

Process ID is often used as the tiebreaker.Process ID is often used as the tiebreaker.

Page 20: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

20

Clock SynchronizationClock Synchronization

Logical Clocks - Lamport’s Logical Clock Synchronization Logical Clocks - Lamport’s Logical Clock Synchronization Algorithm (cont.)Algorithm (cont.)

Lamport’s Algorithm can thus be used in Lamport’s Algorithm can thus be used in distributed systems to ensure synchronization:distributed systems to ensure synchronization: A logical clock is implemented in each node in A logical clock is implemented in each node in

the system.the system. Each node can determine the order in which Each node can determine the order in which

events have occurred events have occurred in that system’s own point in that system’s own point of viewof view..

The logical clock of one node does not need to The logical clock of one node does not need to have any relation to real time or to any other have any relation to real time or to any other node in the system.node in the system.

Page 21: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

21

Mutual ExclusionMutual Exclusion

Lamport’s Algorithm Lamport’s Algorithm

Every node in the system keeps a request queue Every node in the system keeps a request queue sorted by logical time stamp.sorted by logical time stamp. Logical clocks are used to impose total global Logical clocks are used to impose total global

order on all events.order on all events. Ordered message deliveryOrdered message delivery between every pair of between every pair of

communicating sites is assumed. communicating sites is assumed. Messages sent from Site Messages sent from Site SSii arrive at Site arrive at Site SSjj in in

the same order.the same order.

Page 22: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

22

Mutual ExclusionMutual Exclusion

Lamport’s Algorithm (cont.)Lamport’s Algorithm (cont.)

1.1. Site Site SSii sends a request and places the request sends a request and places the request in the local request queue.in the local request queue.

2.2. When Site When Site SSjj receives the request, it sends a receives the request, it sends a time-stamped reply to Site time-stamped reply to Site SSii and places the and places the request in its local request queue.request in its local request queue.

3.3. Site Site SSii gains the critical section of the gains the critical section of the requested data when it has received a message requested data when it has received a message from all other sites with a timestamp larger from all other sites with a timestamp larger than the request. than the request.

Page 23: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

23

Mutual ExclusionMutual Exclusion

Centralized Algorithm Centralized Algorithm

The most simple and straightforward way to achieve mutual The most simple and straightforward way to achieve mutual exclusion in a distributed system is to simulate how it is done exclusion in a distributed system is to simulate how it is done in a one-processor system:in a one-processor system: One process is elected as the coordinator.One process is elected as the coordinator. When any process wants to enter a critical section, it sends a When any process wants to enter a critical section, it sends a

request message to the coordinator stating which critical section request message to the coordinator stating which critical section it wants to access.it wants to access.

If no other process is currently in that critical section, the If no other process is currently in that critical section, the coordinator sends back a reply granting permission. When the coordinator sends back a reply granting permission. When the reply arrives, the requesting process enters the critical section. If reply arrives, the requesting process enters the critical section. If another process requests access to the same critical section, it is another process requests access to the same critical section, it is ignored or blocked until the first process exits the critical section ignored or blocked until the first process exits the critical section and sends a message to the coordinator stating that it has exited.and sends a message to the coordinator stating that it has exited.

Page 24: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

24

Mutual ExclusionMutual Exclusion

Centralized Algorithm (cont.)Centralized Algorithm (cont.)

The Centralized Algorithm does have disadvantages:The Centralized Algorithm does have disadvantages: The coordinator is a single point of failure. The coordinator is a single point of failure. If processes are normally ignored when requesting a critical section that is If processes are normally ignored when requesting a critical section that is

in use, they cannot distinguish between a dead coordinator and “permission in use, they cannot distinguish between a dead coordinator and “permission denied”.denied”.

In a large system, a single coordinator can be a bottleneck.In a large system, a single coordinator can be a bottleneck.

Page 25: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

25

Mutual ExclusionMutual Exclusion

Distributed Algorithms Distributed Algorithms

It is often unacceptable to have a single point of It is often unacceptable to have a single point of failure. Therefore researchers continue to look for failure. Therefore researchers continue to look for distributed mutual exclusion algorithms. The most distributed mutual exclusion algorithms. The most well-known is by Ricart and Agrawala:well-known is by Ricart and Agrawala: There must be a total ordering of all events in the There must be a total ordering of all events in the

system. Lamport’s Algorithm can be used for this system. Lamport’s Algorithm can be used for this purpose.purpose.

When a process wants to enter a critical section, it builds When a process wants to enter a critical section, it builds a message containing the name of the critical section, its a message containing the name of the critical section, its process number, and the current time. It then sends the process number, and the current time. It then sends the message to all other processes, as well as to itself.message to all other processes, as well as to itself.

Page 26: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

26

Mutual ExclusionMutual Exclusion

Distributed Algorithms Distributed Algorithms

When a process receives a request message, the action it When a process receives a request message, the action it takes depends on its state with respect to the critical section takes depends on its state with respect to the critical section named in the message. There are three cases:named in the message. There are three cases: If the receiver is not in the critical section and does not want to If the receiver is not in the critical section and does not want to

enter it, it sends an OK message to the sender.enter it, it sends an OK message to the sender. If the receiver is in the critical section, it does not reply. It If the receiver is in the critical section, it does not reply. It

instead queues the request. instead queues the request. If the receiver also wants to enter the same critical section, it If the receiver also wants to enter the same critical section, it

compares the time stamp in the incoming message with the compares the time stamp in the incoming message with the time stamp in the message it has sent out. The lowest time time stamp in the message it has sent out. The lowest time stamp wins. If its own message has a lower time stamp, it does stamp wins. If its own message has a lower time stamp, it does not reply and queues the request from the sending process.not reply and queues the request from the sending process.

When a process has received OK messages from all other When a process has received OK messages from all other processes, it enters the critical section. Upon exiting the processes, it enters the critical section. Upon exiting the critical section, it sends OK messages to all processes in its critical section, it sends OK messages to all processes in its queue and deletes them all from the queue.queue and deletes them all from the queue.

Page 27: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

27

Mutual ExclusionMutual Exclusion

Token-Based Algorithms Token-Based Algorithms

Another approach is to create a logical or physical Another approach is to create a logical or physical ring. ring.

Each process knows the identity of the process Each process knows the identity of the process succeeding it. succeeding it.

When the ring is initialized, Process 0 is given a token. When the ring is initialized, Process 0 is given a token. The token circulates around the ring in order, from The token circulates around the ring in order, from Process k to Process k + 1.Process k to Process k + 1.

When a process receives the token from its neighbor, it When a process receives the token from its neighbor, it checks to see if it is attempting to enter a critical checks to see if it is attempting to enter a critical section. If so, the process enters the critical section section. If so, the process enters the critical section and does its work, keeping the token the whole time.and does its work, keeping the token the whole time.

Page 28: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

28

Mutual ExclusionMutual Exclusion

Token-Based Algorithms (cont.) Token-Based Algorithms (cont.)

After the process exits the critical section, it passes the token to After the process exits the critical section, it passes the token to the next process in the ring. It is not permitted to enter a second the next process in the ring. It is not permitted to enter a second critical section using the same token.critical section using the same token.

If a process is handed a token an is not interested in entering a If a process is handed a token an is not interested in entering a critical section, it passes the token to the next process. critical section, it passes the token to the next process.

Page 29: 1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.

29

Questions?Questions?