Top Banner
Time within Distributed Systems Time is important, however, it is problematic in distributed systems as we cannot synchronize time perfectly
29

Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

Nov 05, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

Time within Distributed Systems

Time is important, however, it is problematic in distributed systems as we cannot synchronize time 

perfectly

Page 2: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

Introducing Time

● Time is a quantity that we often want to measure accurately● Algorithms that depend upon clock synchronization have 

been developed in a lot of areas (not just within the distributed systems arena)

● Physical time is problematic within distributed systems (for lots of reasons)

Page 3: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

Clocks, Events and Process States

● We define an event to be the occurrence of a single action that a process carries out as it executes

● An event is a communication action or a state­transforming action

● Clocks ­ every computer has one, and it can be used to timestamp any event

Page 4: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

Clock Skew

● Computer clocks are like all other clocks in that they tend not to be in perfect agreement

● Skew or Clock Drift is a factor● For ordinary clocks based on a quartz crystal, clock drift is 

about 10­6 seconds/second ­ giving a difference of 1 second every 1,000,000 seconds (or 11.6 days)

● The drift rate of a "high precision" quartz clock is about 10­7 or 10­8 seconds/second

Page 5: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

CTU (UTC)

● CTU stands for Coordinated Universal Time and is set from atomic clocks (which have a drift rate of one part in 10+13)

● CTU (which is actually abbreviated as UTC) is an international standard for timekeeping

● Timing signals can be broadcast via radio signals (set to UTC devices) as can satellite GPS systems

● Computers with the appropriate (and expensive) receivers attached can synchronize their clocks with UTC

Page 6: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

Synchronizing Physical Clocks

● External synchronization ­ setting the time to some external source of time

● Internal synchronization ­ setting the time based on "local agreement" (local time)

● In a synchronous distributed system, bounds are known for the drift rate of clocks, the maximum transmission delay is known, and the time to execute each processing step is set ­ so, synchronizing clocks is "easier"

● Unfortunately, most distributed systems are asynchronous

Page 7: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

Cristian's Synchronizing Clocks

● Cristian suggested the use of a time server, connected to a device that receives signals from a source of UTC

Page 8: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

More on Cristian's Algorithm

● Basic idea: Getting the current time from a “time server”, using periodic client requests

● Major problem – what happens if the time from the time server is less than the client – resulting in time running backwards on the client!  (Which cannot happen – time does not go backwards)

● Minor problem results from the delay introduced by the network request/response: latency

Page 9: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

Discussing Cristian's Algorithm

● Single point of failure (if only one server used)● The time server may fail and thus render synchronization 

temporarily impossible

● Solution: a group of synchronized time servers can be configured to which clients multicast requests

● Research showed that if F is the number of faulty server clocks out of a total of N servers, then we must have N > 3F if the other, correct, clocks are still to be able to achieve agreement

Page 10: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

The Berkeley Algorithm

● A coordinator is chosen to act as the "master" clock● The master periodically polls the other computers (the 

"slaves") to determine their local time● An average time is then calculated by the master and 

distributed to the slaves to allow them to adjust their clocks to the "correct time"

Page 11: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

Berkeley in Action

Clocks running fast slow down (so that the other can catch up), clocks running slow skip forward to the correct time

Page 12: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

Discussing Berkeley's Algorithm

● Faulty clocks can be dealt with due to the master's ability to take a "fault­tolerant average" ­ a subset of clocks is chosen that do not differ from one another by more than a specified amount, and the average is taken of the time readings from only these clocks

● An experiment involving 15 computers showed that Berkeley could synchronize clocks to within 20­25 milliseconds

● If the master suffers a failure, protocols exist to elect a predecessor (that is, a new master)

Page 13: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

The Network Time Protocol (NTP)

Defines an architecture for a time service and a protocol to distribute time information over the Internet

Page 14: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

NTP Design Goals

● To provide a service enabling clients across the Internet to be synchronized accurately to UTC

● To provide a reliable service that can survive lengthy losses of connectivity

● To enable clients to resynchronize sufficiently frequently to offset the rates of drift found in most computers

● To provide protection against interference with the time service, whether malicious or accidental

Page 15: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

How NTP Works

● Provides a network of servers located across the Internet● Primary servers ­ attached to a UTC time source● Secondary servers ­ connected to a primary for 

synchronization● The servers are connected in a logical hierarchy called a 

"synchronization subnet", whose levels are called "strata"● The synchronization subnet can reconfigure as servers 

become unreachable or failures occur● Messages a delivered using UDP

Page 16: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

Example Synchronization Subnet

1

2

3

2

3 3

Note: Arrows denote synchronization control, numbers denote strata.

Page 17: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

NTP's Modes

● Multicast mode ­ used on high­speed LANs, requests are multicast to a collection of NTP servers, then clients set their clocks assuming a small network delay (achieving relatively low accuracies)

● Procedure­call mode ­ one computer accepts requests, replies with a timestamp, which is then used to update client clocks (higher accuracies achievable)

● Symmetric mode ­ intended to be used at strata level 1, where the highest accuracies are to be achieved; pairs of servers exchange timing messages bearing timing information, and this information is retained over time allowing the two servers to very closely synchronize their clocks

Page 18: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

Logical Clocks

● Synchronization is based on “relative time”.● Note that (with this mechanism) there is no requirement for 

“relative time” to have any relation to the “real time”.● What’s important is that the processes in the Distributed 

System agree on the ordering in which certain events occur.

● Such “clocks” are referred to as Logical Clocks.

Page 19: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

Lamport’s Logical Clocks

● First point: if two processes do not interact, then their clocks do not need to be synchronized – they can operate concurrently without fear of interfering with each other

● Second (critical) point: it does not matter that two processes share a common notion of what the “real” current time is. What does matter is that the processes have some agreement on the order in which certain events occur

● Lamport used these two observations to define the “happens­before” relation (also often referred to within the context of Lamport’s Timestamps)

Page 20: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

The Happens­Before Relation, 1 of 4

● If A and B are events in the same process, and A occurs before B, then we can state that: A “happens­before” B is true

● Equally, if A is the event of a message being sent by one process, and B is the event of the same message being received by another process, then A “happens­before” B is also true

● Note that a message cannot be received before it is sent, since it takes a finite, nonzero amount of time to arrive … and, of course, time is not allowed to run backwards

Page 21: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

The Happens­Before Relation, 2 of 4

● Obviously, if A “happens­before” B and B “happens­before” C, then it follows that A “happens­before” C

● If the “happens­before” relation holds, deductions about the current clock “value” on each DS component can then be made

● It therefore follows that if C(A) is the time on A, then C(A) is less than C(B), and so on

Page 22: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

The Happens­Before Relation, 3 of 4● Now, assume three processes are in a DS: A, B and C● All have their own physical clocks (which are running at 

differing rates due to “clock skew”, etc.)● A sends a message to B and includes a “timestamp”● If this sending timestamp is less than the time of arrival at B, 

things are OK, as the “happens­before” relation still holds (i.e. A “happens­before” B is true)

● However, if the timestamp is more than the time of arrival at B, things are NOT OK (as A “happens­before” B is not true, and this cannot be as the receipt of a message has to occur after it was sent)

Page 23: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

The Happens­Before Relation, 4 of 4

● The question to ask is: How can some event that “happens­before” some other event possibly have occurred at a later time?? 

● The answer is: it can’t! ● So, Lamport’s solution is to have the receiving process 

adjust its clock forward to one more than the sending timestamp value. This allows the “happens­before” relation to hold, and also keeps all the clocks running in a synchronized state. The clocks are all kept in sync relative to each other

Page 24: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

Lamports Clocks in Action

Page 25: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

Problem: Totally­Ordered Multicasting● Updating a replicated database and leaving it in an inconsistent state: 

Update 1 adds 100 euro to an account, Update 2 calculates and adds 1% interest to the same account. Due to network delays, the updates may not happen in the correct order. Whoops!

Page 26: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

Solution: Totally­Ordered Multicasting

● A multicast message is sent to all processes in the group, including the sender, together with the sender’s timestamp

● At each process, the received message is added to a local queue, ordered by timestamp

● Upon receipt of a message, a multicast acknowledgment/timestamp is sent to the group

● Due to the “happens­before” relationship holding, the timestamp of the acknowledgment is always greater than that of the original message

Page 27: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

More on Totally Ordered Multicasting

● Only when a message is marked as acknowledged by all the other processes will it be removed from the queue and delivered to a waiting application

● Lamport’s clocks ensure that each message has a unique timestamp, and consequently, the local queue at each process eventually contains the same contents

● In this way, all messages are delivered/processed in the same order everywhere, and updates can occur in a consistent manner

Page 28: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

Totally­Ordered Multicasting, Revisited● Update 1 is time­stamped and multicast. Added to local queues● Update 2 is time­stamped and multicast. Added to local queues● Acknowledgments for Update 2 sent/received. Update 2 can now be processed● Acknowledgments for Update 1 sent/received. Update 1 can now be processed● Note: all queues are the same, as the timestamps have been used to ensure the 

“happens­before” relation holds.

Page 29: Time is important, however, it is problematic in distributed …glasnost.itcarlow.ie/~barryp/slides/time.pdf · Synchronizing Physical Clocks External synchronization setting the

In Summary

● Handling Time with a DS is tricky!

● So, we rarely try to deal with “real time”

● Relative time (using Lamport's logical clocks) is the preferred method when ensuring the correct ordering of events within a DS