Top Banner
Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud
33

Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Distributed SystemsCS 15-440

Google Chubby and Message Ordering

Recitation 4, Sep 29, 2011

Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

Page 2: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Today…

Last recitation session: Google Protocol Buffers and Publish-Subscribe

Today’s session: Google Chubby

A Google library and infrastructure for synchronization

Ordered Communication Ordering events and enforcing ordering while communicating

Announcement: Project 1 due on Oct 3rd

Page 3: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Overview

Recap

Google Chubby

Ordered Communication

Page 4: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Recap: Google Physical Infrastructure

Google has created a large distributed system from commodity PCs

Commodity PC

Rack Approx 40 to 80 PCsOne Ethernet switch (Internal=100Mbps, external = 1Gbps)

ClusterApprox 30 racks (around 2400 PCs)2 high-bandwidth switches (each rack connected to both the switches for redundancy)Placement and replication generally done at cluster level

Data Center

Page 5: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Recap: Google Data center Architecture

(To avoid clutter the Ethernet connections are shown from only one of the clusters to the external links)

Page 6: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Recap: Google System Architecture

Page 7: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Recap: Google Infrastructure

Page 8: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Overview

Recap

Google Chubby

Ordered Communication

Page 9: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Google Chubby

Google Chubby offers the coordination and storage services to other services (e.g., to Google File System)

It provides coarse-grained distributed locks to synchronize distributed activities in a large-scale, asynchronous environment

It can be used to support the election of primary in a set of replicas

It can be used as a name-service within Google

It provides a file system offering the reliable storage of small files

Chubby is an all-in-one package consisting of file-system, locking service, naming service and election facilitator!

Page 10: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Chubby Interface

Chubby provides an abstraction based on a file system concept that every data object is a file

Files are organized into hierarchical namespaceExample

/ls/chubby_cell/directory_name/…/file_name

Lock Service An identifier for describing the name of the instance of Chubby

Page 11: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Chubby as a file-system and a locking service

The interface provides an easy mechanism to store small files

Chubby provides following Interfaces

General Interfaces

File-System Interfaces

Locking Service Interfaces

Page 12: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Chubby – General Interfaces

Chubby provides interfaces for opening, closing and deleting a file in its namespace

Open call: Opens a file or directory and returns a handle

Client can specify if the file has to be opened for reading, writing or locking

Close call: Relinquishes the handle

Delete calls: Remove the file or directory

Page 13: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Chubby – File-System Interfaces

Chubby provides two services:Whole-file reading and writing operations

Single atomic operations are provided to read and write complete data in the file

Chubby can be used to store small files (but not large files)

Access controlA file is associated with an Access Control List (ACL)

ACL can be get and set through interfaces

Page 14: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Chubby – Locking Service Interfaces

In Chubby, a file can be opened as a lock

The owner of the lock has the handle to the file

Chubby provides three interfacesAcquire: The call gets a handle to the lock

Release: This call releases the lock

TryAcquire: This is a Non-blocking variant of the Acquire call

Chubby provides advisory locks, and not mandatory locksAdvantage: Extra flexibility and resilience

Disadvantage: Programmer has to manage the conflict

Page 15: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Summary of Chubby Interfaces

Page 16: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Chubby Architecture

A Chubby Instance (or a chubby cell) is the first level of hierarchy inside Chubby (ls)

/ls/chubby_cell/directory_name/…/file_name

Chubby instance is implemented as a small number of replicated servers (typically 5) with one designated master

Clients access these replicas using Chubby LibraryUses Protocol Buffers to communicate

Replicas are placed at failure-independent sitesTypically, they are placed within a cluster but not within a rack

Page 17: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Chubby Namespace Architecture

The hierarchical namespace of directories and files/locks is maintained in a database at each replicas

The consistency of replicated database is ensured through a consensus protocol that uses operation logs

Logs can be used to reconstruct the state of the system

Problem: Logs can become too large over time

Solution: Chubby takes a snapshot of the system periodically, and erases the old logs

Page 18: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Chubby Session

Chubby Session is the relationship between client and a Chubby cell

KeepAlive messages maintain the session

Page 19: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Client Caching and Consistency

Client caches file data, meta data and handles that are open

Cache consistencyWhenever a mutation is to occur, the associated operation is blocked until all caches are invalidated

Invalidation messages are piggybacked on KeepAlive messages

Disadvantages:Cached copies are not invalidated, and not simultaneous updated

Operation cannot progress until all replicas are invalidated

Advantages:Simple and elegant for small files and locks

Page 20: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Chubby Architecture Diagram

Page 21: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Overview

Recap

Google Chubby

Ordered Communication

Page 22: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Ordered Communication

In several applications, ordering of events is vital

For example, consider a flight-booking systemReserve Cancel

Prices 15% Off

Client

Server

time

Server cancels the reservation before booking – even when the messages are reliably delivered!

We will study how to ensure ordered delivery of events in group communication

Page 23: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Ordered Multicast – An Example

An example where total-ordering is necessaryIn an eCommerce application, the bank database has been replicated across many servers

Let us consider a 2-replica scenario

Bal=1000 Bal=1000

Replicated Database

Event 1 = Add $1000 Event 2 = Add interest of 5%

Bal=2000

1 2

Bal=10503 Bal=20504Bal=2100

The updates from Event 1 and Event 2 should be performed in the same order on every replicated server. Else the data is inconsistent.

Page 24: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Three Types of Ordering

FIFO Order

Causal Order

Total Order

Page 25: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

FIFO Ordering

FIFO OrderIf a process sends a multicasts a message m before m’, then no correct process delivers m’ if it has not already delivered m

In the example,F1 and F2 are in FIFO Order

Drawback:FIFO Order does not specify any order for the messages generated across different processes

e.g, F1 and F3 can be delivered in any order

F3

F1

F2

T2

T1

P1 P2 P3

Tim

e

C3

C1

C2

Page 26: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Causal OrderingCausal Order

If process Pi multicasts a message mi and Pj multicasts mj, and if mimj

(operator ‘’ is Lamport’s happened-before relation) then any correct process that delivers mj will deliver mi before mj

Relationship between FIFO and Causal order:Causal Order implies FIFO Order, but FIFO Order does not imply Causal Order

In the example, C1 and C3 are in Causal Order

Drawback:

The happened-before relation between mi and mj should be induced before communication

F3

F1

F2

T2

T1

P1 P2 P3

Tim

e

C3

C1

C2

Page 27: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Total OrderingTotal Order

If process Pi multicasts a message mi and Pj multicasts mj, and if one correct process delivers mi before mj then every correct process delivers mi before mj

In the example, T1 and T2 are in Total Order

Drawback:Total order does not imply FIFO or causal orders

F3

F1

F2

T2

T1

P1 P2 P3

Tim

e

C3

C1

C2

Page 28: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Totally Ordered Multicast

Totally Ordered Multicast is a multicast communication paradigm that ensures that all messages are delivered in the same order at all the receivers

Approach:Process Pi sends timestamped multicast message msgi to all the receivers in the group

At the sender, the message is buffered in a local queue queuei

Any incoming message at Pj is queued in queuej, according to its timestamp, and acknowledged to every other process.

Process 110

Process 2

0

Process 3

0111

1 1

222

2

4

333

44

5 55

666

7 77

Page 29: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Totally Ordered Multicast (cont’d)

A receiver will deliver the message to the application if The message is at the head of the queue, and

The message has been acknowledged by each other process

Assumptions in Totally Ordered Multicast:Communication is reliable

There is no out-of-order delivery of messages that are transmitted from the same sender

Page 30: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Application of Vector Clocks: Causally Ordered Multicast

In Causally Ordered Communication, a message m is delivered to an application only if all messages that causally precede m has been received

Vector Clocks allow implementation of Causally Ordered MulticastHere, a multicast message is delivered to an application in the causal order

Under some criteria, Causally Ordered Multicast is weaker than Totally Ordered Multicast

If two messages are not related to each other, it does not matter in which order they are delivered to the application

Page 31: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Causally Ordered Multicast – An Example

Page 32: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Causally Ordered Multicast – Approach

Clocks are adjusted only when sending and receiving messages

When sending a message m from Process Pi:

VCi[i] = VCi[i] + 1

ts(m) = VCi

When it delivers a message with ts(m):

VCj[k] = max(VCj[k], ts(m)[k]) ; (for all k)

When Pj receives a message m (with timestamp ts(m)) from Pi, it will deliver the message to the application only if:

ts(m)[i] = VCj[i]+1

m is the next message that Pj was expecting from Pi

ts(m)[k] <= VCj[k]; (for all k != i)

Pj has seen all the messages that have been seen by Pi when it sent the message m

Page 33: Distributed Systems CS 15-440 Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Referenceshttp://perspectives.mvdirona.com/2008/06/11/JeffDeanOnGoogleInfrastructure.aspx

http://mobilelocalsocial.com/2010/google-data-center-fire-returns-worldwide-404-errors/

http://techcrunch.com/2008/04/11/where-are-all-the-google-data-centers/

http://cdk5.net

http://www.dis.uniroma1.it/~baldoni/ordered%2520communication%25202008.ppt

http://www.cs.uiuc.edu/class/fa09/cs425/L5tmp.ppt