Top Banner
A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University Taiwan, R. O. C.
47

A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

A Fault-Toleranth-out of-k Mutual Exclusion Algorithm

Using Cohorts Coteries for Distributed Systems

Presented by

Jehn-Ruey JiangNational Central University

Taiwan, R. O. C.

Page 2: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Distributed Systems A distributed system consists of interconnected, autono

mous nodes which communicate with each other by passing messages.

Interconnected Network

Page 3: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Mutual Exclusion

A node in the system may need to enter the critical section (CS) occasionally to access a shared resource, such as a shared file or a shared table, etc.

How to control the nodes so that the shared resource is accessed by at most one node at a time is called the mutual exclusion problem.

Page 4: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Mutual Exclusion Example

in CS

Page 5: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Mutual Exclusion Example

in CS

Page 6: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

k-Mutual Exclusion

If there are k, k1, identical copies of shared resources, such as a k-user software license, then there can be at most k nodes accessing the resources at a time.

This raises the k-mutual exclusion problem.

Page 7: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

k-Mutual Exclusion Example

in CS

in CS

k=2

Page 8: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

k-Mutual Exclusion Example

in CS

in CS

k=2

Page 9: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

h-out of k-mutual exclusion

On some occasions, a node may require to access h (1hk) copies out of the k shared resources at a time; for example, a node may need h disks from a pool of k disks to proceed.

How to control the nodes to acquire the desired number of resources with the total number of resources accessed concurrently not exceeding k is called the h-out of-k mutual exclusion problem or the h-out of-k resource allocation problem.

Page 10: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

h-out of-k Mutual Exclusion Example

in CS (h=2)

in CS (h=1)

k=3

Page 11: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

h-out of-k Mutual Exclusion Example

in CS (h=1)

in CS (h=1)

k=3

Page 12: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Related WorkRaynal (1991): uses request broadcast

Baldoni et al. (1998): uses k-arbiters

Manabe at al. (2004): uses (h, k)-arbiters

Jiang (2004): uses k-coteries

Page 13: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Jiang’s Algorithm

Among the four algorithms, only Jiang’s algorithm using k-coteries is fault-tolerant.

It can tolerate node and/or network link failures even when the failures lead to network partitioning.

It has lower message cost than others.

Page 14: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

k-Coterie

A collection of sets (called quorums) satisfying the following properties:

1. Intersection Property: There are at most k pairwise

disjoint quorums. 2. Non-intersection Property: For any h (< k) pairwise

disjoint quorums Q1,...,Qh, there exists a quorum Qh+1 such that Q1,...,Qh+1 are pairwise disjoint.

3. Minimality Property: Any quorum is not a super set of another quorum.

Page 15: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Basic Idea of Jiang’s Alg.

A node should select h mutually disjoint sets and collect permissions from all the nodes of the h sets to enter CS for accessing h resources.

To render the algorithm fault-tolerant, a node is demanded to repeatedly reselect h mutually disjoint sets for gathering incremental permissions when a node fails to gather enough permissions to enter CS after a time-out period.

Page 16: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Drawbacks of Jiang’s Alg.

First, it does not specify explicitly how a node can efficiently select and reselect h mutually disjoint sets.

Second, when there is contention, a low-priority node always yields its gathered permissions to high-priority nodes, which causes higher message overhead and may prohibit nodes from entering CS concurrently.

Page 17: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Overview of the Proposed Alg.

Using a specific k-coterie cohorts coterie

Having constant message cost in the best case

A candidate to achieve the highest availability among all the algorithms using k-coteries

Achieving k-concurrency by pre-release action and conditional inquiring

Page 18: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Cohorts Structure Coh(k, m)

A cohorts structure Coh(k, m)(C1,...,Cm), mk, is a list

of sets, where each set Ci is called a cohort. The cohorts structure Coh(k, m) should observe the following three properties:

P1. C1 = k.

P2. i: 1< i m : Ci > 2k 2, for k>1 ( Ci >1, for k=1).

P3. i, j: 1i, jm, ij: CiCj=.

Page 19: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Quorum under Coh(k, m)

A set Q is said to be a quorum under Coh(k, m) if some cohort Ci in Coh(k, m) is Q's primary cohort, and each cohort Cj, j > i, is Q's supporting cohort.

C is Q's primary cohort if QC=C (k 1)

C is Q's supporting cohort if QC=1

Page 20: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Construction of Quorums under Coh(2, 4)

one primary cohort with supporting cohorts at rear

E.G.: {5, 9, 10, 12}

11 12

8

4

14

107

65

321

9

12

11 12

8

4

14

107

65

321

9

1211 12

8

4

14

107

65

321

9

12

There will be no disjoint quorum to be formed.

and {1,2, 4, 8, 14}

Page 21: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Six Types of Messages

REQUEST LOCKED RELEASE PRE-RELEASE INQUIRE RELINQUISH

Comparison: Maekawa’s algorithm uses the following six messages:•REQUEST•LOCKED•FAILED•RELEASE•INQUIRE•RELINQUISH

Page 22: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Requesting h Resources

When a node u wants to enter CS to access h resources, u should invokeGet_Quorum(h, k, (C1,...,Cm)) and waits for it to return.

h, k: integers (C1,...,Cm): Cohorts Structure

Page 23: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Function Get_Quorum(h,k:Integer;(C1,...,Cm):Cohorts Structure):Set; Var R, S: Set; g: Integer; g = h; //g: Storing the number of primary cohorts needed R = ; //R: The set of replying nodes that will be returned For (i =m,...,2 ) Do

S=Probe(Ci, g); If S = Ci (k 1)+(g 1)

Then {R=RS; g=g 1; If g=0 Then Return R;} Else If S=g Then R=RS;

EndFor S=Probe(C1, g); //C1 is the primary cohort of g quorums R=RS; Return R;

End Get_Quorum

Page 24: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Probe(Ci, g)

The function Probe(Ci, g) evoked in Get_Quorum performs the task of requesting all the nodes in set Ci for their exclusive permissions.

A node can only reply to one requesting node at a time

to grant its permission. After a network turn-around time, Probe(Ci, g) returns

a set S of replying nodes of Ci for three cases:

Page 25: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Case 1 for Probe(Ci, g) to return

If i>1 and there are more than Ci (k 1)+(g 1)

replying nodes, the returning set will be a set of

Ci (k 1)+(g 1) replying nodes.

For this case, Ci can be the primary cohort of one quorum, and be the supporting cohorts of

g 1 quorums concurrently.

Page 26: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Case 2 for Probe(Ci, g) to return

If i>1 and there are more than g but less than

Ci (k 1)+(g 1) replying nodes, the returning set

will be a set of g replying nodes. (Note that

Ci (k 1)+(g 1)>g because Ci > 2k 2 for k>1, or

Ci > 1 for k=1.)

For this case, Ci can only be the supporting cohorts of g quorums.

Page 27: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Probe(Ci, g) waits

It is noted that Probe(Ci, g) will postpone the return if none of the three cases stands, which means no node in Ci can reply to grant its permission immediately.

Page 28: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Case 3 for Probe(Ci, g) to return

If i=1 and there are more than g replying nodes, the returning set will be a set of g replying nodes.

Because C1=k, only one node can make C1 the

primary cohort of a quorum. Thus, g replying nodes can make C1 be the

primary cohorts of g quorums for this case.

Page 29: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Pre-release

Probe(Ci, g) will execute the pre-release action before returning S; that is, it will send messages to the nodes in Ci – S to release their permissions in advance.

The pre-release action plays an important role in the

proposed algorithm. As we will show later, the action can allow more nodes to be in CS concurrently and is related to the deadlock-free property.

Page 30: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

On Receiving REQUEST

On receiving a REQUEST from node u, a node v checks it is currently locked for another REQUEST. If not so, v marks itself locked, set u as the locker, records the number h of resources that u requests, and sends a LOCKED message to u.

Page 31: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Two Local Priority Queues

R-QUEUE: On receiving a REQUEST from node u, if v is locked for a REQUEST from another node w (w is the locker), the REQUEST from node u is inserted into R-QUEUE

P-QUEUE: On receiving a PRE-RELEASE message from node u, node v inserts the message into P-QUEUE.

Page 32: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Timestamp

The timestamp is a pair (u, t), where u is the node ID of the requester and t is the sequence number of the message.

A REQUEST of timestamp (u, t) is assumed to precede

(to have higher priority than) another REQUEST of timestamp (u, t) if (t<t) or (t=t u<u).

A message sequence number is assigned by a node to be

always one more than the largest message sequence number ever seen.

Page 33: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Conflict Condition

We said that the conflict condition holds if h1+…+hx + hw > k for REQUEST messages R1,…,Rx in R-QUEUE and P-QUEUE preceding Rw (REQUEST from the locker w), where h1,…,hx, and hw are the numbers of requested resources for request messages R1,…,Rx.

Node w and the nodes sending the messages R1,…,Rx

are called conflicting nodes.

Page 34: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

On Receiving PRE-RELEASE

On receiving a PRE-RELEASE message from node u (u must be the locker), node v inserts the message into P-QUEUE.

It then marks itself unlocked if R-QUEUE is empty; otherwise, it removes from R-QUEUE the node w, sets w as locker, and sends w a LOCKED message, where w is the node at the front of R-QUEUE.

Page 35: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Conditional Inquiring

If conflict condition holds, node v then sends an INQUIRE message to node w.

It is not necessary to send the INQUIRY message if an INQUIRY has already sent to w and w has not yet sent RELINQUISH or RELEASE (we will explain the two messages later).

Page 36: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

On receiving INQUIRE

When node w receives an INQUIRE message from node v, it replies a RELINQUISH message to cancel its lock if it is not in CS.

Otherwise, it replies a RELEASE message, but only after it exits CS. If an INQUIRE message has arrived after w has sent a RELEASE message, it is simply ignored.

Page 37: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

On receiving RELINQUISH

On receiving a RELINQUISH message form w (w must be the locker), node v swaps w with u, sets u as the locker, and sends a LOCKED message to u, where u is the node at the front of R-QUEUE.

Page 38: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Effective Permission

Node u is said to have an effective permission of node v if u has received LOCKED from v and does not sent corresponding RELINQUISH or RELEASE to v.

Page 39: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Entering CS Get_Quorum evoked by node u will eventually return a

set R, which is the union of h pairwise disjoint cohorts quorums.

Node u can enter CS and access h resources if node u has effective permissions of all nodes in R.

If u does not have effective permissions of all nodes in R, i.e., u has sent RELINQUISH messages to some nodes v1,..,vi of R, 1iR, then u must wait for LOCKED messages from v1,..,vi to enter CS.

Page 40: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Existing CS

After existing CS, node u should send RELEASE message to all the nodes to which u has sent REQUEST.

Page 41: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

On receiving RELEASE

On receiving a RELEASE message from node u (u may or may not be the locker), node v removes u’s PRE-RELEASE message from P-QUEUE if u’s PRE-RELEASE message is in P-QUEUE.

Node v marks itself unlocked if R-QUEUE is empty; otherwise, it removes from R-QUEUE the REQUEST message of node w, sets w as locker, and sends w a LOCKED message, where w is the node whose REQUEST is at the front of R-QUEUE.

Page 42: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Analysis

Message Cost:Best Case: 4ch, where c is the cohort size, c > 2k2

one REQUEST, LOCKED and RELEASE for each node in R and one PRERELEASE for the nodes in (Cm…Cm

h1)R. Worst Case: 7n, where n is the number of nodes

one REQUEST, INQUIRE, RELINQUISH, LOCKED, RELEASE, LOCKED for each nodes and one RELEASE for those not in R.(it occurs only when there are conflicting nodes.)

Page 43: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

ComparisonAlgorithm Message complexity k-Concurrency

Fault- Tolerance

Raynal’s algorithm [10]

between 2(n 1) and 3(n 1) yes no

The algorithm using k-arbiters [2]

between 3q to (3h+3)q, where q=(k+1)nk/(k+1) for the (k+1)-cube arbiter and q= 1)1/( knk for the uniform k-arbiter

yes no

The algorithm using (h,k)-arbiters [9]

between 3q to (3h+3)q, where q=(k+2 h)n(k+1 h)/(k+1) for the (k+1)-cube (h,k)-arbiter and q= 1)/( hknk for the uniform (h,k)-arbiter

yes no

Jiang’s algorithm [5] between 3hq and 6en, where q is the quorum size of the k-coterie used, and 0<e1

no yes

The proposed algorithm

between 4ch and 7n, where c > 2k 2

yes yes (maybe of the highest availability)

*n stands for the number of nodes, and h stands for the number of requested resources.

Page 44: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Conclusion The proposed algorithm becomes a k-mutual

exclusion algorithm for k>h=1, and becomes a mutual exclusion algorithm for k=h=1.

It is resilient to node and/or link failures and has constant message cost in the best case.

It is a candidate to achieve the highest availability among all the algorithms using k-coteries since the cohorts coterie is ND.

It has the k-concurrency property, which guarantees that a low-priority node is not postponed by a high-priority node when there are not conflicting nodes.

Page 45: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Thanks!!

Page 46: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Dominated k-Coteries

Let C and D be two distinct k-coteries. C is said to dominate D if and only if every quorum in D is a super set of some quorum in C (i.e.,Q, Q: QD, QC: QQ).

Obviously, the dominating one (C) has more chances

than the dominated one (D) to have available

quorums. A quorum is said to be available if all of its

members (nodes) are up.

Page 47: A Fault-Tolerant h-out of-k Mutual Exclusion Algorithm Using Cohorts Coteries for Distributed Systems Presented by Jehn-Ruey Jiang National Central University.

Nondominated k-Coteries

Since an available quorum implies an available entry to CS, we should always concentrate on ND (nondominated) k-coteries that no other k-coterie can dominate.

The algorithm using ND k-coteries, for example the proposed algorithm, is a candidate to achieve the highest availability.