Top Banner
Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group
49

Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Dec 28, 2015

Download

Documents

Darrell Stewart
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Replication: optimistic approaches

Marc Shapirowith Yasushi Saito (HP Labs)

Cambridge Distributed Systems Group

Page 2: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 2

Motivations for this work

Peer-to-peer, decentralised write sharing

Lessons and commonalities

Understand limitations

Different solutions: spectrum or discrete points?

Simple formal model

Page 3: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 3

Optimistic replication

Replicas of shared objects on sitesWithout synchronisation:

peer-to-peer read and update!

Consistency: a posteriori, offlineMerge independent updates

Applications:high latency networksdisconnected operationcooperative work

Improves availability & performance

Page 4: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 4

Example: cooperative engineering with CVS

CVS: developing shared code

Local, disconnected replica: no interference

Conflicts:Write same file = syntacticOverlap in file = violates edit semanticsDoesn’t compile, test = violates

application semantics

Both sides of a conflict are excluded

Manual repair

Page 5: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 5

Example: Bayou

General-purpose databaseAny replica can update, log actions

action = { dependency check, operation, merge-procedure }

Optimistic replication:epidemic exchange logs{ roll-back, replay }*; commitdep-check: semantic check for conflict merge-proc: semantic repair

Page 6: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 6

Basic vocabulary

While isolated: tentative updates

When connected, reconcile:Propagate & collect updates(Conceptually) Restart from initial stateReplay updates (if possible)

Overriding goal: consistency

Page 7: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

1. Consistency

Study component issues of consistency

Page 8: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 8

What is consistency?

Consistent with user intentsapply operationsaccording to user scenario

Consistent with data invariantsdependent actionspre- and post-conditionsconflict resolution

Replicas consistent with each otherconverge towards same values

Page 9: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 9

Consistency: problem taxonomy1. Objects & updates

Internal vs. external consistency Value / value log / operation log Single master / multi-master

2. Detecting dependence vs. concurrency

3. Concurrency control

4. Laziness of concurrency control Pessimistic / advanced concurrency /

optimistic

5. Convergence

Page 10: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 10

Operation-based reconciliation

Updates: concurrent, unsynchronised

Local log of actions = operation descriptionsobject identifier, method, arguments

Multi-log collects local + remote logs

Reconciliation schedule: merge multi-log & run sequentially

Scheduling issues:Include vs. excludeExecution order

Page 11: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 11

Operation-based model

0

0

1

2

0

0

4

3

Page 12: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 12

Dependence vs. concurrency

Two actions are either have a dependency or commutative / concurrent

Dependent actions:do not conflictmust be scheduled in dependence order

Concurrent actionspotentially conflict

Dependence / concurrency detection is a fundental mechanism

Page 13: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 13

Concurrency control

Concurrent & no conflict commute: execute both, arbitrary order

Conflict detection options

Conflict resolution options

Page 14: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 14

Convergence

Liveness: sites receive same/all actions

Safety: given same actions, sites compute the same value

Stability: actions eventually not undone

Page 15: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

2. Dependency & Concurrency

Mechanisms to detect if actions are dependent or concurrent

Page 16: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 16

Scalar clocks and timestamps

Wall clock, Lamport clockTotal orderTotal order, consistent with

causal dependenceSchedule in timestamp orderCan’t detect concurrency

Page 17: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 17

Happens-before

e1 precedes e2 in processe1 sends, e2 receives

e1 e2

(e1 e2) (e2 e1) e1 || e2

e1 || e2: e1 does not cause e2

e1 e2: e1 might cause e2

Partial order, consistent with causal dependence

Schedule consistent with

Page 18: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 18

Syntactic vs. semantic mechanisms

Scalar timestampsno concurrency detectionvery conservative approx.

of causalityVector timestamps

detect concurrencyconservative approx. of

causality

Alternative: explicit semantic constraints

Page 19: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 19

Locks as semantic constraints

Read(x) depends onprevious Write(x) in same process, orpreviously-received Write(x), whichever

is laterWrite(x) depends on

previous Read(*) in same processMore semantic information than Happens-

BeforeStep in the right direction, but still too coarse

Page 20: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 20

IceCube: Primitive constraints

Declarative (“static”):MustHave: a b

if as and ab then bs(not necessarily contiguous nor in

order)Order: a b

if a, bs and ab then a before b in s(not necessarily both nor contiguous)

Within log, across logs

Imperative (dynamic): preCondition (State)

Page 21: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 21

Log constraints

parcelpredecessor-

successor

alternatives

Express user intents:Predecessor/successor: a b b a

b uses effect of a; “a causes b”Parcel: a b b a

transactionAlternatives: a b b a

Page 22: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

3. Concurrency control & scheduling

Policies for dealing with concurrent actions

Page 23: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 23

Optimistic concurrency control & scheduling

Two actions are either:dependent

schedule in dependence orderconcurrent and non-conflicting or

commutative schedule in any order

concurrent and conflicting schedule in non-conflicting order or exclude one, the other, or both

Page 24: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 24

Concurrency control

Concurrent & no conflict commute: execute both, arbitrary order

Conflict detection options:2 concurrent actions conflictonly if operate on same objectonly if both writeonly if violate semantic invariant

Conflict resolution options:exclude bothexclude 1st, include 2nd (or vice-versa)execute both in favorable order(rewrite and execute both)

Page 25: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 25

What is a conflict?

1 site executes code + pre/post-conditionsPre/post-conditions often unknownDependency between successive actions

Schedule execution must satisfy pre/post-conditionsViolation conflict

pre(x0) post(x0, f(x0))

x1:= f(x0)

pre(x0) post(x’1, g(x0))

x’1:= g(x0)

pre(x1) post(x1, g(x1))

x2:= g(x1)

Page 26: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 26

Thomas’ Write Rule

Pre- / post-conditions unknownScalar clocks

no concurrency detectimplicit concurrency controlschedule in clock ordera later action excludes earlier ones

Lost updates

Delete ambiguity: “tombstone” state

Page 27: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 27

Value-based Version Vector concurrency control

Pre- / post-conditions unknown

Independent objectsactions to different objects commuteVV = per-object vector timestampany concurrent writes to object conflict

Resolution:ManualValues: “Resolver” per data type

Page 28: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 28

Bayou scheduling

Disjoint databases; 1 primary / database

Transaction: single database

Action = { dependency check, operation, merge-procedure }

Optimistic replication:epidemic exchange logs{ roll-back, replay }*; commit

Conflict dependency check fails

merge-procedure

Page 29: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 29

Bayou dependency checks

Write-write conflicts: on replay check that data unchanged

Read-write conflicts: check input datacan detect concurrent updatessemantic: only relevant changes

Application-specific checksbank account balance > £100fine grain

Page 30: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 30

IceCube: Object constraints

Shared data type advertises static semanticsmutually exclusive a b b a

best order (e.g. bank: credits before debits) a b

Only between concurrent actions

Also: dynamic constraints

commutebestorder

mutuallyexclusive

Page 31: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 31

IceCube scheduling

Insight: conflict: choice of which action to excludemaximise value

Page 32: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 32

IceCube execution model

0 1

0 2

0

0

0

0

0

8

11

4

5

6

log constraints

log constraintsobjectconstraints

0 9

0 10

0 7

dynamic constraints

Page 33: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 33

Search vs. syntactic order

0

50

100

150

200

250

5 40 75 110 145 180 215 250

Number of actions

Solu

tion

siz

e OptimalConcatenateIceCubeSingle log

Page 34: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 34

Performance of IceCube heuristics

0

500

1000

1500

2000

2500

3000

1000

2000

3000

4000

5000

6000

7000

8000

9000

1000

0

Number of actions

Ex

ec

uti

on

tim

e (

ms

)

Total

Page 35: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

4. Convergence

Can a peer-to-peer system converge?

Hard in the general case

Formalise to understand limitations, trade-offs

Page 36: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 36

Convergence

Liveness: sites receive same/all operations epidemic multicastquickly

Safety: sites compute the same valueequivalent schedules

Stability: actions eventually not undonestable schedulesUsers, external world dependencyGarbage collection

Page 37: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 37

Schedule soundness & equivalence

s sound:Closed for MustHave

as ab bsConsistent with Order

(a,b s ab) a before b in sEquivalence: s t

s, t soundas atordering is irrelevant!

Page 38: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 38

Stability

Peer-to-peer, indefinite tentative update + advisory reconciliation OK

But stability needed:Users, external world depend on itGarbage collect multilog

Stable: eventually decisions not changedcommitted: definitely included in all

schedulesaborted: definitely excluded

Page 39: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 39

Correctness of stability

Actions known to be stable at site i:stablei = committedi abortedi

Live: action a, site i: a stablei

Safe: site i, schedule si:

si sound committedi si site i,k: committedi abortedk =

Safety invariant: strong, global!

Page 40: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 40

Maintaining disjointness

site i,k: committedi abortedk = Different possibilities

Unilateral abortTWR, Holliday 2000

Unilateral commitDeterministic abort / commit rule

TWR Primary (only one) site decides

Bayou, CVSConsensus before deciding

Deno, Holliday 2000-2002

Page 41: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 41

Maintaining soundness site i, schedule si:

si sound committedi si

When aborting a, also abort actions that MustHave a

When committing a, also abort uncommitted actions that are ‘Order’ed before a

Maintain both soundness and disjointness.Peer-to-peer commitment is hard!

Page 42: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 42

Stability with TWR

Independent objects

Independent writes (no MustHave nor Order)

All sites take same decision:Given two writes to same object, abort

the earlierWhether concurrent or notWrite stable when seen by all sites

Disjointness: committedi =

Soundness: no MustHave (no transactions)

Page 43: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 43

Stability in Bayou

Databases:DisjointIndependent: no multi-DB transaction1 primary / database

Log constraints: transactions, time order

Disjointness: Only 1 site decides about a: the primary for the database that a updates

Soundness: whole transaction commits or aborts

Page 44: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 44

Holliday’s pre-commit protocol

Log constraints: multi-object transactionshappens-before order

Read transactions commit locally

Read-Write transactions: consensus to commitconvert locks to intentionspre-commit, votecommit if quorum ‘yes’abort if anti-quorum ‘no’ or conflict with

committed

Page 45: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 45

Trade-offs

Deterministic rulefast, inflexible

Partition + primarysingle point of failureno MustHave across partition boundaries

Consensusslowscalabilityimpossibility of consensus in asynchronous

systems with failure

Page 46: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

5. Conclusions

Page 47: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 47

Need for OR not going away

“Network technology improving: keep everything consistent pessimistically.”

True, but:Constant latency; unavailable bandwidthMobile access unbounded latencyIncreasing numbers of replicas

“Conflicts are rare.”

True, but:Do occurVery high cost

Page 48: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 48

OR pros & cons

Peer-to-peer read/write sharing

OR accepts more updates:Performance despite latencyAvailability despite failures

Increased complexitySemantic informationNot transparent

Bottleneck moved to commitHard to make peer-to-peerUnless (unacceptable?) restrictions

Unavoidable

Page 49: Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 49

The end