Continuous Consistency Continuous Consistency and Availability and Availability Haifeng Yu CPS 212 Fall 2002
Continuous Consistency Continuous Consistency and Availabilityand Availability
Haifeng Yu
CPS 212 Fall 2002
2
Consistency in ReplicationConsistency in Replication
! Replication comes with consistency cost:! Reasons for replication: Better performance and availability
client
client
clientserver
server
server
! Replication transforms client-servercommunication to server-server communication: • Decrease performance• Decrease availability
3
Strong Consistency and Optimistic ConsistencyStrong Consistency and Optimistic Consistency
! Traditionally, two choices for consistency level:• Strong consistency: Strictly “in sync”• Optimistic consistency: No guarantee at all• Associated tradeoffs with each model
Availability / Performance /Scalability
Consistency
Optimistic Consistency
Strong Consistency
4
Problems with Binary ChoiceProblems with Binary Choice! Strong consistency incurs prohibitive overheads for many WAN apps
• Replication may even decrease performance, availability and scalability relative to a single server!
! Optimistic consistency provides no consistency guarantee at all• Resulting in upset users: Unbounded reservation conflicts• Potentially render the app unusable: If traffic data is more than 1 hour stale,
probably of little use
! Applications cannot tune consistency level based on its environment• Need to adapt to client, service and network characteristics
5
Continuous ConsistencyContinuous Consistency! Consistency is continuous rather than binary for many
WAN apps • These apps can benefit from exploiting the consistency
spectrum between strong and optimistic consistency.
Availability / Performance /Scalability
Consistency
Optimistic Consistency
Strong Consistency
Consistency
Continuous Consistency
Availability / Performance /Scalability
6
Quantifying ConsistencyQuantifying Consistency
! Many ways:• Staleness (TTL in web caching): Invalidate
• Limit number of locally buffered writes
bufferedupdates
To OtherReplicas
7
Applications ?Applications ?! Applications:
• Web caching• Airline reservation• Distributed games• Shared editor
! Non-Applications:• Some scientific computing problems• Banking system• Any application that has binary output
! Application’s nature determines whether continuous consistency is applicable
8
Trading Consistency for PerformanceTrading Consistency for Performance
! Airline reservation: running at Berkeley, Utah, Duke
0
10
20
30
40
50
0% 50% 100%Inconsistency
Thro
ughp
ut(u
pdat
es/s
ec)
StrongConsistency
OptimisticConsistency
[Yu’02, TOCS]
9
The Cost of Increased PerformanceThe Cost of Increased Performance
! Increased performance comes with a cost• Adaptively trade consistency for performance based on client,
network, and service conditions
0%
5%
10%
15%
20%
25%
0% 20% 40% 60% 80% 100%Inconsistency
Res
v. C
onfli
ct R
ate
10
Model vs. ProtocolModel vs. Protocol
! Continuous consistency model is a spec.
! Protocol is anything that can enforce the spec. • Corollary: Strong consistency protocol is a protocol for any model
! Many protocols for a specific model, some are good, others are not
11
Designing a Continuous Consistency ModelDesigning a Continuous Consistency Model
! Model is a spec, thus quantifying consistency (in a bad way) is trivial
! Only applications know its definition of consistency• Airline reservation vs. distributed games
! What is a “good” continuous consistency model?• Can be used by diverse apps• Practical
12
Distributed Consensus and Leader ElectionDistributed Consensus and Leader Election
! What does “continuous consistency” mean ?• Allow at most k decision values• Allow at most k leaders
! Helps overcome some impossibilities• Unique decision value requires ½ majority• K decision values allow any partition with 1/(k + 1) nodes to decide
13
Group Membership ServiceGroup Membership Service
! Def: Keep track of which nodes belong to which group! Traditionally, group membership only maintain a single group
• Primary-partition membership services• Corresponds to strong consistency
! Recently, partitionable membership services• Still active area of research• Corresponds to optimistic consistency
! Continuous consistency:• Allow at most k groups• Again, helps overcome the ½ majority limitation
14
Continuous Consistency SummaryContinuous Consistency Summary
! WAN replication needs dynamically tunable consistency
! Tradeoff between consistency and performance
! How to design a continuous consistency model
! Continuous consistency in other context
! Next: Availability
15
What is Availability ?What is Availability ?
! No well-accepted availability metric for Internet services! “Uptime” metric can be misleading for Internet services
• Server may be inaccessible because of network partition
! Available: “present or ready for immediate use”• From Webster’s Collegiate Dictionary• What does “immediate” mean?• Time-out
! Availability = (accepted accesses) / (submitted accesses)• Implicit time-out in the definition
16
PerformPerform--abilityability
! User satisfaction is not binary• What if a partial result is returned before time-out ?• What if the result is sent back after an hour, or a day ?• Availability is related to performance
! Performability = reward function (quality and timeliness of result)
! Determining reward function is hard !
17
Availability of an Internet ServiceAvailability of an Internet Service! We use user-observed availability in our study:
Availability = (accepted accesses) / (submitted accesses)
Server
client
×2% [Chandra et.al.,
USITS’01]
reject due to server failure×
0.1% [MS press release,Jan’01]
18
Effects of ReplicationEffects of Replication
! Consistency may force a replica to reject an otherwise acceptable request• Network Failure Rate Replica Rejection Rate
client
× < 2%
× rejectReplica Replica
reject×
communicationto maintainconsistency
failed
> 0.1%
19
Limitations of Strong ConsistencyLimitations of Strong Consistency
: Replicas
: Clients
Option 1: accept reads accept reads
reject writes reject writes
Option 2: accept reads reject reads
accept writes reject writes
20
Effects of Continuous ConsistencyEffects of Continuous Consistency
Option 1: accept reads accept reads
reject writes reject writes
New Option 1: accept reads accept reads
accept first 10 writes accept first 5 writes
allowreplica to
buffer 5 writes
21
Effects of Continuous ConsistencyEffects of Continuous Consistency
Option 2: accept reads reject reads
accept writes reject writes
New Option 2: accept reads accept first few reads
accept writes accept first 5 writes
allowreplica to
buffer 5 writes
22
Consistency Impact is Consistency Impact is InherentInherent
Availability
Inconsistency
Hard Bound
0% Consistency100% Availability
100% Consistency
! Hard bound always exist! We always know the to end points, but may not know the exact
shape of the curve
23
Effects of Consistency ProtocolEffects of Consistency Protocol
! Achieved availability also depends on protocol• Design better protocols• Job of system designers
Availability
Inconsistency
Upper BoundProtocol A
Protocol B
24
Availability OptimizationsAvailability Optimizations
! Technique should not be tied to model
! Focus on two techniques:• Retiring replicas• Aggressive write propagation
25
Limitations of Strong ConsistencyLimitations of Strong Consistency
: Replicas
: Clients
Option 1: accept reads accept reads
reject writes reject writes
Option 2: accept reads reject reads
accept writes reject writes
26
Retiring ReplicasRetiring Replicas
! Obviously, such decision may not be optimal unless we have future knowledge• Importance of prediction
! Even with future knowledge, it is hard
! In option 2, all replicas much reach an agreement• Leader election• We are experiencing partitions• One option: Voting• What if we don’t have majority?
27
Aggressive Write PropagationAggressive Write Propagation
! Applicable to continuous consistency
! Continuous consistency gives us “buffers” that can be utilized in case of network partition
! Keep the buffer empty: • Cannot predict the occurrence of network partitions• Propagate writes more aggressively• Cut down the amount of inconsistency accumulated in times of
good connectivity
28
Effects of Aggressive PropagationEffects of Aggressive Propagation! Baseline: Propagate writes only when necessary (lazily)! Aggressive: When necessary and every 3 seconds
0.993
0.994
0.995
0.996
0.997
0.998
Inconsistency
Avai
labi
lity
Avail UpperBound
Aggressive
Baseline
8 replicas withmeasured faultload
From [Yu’01, SOSP]
29
More Aggressive PropagationMore Aggressive Propagation
! Aggressive write propagation does not work in all cases
! Availability optimizations can incur more communication• Best availability achieved when we use a strong consistency
protocol
! Speaks of availability / performance tradeoffs
30
Availability of Other SystemsAvailability of Other Systems
! Consensus and leader election• Blocks without majority
! Group membership• Blocks without majority
! Relaxing consistency enables them to make progress• Open Question: But will these systems still be useful ?
31
Availability SummaryAvailability Summary
! Availability definition
! Inherent impact of consistency on availability
! Availability also depends on consistency protocols
! Availability optimizations:• Replica retirement• Aggressive write propagation
32
Why can we easily approach the upper bound?Why can we easily approach the upper bound?! Simple protocols in our study can approach the upper bound
closely• Remember reaching the upper bound in general needs future
knowledge
! Related to the characteristics of the faultloads we measured and simulated• Most partitions are singleton partitions• Most transitions are:
fully-connected → singleton partition → fully-connected
! These characteristics are consistent with • Internet hierarchical architecture
33
Dual Effects of Replication Scale on Dual Effects of Replication Scale on AvailabilityAvailability
! Consistency may force a replica to reject a request! Adding more replicas:
• Network Failure Rate Replica Rejection Rate
! Availability = (1 - Network Failure Rate) * ( 1 - Rejection Rate)• Too large or too small replication scale can hurt availability
34
Optimal Replication ScaleOptimal Replication Scale! Optimal replication scale: Adding more replicas can hurt!
• Increase in “replica rejection rate” outweighs decrease in “network failure rate”
! Optimal replication scale depends on• Consistency level• Network failure rate among replicas
0.9840.9860.9880.99
0.9920.9940.9960.998
1
1 2 3 4 5 6 7Number of Replicas
Avai
labi
lity
Uppe
r Bou
nd Failure Rate = 1%,Numerical Error = 250
Failure Rate = 1%,Numerical Error = 0
Failure Rate = 5%,Numerical Error = 250
Failure Rate = 5%,Numerical Error = 0