Continuous Consistency Continuous Consistency and Availability and Availability Haifeng Yu CPS 212 Fall 2002
Dec 31, 2015
Continuous Consistency Continuous Consistency and Availabilityand Availability
Haifeng Yu
CPS 212 Fall 2002
2
Consistency in ReplicationConsistency in Replication
Replication comes with consistency cost: Reasons for replication: Better performance and availability
client
client
clientserver
server
server
Replication transforms client-server
communication to server-server
communication: • Decrease performance
• Decrease availability
3
Strong Consistency and Optimistic ConsistencyStrong Consistency and Optimistic Consistency
Traditionally, two choices for consistency level:• Strong consistency: Strictly “in sync”
• Optimistic consistency: No guarantee at all
• Associated tradeoffs with each model
Availability / Performance /Scalability
Consistency
Optimistic Consistency
Strong Consistency
4
Problems with Binary ChoiceProblems with Binary Choice
Strong consistency incurs prohibitive overheads for many WAN apps• Replication may even decrease performance, availability and scalability
relative to a single server!
Optimistic consistency provides no consistency guarantee at all• Resulting in upset users: Unbounded reservation conflicts
• Potentially render the app unusable: If traffic data is more than 1 hour stale, probably of little use
Applications cannot tune consistency level based on its environment• Need to adapt to client, service and network characteristics
5
Continuous ConsistencyContinuous Consistency Consistency is continuous rather than binary for many
WAN apps • These apps can benefit from exploiting the consistency
spectrum between strong and optimistic consistency.
Availability / Performance /Scalability
Consistency
Optimistic Consistency
Strong Consistency
Consistency
Continuous Consistency
Availability / Performance /Scalability
6
Quantifying ConsistencyQuantifying Consistency
Many ways:• Staleness (TTL in web caching): Invalidate
• Limit number of locally buffered writes
bufferedupdates
To Other
Replicas
7
Applications ?Applications ?
Applications:• Web caching
• Airline reservation
• Distributed games
• Shared editor
Non-Applications:• Some scientific computing problems
• Banking system
• Any application that has binary output
Application’s nature determines whether continuous consistency is applicable
8
Trading Consistency for PerformanceTrading Consistency for Performance
Airline reservation: running at Berkeley, Utah, Duke
0
10
20
30
40
50
0% 50% 100%Inconsistency
Th
rou
gh
pu
t(u
pd
ate
s/s
ec
)
StrongConsistency
OptimisticConsistency
[Yu’02, TOCS]
9
The Cost of Increased PerformanceThe Cost of Increased Performance
Increased performance comes with a cost• Adaptively trade consistency for performance based on client,
network, and service conditions
0%
5%
10%
15%
20%
25%
0% 20% 40% 60% 80% 100%Inconsistency
Res
v. C
on
flic
t R
ate
10
Model vs. ProtocolModel vs. Protocol
Continuous consistency model is a spec.
Protocol is anything that can enforce the spec. • Corollary: Strong consistency protocol is a protocol for any model
Many protocols for a specific model, some are good, others are not
11
Designing a Continuous Consistency ModelDesigning a Continuous Consistency Model
Model is a spec, thus quantifying consistency (in a bad way) is trivial
Only applications know its definition of consistency• Airline reservation vs. distributed games
What is a “good” continuous consistency model?• Can be used by diverse apps
• Practical
12
Distributed Consensus and Leader ElectionDistributed Consensus and Leader Election
What does “continuous consistency” mean ?• Allow at most k decision values
• Allow at most k leaders
Helps overcome some impossibilities• Unique decision value requires ½ majority
• K decision values allow any partition with 1/(k + 1) nodes to decide
13
Group Membership ServiceGroup Membership Service
Def: Keep track of which nodes belong to which group Traditionally, group membership only maintain a single group
• Primary-partition membership services
• Corresponds to strong consistency
Recently, partitionable membership services• Still active area of research
• Corresponds to optimistic consistency
Continuous consistency:• Allow at most k groups
• Again, helps overcome the ½ majority limitation
14
Continuous Consistency SummaryContinuous Consistency Summary
WAN replication needs dynamically tunable consistency
Tradeoff between consistency and performance
How to design a continuous consistency model
Continuous consistency in other context
Next: Availability
15
What is Availability ?What is Availability ?
No well-accepted availability metric for Internet services “Uptime” metric can be misleading for Internet services
• Server may be inaccessible because of network partition
Available: “present or ready for immediate use” • From Webster’s Collegiate Dictionary
• What does “immediate” mean?
• Time-out
Availability = (accepted accesses) / (submitted accesses)
• Implicit time-out in the definition
16
Perform-abilityPerform-ability
User satisfaction is not binary• What if a partial result is returned before time-out ?
• What if the result is sent back after an hour, or a day ?
• Availability is related to performance
Performability = reward function (quality and timeliness of result)
Determining reward function is hard !
17
Availability of an Internet ServiceAvailability of an Internet Service We use user-observed availability in our study:
Availability = (accepted accesses) / (submitted accesses)
Server
client
×2% [Chandra et.al.,
USITS’01]
reject due to server
failure×0.1% [MS press
release,Jan’01]
18
Effects of ReplicationEffects of Replication
Consistency may force a replica to reject an otherwise acceptable request• Network Failure Rate Replica Rejection Rate
client
× < 2%
× reject
Replica Replica
reject×
communication
to maintain
consistency
failed
> 0.1%
19
Limitations of Strong ConsistencyLimitations of Strong Consistency
: Replicas
: Clients
Option 1: accept reads accept reads
reject writes reject writes
Option 2: accept reads reject reads
accept writes reject writes
20
Effects of Continuous ConsistencyEffects of Continuous Consistency
Option 1: accept reads accept reads
reject writes reject writes
New Option 1: accept reads accept reads
accept first 10 writes accept first 5 writes
allow
replica to
buffer
5 writes
21
Effects of Continuous ConsistencyEffects of Continuous Consistency
Option 2: accept reads reject reads
accept writes reject writes
New Option 2: accept reads accept first few reads
accept writes accept first 5 writes
allow
replica to
buffer
5 writes
22
Consistency Impact is Consistency Impact is InherentInherent
Availability
Inconsistency
Hard Bound
0% Consistency
100% Availability
100% Consistency
Hard bound always exist We always know the to end points, but may not know the exact
shape of the curve
23
Effects of Consistency ProtocolEffects of Consistency Protocol
Achieved availability also depends on protocol• Design better protocols
• Job of system designers
Availability
Inconsistency
Upper BoundProtocol A
Protocol B
24
Availability OptimizationsAvailability Optimizations
Technique should not be tied to model
Focus on two techniques:• Retiring replicas
• Aggressive write propagation
25
Limitations of Strong ConsistencyLimitations of Strong Consistency
: Replicas
: Clients
Option 1: accept reads accept reads
reject writes reject writes
Option 2: accept reads reject reads
accept writes reject writes
26
Retiring ReplicasRetiring Replicas
Obviously, such decision may not be optimal unless we have future knowledge• Importance of prediction
Even with future knowledge, it is hard
In option 2, all replicas much reach an agreement• Leader election
• We are experiencing partitions
• One option: Voting
• What if we don’t have majority?
27
Aggressive Write PropagationAggressive Write Propagation
Applicable to continuous consistency
Continuous consistency gives us “buffers” that can be utilized in case of network partition
Keep the buffer empty: • Cannot predict the occurrence of network partitions
• Propagate writes more aggressively
• Cut down the amount of inconsistency accumulated in times of good connectivity
28
Effects of Aggressive PropagationEffects of Aggressive Propagation Baseline: Propagate writes only when necessary (lazily) Aggressive: When necessary and every 3 seconds
0.993
0.994
0.995
0.996
0.997
0.998
Inconsistency
Avail
ab
ilit
y
Avail UpperBound
Aggressive
Baseline
8 replicas with
measured faultload
From
[Yu’01, SOSP]
29
More Aggressive PropagationMore Aggressive Propagation
Aggressive write propagation does not work in all cases
Availability optimizations can incur more communication• Best availability achieved when we use a strong consistency
protocol
Speaks of availability / performance tradeoffs
30
Availability of Other SystemsAvailability of Other Systems
Consensus and leader election• Blocks without majority
Group membership• Blocks without majority
Relaxing consistency enables them to make progress• Open Question: But will these systems still be useful ?
31
Availability SummaryAvailability Summary
Availability definition
Inherent impact of consistency on availability
Availability also depends on consistency protocols
Availability optimizations:• Replica retirement
• Aggressive write propagation
32
Why can we easily approach the upper bound?Why can we easily approach the upper bound?
Simple protocols in our study can approach the upper bound closely• Remember reaching the upper bound in general needs future
knowledge
Related to the characteristics of the faultloads we measured and simulated• Most partitions are singleton partitions
• Most transitions are:
fully-connected → singleton partition → fully-connected
These characteristics are consistent with • Internet hierarchical architecture
33
Dual Effects of Replication Scale on Dual Effects of Replication Scale on AvailabilityAvailability
Consistency may force a replica to reject a request Adding more replicas:
• Network Failure Rate Replica Rejection Rate
Availability = (1 - Network Failure Rate) * ( 1 - Rejection Rate)
• Too large or too small replication scale can hurt availability
34
Optimal Replication ScaleOptimal Replication Scale Optimal replication scale: Adding more replicas can hurt!
• Increase in “replica rejection rate” outweighs decrease in “network failure rate”
Optimal replication scale depends on• Consistency level
• Network failure rate among replicas
0.9840.9860.988
0.990.9920.9940.9960.998
1
1 2 3 4 5 6 7Number of Replicas
Av
ail
ab
ilit
y U
pp
er
Bo
un
d
Failure Rate = 1%,Numerical Error = 250
Failure Rate = 1%,Numerical Error = 0
Failure Rate = 5%,Numerical Error = 250
Failure Rate = 5%,Numerical Error = 0