Logic and Lattices for Distributed Programming Neil Conway , William R. Marczak, Peter Alvaro, Joseph M. Hellerstein UC Berkeley David Maier Portland State University
Feb 25, 2016
Logic and Lattices for Distributed Programming
Neil Conway, William R. Marczak, Peter Alvaro, Joseph M. HellersteinUC Berkeley
David MaierPortland State University
Distributed Programming:Key Challenges
Asynchrony
PartialFailure
Dealing with DisorderEnforce global order– Paxos, Two-Phase Commit, GCS, …– “Strong Consistency”
Tolerate disorder– Programmer must ensure correct behavior
for many possible network orders– “Eventual Consistency”
• Typical goal: replicas converge to same final state
Dealing with DisorderEnforce global order– Paxos, Two-Phase Commit, GCS, …– “Strong Consistency”
Tolerate disorder– Programmer must ensure correct behavior
for many possible network orders– “Eventual Consistency”
• Typical goal: replicas converge to same final state
Goal:Make it easier to write
programs on top ofeventual consistency
This Talk1. Prior Work– Convergent Modules (CRDTs)– Monotonic Logic (CALM)
2. BloomL
3. Case Study
Read: {Alice, Bob}
Write: {Alice, Bob, Dave}
Write: {Alice, Bob, Carol}
Students{Alice, Bob, Dave}
Students{Alice, Bob, Carol}Client0
Client1
Read: {Alice, Bob} Students{Alice, Bob}
How to resolve?
Students{Alice, Bob}
Problem
Replicas perceive different event orders
Goal Same final state at all replicas
Solution
Use commutative operations (“merge functions”)
Students{Alice, Bob, Carol,
Dave}
Students{Alice, Bob, Carol,
Dave}Client0
Client1
Merge = Set Union
Commutative Operations
• Common design pattern• Formalized as CRDTs:
Convergent and Commutative Replicated Data Types– Shapiro et al., INRIA (2009-
2012)– Based on join semilattices
12
Lattices
hS,t,?i is a bounded join semilattice iff:– S is a set– t is a binary operator (“least upper
bound”)• Associative, commutative, and idempotent• Induces a partial order on S: x ·S y if x t y = y• Informally, “merge function” for elements of
S– ? is the “least” element in S• 8x 2 S: ? t x = x
Time
Set(LUB = Union)
IncreasingInteger
(LUB = Max)Boolean
(LUB = Or)
Client0
Client1
Students{Alice, Bob, Carol,
Dave}
Students{Alice, Bob, Carol,
Dave}
Teams{<Alice, Bob>}
Teams{<Alice, Bob>}
Read: {Alice, Bob, Carol, Dave}
Read: {<Alice,Bob>}Write: {<Alice,Bob>, <Carol,Dave>}
Teams{<Alice, Bob>,
<Carol, Dave>}
Remove: {Dave} Students{Alice, Bob, Carol}
Replica Synchronization
Students{Alice, Bob, Carol}
Teams{<Alice, Bob>,
<Carol, Dave>}
Teams{<Alice, Bob>,
<Carol, Dave>}
Teams{<Alice, Bob>,
<Carol, Dave>}
Client0
Client1
Students{Alice, Bob, Carol,
Dave}
Students{Alice, Bob, Carol,
Dave}
Teams{<Alice, Bob>}
Read: {Alice, Bob, Carol}
Read: {<Alice,Bob>}Teams
{<Alice, Bob>}
Remove: {Dave} Students{Alice, Bob, Carol}
Replica Synchronization
Students{Alice, Bob, Carol}
Nondeterministic Outcome!
Teams{<Alice, Bob>}
Teams{<Alice, Bob>}
Problem:Composition of CRDTs canresult in non-determinism
Possible Solution:Encapsulate all distributed
state in a single CRDT
Hard to design,verify, and test
Doesn’t scale with application size
Goal:Design a language that allows
safe composition of CRDTs
Solution: … Datalog?• Concurrent work:
distributed programming using Datalog– P2 (2006-2010)– Bloom (2010-2012)
• Monotonic logic: building block for convergent distributed programs
Monotonic Logic• As input set grows,
output set does not shrink– “Retraction-free”
• Order independent• e.g., map, filter, join,
union, intersection
Non-Monotonic Logic• New inputs might
retract previous outputs
• Order sensitive• e.g., aggregation,
negation
Monotonicity and Determinism
Agents learn strictly more knowledge over
time
Different learning order, same final outcome
Result:Program is deterministic!
Consistency
As
Logical
Monotonicity
CALM Analysis
1.All monotone programs are deterministic
2.Simple syntactic test for monotonicity
Result: Whole-program static analysis foreventual consistency
Problem:CALM only applies to
programs over growing sets
Version Numbers Timestamps Threshold Tests
Quorum Vote• A coordinator
accepts votes from agents
• Count # of votes–When Count(Votes) >
k, send “success” message
Quorum Vote• A coordinator
accepts votes from agents
• Count # of votes–When Count(Votes) >
k, send “success” message
Aggregation isnon-monotonic!
CRDTsLimited scope(single object)Flexible types(any lattice)
CALMWhole program analysisLimited types (only sets)
BloomL
Whole program analysisFlexible types (any lattice)
BloomL Constructs
Organization Collection of agentsCommunication
Message passing
State LatticesComputation Functions over
lattices
28
Monotone Functions
f : ST is a monotone function iff
8a,b 2 S : a ·S b ) f(a) ·T f(b)
Time
Set(LUB = Union)
IncreasingInteger
(LUB = Max)Boolean
(LUB = Or)
size() >= 5
Monotone function fromset increase-int
Monotone function fromincrease-int boolean
30
Quorum Vote in BloomL
QUORUM_SIZE = 5RESULT_ADDR = "example.org"
class QuorumVote include Bud
state do channel :vote_chn, [:@addr, :voter_id] channel :result_chn, [:@addr] lset :votes lmax :vote_cnt lbool :got_quorum end
bloom do votes <= vote_chn {|v| v.voter_id} vote_cnt <= votes.size got_quorum <= vote_cnt.gt_eq(QUORUM_SIZE) result_chn <~ got_quorum.when_true { [RESULT_ADDR] } endend
Monotone function: set ! maxMonotone function: max ! bool
Threshold test on bool (monotone)
Lattice state declarations
Communication interfaces
Accumulate votesinto set
Annotated Ruby class
Program state
Program logic
Merge function for set lattice
Monotonic CALM
BloomL Features• Generalizes logic programming to
lattices– Integration of relational-style queries
and functions over lattices– Efficient incremental evaluation scheme
• Library of built-in lattices– Booleans, increasing/decreasing
integers, sets, multisets, maps, …• API for defining custom lattices
Case Studies
Key-Value Store– Object versioning
via vector clocks– Quorum replication
Replicated Shopping Cart– Using custom lattice types
to encode domain-specific knowledge
Case Studies
Key-Value Store– Object versioning
via vector clocks– Quorum replication
Replicated Shopping Cart– Using custom lattice types
to encode domain-specific knowledge
34
Case Study: Shopping Carts
35
Case Study: Shopping Carts
36
Case Study: Shopping Carts
37
Case Study: Shopping Carts
Perspectives on Shopping
• CRDTs– Individual server replicas converge
• Bloom– Checkout is non-monotonic requires
distributed coordination• Built-in BloomL lattice types– Checkout is not a monotone function of
any of the built-in lattices
Observation:Once a checkoutoccurs, no more shopping actions
can be performed
Observation:Each client knows
when a checkout can beprocessed “safely”
41
Monotone Checkout
OPS = [1]Incomplet
e
OPS = [2]Incomplet
e
OPS = [3]Incomplet
e
OPS = [1,2]
Incomplete
OPS = [2,3]
Incomplete
OPS = [1,2,3]
Complete
42
Monotone Checkout
43
Monotone Checkout
44
Monotone Checkout
45
Monotone Checkout
Shopping Takeaways• Checkout summary is a monotone
function of client’s activities• Custom lattice type captures
application-specific notion of “forward progress”– “Unsafe” state hidden behind ADT
interface
Recap1. How to build eventually consistent systems– Write disorderly programs
2. Disorderly state– Lattices
3. Disorderly computation– Monotone functions over lattices
4. BloomL
– Type system for deterministic behavior– Support for custom lattice types
Thank You!http://www.bloom-lang.net
Backup Slides
50
Strong Consistency in Industry
“… there was a single overarching theme within the keynote talks… strong synchronization of the sort provided by a locking service must be avoided like the plague… [the key] challenge is to find ways of transforming services that might seem to need locking into versions that … can operate correctly without locking.”
-- Birman et al.,“Toward a Cloud Computing Research Agenda”
(LADIS, 2009)
51
Bloom Operational Model
52
QUORUM_SIZE = 5RESULT_ADDR = "example.org"
class QuorumVote include Bud
state do channel :vote_chn, [:@addr, :voter_id] channel :result_chn, [:@addr] table :votes, [:voter_id] scratch :cnt, [] => [:cnt] end
bloom do votes <= vote_chn {|v| [v.voter_id]} cnt <= votes.group(nil, count(:voter_id)) result_chn <~ cnt {|c| [RESULT_ADDR] if c >= QUORUM_SIZE} endend
Quorum Vote in Bloom
Communication
Persistent Storage
Transient StorageAccumulate votes
Send message when quorum reached
Not (set) monotonic!Count votes
Annotated Ruby class
Program state
Program logic
53
Built-in LatticesName Description ? a t b Sample Monotone
Functionslbool Threshold test false a ∨ b when_true() ! vlmax Increasing
number1 max(a,
b)gt(n) ! lbool+(n) ! lmax-(n) ! lmax
lmin Decreasing number
−1 min(a,b)
lt(n) ! lbool
lset Set of values ; a [ b intersect(lset) ! lsetproduct(lset) ! lset
contains?(v) ! lboolsize() ! lmax
lpset Non-negative set
; a [ b sum() ! lmax
lbag Multiset of values
; a [ b mult(v) ! lmax+(lbag) ! lbag
lmap Map from keys to lattice values
empty
map
at(v) ! any-latintersect(lmap) ! lmap
Failure HandlingGreat question!
1. Monotone programs handle transient faults very well– Deterministic simple logging– Commutative, idempotent simple recovery
2. Future work: “controlled non-determinism”– Timeout code is fundamentally non-deterministic– But we still want mostly deterministic programs
Handling Non-Monotonicity
… is not the focus of this talk
Basic alternatives:1. Nodes agree on an event order
using distributed coordination (e.g., Paxos)
2. Allow non-deterministic outcomes• If needed, compensate and apologize