CAP THEOREM Large Scale Data Management
CAP THEOREM Large Scale Data Management
Consistency, Availability, Par99ons-‐Tolerance
• Conjecture by Eric Brewer at PODC 2000 : – It is impossible for a web service to provide following three guarantees : • Consistency • Availability • Par99on-‐tolerance
• Established as theorem in 2002: – Lynch, Nancy, and Seth Gilbert. Brewer’s conjecture and the feasibility of consistent, available, par99on-‐tolerant web services. ACM SIGACT News, v. 33 issue 2, 2002, p. 51-‐59.
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
CAP theorem
• Consistency -‐ all nodes should see the same data at the same 9me
• Availability -‐ node failures do not prevent survivors from con9nuing to operate
• Par88on-‐tolerance -‐ the system con9nues to operate despite arbitrary message loss
• A distributed system can sa8sfy any two of these guarantees at the same 8me but not all three
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Consistency + Availability
• Examples: – Single-‐site databases – Cluster databases – LDAP – xFS file system
• Traits: – 2-‐phase commit – cache valida9on protocols
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Consistency + Par99on Tolerance
• Examples: – Distributed databases – Distributed Locking – Majority protocols
• Traits: – Pessimis9c locking – Make minority par99ons unavailable (Quorums)
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Availability + Par99on Tolerance
• Example: – Code – DNS – Usenet
• Traits: – Expira9on/leases – Conflict resolu9on – Op9mis9c replica9on
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Data Store and CAP
• RDBMS : CA (Master/Slave replica8on, Sharding) • Amazon Dynamo : AP (Read-‐repair, applica9on hooks)
• Terracota : CA (Quorum vote, majority par99on survival)
• Apache Cassandra : AP (Par99oning, Read-‐repair) • Apache Zookeeper: AP (Consensus protocol) • Google BigTable : CA • Apache CouchDB : AP Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
hTp://blog.nahurst.com/visual-‐guide-‐to-‐nosql-‐systems
Techniques for CAP
• Consistent Hashing • Vector Clocks • Sloppy Quorum • Merkle trees • Gossip-‐based protocols • CRDTs • See that later…
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Idea of the proof
• hTp://www.youtube.com/watch?v=Jw1iFr4v58M
Atomic Data Object
• Atomic/Linearizable Consistency: – There must exist a total order on all opera9on such that each opera9on looks as if it were completed at a single instant
– This is equivalent to requiring requests on the distributed shared memory to act as if they are execu9ng on single node, responding to opera9ons one at the 9me
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Available Data Objects
• For a distributed system to be con9nuously available, every request received by a non-‐failing node in the system must result in a response – That is, any algorithm used by service must eventually terminate • (In some ways, this is weak defini9on of availability : it puts no bounds on how long the algorithm may run before termina9ng, and therefore allows unbounded computa9on)
• (On the other hand, when qualified by the need for par99on tolerance, this can be seen as a strong defini9on of availability : even when severe network failures occur, every request must terminate)
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Par99on Tolerance • In order to model par99on tolerance, the network is allowed to
lose arbitrary many messages sent from one node to another • When a network is par99oned, all messages sent from nodes in one
component of the par99on to another component are lost. • The atomicity requirement implies that every response will be
atomic, even though arbitrary messages sent as part of the algorithm might not be delivered
• The availability requirement therefore implies that every node receiving request from a client must respond, even through arbitrary messages that are sent may be lost
• Par99on Tolerance : No set of failures less than total network failure is allowed to cause the system to respond incorrectly
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Asynchronous Network Model
• There is no clock • Nodes must make decisions based only on messages received and local computa9on
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Asynchronous Networks: impossibility result
• Theorem 1 : It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following proper:es: – Availability – Atomic consistency in all fair execu9ons (including those in which messages are lost)
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Asynchronous Networks: impossibility result
• Proof (by contradic9on) : – Assume an algorithm A exists that meets the three criteria : • atomicity, availability and par99on tolerance
– We construct an execu9on of A in which there exists a request that returns and inconsistent response
– Assume that the network consists of at least two nodes. Thus it can be divided into two disjoint, non-‐empty sets G1,G2
– Assume all messages between G1 and G2 are lost. – If a write occurs in G1 and read occurs in G2, then the read opera9on cannot return the results of earlier write opera9on.
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Asynchronous Networks: impossibility result
• Formal proof: – Let v0 be the ini9al value of the atomic object – Let α1 be the prefix of an execu9on of A in which a single write of a value not equal to v0 occurs in G1, ending with the termina9on of the write opera9on.
– assume that no other client requests occur in either G1 or G2. assume that no messages from G1 are received in G2 and no messages from G2 are received in G1
– we know that write opera9on will complete (by the availability requirement)
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Asynchronous Networks: impossibility result
• Let α2 be the prefix of an execu9on in which a single read occurs in G2 and no other client requests occur, ending with the termina9on of the read opera9on
• During α2 no messages from G2 are received in G1 and no messages from G1 are received in G2
• We know that the read must return a value (by the availability requirement)
• The value returned by this execu9on must be v0 as no write opera9on has occurred in α2
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Asynchronous Networks: impossibility result
• Let α be an execu9on beginning with α1 and con9nuing with α2. To the nodes in G2 , α is indis9nguishable from α2, as all the messages from G1 to G2 are lost (in both α1 and α2 that together make up α), and α1 does not include any client requests to nodes in G2.
• Therefore, in the α execu9on -‐ the read request (from α2) must s9ll return v0.
• However, the read request does not begin un9l aler the write request (from α1) has completed
• This therefore contradicts the atomicity property, proving that no such algorithm exists
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Asynchronous Networks: Impossibility Result
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Impossibility results
• It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following proper9es: – Availability -‐ in all fair execu9ons – Atomic consistency -‐ in fair execu9ons in which no messages are lost
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Impossibility results
• Proof: – The main idea is that in the asynchronous model, an algorithm has no way of determining whether a message has been lost, or has been arbitrary delayed in the transmission channel
– Therefore if there existed an an algorithm that guaranteed atomic consistency in execu9ons in which no messages were lost, there would exist an algorithm that guaranteed atomic consistency in all execu9ons.
– This would violate Theorem 1
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
CAP theorem
• While it is impossible to provide all three proper9es : atomicity, availability and par99on tolerance, any two of these proper9es can be achieved: – Atomic, Par99on Tolerant – Atomic, Available – Atomic, Par99on Tolerant
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Atomic, Par99on-‐Tolerant • If availability is not required , it is easy to achieve atomic data and
par99on tolerance • The trivial system that ignores all requests meets these requirements • Stronger liveness criterion : if all the messages in an execu9on are
delivered, system is available and all opera9ons terminate • A simple centralized algorithm meets these requirements : a single
designated node maintains the value of an object • A node receiving request forwards the request to designated node which
sends a response. When acknowledgement is received, the node sends a response to the client
• Many distributed databases provide this guarantee, especially algorithms based on distributed locking or quorums : if certain failure paTerns occur, liveness condi9on is weakened and the service no longer returns response. If there are no failures, then liveness is guaranteed.
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Atomic, Available
• If there are no par99ons -‐ it is possible to provide atomic, available data
• Centralized algorithm with single designated node for maintaining value of an object meets these requirements
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Available, Par99on-‐Tolerant
• It is possible to provide high availability and par99on tolerance if atomic consistency is not required
• If there are no consistency requirements, the service can trivially return v0, the ini9al value in response to every request
• It is possible to provide weakened consistency in an available, par99on-‐tolerant semng
• Web caches are one example of weakly consistent network
Aleksandar Bradic, Vast.com hTp://fr.slideshare.net/alekbr/cap-‐theorem
Par9ally Synchronous Model
• The Lynch Paper also details CAP in Par9ally Synchronous Model : every node has a clock and all clocks increase at the same rate.
• However, clocks are not synchronized • If theorem 1 holds in Par9ally synchronous model, the corollary 1.1 does not hold.
• Weaker consistency (t-‐connected) can be achieved.
CAP Conclusion
• It is possible to build Large Scale Distributed Data Management systems under the CAP theorem: – One property should be sacrified.
Sacrifying one Property • If Consistency is sacrified (AP): – Push consistency problems to applica9ons, Can be more difficult to solve, or not… high programming cost
– Deployement on asynchonous infrastructure… • If Availability is sacrified (CP) – Blocking protocols can really block the system, – Cheap programming cost on asynchronous infrastructure
• If P is sacrified (AC) – Need to provide a quasi-‐synchronous model, where complex failures never happens
– Cheap programming cost with synchronous infra… Stonebraker CACM CACM 2010
Challenges
• Whatever the choices been made, AC/AP/CP • Scalability and throughtput that can be achieved with different approaches will make the difference
• The balance between programming cost/scalability-‐efficiency will be the key.
• Nice challenges for scien9st and engineers…
Clash of cultures
• a Classic distributed systems: focused on ACID seman9cs – A: Atomic – C: Consistent – I: Isolated – D: Durable
• a “Modern” Internet systems: focused on BASE – Basically Available – Sol-‐state (or scalable) – Eventually consistent
NoSQL (CouchDB…) vs NewSQL (VoltDB…) Dan PritcheT BASE, an ACID Alterna9ve ACM Queue hTp://queue.acm.org/detail.cfm?id=1394128
hTp://blogs.the451group.com/informa9on_management/2011/04/15/nosql-‐newsql-‐and-‐beyond/