Dynamo: Amazon’s Highly Available Key-value Store & Amazon DynamoDB Presented by: Zuhair Khayyat
Dynamo:Amazon’s Highly Available Key-value Store
&Amazon DynamoDB
Presented by:
Zuhair Khayyat
Dynamo & DynamoDB
What is Dynamo
● Dynamo is an eventually-consistent key-value storage system used in Amazon's web services to support scalable highly available data access.
● Dynamo is used to mainly to manage the state of services, such as S3 and e-commerce.
● Optimized for availability (always on experience) to maximize customer satisfaction in trade of:
– Data consistency
– Durability
– Performance
Dynamo & DynamoDB
Dynamo: Why not relational database
● Many services on Amazon’s platform that requires high reliability requirements only need primary-key access to a data store.
● Relational databases are highly optimized for complex query processing, however they have limited scalability and chose consistency over availability.
● The complicated features of relational databases requires expensive hardware and very skillful administrators.
Dynamo & DynamoDB
Dynamo: Amazon's Requirements
● Simple reads and writes to binary objects not larger than 1 MB while no operation spans for multiple data.
● Very fast data access, (<300) ms response time.
● Heterogeneous commodity hardware infrastructure.
● Used by decentralized, loosely coupled services.
● Highly available (always on); expect small frequent network and server failures.
Dynamo & DynamoDB
Dynamo: Consistency and Replication
● Strong data consistency and high data availability cannot be achieved simultaneously.
● “Dynamo is designed to be an eventually consistent data store; that is all updates reach all replicas eventually.”
● “always writable” data store, do not reject write operations if data is inconsistent.
– Imagine you are ordering form Amazon.com and the website rejects adding an item to your cart!
● Conflict resolution: The application is responsible too resolve the data conflicts.
Dynamo & DynamoDB
Dynamo VS Bigtable
Dynamo Bigtable
Cluster Setup decentralized Centralized (GFS)
Data Access (Primary-key, version*) (row key,col key,timestamp)
Data Partitioning and Load Balancing
Customized Consistency Hashing
64K partitions stored in least utilized machines
(GFS)
Data Query Zero-hop DHT Ask the Master (GFS)
Read Operation Multiple copies read Single copy read
Typical Value size Less than 1 MB Not specified (GFS)
Writes operation on inconsistence Data
Accept all write operations and resolve conflicts
Make data unavailable until consistent (GFS)
Dynamo & DynamoDB
Dynamo: Interface
● Key-value storage system with operators:
– get(key): returns a single or a list of objects with conflicting versions
– put(key,context,object): place the object and write its replicas to disk. Context contains information about the object such as the version.
● MD5 hashing is applied on the key to generate 128-bit identifier.
Dynamo & DynamoDB
Dynamo: Partitioning
● Dynamo is designed to scale incrementally one machine at a time.
● Consistent hashing generates a fixed output space constructed as a ring.
● A variant of consistent hashing (virtual nodes) is used by Dynamo to dynamically repartition and load balance the data over the storage hosts.
● Each storage host acquires data depending on its capacity.
Dynamo & DynamoDB
Dynamo: Consistent Hashing
A[1,10]
B[31.40]
C[51,60]
D[11.20]
F[41,50]
E[21.30]
H[71,80]
G[61,70]
A[1,10]
B[31.40]
C[55,60]
D[11.20]
F[41,46]
E[21.30]
H[71,80]
G[61,70]
Adding a node(storage host) I
[47,54]
Dynamo & DynamoDB
Dynamo: Variant of Consistent Hashing
A[1,10]
B[31.40]
C[51,60]
D[11.20]
A*[41,50]
C*[21.30]
D*[71,80]
B*[61,70]
A[1,10]
B[31.40]
C[55,60]
D[11.16]
A*[41,46]
C*[25.30]
D*[71,80]
B*[61,70]
Adding a node(storage host) E*
[47,54]
E[17,24]
Dynamo & DynamoDB
Dynamo: Replication
● Each key (k) is assigned to a coordinator node (i).● Each value (v) is replicated to (N-1) clockwise
successor logical nodes in the ring.● Node (i) is responsible to update all other (N-1)
replicas for the keys it owns.● Each key (k) has a preference list of physical
nodes that are responsible to maintain and access the key's data
Dynamo & DynamoDB
Dynamo: Data Versioning
● Eventual consistency protocol is used to update all data replicas asynchronously.
● put() is returned before updating all replicas.● get() can return multiple versions for the same key.● Dynamo track each data mutation as a new version
version to support “write always” protocol.● Dynamo uses vector clocks protocol for versioning.
Dynamo & DynamoDB
Dynamo: vector clocks example 1
A
B
C
Value=100A:1
Dynamo & DynamoDB
Dynamo: vector clocks example 1
A
B
C
Value=100A:1
Value=101A:1,B:1
+1
Dynamo & DynamoDB
Dynamo: vector clocks example 1
A
B
C
Value=100A:1
Value=101A:1,B:1
Value=105A:1,B:1,C:1
+1
+4
Dynamo & DynamoDB
Dynamo: vector clocks example 1
A
B
C
Value=100A:1
Value=101A:1,B:1
Value=105A:1,B:1,C:1
Value=205A:1,B:2,C:1
+1
+4
+100
Dynamo & DynamoDB
Dynamo: vector clocks example 1
A
B
C
Value=100A:1
Value=101A:1,B:1
Value=105A:1,B:1,C:1
Value=205A:1,B:2,C:1
Value=315A:1,B:2,C:2
+1
+4
+100
+110
Dynamo & DynamoDB
Dynamo: vector clocks example 2
A
B
C
Value=100A:1
Value=101A:1,B:1
Value=105A:1,B:1,C:1
Value=201A:1,B:2
+4
+1
+100
+110
+110
Value=311A:1,B:2,C:1
Value=215A:1,B:1,C:2
Conflict!
Dynamo & DynamoDB
Dynamo: resolving conflicts
● Syntactic reconciliation:
– The Application is able to resolve the conflict automatically
● Semantic reconciliation:
– Merge results from different conflicts, make the user revise the new values.
– Example: Amazon's shopping cart: ● Preserve “Add to cart” items.● Deleted items can resurface.
Dynamo & DynamoDB
Dynamo: Processing put() & get()
● The user is able to issue commands with either of the following scenarios:
– A generic load balancer is invoked to direct the user's requests to the least utilization.
– Use a partition-aware library to direct the request to one of the data owners directly.
● The system requires two configurable values:
– R: the number of available healthy nodes required for a successful reads
– W: the number of available healthy nodes required for a successful write.
Dynamo & DynamoDB
Dynamo: Hinted Handoff
● Assuming N=3, a failed put() operation on node A is temporarily handled by B.
● After A recovers, B sends the result of put() operation back to A.
● Advantage: temporarily
failure has minimal effect
on the application.
A
BC
D
A'
C'
D'
A''
Dynamo & DynamoDB
Dynamo: Scalability
● Adding or removing the node requires a third party tool or direct user interaction.
● Gossip-based protocol is used to propagate membership throughout the cluster and to detect failures.
● Replica synchronization is done using Merkle hash tree.
Dynamo & DynamoDB
Dynamo: Peak Performance
● Shopping Cart Service at a holiday:
– 10 Million requests
– 3 million checkouts
– 100000+ concurrent sessions
– No downtime!
Dynamo & DynamoDB
Dynamo DB
Dynamo & DynamoDB
What is DynamoDB
● A NoSQL database service available publicly through amazon's EC2; released on 2012.
● Based on Dynamo, a scalable highly available (key, value) storage system used by Amazon's servers; published in SOSP 2007
●
Dynamo & DynamoDB
DynamoDB: Data Model
● The database is a collection of tables.
● A table is a collection of items.
● An item is a collection of attributes.
● Primary key is required.
● No nulls or empty Strings.
● No schema is required, items can vary in the number of attributes.. How it is possible?
DynamoDB: Example
● Table name: ProductCatalog{ Id = 101 ProductName = "Book 101 Title" ISBN = "111-1111111111" Authors = [ "Author 1","Author 2" ] Price = -2 Dimensions = "8.5 x 11.0 x 0.5" PageCount = 500 InPublication = 1 ProductCategory = "Book" }
{ Id = 201 ProductName = "18-Bicycle 201" Description = "201 description" BicycleType = "Road" Brand = "Brand-Company A" Price = 100 Gender = "M" Color = [ "Red", "Black" ] ProductCategory = "Bike"}
{ Id = 202 ProductName = "21-Bicycle 202" Description = "202 description" BicycleType = "Road" Brand = "Brand-Company A" Price = 200 Gender = "M" Color = [ "Green", "Black" ] ProductCategory = "Bike"}
Dynamo & DynamoDB
DynamoDB: Example
● Storage in Dynamo:
– <Tabel_List, {ProductCatalog,....}>
– <ProductCatalog, {101,102,201,202}>
– <101, {ProductName={},ISBN={},Authors={}...}>
– or –
– <Tabel_List, {ProductCatalog,....}>
– <ProductCatalog, {101,102,201,202}>
– <101, {ProductName,ISBN,Authors...}>
– <101_Authors,{Author 1,Author 2}>1
Dynamo & DynamoDB
DynamoDB: Table Primary Keys
● A table in DynamoDB must have a primary key.
● A primary key can be either “hash only” or hash and range.
● DynamoDB uses unsorted hash index, while the range index is sorted.
● Hash only primary key is based on only a single attribute.
● Hash and range primary key is based on two attributes.
● Data types:
– Scalar data types: Number, String, and Binary.
– Multi-valued types: String Set, Number Set, and Binary Set.
DynamoDB: Read operation
● Availability and durability are maintained through data replication.
● Updating all the replicas after data mutation requires some latency; DynamoDB eventually will synchronize all the replicas.
● DynamoDB supports two read operations:
– Eventually consistent read
● Does not necessarily reflects the last data mutation.● Very fast data access; not affected by failures.
– Consistent read
● Always reflects the last data access.● Wait for data to be consistent in all replicas; affected by
network and storage failures.
Dynamo & DynamoDB
DynamoDB: Similar services
● Datastore on Google Appengine● Cloudant Data Layer (CouchDB)
Dynamo & DynamoDB
DynamoDB: try it today