Top Banner
Dynamo: Amazon’s Highly Available Key-value Store & Amazon DynamoDB Presented by: Zuhair Khayyat
32
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dynamo db

Dynamo:Amazon’s Highly Available Key-value Store

&Amazon DynamoDB

Presented by:

Zuhair Khayyat

Page 2: Dynamo db

Dynamo & DynamoDB

What is Dynamo

● Dynamo is an eventually-consistent key-value storage system used in Amazon's web services to support scalable highly available data access.

● Dynamo is used to mainly to manage the state of services, such as S3 and e-commerce.

● Optimized for availability (always on experience) to maximize customer satisfaction in trade of:

– Data consistency

– Durability

– Performance

Page 3: Dynamo db

Dynamo & DynamoDB

Dynamo: Why not relational database

● Many services on Amazon’s platform that requires high reliability requirements only need primary-key access to a data store.

● Relational databases are highly optimized for complex query processing, however they have limited scalability and chose consistency over availability.

● The complicated features of relational databases requires expensive hardware and very skillful administrators.

Page 4: Dynamo db

Dynamo & DynamoDB

Dynamo: Amazon's Requirements

● Simple reads and writes to binary objects not larger than 1 MB while no operation spans for multiple data.

● Very fast data access, (<300) ms response time.

● Heterogeneous commodity hardware infrastructure.

● Used by decentralized, loosely coupled services.

● Highly available (always on); expect small frequent network and server failures.

Page 5: Dynamo db

Dynamo & DynamoDB

Dynamo: Consistency and Replication

● Strong data consistency and high data availability cannot be achieved simultaneously.

● “Dynamo is designed to be an eventually consistent data store; that is all updates reach all replicas eventually.”

● “always writable” data store, do not reject write operations if data is inconsistent.

– Imagine you are ordering form Amazon.com and the website rejects adding an item to your cart!

● Conflict resolution: The application is responsible too resolve the data conflicts.

Page 6: Dynamo db

Dynamo & DynamoDB

Dynamo VS Bigtable

Dynamo Bigtable

Cluster Setup decentralized Centralized (GFS)

Data Access (Primary-key, version*) (row key,col key,timestamp)

Data Partitioning and Load Balancing

Customized Consistency Hashing

64K partitions stored in least utilized machines

(GFS)

Data Query Zero-hop DHT Ask the Master (GFS)

Read Operation Multiple copies read Single copy read

Typical Value size Less than 1 MB Not specified (GFS)

Writes operation on inconsistence Data

Accept all write operations and resolve conflicts

Make data unavailable until consistent (GFS)

Page 7: Dynamo db

Dynamo & DynamoDB

Dynamo: Interface

● Key-value storage system with operators:

– get(key): returns a single or a list of objects with conflicting versions

– put(key,context,object): place the object and write its replicas to disk. Context contains information about the object such as the version.

● MD5 hashing is applied on the key to generate 128-bit identifier.

Page 8: Dynamo db

Dynamo & DynamoDB

Dynamo: Partitioning

● Dynamo is designed to scale incrementally one machine at a time.

● Consistent hashing generates a fixed output space constructed as a ring.

● A variant of consistent hashing (virtual nodes) is used by Dynamo to dynamically repartition and load balance the data over the storage hosts.

● Each storage host acquires data depending on its capacity.

Page 9: Dynamo db

Dynamo & DynamoDB

Dynamo: Consistent Hashing

A[1,10]

B[31.40]

C[51,60]

D[11.20]

F[41,50]

E[21.30]

H[71,80]

G[61,70]

A[1,10]

B[31.40]

C[55,60]

D[11.20]

F[41,46]

E[21.30]

H[71,80]

G[61,70]

Adding a node(storage host) I

[47,54]

Page 10: Dynamo db

Dynamo & DynamoDB

Dynamo: Variant of Consistent Hashing

A[1,10]

B[31.40]

C[51,60]

D[11.20]

A*[41,50]

C*[21.30]

D*[71,80]

B*[61,70]

A[1,10]

B[31.40]

C[55,60]

D[11.16]

A*[41,46]

C*[25.30]

D*[71,80]

B*[61,70]

Adding a node(storage host) E*

[47,54]

E[17,24]

Page 11: Dynamo db

Dynamo & DynamoDB

Dynamo: Replication

● Each key (k) is assigned to a coordinator node (i).● Each value (v) is replicated to (N-1) clockwise

successor logical nodes in the ring.● Node (i) is responsible to update all other (N-1)

replicas for the keys it owns.● Each key (k) has a preference list of physical

nodes that are responsible to maintain and access the key's data

Page 12: Dynamo db

Dynamo & DynamoDB

Dynamo: Data Versioning

● Eventual consistency protocol is used to update all data replicas asynchronously.

● put() is returned before updating all replicas.● get() can return multiple versions for the same key.● Dynamo track each data mutation as a new version

version to support “write always” protocol.● Dynamo uses vector clocks protocol for versioning.

Page 13: Dynamo db

Dynamo & DynamoDB

Dynamo: vector clocks example 1

A

B

C

Value=100A:1

Page 14: Dynamo db

Dynamo & DynamoDB

Dynamo: vector clocks example 1

A

B

C

Value=100A:1

Value=101A:1,B:1

+1

Page 15: Dynamo db

Dynamo & DynamoDB

Dynamo: vector clocks example 1

A

B

C

Value=100A:1

Value=101A:1,B:1

Value=105A:1,B:1,C:1

+1

+4

Page 16: Dynamo db

Dynamo & DynamoDB

Dynamo: vector clocks example 1

A

B

C

Value=100A:1

Value=101A:1,B:1

Value=105A:1,B:1,C:1

Value=205A:1,B:2,C:1

+1

+4

+100

Page 17: Dynamo db

Dynamo & DynamoDB

Dynamo: vector clocks example 1

A

B

C

Value=100A:1

Value=101A:1,B:1

Value=105A:1,B:1,C:1

Value=205A:1,B:2,C:1

Value=315A:1,B:2,C:2

+1

+4

+100

+110

Page 18: Dynamo db

Dynamo & DynamoDB

Dynamo: vector clocks example 2

A

B

C

Value=100A:1

Value=101A:1,B:1

Value=105A:1,B:1,C:1

Value=201A:1,B:2

+4

+1

+100

+110

+110

Value=311A:1,B:2,C:1

Value=215A:1,B:1,C:2

Conflict!

Page 19: Dynamo db

Dynamo & DynamoDB

Dynamo: resolving conflicts

● Syntactic reconciliation:

– The Application is able to resolve the conflict automatically

● Semantic reconciliation:

– Merge results from different conflicts, make the user revise the new values.

– Example: Amazon's shopping cart: ● Preserve “Add to cart” items.● Deleted items can resurface.

Page 20: Dynamo db

Dynamo & DynamoDB

Dynamo: Processing put() & get()

● The user is able to issue commands with either of the following scenarios:

– A generic load balancer is invoked to direct the user's requests to the least utilization.

– Use a partition-aware library to direct the request to one of the data owners directly.

● The system requires two configurable values:

– R: the number of available healthy nodes required for a successful reads

– W: the number of available healthy nodes required for a successful write.

Page 21: Dynamo db

Dynamo & DynamoDB

Dynamo: Hinted Handoff

● Assuming N=3, a failed put() operation on node A is temporarily handled by B.

● After A recovers, B sends the result of put() operation back to A.

● Advantage: temporarily

failure has minimal effect

on the application.

A

BC

D

A'

C'

D'

A''

Page 22: Dynamo db

Dynamo & DynamoDB

Dynamo: Scalability

● Adding or removing the node requires a third party tool or direct user interaction.

● Gossip-based protocol is used to propagate membership throughout the cluster and to detect failures.

● Replica synchronization is done using Merkle hash tree.

Page 23: Dynamo db

Dynamo & DynamoDB

Dynamo: Peak Performance

● Shopping Cart Service at a holiday:

– 10 Million requests

– 3 million checkouts

– 100000+ concurrent sessions

– No downtime!

Page 24: Dynamo db

Dynamo & DynamoDB

Dynamo DB

Page 25: Dynamo db

Dynamo & DynamoDB

What is DynamoDB

● A NoSQL database service available publicly through amazon's EC2; released on 2012.

● Based on Dynamo, a scalable highly available (key, value) storage system used by Amazon's servers; published in SOSP 2007

Page 26: Dynamo db

Dynamo & DynamoDB

DynamoDB: Data Model

● The database is a collection of tables.

● A table is a collection of items.

● An item is a collection of attributes.

● Primary key is required.

● No nulls or empty Strings.

● No schema is required, items can vary in the number of attributes.. How it is possible?

Page 27: Dynamo db

DynamoDB: Example

● Table name: ProductCatalog{ Id = 101 ProductName = "Book 101 Title" ISBN = "111-1111111111" Authors = [ "Author 1","Author 2" ] Price = -2 Dimensions = "8.5 x 11.0 x 0.5" PageCount = 500 InPublication = 1 ProductCategory = "Book" }

{ Id = 201 ProductName = "18-Bicycle 201" Description = "201 description" BicycleType = "Road" Brand = "Brand-Company A" Price = 100 Gender = "M" Color = [ "Red", "Black" ] ProductCategory = "Bike"}

{ Id = 202 ProductName = "21-Bicycle 202" Description = "202 description" BicycleType = "Road" Brand = "Brand-Company A" Price = 200 Gender = "M" Color = [ "Green", "Black" ] ProductCategory = "Bike"}

Page 28: Dynamo db

Dynamo & DynamoDB

DynamoDB: Example

● Storage in Dynamo:

– <Tabel_List, {ProductCatalog,....}>

– <ProductCatalog, {101,102,201,202}>

– <101, {ProductName={},ISBN={},Authors={}...}>

– or –

– <Tabel_List, {ProductCatalog,....}>

– <ProductCatalog, {101,102,201,202}>

– <101, {ProductName,ISBN,Authors...}>

– <101_Authors,{Author 1,Author 2}>1

Page 29: Dynamo db

Dynamo & DynamoDB

DynamoDB: Table Primary Keys

● A table in DynamoDB must have a primary key.

● A primary key can be either “hash only” or hash and range.

● DynamoDB uses unsorted hash index, while the range index is sorted.

● Hash only primary key is based on only a single attribute.

● Hash and range primary key is based on two attributes.

● Data types:

– Scalar data types: Number, String, and Binary.

– Multi-valued types: String Set, Number Set, and Binary Set.

Page 30: Dynamo db

DynamoDB: Read operation

● Availability and durability are maintained through data replication.

● Updating all the replicas after data mutation requires some latency; DynamoDB eventually will synchronize all the replicas.

● DynamoDB supports two read operations:

– Eventually consistent read

● Does not necessarily reflects the last data mutation.● Very fast data access; not affected by failures.

– Consistent read

● Always reflects the last data access.● Wait for data to be consistent in all replicas; affected by

network and storage failures.

Page 31: Dynamo db

Dynamo & DynamoDB

DynamoDB: Similar services

● Datastore on Google Appengine● Cloudant Data Layer (CouchDB)

Page 32: Dynamo db

Dynamo & DynamoDB

DynamoDB: try it today