Dynamo and BigTable Review and Comparison
IEEEI 2014 Grisha Weintraub
Outline
• Introduction to NoSQL • Introduction to Dynamo and BigTable • Dynamo vs. BigTable comparison
• Open source implementations
Introduction to NoSQL
• New generation of databases
• Response to a “big data” challenge
• Main characteristics: – Non-relational – Distributed – Fault tolerant – Scalable
Introduction to NoSQL
Dynamo and BigTable - Introduction
Dynamo (Amazon) • Giuseppe DeCandia, et al.:
Dynamo: amazon's highly available key-value store. SOSP 2007
BigTable (Google) • Fay Chang, et al.: BigTable: A
Distributed Storage System for Structured Data. OSDI 2006
Highly Available
Key-value Structured Data
Dynamo vs. BigTable
BigTable Dynamo
Architecture
Data model
API
Security
Partitioning
Replication
Storage
Membership and failure detection
Architecture
Dynamo
• Decentralized: – Every node has the same set of
responsibilities as its peers.
– There is no single point of failure.
BigTable
• Centralized: – Single master node maintains
all system metadata. – Other nodes (tablet servers)
handle read and write requests.
Master
Data Model
Dynamo
• Key-value - data is stored as <key, value> pairs, such that key is a unique identifier and a value is an arbitrary entry.
BigTable
• Multidimensional sorted map – map is indexed by a row key and a column key, and ordered by a row key. Column keys are grouped into sets called column families.
Value Key
{ “Name” : ”John”, “Email” : ”[email protected]”, “Card” : ”6652” }
188
{ “Name” : ”Bob”, “Phone” : ”781455”, “Card” : ”9875” }
145
Financial Data Personal Data User ID
Card = “9875” Name = "Bob" Phone = "781455" 145
Card = “6652” Name = "John" Email = "[email protected]" 188
row key column family
column key
API
Dynamo
• get – returns an object associated with the given key.
• put – associates the given object with the specified key.
BigTable
• get – returns values from the individual rows.
• scan – iterates over multiple rows.
• put – inserts a value to the specified table's cell.
• delete – deletes a whole row or a specified cell inside a particular row.
Security
Dynamo
• No security features
BigTable
• Access control rights are granted at column family level.
Financial Data Personal Data Row Key
Card = “9875” Name = "Bob" Phone = "781455" 145
Card = “6652” Name = "John" Email = "[email protected]" 188
Views Personal Data
Views/Updates Personal Data
Views/Updates all the Data
Partitioning
Dynamo • Consistent Hashing:
– Each node is assigned to a random position on the ring.
– Key is hashed to the fixed point on the ring.
– Node is chosen by walking clockwise from the hash location.
BigTable • Data is stored ordered by a row key. • Each table consists of a set of tablets. • Each tablet is assigned to exactly one
tablet server. • METADATA table stores the location of a
tablet under a row key.
A B
D E
F
G
hash(key)
C
….. id
….. 15000
Tablet 1 ….. ….
….. 20000
….. 20001
Tablet 2 ….. ….
….. 25000
Tablet-51 Tablet-11
Tablet-32 Tablet-7
Tablet-16 Tablet-8
Tablet-1 Tablet-21
Tablet Server 1 Tablet Server 2
Replication
Dynamo • Each data item is replicated at N nodes
(N is a user-defined parameter). • Each key K is assigned to a coordinator
node. • Coordinator stores the data associated
with K locally, and also replicates it at the N-1 healthy clockwise successor nodes in the ring.
BigTable • Each tablet is stored in GFS as a
sequence of read-only files called SSTables.
• SSTables are divided into fixed-size chunks, and these chunks are stored on chunkservers.
• Each chunk in GFS is replicated across multiple chunkservers.
N = 3
A B
D E
F
G
hash(key)
C
SSTable3 SSTable2 SSTable1
Chunk3 Chunk2 Chunk1
Chunk1
Chunk3
Chunk1
Chunk2
Chunkserver 1 Chunkserver 2
Chunk2
Chunk3
Chunkserver 3
Storage
Dynamo
• Each node in Dynamo has a local persistence engine where data items are stored as binary objects.
• Different Dynamo instances may use different persistence engines (e.g. MySql, BDB)
• Applications choose the persistence engine based on their object size distribution.
BigTable
• Data is stored in GFS in SSTable file format.
• SSTable is an immutable ordered map, whose keys and values are arbitrary strings.
• SSTable supports "get by key" and "get by key range" requests.
Membership and Failure detection
Dynamo • Gossip-based protocol:
– Each node contacts a peer chosen at random every second and the two nodes exchange their membership data (every node maintains a persistent view of the membership).
BigTable • Failed tablet servers are
identified by regular handshakes between the master and all tablet servers.
A
B
D E
F
G
C
Master
Dynamo vs. BigTable
BigTable Dynamo
centralized decentralized Architecture
sorted map key-value Data model
get, put, scan, delete get, put API
access control no Security
key range based consistent hashing Partitioning
chunkservers in GFS successor nodes in the
ring Replication
SSTables in GFS Plug-in Storage
Handshakes initiated by master
Gossip-based protocol Membership and failure
detection
Open source implementations
Thank You