Databases Architectures & Hypertable Doug Judd CEO, Hypertable, Inc.

Post on 26-Mar-2015

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Databases Architectures Databases Architectures & Hypertable& Hypertable

Doug JuddDoug Judd

CEO, Hypertable, Inc.CEO, Hypertable, Inc.

Database TerminologyDatabase Terminology

www.hypertable.orgwww.hypertable.org

Structured, Semi-Structured, Structured, Semi-Structured, and Unstructured Dataand Unstructured Data

Structured is what RDBMS storeStructured is what RDBMS store Data is broken into discrete componentsData is broken into discrete components Types associated with each component:Types associated with each component:

integer, floating point, date, stringinteger, floating point, date, string Unstructured is free-form textUnstructured is free-form text Semi-structured is combination of Semi-structured is combination of

sturctured and semi-structuredsturctured and semi-structured

www.hypertable.orgwww.hypertable.org

Document-OrientedDocument-Oriented

Semi-structured documentsSemi-structured documents Accepts documents in a format such as Accepts documents in a format such as

JSON, XML, YAMLJSON, XML, YAML Often Schema-lessOften Schema-less Auto-index fieldsAuto-index fields Examples: CouchDB, MongoDBExamples: CouchDB, MongoDB Best Fit: XML or Web documentsBest Fit: XML or Web documents

www.hypertable.orgwww.hypertable.org

Graph DatabasesGraph Databases

Database designed to represent graphsDatabase designed to represent graphs APIs for performing graph operationsAPIs for performing graph operations

Traversal (depth-first, breadth-first)Traversal (depth-first, breadth-first) Shortest/Cheapest pathShortest/Cheapest path PartitioningPartitioning

Some allow HypergraphsSome allow Hypergraphs Examples:Examples:

Neo4j, HyperGraphDB, InfoGrid, Neo4j, HyperGraphDB, InfoGrid, AllegroGraph, Sones, DEX, FlockDB, AllegroGraph, Sones, DEX, FlockDB, OrientDB, VertexDB, InfiniteGraph, Filament OrientDB, VertexDB, InfiniteGraph, Filament

More info: sones graphdb landscapeMore info: sones graphdb landscape

www.hypertable.orgwww.hypertable.org

Column-OrientedColumn-Oriented

Data physically stored by columnData physically stored by column RDBMS typically row-orientedRDBMS typically row-oriented Improved performance for column Improved performance for column

operationsoperations Better data compressionBetter data compression Examples:Examples:

Hypertable, HBase, Cassandra, Vertica Hypertable, HBase, Cassandra, Vertica

www.hypertable.orgwww.hypertable.org

In-MemoryIn-Memory

Data set stored in RAMData set stored in RAM Extremely fast accessExtremely fast access Limited capacityLimited capacity Examples:Examples:

Memcached, Redis, MonetDB, VoltDBMemcached, Redis, MonetDB, VoltDB

www.hypertable.orgwww.hypertable.org

Horizontal ScalabilityHorizontal Scalability

Scale outScale out Increase capacity by adding machinesIncrease capacity by adding machines Opposite of vertical scalability (scale up)Opposite of vertical scalability (scale up) Commodity HardwareCommodity Hardware

www.hypertable.orgwww.hypertable.org

Distributed Hash Table (DHT)Distributed Hash Table (DHT)

Horizontally ScalableHorizontally Scalable DecentralizedDecentralized Fast accessFast access Restricted API: Restricted API: GET,SET,DELETEGET,SET,DELETE Peer-to-peer file sharing systems: Peer-to-peer file sharing systems:

BitTorrent, Napster, Gnutella, FreenetBitTorrent, Napster, Gnutella, Freenet Examples:Examples:

Dynamo, Cassandra, Riak, Project Voldemort, Dynamo, Cassandra, Riak, Project Voldemort, SimpleDB, S3, Redis, Scalaris, Membase SimpleDB, S3, Redis, Scalaris, Membase

Scalable Database Scalable Database ArchitecturesArchitectures

www.hypertable.orgwww.hypertable.org

Auto-ShardingAuto-Sharding

Splits table data into horizontal “shards”Splits table data into horizontal “shards” Shards managed by traditional RDBMSShards managed by traditional RDBMS

(e.g. MySQL, Postgres)(e.g. MySQL, Postgres) Automated “glue” code to handle sharding Automated “glue” code to handle sharding

and request routingand request routing Examples:Examples:

MongoDB, AsterData, Greenplum MongoDB, AsterData, Greenplum

www.hypertable.orgwww.hypertable.org

MongoDBMongoDB

www.hypertable.orgwww.hypertable.org

DynamoDynamo

Developed by Amazon.com for their Developed by Amazon.com for their Shopping CartShopping Cart

Designed for high write availabilityDesigned for high write availability Eventually Consistent DHTEventually Consistent DHT Implementations:Implementations:

CassandraCassandra Project VoldemortProject Voldemort RiakRiak DynomiteDynomite

www.hypertable.orgwww.hypertable.org

Eventual ConsistencyEventual Consistency

Database update semantics in a Database update semantics in a distributed system with data replicationdistributed system with data replication

Strong Consistency - after an update Strong Consistency - after an update completes completes allall processes see the updated processes see the updated valuevalue

Eventual Consistency - Eventual Consistency - eventually alleventually all processes will see the updated valueprocesses will see the updated value

Most well-known eventual consistency Most well-known eventual consistency system is DNSsystem is DNS

www.hypertable.orgwww.hypertable.org

Eventual ConsistencyEventual Consistency

www.hypertable.orgwww.hypertable.org

Consistent HashingConsistent Hashing

www.hypertable.orgwww.hypertable.org

Amazon AWSAmazon AWS

S3S3 Online storage web serviceOnline storage web service Designed for larger amounts of dataDesigned for larger amounts of data Cost $0.15/GB per monthCost $0.15/GB per month

SimpleDBSimpleDB Designed for smaller amounts of dataDesigned for smaller amounts of data Provides indexing and richer query capabilityProvides indexing and richer query capability Cost $027/GB per month + machine utilization feeCost $027/GB per month + machine utilization fee

RDSRDS Managed MySQL instancesManaged MySQL instances

www.hypertable.orgwww.hypertable.org

Order Preserving Partitioner Order Preserving Partitioner (Cassandra)(Cassandra)

www.recipezaar.com 1091721999…6297502721091721999…629750272

++www.ribbonprinters.com 1091721999…965293103 1091721999…965293103

/ 2 =/ 2 =www.rgb????i?pQdp?.??? 1091721999…297521687?.??? 1091721999…297521687

www.hypertable.orgwww.hypertable.org

Order Preserving PartitionerOrder Preserving PartitionerBalance ProblemBalance Problem

www.hypertable.orgwww.hypertable.org

Bigtable: the infrastructure that Bigtable: the infrastructure that Google is built onGoogle is built on

Bigtable underpins 100+ Google Bigtable underpins 100+ Google services, including:services, including:

YouTube, Blogger, Google Earth, GoogleYouTube, Blogger, Google Earth, Google Maps, Orkut, Gmail, Google Analytics, Maps, Orkut, Gmail, Google Analytics,

Google Book Search, Google Code,Google Book Search, Google Code,Crawl Database…Crawl Database…

ImplementationsImplementations HypertableHypertable HBaseHBase

www.hypertable.orgwww.hypertable.org

Google StackGoogle Stack

GFSGFS - Replicates data inter-machine - Replicates data inter-machine MapReduceMapReduce - Efficiently process data in GFS - Efficiently process data in GFS BigtableBigtable - Indexed table structure - Indexed table structure

www.hypertable.orgwww.hypertable.org

Google File SystemGoogle File System

www.hypertable.orgwww.hypertable.org

Google File SystemGoogle File System

www.hypertable.orgwww.hypertable.org

System OverviewSystem Overview

www.hypertable.orgwww.hypertable.org

Data ModelData Model

Sparse, two-dimensional table with cell versionsSparse, two-dimensional table with cell versions Cells are identified by a 4-part keyCells are identified by a 4-part key

Row (string)Row (string) Column Family (byte)Column Family (byte) Column Qualifier (string)Column Qualifier (string) Timestamp (long integer)Timestamp (long integer)

www.hypertable.orgwww.hypertable.org

Table: Visual RepresentationTable: Visual Representation

www.hypertable.orgwww.hypertable.org

Table: Actual RepresentationTable: Actual Representation

www.hypertable.orgwww.hypertable.org

Scaling (part I)Scaling (part I)

www.hypertable.orgwww.hypertable.org

Scaling (part II)Scaling (part II)

www.hypertable.orgwww.hypertable.org

Scaling (part III)Scaling (part III)

www.hypertable.orgwww.hypertable.org

Request RoutingRequest Routing

HypertableHypertable

www.hypertable.orgwww.hypertable.org

Hypertable OverviewHypertable Overview

Massively Scalable DatabaseMassively Scalable Database Modeled after Google’s BigtableModeled after Google’s Bigtable High Performance Implementation (C++)High Performance Implementation (C++) Thrift Interface for all popular High Level Thrift Interface for all popular High Level

Languages: Java, Ruby, Python, PHP, etcLanguages: Java, Ruby, Python, PHP, etc Open Source (GPL license)Open Source (GPL license) Project started March 2007 @ ZventsProject started March 2007 @ Zvents

www.hypertable.orgwww.hypertable.org

Hypertable In Use TodayHypertable In Use Today

www.hypertable.orgwww.hypertable.org

Hypertable vs. HBaseHypertable vs. HBase

www.hypertable.orgwww.hypertable.org

Hypertable vs. HBaseHypertable vs. HBaseTest Hypertable

Advantage Relative to HBase (%)

Random Read Zipfian 80 GB 925

Random Read Zipfian 20 GB 777

Random Read Zipfian 2.5 GB 100

Random Write 10KB values 51

Random Write 1KB values 102

Random Write 100 byte values 427

Random Write 10 byte values 931

Sequential Read 10KB values 1060

Sequential Read 1KB values 68

Sequential Read 100 byte values

129

Scan 10KB values 2

Scan 1KB values 58

Scan 100 byte values 75

Scan 10 byte values 220

www.hypertable.orgwww.hypertable.org

Annual EC2 Cost SavingsAnnual EC2 Cost Savings Assuming 200% improvementAssuming 200% improvement Extra large reserved instancesExtra large reserved instances

www.hypertable.orgwww.hypertable.org

ResourcesResources

Project SiteProject Site www.hypertable.org

TwitterTwitter hypertable

Commercial SupportCommercial Support www.hypertable.com

Performance Evaluation Performance Evaluation Write-upWrite-up

blog.hypertable.com/?p=14

Q&AQ&A

top related