Extreme computing Databases and MapReduce Stratis D. Viglas School of Informatics University of Edinburgh Stratis D. Viglas www.inf.ed.ac.uk Databases and MapReduce BigTable Outline Databases and MapReduce Overview Relational databases Relational data processing on Hadoop MR BigTable Hive and Pig Stratis D. Viglas www.inf.ed.ac.uk Databases and MapReduce BigTable A different data model • BigTable’s data model is not relational • A table is “a sparse, distributed, persistent multidimensional sorted map” • The map is indexed by a triplet • (row:string, column:string, time:int64) • row and column are keys, time is a timestamp • Bigtables are mutable at the row level • Support for insertions, deletions, lookups Stratis D. Viglas www.inf.ed.ac.uk Databases and MapReduce BigTable Rows and columns in more detail "<html>..." "<html>..." "<html>..." "CNN" "CNN.com" t 3 t 5 t 6 t 9 t 8 com.cnn.www contents: anchor:cnnsi.com anchor:my.look.ca • Rows are maintained in sorted lexicographic order • Applications can exploit this property for efficient row scans • Row ranges dynamically partitioned into tablets • Columns grouped into column families • Column key = family:qualifier • Column families provide locality hints • Unbounded number of columns per table Stratis D. Viglas www.inf.ed.ac.uk
5
Embed
Extreme computing Databases and MapReduce · Extreme computing Databases and MapReduce Stratis D. Viglas School of Informatics University of Edinburgh Stratis D. Viglas Databases
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Extreme computingDatabases and MapReduce
Stratis D. Viglas
School of InformaticsUniversity of Edinburgh
Stratis D. Viglas www.inf.ed.ac.uk
Databases and MapReduce BigTable
Outline
Databases and MapReduceOverviewRelational databasesRelational data processing on Hadoop MRBigTableHive and Pig
Stratis D. Viglas www.inf.ed.ac.uk
Databases and MapReduce BigTable
A different data model
• BigTable’s data model is not relational• A table is “a sparse, distributed, persistent multidimensional sorted
map”• The map is indexed by a triplet
• (row:string, column:string, time:int64)• row and column are keys, time is a timestamp
• Bigtables are mutable at the row level• Support for insertions, deletions, lookups
Stratis D. Viglas www.inf.ed.ac.uk
Databases and MapReduce BigTable
Rows and columns in more detail
"<html>..."
"<html>..."
"<html>..."
"CNN" "CNN.com"
t3t5
t6
t9 t8com.cnn.www
contents: anchor:cnnsi.com anchor:my.look.ca
• Rows are maintained in sorted lexicographic order• Applications can exploit this property for efficient row scans• Row ranges dynamically partitioned into tablets
• Columns grouped into column families• Column key = family:qualifier• Column families provide locality hints• Unbounded number of columns per table
Stratis D. Viglas www.inf.ed.ac.uk
Databases and MapReduce BigTable
Building blocks: SSTable
• The smallest and most basic building block• Persistent immutable map from keys to values
• Stored in GFS• Sequence of disk blocks with a (persistent) index for lookup• Memory-mapped for fast operation
• Two supported operations• Given a key, look up the value associated with it• Iterate over key/value pairs within a given key range
64kBblock
64kBblock
64kBblock
Index
SSTable
Stratis D. Viglas www.inf.ed.ac.uk
Databases and MapReduce BigTable
Building blocks: Tablets and Tables
• Dynamically partitioned range of rows• Built from multiple SSTables
64kBblock
64kBblock
64kBblock
Index
SSTable
64kBblock
64kBblock
64kBblock
Index
SSTable
Tablet start: aardvark end: apple
• Multiple tablets make up a table• SSTables can be shared beween tablets
SSTable
Tabletaardvark apple
SSTable SSTable SSTable
Tabletapplepie boat
Stratis D. Viglas www.inf.ed.ac.uk
Databases and MapReduce BigTable
Notes on the architecture
• Similar to GFS
• Single master server, multiple tablet servers
• BigTable master
• Assigns tablets to tablet servers
• Detects addition and expiration of tablet servers
• Balances tablet server load
• Handles garbage collection
• Handles schema evolution
• Bigtable tablet servers
• Each tablet server manages a set of tablets
• Typically between ten to a thousand tablets
• Each 100− 200MB by default
• Handles read and write requests to the tablets
• Splits tablets when they grow too large
Stratis D. Viglas www.inf.ed.ac.uk
Databases and MapReduce BigTable
Location dereferencing
Chubby file ...
...
...
...
...
...
...
...
...
...
...
Other metadatatablets
Root tablet(1st metadata level)master file
User table 1
User table nchubby: replicated, persistent lock service; maintains tablet server locations
root tablet: root of the metadata tree
at most three levels in the metadata hierarchy
B-tree like structure, indexed by table identifier and end row
Stratis D. Viglas www.inf.ed.ac.uk
Databases and MapReduce BigTable
Tablet assignment
• Master keeps track of• Set of live tablet servers• Assignment of tablets to tablet servers• Unassigned tablets
• Each tablet is assigned to one tablet server at a time• Tablet server maintains an exclusive lock on a file in Chubby• Master monitors tablet servers and handles assignment
• Changes to tablet structure• Table creation/deletion (master initiated)• Tablet merging (master initiated)• Tablet splitting (tablet server initiated)
Stratis D. Viglas www.inf.ed.ac.uk
Databases and MapReduce BigTable
Tablet serving and I/O flow
SSTable SSTable SSTable
memtable read
write
memory
GFS
tablet log
write operations arelogged (in redo records)
recent updates kept sorted in main memory
memtable and SSTablesare merged to servethe read request
Stratis D. Viglas www.inf.ed.ac.uk
Databases and MapReduce BigTable
Tablet management
• Minor compaction• Converts the memtable into an SSTable• Reduces memory footprint and log traffic on restart
• Merging compaction• Reads the contents of a few SSTables and the memtable, and writes
out a new SSTable• Reduces number of SSTables
• Major compaction• Merging compaction that results in only one SSTable• No deletion records, only live data
Stratis D. Viglas www.inf.ed.ac.uk
Databases and MapReduce Hive and Pig
Outline
Databases and MapReduceOverviewRelational databasesRelational data processing on Hadoop MRBigTableHive and Pig
Stratis D. Viglas www.inf.ed.ac.uk
Databases and MapReduce Hive and Pig
High-level data processing
• Hive: data warehousing application in Hadoop
• Query language is HQL, variant of SQL
• Tables stored on HDFS as flat files
• Developed by Facebook, now open source
• Pig: large-scale data processing system
• Scripts are written in Pig Latin, a dataflow language
• Developed by Yahoo!, now open source
• Roughly 1/3 of all Yahoo! internal jobs
• Common idea
• Provide higher-level language to facilitate large-data processing
• Higher-level language is compiled to Hadoop jobs
Stratis D. Viglas www.inf.ed.ac.uk
Databases and MapReduce Hive and Pig
Hive: background and components
• Started at Facebook1
• Data was collected by nightly cron jobs into Oracle DB• Extract-transform-load (ETL) via hand-coded python• Grew from 10s of GBs (2006) to 1TB/day new data (2007), now 10x that
• Shell: allows interactive queries• Driver: session handles, fetch, execute• Compiler: parse, plan, optimize• Execution engine: DAG of stages (MR, HDFS, metadata processing)• Metastore: schema, location in HDFS, SerDe
1It had to be good for something apart from wasting my PhD students’ timeStratis D. Viglas www.inf.ed.ac.uk
Databases and MapReduce Hive and Pig
Logical and physical models
• Tables
• Typed columns (int, float, string, boolean)
• Also: list, map
• Partitions
• For example, range-partition tables by date
• Buckets
• Hash partitions within ranges (useful for sampling, join optimization)