Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google, Inc. UWCS OS Seminar Discussion Erik Paulson 2 October 2006 See also the (other)UW presentation by Jeff Dean in September of 2005 (See the link on the seminar page, or just google for “google bigtable”)
19
Embed
Google Bigtable - ict.ac.cnprof.ict.ac.cn/DComputing/uploads/2011/DC_6_bigtable.… · · 2013-07-10Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, ... Each tablet is
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Google BigtableFay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes,
Robert E. GruberGoogle, Inc.
UWCS OS Seminar DiscussionErik Paulson
2 October 2006
See also the (other)UW presentation by Jeff Dean in September of 2005 (See the link on the seminar page, or just google for “google bigtable”)
2 of 19
Before we begin…
• Intersection of databases and distributed systems
• Will try to explain (or at least warn) when we hit a patch of database
• Remember this is a discussion!
3 of 19
Google Scale• Lots of data
– Copies of the web, satellite data, user data, email and USENET, Subversion backing store
• Many incoming requests• No commercial system big enough
– Couldn’t afford it if there was one– Might not have made appropriate design choices
• Firm believers in the End-to-End argument• 450,000 machines (NYTimes estimate, June 14th
2006
4 of 19
Building Blocks• Scheduler (Google WorkQueue)• Google Filesystem• Chubby Lock service• Two other pieces helpful but not required
– Sawzall– MapReduce (despite what the Internet says)
• BigTable: build a more application-friendly storage service using these parts
5 of 19
Google File System
• Large-scale distributed “filesystem”• Master: responsible for metadata• Chunk servers: responsible for reading
and writing large chunks of data• Chunks replicated on 3 machines, master
responsible for ensuring replicas exist• OSDI ’04 Paper
6 of 19
Chubby
• {lock/file/name} service• Coarse-grained locks, can store small
amount of data in a lock• 5 replicas, need a majority vote to be
active• Also an OSDI ’06 Paper
7 of 19
Data model: a big map•<Row, Column, Timestamp> triple for key - lookup, insert, and delete API
•Arbitrary “columns” on a row-by-row basis
•Column family:qualifier. Family is heavyweight, qualifier lightweight
•Column-oriented physical store- rows are sparse!
•Does not support a relational model
•No table-wide integrity constraints
•No multirow transactions
8 of 19
SSTable• Immutable, sorted file of key-value
pairs• Chunks of data plus an index
– Index is of block ranges, not values
Index
64K block
64K block
64K block
SSTable
9 of 19
Tablet
• Contains some range of rows of the table• Built out of multiple SSTables
Index
64K block
64K block
64K block
SSTable
Index
64K block
64K block
64K block
SSTable
Tablet Start:aardvark End:apple
10 of 19
Table• Multiple tablets make up the table• SSTables can be shared• Tablets do not overlap, SSTables can overlap