TriHUG 3/14: HBase in Production
Post on 23-Aug-2014
247 Views
Preview:
DESCRIPTION
Transcript
HBase In ProductionHey we’re hiring!
Contents● Bronto Overview● HBase Architecture● Operations● Table Design● Questions?
Bronto OverviewBronto Software provides a cloud-based marketing platform for organizations to drive revenue through their email, mobile and social campaigns
Bronto Contd.● ESP for E-Commerce retailers● Our customers are marketers● Charts, graphs, reports● Market segmentation● Automation● We are also hiring
Where We Use HBase● High volume scenarios● Realtime data● Batch processing● HDFS staging area● Sorting/Indexing not a priority
○ We are working on this
HBase Overview● Implementation of Google’s BigTable● Sparse, sorted, versioned map● Built on top of HDFS● Row level ACID● Get, Put, Scan● Assorted RMW operations
Tables OverviewTables are sorted (lexicographically) key value pairs of uninterpreted byte[]s. Keyspace is divided up into regions of keys. Each region is hosted by exactly one machine.
R3R1
Server 1
Key Value
a byte[]
aa byte[]
b byte[]
bb byte[]
c byte[]
ca byte[]
R1: [a, b)
R2: [b, c)
R3: [c, d)
R2
Server 1
Table Overview
Operations● Layers of complexity● Normal failure modes
○ Hardware dies (or combust)○ Human error
● JVM● HDFS considerations● Lots of knobs
Cascading Failure1. High write volume fragments heap2. GC promotion failure3. Stop the world GC4. ZK timeout5. Receive YouAreDeadException, die6. Failover7. Goto 1
Useful Tunings● MSLAB enabled● hbase.regionserver.handler.count
○ Increasing puts more IO load on RS○ 50 is our sweet spot
● JVM tuning○ UseConcMarkSweepGC○ UseParNewGC
Monitoring Tools● Nagios for hardware checks● Cloudera Manager
○ Reporting and health checks○ Apache Ambari and MapR provide similar tools
● Hannibal + custom scripts○ Identify hot regions for splitting
Table Design● Table design is deceptively simple● Main Considerations:
○ Row key structure○ Number of column families
● Know your queries in advance
Additional Context● SAAS environment
○ “Twitter clone” model won’t work● Thousands of users millions, of attributes● Skewed customer base
○ Biggest clients have 10MM+ contacts○ Smallest have thousands
Row Keys● Most important decision● The only (native) index in HBase● Random reads and writes are fast
○ Sorted on disk and in memory○ Bloom filters speed read performance (not in use)
Hotspotting● Associated with monotonically increasing
keys○ MySql AUTO_INCREMENT
● Writes lock onto one region at a time● Consequences:
○ Flush and compaction storms○ $500K cluster limited by $10K machine
Row Key Advice● Read/Write ratio should drive design
○ We pay a write time penalty for faster reads● Identify queries you need to support● Consider composite keys instead of indexes● Bucketed/Salted keys are an option
○ Distribute writes across N buckets○ Rebucketing is difficult○ Requires N reads, slow workers
Variable Width Keyscustomer_hash::email● Allows scans for a single customer● Hashed id distributes customers● Sorted by email address
○ Could also use reverse domain for gmail, yahoo, etc.
Fixed Width Keyssite::contact::create::email● FuzzyRowFilter
○ Can fix site, contact, and reverse_create○ Can search for any email address○ Could use a fixed width encoding for domain
■ Search for just gmail, yahoo, etc● Distributes sites and users● Contacts sorted by create date
Column Families● Groupings of named columns● Versioning, compression, TTL● Different than BigTable
○ BigTable: 100s○ HBase: 1 or 2
Column Family ExampleId d {VERSIONS => 2} s7 {TTL => 604800}
a (address) p (phone) o:3-27 (open) c:3-20 (click)
dfajkdh byte[] byte[]:555-5555 byte[]
hnvdzu9 byte[]:1234 St. XXXX
hnvdzu9 byte[]:1233 St.
hnvdzu9 XXXX byte[]
er9asyjk byte[]: 324 Ave
Column Family Example
● PROTIP: Keep CF and qualifier names short ○ They are repeated on disk for every cell
● “d” supports 2 versions of each column, maps to demographics● “s7” has seven day TTL, maps to stats kept for 7 days.
MemStore
HDFSs2s1 s3
f1
Column Families In Depth
MemStore
HDFSs2s1
f2
my_table,,1328551097416.12921bbc0c91869f88ba6a044a6a1c50.
● StoreFile(s) for each CF in region
● Sparse● One memstore per CF
○ Must flush together● Compactions happen at
region level
(Region)
(family) (family)
Compactions● Rewrites StoreFiles
○ Improves read performance○ IO Intensive
● Region scope● Used to take > 50 hours● Custom script took it down to 18
○ Can (theoretically) run during the day
MemStore
HDFS
S1
f1
my_table,,1328551097416.12921bbc0c91869f88ba6a044a6a1c50.
(Region)
MemStore
HDFSs2
s1 s3
f1
my_table,,1328551097416.12921bbc0c91869f88ba6a044a6a1c50.
(Region)
Compaction Before and After
s4 s5 s6
Before After
K-Way Merge
The Table From Hell● 19 Column Families● 60% of our region count● Skewed write pattern
○ KB size store files○ Frequent compaction storms○ hbase.hstore.compaction.min.size (HBASE-5461)
● Moved to it’s own cluster
And yet...● Cluster remained operational
○ Table is still in use today● Met read and write demand● Regions only briefly active
○ Rowkeys by date and customer
What saved us● Keyed by customer and date● Effectively write once
○ Kept “active” region count low● Custom compaction script
○ Skipped old regions● More hardware● Were able to selectively migrate
Column Family Advice● Bad choice for fine grained partitioning● Good for
○ Similarly typed data○ Varying versioning/retention requirements
● Prefer intra row scans○ CF and qualifiers are sorted○ ColumnRangeFilter
Questions?
top related