Top Banner
© Hortonworks Inc. 2011 MapReduce over snapshots HBASE-8369 Enis Soztutar Enis [at] apache [dot] org @enissoz Page 1
16
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Mapreduce over snapshots

© Hortonworks Inc. 2011

MapReduce over snapshotsHBASE-8369

Enis Soztutar

Enis [at] apache [dot] org

@enissoz

Page 1

Page 2: Mapreduce over snapshots

© Hortonworks Inc. 2011

About Me

Page 2Architecting the Future of Big Data

• In the Hadoop space since 2007

• Committer and PMC Member in Apache HBase and Hadoop

• Working at Hortonworks as member of Technical Staff

• Twitter: @enissoz

Page 3: Mapreduce over snapshots

© Hortonworks Inc. 2011

Snapshots

• Currently a snapshot is a bunch of reference files together with some metadata

• A table’ snapshot can contain – Table descriptor– List of regions– References to files in the regions– References to WALs for regionservers

• Current snapshot impl is flush based– Forces flush to all regions, so that in-memory data is written to disk

Page 3Architecting the Future of Big Data

Page 4: Mapreduce over snapshots

© Hortonworks Inc. 2011

MR over Snapshots

• Idea is do scan’s on the client side bypassing region servers

• Use snapshots since they are immutable

• Similar to short circuit hdfs reads

• TableSnapshotInputFormat works similar to TableInputFormat

• TableMapReduceUtil methods to configure the job

Page 4Architecting the Future of Big Data

Page 5: Mapreduce over snapshots

© Hortonworks Inc. 2011

Deployment Options

HBase online

• Take snaphot while HBase is running• Run MR job over the snapshot

HBase offline• Take snapshot while HBase is running• Export Snapshot using ExportSnapshot to a different hdfs• Run MR job over snapshot with or without HBase running

Page 5Architecting the Future of Big Data

Page 6: Mapreduce over snapshots

© Hortonworks Inc. 2011

TableSnapshotInputFormat

• Gets a Scan representing the query• Restore the snapshot to a temporary directory• For each region in the snapshot:

– Determine whether the region should be scanned (falls between scan start row and stop row)

– Create one split per region in the scan range ( # of map tasks)– Each RecordReader will open the region (Hregion) as in HRegionServer– An internal RegionScanner is used for running the scan

Page 6Architecting the Future of Big Data

Page 7: Mapreduce over snapshots

© Hortonworks Inc. 2011

API

Page 7Architecting the Future of Big Data

Page 8: Mapreduce over snapshots

© Hortonworks Inc. 2011

Timeline

• Will (hopefully) be committed to trunk next week or so

• Interest in bringing this to 0.94 and 0.96 bases as well

• Will come in HDP-2.1, which will be based on 0.96 line

Page 8Architecting the Future of Big Data

Page 9: Mapreduce over snapshots

© Hortonworks Inc. 2011

Security Aspects

• HBase user owns the files in filesystem• Snapshot files are also owned by the HBase user• Mapreduce job should be able to read the files in the snapshot + actual

data files• HDFS only has posix-like perms based on user/group/other

– User running MR job has to be either the HBase user, or have group perms– HDFS does not have ACL’s, so there is no easy way to grant read access at

filesystem layer

• Idea: similar to current short circuit impl, we can implement a FD transfer– User will submit jobs under her own user credentials– Ask HBase daemons to open the files, and pass a handler / token

Page 9Architecting the Future of Big Data

Page 10: Mapreduce over snapshots

© Hortonworks Inc. 2011

Performance

ScanTest:• Scan : open a scanner, do full table scan• SnapshotScan : open a client-side scanner, do full table scan• ScanMR : parallel full table scan from MR• SnapshotScanMR : do full table scan

• 8 Region servers, 6 disks each• HBase trunk• Hadoop-2.2 (HDP-2.0.7.0-12)• Load data with IntegrationTestBulkLoad

– Evenly distributed rows, created as bulk loaded hfiles. 3 column families

• # store files per region varies 3,6,9, and 12 (1,2,3,4 file per store)• Data sizes: 6.6G, 13.2G, 19.8G, 26.4G

Page 10Architecting the Future of Big Data

Page 11: Mapreduce over snapshots

© Hortonworks Inc. 2011

Scan speed

Page 11Architecting the Future of Big Data

Page 12: Mapreduce over snapshots

© Hortonworks Inc. 2011

API

• We do not want to limit snapshot scanning only to MapReduce

• Allow client side scanners over snapshot files

Page 12Architecting the Future of Big Data

Page 13: Mapreduce over snapshots

© Hortonworks Inc. 2011

ResultScanner is main scan API

Page 13Architecting the Future of Big Data

Page 14: Mapreduce over snapshots

© Hortonworks Inc. 2011

API (caution: not final yet)

Page 14Architecting the Future of Big Data

Page 15: Mapreduce over snapshots

© Hortonworks Inc. 2011

To the future and beyond

• HBASE-8691 High-Throughput Streaming Scan API

• Can we bypass regionservers without taking snapshots?

• Bypass memstore data, or stream memstore data, but read directly from hfiles

• Secure reading from snapshots

• Keep up with the updates at – https://issues.apache.org/jira/browse/HBASE-8369

Page 15Architecting the Future of Big Data

Page 16: Mapreduce over snapshots

© Hortonworks Inc. 2011

ThanksQuestions?

Architecting the Future of Big DataPage 16

Enis Söztutarenis [ at ] apache [dot] org@enissoz