MapR's Hadoop Distribution
MapR's Hadoop Distribution
Who am I?
• Keys Botzum • [email protected] • Senior Principal Technologist, MapR Technologies • MapR Federal and Eastern Region
http://www.mapr.com/company/events/speaking/pdb-10-16-12
Agenda
• What’s a Hadoop? • What’s MapR? • Enterprise Grade Hadoop • Making Hadoop More Open
Hadoop in 15 minutes
How to Scale? Big Data has Big Problems • Petabytes of data • MTBF on 1000s of nodes is < 1 day • Something is always broken • There are limits to scaling Big Iron • Sequential and random access just don’t scale
Example: Update 1% of 1TB
• Data consists of 1010 records, each 100 bytes • Task: Update 1% of these records
Approach 1: Just Do It
• Each update involves read, modify and write • t = 1 seek + 2 disk rotations = 20ms • 1% x 1010 x 20 ms = 2 mega-seconds = 23 days
• Total time dominated by seek and rotation times
Approach 2: The “Hard” Way
• Copy the entire database 1GB at a time • Update records on the fly
• t = 2 x 1GB / 100MB/s + 20ms = 20s • 103 x 20s = 20,000s = 5.6 hours
• 100x faster to do 100x more work! • Moral: Read data sequentially even if you only want 1%
of it
MapReduce: A Paradigm Shift
• Distributed computing platform • Large clusters • Commodity hardware
• Pioneered at Google • BigTable, MapReduce and Google File System
• Commercially available as Hadoop
Hadoop • Commodity hardware – thousands of nodes • Handles Big Data – petabytes and more • Sequential file access – each spindle provides data as fast as
possible • Sharding
• Data distributed evenly across cluster • More spindles and CPUs working on different parts of data set
• Reliability – self-healing (mostly), self-balancing • MapReduce
• Parallel computing framework • Function shipping
§ Moves the computation to the data rather than the typical reverse
§ Takes into account sharding • Hides most of complexity from developers
Inside Map-Reduce
Input Map Shuffle and sort
Reduce Output
"The 6me has come," the Walrus said, "To talk of many things: Of shoes—and ships—and sealing-‐wax
the, 1 6me, 1 has, 1 come, 1 …
come, [3,2,1] has, [1,5,2] the, [1,2,1] 6me, [10,1,3] …
come, 6 has, 8 the, 4 6me, 14 …
Agenda
• What’s a Hadoop? • What’s MapR? • Enterprise Grade Hadoop • Making Hadoop More Open
The MapR Distribution for Apache Hadoop
• Commercial Hadoop Distribution • Open, enterprise-grade distribution
• Primarily leveraging open source components • Carefully targeted enhancements to make Hadoop more
open and enterprise-grade
• Growing fast and a recognized leader
MapR in the Cloud
• Available as a service with Amazon Elastic MapReduce
(EMR) • http://aws.amazon.com/elasticmapreduce/mapr
§ Available as a service with Google Compute Engine
MapR Partners
Agenda
• What’s a Hadoop? • What’s MapR? • Enterprise Grade Hadoop • Making Hadoop More Open
MapR’s Complete Distribution for Apache Hadoop
• Integrated, tested, hardened and supported
• Integrated with Accumulo
• Runs on commodity hardware
• Open source with standards-based extensions for:
• Security • File-based access • Most SQL-based
access • Easiest integration
• High availability • Best performance
MapR Heatmap™
LDAP, NIS Integration
Quotas, Alerts, Alarms
CLI, REST APT
Hive Pig Oozle Sqoop HBase Whirr
Mahout Cascading Naglos Integration
Ganglia Integration
Flume Zoo-keeper
MapR Control System
Direct Access
NFS
Real-Time
Streaming
Volumes Mirrors Snap-shots
Data Placemen
t
No NameNode Architecture
High Performance Direct Shuffle
Stateful Failover and Self Healing
2.7 MapR’s Storage Services™
Accumulo
Easy Management at Scale
• Health Monitoring
• Cluster Administration
• Application Resource Provisioning
Same information and tasks available via command line and REST
MapR: Lights Out Data Center Ready
Reliable Compute Dependable Storage
• Automated stateful failover • Automated re-‐replica6on • Self-‐healing from HW and SW failures
• Load balancing • Rolling upgrades • No lost jobs or data • 99999’s of up6me
§ Business con6nuity with snapshots and mirrors
§ Recover to a point in 6me § End-‐to-‐end check summing
§ Strong consistency § Built in compression § Mirror across sites to meet Recovery Time Objec6ves
Storage Architecture
§ How does MapR manage storage and how is this different from generic Hadoop?
21 ©MapR Technologies -‐ Confiden6al
What is a Volume?
§ Like a sub-‐directory § related dirs/files together
§ Contains file metadata for this volume
§ Mounted to form global name-‐space
§ Logical unit of policy
Volumes help you manage data
22 ©MapR Technologies -‐ Confiden6al
Typical Volume Layout
Create lots of volumes, 100K volumes OK!
/
/binaries /var/mapr /projects /hbase /users
/mjones /jsmith /build /test local...
23 ©MapR Technologies -‐ Confiden6al
Volumes Let You Manage Data
§ Replica6on factor § Quotas § Load balancing § Snapshots § Mirrors § Data placement § Made of containers
§ Container is Sharding unit § 16 – 32G
24 ©MapR Technologies -‐ Confiden6al
Storage Architecture
§ Nodes § Disks § Storage Pools § Containers
– Distributed across cluster – 16-‐32 GB
§ Volumes
25 ©MapR Technologies -‐ Confiden6al
A B C D
NameNode
E F
NAS APPLIANCE
DataNode DataNode DataNode
DataNode DataNode DataNode
DataNode DataNode DataNode
No NameNode Architecture Other Hadoop Distribu6ons MapR
§ HA requires specialized hardware and/or sonware
§ File scalability hampered by namenode booleneck
§ Metadata must fit in memory
§ HA w/ automa6c failover and re-‐replica6on § Up to 1T files (> 5000x advantage) § Higher performance § 100% commodity hardware § Metadata is persisted to disk
NameNode
A BNameNode
C D
NameNode
E F
A F C D E D
B C E B
C F B F
A B
A D
E
26 ©MapR Technologies -‐ Confiden6al
Hadoop / HBASE APPLICATIONS
NFS APPLICAITONS
Hadoop / HBASE APPLICATIONS
NFS APPLICAITONS
MapR Snapshots
§ Snapshots without data duplica6on
§ Saves space by sharing blocks
§ Lightning fast § Zero performance loss on wri6ng to original
§ Scheduled, or on-‐demand § Easy recovery by user
REDIRECT ON WRITE FOR SNAPSHOT
Data Blocks
Snapshot 1 Snapshot 2 Snapshot 3
READ / WRITE
MapR Storage Services
Hadoop / HBASE APPLICATIONS
NFS APPLICAITONS
A B C C’ D
Production
MapR Mirroring/COOP Requirements
Business Con6nuity and Efficiency Efficient design § Differen6al deltas are updated § Compressed and
check-‐summed
Easy to manage § Scheduled or on-‐demand § WAN, Remote Seeding § Consistent point-‐in-‐6me
WAN
Production Research
Datacenter 1 Datacenter 1
WAN
Cloud
Compute Engine
Thought Questions • Consider a cluster with
• Petabytes of data • Hundred or thousands of jobs running each day, creating new data • Many users and teams all using this cluster
• How do I back this up? • User “oops” protection
• How do I replicate data from one cluster to another in support of disaster recovery? • Protection from power outages, floods, fire, etc
29 ©MapR Technologies -‐ Confiden6al
Designed for Performance and Scale MapR Apache/CDH
Terasort w/ 1x replica6on (no compression)
Total (minutes) 24 min 34 sec 49 min 33 sec
Map 9 min 54 sec 28 min 12 sec
Shuffle 9 min 8 sec 27 min 0 sec
Terasort w/ 3x replica6on (no compression)
Total 47 min 4 sec 73 min 42 sec
Map 11 min 2 sec 30 min 8 sec
Shuffle 9 min 17 sec 28 min 40 sec
DFSIO/local write
Throughput/node 870 MB/s 240 MB/s
YCSB (HBase benchmark, 50% read, 50% update)
Throughput 33102 ops/sec 7904 ops/sec
Latency (r/u) 2.9-‐4 ms/0.4 ms 7-‐30 ms/0-‐5 ms
YCSB (HBase benchmark, 95% read, 5% update)
Throughput 18K ops/sec 8500 ops/sec
Latency (r/u) 5.5-‐5.7 ms/0.6 ms 12-‐30 ms/1 ms
HW: 10 servers, 2 x 4 cores (2.4 GHz), 11 x 2TB, 32 GB
§ 1.4 PB user data § 900-‐1200 MapReduce jobs per day § 16 TB/day average IO through each server § 85-‐90% storage u6liza6on (with snapshots) § Very low-‐end hardware (consumer drives)
§ 6B files on a single cluster (+ 3x replica6on) § 2000 servers targeted § No degrada6on during hardware failures § Heavy read/write/delete workload § 1.7K creates/sec/node
Large Web 2.0 company
Response Eme (write/read/delete)
Atomic workload 7.8/4.5/8.7 ms
Mixed workload 6.6/4.9/9.1 ms
Customer Support
• 24x7x365 “Follow-The-Sun” coverage • Critical customer issues are worked on
around the clock • Dedicated team of Hadoop engineering
experts • Contacting MapR support
• Email: [email protected] (automatically opens a case)
• Phone: 1.855.669.6277 • Self Service options:
§ http://answers.mapr.com/ § Web Portal: http://mapr.com/
support
Two MapR Editions – M3 and M5
§ Control System § NFS Access § Performance § High Availability § Snapshots & Mirroring § 24 X 7 Support § Annual Subscrip6on
§ Control System § NFS Access § Performance § Unlimited Nodes § Free
Compute Engine
Also Available through:
Agenda
• What’s a Hadoop? • What’s MapR? • Enterprise Grade Hadoop • Making Hadoop More Open
33 ©MapR Technologies
Not All ApplicaEons Use the Hadoop APIs
Applica6ons and libraries that use files and/or SQL • These are not legacy
applica6ons, they are valuable applica6ons
Applica6ons and libraries that use the Hadoop APIs
30 years 100,000s applica6ons
10,000s libraries 10s programming languages
34 ©MapR Technologies
Hadoop Needs Industry-‐Standard Interfaces
• MapReduce and HBase applica6ons • Mostly custom-‐built
Hadoop API
• File-‐based applica6ons • Supported by most opera6ng systems NFS
• SQL-‐based tools • Supported by most BI applica6ons and query builders
ODBC
35 ©MapR Technologies
NFS
36 ©MapR Technologies
Your Data is Important
§ HDFS-‐based Hadoop distribu6ons do not (cannot) properly support NFS
§ Your data is important, it drives your business – make sure you can access it – Why store your data in a system which cannot be accessed by 95% of the world’s applica6ons and libraries?
37 ©MapR Technologies
Direct Access NFS™ File Browsers
Access Directly “Drag & Drop”
Random Read Random Write
Log directly
grep!sed!sort!tar!
Standard Linux Commands & Tools
Applica6ons
38 ©MapR Technologies
The NFS Protocol
§ RFC 1813
§ Very simple protocol
§ Random reads/writes – Read count bytes from offset offset of file file
– Write buffer data to offset offset of a file file
§ HDFS does not support random writes so it cannot support NFS
WRITE3res NFSPROC3_WRITE(WRITE3args) = 7; struct WRITE3args { nfs_fh3 file; offset3 offset; count3 count; stable_how stable; opaque data<>; }; READ3res NFSPROC3_READ(READ3args) = 6; struct READ3args { nfs_fh3 file; offset3 offset; count3 count; };
39 ©MapR Technologies
Hadoop Was Designed to Support MulEple Storage Layers
HDFS
o.a.h.hd
fs.Distrib
uted
FileSystem
NFS interface
Hadoop FileSystem API
S3
o.a.h.fs.s3n
a6ve.Na6
veS3FileSystem
Local File System
o.a.h.fs.LocalFileSystem
FTP
o.a.h.fs.np.FTPFileSystem
MapR storage layer
com.m
apr.fs.MapRFileSystem
o.a.h.fs.FileSystem Interface MapReduce
40 ©MapR Technologies
One NFS Gateway
What about scalability and high availability?
41 ©MapR Technologies
MulEple NFS Gateways
42 ©MapR Technologies
MulEple NFS Gateways with Load Balancing
43 ©MapR Technologies
MulEple NFS Gateways with NFS HA (VIPs)
Customer Examples: Import/Export Data • Network security vendor
• Network packet captures from switches are streamed into the cluster • New pattern definitions are loaded into online IPS via NFS
• Online measurement company • Clickstreams from application servers are streamed into the cluster
• SaaS company • Exporting a database to Hadoop over NFS
• Ad exchange • Bids and transactions are streamed into the cluster
Customer Examples: Productivity and Operations
• Retailer • Operational scripts are easier with NFS than HDFS + MapReduce
§ chmod/chown, file system searches/greps, perl, awk, tab-complete • Consolidate object store with analytics
• Credit card company • User and project home directories on Linux gateways
§ Local files, scripts, source code, … § Administrators manage quotas, snapshots/backups, …
• Large Internet company recommendation system • Web server serve MapReduce results (item relationships) directly from
cluster
• Email marketing company • Object store with HBase and NFS
Apache Drill Interactive Analysis of Large-Scale Datasets
Latency Matters
• Ad-hoc analysis with interactive tools
• Real-time dashboards
• Event/trend detection and analysis • Network intrusion analysis on the fly • Fraud • Failure detection and analysis
Big Data Processing
Batch processing Interactive analysis Stream processing
Query runtime Minutes to hours Milliseconds to minutes
Never-ending
Data volume TBs to PBs GBs to PBs Continuous stream
Programming model
MapReduce Queries DAG
Users Developers Analysts and developers
Developers
Google project MapReduce Dremel
Open source project
Hadoop MapReduce
Storm and S4
Introducing Apache Drill…
Innovations
• MapReduce • Scalable IO and compute trumps efficiency with today's commodity hardware • With large datasets, schemas and indexes are too limiting • Flexibility is more important than efficiency • An easy to use scalable, fault tolerant execution framework is key for large
clusters
• Dremel • Columnar storage provides significant performance benefits at scale • Columnar storage with nesting preserves structure and can be very efficient • Avoiding final record assembly as long as possible improves efficiency • Optimizing for the query use case can avoid the full generality of MR and thus
significantly reduce latency. No need to start JVMs, just push compact queries to running agents.
• Apache Drill • Open source project based upon Dremel’s ideas • More flexibility and openness
More Reading on Apache Drill • MapR and Apache Drill
• http://www.mapr.com/drill • Apache Drill project page
• http://incubator.apache.org/projects/drill.html • Google’s Dremel
• http://research.google.com/pubs/pub36632.html • Google’s BigQuery
• https://developers.google.com/bigquery/docs/query-reference • MIT’s C-Store – a columnar database
• http://db.csail.mit.edu/projects/cstore/ • Microsoft’s Dryad
• Distributed execution engine • http://research.microsoft.com/en-us/projects/dryad/
• Google’s Protobufs • https://developers.google.com/protocol-buffers/docs/proto