Top Banner
Apache Hadoop Goes Realtime at Facebook ~ Borthakur, Sarma, Gray, Muthukkaruppan, Spiegelberg, Kuang, Ranganathan, Molkov, Menon, Rash, Scmidt and Aiyer
32

Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

Dec 25, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

Apache Hadoop Goes Realtime at Facebook

~ Borthakur, Sarma, Gray, Muthukkaruppan, Spiegelberg, Kuang, Ranganathan, Molkov,

Menon, Rash, Scmidt and Aiyer

Page 2: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

2

Problem and Context● Ever increasing data at Facebook● Launch of Facebook Messages● Other Young Turks at Facebook● Leaving MySQL and its sharding ● Migration challeneges● Problem in words: Unpredictable growth, write

throughput and latency requirements

Page 3: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

3

Problem and Context (contd.)● The Usual Suspects

● Cassandra● Other NoSQL

● Other considerations● Solution: A near realtime Hadoop/HBase that is

modified from the vanilla versions to provide scalability, consistency, availability and a compatible data model.

Page 4: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

4

Key contributions● Making Hadoop and HBase more real-time● Adapting Hadoop and HBase to Facebook's

unique requirements● Implementation of RealTime HDFS● Implementation of Production HBase● Operational Optimizations

Page 5: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

5

Overview● Problem and Context● Facebook stands alone● (Small) Introduction to Hadoop and HBase● Enter the Hs● Realtime HDFS● Production HBase● Operational Optimization● The present future

Page 6: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

6

Facebook's unique requirements

● Facebook and the Hadoop ecosystem● Offline and sequential

● Requirement Type 1 – Realtime concurrent read access to large stream of realtime data

● Example: Scribe

● Requirement Type 2 - Dynamically index a rapidly growing data set for fast random lookups

● Example: Facebook Messages

Page 7: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

7

Facebook's unique requirements● Facebook Messaging:

● Unweildly tables● High Write Throughput● Data Migration

● Facebook Insights● Realtime Analytics● Aggregators

● Facebook Metrics System● Quick reads● Automatic Sharding

Page 8: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

8

Overview● Problem and Context● Facebook stands alone● (Small) Introduction to Hadoop and HBase(Small) Introduction to Hadoop and HBase● Enter the Hs● Realtime HDFS● Production Hbase● Operational Optimization● The present future

Page 9: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

9

Introduction to Hadoop

Page 10: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

10

Introduction to HBase● Hbase: A NoSQL database that utilizes an on-disk

column storage format.

● Hbase USP: Provides fast key-based access to a specific cell or data or a range of cells.

● Based on Google's BigTable but extends it

● Has Row atomicity and read-modify-write consistency

● Simplifies a lot of tasks related to distributed databases.

● Tagline: Random access to web-scale data

Page 11: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

11

Introduction to Zookeeper● Zookeeper: A software service for a distributed

environment that coordinates and configures different machines in a centralized way.

● A change is not considered successful until it has been written to a quorum

● A leader is elected within the ensemble for conflicts

● In HBase, ZooKeeper coordinates and shares state between the Masters and RegionServers.

● Tagline: Enables highly reliable distributed coordination

Page 12: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

12

HBase + Zookeeper

Page 13: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

13

Overview● Problem and Context● Facebook stands alone● (Small) Introduction to Hadoop and HBase● Enter the HsEnter the Hs● Realtime HDFS● Production HBase● Operational Optimization● The present future

Page 14: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

14

The Why Hadoop/HBase question● Scalability● Range Scans● Efficient low-latency strong consistency● Atomic Read-Modify-Write● Random reads● Fault Isolation

Page 15: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

15

The Why Hadoop/HBase question● High write throughput● Data model● High Availability

● Non-requirements● Tolerance of network partitions● Individual data centre failure zero downtime● Federation comfort

Page 16: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

16

Overview● Problem and Context● Facebook stands alone● (Small) Introduction to Hadoop and HBase● Enter the Hs● Realtime HDFSRealtime HDFS● Production HBase● Operational Optimization● The present future

Page 17: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

17

Realtime HDFS - AvatarNode

Page 18: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

18

Realtime HDFS - AvatarNode

Page 19: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

19

Realtime HDFS – AvatarNode view

Page 20: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

20

Realtime HDFS – Logging

● Enhancements to Transcation logging:● Conventional HDFS● Change: Let the StandbyNode always know about

block ids.● Avoidance of partial reads between Active and

Standby node

Page 21: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

21

Improved block availability

● Challenge: Placement of non-local blocks is not optimal; can be on any rack or within any node therein.

● Soution: A new block placement policy which has reduced the probabilty of data loss by orders of magnitude.

● Define a 'window' of logical racks and logical machines around the original block.

Page 22: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

22

Hadoop performance improvements

● RPC Timeout● Live free or fail fast

● File Lease recovery● Local replica awareness● New tricks:

● HDFS sync● Concurrent readers

Page 23: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

23

Overview● Problem and Context● Facebook stands alone● (Small) Introduction to Hadoop and HBase● Enter the Hs● Realtime HDFS● Production HBaseProduction HBase● Operational Optimization● The present future

Page 24: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

24

HBase – ACID compliance

● Requirement: Row-level atomicity and consistency of ACID compliance

● RegionServer failure during log write for row transactions.

● Consistency of replicas

● Solution:

● WAL edits ~ Write Ahead Log policy● Immediate rollback

Page 25: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

25

HBase – Availability Improvements

● Master Rewrite● Store transient state in Zookeeper

● Rolling upgrades● Handled by reassigning of regions

● Distributed Logsplitting● Outsource to Zookeeper

Page 26: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

26

Hbase – Performance Improvement

● Compaction Improvement● put latency dropped from 25 ms to 3 ms!

● Read Optimization – Skipping certain unnecessary files for certain queries, reducing I/O

● Using Bloom filters ● A new special timestamp file selection algorithm

● Ensuring that Regions are local to their data

Page 27: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

27

Overview● Problem and Context● Facebook stands alone● (Small) Introduction to Hadoop and HBase● Enter the Hs● Realtime HDFS● Production Hbase● Operational OptimizationOperational Optimization● The present future

Page 28: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

28

Operational Optimizations● Facebook's HBase testing program● HBase Verify● HBCK● Added metrics for long running operations too!● Manual split instead of automatic

Page 29: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

29

Operational Optimizations

● Dark Launch● Dashboard/ODS integration

● Cross-cluster dashboards for higher analysis● Visualize version differences

● Backups● Do it using Scribe as an alternate application log● Piggyback on the date sent for Hive analytics

Page 30: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

30

Operational Optimizations

● Importing the data● Challenge: Importing legacy data in HBase from a

Hadoop job saturates the production network● Solution: Use Bulk Import with compression

– Enhanced by GZIP of the intermediate map output

● Reducing Network I/O:● Decreased the periodicity of major compactions● Certain column families excluded from logging

Page 31: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

31

The present future● Apache Hadoop 2.0 was released in 2012● One addition was YARN

● A powerful cluster resource management● Added the High Availability feature to NameNode by

introducing the Hot/Standby NameNode.● Greater integration with Zookeeper, especially for

the ZKFC (Implementation of failover in DAFS)

Page 32: Apache Hadoop Goes Realtime at Facebook Borthakur, Sarma ...

32

Thank you and GG!