Top Banner
Proprietary & Confidential. Copyright © 2014. Hado’ops or Had’oops 1 We’re Hiringrocketfuel.co Kishore Kumar Yellamraju Abhijit Pol
41

Hado“OPS” or Had “oops”

Jan 27, 2015

Download

Technology

Maintaining large-scale distributed systems is a herculean task and Hadoop is no exception. The scale and velocity that we operate at Rocket Fuel presents a unique challenge. We observed 5 fold PB growth in our data and 5 fold number of machines, all in just a year’s time. As Hadoop became a critical infrastructure at Rocket Fuel, we had to ensure scale and high availability so our reporting, data mining, and machine learning could continue to excel. We also had to ensure business continuity with disaster recovery plans in the face of this drastic growth. In this presentation, we will discuss what worked well for us and what we learned 9the hard way). Specifically, we will (a) describe how we automated installation and dynamic configuration using Puppet and InfraDB (b) describe the performance tuning for scaling Hadoop (c) talk about the good, bad, and ugly of scheduling and multi-tenancy (d) detail some of the hard-fought issues (e) brief our Business-Continuity Plans and Disaster Recovery (f) touch upon how we monitor our Monster Hadoop cluster, and finally, (g) share our experience of Yarn-at-Scale at Rocket Fuel.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

HadorsquoopsrsquoorHadrsquooopsrsquo 1

Wersquore Hiringrocketfuelcomcareers

Kishore Kumar YellamrajuAbhijit Pol

Proprietary amp Confidential Copyright copy 2014

The Web Is Monetized By Advertising

Proprietary amp Confidential Copyright copy 2014

Delivery Methods

raquoDisplayraquoVideoraquoMobileraquoSocial

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

$238965$06782$17234

$009$178964$16782$17234$0809$242125

$211$126

$2178$2056$0809$242125

$211$126$278$156

$1809$242125

$211$126$278$056$242125

$211$126$278

$0756$0809$242125

$211$126$278

$1256$1809$242125

$211$126$278

$0586$2009

125$211$126$278$156

$000

[ + ][ + ]

SitePageGeoWeatherTime of DayBrand AffinityUser

Always buying the best impressions amp serving the best ad

Real Time Bidding and Serving

Proprietary amp Confidential Copyright copy 2014

GoalLeadsamp sales

GoalCoupondownloads

GoalBrandawareness

SitePageGeoWeatherTime of DayBrand AffinityDemo

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-marketBehaviorResponse

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

X

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

+100+40-20+20+15+10+40+35

+97

+40-70-20+10+15-25-40

-18+07

+10-10-20+20+10-35-25+10

+14

Real Time Bidding and Serving

X

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

Throughput

Proprietary amp Confidential Copyright copy 2014

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash 2655 serversndash 1874 Teraflops of computing

raquo188 Terabytes of memoryndash 13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash 106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 2: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

The Web Is Monetized By Advertising

Proprietary amp Confidential Copyright copy 2014

Delivery Methods

raquoDisplayraquoVideoraquoMobileraquoSocial

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

$238965$06782$17234

$009$178964$16782$17234$0809$242125

$211$126

$2178$2056$0809$242125

$211$126$278$156

$1809$242125

$211$126$278$056$242125

$211$126$278

$0756$0809$242125

$211$126$278

$1256$1809$242125

$211$126$278

$0586$2009

125$211$126$278$156

$000

[ + ][ + ]

SitePageGeoWeatherTime of DayBrand AffinityUser

Always buying the best impressions amp serving the best ad

Real Time Bidding and Serving

Proprietary amp Confidential Copyright copy 2014

GoalLeadsamp sales

GoalCoupondownloads

GoalBrandawareness

SitePageGeoWeatherTime of DayBrand AffinityDemo

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-marketBehaviorResponse

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

X

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

+100+40-20+20+15+10+40+35

+97

+40-70-20+10+15-25-40

-18+07

+10-10-20+20+10-35-25+10

+14

Real Time Bidding and Serving

X

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

Throughput

Proprietary amp Confidential Copyright copy 2014

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash 2655 serversndash 1874 Teraflops of computing

raquo188 Terabytes of memoryndash 13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash 106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 3: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Delivery Methods

raquoDisplayraquoVideoraquoMobileraquoSocial

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

$238965$06782$17234

$009$178964$16782$17234$0809$242125

$211$126

$2178$2056$0809$242125

$211$126$278$156

$1809$242125

$211$126$278$056$242125

$211$126$278

$0756$0809$242125

$211$126$278

$1256$1809$242125

$211$126$278

$0586$2009

125$211$126$278$156

$000

[ + ][ + ]

SitePageGeoWeatherTime of DayBrand AffinityUser

Always buying the best impressions amp serving the best ad

Real Time Bidding and Serving

Proprietary amp Confidential Copyright copy 2014

GoalLeadsamp sales

GoalCoupondownloads

GoalBrandawareness

SitePageGeoWeatherTime of DayBrand AffinityDemo

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-marketBehaviorResponse

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

X

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

+100+40-20+20+15+10+40+35

+97

+40-70-20+10+15-25-40

-18+07

+10-10-20+20+10-35-25+10

+14

Real Time Bidding and Serving

X

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

Throughput

Proprietary amp Confidential Copyright copy 2014

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash 2655 serversndash 1874 Teraflops of computing

raquo188 Terabytes of memoryndash 13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash 106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 4: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

$238965$06782$17234

$009$178964$16782$17234$0809$242125

$211$126

$2178$2056$0809$242125

$211$126$278$156

$1809$242125

$211$126$278$056$242125

$211$126$278

$0756$0809$242125

$211$126$278

$1256$1809$242125

$211$126$278

$0586$2009

125$211$126$278$156

$000

[ + ][ + ]

SitePageGeoWeatherTime of DayBrand AffinityUser

Always buying the best impressions amp serving the best ad

Real Time Bidding and Serving

Proprietary amp Confidential Copyright copy 2014

GoalLeadsamp sales

GoalCoupondownloads

GoalBrandawareness

SitePageGeoWeatherTime of DayBrand AffinityDemo

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-marketBehaviorResponse

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

X

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

+100+40-20+20+15+10+40+35

+97

+40-70-20+10+15-25-40

-18+07

+10-10-20+20+10-35-25+10

+14

Real Time Bidding and Serving

X

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

Throughput

Proprietary amp Confidential Copyright copy 2014

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash 2655 serversndash 1874 Teraflops of computing

raquo188 Terabytes of memoryndash 13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash 106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 5: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

$238965$06782$17234

$009$178964$16782$17234$0809$242125

$211$126

$2178$2056$0809$242125

$211$126$278$156

$1809$242125

$211$126$278$056$242125

$211$126$278

$0756$0809$242125

$211$126$278

$1256$1809$242125

$211$126$278

$0586$2009

125$211$126$278$156

$000

[ + ][ + ]

SitePageGeoWeatherTime of DayBrand AffinityUser

Always buying the best impressions amp serving the best ad

Real Time Bidding and Serving

Proprietary amp Confidential Copyright copy 2014

GoalLeadsamp sales

GoalCoupondownloads

GoalBrandawareness

SitePageGeoWeatherTime of DayBrand AffinityDemo

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-marketBehaviorResponse

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

X

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

+100+40-20+20+15+10+40+35

+97

+40-70-20+10+15-25-40

-18+07

+10-10-20+20+10-35-25+10

+14

Real Time Bidding and Serving

X

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

Throughput

Proprietary amp Confidential Copyright copy 2014

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash 2655 serversndash 1874 Teraflops of computing

raquo188 Terabytes of memoryndash 13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash 106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 6: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

GoalLeadsamp sales

GoalCoupondownloads

GoalBrandawareness

SitePageGeoWeatherTime of DayBrand AffinityDemo

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-marketBehaviorResponse

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

X

Impression ScorecardDemoBrand AffinityTime of DayGeoWeatherSitePageAd PositionIn-MarketBehaviorResponse

+100+40-20+20+15+10+40+35

+97

+40-70-20+10+15-25-40

-18+07

+10-10-20+20+10-35-25+10

+14

Real Time Bidding and Serving

X

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

Throughput

Proprietary amp Confidential Copyright copy 2014

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash 2655 serversndash 1874 Teraflops of computing

raquo188 Terabytes of memoryndash 13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash 106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 7: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

6 Ad Served

User Segments

3 Bid Reques

t

Overview

Publishers

2 Ad Request

1 Page Request

4 Bid amp Ad

User Engagemen

ts

Data Partners

Advertisers

Browser

Some Exchange Partners

Ad Exchange

Optimize

Rocket Fuel Platform

Real-time BidderAutomated Decisions

Models

Refresh learning

Data Store

Ads ampBudget

ModelScores

Events

5 RocketfuelWinning Ad

Proprietary amp Confidential Copyright copy 2014

Throughput

Proprietary amp Confidential Copyright copy 2014

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash 2655 serversndash 1874 Teraflops of computing

raquo188 Terabytes of memoryndash 13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash 106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 8: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Throughput

Proprietary amp Confidential Copyright copy 2014

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash 2655 serversndash 1874 Teraflops of computing

raquo188 Terabytes of memoryndash 13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash 106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 9: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Latency

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash 2655 serversndash 1874 Teraflops of computing

raquo188 Terabytes of memoryndash 13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash 106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 10: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Architecture and Scale

raquoDatacentersraquoScaleraquoGrowthraquoArchitecture

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash 2655 serversndash 1874 Teraflops of computing

raquo188 Terabytes of memoryndash 13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash 106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 11: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Data Center Expansion

raquoabc

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash 2655 serversndash 1874 Teraflops of computing

raquo188 Terabytes of memoryndash 13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash 106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 12: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Data Center Design

bull Racks custom built at Rocket Fuelbull Leased spacebandwidth in colocation facilities

Hadoop Server20 2U servers (85kW)

Bidders40 2-U Twin 2 servers (17kW)

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash 2655 serversndash 1874 Teraflops of computing

raquo188 Terabytes of memoryndash 13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash 106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 13: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Rocket Fuel Scale

raquo34474 CPU processor coresndash 2655 serversndash 1874 Teraflops of computing

raquo188 Terabytes of memoryndash 13X the memory of IBM computer Watson that

played Jeopardy

raquo42PB Petabytes of storagendash 106X the data volume of the entire Library of

Congress

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 14: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Hadoop at Rocket Fuel

raquo 1400 servers

raquo 15K Disks

raquo 15K Cores

raquo 90 TB

raquo 30K MR slots

raquo 12K daily MR jobs

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 15: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

200 Servers 1400 Servers

1 Year

5 PB

41 PB8x

Growth

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 16: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Data Architecture 30

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 17: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Hadoop Setup

QJM ZK Quorum

raquo 6x2TB Disksraquo 2x6 coreraquo 196 GB RAMraquo 2x1G NIC

raquo 12x3TB Disksraquo 2x6 coreraquo 64 GB RAMraquo 10G NIC

raquo same as DNrsquosraquo Dedicated disk

to ZK or JN

JT

Standby NN

ZKFCZKFC

Active NN

DNTT

DNTT

DNTT

DNTT

DNTT

DNTT

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 18: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 19: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Puppet+

Infradb

Automation is key

Maintenance is Not Easy

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 20: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Puppet and Infradb

raquo Automate as much as you canraquo Adding a slave node to Hadoop cluster lt 120 secondsraquo Bringing up a new Hadoop cluster lt 500 secondsraquo MR slots are automatically determined based on hardware config

Isnrsquot it cool

Just define once

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 21: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

No issues when cluster is small Problems starts when it grows

Performance Tuning

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 22: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

dfsdatanodehandlercount dfsnamenodehandlercount

dfsdatanodemaxtransferthreads dfsimagetransfertimeout

mapredreduceparallelcopies

mapredjobtrackerhandlercount

iosortmbiosortfactor

maxClientCnxns ZK

HDFS

MR

IMP MAPREDUCE-2026

-XX+UseConcMarkSweepGC

-XXCMSFullGCsBeforeCompaction=1

-XXCMSInitiatingOccupancyFraction=60

ha-timeoutms

JVM

Performance Tuning

mapreducereduceshuffleparallelcopies

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 23: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

MAPREDUCE-5351

MAPREDUCE-5508

keepfailedtaskfiles=true

We Have an Issue

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 24: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

instances of JobInProgressrdquo class = no of users submitted jobs X mapredjobtrackercompleteuserjobsmaximum

mapredjobtrackercompleteuserjobsmaximum mapredjobtrackerretirejobinterval

mapredjobtrackerretiredjobscachesize

JT OOM

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 25: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 26: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Monitoring

Wall of Ops

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 27: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

MonitoringhadoopnamenodeCallQueueLength hadoopjobtrackerjvmmemheapusedm

Donrsquot fly blind you will crash

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 28: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

MR Workload Monitoring

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 29: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Network Monitoring

Donrsquot blame network instead monitor it Network Mesh can be mess

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 30: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Alerting

Monitoring is not enough need better Alerting

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 31: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Alerts

httphostnameportjmx

qry=Hadoopservice=NameNodename=NameNodeInfo

gtgt Checking whether NN and JT are up is a no brainer gtgt Reduce alert noise by having summaryaggregate alerts gtgt We heavily rely on custom scripts that query jmx for NN and JT

qry=hadoopservice=JobTrackername=JobTrackerInfo

NameDirStatuses DeadNodes NumberOfMissingBlocks

qry=Hadoopservice=NameNodename=FSNamesystemState

FSState CapacityRemaining NumDeadDataNodes UnderReplicatedBlocks

Blacklisted TTrsquos jobs slots_used ThreadCount

qry=javalangtype=Memory

Used jvm free jvm etc

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 32: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

MR Workload Alerting

raquo Monitoring MR workload and alertndash In-house tool that use ldquohoudahrdquo ruby gem monitorsndash Long running jobs jobs with more map tasks blacklisted

TTrsquos with more failure counts etchellip

raquo Collect details and auto-restart blacklisted TTrsquosraquo Parse the JT logfile for rouge jobsraquo Parse the JT log and collects all Job related inforaquo White-elephant or hraven could helpraquo Parse the scheduler html page or use metrics page httpltJT-hostnamegt50030scheduleradvanced httpltJT-hostnamegt50030metrics

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 33: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Modeling

OPS

ETL

Ad-hoc

Multi Tenancy

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 34: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

No Scheduler is perfect unless you understand and tune it properly

Scheduling

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 35: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Operations

raquo Maintenanceraquo Performance Tuningraquo Monitoringraquo BCPraquo YARN

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 36: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

BCP

raquo BCP rarr Business Continuity Planraquo Near real time reporting over 15+ TB of daily dataraquo Freshness of models trained over petabytes of data

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 37: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Data BCP Cluster

INW Data

Cluster

US Serving Clusters

EU Serving Clusters

HK Serving Clusters

Modeling

Reporting

User Queries

Amazon BackupLSV Data

Cluster

USEUHK Serving Clusters

Research

Ad-hoc Queries

Processed Data

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 38: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

YARN

JobTracker

raquo Resource Manager - Global resource scheduler - Hierarchical queues - Application management

raquo Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring

raquo Application Master - Per-application - Manages application scheduling and task execution

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 39: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

YARN at Rocket FueI

raquo Yarn is in production raquo 700+ nodesraquo 31TB RAM 8500 disks 8500 cores raquo Primary use case Map-Reduceraquo No more static slotsraquo Tez Spark Storm are in race

YAY

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 40: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

Obligatory ldquowe are hiringrdquo slide

httprocketfuelcomcareers

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41
Page 41: Hado“OPS” or Had “oops”

Proprietary amp Confidential Copyright copy 2014

THANKS

kishorerocketfuelcomapolrocketfuelcom

  • Hadorsquoopsrsquo or Hadrsquooopsrsquo 1
  • The Web Is Monetized By Advertising
  • Delivery Methods
  • Overview
  • Always buying the best impressions amp serving the best ad
  • Real Time Bidding and Serving
  • Overview (2)
  • Throughput
  • Latency
  • Architecture and Scale
  • Data Center Expansion
  • Data Center Design
  • Rocket Fuel Scale
  • Hadoop at Rocket Fuel
  • Growth
  • Data Architecture 30
  • Hadoop Setup
  • Operations
  • Maintenance is Not Easy
  • Puppet and Infradb
  • Performance Tuning
  • Performance Tuning (2)
  • We Have an Issue
  • JT OOM
  • Operations (2)
  • Monitoring
  • Monitoring (2)
  • MR Workload Monitoring
  • Network Monitoring
  • Alerting
  • Alerts
  • MR Workload Alerting
  • Multi Tenancy
  • Scheduling
  • Operations (3)
  • BCP
  • Data BCP Cluster
  • YARN
  • YARN at Rocket FueI
  • Obligatory ldquowe are hiringrdquo slide
  • Slide 41