Top Banner
(Big Data) 2 How YARN Timeline Service v.2 Unlocks 360-Degree Platform Insights at Scale Sangjin Lee @sjlee (Twitter) Joep Rottinghuis @joep (Twitter)
26

HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Jan 06, 2017

Download

Engineering

Michael Stack
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

(Big Data)2How YARN Timeline Service v.2 Unlocks 360-

Degree Platform Insights at Scale

Sangjin Lee @sjlee (Twitter)Joep Rottinghuis @joep (Twitter)

Page 2: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Outline• Why v.2?

• Highlights

• Developing for Timeline Service v.2

• Setting up Timeline Service v.2

• Milestones

• Demo

Page 3: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Why v.2?• YARN Timeline Service v 1.x

• Gained good adoption: Tez, HIVE, Pig, etc.

• Keeps improving with v 1.5 APIs and storage implementation

• Still facing some fundamental challenges...

Page 4: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Why v.2?• Scalability and reliability challenges

• Single instance of Timeline Server

• Storage (single local LevelDB instance)

• Usability

• Flow

• Metrics and configuration as first-class citizens

• Metrics aggregation up the entity hierarchy

Page 5: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Highlightsv.1 v.2

Single writer/reader Timeline Server Distributed writer/collector architecture

Single local LevelDB storage* Scalable storage (HBase)

v.1 entity model New v.2 entity model

No aggregation Metrics aggregation

REST API Richer query REST API

Page 6: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Architecture• Separation of writers (“collectors”) and readers

• Distributed collectors: one collector for each app

• Dedicated RM collector for RM-generated data

• Collector discovery via RM

• Pluggable storage with HBase as default storage

Page 7: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Distributed collectors & readers

Page 8: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

What is a flow?• A flow is a group of YARN

applications that are launched as parts of a logical app

• Oozie, Scalding, Pig, etc.• name:

“frequent_visitor_stat”• run id: 1466097809000• version: “b9b9068”

Page 9: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Configuration and metrics• Now explicit top-level attributes

of entities• Fine-grained updates and

queries made possible• “update metric A to value x”

• “query entities where config A = B”

Page 10: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Configuration and metrics• Now explicit top-level attributes

of entities• Fine-grained updates and

queries made possible• “update metric A to value x”

• “query entities where config A = B”

Page 11: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

HBase Storage• Scalable backend• Row Key structure

• efficient range scans

• KeyPrefixRegionSplitPolicy

• Filter pushdown• Coprocessors for flow aggregation (“readless” aggregation)

• Cell tags for metadata (application id, aggregation operation)• Cell timestamps generated during put

• left shifted with app id added to avoid overwrites

Page 12: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Tables in HBase• flow run

• application

• entity

• flow activity

• app to flow

Page 13: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

table: flow runRow key: clusterId!userName!flowName!inverted(flowRunId)

• most recent flow run stored first• coprocessor enabled

Page 14: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

table: applicationRow key: clusterId!userName!flowName!inverted(flowRunId)!AppId

• applications within a flow run stored together

• most recent flow run stored first

Page 15: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

table: entityRow key: userName!clusterId!flowName!inverted(flowRunId)!AppId!entityType!entityId

• entities within an application within a flow run stored together per type• for example, all containers within a yarn application will

be stored together• pre-split table• stores information per entity run like info, relatesTo,

relatedTo, events, metrics, config

Page 16: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

table: flow activityRow key: clusterId!inverted(TopOfTheDay)!userName!flowName

• shows the flows that ran on that day• stores information per flow like

number of runs, the run ids, versions

Page 17: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

table: appToFlowRow key: clusterId!appId

- stores mapping of appId to flowName and flowRunId

Page 18: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Metrics aggregation• Application level

• Rolls up sub-application metrics

• Performed in real time in the collectors in memory

• Flow run level• Rolls up app level metrics

• Performed in HBase region servers via coprocessors

• Offline aggregation (TBD)

• Rolls up on user, queue, and flow offline periodically

• Phoenix tables

Page 19: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

FlowRun Aggregation

via the HBaseCoprocessor

AppMetrics

Cells in

HBase

FlowRun

MetricSum

Page 20: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

AppMetrics

Cellsin

HBase

FlowRun

MetricSum

FlowRun Aggregation

via the HBaseCoprocessor

Page 21: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Reader REST API: paths• URLs under /ws/v2/timeline

• Canonical REST style URLs: /ws/v2/timeline/clusters/cluster_name/users/user_name/flows/flow_name/runs/run_id

• Path elements may be omitted if they can be inferred

• flow context can be inferred by app id

• default cluster is assumed if cluster is omitted

Page 22: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Setting up Timeline Service v.2• Set up the HBase cluster (1.1.x)

• Add the timeline service jar to HBase

• Install the flow run coprocessor

• Create tables via TimelineSchemaCreator utility

• Configure the YARN cluster

• Enable Timeline Service v.2

• Add hbase-site.xml for the timeline collector and readers

• Start the timeline reader daemon

Page 23: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Milestone 1 ("Alpha 1")• Merge discussion (YARN-2928) in progress as we

speak!✓ Complete end-to-end read/write flow

✓ Real time application and flow aggregation

✓ New entity model

✓ HBase Storage

✓ Rich REST API

✓ Integration with Distributed Shell and MapReduce

✓ YARN generic events and system metrics

Page 24: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Milestones - Future• Milestone 2 (“Alpha 2”)

• Integration with new YARN UI

• Integration with more frameworks

• Beta• Freeze API and storage

schema• Security• Collectors as containers• Storage fault tolerance• Production-ready• Migration-ready

Page 25: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Contributors• Li Lu, Junping Du, Vinod Kumar Vavilapalli (Hortonworks)

• Varun Saxena, Naganarasimha G. R. (Huawei)

• Sangjin Lee, Vrushali Channapattan, Joep Rottinghuis (Twitter)

• Zhijie Shen (now at Facebook)

• The HBase and Phoenix community!

Page 26: HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Thank you!