Top Banner
© Hortonworks Inc. 2013 YARN Code Overview Ocular bleeding is no reason to stop programing! Page 1
12

Hortonworks Yarn Code Walk Through January 2014

Jan 26, 2015

Download

Technology

Hortonworks

This slide deck accompanies the Webinar recording YARN Code Walk through on Jan. 22, 2014, on Hortonworks.com/webinars under Past Webinars, or
https://hortonworks.webex.com/hortonworks/lsr.php?AT=pb&SP=EC&rID=129468197&rKey=b645044305775657
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Hortonworks Yarn Code Walk Through January 2014

© Hortonworks Inc. 2013

YARN Code OverviewOcular bleeding is no reason to stop programing!

Page 1

Page 2: Hortonworks Yarn Code Walk Through January 2014

© Hortonworks Inc. 2013

Quick Bio – Joseph Niemiec

• Hadoop user for 2+ years• 1 of 5 Author’s for Apache Hadoop YARN (March 2014)

• Originally used Hadoop for location based services –Destination Prediction–Traffic Analysis–Effects of weather at client locations on call center call types

• Pending Patent in Automotive/Telematics domain• Defensive Paper on M2M Validation• Started on analytics to be better at an MMORPG

Page 3: Hortonworks Yarn Code Walk Through January 2014

© Hortonworks Inc. 2013

Agenda

• What Is YARN• YARN Concepts & Architecture• Code and more Code• Q&A

Page 3

Page 4: Hortonworks Yarn Code Walk Through January 2014

© Hortonworks Inc. 2013

From Batch To Anything

HADOOP 1.0

HDFS(redundant, reliable storage)

MapReduce(cluster resource management

& data processing)

HDFS2(redundant, reliable storage)

YARN(cluster resource management)

MapReduce(data processing)

Others(data processing)

HADOOP 2.0

Single Use SystemBatch Apps

Multi Purpose PlatformBatch, Interactive, Online, Streaming, …

Page 4

Page 5: Hortonworks Yarn Code Walk Through January 2014

© Hortonworks Inc. 2013Page 5

Concepts

• Application–Application is a job submitted to the framework–Examples

– Map Reduce Job – MoYa Cluster

• Container–Basic unit of allocation–Fine-grained resource allocation across multiple resource

types (memory, cpu, disk, network, gpu etc.)– container_0 = 2GB, 1CPU– container_1 = 1GB, 6 CPU

–Replaces the fixed map/reduce slots

Page 6: Hortonworks Yarn Code Walk Through January 2014

© Hortonworks Inc. 2013Page 6

Architecture

• Resource Manager–Global resource scheduler–Hierarchical queues

• Node Manager–Per-machine agent–Manages the life-cycle of container–Container resource monitoring

• Application Master–Per-application–Manages application scheduling and task execution–E.g. MapReduce Application Master

Page 7: Hortonworks Yarn Code Walk Through January 2014

© Hortonworks Inc. 2013

To the code!

Page 7

Page 8: Hortonworks Yarn Code Walk Through January 2014

© Hortonworks Inc. 2013

Q&A

Page 8

Page 9: Hortonworks Yarn Code Walk Through January 2014

© Hortonworks Inc. 2013

YARN - ApplicationMaster

• ApplicationMaster–ApplicationSubmissionContext is the complete

specification of the ApplicationMaster, provided by Client–ResourceManager responsible for allocating and launching

ApplicationMaster container

Page 9

ApplicationSubmissionContext

resourceRequest

containerLaunchContext

appName

queue

Page 10: Hortonworks Yarn Code Walk Through January 2014

© Hortonworks Inc. 2013

YARN – Resource Allocation & Usage

• ContainerLaunchContext–The context provided by ApplicationMaster to NodeManager to

launch the Container–Complete specification for a process–LocalResource used to specify container binary and

dependencies– NodeManager responsible for downloading from shared namespace

(typically HDFS)

Page 10

ContainerLaunchContextcontainer

commands

environment

localResources LocalResourceuri

type

Page 11: Hortonworks Yarn Code Walk Through January 2014

© Hortonworks Inc. 2013

YARN – Resource Allocation & Usage

• ResourceRequest

Page 11

priority capability resourceName numContainers

0 <2gb, 1 core>host01 1

rack0 1

* 1

1 <4gb, 1 core> * 1

Page 12: Hortonworks Yarn Code Walk Through January 2014

© Hortonworks Inc. 2013

YARN – Resource Allocation & Usage

• Container–The basic unit of allocation in YARN–The result of the ResourceRequest provided by

ResourceManager to the ApplicationMaster–A specific amount of resources (cpu, memory etc.) on a specific

machine

Page 12

ContainercontainerId

resourceName

capability

tokens