Click here to load reader
May 06, 2015
1
Cloudera Manager – API’s & Extensibility Ma# Harris – Systems Engineer December 2013
CONFIDENTIAL -‐ RESTRICTED
Cloudera Manager
2
End-‐to-‐End AdministraGon for CDH
Manage Easily deploy, configure & opGmize clusters 1 Monitor Maintain a central view of all acGvity 2 Diagnose Easily idenGfy and resolve issues 3 Integrate Use Cloudera Manager with exisGng tools 4
©2013 Cloudera, Inc. All Rights Reserved.
IntegraGng with your IT Mgmt tools
3 ©2013 Cloudera, Inc. All Rights Reserved.
Cloudera Manager
Installa;on, Deployment
tools e.g. Chef, Puppet etc.
Monitoring Tools
e.g. Orion, Tivoli, BMC
etc.
Aler;ng Tools
e.g Nagios, SNMP etc.
Hadoop Opera*ons
Datacenter Opera*ons Various op*ons of integra*ng Cloudera Manager into your exis*ng Datacenter Opera*ons/Tools • Cloudera Manager API
• Introduced in CM4 (June 2012) • Installa*on & deployment • Monitoring
• SNMP Alerts • Introduced in CM4.5 (Feb 2013)
• And more… • Monitoring ‘tsquery’ (Feb 2013) • User-‐defined triggers/alarms (new for C5!) • Service extensibility (new for C5!)
Cloudera Manager (CM) API • API access was a new feature introduced in Cloudera Manager 4.0, providing programmaGc access to
cluster operaGons (such as configuraGon and restart) and monitoring informaGon (such as health and metrics).
• The CM API is an HTTP REST API, using JSON serializaGon. The API is served on the same host and port as the CM web UI, and does not require an extra process or extra configuraGon. API users have the same privileges as they do in the web UI world.
©2013Cloudera, Inc. All Rights Reserved. 4
• Docs & Examples h#p://cloudera.github.io/cm_api/ h#ps://github.com/cloudera/cm_api
• Java/Python clients h#p://blog.cloudera.com/blog/2013/05/how-‐to-‐automate-‐your-‐hadoop-‐cluster-‐from-‐java/
Examples of integraGon with CM API • Installa;on & Deployment
• Chef • Puppet • Dell Crowbar
• h#p://blog.cloudera.com/blog/2013/08/how-‐to-‐deploy-‐hadoop-‐clusters-‐automaGcally-‐with-‐dell-‐crowbar-‐and-‐cloudera-‐manager/ • StackIQ
• h#p://web.stackiq.com/blog/bid/312064/StackIQ-‐Cluster-‐Manager-‐now-‐integrated-‐with-‐Cloudera • WANdisco – non-‐stop NN setup • Several other customers/partners leveraging the API’s as part of their install & deployment
process • Monitoring & Aler;ng
• Oracle Enterprise Manager (via Big Data Appliance) • Nagios
• h#ps://github.com/cloudera/cm_api/tree/master/nagios • h#ps://github.com/harisekhon/nagios-‐plugins/blob/master/
check_hadoop_cloudera_manager_metrics.pl • SNMP alerts integraGon with IBM Netcool
©2013 Cloudera, Inc. All Rights Reserved. 5
Develop & Contribute your plug-‐in’s using Cloudera Manager API
Cloudera Manager – Monitoring via ‘tsquery’
6
©2013 Cloudera, Inc. All Rights Reserved.
• Introduced as part of CM4.5 release (Feb 2013)
• Great way to add interesGng charts (above & beyond what is provided by default) and monitor metrics that are relevant to your clusters
• The tsquery language is used to specify statements for retrieving Gme-‐series data from the Cloudera Manager Gme-‐series data store
• Example: How do I compare all disk IO for all the DataNodes that belong to a specific HDFS service? select bytes_read, bytes_wriZen where roleType=DATANODE and serviceName=hdfs1
• Retrieved Gme-‐series data can be plo#ed via various opGons – line, bar, sca#er, heat maps, table list etc.
• Extending this concept to create user-‐defined triggers/alarms (new for C5!).
• More details • h#p://www.cloudera.com/content/cloudera-‐content/cloudera-‐docs/CM5/latest/Cloudera-‐
Manager-‐DiagnosGcs-‐Guide/cm5dg_chart_Gme_series_data.html
Examples of Cloudera Manager ‘tsquery’
7
©2013 Cloudera, Inc. All Rights Reserved.
Example1: How do I track the aggregate Cluster Disk IO? select dt0(read_bytes_disk_sum), dt0(write_bytes_disk_sum) where category = CLUSTER and clusterId = $CLUSTERID
Example2: How do I compare CPU usage across hosts? select dt0(total_cpu_user) / getHostFact(numCores, 1) * 100, dt0(total_cpu_system) / getHostFact(numCores, 1) * 100, dt0(total_cpu_nice) / getHostFact(numCores, 1) * 100, dt0(total_cpu_iowait) / getHostFact(numCores, 1) * 100, dt0(total_cpu_irq) / getHostFact(numCores, 1) * 100, dt0(total_cpu_so`_irq) / getHostFact(numCores, 1) * 100
Create & Contribute your ‘tsqueries’! h#ps://github.com/cloudera/cm_charGng_scrapbook
Cloudera Manager – Service Extensibility
• Introduced in C5 • SGll in Beta!
• Some aspects (espcially Parcel mgmt) available in CM4.x
• Example: CollaboraGon with Syncsort to deploy DMX-‐h libraries
• Single management console for CDH, non-‐CDH services and ISV applicaGons
• Similar look and feel as exisGng services
• Easy to write (Java-‐free!)
• Flexible
• Independent release cycle
©2013Cloudera, Inc. All Rights Reserved.
Analogy from OperaGng Systems (OS) world
9 ©2013Cloudera, Inc. All Rights Reserved.
Core OS kernel
Package Mgmt
Process/ Resource Mgmt
Security Mgmt
Data Access Mgmt
ISV’s view of OS
Systems Management
Bringing ISV Apps to CDH
10 ©2013Cloudera, Inc. All Rights Reserved.
Core Hadoop/CDH kernel
Parcels Resource Mgmt
Security Mgmt CDK API’s
ISV’s view of Hadoop
Cloudera Manager
IntegraGng into the Cloudera Product Porpolio
11 ©2013Cloudera, Inc. All Rights Reserved.
Cloudera Manager
Features Descrip;on Examples
Package Mgmt
-‐ Ability to easily package and distribute binaries/jars via “Parcels”
-‐ InformaGca -‐ Syncsort
Resource Mgmt
-‐ Ability to deploy applicaGons as stand-‐alone processes or via YARN* on the Hadoop grid
-‐ Resource isolaGon of cluster resources
-‐ SAS -‐ 0xData -‐ Accumulo
Security Mgmt
-‐ Support for Kerberos Mgmt -‐ Role bases access control for Tables/Views in Hive/Impala via Sentry
Data Access Mgmt
-‐ HDFS and HBase API abstracGon and simplificaGon
Systems Mgmt
Manage -‐ Deploy and upgrade (rolling) services and pkgs -‐ Manage configuraGons
Monitor -‐ ProacGve health checks -‐ Track resource uGlizaGon -‐ Custom metrics charts
Diagnose -‐ Distributed log collecGon and searching -‐ Tag and track key events
Integrate -‐ Access operaGonal tools via API -‐ Surface overall cluster metrics to ISV dashboard
Non-‐CDH Apps…
ISV’s
Accumulo, Spark, Giraph etc.
* Support for YARN planned as part of CM5.x in FY14
So.. How does it work?
• A JSON file that describes your service • Set of control scripts • Packaged as a JAR file • As promised, Java-‐free
©2013Cloudera, Inc. All Rights Reserved.
Example: Cloudera Manager Extensions -‐ Spark
©2013Cloudera, Inc. All Rights Reserved.
Cloudera Manager Extensions
©2013Cloudera, Inc. All Rights Reserved.
Cloudera Manager Extensions: Spark
©2013Cloudera, Inc. All Rights Reserved.
Cloudera Manager Extensions: Spark
©2013Cloudera, Inc. All Rights Reserved.
Cloudera Manager Extensions: Spark
©2013Cloudera, Inc. All Rights Reserved.
#!/bin/bash CMD=$1 MASTER_PORT=<read in from ./params.proper;es>
case $CMD in (start_master) exec $SPARK_HOME/scripts/spark-‐start.sh master" ;; (*) echo "$;mestamp Don't understand [$CMD]" ;; esac
name : “spark”, roles : [{ name : "master", startRunner : { program : "scripts/control.sh", args : [ "start_master", "./params.proper;es"] }, parameters : [{ name : "master_port", type : "port", default : 7077 }], configWriter : { generators : [{ filename : "params.proper;es" }] }]
The Code
©2013Cloudera, Inc. All Rights Reserved.
Next Steps
• DocumentaGon & SDK as part of C5 Beta2 or later (definitely before GA!)
• Working with select ISV’s (SAS, Syncsort, 0xData etc.) as part of Beta to further fine-‐tune this feature
©2013Cloudera, Inc. All Rights Reserved.
Develop & Contribute your Cloudera Manager service extensibility plug-‐in’s !
Vision of CM Extensibility
©2012Cloudera, Inc. All Rights Reserved. 20
CDH CM
Syncsort Informatica
Security ISV’s 0xData
Capacity Mgr SLA Mgr Cost
Optimizer
API
Horizontal Extension
Vert
ical
Ext
ensi
on
Serv
ice
Exte
nsib
ility
Ops Apps
SAS
Revolution
Spark Giraph Accumulo
Oracle OEM Dell Nagios
API SNMP
Chef/ Puppet
Q&A
©2013Cloudera, Inc. All Rights Reserved.