Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

Post on 10-May-2015

3216 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Pivotal has setup and operationalized 1000 node Hadoop cluster called the Analytics Workbench. It takes special setup and skills to manage such a large deployment. This session shares how we set it up and how you will manage it.

Transcript

1 © Copyright 2013 EMC Corporation. All rights reserved.

Operationalizing 1000 Node Hadoop Cluster – Analytics Workbench Clinton Ooi Bhavin Modi

2 © Copyright 2013 EMC Corporation. All rights reserved.

Agenda

Introduction

Tools – Kickstart – Parallel SSH – Puppet

Q & A

3 © Copyright 2013 EMC Corporation. All rights reserved.

Meet AWB Introduction to the Analytics Workbench

4 © Copyright 2013 EMC Corporation. All rights reserved.

Vision Statement Provide a collaborative platform that is:

AGILE: Support platform for proving mixed mode enterprise readiness at scale. INNOVATIVE: Showcase ground breaking data science. ACCESSIBLE: Create a shared environment for rapid innovation of big data and cloud computing technologies. EDUCATIONAL: Provide a resource for educating developers, partners, and customers on big data and cloud technologies.

5 © Copyright 2013 EMC Corporation. All rights reserved.

Partners Intel– contributed 2,000 hex-core CPUs

Mellanox – contributed 72 switches, 1000+ network cards, 1400+ cables

Micron – contributed 6,000 memory modules

Seagate – contributed 12,000 2TB drives

Supermicro – contributed 1,000+ servers

Switch – contributed the hosting facility in its state-of-the-art data center

VMware – provided operational support

6 © Copyright 2013 EMC Corporation. All rights reserved.

Quick facts Largest Hadoop cluster of its kind

Operational since July 2012

Single multi-tenant cluster

Physical cluster (no virtualization)

25 projects - 12 active, 8 in pipeline

7 © Copyright 2013 EMC Corporation. All rights reserved.

Use-case Pivotal Demonstration

Partner Engagements

Industry and Academia Collaboration

8 © Copyright 2013 EMC Corporation. All rights reserved.

Tools Scalable Tool Chain & Standardization

9 © Copyright 2013 EMC Corporation. All rights reserved.

AWB Cluster Lifecycle

10 © Copyright 2013 EMC Corporation. All rights reserved.

AWB Cluster Lifecycle

11 © Copyright 2013 EMC Corporation. All rights reserved.

Kickstart Generic tool to automate OS install

Requires DHCP, TFTP and HTTP services

TFTP serves the PXELINUX HEX file, Linux kernel (vmlinuz) and in-memory file system (initrd)

HTTP serves the kickstart configuration (kickstart.cfg)

12 © Copyright 2013 EMC Corporation. All rights reserved.

Kickstart

Example of PXELINUX file - /tftpboot/pxelinux.cfg/AC1C0401

Continued

default install label install kernel centos/6.2/vmlinuz append initrd=centos/6.2/initrd.img ramdisk_size=9025 text console=ttyS2,115200,n,1 sshd=1 install=http://10.1.25.51/centos/6.2/os/x86_64 ks=http://10.1.25.51/centos/6.2/kickstart/conf/kickstart.cfg implicit 1 display message prompt 1 timeout 10

13 © Copyright 2013 EMC Corporation. All rights reserved.

Kickstart

Example of kickstart config

Continued

… url --url http://10.1.25.51/centos/6.2/os/x86_64 ... %packages @core @performance … %post --log=/root/kickstart-post.log wget -O /root/post-install.tgz http://10.1.25.51/centos/6.2/post-install.tgz …

14 © Copyright 2013 EMC Corporation. All rights reserved.

Kickstart

Generate PXELINUX and kickstart files

Continued

[cooi@ks ~]$ ./kickstart --generate --os centos --osver 6.2 --restart pxe node0945 Generating /tftpboot/pxelinux.cfg/AC1C0401 Setting bootdev on node0945.sp Set Boot Device to pxe Restarting node0945.sp Chassis Power Control: Cycle

[cooi@ks ~]$ for i in `seq -w 1 200`; do ./kickstart --generate --os centos --osver 6.2 --restart pxe node0$i; done … Skipping

15 © Copyright 2013 EMC Corporation. All rights reserved.

Kickstart

Enable switching or upgrading OS easily

Kickstart 60 nodes in ~45 minutes: – 1 kickstart server with software RAID5 – 100Mbps TOR and aggregator switches – Saturated the 100Mbps network

Kickstart 200 nodes in ~45 minutes: – 2 kickstart servers with software RAID5 – 100Mbps TOR switches and 1Gbps aggregator switches

Estimate to do >1000 nodes with full 1Gbps network

Continued

16 © Copyright 2013 EMC Corporation. All rights reserved.

Parallel SSH

Sys admin’s lightsaber

17 © Copyright 2013 EMC Corporation. All rights reserved.

Parallel SSH Continued

Start/Stop Hadoop services

Orchestrate cluster deployments

Perform manual cluster administration tasks

Pick one that is user-friendly and scalable, e.g. – Massh - http://m.a.tt/er/massh/ – ClusterShell - https://github.com/cea-hpc/clustershell – Parallel Distributed Shell (pdsh) - https://code.google.com/p/pdsh

18 © Copyright 2013 EMC Corporation. All rights reserved.

Puppet Configuration Management framework

Install and configure all applications on the cluster

Configure monitoring system

Currently running Puppet 2.7.x

19 © Copyright 2013 EMC Corporation. All rights reserved.

Puppet Continued

20 © Copyright 2013 EMC Corporation. All rights reserved.

Puppet Continued

21 © Copyright 2013 EMC Corporation. All rights reserved.

Puppet Continued

22 © Copyright 2013 EMC Corporation. All rights reserved.

Puppet Continued

23 © Copyright 2013 EMC Corporation. All rights reserved.

Puppet Continued

Puppet sync 600 nodes in ~15 minutes: – Use parallel SSH tool to trigger Puppet sync across the cluster – 1 Puppet master with dual hex-core CPU – Saturated CPU on the Puppet master

Switch versions of Hadoop in 2 hours

Manifests and modules are version-controlled

24 © Copyright 2013 EMC Corporation. All rights reserved.

Puppet Continued

One quarter to learn, deploy and design our Puppet infrastructure.

– It is an iterative process.

Tasks managed outside of Puppet: – User account management – Start/Stop Hadoop services – Orchestrate deployment – Rollback/uninstall applications

25 © Copyright 2013 EMC Corporation. All rights reserved.

Cluster Management Tools

Task / Tools Kickstart Parallel SSH Puppet Nagios Ganglia

Install OS

Install Apps

Configure Apps

Start / Stop Services

Monitoring

26 © Copyright 2013 EMC Corporation. All rights reserved.

Q & A

http://www.analyticsworkbench.com

27 © Copyright 2013 EMC Corporation. All rights reserved.

Pivotal Sessions at EMC World Session Presenter Dates/Times The Pivotal Platform: A Purpose-Built Platform for Big-Data-Driven Applications

Josh Klahr Tue 5:30 - 6:30, Palazzo E Wed 11:30 - 12:30, Delfino 4005

Pivotal: Data Scientists on the Front Line: Examples of Data Science in Action

Noelle Sio Tue 10:00 - 11:00, Lando 4205 Thu 8:30 - 9:30, Palazzo F

Pivotal: Operationalizing 1000-node Hadoop Cluster – Analytics Workbench

Clinton Ooi Bhavin Modi

Tue 11:30 - 12:30, Palazzo L Thu 10:00- 11:00 am, Delfino 4001A

Pivotal: for Powerful Processing of Unstructured Data For Valuable Insights

SK Krishnamurthy

Mon 4:00 - 5:00, Lando 4201 A Tue 4:00 - 5:00, Palazzo M

Pivotal: Big & Fast data – merging real-time data and deep analytics

Michael Crutcher

Mon 1:00 - 2:00, Lando 4201 A Wed 10:00 - 11:00, Palazzo M

Pivotal: Virtualize Big Data to Make The Elephant Dance June Yang Dan Baskette

Mon 11:30 - 12:30, Marcello 4401A Wed 4:00 - 5:00, Palazzo E

Hadoop Design Patterns Don Miner Mon 2:30 - 3:30, Palazzo F Wed 8:30 - 9:30, Delfino 4005

top related