A Day in the Life of a Hadoop Administrator

Post on 13-Apr-2017

470 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

Transcript

www.edureka.co/hadoop-admin

A day in the life of Hadoop Administrator!

www.edureka.co/hadoop-admin

What will you learn today?

Let us have a quick poll, do you know the following topics?

The Daily tasks Hadoop Admins do Cluster Monitoring tools How Fault tolerance is maintained in cluster Demo on Hadoop High Availability Demo on YARN High Availability

www.edureka.co/hadoop-admin

Daily tasks

www.edureka.co/hadoop-admin

Cluster MonitoringFirst thing in the morning, Monitor Console should be checked (Cloudera manager, Nagios, Ganglia etc) and the Jobtracker UI.

www.edureka.co/hadoop-admin

Few Cluster Monitoring Tools

www.edureka.co/hadoop-admin

Cluster Plan

Plan the day and review the past tasks in a meeting

www.edureka.co/hadoop-admin

Cluster PlanTypical slave node hardware configurations

Midline configuration (all around, deep storage, 1 Gb Ethernet)

CPU 2 × 6 core 2.9 Ghz/15 MB cache

Memory 64 GB DDR3-1600 ECC

Disk controller SAS 6 Gb/s

Disks 12 × 3 TB LFF SATA II 7200 RPM

Network controller 2 × 1 Gb Ethernet

NotesCPU features such as Intel’s Hyper-Threading and

QPI are desirable. Allocate memory to take advantage of triple- or quad-channel memory

configurations.

www.edureka.co/hadoop-admin

High end configuration (high memory, spindle dense, 10 Gb Ethernet)

CPU 2 × 6 core 2.9 Ghz/15 MB cache

Memory 96 GB DDR3-1600 ECC

Disk controller 2 × SAS 6 Gb/s

Disks 24 × 1 TB SFF Nearline/MDL SAS 7200 RPM

Network controller 1 × 10 Gb Ethernet

Notes Same as the midline configuration

High end configuration (high memory, spindle dense, 10 Gb Ethernet)

www.edureka.co/hadoop-admin

Execute Few Regular Utility TasksDeveloping and running files merger so that the small files and directories our data suppliers create would become bigger and fewer.

www.edureka.co/hadoop-admin

Backup And Recovery Task

www.edureka.co/hadoop-admin

Demo on Achieving Hadoop High Availability

www.edureka.co/hadoop-admin

Job Scheduling And ConfigurationKeep the farm working – we build Monitoring, Managing resources between our users and our tools, tuning configurations for the farm stack, for MapReduce, Spark jobs and for the servers

www.edureka.co/hadoop-admin

Analyzing Failed TasksAnalyzing too heavy or failed jobs and Fixing problems

www.edureka.co/hadoop-admin

Demo on Achieving YARN High Availability

www.edureka.co/hadoop-admin

Evaluating New Host Requests

Collecting and Defining requirements for new hosts

www.edureka.co/hadoop-admin

Updates And UpgradesUpgrading and updating the farm from time to time

www.edureka.co/hadoop-admin

Try And Finalize New Solutions

Setting Benchmarks for new projects.

www.edureka.co/hadoop-admin

Be In Touch With New Configuration Tools

Set a configuration management tool for our test and production environments

www.edureka.co/hadoop-admin

Execute Few DWH ResponsibilitiesDeveloping an easy infrastructure to insert data to the cluster and into Hive and HBase

www.edureka.co/hadoop-admin

Assisting Hadoop Developers

Daily support for developers who use the Hadoop stack

www.edureka.co/hadoop-admin

Checking usage of Resources and User-Permissions

Managing Users, Permissions, Quotas etc

Members

www.edureka.co/hadoop-admin

Demo on User permission and Quota

www.edureka.co/hadoop-admin

Trubleshooting

www.edureka.co/hadoop-admin

Common Error Messages

NameNode startup fails

Exception when initializing the filesystem

Could only be replicated to 0 nodes instead of 1

Server not available

Could not obtain block blk_-4157273618194597760_1160 from any node

Could not get block locations. Aborting...

www.edureka.co/hadoop-admin

Certifications

Edureka's Hadoop Administration course: • Become Hadoop Administrator by mastering Hadoop Cluster: Planning & Deployment, Monitoring,

Performance tuning, Security using Kerberos, HDFS High Availability using Quorum Journal Manager (QJM) and Oozie, Hcatalog/Hive Administration.

• Online Live Courses: 24 hours• Assignments: 30 hours• Project: 20 hours• Lifetime Access + 24 X 7 Support

Go to www.edureka.co/Hadoop-admin

Batch starts from 21 November (Weekend Batch)

www.edureka.co/hadoop-admin

Thank You

Questions/Queries/Feedback

Recording and presentation will be made available to you within 24 hours

top related