Top Banner
www.edureka.co/hadoop- admin A day in the life of Hadoop Administrator!
26

A Day in the Life of a Hadoop Administrator

Apr 13, 2017

Download

Technology

Edureka!
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

A day in the life of Hadoop Administrator!

Page 2: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

What will you learn today?

Let us have a quick poll, do you know the following topics?

The Daily tasks Hadoop Admins do Cluster Monitoring tools How Fault tolerance is maintained in cluster Demo on Hadoop High Availability Demo on YARN High Availability

Page 3: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Daily tasks

Page 4: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Cluster MonitoringFirst thing in the morning, Monitor Console should be checked (Cloudera manager, Nagios, Ganglia etc) and the Jobtracker UI.

Page 5: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Few Cluster Monitoring Tools

Page 6: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Cluster Plan

Plan the day and review the past tasks in a meeting

Page 7: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Cluster PlanTypical slave node hardware configurations

Midline configuration (all around, deep storage, 1 Gb Ethernet)

CPU 2 × 6 core 2.9 Ghz/15 MB cache

Memory 64 GB DDR3-1600 ECC

Disk controller SAS 6 Gb/s

Disks 12 × 3 TB LFF SATA II 7200 RPM

Network controller 2 × 1 Gb Ethernet

NotesCPU features such as Intel’s Hyper-Threading and

QPI are desirable. Allocate memory to take advantage of triple- or quad-channel memory

configurations.

Page 8: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

High end configuration (high memory, spindle dense, 10 Gb Ethernet)

CPU 2 × 6 core 2.9 Ghz/15 MB cache

Memory 96 GB DDR3-1600 ECC

Disk controller 2 × SAS 6 Gb/s

Disks 24 × 1 TB SFF Nearline/MDL SAS 7200 RPM

Network controller 1 × 10 Gb Ethernet

Notes Same as the midline configuration

High end configuration (high memory, spindle dense, 10 Gb Ethernet)

Page 9: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Execute Few Regular Utility TasksDeveloping and running files merger so that the small files and directories our data suppliers create would become bigger and fewer.

Page 10: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Backup And Recovery Task

Page 11: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Demo on Achieving Hadoop High Availability

Page 12: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Job Scheduling And ConfigurationKeep the farm working – we build Monitoring, Managing resources between our users and our tools, tuning configurations for the farm stack, for MapReduce, Spark jobs and for the servers

Page 13: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Analyzing Failed TasksAnalyzing too heavy or failed jobs and Fixing problems

Page 14: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Demo on Achieving YARN High Availability

Page 15: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Evaluating New Host Requests

Collecting and Defining requirements for new hosts

Page 16: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Updates And UpgradesUpgrading and updating the farm from time to time

Page 17: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Try And Finalize New Solutions

Setting Benchmarks for new projects.

Page 18: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Be In Touch With New Configuration Tools

Set a configuration management tool for our test and production environments

Page 19: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Execute Few DWH ResponsibilitiesDeveloping an easy infrastructure to insert data to the cluster and into Hive and HBase

Page 20: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Assisting Hadoop Developers

Daily support for developers who use the Hadoop stack

Page 21: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Checking usage of Resources and User-Permissions

Managing Users, Permissions, Quotas etc

Members

Page 22: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Demo on User permission and Quota

Page 23: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Trubleshooting

Page 24: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Common Error Messages

NameNode startup fails

Exception when initializing the filesystem

Could only be replicated to 0 nodes instead of 1

Server not available

Could not obtain block blk_-4157273618194597760_1160 from any node

Could not get block locations. Aborting...

Page 25: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Certifications

Edureka's Hadoop Administration course: • Become Hadoop Administrator by mastering Hadoop Cluster: Planning & Deployment, Monitoring,

Performance tuning, Security using Kerberos, HDFS High Availability using Quorum Journal Manager (QJM) and Oozie, Hcatalog/Hive Administration.

• Online Live Courses: 24 hours• Assignments: 30 hours• Project: 20 hours• Lifetime Access + 24 X 7 Support

Go to www.edureka.co/Hadoop-admin

Batch starts from 21 November (Weekend Batch)

Page 26: A Day in the Life of a Hadoop Administrator

www.edureka.co/hadoop-admin

Thank You

Questions/Queries/Feedback

Recording and presentation will be made available to you within 24 hours