Top Banner
1 Hands on Hadoop Daniel Templeton & Inyoung Cho Cloudera, Inc.
12

Java one14 handsonhadoop

Jun 10, 2015

Download

Technology

templedf

Slides for the JavaOne 14 Hands-on Lab: Hands-on Hadoop
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Java one14 handsonhadoop

1

Hands on HadoopDaniel Templeton & Inyoung ChoCloudera, Inc.

Page 2: Java one14 handsonhadoop

2 ©2014 Cloudera, Inc. All rights reserved.2

Your Hosts

Daniel Templeton• Certification Developer• Crusty, old HPC guy• Likes Perl

Inyoung Cho• Certification Developer• Recovering Java

Evangelist• Invented JavaOne Hands-

on Labs

Page 3: Java one14 handsonhadoop

3 ©2014 Cloudera, Inc. All rights reserved.3

What is “Big Data”?

• Super-cool marketing buzz word• “Come see our new line of BIG DATA toasters…”

• “The Five V’s”• Any data that is difficult to store in a traditional

RDBMS• Too big, changes schemas too often, unstructured, …

Page 4: Java one14 handsonhadoop

4 ©2014 Cloudera, Inc. All rights reserved.4

What is Hadoop?

Page 5: Java one14 handsonhadoop

5 ©2014 Cloudera, Inc. All rights reserved.5

What is Hadoop?

Page 6: Java one14 handsonhadoop

6 ©2014 Cloudera, Inc. All rights reserved.6

HDFS in a Nutshell

• Distributed “file system” service• Highly scalable and fault resilient• Chunks files into “blocks” that are replicated and

distributed across the cluster

Page 7: Java one14 handsonhadoop

7 ©2014 Cloudera, Inc. All rights reserved.7

MapReduce in a Nutshell

• Embarrassingly parallel batch execution engine• Two phases: map and reduce

• https://www.youtube.com/watch?v=bcjSe0xCHbE• Tasks are scheduled to run where the data is• Jobs are written to Java API

Page 8: Java one14 handsonhadoop

8 ©2014 Cloudera, Inc. All rights reserved.8

Hive in a Nutshell

• SQL engine for Hadoop• Translates HiveQL into MapReduce jobs

Page 9: Java one14 handsonhadoop

9 ©2014 Cloudera, Inc. All rights reserved.9

Impala in a Nutshell

• Hive with the MapReduce

Page 10: Java one14 handsonhadoop

10 ©2014 Cloudera, Inc. All rights reserved.10

Pig in a Nutshell

• Script-like language for data operations• Translates into MapReduce jobs

Page 11: Java one14 handsonhadoop

11 ©2014 Cloudera, Inc. All rights reserved.11

The Lab

• Self-paced• Should take right about 2 hours• “Additional Exercises” if you finish early• Inyoung and I are here to answer questions• Have fun!

Page 12: Java one14 handsonhadoop

12 ©2014 Cloudera, Inc. All rights reserved.

Aaron Myers &Daniel Templeton