Hadoop (Pig). Processing of large data (by Eugene Smertenko) - Big Data Tech Hangout - 2013.10.26

Post on 15-Jan-2015

459 Views

Category:

Education

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

On Saturday, 26 of October, the second external meeting of Tech Hangout Community took place in Creative Space 12, the cultural and educational center based in Kiev! The event was held under the motto «Discover the value of Big Data!» * Tech Hangout -- an event, organized by the developers for the developers for knowledge and experience sharing. The concept of the event proposes a 30-minute report on the topic previously defined, and the discussion of the same duration in a roundtable session format. This initiative has proved to be so popular and high-demand that Tech Hangout own logo, blog and group on Facebook with the opportunity to discuss information heard have been created in a short period of time. Join to discuss - https://www.facebook.com/groups/techhangout/ Read us - http://hangout.innovecs.com/

Transcript

Hadoop-PigProcessing of large data

Yevgen SmertenkoEngineering Team Lead. BI Developer.

How it worksBI engineerclear result

data

PigPig

Hadoop - Software Framework

Provide Massive Parallel Processing (MPP) of data

MapReduce program• Input read• Map• Partition / Combine• Copy / Compare / Merge• Reduce• Output write

MapReduce Data Flow

MapReduce Data Flow

MapReduce functionality

The Hadoop Ecosystem

PIG

• Data types• Relational operators• UDF – user defined functions

Pig Latin - language of the data streams description

Pig. Data Types

Simple Types• int• long• float• double• chararray• bytearray• boolean• datetime

Complex Types• tuple (.., ..)• map [key#value]• bag {(), .., ()}

Pig. Relational operators

• SPLIT• UNION• FILTER• DISTINCT• SAMPLE• FOREACH• STREAM

• JOIN• GROUP / COGROUP• CROSS• ORDER

• LOAD• STORE

PIG. UDF

Eval Functions (EvalFunc) • Filter Functions • Aggregate Functions• Algebraic Interface• Accumulator Interface

Load/Store Functions (StoreFunc)

piggybank

How it worksBI engineerclear result

data

Pig

THANKS FOR YOUR ATTENTION!

top related