Top Banner
BIG DATA - FAST DATA USING MAPREDUCE IN HAZELCAST Source: http://www.newscientist.com/gallery/dn17805-computer-museums-of-the-world/11 www.hazelcast.com
28
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Big Data, Fast Data - MapReduce in Hazelcast

BIG DATA - FAST DATAUSING MAPREDUCE IN HAZELCAST

Source: http://www.newscientist.com/gallery/dn17805-computer-museums-of-the-world/11

www.hazelcast.com

Page 2: Big Data, Fast Data - MapReduce in Hazelcast

WHO AM IChristoph Engelbert (@noctarius2k)8+ years of Java WeirdonessPerformance, GC, traffic topicsApache DirectMemory PMCPrevious companies incl. Ubisoft and HRSCastMapR MapReduce for Hazelcast 3

www.hazelcast.com

Page 3: Big Data, Fast Data - MapReduce in Hazelcast

TOPICSHazelcastDistributed ComputingMap & ReduceDemonstrationQuestions

www.hazelcast.com

Page 4: Big Data, Fast Data - MapReduce in Hazelcast

HAZELCASTA SHORT SPACE TRIP

www.hazelcast.com

Page 5: Big Data, Fast Data - MapReduce in Hazelcast

WHAT IS HAZELCAST?In-Memory Data-GridData Partioning (Sharding)Java Collections ImplementationDistributed Computing Platform

www.hazelcast.com

Page 6: Big Data, Fast Data - MapReduce in Hazelcast

WHY HAZELCAST?

www.hazelcast.com

Page 7: Big Data, Fast Data - MapReduce in Hazelcast

WHY IN-MEMORYCOMPUTING?

www.hazelcast.com

Page 8: Big Data, Fast Data - MapReduce in Hazelcast

TREND OF PRICES

Data Source: http://www.jcmit.com/memoryprice.htm

www.hazelcast.com

Page 9: Big Data, Fast Data - MapReduce in Hazelcast

SPEED DIFFERENCE

Data Source: http://i.imgur.com/ykOjTVw.png

www.hazelcast.com

Page 10: Big Data, Fast Data - MapReduce in Hazelcast

DISTRIBUTEDCOMPUTING

OR

MULTICORE CPU ON STEROIDS

www.hazelcast.com

Page 11: Big Data, Fast Data - MapReduce in Hazelcast

THE IDEA OF DISTRIBUTED COMPUTING

Source: https://www.flickr.com/photos/stefan_ledwina/1853508040

www.hazelcast.com

Page 12: Big Data, Fast Data - MapReduce in Hazelcast

THE BEGINNING

Source: http://en.wikipedia.org/wiki/File:KL_Advanced_Micro_Devices_AM9080.jpg

www.hazelcast.com

Page 13: Big Data, Fast Data - MapReduce in Hazelcast

MULTICORE IS NOT NEW

Source: http://en.wikipedia.org/wiki/File:80386with387.JPG

www.hazelcast.com

Page 14: Big Data, Fast Data - MapReduce in Hazelcast

CLUSTER IT

Source: http://rarecpus.com/images2/cpu_cluster.jpg

www.hazelcast.com

Page 15: Big Data, Fast Data - MapReduce in Hazelcast

SUPER COMPUTER

Source: http://www.dkrz.de/about/aufgaben/dkrz-geschichte/rechnerhistorie-1

www.hazelcast.com

Page 16: Big Data, Fast Data - MapReduce in Hazelcast

CLOUD COMPUTING

Source: https://farm6.staticflickr.com/5523/11407118963_e0e0870846_b_d.jpg

www.hazelcast.com

Page 17: Big Data, Fast Data - MapReduce in Hazelcast

MAP & REDUCETHE BLACK MAGIC FROM PLANET GOOGLE

www.hazelcast.com

Page 18: Big Data, Fast Data - MapReduce in Hazelcast

USE CASESLog AnalysisData QueryingAggregation and summingDistributed SortETL (Extract Transform Load)and more...

www.hazelcast.com

Page 19: Big Data, Fast Data - MapReduce in Hazelcast

SIMPLE STEPSReadMap / TransformReduce

www.hazelcast.com

Page 20: Big Data, Fast Data - MapReduce in Hazelcast

FULL STEPSReadMap / TransformCombiningGrouping / ShufflingReduceCollating

www.hazelcast.com

Page 21: Big Data, Fast Data - MapReduce in Hazelcast

MAPREDUCE WORKFLOW

www.hazelcast.com

Page 22: Big Data, Fast Data - MapReduce in Hazelcast

Data are mapped / transformed in a set of key-value pairs

SOME PSEUDO CODE (1/3)

MAPPING

map( key:String, document:String ):Void -> for each w:word in document: emit( w, 1 )

www.hazelcast.com

Page 23: Big Data, Fast Data - MapReduce in Hazelcast

Multiple values are combined to an intermediate result to preserve traffic

SOME PSEUDO CODE (2/3)

COMBINING

combine( word:String, counts:List[Int] ):Void -> emit( word, sum( counts ) )

www.hazelcast.com

Page 24: Big Data, Fast Data - MapReduce in Hazelcast

Values are reduced / aggregated to the requested result

SOME PSEUDO CODE (3/3)

REDUCING

reduce( word:String, counts:List[Int] ):Int -> return sum( counts )

www.hazelcast.com

Page 25: Big Data, Fast Data - MapReduce in Hazelcast

FOR MATHEMATICIANSProcess: (K x V)* → (L x W)* ⇒ [(l1, w1), …, (lm, wm)]

Mapping: (K x V) → (L x W)* ⇒ (k, v) → [(l1, w1), …, (ln, wn)]

Reducing: L x W* → X* ⇒ (l, [w1, …, wn]) → [x1, …,xn]

www.hazelcast.com

Page 26: Big Data, Fast Data - MapReduce in Hazelcast

MAPREDUCE PROGRAMS INGOOGLE SOURCE TREE

Source: http://research.google.com/archive/mapreduce-osdi04-slides/index-auto-0005.html

www.hazelcast.com

Page 27: Big Data, Fast Data - MapReduce in Hazelcast

DEMONSTRATION

www.hazelcast.com

Page 28: Big Data, Fast Data - MapReduce in Hazelcast

@noctarius2k@hazelcast

http://www.sourceprojects.comhttp://github.com/noctarius

THANK YOU!ANY QUESTIONS?

Images: All images are licensed under Creative Commons

www.hazelcast.com