This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Introduce
WorkloadsDistributed compute in a nutshell (where many nuts > few nuts)
Workloads
Distributed Compute Developer
Big Compute Big Penguin Big Data
BIG ANYTHINGHello All Worlds
Big Co$t?
That’d be like Microsoft, right?
Microsoft Research Genomics
InspectDistributed compute in a nutshell (with many little nuts)
Inspect
HPCHead Node
Broker Nodes
Compute Nodes
Allows on-premises
And hybrid option
Compare Architectures
Big DataName Node
Data Nodes
Allows cloud or on-premises no hybrid option
Hadoop
HPC
All distributed compute works on the basis of taking a large JOB and breaking it to many smaller TASKS which are then parallelised
Develop
Deploy
ExamplesHow do you take your compute?
OUCHLessons learned from getting too close to the coalface
A broken cluster is no place to be diagnosing
Scalability < Elasticity
Hybrid HPC is next to useless
95% through a petabyte is a bad place to find a bug