Top Banner
Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson work for Machine Learning and Data Mining in the Joseph Gonzalez Carlos Guestrin Joe Hellerstein
54

Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Dec 13, 2015

Download

Documents

Rodney Bryan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Carnegie Mellon

Yucheng Low

AapoKyrola

DannyBickson

A Framework for Machine Learning and Data Mining in the Cloud

JosephGonzalez

CarlosGuestrin

JoeHellerstein

Page 2: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Big Data is Everywhere

72 Hours a MinuteYouTube28 Million

Wikipedia Pages

900 MillionFacebook Users

6 Billion Flickr Photos

2

“… data a new class of economic asset, like currency or gold.”

“…growing at 50 percent a year…”

Page 3: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

How will wedesign and implement Big learning systems?

Big Learning

3

Page 4: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Shift Towards Use Of Parallelism in ML

GPUs Multicore Clusters Clouds Supercomputers

ML experts repeatedly solve the same parallel design challenges:

Race conditions, distributed state, communication…

Resulting code is very specialized:difficult to maintain, extend, debug…

Graduate

students

Avoid these problems by using high-level abstractions

4

Page 5: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

CPU 1 CPU 2 CPU 3 CPU 4

MapReduce – Map Phase

5

Page 6: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

CPU 1 CPU 2 CPU 3 CPU 4

MapReduce – Map Phase

6

Embarrassingly Parallel independent computation

12.9

42.3

21.3

25.8

No Communication needed

Page 7: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

CPU 1 CPU 2 CPU 3 CPU 4

MapReduce – Map Phase

7

12.9

42.3

21.3

25.8

24.1

84.3

18.4

84.4

Embarrassingly Parallel independent computation No Communication needed

Page 8: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

CPU 1 CPU 2 CPU 3 CPU 4

MapReduce – Map Phase

8

12.9

42.3

21.3

25.8

17.5

67.5

14.9

34.3

24.1

84.3

18.4

84.4

Embarrassingly Parallel independent computation No Communication needed

Page 9: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

CPU 1 CPU 2

MapReduce – Reduce Phase

9

12.9

42.3

21.3

25.8

24.1

84.3

18.4

84.4

17.5

67.5

14.9

34.3

2226.

26

1726.

31

Image Features

Attractive Face Statistics

Ugly Face Statistics

U A A U U U A A U A U A

Attractive Faces Ugly Faces

Page 10: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

MapReduce for Data-Parallel ML

Excellent for large data-parallel tasks!

Data-Parallel Graph-Parallel

CrossValidation

Feature Extraction

MapReduce

Computing SufficientStatistics

Graphical ModelsGibbs Sampling

Belief PropagationVariational Opt.

Semi-Supervised Learning

Label PropagationCoEM

Graph AnalysisPageRank

Triangle Counting

Collaborative Filtering

Tensor Factorization

Is there more toMachine Learning

?10

Page 11: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Carnegie Mellon

Exploit Dependencies

Page 12: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

12

HockeyScuba Diving

Underwater Hockey

Scuba Diving

Scuba Diving

Scuba Diving

Hockey

Hockey

Hockey

Hockey

Page 13: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Graphs are Everywhere

Use

rs

Movies

Netflix

Collaborative Filtering

Doc

s

Words

Wiki

Text Analysis

Social Network

Probabilistic Analysis

13

Page 14: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Properties of Computation on Graphs

DependencyGraph

IterativeComputation

My Interests

Friends Interests

LocalUpdates

Page 15: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

ML Tasks Beyond Data-Parallelism

Data-Parallel Graph-Parallel

CrossValidation

Feature Extraction

Map Reduce

Computing SufficientStatistics

Graphical ModelsGibbs Sampling

Belief PropagationVariational Opt.

Semi-Supervised Learning

Label PropagationCoEM

Graph AnalysisPageRank

Triangle Counting

Collaborative Filtering

Tensor Factorization

15

Page 16: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Bayesian Tensor Factorization

Gibbs Sampling

MatrixFactorization

Lasso

SVM

Belief Propagation

PageRank

CoEM

SVD

LDA

…Many others…Linear Solvers

Splash SamplerAlternating Least

Squares

21

2010Shared Memory

Page 17: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

22

Limited CPU PowerLimited MemoryLimited Scalability

Page 18: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Distributed Cloud

- Distributing State- Data Consistency- Fault Tolerance

23

Unlimited amount of computation resources!(up to funding limitations)

Page 19: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

The GraphLab Framework

Consistency Model

Graph BasedData Representation

Update FunctionsUser Computation

24

Page 20: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Data GraphData associated with vertices and edges

Vertex Data:• User profile• Current interests estimates

Edge Data:• Relationship (friend, classmate, relative)

Graph:• Social Network

25

Page 21: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Distributed Graph

Partition the graph across multiple machines.

26

Page 22: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Distributed Graph

Ghost vertices maintain adjacency structure and replicate remote data.

“ghost” vertices

27

Page 23: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Distributed Graph

Cut efficiently using HPC Graph partitioning tools (ParMetis / Scotch / …)

“ghost” vertices

28

Page 24: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

The GraphLab Framework

Consistency Model

Graph BasedData Representation

Update FunctionsUser Computation

29

Page 25: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Pagerank(scope){ // Update the current vertex data

// Reschedule Neighbors if needed if vertex.PageRank changes then reschedule_all_neighbors; }

Update FunctionsUser-defined program: applied to avertex and transforms data in scope of vertex

Dynamic computation

Update function applied (asynchronously) in parallel until convergence

Many schedulers available to prioritize computation

30

Page 26: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

32

Shared Memory Dynamic Schedule

e f g

kjih

dcbaCPU 1

CPU 2

a

h

a

b

b

i

Process repeats until scheduler is empty

Scheduler

Page 27: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Distributed Scheduling

e

ih

ba

f g

kj

dc

a

h

f

g

j

cb

i

Each machine maintains a schedule over the vertices it owns.

33Distributed Consensus used to identify completion

Page 28: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Ensuring Race-Free CodeHow much can computation overlap?

34

Page 29: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

The GraphLab Framework

Consistency Model

Graph BasedData Representation

Update FunctionsUser Computation

36

Page 30: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Racing Collaborative Filtering

37

0 200 400 600 800 1000 1200 1400 16000.01

0.1

1

d=20 Racing d=20

Time

RMSE

Page 31: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Serializability

38

For every parallel execution, there exists a sequential execution of update functions which produces the same result.

CPU 1

CPU 2

SingleCPU

Parallel

Sequential

time

Page 32: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Serializability Example

39

Read

Write

Update functions one vertex apart can be run in parallel.

Edge Consistency

Overlapping regions are only read.

Stronger / Weaker consistency levels available

User-tunable consistency levelstrades off parallelism & consistency

Page 33: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Solution 1

Graph Coloring

Distributed Consistency

Solution 2

Distributed Locking

Page 34: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Edge Consistency via Graph Coloring

Vertices of the same color are all at least one vertex apart.Therefore, All vertices of the same color can be run in parallel!

42

Page 35: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Chromatic Distributed EngineTi

me

Execute tasks on all vertices of

color 0

Execute tasks on all vertices of

color 0

Ghost Synchronization Completion + Barrier

Execute tasks on all vertices of

color 1

Execute tasks on all vertices of

color 1

Ghost Synchronization Completion + Barrier

43

Page 36: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Matrix FactorizationNetflix Collaborative Filtering

Alternating Least Squares Matrix Factorization

Model: 0.5 million nodes, 99 million edges

Netflix

Users

Movies

d

44

Users Movies

Page 37: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

45

Netflix Collaborative Filtering

Ideal

D=100

D=20

# machines

HadoopMPI

GraphLab

# machines

Page 38: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

The Cost of Hadoop

46

Page 39: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

ProblemsRequire a graph coloring to be available.

Frequent Barriers make it extremely inefficient for highly dynamic systems where only a small number of vertices are active in each round.

47

Page 40: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Solution 1

Graph Coloring

Distributed Consistency

Solution 2

Distributed Locking

Page 41: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Distributed LockingEdge Consistency can be guaranteed through locking.

: RW Lock

49

Page 42: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Consistency Through LockingAcquire write-lock on center vertex, read-lock on adjacent.

50

Page 43: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

SolutionPipelining

CPU Machine 1

Machine 2

A C

B D

Consistency Through LockingMulticore Setting

PThread RW-Locks

Distributed Setting

Distributed LocksChallenges

Latency

A C

B D

A C

B D

A

51

Page 44: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

No Pipelining

lock scope 1

Process request 1

scope 1 acquiredupdate_function 1

release scope 1

Process release 1

Tim

e

52

Page 45: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Pipelining / Latency HidingHide latency using pipelining

lock scope 1

Process request 1

scope 1 acquired

update_function 1release scope 1

Process release 1

lock scope 2

Tim

e lock scope 3Process request 2

Process request 3scope 2 acquiredscope 3 acquired

update_function 2release scope 2

53

Page 46: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

The GraphLab Framework

Consistency Model

Graph BasedData Representation

Update FunctionsUser Computation

54

Page 47: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

What if machines fail? How do we provide fault tolerance?

Page 48: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Checkpoint

1: Stop the world2: Write state to disk

Page 49: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

57

Snapshot Performance

No Snapshot

Snapshot

One slow machine

Because we have to stop the world, One slow machine slows everything down!

Snapshot time

Slow machine

Page 50: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

How can we do better?

Take advantage of consistency

Page 51: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

59

Checkpointing1985: Chandy-Lamport invented an asynchronous snapshotting algorithm for distributed systems.

snapshottedNot snapshotted

Page 52: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

60

CheckpointingFine Grained Chandy-Lamport.

Easily implemented within GraphLab as an Update Function!

Page 53: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

61

Async. Snapshot Performance

No Snapshot

Snapshot

One slow machine

No penalty incurred by the slow machine!

Page 54: Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

SummaryExtended GraphLab abstraction to distributed systemsTwo different methods of achieving consistency

Graph ColoringDistributed Locking with pipelining

Efficient implementationsAsynchronous Fault Tolerance with fined-grained Chandy-Lamport

Performance

Useability

Efficiency Scalabilitys

62