Top Banner
Matrix Methods with Hadoop DAVID F. GLEICH ASSISTANT PROFESSOR COMPUTER SCIENCE PURDUE UNIVERSITY David Gleich · Purdue 1 Slides bit.ly/10SIe1A Code github.com/dgleich/matrix-hadoop-tutorial bit.ly/10SIe1A
60

Matrix methods for Hadoop

Jan 15, 2015

Download

Technology

David Gleich

A quick tutorial on how to tackle problems from a matrix-vector perspective in Hadoop
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Matrix methods for Hadoop

Matrix Methods with Hadoop DAVID F. GLEICH ASSISTANT PROFESSOR "COMPUTER SCIENCE "PURDUE UNIVERSITY

David Gleich · Purdue 1

Slides bit.ly/10SIe1A Code github.com/dgleich/matrix-hadoop-tutorial

bit.ly/10SIe1A

Page 2: Matrix methods for Hadoop

David Gleich · Purdue 2 bit.ly/10SIe1A

Page 3: Matrix methods for Hadoop

A bit of philosophy …

3 Image from rockysprings, deviantart, CC share-alike

Page 4: Matrix methods for Hadoop

David Gleich · Purdue 4 bit.ly/10SIe1A

Page 5: Matrix methods for Hadoop

Matrix computations

A =

2

66664

A1,1 A1,2 · · · A1,n

A2,1 A2,2 · · ·...

.... . .

. . . Am�1,nAm,1 · · · Am,n�1 Am,n

3

77775

Least squares Eigenvalues

Ax Ax = b min kAx � bk Ax = �x

Operations Linear "systems David Gleich · Purdue 5 bit.ly/10SIe1A

Page 6: Matrix methods for Hadoop

Outcomes Recognize relationships between matrix methods and things you’ve already been doing" Example SQL queries as matrix computations Understand how to use Hadoop to compute these matrix methods at scale for BigData" Example Recommenders with social network info Understand some of the issues that could arise.

David Gleich · Purdue 6 bit.ly/10SIe1A

Page 7: Matrix methods for Hadoop

Ideal outcomes

How to use techniques from "matrix computations in order "to solve your problems quickly!

1986 David Gleich · Purdue 7 bit.ly/10SIe1A

Page 8: Matrix methods for Hadoop

Taking the red pill …

8 Image from rockysprings, deviantart, CC share-alike

Page 9: Matrix methods for Hadoop

Matrix computations Physics

Statistics Engineering

Graphics Bioinformatics

Databases Machine learning

Information retrieval Computer vision Social networks

David Gleich · Purdue 9 bit.ly/10SIe1A

Page 10: Matrix methods for Hadoop

matrix computations "≠"

linear algebra

David Gleich · Purdue 10

bit.ly/10SIe1A

Page 11: Matrix methods for Hadoop

A SQL statement as a "matrix computation

http://stackoverflow.com/questions/4217449/returning-average-rating-from-a-database-sql

How do I find the average rating for each product?

David Gleich · Purdue 11

bit.ly/10SIe1A

Page 12: Matrix methods for Hadoop

A SQL statement as a "matrix computation

http://stackoverflow.com/questions/4217449/returning-average-rating-from-a-database-sql

SELECT ! p.product_id, ! p.name, ! AVG(pr.rating) AS rating_average!FROM products p !INNER JOIN product_ratings pr!ON pr.product_id = p.product_id!GROUP BY p.product_id!ORDER BY rating_average DESC !

How do I find the average rating for each product?

David Gleich · Purdue 12

bit.ly/10SIe1A

Page 13: Matrix methods for Hadoop

This SQL statement is a "matrix computation!

13

Image from rockysprings, deviantart, CC share-alike

Page 14: Matrix methods for Hadoop

SELECT ! ... ! AVG(pr.rating) !... !GROUP BY p.product_id!

product_ratings

pid8 uid2 4 pid9 uid9 1 pid2 uid9 5 pid9 uid5 5 pid6 uid8 4 pid1 uid2 4 pid3 uid4 4 pid5 uid9 2 pid9 uid8 4 pid9 uid9 1

Is a matrix!

pid1 pid2 pid3 pid4 pid5 pid6 pid7 pid8 pid9

David Gleich · Purdue 14

bit.ly/10SIe1A

Page 15: Matrix methods for Hadoop

product_ratings

pid8 uid2 4 pid9 uid9 1 pid2 uid9 5 pid9 uid5 5 pid6 uid8 4 pid1 uid2 4 pid3 uid4 4 pid5 uid9 2 pid9 uid8 4 pid9 uid9 1

Is a matrix!

pid1 pid2 pid3 pid4 pid5 pid6 pid7 pid8 pid9

But it’s a weird matrix"

Missing entries!

David Gleich · Purdue 15

bit.ly/10SIe1A

Page 16: Matrix methods for Hadoop

product_ratings

pid8 uid2 4 pid9 uid9 1 pid2 uid9 5 pid9 uid5 5 pid6 uid8 4 pid1 uid2 4 pid3 uid4 4 pid5 uid9 2 pid9 uid8 4

Is a matrix!

pid1 pid2 pid3 pid4 pid5 pid6 pid7 pid8 pid9

4

4

4

4 5 4

But it’s a weird matrix"

Matrix

SELECT AVG(r) ... GROUP BY pid

Vector

Average"of ratings

David Gleich · Purdue 16

bit.ly/10SIe1A

Page 17: Matrix methods for Hadoop

But it’s a weird matrix"and not a linear operator

A =

2

66664

A1,1 A1,2 · · · A1,n

A2,1 A2,2 · · ·...

.... . .

. . . Am�1,nAm,1 · · · Am,n�1 Am,n

3

77775

avg(A) =

2

6664

Pj A1,j/

Pj “A1,j 6= 0”P

j A2,j/P

j “A2,j 6= 0”...P

j Am,j/P

j “Am,j 6= 0”

3

7775

David Gleich · Purdue 17

product_ratings

pid8 uid2 4 pid9 uid9 1 pid2 uid9 5 pid9 uid5 5 pid6 uid8 4 pid1 uid2 4 pid3 uid4 4 pid5 uid9 2 pid9 uid8 4 pid9 uid9 1

Is a matrix!

bit.ly/10SIe1A

Page 18: Matrix methods for Hadoop

matrix computations "≠"

linear algebra

David Gleich · Purdue 18

bit.ly/10SIe1A

Page 19: Matrix methods for Hadoop

… but there is a linear operator hiding …

David Gleich · Purdue 19

avg(A) = Pe

P =

2

64

A1,1/P

j “A1,j 6= 0” A1,2/P

j “A1,j 6= 0” · · ·A2,1/

Pj “A2,j 6= 0” A2,2/

Pj “A2,j 6= 0” · · ·

.... . .

3

75

e is the vector of all ones

bit.ly/10SIe1A

Page 20: Matrix methods for Hadoop

Hadoop, MapReduce, and Matrix Methods

David Gleich · Purdue 20

bit.ly/10SIe1A

Page 21: Matrix methods for Hadoop

MapReduce

David Gleich · Purdue 21

bit.ly/10SIe1A

Page 22: Matrix methods for Hadoop

The MapReduce Framework Originated at Google for indexing web pages and computing PageRank.

Express algorithms in "“data-local operations”. Implement one type of communication: shuffle. Shuffle moves all data with the same key to the same reducer.

MM R

RMM

Input stored in triplicate

Map output"persisted to disk"before shuffle

Reduce input/"output on disk

1 MM R

RMMM

Maps Reduce

Shuffle

2

3

4

5

1 2 M M

3 4 M M

5 M

Data scalable

Fault-tolerance by design

22

David Gleich · Purdue bit.ly/10SIe1A

Page 23: Matrix methods for Hadoop

wordcount "is a matrix computation too

map(document) :

for word in document

emit (word, 1)

reduce(word, counts) :

emit (word, sum(counts))

1 2 D D

3 4 D D

5 D

matrix,1 matrix,1 matrix,1 matrix,1

bigdata,1 bigdata,1 bigdata,1 bigdata,1 bigdata,1 bigdata,1 bigdata,1 bigdata,1

hadoop,1 hadoop,1 hadoop,1 hadoop,1 hadoop,1 hadoop,1 hadoop,1

David Gleich · Purdue 23

bit.ly/10SIe1A

Page 24: Matrix methods for Hadoop

wordcount "is a matrix computation too

A =

2

66664

A1,1 A1,2 · · · A1,n

A2,1 A2,2 · · ·...

.... . .

. . . Am�1,nAm,1 · · · Am,n�1 Am,n

3

77775

doc1

doc2

docm

= A

colsum(A) = AT e word count = e is the vector of all ones

David Gleich · Purdue 24

bit.ly/10SIe1A

Page 25: Matrix methods for Hadoop

inverted index"is a matrix computation too

A =

2

66664

A1,1 A1,2 · · · A1,n

A2,1 A2,2 · · ·...

.... . .

. . . Am�1,nAm,1 · · · Am,n�1 Am,n

3

77775

doc1

doc2

docm

= A

David Gleich · Purdue 25

bit.ly/10SIe1A

Page 26: Matrix methods for Hadoop

2

66664

A1,1 A2,1 · · · Am,1

A1,2 A2,2 · · ·...

.... . .

. . . Am,n�1A1,n · · · Am�1,n Am,n

3

77775= AT

term1

term2

termm

inverted index"is a matrix computation too

David Gleich · Purdue 26

bit.ly/10SIe1A

Page 27: Matrix methods for Hadoop

A recommender system "with social info

David Gleich · Purdue 27

product_ratings

pid8 uid2 4 pid9 uid9 1 pid2 uid9 5 pid9 uid5 5 pid6 uid8 4 pid1 uid2 4 pid3 uid4 4 pid5 uid9 2 pid9 uid8 4 pid9 uid9 1

friends_links

uid6 uid1 uid8 uid9 uid7 uid7 uid7 uid4 uid6 uid2 uid7 uid1 uid3 uid1 uid1 uid8 uid7 uid3 uid9 uid1

bit.ly/10SIe1A

Page 28: Matrix methods for Hadoop

A recommender system "with social info

David Gleich · Purdue 28

product_ratings

pid8 uid2 4 pid9 uid9 1 pid2 uid9 5 pid9 uid5 5 pid6 uid8 4 pid1 uid2 4 pid3 uid4 4 pid5 uid9 2 pid9 uid8 4 pid9 uid9 1

friends_links

uid6 uid1 uid8 uid9 uid7 uid7 uid7 uid4 uid6 uid2 uid7 uid1 uid3 uid1 uid1 uid8 uid7 uid3 uid9 uid1

pid1

pid2

2

64A1,1 A2,1 · · ·A1,2 A2,2 · · ·...

. . .. . .

3

75uid1

uid2

2

64A1,1 A2,1 · · ·A1,2 A2,2 · · ·...

. . .. . .

3

75

bit.ly/10SIe1A

Page 29: Matrix methods for Hadoop

A recommender system "with social info

David Gleich · Purdue 29

product_ratings

pid8 uid2 4 pid9 uid9 1 pid2 uid9 5 pid9 uid5 5 pid6 uid8 4 pid1 uid2 4 pid3 uid4 4 pid5 uid9 2 pid9 uid8 4 pid9 uid9 1

friends_links

uid6 uid1 uid8 uid9 uid7 uid7 uid7 uid4 uid6 uid2 uid7 uid1 uid3 uid1 uid1 uid8 uid7 uid3 uid9 uid1

R S

bit.ly/10SIe1A

Page 30: Matrix methods for Hadoop

A recommender system "with social info

David Gleich · Purdue 30

Recommend each item based on the average rating of all trusted users

“X = S RT” with something that is"almost a matrix-matrix"product

R pid1

pid2

2

64A1,1 A2,1 · · ·A1,2 A2,2 · · ·...

. . .. . .

3

75 S uid1

uid2

2

64A1,1 A2,1 · · ·A1,2 A2,2 · · ·...

. . .. . .

3

75

Xuid,pid =

X

uid2

Suid,uid2Ruid2,pid

!· X

uid2

“Suid,uid2 and Ruid2,pid 6= 0”

!�1

bit.ly/10SIe1A

Page 31: Matrix methods for Hadoop

Tools I like

hadoop streaming dumbo mrjob hadoopy C++

David Gleich · Purdue 31

bit.ly/10SIe1A

Page 32: Matrix methods for Hadoop

Tools I don’t use but other people seem to like …

pig java hbase mahout Eclipse Cassandra

David Gleich · Purdue 32

Mahout is the closest thing to a library for matrix computations in Hadoop. If you like Java, you should probably start there. I’m a low-level guy

bit.ly/10SIe1A

Page 33: Matrix methods for Hadoop

hadoop streaming

the map function is a program"(key,value) pairs are sent via stdin"output (key,value) pairs goes to stdout the reduce function is a program"(key,value) pairs are sent via stdin"keys are grouped"output (key,value) pairs goes to stdout

David Gleich · Purdue 33

bit.ly/10SIe1A

Page 34: Matrix methods for Hadoop

mrjob from

a wrapper around hadoop streaming for map and reduce functions in python

class MRWordFreqCount(MRJob): def mapper(self, _, line): for word in line.split(): yield (word.lower(), 1) def reducer(self, word, counts): yield (word, sum(counts)) if __name__ == '__main__': MRWordFreqCount.run()

David Gleich · Purdue 34

bit.ly/10SIe1A

Page 35: Matrix methods for Hadoop

How can Hadoop streaming possibly be fast?

Hadoop streaming frameworks

Iter 1QR (secs.)

Iter 1Total (secs.)

Iter 2Total (secs.)

OverallTotal (secs.)

Dumbo 67725 960 217 1177

Hadoopy 70909 612 118 730

C++ 15809 350 37 387

Java 436 66 502

Synthetic data test 100,000,000-by-500 matrix (~500GB)Codes implemented in MapReduce streamingMatrix stored as TypedBytes lists of doublesPython frameworks use Numpy+AtlasCustom C++ TypedBytes reader/writer with AtlasNew non-streaming Java implementation too

David Gleich (Sandia)

All timing results from the Hadoop job tracker

C++ in streaming beats a native Java implementation.

16/22MapReduce 2011

500 GB matrix. Computing the R in a QR factorization. "See my next talk!

David Gleich · Purdue 35

Example available from github.com/dgleich/mrtsqr"

for verification

mrjob could be faster if it used typedbytes for intermediate storage see https://github.com/Yelp/mrjob/pull/447

bit.ly/10SIe1A

Page 36: Matrix methods for Hadoop

Matrix-vector product

David Gleich · Purdue 36

Ax = y

y

i

=X

k

A

ik

x

k

A x

Follow along! ���matrix-hadoop/codes/smatvec.py!

bit.ly/10SIe1A

Page 37: Matrix methods for Hadoop

Where do matrix-vector products arise? Google’s PageRank Computing cosine-similarity between one document and all other documents Predictions from kernel methods Computing averages (the example above)

David Gleich · Purdue 37

bit.ly/10SIe1A

Page 38: Matrix methods for Hadoop

Matrix-vector product

David Gleich · Purdue 38

Ax = y

y

i

=X

k

A

ik

x

k

A x

A is stored by row

$ head samples/smat_5_5.txt !0 0 0.125 3 1.024 4 0.121 !1 0 0.597 !2 2 1.247 !3 4 -1.45 !4 2 0.061 !

x is stored entry-wise !

$ head samples/vec_5.txt !0 0.241 !1 -0.98 !2 0.237 !3 -0.32 !4 0.080 !

Follow along! ���matrix-hadoop/codes/smatvec.py!

bit.ly/10SIe1A

Page 39: Matrix methods for Hadoop

Matrix-vector product"(in pictures)

David Gleich · Purdue 39

Ax = y

y

i

=X

k

A

ik

x

k

A x

A x

Input Map 1!Align on columns"

Reduce 1!Output Aik xk"keyed on row i

A

x Reduce 2!Output sum(Aik xk)"

y

bit.ly/10SIe1A

Page 40: Matrix methods for Hadoop

Matrix-vector product"(in pictures)

David Gleich · Purdue 40

Ax = y

y

i

=X

k

A

ik

x

k

A x

A x

Input Map 1!Align on columns"

def joinmap(self, key, line): ! vals = line.split() ! if len(vals) == 2: ! # the vector ! yield (vals[0], # row ! (float(vals[1]),)) # xi ! else: ! # the matrix ! row = vals[0] ! for i in xrange(1,len(vals),2): ! yield (vals[i], # column ! (row, # i,Aij! float(vals[i+1]))) !

bit.ly/10SIe1A

Page 41: Matrix methods for Hadoop

Matrix-vector product"(in pictures)

David Gleich · Purdue 41

Ax = y

y

i

=X

k

A

ik

x

k

A x

A x

Input Map 1!Align on columns"

Reduce 1!Output Aik xk"keyed on row i

A

x def joinred(self, key, vals): ! vecval = 0. ! matvals = [] ! for val in vals: ! if len(val) == 1: ! vecval += val[0] ! else: ! matvals.append(val) ! for val in matvals: ! yield (val[0], val[1]*vecval) !

Note that you should use a secondary sort to avoid reading both in memory

bit.ly/10SIe1A

Page 42: Matrix methods for Hadoop

Matrix-vector product"(in pictures)

David Gleich · Purdue 42

Ax = y

y

i

=X

k

A

ik

x

k

A x

A x

Input Map 1!Align on columns"

Reduce 1!Output Aik xk"keyed on row i

A

x Reduce 2!Output sum(Aik xk)"

y def sumred(self, key, vals): ! yield (key, sum(vals)) !

bit.ly/10SIe1A

Page 43: Matrix methods for Hadoop

Matrix-matrix product

David Gleich · Purdue 43

A B

Follow along! ���matrix-hadoop/codes/matmat.py!

AB = CCij =

X

k

Aik Bkj

bit.ly/10SIe1A

Page 44: Matrix methods for Hadoop

Matrix-matrix product

David Gleich · Purdue 44

A B

Follow along! ���matrix-hadoop/codes/matmat.py!

AB = CCij =

X

k

Aik Bkj

A is stored by row

$ head samples/smat_10_5_A.txt !0 0 0.599 4 -1.53 !1 !2 2 0.260 !3 !4 0 0.267 1 0.839

B is stored by row

$ head samples/smat_5_5.txt !0 0 0.125 3 1.024 4 0.121 !1 0 0.597 !2 2 1.247 ! bit.ly/10SIe1A

Page 45: Matrix methods for Hadoop

Matrix-matrix product "(in pictures)

David Gleich · Purdue 45

A B

AB = CCij =

X

k

Aik Bkj

A Map 1!Align on columns"

B Reduce 1!Output Aik Bkj"keyed on (i,j)

A

B Reduce 2!Output sum(Aik Bkj)"

C

bit.ly/10SIe1A

Page 46: Matrix methods for Hadoop

Matrix-matrix product "(in code)

David Gleich · Purdue 46

A B

AB = CCij =

X

k

Aik Bkj

A Map 1!Align on columns"

B

def joinmap(self, key, line): ! mtype = self.parsemat() ! vals = line.split() ! row = vals[0] ! rowvals = \ ! [(vals[i],float(vals[i+1])) ! for i in xrange(1,len(vals),2)] ! if mtype==1: ! # matrix A, output by col ! for val in rowvals: ! yield (val[0], (row, val[1])) ! else: ! yield (row, (rowvals,)) !

bit.ly/10SIe1A

Page 47: Matrix methods for Hadoop

Matrix-matrix product "(in pictures)

David Gleich · Purdue 47

A B

AB = CCij =

X

k

Aik Bkj

A Map 1!Align on columns"

B Reduce 1!Output Aik Bkj"keyed on (i,j)

A

B

def joinred(self, key, line): ! # load the data into memory ! brow = [] ! acol = [] ! for val in vals: ! if len(val) == 1: ! brow.extend(val[0]) ! else: ! acol.append(val) ! ! for (bcol,bval) in brow: ! for (arow,aval) in acol: ! yield ((arow,bcol),aval*bval) !

bit.ly/10SIe1A

Page 48: Matrix methods for Hadoop

Matrix-matrix product "(in pictures)

David Gleich · Purdue 48

A B

AB = CCij =

X

k

Aik Bkj

A Map 1!Align on columns"

B Reduce 1!Output Aik Bkj"keyed on (i,j)

A

B Reduce 2!Output sum(Aik Bkj)"

C def sumred(self, key, vals): ! yield (key, sum(vals)) !

bit.ly/10SIe1A

Page 49: Matrix methods for Hadoop

Our social recommender

David Gleich · Purdue 49

RT S

Follow along! ���matrix-hadoop/recsys/recsys.py!

R is stored entry-wise !

$ gunzip –c data/rating.txt.gz!139431556 591156 5 !139431556 1312460676 5 !139431556 204358 4 139431556 368725 5 !Object ID! User ID! Rating!

S is stored entry-wise !

$ gunzip –c data/rating.txt.gz!3287060356 232085 -1 !3288305540 709420 1 !3290337156 204418 -1 !3294138244 269243 -1 !Other ID! Trust!My ID!

bit.ly/10SIe1A

Page 50: Matrix methods for Hadoop

Social recommender "(in code)

David Gleich · Purdue 50

A B

A Map 1!Align on columns"

B

def joinmap(self, key, line): ! parts = line.split('\t') ! if len(parts) == 8: # ratings ! objid = parts[0].strip() ! uid = parts[1].strip() ! rat = int(parts[2]) ! yield (uid, (objid, rat)) ! else len(parts) == 4: # trust ! myid = parts[0].strip() ! otherid = parts[1].strip() ! value = int(parts[2]) ! if value > 0: ! yield (otherid, (myid,)) !

Conceptually, the first step is the same as the matrix-matrix product. We reorganize the data by user-id to be able to map the trust relationships

bit.ly/10SIe1A

Page 51: Matrix methods for Hadoop

Matrix-matrix product "(in pictures)

David Gleich · Purdue 51

A B

A Map 1!Align on columns"

B Reduce 1!Output Aik Bkj"keyed on (i,j)

A

B

def joinred(self, key, vals): ! tusers = [] # uids that trust key ! ratobjs = [] # objs rated by uid=key ! for val in vals: ! if len(val) == 1: ! tusers.append(val[0]) ! else: ! ratobjs.append(val) !! for (objid, rat) in ratobjs: ! for uid in tusers: ! yield ((uid, objid), rat) !

Conceptually, the second step

is the same as the matrix-

matrix product too, we “map”

the ratings from each trusted

user back to the source.

bit.ly/10SIe1A

Page 52: Matrix methods for Hadoop

Matrix-matrix product "(in pictures)

David Gleich · Purdue 52

A B

AB = CCij =

X

k

Aik Bkj

A Map 1!Align on columns"

B Reduce 1!Output Aik Bkj"keyed on (i,j)

A

B Reduce 2!Output sum(Aik Bkj)"

C def avgred(self, key, vals): ! s = 0. ! n = 0 ! for val in vals: ! s += val! n += 1 ! # the smoothed average of ratings ! yield key, ! (s+self.options.avg)/float(n+1) ! !

bit.ly/10SIe1A

Page 53: Matrix methods for Hadoop

Better ways to store "matrices in Hadoop

David Gleich · Purdue 53

A B

A B

Block matrices minimize the number of intermediate keys and values used. I’d form them based on the first reduce No need for “integer” keys that

fall between 1 and n!

bit.ly/10SIe1A

Page 54: Matrix methods for Hadoop

Tall-and-skinny matrices are common in BigData

David Gleich · Purdue 54

A1

A4

A2

A3

A4

A : m x n, m ≫ n Key is an arbitrary row-id Value is the 1 x n array "for a row Each submatrix Ai is an "the input to a map task.

bit.ly/10SIe1A

Page 55: Matrix methods for Hadoop

Double-precision floating point was designed for the era where “big” was 1000-10000

David Gleich · Purdue 55

bit.ly/10SIe1A

Page 56: Matrix methods for Hadoop

Error analysis of summation

s = 0; for i=1 to n: s = s + x[i] A simple summation formula has "error that is not always small if n is a billion

David Gleich · Purdue 56

fl(x + y ) = (x + y )(1 + ")

fl(X

i

x

i

) �X

i

x

i

nµX

i

|xi

| µ ⇡ 10�16

bit.ly/10SIe1A

Page 57: Matrix methods for Hadoop

If your application matters then watch out for this issue. Use quad-precision arithmetic or compensated summation instead.

David Gleich · Purdue 57

bit.ly/10SIe1A

Page 58: Matrix methods for Hadoop

Compensated Summation “Kahan summation algorithm” on Wikipedia s = 0.; c = 0.; for i=1 to n: y = x[i] – c t = s + y c = (t – s) – y s = t

David Gleich · Purdue 58

Mathematically, c is always zero. On a computer, c can be non-zero The parentheses matter! fl(csum(x)) �

X

i

x

i

(µ + nµ2)X

i

|xi

|

µ ⇡ 10�16

bit.ly/10SIe1A

Page 59: Matrix methods for Hadoop

Collaborators, Friends, and People who have taught me

MRTSQR!Paul Constantine (Stanford) Austin Benson (Stanford) James Demmel (Berkeley) Simform!Jeremy Templeton (Sandia) Joe Ruthruff (Sandia) Yangyang Hou (Purdue) Joe Nichols (Stanford)

Sandia MapReduce!Todd Plantenga Tammy Kolda Justin Basilico (now Netflix) Others!Margot Gerritsen (Stanford)

Grants Sandia CSAR

David Gleich · Purdue 59

bit.ly/10SIe1A

Page 60: Matrix methods for Hadoop

Questions?

60

Image from rockysprings, deviantart, CC share-alike