Top Banner
Matrix Factorization Reporter : Sun Yuanshuai E-mail : [email protected]
24

Matrix Factorization

Jan 19, 2016

Download

Documents

Curt

Matrix Factorization. Reporter : Sun Yuanshuai E-mail : [email protected]. 1. MF Introduction. 2. Application Area. 3. My Work. 4. Difficulty in my work. Content. MF Introduction. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Matrix Factorization

Matrix Factorization

Reporter : Sun Yuanshuai

E-mail : [email protected]

Page 2: Matrix Factorization

Content

11 MF Introduction

22 Application Area

33 My Work

44 Difficulty in my work

Page 3: Matrix Factorization

MF Introduction

Matrix factorization (abbr. MF), just as the name suggests, decomposes a big matrix into the multiplication form of several small matrix. It defines mathematically as follows,

We here assume the target matrix , the factor matrix and , where K << min (m, n), so it is

nmR kmU

knV

TUVR

Page 4: Matrix Factorization

Rji

jiijVU

vurf),(

2

,)(minarg

MF Introduction

We quantify the quality of the approximation with the Euclidean distance, so we can get the objective function as follows,

Rji

jijiVU

rrf),(

2~

,,,

))((minarg

Where i.e. is the predict value.

K

kjkikjiji vuvur

1**

~

, *~

, jir

Rji

jiji

ji

jiji

VUrr

r

rrf

),(

~

,,~

,

,,

,)log(minarg

22 |||||||| jvu vui

Page 5: Matrix Factorization

MF Introduction1. Alternating Descent Method

This method only works, when the loss function implies with Euclidean distance.

0])[( iuj jjiij

i

UVVUrU

f

So, we can get

The same to .jV

)( jj

j jij

i VV

VrU

Page 6: Matrix Factorization

MF Introduction2. Gradient Descent Method

The update rules of U defines as follows,

j jjiiji

VVUrU

f])[(

iii UfUU /*/

iuUwhere

The same to .jV

Rji

jiijVU

vurf),(

2

,)(minarg 22 |||||||| jvu vu

i

Page 7: Matrix Factorization

MF IntroductionGradient AlgorithmStochastic Gradient Algorithm

Page 8: Matrix Factorization

MF IntroductionOnline Algorithm

Online-Updating Regularized Kernel Matrix Factorization Models for Large-Scale Recommender Systems

Page 9: Matrix Factorization

MF IntroductionLoss Function

Rji

jiijVU

vurf),(

2

,)(minarg

])()[( ijT

ijT

ijijij UVUURVV

UVU

URVV

T

T

jij

UVU

VT

We update the factor V for reducing the objective function f with the conventional gradient descendent, as follows,

Here we set , so it is reachable

, the same to factor matrix U. go

Page 10: Matrix Factorization

MF Introduction

Here, we go on based on an assumption that SSGD can converge to a set of stationary points.

Page 11: Matrix Factorization

MF Introduction

The idea of DSGD is to specialize the SSGD algorithm, choosing the strata with special layout such that SGD can be run on each stratum in a distributed manner.

We see that there exists dependence between the current solution and the last one gotten by iteration operation, i.e. the last solution has to be known before the current can be computed. To solve the problem, we propose the notion interchangeablility :

Page 12: Matrix Factorization

We can get the theorem from definition about interchangeability, as follows:

From the theorem, we can compute the train matrix which is block-diagonal in parallel, i.e.

can be computed independ

-ently.

Zi

MF Introduction

Page 13: Matrix Factorization

We can compute the block-diagonal matrix in parallel. Our target, however, is to make the general matrix decomposition parallelism. How can we make it?

Now we can stratify the input matrix, such that each stratum meets the interchangeable condition.

Assume we cut input matrix into 3*3 blocks, as follows:

MF Introduction

Page 14: Matrix Factorization

MF Introduction

Page 15: Matrix Factorization

Application Area

Any area where dyadic data can be generated.

Dyadic Data : In general, dyadic data are the measurements on dyads, which are pairs of two elements coming from two sets.

= (userId, itemId, rating)

customer product

buy

Page 16: Matrix Factorization

My Work

Page 17: Matrix Factorization

My Work

MM T

Page 18: Matrix Factorization

× =

Left Matrix

Right Matrix

× =

× =

+

+

||

Page 19: Matrix Factorization

Difficulty in my work

DataSet

I use a total of 5 jobs. But the job can’t work. Just because the data generated in the procedure is too big which is 6000GB, Analyzed as follows, the left matrix is 300 thousand * 250 thousand approximately, the right matrix F is 250 thousand * 10 approximately, so the additional data generated is 250K*10*250K*8=6000G, where the 8 implies the number of bytes taken to store a double.

Page 20: Matrix Factorization

Difficulty in my work

The techniques I have used:

Combiner

Compress

Page 21: Matrix Factorization

THANK YOU FOR YOUR TIME!

Page 22: Matrix Factorization
Page 23: Matrix Factorization
Page 24: Matrix Factorization