Top Banner
FACTORIZATION MACHINE: MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: [email protected] Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1
24

F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: [email protected] Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

Dec 30, 2015

Download

Documents

Nora Wells
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

1

FACTORIZATION MACHINE:MODEL, OPTIMIZATION AND APPLICATIONS

Yang LIUEmail: [email protected]: Prof. Andrew Yao

Prof. Shengyu Zhang

Page 2: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

2

OUTLINE

Factorization machine (FM) A generic predictor Auto feature interaction

Learning algorithm Stochastic gradient descent (SGD) …

Applications Recommendation systems Regression and classification …

Page 3: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

3

DOUBAN MOVIE

Page 4: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

4

PREDICTION TASK

e.g. Alice rates Titanic 5 at time 13

??

Page 5: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

5

PREDICTION TASK

Format: for regression, for classification

Training set:

Testing set: ,

Objective: to predict

Page 6: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

6

LINEAR MODEL – FEATURE ENGINEERING

Linear SVM

Logistic Regression

�̂� (𝑥 )= 1

1+𝑤0 exp (−𝑤𝑇𝑥  )

Jia
Is this the correct format of linear SVM?Do we need to mention that features are independent in linear models?
Page 7: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

7

FACTORIZATION MODEL

Model parameters , where

is the inner dimension

Linear:

FM:

Interaction between variables

Page 8: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

8

W

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

Page 9: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

9

W

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

Page 10: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

10

W?

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

Page 11: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

11

VVT

k

W

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

=

Page 12: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

12

=

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

VVT

Wk

Page 13: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

13

=

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

VVT

W

Page 14: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

14

=

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

VVT

W¿ 𝒗𝑨

𝑻 𝒗𝑻𝑰

Page 15: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

15

=

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

VVT

W𝑤𝑖𝑗

𝑣 𝑖T

𝑣 𝑗

Factorization

Page 16: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

16

=

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

VVT

W𝑤𝑖𝑗

𝑣 𝑖T

𝑣 𝑗

FactorizationMachine

Jia
Is "machine" an appropriate description?Why using "machine"?
Page 17: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

17

FM: PROPERTIES

Expressiveness:

Feature dependency: and are dependent

Linear computation complexity:

Page 18: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

18

OPTIMIZATION TARGET

Min ERROR Min ERROR + Regularization

Loss function

Page 19: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

19

STOCHASTIC GRADIENT DESCENT (SGD)

For item , update by:

: initial value of : learning rate : regularization

Pros Easy to implement Fast convergence on big training data

Cons Parameter tuning Sequential method

Page 20: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

20

APPLICATIONS

EMI Music Hackathon 2012 Song recommendation

Given: Historical ratings User demographics

# features: 51K # items in training: 188K

?

Page 21: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

21

RESULTS FOR EMI MUSIC

FM: Root Mean Square Error (RMSE) 13.27626 Target value [0,100] The best (SVD++) is 13.24598

Details Regression Converges in 100 iterations Time for each iteration: < 1 s

Win 7, Intel Core 2 Duo CPU 2.53GHz, 6G RAM

Page 22: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

22

OTHER APPLICATIONS

Ads CTR prediction (KDD Cup 2012) Features

User_info, Ad_info, Query_info, Position, etc. # features: 7.2M # items in training: 160M Classification Performance:

AUC: 0.80178, the best (SVM) is 0.80893

Page 23: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

23

OTHER APPLICATIONS

HiCloud App Recommendation Features

App_info, Smartphone model, installed apps, etc. # features: 9.5M # items in training: 16M Classification Performance:

Top 5: 8%, Top 10: 18%, Top 20: 32%; AUC: 0.78

Page 24: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1.

24

SUMMARY

FM: a general predictor Works under sparsity Linear computation complexity Estimates interactions automatically Works with any real valued feature vector

THANKS!