Top Banner
Large Scale Matrix Factorization Fei Wang Division of Health Informatics Department of Healthcare Policy and Research Weill Cornell Medical College Cornell University 12/8/16 IEEE Big Data Conference 2016 1
56

Large Scale Matrix Factorization - Drexel CCI

Nov 14, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Large Scale Matrix Factorization - Drexel CCI

Large Scale Matrix Factorization

Fei Wang

Division of Health Informatics

Department of Healthcare Policy and Research

Weill Cornell Medical College

Cornell University

12/8/16 IEEE Big Data Conference 2016 1

Page 2: Large Scale Matrix Factorization - Drexel CCI

Outline

• Introduction

• Matrix Factorization Technologies

• Conclusions and Discussions

12/8/16 IEEE Big Data Conference 2016 2

Page 3: Large Scale Matrix Factorization - Drexel CCI

What is a matrix?

12/8/16 IEEE Big Data Conference 2016 3

Page 4: Large Scale Matrix Factorization - Drexel CCI

Matrix: A Natural Representation for Networks/Graphs/Relational Data

12/8/16 IEEE Big Data Conference 2016 4

Page 5: Large Scale Matrix Factorization - Drexel CCI

Matrices in Social Networks

12/8/16 IEEE Big Data Conference 2016 5

Page 6: Large Scale Matrix Factorization - Drexel CCI

Matrices in Healthcare

Heart Attack

Rales

Cough

Wheezing

Chest Pain

Fever

Ankle Edema

Initial Initial

Expanded

Expanded

Expanded

12/8/16 IEEE Big Data Conference 2016 6

Page 7: Large Scale Matrix Factorization - Drexel CCI

Matrices in Healthcare

John, Male, 53 Heart Failure

Tom, Male, 30 Hypertension

Roy, Male, 40 Sepsis

Sara, Female, 28 Asthma

Natasha, Female, 42 Hyperlipidemia

Jack, Male, 30 Pneumonia

12/8/16 IEEE Big Data Conference 2016 7

Page 8: Large Scale Matrix Factorization - Drexel CCI

Matrices in Healthcare

12/8/16 IEEE Big Data Conference 2016 8

Page 9: Large Scale Matrix Factorization - Drexel CCI

Outline

• Introduction

• Matrix Factorization Technologies • Principal Component Analysis

• Singular Value Decomposition

• Nonnegative Matrix Factorization

• Convolutional Matrix Factorization

• Regularized Matrix Factorization

• Inductive Matrix Factorization

• Conclusions and Discussions

12/8/16 IEEE Big Data Conference 2016 9

Page 10: Large Scale Matrix Factorization - Drexel CCI

Matrix Factorization

12/8/16 IEEE Big Data Conference 2016 10

Page 11: Large Scale Matrix Factorization - Drexel CCI

Outline

• Introduction

• Matrix Factorization Technologies • Principal Component Analysis

• Singular Value Decomposition

• Nonnegative Matrix Factorization

• Convolutional Matrix Factorization

• Regularized Matrix Factorization

• Inductive Matrix Factorization

• Conclusions and Discussions

12/8/16 IEEE Big Data Conference 2016 11

Page 12: Large Scale Matrix Factorization - Drexel CCI

Orthogonal Projection

12/8/16 IEEE Big Data Conference 2016 12

Page 13: Large Scale Matrix Factorization - Drexel CCI

Principal Component Analysis

12/8/16 IEEE Big Data Conference 2016 13

Page 14: Large Scale Matrix Factorization - Drexel CCI

Principal Component Analysis

12/8/16 IEEE Big Data Conference 2016 14

Page 15: Large Scale Matrix Factorization - Drexel CCI

Principal Component Analysis

12/8/16 IEEE Big Data Conference 2016 15

Page 16: Large Scale Matrix Factorization - Drexel CCI

Principal Component Analysis

12/8/16 IEEE Big Data Conference 2016 16

Page 17: Large Scale Matrix Factorization - Drexel CCI

Principal Component Analysis

12/8/16 IEEE Big Data Conference 2016 17

Page 18: Large Scale Matrix Factorization - Drexel CCI

Principal Component Analysis

12/8/16 IEEE Big Data Conference 2016 18

Page 19: Large Scale Matrix Factorization - Drexel CCI

Outline

• Introduction

• Matrix Factorization Technologies • Principal Component Analysis

• Singular Value Decomposition

• Nonnegative Matrix Factorization

• Convolutional Matrix Factorization

• Regularized Matrix Factorization

• Inductive Matrix Factorization

• Conclusions and Discussions

12/8/16 IEEE Big Data Conference 2016 19

Page 20: Large Scale Matrix Factorization - Drexel CCI

Singular Value Decomposition

12/8/16 IEEE Big Data Conference 2016 20

Page 21: Large Scale Matrix Factorization - Drexel CCI

Singular Value Decomposition

12/8/16 IEEE Big Data Conference 2016 21

Page 22: Large Scale Matrix Factorization - Drexel CCI

Outline

• Introduction

• Matrix Factorization Technologies • Principal Component Analysis

• Singular Value Decomposition

• Nonnegative Matrix Factorization

• Convolutional Matrix Factorization

• Regularized Matrix Factorization

• Inductive Matrix Factorization

• Conclusions and Discussions

12/8/16 IEEE Big Data Conference 2016 22

Page 23: Large Scale Matrix Factorization - Drexel CCI

Nonnegative Matrix Factorization

• Consider the following problem • M = 2429 facial images

• Each image of size n = 19 by 19 = 361

• Matrix V = n by m is the original dataset

• We want to approximate V by two lower rank matrix W (n by 49) and H (49 by m) • V ~ WH

• Constraints • All entries of W and H are non-negative

12/8/16 IEEE Big Data Conference 2016 23

Page 24: Large Scale Matrix Factorization - Drexel CCI

Nonnegative Matrix Factorization • How well can W and H approximate V

• How can we interpret the result

Daniel D. Lee and H. Sebastian Seung. Learning the parts of objects by non-negative matrix factorization. Nature 401, 788-791 (21 October 1999)

12/8/16 IEEE Big Data Conference 2016 24

Page 25: Large Scale Matrix Factorization - Drexel CCI

Nonnegative Matrix Factorization • Factorizing a nonnegative matrix to the product of

two low-rank matrices

12/8/16 IEEE Big Data Conference 2016 25

Page 26: Large Scale Matrix Factorization - Drexel CCI

Nonnegative Matrix Factorization • Multiplicative update method

Daniel D. Lee and H. Sebastian Seung (2001). Algorithms for Non-negative Matrix Factorization. NIPS 2001. 12/8/16 IEEE Big Data Conference 2016 26

Page 27: Large Scale Matrix Factorization - Drexel CCI

Nonnegative Matrix Factorization

12/8/16 IEEE Big Data Conference 2016 27

• Initialize F and G with nonnegative values

• Iterate the following procedure: • Fixing , Solve

• Fixing , Solve

(1) Projected Gradient: http://www.csie.ntu.edu.tw/~cjlin/nmf/ (2) Newtown Type of Method: http://www.cs.utexas.edu/users/dmkim/Source/software/nnma/index.html (3) Block Principal Pivoting: https://sites.google.com/site/jingukim/nmf_bpas.zip?attredirects=0 P. Paatero and U. Tapper. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics, 5(1):111–126, 1994 C.-J. Lin. Projected gradient methods for non-negative matrix factorization. Neural Computation,19(2007), 2756-2779. D. Kim, S. Sra, I. S. Dhillon, Fast Newton-type Methods for the Least Squares Nonnegative Matrix Approximation Problem. SDM 2007. J. Kim and H. Park. Toward Faster Nonnegative Matrix Factorization: A New Algorithm and Comparisons. ICDM 2008.

Page 28: Large Scale Matrix Factorization - Drexel CCI

Nonnegative Matrix Factorization: An Online Algorithm

Wang, Fei, Ping Li, and Arnd Christian König. "Efficient Document Clustering via Online Nonnegative Matrix Factorizations." In SDM, vol. 11, pp. 908-919. 2011.

12/8/16 IEEE Big Data Conference 2016 28

Page 29: Large Scale Matrix Factorization - Drexel CCI

Online NMF: Updating g

Nonnegative Least Square (NLS) ● Active Set ● Projected Gradient ● Principal Block Pivoting

C. L. Lawson and R. J. Hanson. Solving Least Squares Problems. Society for Industrial Mathematics, 1995.

C. J. Lin. Projected gradient methods for non-negative matrix factorization. Neural Computation, 19(10):2756-2779.

J. Kim and H. Park. Toward faster nonnegative matrix factorization: A new algorithm and comparisons. In Proceedings of the 8th International Conference on Data Mining (ICDM), pages 353-362, 2008.

12/8/16 IEEE Big Data Conference 2016 29

Page 30: Large Scale Matrix Factorization - Drexel CCI

Online NMF: Updating F

loss

gradient

1st order projected gradient descent

2nd order projected gradient descent

12/8/16 IEEE Big Data Conference 2016 30

Page 31: Large Scale Matrix Factorization - Drexel CCI

Online NMF: Experiments

12/8/16 31

Page 32: Large Scale Matrix Factorization - Drexel CCI

NMF with Random Projections

Solution Procedure

Objectives Projected Objectives

32

Page 33: Large Scale Matrix Factorization - Drexel CCI

Random NMF: Theoretical Analysis

R. I. Arriaga and S. Vempala. An algorithmic theory of learning: Robust concepts and random projection. In FOCS, 1999.

12/8/16 IEEE Big Data Conference 2016 33

Page 34: Large Scale Matrix Factorization - Drexel CCI

Random NMF: Experiments

Data Dimension: 12600; Data Size: 203

12/8/16 IEEE Big Data Conference 2016 34

Page 35: Large Scale Matrix Factorization - Drexel CCI

Outline

• Introduction

• Matrix Factorization Technologies • Principal Component Analysis

• Singular Value Decomposition

• Nonnegative Matrix Factorization

• Convolutional Matrix Factorization

• Regularized Matrix Factorization

• Inductive Matrix Factorization

• Conclusions and Discussions

12/8/16 IEEE Big Data Conference 2016 35

Page 36: Large Scale Matrix Factorization - Drexel CCI

Patient EHR Matrix

Wang, Fei, Noah Lee, Jianying Hu, Jimeng Sun, and Shahram Ebadollahi. "Towards heterogeneous

temporal clinical event pattern discovery: a convolutional approach." In Proceedings of the 18th ACM

SIGKDD international conference on Knowledge discovery and data mining, pp. 453-461. ACM, 2012.

12/8/16 IEEE Big Data Conference 2016 36

Page 37: Large Scale Matrix Factorization - Drexel CCI

Temporal Patterns

12/8/16 IEEE Big Data Conference 2016 37

Page 38: Large Scale Matrix Factorization - Drexel CCI

One-Side Convolution

12/8/16 IEEE Big Data Conference 2016 38

Page 39: Large Scale Matrix Factorization - Drexel CCI

One-Side Convolutional NMF

12/8/16 IEEE Big Data Conference 2016 39

Page 40: Large Scale Matrix Factorization - Drexel CCI

Multiplicative Updates

12/8/16 IEEE Big Data Conference 2016 40

Page 41: Large Scale Matrix Factorization - Drexel CCI

Synthetic Example

12/8/16 IEEE Big Data Conference 2016 41

Page 42: Large Scale Matrix Factorization - Drexel CCI

Bag-of-Pattern Representation

[ 1 3 1 1 ]

[ 0 2 1 2 ]

[ 0 3 1 0 ]

42

Page 43: Large Scale Matrix Factorization - Drexel CCI

Prediction of CHF Onset Risk Case: 1,127

Control: 3,850

Prediction

Window: 180 days

Observation

Window: 360 days

Logistic

Regression

12/8/16 IEEE Big Data Conference 2016 43

Page 44: Large Scale Matrix Factorization - Drexel CCI

Outline

• Introduction

• Matrix Factorization Technologies • Principal Component Analysis

• Singular Value Decomposition

• Nonnegative Matrix Factorization

• Convolutional Matrix Factorization

• Regularized Matrix Factorization

• Inductive Matrix Factorization

• Conclusions and Discussions

12/8/16 IEEE Big Data Conference 2016 44

Page 45: Large Scale Matrix Factorization - Drexel CCI

Matrix Factorization

12/8/16 IEEE Big Data Conference 2016 45

Page 46: Large Scale Matrix Factorization - Drexel CCI

Sparsity Lower Bound

46

Page 47: Large Scale Matrix Factorization - Drexel CCI

Sparsity

Robert Tibshirani. Regression Shrinkage and Selection Via the Lasso. Journal of the Royal Statistical Society, Series B. 1994. 12/8/16 IEEE Big Data Conference 2016 47

Page 48: Large Scale Matrix Factorization - Drexel CCI

Sparsity

• Group Lasso: L1/2 norm

• Exclusive Lasso: L2/1 norm

• Elastic Net Regularization

• Fused Lasso

• Tree Structured Group Lasso

12/8/16 IEEE Big Data Conference 2016 48

Lukas Meier, Sara Van De Geer, Peter Bühlmann. The group lasso for logistic regression. Journal of the Royal Statistical Society: Series B, 70(1), 53–71, 2008. Y. Zhou, R. Jin, and S. C. H. Hoi. Exclusive Lasso for Multi-task Feature Selection. AISTATS 2010. Zou, Hui; Hastie, Trevor. Regularization and Variable Selection via the Elastic Net. Journal of the Royal Statistical Society, Series B: 301–320. 2005. R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, K. Knight. Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B. 67(1), 91–108. 2005. J. Liu, J. Ye. Moreau-Yosida Regularization for Grouped Tree Structure Learning. NIPS 2010.

SLEP: A Sparse Learning Package http://www.public.asu.edu/~jye02/Software/SLEP/

Page 49: Large Scale Matrix Factorization - Drexel CCI

Dictionary Learning

12/8/16 IEEE Big Data Conference 2016 49

Page 50: Large Scale Matrix Factorization - Drexel CCI

Group Sparse Coding

12/8/16 IEEE Big Data Conference 2016 50

Bengio, Samy, et al. "Group sparse coding." Advances in neural information processing systems. 2009.

Page 51: Large Scale Matrix Factorization - Drexel CCI

Automatic Group Sparse Coding

12/8/16 IEEE Big Data Conference 2016 51

Wang, Fei, Noah Lee, Jimeng Sun, Jianying Hu, and Shahram Ebadollahi. "Automatic Group Sparse Coding." In AAAI. 2011.

Page 52: Large Scale Matrix Factorization - Drexel CCI

Synthetic Example

12/8/16 IEEE Big Data Conference 2016 52

Page 53: Large Scale Matrix Factorization - Drexel CCI

Synthetic Example

12/8/16 IEEE Big Data Conference 2016 53

Page 54: Large Scale Matrix Factorization - Drexel CCI

Outline

• Introduction

• Matrix Factorization Technologies • Principal Component Analysis

• Singular Value Decomposition

• Nonnegative Matrix Factorization

• Convolutional Matrix Factorization

• Regularized Matrix Factorization

• Inductive Matrix Factorization

• Conclusions and Discussions

12/8/16 IEEE Big Data Conference 2016 54

Page 55: Large Scale Matrix Factorization - Drexel CCI

Inductive Matrix Factorization

12/8/16 IEEE Big Data Conference 2016 55

Natarajan, Nagarajan, and Inderjit S. Dhillon. "Inductive matrix completion for predicting gene–disease associations." Bioinformatics 30, no. 12 (2014): i60-i68.

Page 56: Large Scale Matrix Factorization - Drexel CCI

Inductive Matrix Factorization

12/8/16 IEEE Big Data Conference 2016 56