Top Banner
Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization Jialin Dong ShanghaiTech University 1
41

Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Oct 18, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Ranking from Crowdsourced Pairwise

Comparisons

via Matrix Manifold Optimization

Jialin Dong

ShanghaiTech University

1

Page 2: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Outline

Introduction

FourVignettes:

➢ System Model and Problem Formulation

➢ Problem Analysis

❖ Convex Methods

Disadvantage: Why Not Convex Optimization?

❖ Scalable Nonconvex Optimization

Motivation: Why Nonconvex Optimization?

➢ Matrix Manifold Optimization

❖ Regularized Smoothed Maximum-likelihood Estimation

➢ Simulation Results

Summary2

Page 3: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Part I : Introduction

3

Page 4: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Ranking

Given n items, infer an ordering on the items based on partial sampled

data

Applications

4⋮ ⋮ ⋮ ⋮ ⋮

⋯⋯⋯

⋱⋮ ⋮⋯

? ? ?

? ? ?

? ?

? ? ?

Page 5: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Crowdsourcing

Crowdsourcing: Harness the power of human computation to solve

tasks

Applications: Machine learning models, clustering data, scencing

recognition

5

Page 6: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Pairwise Measurements

Pairwise relations over a few object pairs: difficult to directly measure

each individual object

Applications localization, Alignment, registration and synchronization,

community detection... [Chen & Goldsmith, 14]

6

Page 7: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Part II : Four Vignettes

7

Page 8: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Vignettes A: System Model and Problem Formulation

8

Page 9: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

9

System Model

m crowd users, n items to be ranked

Underlying weight matrix X: low-rank, unknown

Pairwise comparisons

Sampling set

9

⋮ ⋮ ⋮ ⋮ ⋮

⋯⋯⋯

⋱⋮ ⋮⋯

? ? ?

? ? ?

? ?

? ? ?

Page 10: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

10

System Model

BTL model [Bradley & Terry, 1952]:

Associated score

Individual ranking list for user i:10

Page 11: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

11

Problem Formulation

Maximum-likelihood estimation (MLE): the negative log-likelihood

function is given by

Low-rank matrix optimization problem

11

How to solve this low-rank matrix optimization problem ?

Page 12: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Vignettes B: Problem Analysis

12

Page 13: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Vignettes B.1: Convex Methods

13

Page 14: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Generic low-rank matrix optimization

Rank-constrained matrix optimization problem

➢ is a real-linear map on matrices

➢ is convex and differentiable

Challenge 1: Reliably solve the low-rank matrix problem at scale

Challenge II: Develop optimization algorithms with optimal storage

14

Page 15: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

A brief biased history of convex methods

1990s: Interior-point methods (computationally expensive)

➢ Storage cost for Hessian

2000s: Convex first-order methods

➢ (Accelerated) proximal gradient, spectral bundle methods, and others

➢ Store matrix variable

2008-Present: Storage-efficient convex first-order methods

➢ Conditional gradient method(CGM) and extensions

➢ Store matrix in low-rank form after iterations: no storage guarantees

15Interior-point: Nemirovski & Nesterov 1994; ... First-order: Rockafellar 1976; Helmberg & Rendl

1997; Auslender & Teboulle 2006; ... CGM: Frank & Wolfe 1956; Levitin & Poljak 1967; Jaggi 2013; ...

Page 16: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Convex Relaxation Approach

Nuclear norm relaxation to solve the original problem:

Theoretical foundations: Beautiful, nearly complete theory

Effective algorithms: Spectral projected-gradient (SPG) [Davenport et

al., 14], Newton-ADMM method [Ali et al.,17]…

➢ Use generic methods for not huge problems: high level language support

(CVX/CVXPY/Convex.jl) makes prototyping easy

16

Page 17: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Why not

Convex methods have slow memory hogs, high computational

complexity

➢ Computationally expensive:

➢ Storage issue:

17

Page 18: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Vignettes B.2: Scalable Nonconvex Optimization

18

Page 19: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Recent advances in nonconvex optimization

2009–Present: Nonconvex heuristics

➢ Burer–Monteiro factorization idea + various nonlinear programming methods

➢ Store low-rank matrix factors

Guaranteed solutions: Global optimality with statistical assumptions

➢ Matrix completion/recovery: [Sun-Luo’14], [Chen-Wainwright’15], [Ge-Lee-

Ma’16],…

➢ Phase retrieval: [Candes et al., 15], [Chen-Candes’ 15], [Sun-Qu-Wright’16]

➢ Community detection/phase synchronization [Bandeira-Boumal-

Voroninski’16], [Montanari et al., 17],…

19

Page 20: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Why Nonconvex Optimization

20

Convex methods:

➢ Slow memory hogs:

➢ High computational complexity, e.g., singular value decomposition

Nonconvex methods: fast, lightweight

➢ Under certain statistical models with benign global geometry

➢ Store low-rank matrix factors

Page 21: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Vignettes C: Matrix Manifold Optimization

21

Page 22: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

What is matrix manifold optimization?

Matrix manifold (or Riemannian) optimization problem

➢ is a smooth function

➢ is a Riemannian manifold: spheres, orthonormal bases (Stiefel), rotations,

positive definite matrices, fixed-rank matrices, Euclidean distance matrices,

semidefinite fixed-rank matrices, linear subspaces (Grassmann), phases,

essential matrices, fixed-rank tensors, Euclidean spaces...

22

How to reformulate the original problem to Riemannian optimization problem?

Page 23: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Reformulate to the Riemannian optimization problem

The original problem

The reformulated matrix manifold optimization problem

❖ Problem with respect to Fixed-rank matrices manifold

➢ F removing the origin

❖ Challenge: nonsmooth elementwise infinity norm constraint23

Page 24: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Smoothing

24

Motivation: derive smooth objective to implement Riemannian optimization

Page 25: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Regularization

25

Motivation: Address the constraint of smoothed surrogate of the

element-wise infinity norm

Convex regularized function:

How to develop the algorithm on the manifold?

Page 26: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Manifold optimization paradigms

Generalize Euclidean gradient (Hessian) to Riemannian gradient (Hessian)

We need Riemannian geometry: 1) linearize search space into atangent space ; 2) pick a metric, i.e., Riemannian metric, on togive intrinsic notions of gradient and Hessian

26

Riemannian Gradient Euclidean Gradient

Retraction Operator

Page 27: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Optimization on the manifold: main idea

27

Page 28: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Optimization on the manifold: main idea

28

Page 29: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Optimization on the manifold: main idea

29

Page 30: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Optimization on the manifold: main idea

30

Page 31: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Example: Rayleigh quotient

Optimization over (sphere) manifold

➢ The cost function is smooth on , symmetric matrix

Step 1: Compute the Euclidean gradient in

Step 2: Compute the Riemannian gradient on via projecting to

the tangent space using the orthogonal projector

31

Page 32: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Ranking problem in this paper

Low-rank optimization for ranking via Riemannian optimization

32

How to efficiently compute the descent direction on the manifold ?

Page 33: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Riemannian trust-region method

Sub-optimal problem

Update the iterate

33

Page 34: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Vignettes D: Simulation Result

34

Page 35: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Algorithms & Metric

Algorithms

➢ Proposed Riemannian trust-region algorithm solving log-sum-exp regularized

problem ( PRTRS)

➢ Bi-factor gradient descent solving log-barrier regularized problem (BFGDB)

[Park et al., 16]

➢ Spectral projected-gradient (SPG) [Davenport et al., 14]

Metric

➢ Sampling size: , rescaled sample size : d

➢ Relative mean square error (RMSE):

35

Page 36: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Convergence rate

36

Objective

➢ PRTRS

➢ BFGDB [Park et al., 16]

➢ SPG [Davenport et al., 14]

PRTRS:

➢ Faster rate of convergence than BFGDB algorithm

➢ Comparable with SPG algorithm

Page 37: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Performance

37

Conclusion

➢ Rescaled sample size increases, relative

MSE reduces

➢ PRTRS: better performance in terms of

MSE than both SPG and BFGDB

Page 38: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

GLOBECOM 2017 TUTORIAL

Computational Cost

38

Conclusion

➢ Computational time with different sizes Krespectively

➢ PRTRS: dramatical advantage in computational

time of both SPG and BFGDB

Page 39: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Part III: Summary

39

Page 40: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Concluding remarks

Ranking from pairwise comparisons in the crowdsourcing system

Scalable nonconvex optimization algorithms

➢ Store low-rank matrix factors

➢ Global optimality with statistical assumptions

Matrix manifold optimization

➢ Smoothed regularized MLE

➢ Riemannian trust-region algorithm: Outperform the state the state-of-art algorithms

❖ performance (i.e., relative MSE)

❖ computational cost

❖ convergence rate

40

Page 41: Ranking from Crowdsourced Pairwise Comparisons via Matrix ...dml.cs.byu.edu/icdm17ws/Jialin.pdf · m crowd users,n items to be ranked Underlying weight matrix X:low-rank,unknown Pairwise

Thanks

41