Top Banner
IE598 Big Data Optimization Instructor: Niao He Jan 17, 2018 Introduction 1
33

IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Mar 26, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

IE598 Big Data Optimization

Instructor: Niao He

Jan 17, 2018

Introduction

1

Page 2: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

A little about me

• Assistant Professor, ISE & CSL

UIUC, 2016 –

• Ph.D. in Operations Research,

M.S. in Computational Sci. & Eng.

Georgia Tech, 2010 – 2015

• B.S. in Mathematics,

University of Sci. & Tech. of China, 2006 – 2010

2

Page 3: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

A little about the course

Big Data Optimization

• Explore modern optimization theories, algorithms, and big data applications

• Emphasize a deep understanding of structure of optimization problems and computation complexity of numerical algorithms

• Expose to the frontier of research in the intersection of large-scale optimization and machine learning

3

Page 4: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

A little about you

• PhD or Master?

• ISE, ECE, Stat, CS?

• Took any optimization courses?

4

Page 5: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Course Details

• Prerequisites: no formal ones, but assume knowledge in

– linear algebra, real analysis, and probability theory

– basic machine learning and optimization at graduate level

• Textbooks: no required ones, but recommend to read the listed references on syllabus

5Ben-Tal & Nemirovski (2011) Nesterov (2003) Beck (2017) Bubeck (2015)

Page 6: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Course Details

• Evaluation: groups of size 1~3

– Paper Presentation (25%)

– Final Project (75%)

• Proposal (10%) : 1~3 pages

• Report (40%): 10~15 pages under given format

• Presentation (25%): 15~20 mins

– Bonus (20 pts): a conference or journal submission

• Deadlines: see syllabus

6

Page 7: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Course Admin

• Syllabus & Websitehttp://niaohe.ise.illinois.edu/IE598/ (pwd: spring2018)

• Where to get help – No TAs

– Email: [email protected] with [IE 598] in your subject

– Office Location: 211 Transport Building

– Office Hours: Mon. 3:00-4:00 or by appointment via email

7

Page 8: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Introduction

8

Page 9: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Starts with the buzzword

9

Page 10: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Era of Big Data

• Big data heat in academia

10

Page 11: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Era of Big Data

• Big data heat in industry– LinkedIn:

48,000+ Data Scientist jobs in United States

11

Page 12: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

More than a buzzword

• Big data revolution in various areas

– Robotics and autonomous car

– Natural language processing

– Computer vision

– Healthcare

FinanceHealthcare

Environment

Aerospace

Lifestyle

12

Page 13: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

How to do data analysis?

Key Steps

• Pose a problem

• Collect data

• Pre-process and clean data

• Formulate a mathematical model

• Find a solution

• Evaluate and interpret the results

13

Page 14: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

What is Optimization?

• Find the optimal solution that minimize/maximize an objective function subject to constraints

14

Page 15: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Why do we care?

Optimization lies at the heart of many fields, especially machine learning.

• Finance– Portfolio selection, asset pricing, etc.

• Electrical Engineering – Signal and image processing, control and robotics, etc.

• Industrial Engineering– Supply chain, revenue management, transportation etc.

• Computer Science– Machine learning, computer vision, etc.

15

Page 16: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Example – Portfolio Selection

• Markowitz Mean-Variance Model

where

– 𝑤 is a vector of portfolio weights

– 𝑅 is the expected returns

– Σ is the variance of portfolio returns

– 𝜆 > 0 is the risk tolerance factor

16

Page 17: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Example – Image Denoising

• Total Variation Denoising Model

where

– 𝑥 is image matrix

– 𝑂 is the noisy image, P is the observed entries

– 𝑇𝑉(𝑥) is the total variation

17

Page 18: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Example – Inventory

• Newsvendor Model

where

– 𝑞 is number of newspaper to be stocked

– 𝐷 is the random demand

– 𝑐 is the unit purchase price

– 𝑝 is the sell price

18

Page 19: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Example – Regression

• Linear Regression Model

where

– 𝑥𝑖: predictor vector (feature)

– 𝑦𝑖: response vector (label)

– 𝑤: parameters to be learned

– 𝑛: number of data points

19

Page 20: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Example – Regularization

• Ridge Regression Model

▪ 𝑤2

2= σ𝑗=1

𝑑 𝑤𝑗2 is the 𝐿2-regularization

• LASSO (Least Absolute Shrinkage and Selection Operator)

▪ 𝑤1= σ𝑗=1

𝑑 |𝑤𝑗| is the 𝐿1-regularization

20

Page 21: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Example – Classification

• Maximum Margin Classifier Model

where

– 𝑥𝑖: predictor vector (feature)

– 𝑦𝑖 ∈ {1,−1}: label/class

– 𝑤: parameters to be learned

– 𝑛: number of data points

21

Page 22: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Example – More Classification

• Soft Margin SVM (support vector machine)

• Logistic Regression

22

Page 23: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Example – Maximum Likelihood Estimation

• Assume data points 𝑥1, . . . , 𝑥𝑛 are drawn i.i.d. from some distribution and we want to fit the data with a model 𝑝(𝑥|𝑤)with parameter 𝑤, the maximum likelihood estimation is to solve

– Least square regression as a special case

– Logistic regression as a special case

23

Page 24: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Example – Clustering

• K-Means Model

where

– 𝑥1, … , 𝑥𝑛: data

– 𝜇1, … , 𝜇𝑘: cluster centers to be learned

– 𝐶1, … , 𝐶𝑘: clusters to be assigned to

24

Page 25: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Many More Examples in ML

• Supervised learning (predictive models)– Regression

– Classification

– Neural networks

– Boosting

• Unsupervised learning (data exploration)– Clustering (K-means)

– Dimension reduction (PCA)

– Density estimation

• Reinforcement learning

• Collaborative filtering

• Graphical models

• Probabilistic inference

• … … 25

Page 26: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Theme of This Course

How to solve optimization problems efficiently in the Big Data environment?

26

Page 27: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Structure of Optimization

• Linear vs. Nonlinear

• Deterministic vs. Stochastic

• Continuous vs. Combinatorial

• Smooth vs. Nonsmooth

• Convex vs. Nonconvex

• Low-dimensional vs. High-dimensional

• Static vs. Online

• Single vs. Sequential Decision Making

27

Page 28: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Easy or Hard?

What makes an optimization problem easy or hard?

Find minimum volume ellipsoid Find maximum volume ellipsoid

Polynomial solvable NP-hard

Example from L. Xiao, CS286 seminar 28

Page 29: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Easy or Hard?

What makes an optimization problem easy or hard?

Linear Optimization Polynomial Optimization

Polynomial solvable P ~ NP-hard

29

Page 30: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Complexity and Convexity

“ The great watershed in optimization isn’t between linearity and nonlinearity, but convexity and nonconvexity.”

— R. Rockafellar, SIAM Review 1993

Non-Convex Optimization Convex Optimization30

Page 31: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Types of Algorithms

• Polynomial-time algorithms (dates back to 1970s or so)

– E.g., ellipsoid method, interior point method (IPM)

• First-order algorithms (dates back to 1900s, resurrection since 1980)

– E.g., accelerated gradient descent method (AGD)

• Second-order algorithms – E.g., Newton method, L-BFGS

• Stochastic and randomized algorithms (dates back to 1950s, resurrection since 2004)

– E.g., stochastic approximation (SA)

31

Page 32: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Central Topics

32

Large-Scale

Optimization

Algorithms

Complexity

Applications

Page 33: IE598 Big Data Optimizationniaohe.ise.illinois.edu/IE598/IE598-lecture1-introduction.pdf · Starts with the buzzword 9. Era of Big Data • Big data heat in academia 10. Era of Big

Courtesy Warning

• If you are looking for software/tools for big data analytics :– IE 529 Stats of Big Data & Clustering

– Check CS/CSE related data mining courses

• If you are looking for introduction-level optimization:– ECE 490 Introduction to Optimization

– IE 510 Advanced Nonlinear Programming

• If you are looking for combinatorial optimization:– IE 598 Advanced Integer Programming

• Otherwise, welcome to the class and see you next week!

33