Top Banner
Support Vector Machine Industrial AI Lab. Prof. Seungchul Lee
35

Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Nov 04, 2019

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Support Vector Machine

Industrial AI Lab.

Prof. Seungchul Lee

Page 2: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Classification (Linear)

• Autonomously figure out which category (or class) an unknown item should be categorized into

• Number of categories / classes– Binary: 2 different classes

– Multiclass: more than 2 classes

• Feature– The measurable parts that make up the unknown item (or the information you have available to

categorize)

2

Page 3: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Distance from a Line

3

Page 4: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

𝝎

• If Ԧ𝑝 and Ԧ𝑞 are on the decision line

4

Page 5: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

𝒅

• If 𝑥 is on the line and 𝑥 = 𝑑𝜔

𝜔(where 𝑑 is a normal distance

from the origin to the line)

5

Page 6: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Distance from a Line: 𝒉

• for any vector of 𝑥

6

Page 7: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Distance from a Line: 𝒉

7

Page 8: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Distance from a Line: 𝒉

• Another method to find a distance between 𝑔 𝑥 = 1 and 𝑔 𝑥 = −1

• Suppose 𝑔 𝑥1 = −1 and 𝑔 𝑥2 = 1

8

Page 9: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Illustrative Example

• Binary classification– 𝐶1 and 𝐶0

• Features– The coordinate of the unknown animal 𝑖 in the zoo

9

Page 10: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Hyperplane

• Is it possible to distinguish between 𝐶1 and 𝐶0 by its coordinates on a map of the zoo?

• We need to find a separating hyperplane (or a line in 2D)

10

Page 11: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Decision Making

• Given:

– Hyperplane defined by 𝜔 and 𝜔0

– Animals coordinates (or features) 𝑥

• Decision making:

• Find 𝜔 and 𝜔0 such that 𝑥 given 𝜔0 + 𝜔𝑇𝑥 = 0

11

Page 12: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Decision Boundary or Band

• Find 𝜔 and 𝜔0 such that 𝑥 given 𝜔0 + 𝜔𝑇𝑥 = 0

or

• Find 𝜔 and 𝜔0 such that – 𝑥 ∈ 𝐶1 given 𝜔0 + 𝜔𝑇𝑥 > 1 and

– 𝑥 ∈ 𝐶0 given 𝜔0 + 𝜔𝑇𝑥 < −1

12

Page 13: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Data Generation for Classification

13

Page 14: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Optimization Formulation 1

• 𝑛 (= 2) features

• 𝑁 belongs to 𝐶1 in training set

• 𝑀 belongs to 𝐶0 in training set

• 𝑚 = 𝑁 +𝑀 data points in training set

• 𝜔 and 𝜔0 are the unknown variables

14

Page 15: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Optimization Formulation 1

15

Page 16: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

CVXPY 1

16

Page 17: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

CVXPY 1

17

Page 18: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Linear Classification: Outlier

• Note that in the real world, you may have noise, errors, or outliers that do not accurately represent the actual phenomena

• Linearly non-separable case

18

Page 19: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Outliers

• No solutions (hyperplane) exist

• We have to allow some training examples to be misclassified !

• but we want their number to be minimized

19

Page 20: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Optimization Formulation 2

• 𝑛 (= 2) features

• 𝑁 belongs to 𝐶1 in training set

• 𝑀 belongs to 𝐶0 in training set

• 𝑚 = 𝑁 +𝑀 data points in training set

• For the non-separable case, we relax the above constraints

• Need slack variables 𝑢 and 𝑣 where all are positive

20

Page 21: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Optimization Formulation 2

• The optimization problem for the non-separable case

21

Page 22: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Expressed in a Matrix Form

22

Page 23: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

CVXPY 2

23

Page 24: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Further Improvement

• Notice that hyperplane is not as accurately represent the division due to the outlier

• Can we do better when there are noise data or outliers?

• Yes, but we need to look beyond linear programming

• Idea: large margin leads to good generalization on the test data24

Page 25: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Maximize Margin

• Finally, it is Support Vector Machine (SVM)

• Distance (= margin)

• Minimize 𝜔 2 to maximize the margin (closest samples from the decision line)

• Use gamma (𝛾) as a weighting between the followings:– Bigger margin given robustness to outliers

– Hyperplane that has few (or no) errors

25

Page 26: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Support Vector Machine

26

Page 27: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Support Vector Machine

• In a more compact form

27

Page 28: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Classifying Non-linear Separable Data

• Consider the binary classification problem

– each example represented by a single feature 𝑥

– No linear separator exists for this data

28

Page 29: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Classifying Non-linear Separable Data

• Consider the binary classification problem

– each example represented by a single feature 𝑥

– No linear separator exists for this data

• Now map each example as 𝑥 → 𝑥, 𝑥2

• Data now becomes linearly separable in the new representation

• Linear in the new representation = nonlinear in the old representation

29

Page 30: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Classifying Non-linear Separable Data

• Let's look at another example

– Each example defined by a two features

– No linear separator exists for this data 𝑥 = 𝑥1, 𝑥2

• Now map each example as 𝑥 = 𝑥1, 𝑥2 → 𝑧 = 𝑥12, 2𝑥1𝑥2, 𝑥2

2

– Each example now has three features (derived from the old representation)

• Data now becomes linear separable in the new representation

30

Page 31: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Kernel

• Often we want to capture nonlinear patterns in the data

– nonlinear regression: input and output relationship may not be linear

– nonlinear classification: classes may note be separable by a linear boundary

• Linear models (e.g. linear regression, linear SVM) are note just rich enough

– by mapping data to higher dimensions where it exhibits linear patterns

– apply the linear model in the new input feature space

– mapping = changing the feature representation

• Kernels: make linear model work in nonlinear settings

31

Page 32: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Nonlinear Classification

32https://www.youtube.com/watch?v=3liCbRZPrZA

Page 33: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Classifying Non-linear Separable Data

33

Page 34: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Classifying Non-linear Separable Data

34

Page 35: Support Vector Machine - i-systems.github.ioi-systems.github.io/HSE545/iAI/ML/topics/05_Classification/10_SVM.pdf · Support Vector Machine 26. Support Vector Machine •In a more

Classifying Non-linear Separable Data

35