Top Banner
Machine Learning Queens College Lecture 13: SVM Again
26

Machine Learning Queens College Lecture 13: SVM Again.

Dec 27, 2015

Download

Documents

Prosper Burke
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Machine Learning Queens College Lecture 13: SVM Again.

Machine Learning

Queens College

Lecture 13: SVM Again

Page 2: Machine Learning Queens College Lecture 13: SVM Again.

Today

• Completion of Support Vector Machines

• Project Description and Topics

2

Page 3: Machine Learning Queens College Lecture 13: SVM Again.

Support Vectors

• Support Vectors are those input points (vectors) closest to the decision boundary

• 1. They are vectors

• 2. They “support” the decision hyperplane

3

Page 4: Machine Learning Queens College Lecture 13: SVM Again.

Support Vectors

• Define this as a decision problem

• The decision hyperplane:

• No fancy math, just the equation of a hyperplane.

4

Page 5: Machine Learning Queens College Lecture 13: SVM Again.

Support Vectors

• The decision hyperplane:

• Scale invariance

5

Page 6: Machine Learning Queens College Lecture 13: SVM Again.

Support Vectors

• The decision hyperplane:

• Scale invariance

6

This scaling does not change the decision hyperplane, or the supportvector hyperplanes. But we willeliminate a variable from the optimization

Page 7: Machine Learning Queens College Lecture 13: SVM Again.

What are we optimizing?

• We will represent the size of the margin in terms of w.

• This will allow us to simultaneously– Identify a decision

boundary– Maximize the margin

7

Page 8: Machine Learning Queens College Lecture 13: SVM Again.

Max Margin Loss Function

• If constraint optimization then Lagrange Multipliers

• Optimize the “Primal”

8

Page 9: Machine Learning Queens College Lecture 13: SVM Again.

Visualization of Support Vectors

9

Page 10: Machine Learning Queens College Lecture 13: SVM Again.

Interpretability of SVM parameters

• What else can we tell from alphas?– If alpha is large, then the associated data

point is quite important.– It’s either an outlier, or incredibly important.

• But this only gives us the best solution for linearly separable data sets…

10

Page 11: Machine Learning Queens College Lecture 13: SVM Again.

Basis of Kernel Methods

• The decision process doesn’t depend on the dimensionality of the data.

• We can map to a higher dimensionality of the data space.

• Note: data points only appear within a dot product.• The error is based on the dot product of data points – not the

data points themselves.

11

Page 12: Machine Learning Queens College Lecture 13: SVM Again.

Basis of Kernel Methods

• Since data points only appear within a dot product.• Thus we can map to another space through a replacement

• The error is based on the dot product of data points – not the data points themselves.

12

Page 13: Machine Learning Queens College Lecture 13: SVM Again.

Learning Theory bases of SVMs

• Theoretical bounds on testing error.– The upper bound doesn’t depend on the

dimensionality of the space– The lower bound is maximized by maximizing

the margin, γ, associated with the decision boundary.

13

Page 14: Machine Learning Queens College Lecture 13: SVM Again.

Why we like SVMs

• They work– Good generalization

• Easily interpreted.– Decision boundary is based on the data in the

form of the support vectors.• Not so in multilayer perceptron networks

• Principled bounds on testing error from Learning Theory (VC dimension)

14

Page 15: Machine Learning Queens College Lecture 13: SVM Again.

SVM vs. MLP

• SVMs have many fewer parameters– SVM: Maybe just a kernel parameter– MLP: Number and arrangement of nodes and

eta learning rate

• SVM: Convex optimization task– MLP: likelihood is non-convex -- local minima

15

Page 16: Machine Learning Queens College Lecture 13: SVM Again.

Soft margin classification

• There can be outliers on the other side of the decision boundary, or leading to a small margin.

• Solution: Introduce a penalty term to the constraint function

16

Page 17: Machine Learning Queens College Lecture 13: SVM Again.

Soft Max Dual

17

Still Quadratic Programming!

Page 18: Machine Learning Queens College Lecture 13: SVM Again.

• Points are allowed within the margin, but cost is introduced.

Soft margin example

18

Hinge Loss

Page 19: Machine Learning Queens College Lecture 13: SVM Again.

Probabilities from SVMs

• Support Vector Machines are discriminant functions

– Discriminant functions: f(x)=c– Discriminative models: f(x) = argmaxc p(c|x)– Generative Models: f(x) = argmaxc p(x|c)p(c)/p(x)

• No (principled) probabilities from SVMs• SVMs are not based on probability

distribution functions of class instances.

19

Page 20: Machine Learning Queens College Lecture 13: SVM Again.

Efficiency of SVMs

• Not especially fast.

• Training – n^3– Quadratic Programming efficiency

• Evaluation – n– Need to evaluate against each support vector

(potentially n)

20

Page 21: Machine Learning Queens College Lecture 13: SVM Again.

Research Projects• Run a machine learning experiment

– Identify a problem/task.– Find appropriate data– Implement one or more ML algorithm– Evaluate the performance.

• Write a report of the experiment– 4 pages including references– Abstract

• One paragraph describing the experiment– Introduction

• Describe the problem/task– Data

• Describe the data set, features extracted, cleaning processes– Method

• Describe the algorithm/approach– Results

• Present and Discuss results– Conclusion

• Summarize the experiment and results

• Teams of two people are acceptable. – Requires a report from each participant (written independently) describing who was

responsible for the components of the work. 21

Page 22: Machine Learning Queens College Lecture 13: SVM Again.

Sample Problems/Tasks• Vision/Graphics

– Object Classification– Facial Recognition– Fingerprint Identification– Fingerprint ID– Handwriting recognition

• Non English languages?

• Language– Topic classification– Sentiment analysis– Speech recognition– Speaker identification– Punctuation restoration– Semantic Segmentation– Recognition of Emotion, Sarcasm, etc.– SMS Text normalization– Chat participant Id– Twitter classification– Twitter threading

22

Page 23: Machine Learning Queens College Lecture 13: SVM Again.

Sample Problems/Tasks• Games

– Chess– Checkers– Poker– Blackjack– Go

• Recommenders (Collaborative Filtering)– Netflix– Courses– Jokes– Books– Facebook

• Video Classification– Motion classification– Segmentation

23

Page 24: Machine Learning Queens College Lecture 13: SVM Again.

ML Topics to explore in the project

24

Page 25: Machine Learning Queens College Lecture 13: SVM Again.

Data

• UCI Machine Learning Repository– http://archive.ics.uci.edu/ml/

• Ask Me

• Collect some of your own

25

Page 26: Machine Learning Queens College Lecture 13: SVM Again.

Next Time

• Kernel Methods

26