Top Banner
SVMs in a Nutshell
12

SVMs in a Nutshell

Jan 31, 2016

Download

Documents

susane

SVMs in a Nutshell. What is an SVM?. Support Vector Machine More accurately called support vector classifier Separates training data into two classes so that they are maximally apart. Simpler version. Suppose the data is linearly separable Then we could draw a line between the two classes. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SVMs in a Nutshell

SVMs in a Nutshell

Page 2: SVMs in a Nutshell

What is an SVM?

• Support Vector Machine– More accurately called support vector

classifier– Separates training data into two classes so

that they are maximally apart

Page 3: SVMs in a Nutshell

Simpler version

• Suppose the data is linearly separable

• Then we could draw a line between the two classes

Page 4: SVMs in a Nutshell

Simpler version

• But what is the best line? In SVM, we’ll use the maximum margin hyperplane

Page 5: SVMs in a Nutshell

Maximum Margin Hyperplane

Page 6: SVMs in a Nutshell

What if it’s non-linear?

Page 7: SVMs in a Nutshell

Higher dimensions

• SVM uses a kernel function to map the data into a different space where it can be separated

Page 8: SVMs in a Nutshell

What if it’s not separable?

• Use linear separation, but allow training errors

• This is called using a “soft margin”• Higher cost for errors = creation of more

accurate model, but may not generalize• Choice of parameters (kernel and cost)

determines accuracy of SVM model• To avoid over- or under-fitting, use cross

validation to choose parameters

Page 9: SVMs in a Nutshell

Some math

• Data: {(x1, c1), (x2, c2), …, (xn, cn)}

• xi is vector of attributes/features, scaled

• ci is class of vector (-1 or +1)

• Dividing hyperplane: wx - b = 0• Linearly separable means there exists a

hyperplane such that wxi - b > 0 if positive example and wxi - b < 0 if negative example

• w points perpendicular to hyperplane

Page 10: SVMs in a Nutshell

More math

• wx - b = 0Support vectors• wx - b = 1• wx - b = -1 Distance between

hyperplanes is 2/|w|, so minimize |w|

Page 11: SVMs in a Nutshell

More math

• For all i, either w xi - b 1 or wx - b -1

• Can be rewritten: ci(w xi - b) 1

• Minimize (1/2)|w| subject to ci(w xi - b) 1

• This is a quadratic programming problem and can be solved in polynomial time

Page 12: SVMs in a Nutshell

A few more details

• So far, assumed linearly separable– To get to higher dimensions, use kernel function

instead of dot product; may be nonlinear transform– Radial Basis Function is commonly used kernel:

k(x, x’) = exp(||x - x’||2) [need to choose ]

• So far, no errors; soft margin:– Minimize (1/2)|w| + C i

– Subject to ci(w xi - b) 1 - i

– C is error penalty