Top Banner
Support Vector Machines: Optimization of Decision Making Christopher Katinas March 10, 2016
15

Support Vector Machines: Optimization of the Methodology presentation/2016... · Support Vector Machines: Optimization of Decision Making ... Matlab ‘quadprog’ can solve this!

Apr 26, 2018

Download

Documents

dinhkhanh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Support Vector Machines: Optimization of the Methodology presentation/2016... · Support Vector Machines: Optimization of Decision Making ... Matlab ‘quadprog’ can solve this!

Support Vector Machines: Optimization of Decision Making

Christopher Katinas March 10, 2016

Page 2: Support Vector Machines: Optimization of the Methodology presentation/2016... · Support Vector Machines: Optimization of Decision Making ... Matlab ‘quadprog’ can solve this!

Overview • Background of Support Vector Machines • Segregation Functions/Problem Statement • Methodology • Training/Testing Results • Conclusions

Page 3: Support Vector Machines: Optimization of the Methodology presentation/2016... · Support Vector Machines: Optimization of Decision Making ... Matlab ‘quadprog’ can solve this!

Support Vector Machines (SVMs) • Goal: Maximize the margin between two distinct

groups via a segregation function • Certain engineering problems require high certainty

estimations of the equation separating two data sets (ex. phase diagrams in thermodynamics)

• Distinct phases are separated by functions which may not be described easily in closed form

https://en.wikipedia.org/wiki/Phase_diagram

Can the liquid/vapor line be recreated by only using select points and using an SVM to identify the function?

Page 4: Support Vector Machines: Optimization of the Methodology presentation/2016... · Support Vector Machines: Optimization of Decision Making ... Matlab ‘quadprog’ can solve this!

Segregation Functions to Match

𝒚𝒚 𝒙𝒙 = 0.01𝑥𝑥2 + 5 Parabola

Page 5: Support Vector Machines: Optimization of the Methodology presentation/2016... · Support Vector Machines: Optimization of Decision Making ... Matlab ‘quadprog’ can solve this!

Segregation Functions to Match

𝒚𝒚 𝒙𝒙 = 0.01𝑥𝑥2 + 5

𝒚𝒚 𝒙𝒙 = 0.5𝑥𝑥 + 25

Parabola

Line

Page 6: Support Vector Machines: Optimization of the Methodology presentation/2016... · Support Vector Machines: Optimization of Decision Making ... Matlab ‘quadprog’ can solve this!

Segregation Functions to Match

𝒚𝒚 𝒙𝒙 = 0.01𝑥𝑥2 + 5

𝒚𝒚 𝒙𝒙 = 0.5𝑥𝑥 + 25

𝒚𝒚 𝒙𝒙 = 108.07131− 1730.63𝑥𝑥+233.42

101.325760

Antoinne Equation for Vapor Pressure of Water

Parabola

Line

Page 7: Support Vector Machines: Optimization of the Methodology presentation/2016... · Support Vector Machines: Optimization of Decision Making ... Matlab ‘quadprog’ can solve this!

Segregation Functions to Match

𝒚𝒚 𝒙𝒙 = 0.01𝑥𝑥2 + 5

𝒚𝒚 𝒙𝒙 = 0.5𝑥𝑥 + 25

𝒚𝒚 𝒙𝒙 = 108.07131− 1730.63𝑥𝑥+233.42

101.325760

Antoinne Equation for Vapor Pressure of Water

𝒚𝒚 𝒙𝒙 = ± 232 − 𝑥𝑥 − 54 22 + 50 Circle of Radius 23 at centered at (54,50)

Parabola

Line

Page 8: Support Vector Machines: Optimization of the Methodology presentation/2016... · Support Vector Machines: Optimization of Decision Making ... Matlab ‘quadprog’ can solve this!

Segregation Functions to Match

𝒚𝒚 𝒙𝒙 = 0.01𝑥𝑥2 + 5

𝒚𝒚 𝒙𝒙 = 0.5𝑥𝑥 + 25

𝒚𝒚 𝒙𝒙 = 108.07131− 1730.63𝑥𝑥+233.42

101.325760

Antoinne Equation for Vapor Pressure of Water

𝒚𝒚 𝒙𝒙 = ± 232 − 𝑥𝑥 − 54 22 + 50 Circle of Radius 23 at centered at (54,50)

𝒚𝒚 𝒙𝒙 = ±62 − 𝑥𝑥 − 84 22

6 + 20

𝒚𝒚 𝒙𝒙 = ± 232 − 𝑥𝑥 − 54 22 + 50

Circle of Radius 23 at centered at (54,50) and Ellipse centered at (84,20)

Parabola

Line

Page 9: Support Vector Machines: Optimization of the Methodology presentation/2016... · Support Vector Machines: Optimization of Decision Making ... Matlab ‘quadprog’ can solve this!

Methodology • Solve Lagrangian Dual Problem

max �𝛼𝛼𝑖𝑖

𝑛𝑛

𝑖𝑖=1

−12��𝛼𝛼𝑖𝑖

𝑛𝑛

𝑗𝑗=1

𝛼𝛼𝑗𝑗𝑦𝑦𝑖𝑖𝑦𝑦𝑗𝑗𝑲𝑲(𝒙𝒙𝒊𝒊,𝒙𝒙𝒋𝒋)𝑛𝑛

𝑖𝑖=1

such that 𝐶𝐶 ≥ 𝛼𝛼𝑖𝑖 ≥ 0 and�𝑦𝑦𝑖𝑖𝛼𝛼𝑖𝑖

𝑛𝑛

𝑖𝑖=1

= 0

min −�𝛼𝛼𝑖𝑖

𝑛𝑛

𝑖𝑖=1

+12��𝛼𝛼𝑖𝑖

𝑛𝑛

𝑗𝑗=1

𝛼𝛼𝑗𝑗𝑦𝑦𝑖𝑖𝑦𝑦𝑗𝑗𝑲𝑲(𝒙𝒙𝒊𝒊,𝒙𝒙𝒋𝒋)𝑛𝑛

𝑖𝑖=1

such that 𝐶𝐶 ≥ 𝛼𝛼𝑖𝑖 ≥ 0and�𝑦𝑦𝑖𝑖𝛼𝛼𝑖𝑖

𝑛𝑛

𝑖𝑖=1

= 0

𝑲𝑲 𝒙𝒙𝒊𝒊,𝒙𝒙𝒋𝒋 = 𝑃𝑃 + 𝐴𝐴𝒙𝒙𝒊𝒊𝑇𝑇𝒙𝒙𝒋𝒋𝒅𝒅

Kernel Function

Matlab ‘quadprog’ can solve this!

Select A to prevent numerical overflow for a given d, and P should be large to force optimizer to solve for correct weights - A-1=max(xiTxj) [Normalize inputs] - P= 1𝑒𝑒1𝑒 𝟏𝟏/𝒅𝒅

C was set to 1.0 for all simulations performed in this study (hence the choice of A and P)

Page 10: Support Vector Machines: Optimization of the Methodology presentation/2016... · Support Vector Machines: Optimization of Decision Making ... Matlab ‘quadprog’ can solve this!

Training Method • Use Delaunay Triangulation to identify most

critical points and query the function close to the boundary – RED line segments denote where segregation function

must reside – Specify maximum number of refinements – Keep only the points which bound the function for

faster optimization

Page 11: Support Vector Machines: Optimization of the Methodology presentation/2016... · Support Vector Machines: Optimization of Decision Making ... Matlab ‘quadprog’ can solve this!

Training/Testing Results

All results shown for 8 refinements and five random seeding points on each side of the function – Eighth Order Polynomial Kernel was used. All Training Points were Kept!

Magenta X = Group 1 Test Points Blue Area = Testing Group 1 Green X = Group 2 Test Points Maroon Area = Testing Group 2 Cyan circles = Support Vectors Yellow Line = Actual Boundary

Magenta X = Group 1 Test Points Green X = Group 2 Test Points Blue Lines = Segregation Function Anti- Gate Red Lines = Segregation Function Gate Black circles = Desired New Points

Page 12: Support Vector Machines: Optimization of the Methodology presentation/2016... · Support Vector Machines: Optimization of Decision Making ... Matlab ‘quadprog’ can solve this!

Training/Testing Results

Parabolic – 0.68% Error

Line – 0.54% Error

Antoinne – 0.94% Error

Circle – 0.13% Error

Circle/Ellipse – 0.60% Error

Page 13: Support Vector Machines: Optimization of the Methodology presentation/2016... · Support Vector Machines: Optimization of Decision Making ... Matlab ‘quadprog’ can solve this!

Training/Testing Results

Antoinne – 0.94% Error No Pre-Seeding

Antoinne – 0.26% Error Pre-Seeded Boundary Points Only

• Error in Antoinne Equation was due to no test points at boundaries – Created one point at each corner of the domain [Pre-Seeding]

Page 14: Support Vector Machines: Optimization of the Methodology presentation/2016... · Support Vector Machines: Optimization of Decision Making ... Matlab ‘quadprog’ can solve this!

Training/Testing Results with Noise

Antoinne – 0.26% Error Zero Noise

• Slack variables automatically included based on methodology shown earlier.

• More support vectors than for the no noise case due to higher difficult in fitting of segregation function

(5 units of uniform random noise prescribed in each input variable)

Antoinne – 0.70% Error

Page 15: Support Vector Machines: Optimization of the Methodology presentation/2016... · Support Vector Machines: Optimization of Decision Making ... Matlab ‘quadprog’ can solve this!

Conclusions • SVMs are extremely versatile in allowing for

quantifiable decision-making strategies • Capability of support vector machines was successfully

demonstrated via five examples • Care must be taken in selecting the parameters and

training points – Poor choice of number of training points can lead to

improper bounding function and ultimately higher error – Delaunay triangulation is a new method to acquire more

desirable training points over random domain space – Modified Kernel function constants were based on

optimization versatility and general convergence – Noise can be included and SVM is capable of creating a

reasonable segregation function