Top Banner
如何用 SVM 做分類問題 Yiwei Chen 2016.10
63

How to use SVM for data classification

Apr 15, 2017

Download

Software

Yiwei Chen
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: How to use SVM for data classification

如何用 SVM 做分類問題

Yiwei Chen2016.10

Page 2: How to use SVM for data classification

import numpy as npfrom sklearn import datasetsfrom sklearn.model_selection import GridSearchCVfrom sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import MinMaxScalerfrom sklearn.svm import SVC

dataset = datasets.load_iris()X_train, X_test, y_train, y_test = train_test_split( dataset.data, dataset.target, test_size=0.1, stratify=dataset.target)

scaler = MinMaxScaler()X_scaled = scaler.fit_transform(X_train)

param_grid = { "C": np.logspace(-5, 15, num=6, base=2), "gamma": np.logspace(-13, 3, num=5, base=2)}grid = GridSearchCV( estimator=SVC(kernel="rbf", max_iter=10000000), param_grid=param_grid, cv=5)grid.fit(X_scaled, y_train)

Page 3: How to use SVM for data classification

clf = SVC(kernel="rbf", C=grid.best_params_["C"], gamma=grid.best_params_["gamma"], max_iter=10000000)clf.fit(X_scaled, y_train)

novel_X = np.array([[5.9, 3.2, 3.9, 1.5]])novel_X_scaled = scaler.transform(novel_X)print(novel_X_scaled)print(clf.predict(novel_X_scaled))

X_test_scaled = scaler.transform(X_test)print(clf.predict(X_test_scaled))print(clf.score(X_test_scaled, y_test))

Page 4: How to use SVM for data classification

如果看得懂前兩頁,就可以跳出這份投影片了

Page 5: How to use SVM for data classification

學習的方式很多

Page 6: How to use SVM for data classification

學習的目的也不同

notsweet

sweet

Page 7: How to use SVM for data classification

從經驗中學習冥冥之定數

Learn the Mother Naturefrom experience

Page 8: How to use SVM for data classification

這份投影片著重在

監督式分類(Supervised classification)

Page 9: How to use SVM for data classification

Mother Nature

甜 不甜 不甜 甜 ??

Page 10: How to use SVM for data classification

??

甜 / 不甜 ?

train甜/不甜?

model

甜 不甜 不甜 甜

Page 11: How to use SVM for data classification

??

甜 / 不甜 ?

predict

model

甜甜/不甜?

甜 不甜 不甜 甜

Page 12: How to use SVM for data classification

Supervised Classification

● 有 training data: 一些物品/事情 + 其類別 (classes)

● 你要訓練出一個模型 (train a model),之後

有新的物品進來,能預測 (predicts) 其類別

類別可以有兩個 (甜/不甜, binary classification) 或者更多個 (台/日/韓, multi-class classification)

Page 13: How to use SVM for data classification

Support Vector Machine (SVM)

● 有 training data: 向量 (vectors) + 其類別

● 你要訓練出一個模型 -- 為一個函數 (function),之後有新的向量進來,能預測其類別

類別可以有兩個 (甜/不甜, binary classification) 或者更多個 (台/日/韓, multi-class classification)

Page 14: How to use SVM for data classification

(1.2, 0, 0, 1, …, 57)

trainƒ: →

model

O

(8.7, 1, 0, 0, …, -3)X

(2.4, 1, 0, 0, …, 22)O

(0.3, 0, 1, 0, …, 33)X

⋮⋮

Page 15: How to use SVM for data classification

(1.2, 0, 0, 1, …, 57)

ƒ: →

model

O

(8.7, 1, 0, 0, …, -3)X

(2.4, 1, 0, 0, …, 22)O

(0.3, 0, 1, 0, …, 33)X (1.2, 0, 1, …, 8)

predict

X

O

⋮⋮

Page 16: How to use SVM for data classification

Feature engineering

● 用同樣方式,把物品轉成向量

● Size: 8cm or 80mm?● red/yellow/green: (1,0,0)/(0,1,0)/(0,0,1)

Page 17: How to use SVM for data classification

解決監督式分類問題有很多種方法

● SVM● Decision trees● Neural networks● Deep learning● …

他們可以解決監督式分類問題

不代表他們只能解決監督式分類問題

Page 18: How to use SVM for data classification

Agenda

● Supervised classification● Support Vector Machine● Software environment● Use Support Vector Machines

Page 19: How to use SVM for data classification

(1.2, 0, 0, 1, …, 57)

trainƒ: →

model

O

(8.7, 1, 0, 0, …, 22)X

(2.4, 1, 0, 0, …, -3)O

(0.3, 0, 1, 0, …, 33)X (1.2, 0, 1, …, 8)

predict

X

O

⋮⋮

Page 20: How to use SVM for data classification

Support Vector Machine ??

例子: 二維的向量,兩個分類

Feature 1

Feature 2

train

Model (function)

Page 21: How to use SVM for data classification

Support Vector Machine ??

例子: 二維的向量,兩個分類

predict

Model

?

? Model

Page 22: How to use SVM for data classification

Maximum Margin

Page 23: How to use SVM for data classification

SVM 的性質

● 和距離相關 (Distance related)● 分越開越好 (Maximum margin)

Page 24: How to use SVM for data classification

Characteristics in SVM

● 和距離相關 (Distance related)● 分越開越好 (Maximum margin)● 參數化 (Parameterized)

○ 邊界有可能是彎的

○ 可以分錯,但要懲罰

Page 25: How to use SVM for data classification

用不同參數訓練,有不同結果 ...

Page 26: How to use SVM for data classification

Agenda

● Supervised classification● Support Vector Machine● Software environment● Use Support Vector Machines

Page 27: How to use SVM for data classification

用 python 的話

scikit-learn(sklearn)

numpy

SVM, decision trees,...

arrays, ... scipy

python

variance, ...

Page 28: How to use SVM for data classification

Anaconda: 願望一次滿足

● 跑在 python 上的開源科學平台

○ Linux / OSX / Windows● 想得到的都幫你安裝

● 快。不花腦。

● https://www.continuum.io/anaconda-overview

Page 29: How to use SVM for data classification

Agenda

● Supervised classification● Support Vector Machine● Software environment● Use Support Vector Machines

Page 30: How to use SVM for data classification

(1.2, 0, 0, 1, …, 57)

trainƒ: →

model

O

(8.7, 1, 0, 0, …, 22)X

(2.4, 1, 0, 0, …, -3)O

(0.3, 0, 1, 0, …, 33)X (1.2, 0, 1, …, 8)

predict

X

O

⋮⋮

Page 31: How to use SVM for data classification

一般流程

定好 評估公式+基礎預測

上線預測訓練

Page 32: How to use SVM for data classification

● Accuracy○ Training accuracy○ Testing accuracy

● precision, recall, Type I / Type II error, AUC, …

進行任何訓練前,先決定好你要怎麼評估結果!

評估 (Evaluation)

Page 33: How to use SVM for data classification

● Simple and easy, 閉著眼睛猜

● 拿來「比較」用(你知道你做的比Baseline還差嗎)

基礎的預測 (Baseline predictor)

train ALL

Page 34: How to use SVM for data classification

用 SVM 的流程

定好 評估公式+基礎預測

處理資料處理資料

縮放 features

尋找最好的參數

訓練模型

縮放 features

預測

Page 35: How to use SVM for data classification

dataset = datasets.load_iris()X_train, X_test, y_train, y_test = train_test_split( dataset.data, dataset.target, test_size=0.1, stratify=dataset.target)

scaler = MinMaxScaler()X_scaled = scaler.fit_transform(X_train)

param_grid = { "C": np.logspace(-5, 15, num=6, base=2), "gamma": np.logspace(-13, 3, num=5, base=2)}grid = GridSearchCV( estimator=SVC(kernel="rbf", max_iter=10000000), param_grid=param_grid, cv=5)grid.fit(X_scaled, y_train)

clf = SVC(kernel="rbf", C=grid.best_params_["C"], gamma=grid.best_params_["gamma"], max_iter=10000000)clf.fit(X_scaled, y_train)

Page 36: How to use SVM for data classification

novel_X = np.array([[5.9, 3.2, 3.9, 1.5]])novel_X_scaled = scaler.transform(novel_X)print(novel_X_scaled)print(clf.predict(novel_X_scaled))

X_test_scaled = scaler.transform(X_test)print(clf.predict(X_test_scaled))print(clf.score(X_test_scaled, y_test))

Page 37: How to use SVM for data classification

1. Data preparation

● Transform object → vector● Whole training data at once

○ X in numpy.array (2-D) or scipy.sparse.csr_matrix○ y in numpy.array

(1.2, 0, 57)O

(8.7, 1, 22)X

(2.4, 1, -3)O X=np.array([[2.4, 1, -3], [8.7, 1, 22], [1.2, 0, 57]])

y=np.array([1,0,1])

Page 38: How to use SVM for data classification

2. Feature Scaling

(1.2, 0, 0, …)O

(8.7, 1, 0, …)X

(2.4, 1, 0, …)O

(0.3, 0, 1, …)X

⋮⋮

0.3 ~ 10.3

(n−0.3) ×0.1

0 ~ 1

0 ~ 1

(n+0) ×1

0 ~ 1

(0.09, 0, 0, …)O

(0.84, 1, 0, …)X

O

(0 , 0, 1, …)X

⋮⋮

(0.21, 1, 0, …)

scale

Page 39: How to use SVM for data classification

2. Feature Scaling

(1.2, 0, 0, …)O

(8.7, 1, 0, …)X

(2.4, 1, 0, …)O

(0.3, 0, 1, …)X

⋮⋮

(0.09, 0, 0, …)O

(0.84, 1, 0, …)X

O

(0 , 0, 1, …)X

⋮⋮

(0.21, 1, 0, …)

scale

scaler = MinMaxScaler()X_scaled = scaler.fit_transform(X)

Page 40: How to use SVM for data classification

3. Search for the best parameter

param_grid = { "C": np.logspace(-5, 15, num=6, base=2), "gamma": np.logspace(-13, 3, num=5, base=2)}

grid = GridSearchCV( estimator=SVC(kernel="rbf", max_iter=10000000), param_grid=param_grid, cv=5)

grid.fit(X_scaled, y_train)

Page 41: How to use SVM for data classification

3. Search for best (??) C and

Page 42: How to use SVM for data classification

3. what is “best”?

甜 不甜 不甜 甜 ??

train

model

你還不知道

Page 43: How to use SVM for data classification

3. Search for the best - validation

train

model

當做新的,沒看過

validate

甜 不甜 不甜 甜

Page 44: How to use SVM for data classification

3. Search for the best - cross-validation

Cross-validation (CV): each fold validates in turn

train validate

train validate train

validate train

Given C=12, =34, the validation accuracy=0.56

Page 45: How to use SVM for data classification

3. Search for the best parameter - Grid

C

Page 46: How to use SVM for data classification

3. Search for the best parameter

param_grid = { "C": np.logspace(-5, 15, num=6, base=2), "gamma": np.logspace(-13, 3, num=5, base=2)}

grid = GridSearchCV( estimator=SVC(kernel="rbf", max_iter=10000000), param_grid=param_grid, cv=5)

grid.fit(X_scaled, y_train)

Page 47: How to use SVM for data classification

4. Train Model

use the best parameter in CV to train

clf = SVC(kernel="rbf", C=grid.best_params_["C"], gamma=grid.best_params_["gamma"], max_iter=10000000)clf.fit(X_scaled, y_train)

Page 48: How to use SVM for data classification

Predict a novel data

● Scaling● Predict

novel_X = np.array([[5.9, 3.2, 3.9, 1.5]])novel_X_scaled = scaler.transform(novel_X)

print(clf.predict(novel_X_scaled))

Page 49: How to use SVM for data classification

Scale Training Data

(1.2, 0, 0, …)O

(8.7, 1, 0, …)X

(2.4, 1, 0, …)O

(0.3, 0, 1, …)X

⋮⋮

0.3 ~ 10.3

(n−0.3) ×0.1

0 ~ 1

0 ~ 1

(n+0) ×1

0 ~ 1

(0.09, 0, 0, …)O

(0.84, 1, 0, …)X

O

(0 , 0, 1, …)X

⋮⋮

(0.21, 1, 0, …)

scale

Page 50: How to use SVM for data classification

Scale Testing Data

(2.3, 0, 0, …)O

(-0.7, 1, 1, …)X

(1.3, 1, 1, …)O

(100, 0, 0, …)X

⋮⋮

(n−0.3) ×0.1

(n+0) ×1

(0.20, 0, 0, …)O

(-0.1, 1, 1, …)X

O

(9.97, 0, 0, …)X

⋮⋮

(0.10, 1, 1, …)

scale

Page 51: How to use SVM for data classification

dataset = datasets.load_iris()X_train, X_test, y_train, y_test = train_test_split( dataset.data, dataset.target, test_size=0.1, stratify=dataset.target)

scaler = MinMaxScaler()X_scaled = scaler.fit_transform(X_train)

param_grid = { "C": np.logspace(-5, 15, num=6, base=2), "gamma": np.logspace(-13, 3, num=5, base=2)}grid = GridSearchCV( estimator=SVC(kernel="rbf", max_iter=10000000), param_grid=param_grid, cv=5)grid.fit(X_scaled, y_train)

clf = SVC(kernel="rbf", C=grid.best_params_["C"], gamma=grid.best_params_["gamma"], max_iter=10000000)clf.fit(X_scaled, y_train)

Page 52: How to use SVM for data classification

novel_X = np.array([[5.9, 3.2, 3.9, 1.5]])novel_X_scaled = scaler.transform(novel_X)print(novel_X_scaled)print(clf.predict(novel_X_scaled))

X_test_scaled = scaler.transform(X_test)print(clf.predict(X_test_scaled))print(clf.score(X_test_scaled, y_test))

Page 53: How to use SVM for data classification

Agenda

● Supervised classification● Support Vector Machine● Software environment● Use Support Vector Machines

Takeaway…

Page 54: How to use SVM for data classification

??

甜 / 不甜 ?

train甜/不甜?

model

甜 不甜 不甜 甜

Page 55: How to use SVM for data classification

??

甜 / 不甜 ?

predict

model

甜甜/不甜?

甜 不甜 不甜 甜

Page 56: How to use SVM for data classification

用 SVM 的流程

Evaluation criteria + Baseline predictor

prepare dataprepare data

scale features

search best param:CV on grid

train model

scale features

predict

Page 57: How to use SVM for data classification

知道怎麼正確使用微波爐之後...

● Data collection (準備食材)● Model evaluation monitoring (客戶滿意?)● Feature engineering (處理食材)● Model update from novel data (與時俱進)● Training / prediction in large scale (大量食材)● A robust pipeline that integrates these altogether

(開餐廳)

Page 58: How to use SVM for data classification

Happy Training!

Page 59: How to use SVM for data classification

More materials

Page 60: How to use SVM for data classification

“Support” Vectors?

Page 61: How to use SVM for data classification

Maximum Margin

Page 62: How to use SVM for data classification

Why scaling?