Top Banner
1/31/08 CS 461, Winter 2009 1 CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff [email protected]
15

1/31/08CS 461, Winter 20091 CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff [email protected] Dr. Kiri Wagstaff [email protected].

Mar 26, 2015

Download

Documents

Jake Glass
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1/31/08CS 461, Winter 20091 CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff wkiri@wkiri.com Dr. Kiri Wagstaff wkiri@wkiri.com.

1/31/08 CS 461, Winter 2009 1

CS 461: Machine LearningLecture 4

Dr. Kiri [email protected]. Kiri [email protected]

Page 2: 1/31/08CS 461, Winter 20091 CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff wkiri@wkiri.com Dr. Kiri Wagstaff wkiri@wkiri.com.

1/31/08 CS 461, Winter 2009 2

Plan for Today

Solution to HW 2 Support Vector Machines Neural Networks

Perceptrons Multilayer Perceptrons

Page 3: 1/31/08CS 461, Winter 20091 CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff wkiri@wkiri.com Dr. Kiri Wagstaff wkiri@wkiri.com.

1/31/08 CS 461, Winter 2009 3

Review from Lecture 3

Decision trees Regression trees, pruning, extracting rules

Evaluation Comparing two classifiers: McNemar’s test

Support Vector Machines Classification

Linear discriminants, maximum margin Learning (optimization): gradient descent, QP

Page 4: 1/31/08CS 461, Winter 20091 CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff wkiri@wkiri.com Dr. Kiri Wagstaff wkiri@wkiri.com.

1/31/08 CS 461, Winter 2009 4

Neural Networks

Chapter 11

It Is Pitch Dark

Chapter 11

It Is Pitch Dark

Page 5: 1/31/08CS 461, Winter 20091 CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff wkiri@wkiri.com Dr. Kiri Wagstaff wkiri@wkiri.com.

1/31/08 CS 461, Winter 2009 5

Perceptron

[Alpaydin 2004 The MIT Press]

Graphical

[ ][ ]Td

Td

Td

jjj

x,...,x,

w,...,w,w

wxwy

1

10

01

1=

=

=+=∑=

x

w

xw

Math

Page 6: 1/31/08CS 461, Winter 20091 CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff wkiri@wkiri.com Dr. Kiri Wagstaff wkiri@wkiri.com.

1/31/08 CS 461, Winter 2009 6

“Smooth” Output: Sigmoid Function

y = sigmoid wTx( ) =1

1+ exp −wTx[ ]

1. Calculate g x( ) = wTx and choose C1 if g x( ) > 0, or

2. Calculate y = sigmoid wTx( ) and choose C1 if y > 0.5

Why?

• Converts output to probability!

• Less “brittle” boundary

Page 7: 1/31/08CS 461, Winter 20091 CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff wkiri@wkiri.com Dr. Kiri Wagstaff wkiri@wkiri.com.

1/31/08 CS 461, Winter 2009 7

K outputsRegression:

xy

xw

W=

=+=∑=

Tii

d

jjiji wxwy 0

1

[Alpaydin 2004 The MIT Press]

kk

i

i

k k

ii

Tii

yy

C

oo

y

o

maxif

choose

expexp

=

=

=

xw

Softmax

Classification:

Page 8: 1/31/08CS 461, Winter 20091 CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff wkiri@wkiri.com Dr. Kiri Wagstaff wkiri@wkiri.com.

1/31/08 CS 461, Winter 2009 8

Training a Neural Network

1. Randomly initialize weights2. Update =

Learning rate * (Desired - Actual) * Input

Δw jt = η y t − ˆ y t( )x j

t

Page 9: 1/31/08CS 461, Winter 20091 CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff wkiri@wkiri.com Dr. Kiri Wagstaff wkiri@wkiri.com.

1/31/08 CS 461, Winter 2009 9

Learning Boolean AND

[Alpaydin 2004 The MIT Press]

Δw jt = η y t − ˆ y t( )x j

t

Perceptron demo

Page 10: 1/31/08CS 461, Winter 20091 CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff wkiri@wkiri.com Dr. Kiri Wagstaff wkiri@wkiri.com.

1/31/08 CS 461, Winter 2009 10

Multilayer Perceptrons = MLP = ANN

[Alpaydin 2004 The MIT Press]

y i = viT z = v ihzh + v i0

h=1

H

zh = sigmoid whTx( )

=1

1+ exp − whj x j + wh 0j=1

d

∑ ⎛ ⎝ ⎜ ⎞

⎠ ⎟ ⎡

⎣ ⎢ ⎤ ⎦ ⎥

Page 11: 1/31/08CS 461, Winter 20091 CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff wkiri@wkiri.com Dr. Kiri Wagstaff wkiri@wkiri.com.

1/31/08 CS 461, Winter 2009 11

x1 XOR x2 = (x1 AND ~x2) OR (~x1 AND x2)

[Alpaydin 2004 The MIT Press]

Page 12: 1/31/08CS 461, Winter 20091 CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff wkiri@wkiri.com Dr. Kiri Wagstaff wkiri@wkiri.com.

1/31/08 CS 461, Winter 2009 12

Examples

Digit Recognition Ball Balancing

Page 13: 1/31/08CS 461, Winter 20091 CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff wkiri@wkiri.com Dr. Kiri Wagstaff wkiri@wkiri.com.

1/31/08 CS 461, Winter 2009 13

ANN vs. SVM

SVM with sigmoid kernel = 2-layer MLP Parameters

ANN: # hidden layers, # nodes SVM: kernel, kernel params, C

Optimization ANN: local minimum (gradient descent) SVM: global minimum (QP)

Interpretability? About the same… So why SVMs?

Sparse solution, geometric interpretation, less likely to overfit data

Page 14: 1/31/08CS 461, Winter 20091 CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff wkiri@wkiri.com Dr. Kiri Wagstaff wkiri@wkiri.com.

1/31/08 CS 461, Winter 2009 14

Summary: Key Points for Today

Support Vector Machines Neural Networks

Perceptrons Sigmoid Training by gradient descent

Multilayer Perceptrons

ANN vs. SVM

Page 15: 1/31/08CS 461, Winter 20091 CS 461: Machine Learning Lecture 4 Dr. Kiri Wagstaff wkiri@wkiri.com Dr. Kiri Wagstaff wkiri@wkiri.com.

1/31/08 CS 461, Winter 2009 15

Next Time

Midterm Exam! 9:10 – 10:40 a.m. Open book, open notes (no computer) Covers all material through today

Neural Networks(read Ch. 11.1-11.8)

Questions to answer from the reading Posted on the website (calendar) Three volunteers?