Top Banner
1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola, MIT and others
32

1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

Dec 29, 2015

Download

Documents

Irene Young
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

1

CS 512 Machine Learning

Berrin Yanikoglu

Slides are expanded from the

Machine Learning-Mitchell book slidesSome of the extra slides thanks to T. Jaakkola, MIT and others

Page 2: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

2

CS512-Machine Learning Please refer to http://people.sabanciuniv.edu/berrin/cs512/

for course information. This webpage is also linked from SuCourse.

We will use SuCourse mainly for assignments and announcements. You are responsible of checking SuCourse for announcements.

Undergraduates are required to attend the course; graduates are strongly encouraged to.

Page 3: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

3

What to learn in Lecture 1

Course information, expectations

What is learning, different types of learning supervised learning, classification, regression... different applications history of learning

All terminology and concepts raw data, feature extraction, ... feature qualities training, testing, error measures classifiers, decision boundaries/regions

Page 4: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

4

What is learning?

“Learning denotes changes in a system that ... enable a system to do the same task more efficiently the next time.” –Herbert Simon

“Learning is any process by which a system improves performance from experience.” –Herbert Simon

“Learning is constructing or modifying representations of what is being experienced.” –Ryszard Michalski

“Learning is making useful changes in our minds.” –Marvin Minsky

Page 5: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

5

Why learn?

Build software agents that can adapt to their users or to other software agents or to changing environments Personalized news or mail filter Personalized tutoring Mars robot

Develop systems that are too difficult/expensive to construct manually because they require specific detailed skills or knowledge tuned to a specific task Large, complex AI systems cannot be completely derived by hand

and require dynamic updating to incorporate new information.

Discover new things or structure that were previously unknown to humans Examples: data mining, scientific discovery

Page 6: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

6

Page 7: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

7

Page 8: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

8

Related Disciplines

The following are close disciplines: Artificial Intelligence

Machine learning deals with the learning part of AI Pattern Recognition

Concentrates more on “tools” rather than theory Data Mining

More specific about discovery

The following are useful in machine learning techniques or may give insights: Probability and Statistics Information theory

Psychology (developmental, cognitive) Neurobiology Linguistics Philosophy

Page 9: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

12

Major paradigms of machine learning

Rote learning – “Learning by memorization.” Employed by first machine learning systems, in 1950s

Samuel’s Checkers program

Supervised learning – Use specific examples to reach general conclusions or extract general rules

Classification (Concept learning) Regression

Unsupervised learning (Clustering) – Unsupervised identification of natural groups in data

Reinforcement learning– Feedback (positive or negative reward) given at the end of a sequence of steps

Analogy – Determine correspondence between two different representations

Discovery – Unsupervised, specific goal not given

Page 10: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

13

Rote Learning is Limited

Memorize I/O pairs and perform exact matching with new inputs

If a computer has not seen the precise case before, it cannot apply its experience

We want computers to “generalize” from prior experience Generalization is the most important factor in learning

Page 11: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

14

The inductive learning problem

Extrapolate from a given set of examples to make accurate predictions about future examples

Supervised versus unsupervised learning Learn an unknown function f(X) = Y, where X is an input

example and Y is the desired output. Supervised learning implies we are given a training set of

(X, Y) pairs by a “teacher” Unsupervised learning means we are only given the Xs. Semi-supervised learning: mostly unlabelled data

Page 12: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

15

Types of supervised learning

a) Classification: • We are given the label of the training objects: {(x1,x2,y=T/O)}

• We are interested in classifying future objects: (x1’,x2’) with the correct label.

I.e. Find y’ for given (x1’,x2’).

b) Concept Learning:

• We are given positive and negative samples for the concept we want to learn (e.g.Tangerine): {(x1,x2,y=+/-)}

• We are interested in classifying future objects as member of

the class (or positive example for the concept) or not.

I.e. Answer +/- for given (x1’,x2’).

x1=size

x2=

colo

r

Tangerines Oranges

Tangerines Not Tangerines

Page 13: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

16

Types of Supervised Learning

Regression Target function is continuous rather

than class membership For example, you have some the

selling prices of houses as their sizes (sq-mt) changes in a particular location that may look like this. You may hypothesize that the prices are governed by a particular function f(x). Once you have this function that “explains” this relationship, you can guess a given house’s value, given its sq-mt. The learning here is the selection of this function f() .

Note that the problem is more meaningful and challenging if you imagine several input parameters, resulting in a multi-dimensional input space.

60 70 90 120 150 x=size

y=price

f(x)

Page 14: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

17

Classification

Assign object/event to one of a given finite set of categories. Medical diagnosis Credit card applications or transactions Fraud detection in e-commerce Spam filtering in email Recommended books, movies, music Financial investments Spoken words Handwritten letters

Page 15: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

18

Classification System

Input Classifier Output(Features or Feature Vector)

ABCD...Z

x1

x2

x3

.

.xd

Page 16: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

19

Regression System

Input Classifier Output(Features or Feature Vector)

f(x1, x2, x3, ..., xd)

x1

x2

x3

.

.xd

Page 17: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

20

Page 18: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

21

Page 19: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

22

Page 20: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

23

Evaluation of Learning Systems

Experimental Conduct controlled cross-validation experiments to compare

various methods on a variety of benchmark datasets. Gather data on their performance, e.g. test accuracy,

training-time, testing-time… Maybe even analyze differences for statistical significance.

Theoretical Analyze algorithms mathematically and prove theorems about

their: Ability to fit training data Computational complexity Sample complexity (number of training examples needed to learn an

accurate function)

Page 21: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

24

Measuring Performance

Performance of the learner can be measured in one of the following ways, as suitable for the application: Accuracy

Number of mistakes (in classification problems)

Mean Squared Error (in regression problems)

Loss functions (more general, taking into account different costs for different mistakes)

Solution quality (length, efficiency) Speed of performance …

Page 22: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

25

Page 23: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

26

Page 24: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

27

Page 25: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

28

The next slides review the design cycle as a whole processfrom Gutierrez-Osuna, Texas A&M

Page 26: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

29

Page 27: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

30

Page 28: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

31

Page 29: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

32

Page 30: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

33

Page 31: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

34

Page 32: 1 CS 512 Machine Learning Berrin Yanikoglu Slides are expanded from the Machine Learning-Mitchell book slides Some of the extra slides thanks to T. Jaakkola,

35