Top Banner
Instance Based Learning
25

Instance Based Learning

Jan 01, 2016

Download

Documents

brooke-gentry

Instance Based Learning. Nearest Neighbor. Remember all your data When someone asks a question Find the nearest old data point Return the answer associated with it In order to say what point is nearest, we have to define what we mean by "near". - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Instance Based Learning

Instance Based Learning

Page 2: Instance Based Learning

Nearest Neighbor • Remember all your data

• When someone asks a question– Find the nearest old data point

– Return the answer associated with it

• In order to say what point is nearest, we have to define what we mean by "near".

• Typically, we use Euclidean distance between two points.

2)2()1(2)2(2

)1(2

2)2(1

)1(1 )(...)()( kk aaaaaa

Nominal attributes: distance is set to 1 if values are different, 0 if they are equal

Page 3: Instance Based Learning

Predicting Bankruptcy

Page 4: Instance Based Learning

Predicting Bankruptcy• Now, let's say we have a new person with R equal to 0.3 and L

equal to 2. • What y value should we predict?

And so our answer would be "no".

Page 5: Instance Based Learning

Scaling• The naïve Euclidean distance isn't always appropriate.

• Consider the case where we have two features describing a car.

– f1 = weight in pounds

– f2 = number of cylinders.

• Any effect of f2 will be completely lost because of the relative scales.

• So, rescale the inputs to put all of the features on about equal footing:

ii

iii vv

vva

minmax

min

Page 6: Instance Based Learning

Time and Space• Learning is fast

– We just have to remember the training data.

• Space is n.

• What takes longer is answering a query.

• If we do it naively, we have to, for each point in our training set (and there are n of them) compute the distance to the query point (which takes about m computations, since there are m features to compare).

• So, overall, this takes about m * n time.

Page 7: Instance Based Learning

Noise

Someone with an apparently healthy financial record goes bankrupt.

Page 8: Instance Based Learning

Remedy: K-Nearest Neighbors• k-nearest neighbor algorithm:

– Just like the old algorithm, except that when we get a query, we'll search for the k closest points to the query points.

• Output what the majority says.

– In this case, we've chosen k to be 3.

– The three closest points consist of two "no"s and a "yes", so our answer would be "no".

Find the optimal k using cross-validation

Page 9: Instance Based Learning

Other Variants• IB2: save memory, speed up classification

– Work incrementally

– Only incorporate misclassified instances

– Problem: noisy data gets incorporated

• IB3: deal with noise– Discard instances that don’t perform well

– Keep a record of the number of correct and incorrect classification decisions that each exemplar makes.

– Two predetermined thresholds are set on success ratio. • If the performance of exemplar falls below the low threshold it is

deleted.

• If the performance exceeds the upper threshold it is used for prediction.

Page 10: Instance Based Learning

Instance-based learning: IB2• IB2: save memory, speed

up classification– Work incrementally– Only incorporate

misclassified instances– Problem: noisy data

gets incorporated

Data: “Who buys gold jewelry”

(25,60,no) (45,60,no) (50,75,no) (50,100,no)

(50,120,no) (70,110,yes) (85,140,yes) (30,260,yes)

(25,400,yes) (45,350,yes) (50,275,yes) (60,260,yes)

Page 11: Instance Based Learning

Instance-based learning: IB2• Data:

– (25,60,no) – (85,140,yes) – (45,60,no) – (30,260,yes) – (50,75,no) – (50,120,no)– (70,110,yes)– (25,400,yes)– (50,100,no)– (45,350,yes)– (50,275,yes)– (60,260,yes)

This is the final answer. I.e. we memorize only these 5 points. However, let’s compute gradually the classifier.

Page 12: Instance Based Learning

Instance-based learning: IB2• Data:

– (25,60,no)

Page 13: Instance Based Learning

Instance-based learning: IB2• Data:

– (25,60,no) – (85,140,yes)

Since so far the model has only the first

instance memorized, this second instance

gets wrongly classified. So, we memorize it as

well.

Page 14: Instance Based Learning

Instance-based learning: IB2• Data:

– (25,60,no) – (85,140,yes) – (45,60,no)

So far the model has the two first instances memorized.

The third instance gets properly classified, since it happens to be

closer with the first. So, we don’t memorize it.

Page 15: Instance Based Learning

Instance-based learning: IB2• Data:

– (25,60,no) – (85,140,yes) – (45,60,no) – (30,260,yes)

So far the model has the two first instances memorized.

The fourth instance gets properly classified, since it happens to be

closer with the second. So, we don’t memorize it.

Page 16: Instance Based Learning

Instance-based learning: IB2• Data:

– (25,60,no) – (85,140,yes) – (45,60,no) – (30,260,yes)– (50,75,no)

So far the model has the two first instances memorized.

The fifth instance gets properly classified, since it happens to be

closer with the first. So, we don’t memorize it.

Page 17: Instance Based Learning

Instance-based learning: IB2• Data:

– (25,60,no) – (85,140,yes) – (45,60,no) – (30,260,yes)– (50,75,no)– (50,120,no)

So far the model has the two first instances memorized.

The sixth instance gets wrongly classified, since it happens to be

closer with the second. So, we memorize it.

Page 18: Instance Based Learning

Instance-based learning: IB2• Continuing in a similar

way, we finally get, the figure in the right. – The colored points are

the one that get memorized.

This is the final answer. I.e. we memorize only these 5 points.

Page 19: Instance Based Learning

Instance-based learning: IB3• IB3: deal with noise

– Discard instances that don’t perform well

– Keep a record of the number of correct and incorrect classification decisions that each exemplar makes.

– Two predetermined thresholds are set on success ratio.

– An instance is used for training: • If the number of incorrect classifications is the first threshold, and

• If the number of correct classifications the second threshold.

Page 20: Instance Based Learning

Instance-based learning: IB3• Suppose the lower

threshold is 0, and upper threshold is 1.

• Shuffle the data first– (25,60,no)– (85,140,yes)– (45,60,no)– (30,260,yes)– (50,75,no)– (50,120,no)– (70,110,yes)– (25,400,yes)– (50,100,no)– (45,350,yes)– (50,275,yes)– (60,260,yes)

Page 21: Instance Based Learning

Instance-based learning: IB3• Suppose the lower

threshold is 0, and upper threshold is 1.

• Shuffle the data first– (25,60,no) [1,1] – (85,140,yes) [1,1]– (45,60,no) [0,1]– (30,260,yes) [0,2]– (50,75,no) [0,1]– (50,120,no) [0,1]– (70,110,yes) [0,0]– (25,400,yes) [0,1]– (50,100,no) [0,0]– (45,350,yes) [0,0]– (50,275,yes) [0,1]– (60,260,yes) [0,0]

Page 22: Instance Based Learning

Instance-based learning: IB3• The points that will be

used in classification are:– (45,60,no) [0,1]– (30,260,yes) [0,2]– (50,75,no) [0,1]– (50,120,no) [0,1]– (25,400,yes) [0,1]– (50,275,yes) [0,1]

Page 23: Instance Based Learning

Rectangular generalizations• When a new exemplar is classified correctly, it is generalized by

simply merging it with the nearest exemplar.

• The nearest exemplar may be either a single instance or a hyper-rectangle.

Page 24: Instance Based Learning

Rectangular generalizations• Data:

– (25,60,no)– (85,140,yes)– (45,60,no)– (30,260,yes)– (50,75,no)– (50,120,no)– (70,110,yes)– (25,400,yes)– (50,100,no)– (45,350,yes)– (50,275,yes)– (60,260,yes)

Page 25: Instance Based Learning

Classification

Class 1

Class

2

Separation line

• If the new instance lies within a rectangle then output the rectangle class

• If the new instance lies in the overlap of several rectangles, then output the class of the rectangle whose center is the closest to the new data instance.

• If the new instance lies outside any of the rectangles, output the class of the rectangle, which is the closest to the data instance.

• The distance of a point from a rectangle is:

1. If an instance lies within rectangle, d=0

2. If outside, d = distance from the closest rectangle part, i.e. distance from some point in the rectangle boundary.