Top Banner
Machine Learning using Matlab Lecture 8 Advice on ML application
30

Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Aug 23, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Machine Learning using Matlab

Lecture 8 Advice on ML application

Page 2: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Presentation schedule

Page 3: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Time slot

10:00 - 10:20 Presentation 1

10:25 - 10:45 Presentation 2

10:50 - 11:10 Presentation 3

11:15 - 11:35 Presentation 4

● 20 minutes for each group (15 minutes talk, and 5 minutes questions)

● Each member should give at least 3 minutes talk

Page 4: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Outline● Evaluating your machine learning model● Bias vs. variance

○ Feature parameter, e.g., degree of polynomial in linear regression○ Regularization parameter, e.g., C in SVM○ Size of training examples

● Handling skewed/unbalanced classes

Page 5: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Debugging a learning modelSuppose you have implemented regularized linear regression to predict housing prices:

However, when you test your hypothesis on a new set of houses, you find that it makes unacceptably large errors in its predictions. What should you try next?

● Get more training examples● Try smaller sets of features● Try getting additional features● Try adding polynomial features● Try decreasing lambda● Try increasing lambda

Page 6: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Evaluate your modelTo evaluate the performance of your ML model, you should:

● Divide your dataset into training set (70%) and test set (30%)● Learn hypothesis from from training data, namely,● Predict results on test set and measure the performance of your model

Page 7: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Example - linear regressionSize Price

2104 400

1600 330

2400 369

1416 232

3000 540

1985 300

1534 315

1427 199

1380 212

1494 243

Randomly shuffled

Size Price

2104 400

2400 369

1416 232

3000 540

1534 315

1427 199

1380 212

Training set (70%)

Size Price

1600 330

1985 300

1494 243

Test set (30%)

Page 8: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Example - linear regression

Size Price

2104 400

2400 369

1416 232

3000 540

1534 315

1427 199

1380 212

Training set

Minimize the following cost function using the training set

optimal

Page 9: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Example - linear regression

Size Price

1600 330

1985 300

1494 243

Test setMean squared error

Page 10: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Question: how to evaluate the performance of a logistic regression model?

Page 11: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Parameter selectionTake linear regression as an example, you may need to choose the degree of polynomial (d), i.e.,

● You tried d from 1 to 10, and you find d = 3 have the lowest mean square error in test data. So you claim d = 3 is the optimal parameter of your model. Anything wrong?

If you apply your model to other data, the performance may decrease as the parameter is fit to the test data. Namely, you don’t know how well your model is generalized to other examples.

Page 12: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Parameter selection (cont.)To select the optimal parameters, there are two options:

● K-fold Cross Validation (CV) when you have a small data● Divide your data into three parts (training, validation, and test) when you have

a big data

Page 13: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

K-fold Cross Validation● Divide your training set into K parts● Each iteration you pick (K-1) parts for

training, and pick the rest part for testing, measure the performance

● Average the performance from those K iterations.

Page 14: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Parameter selection with K-fold CV● Procedures:

○ For each parameter, e.g, degree of polynomial, compute the average performance use K-fold CV

○ Pick the parameter that reports the best average performance

● Pros and cons:○ Less bias○ Computational intensive (train K ⨉ d times)

Page 15: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Parameter selection with big data● Procedures:

○ Divide your dataset into three parts: training set (60%), validation set (20%), and test set (20%)

○ Train your model with training set, and measure the performance on validation set with different parameters, choose the optimal parameter. I.e., the parameter that has the best performance on validation set

○ Measure the performance of your model on test set with the optimal parameter

● Pros and cons:○ Less computational cost (train d times)○ More bias

Page 16: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Example - bias vs. variance on regression

Underfitting, high bias Just right Overfitting, high variance

size size size

pric

e

pric

e

pric

e

Page 17: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Bias vs. variance on degree of polynomialUsing the mean squared error which is defined before, we have training error and validation error:

Q: if we change the degree of polynomial d, what will the training error and validation error look like?

Page 18: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Diagnosing bias vs. variance on degree of polynomialSuppose your machine learning model is performing less well than you were hoping. Is it a bias problem or a variance problem?

● Bias (underfitting): both training error and validation error are high

● Variance (overfitting): training error ≫ test error

degree of polynomial

erro

r

training error

validation error

Page 19: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Let’s fix the degree of polynomial d = 4, what will the hypothesis look like with different values of lambda?

Bias vs. variance on regularization

size

pric

e

size

pric

e

size

pric

e

small intermediate large

Page 20: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Diagnosing bias vs. variance on regularizationIf we change the value of regularization parameter , what will the training error and validation error look like?

erro

rtraining error

validation error

“Just right”

Q: now you try to tune degree of polynomial d and regularization parameter , what should you do?

Page 21: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Grid search● Pick a bunch of values of parameter A● Pick a bunch of values of parameter B● For each pair of parameter A and B, evaluate the

validation error, either K-fold CV on training set or testing on validation set.

● Pick the pair that gives the minimum value of the validation error

Page 22: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Grid search - regularized linear regression

0.05 2 10

2 0.22 0.10 0.34

4 0.32 0.05 0.21

6 0.52 0.12 0.43

d

optimal parameters (4,2)

Page 23: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Bias vs. variance on size of dataIf a learning algorithm is suffering from high bias, what will the training error and validation error look like when increasing training examples?

size

pric

e

size

pric

e

No. of training examples

erro

r

validation error

training error

Increasing number of training examples will not help much if high bias

high error

Page 24: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Bias vs. variance on size of training examplesIf a learning algorithm is suffering from high variance, what will the training error and validation error look like when increasing number of training examples?

size

pric

e

No. of training examples

erro

r validation error

training errorsize

pric

e

Increasing number of training examples is likely to help if high variance

Page 25: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Debugging a learning modelSuppose you have implemented regularized linear regression to predict housing prices:

However, when you test your hypothesis on a new set of houses, you find that it makes unacceptably large errors in its predictions. What should you try next?

● Get more training examples ➡ fixes high variance● Try smaller sets of features ➡ fixes high variance● Try getting additional features ➡ fixes high bias● Try adding polynomial features ➡ fixes high bias● Try decreasing lambda ➡ fixes high bias● Try increasing lambda ➡ fixes high variance

Page 26: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Is your error metric fair?Suppose you have trained a logistic regression model to predict cancer. In your test set, only 0.5% of patients have cancer (skewed classes). You got 1% error on test set. Is your model a good classifier?

Positive example (1) - patient have cancerNegative example (0) - patient no cancer

Function y = predictCancer(x)y = 0;

end

You will achieve 0.5% error without doing anything!

Page 27: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Precision/Recall

Predicted condition

Total population Positive Negative

True condition Positive True Positive (TP) False Negative (FN)

Negative False Positive (FP) True Negative (TN)

Precision = TP/(TP+FP)

Recall = TP/(TP+FN)

Precision: of all patients where we predicted have cancer, what fraction of patients actually have cancer.

Recall: of all patients that actually have cancer, what fraction of patients did we correctly detected as having cancer.

Page 28: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

Tradeoff between precision and recall● Logistic regression:

○ Predict 1 if ○ Predict 0 if

● Suppose we want to predict cancer (y = 1) only if very confident

○ Higher precision, lower recall (large threshold)

● Suppose we want to avoid missing too much cases of cancer (avoid false negatives)

○ Higher recall, lower precision (small threshold)

● Generate the curve by tuning thresholds

Recall

Pre

cisi

on

Large threshold

Small threshold

Page 29: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

F1-measureSuppose you have the precision and recall of three learning algorithms, which one is better? Precision Recall

Algorithm 1 0.6 0.3

Algorithm 2 0.2 0.9

Algorithm 3 0.9 0.1

Algorithm 1 has the highest F1-measure

Page 30: Machine Learning using Matlab - Uni Konstanz...Matlab Lecture 8 Advice on ML application Presentation schedule Time slot 10:00 - 10:20 Presentation 1 10:25 - 10:45 Presentation 2 10:50

SummaryThe procedure of a machine learning project:

1. Collect data and divide it into training, validation, and test sets.2. Choose the machine learning model you would like to use3. Select the optimal parameters by means of training and validation sets4. With the optimal parameters, predict results on test set5. Measure and analyze your result, improve your model if possible6. Write your project report