Top Banner
Design and Analysis of Experiments CS503/CSL603 – Fall 2018 Narayanan C Krishnan [email protected]
45

Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

May 27, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Design and Analysis of Experiments

CS503/CSL603 – Fall 2018Narayanan C Krishnan

[email protected]

Page 2: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Outline

• Measures for evaluation• Experimental design• Estimating the generalized performance

• Hypothesis testing• Interval estimation• Confidence intervals

Design and Analysis of Experiments CSL603 - Machine Learning 2

Page 3: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Confusion Matrix (1)

• 2-Class Scenario

Design and Analysis of Experiments CSL603 - Machine Learning 3

positive negative Total

positive True positive ( !" )

False negative( #$ ) ( " )

negative False positive( #" )

True negative( !$ ) ( $ )

Total ( "′ ) ( $′ ) &

Page 4: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Confusion Matrix (2)

• K-Class Scenario

Design and Analysis of Experiments CSL603 - Machine Learning 4

Page 5: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Performance Measures

• Error: !" + !$ /&• Accuracy: '" + '$ /&• tp-rate: '" /"• fp-rate: !" /$• Precision: '" /"′• Recall: '" /"• Sensitivity: '" /"• Specificity: '$ /$• F Measure: )×+,-./0/12×,-.344+,-./0/125,-.344

Design and Analysis of Experiments CSL603 - Machine Learning 5

positive negative Total

positive True positive ( '" )

False negative( !$ ) ( " )

negative False positive( !" )

True negative( '$ ) ( $ )

Total ( "′ ) ( $′ ) &

Page 6: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Receiver Operating Characteristic

• Classification error is inefficient when• Costs associated with false positives and negative errors• Class distributions are skewed

• ROC - Assess predictive behavior that is independent of error costs or class distributions• Origin from signal detection theory

• Assume a classifier that uses a threshold to determine the class label• classify ! as positive if " #|! ≥ &• The number of true and false positives depend on &

Design and Analysis of Experiments CSL603 - Machine Learning 6

Page 7: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Example (1)

! "# $ 0 0.5 12 1 0.9 1 1 05 1 0.9 1 1 03 1 0.7 1 1 08 1 0.6 1 1 01 1 0.5 1 1 04 0 0.4 1 0 09 0 0.3 1 0 06 0 0.2 1 0 07 0 0.1 1 0 0

Design and Analysis of Experiments CSL603 - Machine Learning 7

0 1

1

Page 8: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Example (1)

! "# $ 0 0.5 12 1 0.9 1 1 05 1 0.9 1 1 03 1 0.7 1 1 08 1 0.6 1 1 01 1 0.5 1 1 04 0 0.4 1 0 09 0 0.3 1 0 06 0 0.2 1 0 07 0 0.1 1 0 0

Design and Analysis of Experiments CSL603 - Machine Learning 8

0 1

1

Page 9: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Example (2)

! "# $ 0 0.5 12 1 0.9 1 1 05 1 0.9 1 1 03 1 0.7 1 1 08 1 0.6 1 1 01 1 0.2 1 0 04 0 0.6 1 1 09 0 0.3 1 0 06 0 0.2 1 0 07 0 0.1 1 0 0

Design and Analysis of Experiments CSL603 - Machine Learning 9

0 1

1

Page 10: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Receiver Operating Characteristic Curve

Design and Analysis of Experiments CSL603 - Machine Learning 10

Page 11: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Domination in ROC Space

• Learner L1 dominates L2 if L1’s ROC curve is always above L2’s curve• If L1 dominates L2, then L1 is better than L2 for all possible error

costs and class distributions• If neither dominates (L2 and L3), then different classifiers are better

under different conditions

Design and Analysis of Experiments CSL603 - Machine Learning 11

Page 12: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Quantitative Measure from ROC Curve

• Area Under the (ROC) Curve

Design and Analysis of Experiments CSL603 - Machine Learning 12

Page 13: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Generating ROC Curve (1)

• Assume classifier outputs !(#|%) instead of just # (the predicted class for instance %)• Let ' be a threshold such that if ! # % > ' , then % is classified as #,

else not #• Compute fp-rate and tp-rate for different values of ' from 0 to 1• Plot each (fp-rate, tp-rate) and interpolate (or convex hull)• If multiple points have same fp-rate, then average tp-rates (k-fold

cross-validation)

Design and Analysis of Experiments CSL603 - Machine Learning 13

Page 14: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Generating ROC Curve (2)

• What if classifier does not provide !(#|%), but just #?• E.g., decision tree, rule

• Generally, even these discrete classifiers maintain statistics for classification• E.g., decision tree leaf nodes use proportion of examples of each class• E.g., rules have the number of examples covered by the rule

• These statistics can be compared against a varying threshold (')

Design and Analysis of Experiments CSL603 - Machine Learning 14

Page 15: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Other Performance Measures

• Training time and space complexity• Testing time and space complexity• Interpretability of the model

Design and Analysis of Experiments CSL603 - Machine Learning 15

Page 16: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Mid-Sem Logistics

• When – Monday, October 1, 2018, 3.00-5.00pm

• Where - TBD

• Cheat sheet – single A4 double sided (hand written)

• Bring calculators

• Syllabus• Introduction, decision trees and forests, linear regression, linear classification, artificial neural

networks - all material discussed between weeks 1-6 including the linear algebra review

• Study material • Lecture slides, class notes, reference material, quizzes and labs, last year’s quiz/exam

material (check the website)

• Exam format• Numericals, design questions, proofs, derivations,…

Artificial Neural Networks CSL603 - Machine Learning 16

Page 17: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Evaluating the Hypothesis (1)

• Can we make any conclusion about the generalization performance of the classifier based on the training set?• How about the validation set?• Could be biased if the validation set is used for

• Choosing the classifier (over another)• Parameter tuning

• Need another test set that is ‘truly’ unseen during training/tuning• Options are limited with small amount of training data

Design and Analysis of Experiments CSL603 - Machine Learning 17

Page 18: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Evaluating the Hypothesis (2)

• Two main difficulties• Bias in the estimate - performance of the learned hypothesis on the training

set is optimistically biased• Variance in the estimate

• Performance estimated on unseen test set is unbiased• However estimate can still vary from the true performance depending on the make up of

the test set.

• Interested in the minimum variance unbiased estimate of the generalization performance.

Design and Analysis of Experiments CSL603 - Machine Learning 18

Page 19: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Experimental Design (1)

• Train/Test Split• Given dataset ! = #$, &$ $'(

)

• Perform * random trials, where for each trial• Randomly split ! into a training set (2/3rd) and testing set (1/3rd)• Learn a classifier on the training set• Compute the performance (error) on the test set

• Compute the average performance (error) over the * trials• Problem?

Design and Analysis of Experiments CSL603 - Machine Learning 19

Page 20: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Experimental Design (2)

• Train/Test Split• Given dataset ! = #$, &$ $'(

)

• Perform * random trials, where for each trial• Randomly split ! into a training set (2/3rd) and testing set (1/3rd)• Learn a classifier on the training set• Compute the performance (error) on the test set

• Compute the average performance (error) over the * trials• Problem?• Train and test sets overlap between trials - bias the result

Design and Analysis of Experiments CSL603 - Machine Learning 20

Page 21: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Experimental Design (3)

• ! -Fold Cross Validation• Given dataset " = $%, '% %()

*

• Partition " into ! disjoint subsets - "), "+, … , "-• For . = 1:! trials

• Learn the classifier on the training set " − "2• Compute the performance on the test set "2

• Computer average performance over the ! trials• A better estimate of generalization performance

• Test sets do not overlap• Stratification

• Distribution of classes in training and testing sets should be the same as in original dataset• When size of " is very small

• Leave one out cross validation - ! = 3

Design and Analysis of Experiments CSL603 - Machine Learning 21

Page 22: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Experimental Design (4)

• 5x2 Cross Validation (Dietterich 1998)• For each of 5 trials (shuffling ! each time)• Divide ! in two halves !" and !#• Compute error using !" as training and !# as testing• Compute error using !# as training and !" as testing

• Computer average error of all 10 results• 5 trials best number to minimize overlap among training and testing

sets• Train and Test sets are of similar sizes

Design and Analysis of Experiments CSL603 - Machine Learning 22

Page 23: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Experimental Design (5)

• Bootstrapping• If not enough data for ! -Fold Cross Validation• Generate multiple sets of size " from # by sampling with replacement• Each set has approximately 63% of the examples in #• Compute average error over all samples

Design and Analysis of Experiments CSL603 - Machine Learning 23

Page 24: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Interval Estimation (1)

• Estimate the mean ! of a normal distribution " !, $%• Given a set & = () )*+

,

• Estimate

- = 1/0)*+

,()

• where m~" !, $%//• Define statistic 4 with a unit normal distribution " 0, 1

- − !$/ / ~4

Design and Analysis of Experiments CSL603 - Machine Learning 24

Page 25: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Unit Normal Distribution

• 95% of ! lies in −1.96, 1.96• 99% of ! lies in −2.58, 2.58• …• Therefore, + −1.96 < ! < 1.96 = 0.95

• Two-sided confidence interval

Design and Analysis of Experiments CSL603 - Machine Learning 25

Page 26: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Interval Estimation (2)! −1.96 < ( < 1.96 = 0.95

! −1.96 < , - − ./ < 1.96 = 0.95

! - − 1.96 /, < . < - + 1.96 /

, = 0.95

! - − 12/4/, < . < - + 12/4

/, = 1 − 5

Design and Analysis of Experiments CSL603 - Machine Learning 26

12/4 1 − 52.58 0.992.33 0.981.96 0.951.64 0.901.28 0.80

Page 27: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Two-Sided Vs One-Sided Confidence Interval

Design and Analysis of Experiments CSL603 - Machine Learning 27

!" 2.33 1.64 1.281 − % 0.99 0.95 0.90

& ' − 1.64 +, < . = 0.95

& ' − !"+, < . = 1 − %

Page 28: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Interval Estimation (3)

• Previous analysis required us to know !"• But typically this is unknown

• Instead, we can use sample variance #"

#" = 1& − 1(

)*+

,-) − . "

• When -)~0 1, !" , then & − 1 #"/!" is chi-squared with &–1degrees of freedom• & . − 1 /# is t-distributed with &–1 degrees of freedom

Design and Analysis of Experiments CSL603 - Machine Learning 28

Page 29: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Student’s t-distribution

• Similar to normal distribution, but with larger spread (heavier tails)• It includes the additional

uncertainty with using sample variance• !" → ∞, it becomes a normal

distribution

Design and Analysis of Experiments CSL603 - Machine Learning 29

Page 30: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Interval Estimation (4)

• When population variance !" is unknown, we can use the student’s t distribution to obtain the interval

#" = %&'% ∑)*%

& +) − - ", . - − / /# ~2&'%• So a two-sided confidence interval estimate would be of the form

3 - − 24/",&'%#. < / < - + 24/",&'%

#. = 1 − 9

Design and Analysis of Experiments CSL603 - Machine Learning 30

Page 31: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Interval Estimation (5)• ! = 3• #$ = 0.022 , # = 0.149• * = 0.05, -. = / − 1 = 9• 12.2$3,4 = 2.685• 8 3 − 0.127 < < < 3 + 0.127 = 0.95• 8 2.873 < < < 3.217 = 0.95

Design and Analysis of Experiments CSL603 - Machine Learning 31

> ?@1 3.02 3.13 3.24 2.85 2.96 3.17 3.28 2.89 2.910 3.0

Page 32: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Hypothesis Testing (1)

• Want to claim a hypothesis !"• E.g., !" : error& ℎ < 0.10

• Define the opposite of !" to be the null hypothesis !,• E.g., !, : error& ℎ ≥ 0.10

• Perform experiment collecting data about error& ℎ• With what probability can we reject !,?

Design and Analysis of Experiments CSL603 - Machine Learning 32

Page 33: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Hypothesis Testing (2)

• Sample ! = #$ $%&' ~) *, ,-• Estimate the sample mean . = &

'∑$%&' #$

• Want to test if . is not equal to some constant *0• Null hypothesis - 10: * = *0• Alternative hypothesis - 1&: * ≠ *0• Reject 10 if . too far from *0

• We fail to reject 10 with level of significance 4 if *0 lies in the 1 − 4 confidence interval:

7 . − *0, ∈ −9:/-, 9:/-

• We reject 10 if it falls outside this interval on either side (two-sided test)

Design and Analysis of Experiments CSL603 - Machine Learning 33

Page 34: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Hypothesis Testing (3)

• Sample ! = #$ $%&' ~) *, ,-

• Estimate the sample mean . = &' ∑$%&

' #$• Null hypothesis - 01: * ≤ *1• Alternative hypothesis - 0&: * > *1• Reject 01• We fail to reject 01 with level of significance 5 if *1 lies in the 1 − 5

confidence interval:8 . − *1

, ∈ −∞, ;<• We reject 01 if it falls outside this interval (one-sided test)

Design and Analysis of Experiments CSL603 - Machine Learning 34

Page 35: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Hypothesis Testing (4)

• Sample ! = #$ $%&' ~) *, ,-• Estimate the sample mean . = &

'∑$%&' #$

• Want to test if * is not equal to some constant *0• variance ,- is unknown, use sample variance 1-

• Null hypothesis - 20: * ≤ *0• Alternative hypothesis - 2&: * > *0• Reject 20

• We fail to reject 20 with level of significance 6 if *0 lies in the 1 − 6 confidence interval:

9 . − *01 ∈ −∞, <=,'>&

• We reject 20 if it falls outside this interval (one-sided test)

Design and Analysis of Experiments CSL603 - Machine Learning 35

Page 36: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Hypothesis Testing (5)• ! = 3, #$ = 0.022 , # = 0.149• *+ = 2.9• -.: * > 2.9, -+: * ≤ 2.9• 3 = 0.05, 56 = 7 − 1 = 9• 9+.+:,; = 1.833

• > ?@ABC

= 2.121 ∉ −∞, 1.833• Therefore, reject the null hypothesis• Accept the alternate hypothesis

Design and Analysis of Experiments CSL603 - Machine Learning 36

F GH1 3.02 3.13 3.24 2.85 2.96 3.17 3.28 2.89 2.910 3.0

Page 37: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Estimating Classifier Error

• Learn classifier on the training set• Test classifier on the test set ! of size "• Assume probability # of error by the classifier• $ = number of errors made by the classifier on !• $ is described by binomial distribution

% $ = ' = "' #( 1 − # +,(

Design and Analysis of Experiments CSL603 - Machine Learning 37

Page 38: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Binomial Distribution

• Coin Toss experiment• Probability of a head - !• The probability of observing " heads in # coin tosses is

$ " = #" !& 1 − ! )*&

• Mean - #!• Standard Deviation - #! 1 − !

Design and Analysis of Experiments CSL603 - Machine Learning 38

Page 39: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Coin Toss Vs Classification

• Coin toss• Toss results in a head• Probability of a head - !• " heads observed in # coin

tosses• Probability of " heads in # coin

tosses- $(")• Estimating !

• Classification of an instance• Classifier misclassifies an

instance• Probability of misclassification !• " misclassified instances in #

samples of '• Probability of a misclassified

instance in S - $(( = ")• Estimating !

Design and Analysis of Experiments CSL603 - Machine Learning 39

Page 40: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Binomial Test

• Test whether the error probability ! is less than or equal to some value !".• Null hypothesis - #": ! ≤ !"• Alternative hypothesis - #&: ! > !"• Reject #" with significance ( if

) * ≥ , =./01

234 !"/ 1 − !" 27/ < (

• Where , = !"3

Design and Analysis of Experiments CSL603 - Machine Learning 40

Page 41: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Approximate Normal Test

• Approximating ! with normal distribution• ! is sum of " independent random variables from the same distribution• !/" is approximately normal for large " with mean $% and variance $% 1 − $% /"

(central limit theorem)"(!/" − $%)$% 1 − $%

~+• Fail to reject ,%: $ ≤ $%with significance / if

"(!/" − $%)$% 1 − $%

∈ −∞, 34• Reject ,% if outside

• Works well for " not too small and $ is not very close to 0 or 1

Design and Analysis of Experiments CSL603 - Machine Learning 41

Page 42: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Example (1)

• Let ! = 40, & = 12,) = 0.3• Set ,- = 0.2, . = 0.05• Alternate Hypothesis: 01: , > ,-• Null Hypothesis: 0-: , ≤ ,-• Compute

!(&/! − ,-),- 1 − ,-

= 1.58 :-.-; = 1.64

• 1.58 ∈ −∞, 1.64• Therefore fail to reject 0-Design and Analysis of Experiments CSL603 - Machine Learning 42

Page 43: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

t-Test

• So far we have looked at single validation set.• Suppose do a k-fold cross validation• ! – error percentages "#, 1 ≤ ' ≤ !

( = 1!*#+,

-"# , ./ = 1

! − 1*#+,

-"# − ( /

• Hence! ( − "1 /.~4-5,

• Reject the null hypothesis with significance 6 if this value is greater than 47,-5,

Design and Analysis of Experiments CSL603 - Machine Learning 43

Page 44: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Comparing two learners

• K-fold cross-validated paired t test• Paired test: Both learners get the same train/test sets• Use k-fold cross validation to get the ! training/test sets.• Errors of learners 1 and 2 on fold " - #$%, #$'• Paired difference on fold " - #$ = #$% − #$'• Null hypothesis is whether #$ has mean 0

* = 1!,$-%

.#$ , /' = 1

! − 1,$-%

.#$ − * '

• Hence! * − 0 //~3.4%

Design and Analysis of Experiments CSL603 - Machine Learning 44

Page 45: Design and Analysis of Experimentscse.iitrpr.ac.in/ckn/courses/f2018/csl603/w7.pdf · Design and Analysis of Experiments CSL603 -Machine Learning 18. Experimental Design (1) ... Experimental

Summary

• Measures for evaluation• Experimental design• Estimating the generalized performance

• Hypothesis testing• Interval estimation• Confidence intervals

Design and Analysis of Experiments CSL603 - Machine Learning 45