Statistical Comparison of Two or More Systems The most relevant of all the Basic Theory Lectures. No Holidays.

Post on 01-Apr-2015

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Statistical Comparison of Two or More Systems

The most relevant of all the Basic Theory Lectures.

No Holidays.

THE MISSION

Your analysis task involves manipulating conditions of the system of interest from a prescribed set of options. Design of Experiments: Determine if the

different options are really different. Is the best one really statistically better?

Ranking and Selection: What’s the probability that the best sample indicates the best system setting?

VOCABULARY

Factor An element of the system that will be

manipulated Setting or Level

A value that a Factor may assume

EXAMPLE : Simulation model of Football (EA Sports)

Factors Quarterback Running Back Strong Safety

Settings or Levels for Quarterback Dante’ Bret Johnny U.

TYPES OF DESIGNS One Factor, Two Settings

Paired samples Behrens-Fischer Question: Which is Best?

More than one Factor Factorial Designs Partially Exhaustive Designs Question: Are the settings significant

difference-makers?

PAIRED SAMPLES Example: Quarterback Controversy! Simulate St. Louis Rams vs. Tampa Bay

Bucs, recording the Quarterback Rating Level 1: Curt Warner Level 2: Mark Bulger

Run the simulation 28 times for each player, resulting in data set W1, W2, ..., W28 B1, B2, ..., B28

Is E[B] > E[W]?

BRUTE FORCE

Confidence interval on the quantity E[W]-E[B]

If it doesn’t include 0.0, we have conclusive evidence that there is a difference

Equivalent to the Hypothesis Test H0: E[B]=E[W]

CALCULATIONS ON VARIANCES: SOME BASICS Let X and Y be random variables

)(])[(

])[(][

2

2

xdFXEX

XEXEXVAR

X

CALCULATIONS ON VARIANCES: SOME BASICS Let X and Y be random variables

],[2][][][)5

][][)4

][][][],[)3

],[2][][][)2

])[(][][)1

2

22

YXCOVYVARXVARYXVAR

XVARccXVAR

YEXEXYEYXCOV

YXCOVYVARXVARYXVAR

XEXEXVAR

COV=0 if X and Y are independent.

SAMPLE MEAN

n

XVARXVAR

n

n

XVARnn

XVARXVAR

i

n

ii

n

ii

)(

1)(

2

12

1

nX

X

CONFIDENCE INTERVAL

a/2 probability of Type I error on each end of the confidence interval

basic interval for X-bar is n

ZX

nZX

XVARZX

2/

2

2/

2/ ][

BASIC CONFIDENCE INTERVAL

][)( 2/ BWVARZBW

28

0][][

],[2][][][

BVARWVAR

BWCOVBVARWVARBWVAR

SPREADSHEET HIGHLIGHTS 1 (U-0.5)*SQRT(12)

zero mean unit stddev

m + (U-0.5)*SQRT(12)*s mean m stddev s uniform over an interval centered at

m and s*SQRT(12)/2 wide

COMMON RANDOM NUMBERS Correlation is not always BAD! Suppose we could INDUCE

CORRELATION between the W’s and the B’s without adding any bias?

Reduces the theoretical variance of W-bar – B-bar

FREE POWER (the probability of correctly rejecting H0: equal means)

STREAMING

Segregate the random number generation task into streams connected to phenomena

seed1 seed2

Inter-arrivaltimes

Servicetimes

Zi=aZi-1 mod m

1. Change features of the service.2. Use exact same arrival stream forcomparing each service setting.

SPREADSHEET HIGHLIGHTS 2

Use same results of RAND() for building Bulger samples Warner samples

Note CI shrinkage Try with identical sigma Discuss “Estimation”

Behrens-Fischer Problem Comparison of Means No pairs, equal sample sizes, or equal

variances Remember that we are after the variance of

W-bar – B-bar Common use: New samples vs. History

0/][/][

],[2][][][

BW nBVARnWVAR

BWCOVBVARWVARBWVAR

SPREADSHEET HIGHLIGHTS

MULTI-SETTING CASE

Can involve many Factors or just one

Treatment i has mean mi

Analysis of Variance (ANOVA) Data from treatment 1, 2, ..., n H0: m1 =...mn-1 =mn

Are the treatments distinguishable?

DESIGN OF EXPERIMENT

DetermineFactors and Settings

Collect DataAccording to Design

Design = Which Factors,Which Settings for each Treatment

PerformANOVA

State Conclusion

FULL FACTORIAL

Build sample of All Combinations Factors

Quarterback (2) Running Back (3) Strong Safety (3) 2x3x3=18 Treatments

HOW ANOVA WORKS Xi,j is ith sample from jth treatment

point Assumed iid Normal (never!) Decomposition of variability

Observation (Obs) Treatment vs. Grand Mean (Tr) Within Treatment (Res)

jiiji eX ,,

HYPOTHESIS H0

The treatment variability is random variability

The size of the treatment variability is in-scale with the residual variability

ANOVA uses sums of squares g treatments nt samples from treatment t

ANOVA TABLE

1)(

)(

1)(

11

2,

1

11

2,

1Re

1

2

g

it

g

tji

n

jObs

g

it

g

ttji

n

js

g

tttTr

nxxSS

gnxxSS

gxxnSS

t

t

degreesfreedom

REMEMBER chi-SQUARED?From our Goodness-of-Fit Test

X~N(0,1) for n independent X’s sum of n X2 is chi-SQUARED with n

degrees of freedom if estimates (X-bar, sigma) were

used to make X’s N(0,1), lose one d.f. per estimate

F-distribution X is chi-sq with n d.f. Y is chi-sq with m d.f. (X/n)/(Y/m) has F distribution

ANOVA HYPOTHESIS TEST

FfdSS

fdSS

s

Tr ~./

./

Re

The normalizing s cancels!

ANOVA HYPOTHESIS TEST Compare the

test statistic to a table

Reject if its big and conclude that ...

the Treatments are Different!

SPREADSHEET HIGHLIGHTS

top related