Top Banner
Lecture 15: Tues., Mar. 2 • Inferences about Linear Combinations of Group Means (Chapter 6.2) • Chi-squared test (Handout/Notes) • Thursday: Simple Linear Regression (Chapter 7)
21

Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

Lecture 15: Tues., Mar. 2

• Inferences about Linear Combinations of Group Means (Chapter 6.2)

• Chi-squared test (Handout/Notes)

• Thursday: Simple Linear Regression (Chapter 7)

Page 2: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

Review of One-way layout

• Assumptions of ideal model– All populations have same standard deviation.– Each population is normal– Observations are independent

• Planned comparisons: Usual t-test but use all groups to estimate . If many planned comparisons, use Bonferroni to adjust for multiple comparisons

• Test of vs. alternative that at least two means differ: one-way ANOVA F-test

• Unplanned comparisons: Use Tukey-Kramer procedure to adjust for multiple comparisons.

IH 210 :

Page 3: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

Case Study 5.1.2: Spock Conspiracy Trial

• In 1968, Dr. Spock was tried in U.S. District Court of Boston on charges of conspiring to violate Selective Service Act by encouraging young men to resist being drafted into military service.

• Defense challenged method by which jurors were selected, claiming that women – many of whom had raised children according to popular methods developed by Dr. Spock - were underrepresented

• Venire for trial contained only one woman.• Defense argued that judge in trial had a history had a history

of venires in which women were systematically underrepresented.

Page 4: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

Data for Spock Conspiracy Trial

• Percent of women in recent 30-juror venires for Spock Trial judge and six other Boston area district judges (A,B,C,D,E,F). Seven groups (judges) in one-way layout. Data in spock.JMP.

• Key question: How does the mean percentage of women for Spock Trial judge compare to the average of the mean percentage of women for the other six judges, i.e., what is

6FEDCBA

Spock

Page 5: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

Inference about Linear Combinations of Group Means

• Parameter of interest: For Spock study, • Point estimate:• Standard Error:

• 95% Confidence Interval for :• Test of : For level .05 test,

reject if and only if does not belong to the 95% confidence interval.

IICCC 2211

6

1,1 7654321 CCCCCCC

IIYCYCYCg 2211

I

Ip n

C

n

C

n

CsgSE

2

2

22

1

21)(

)(*,975. gSEtg In*:*,:0 aHH

0H *

Page 6: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

O n e w a y A n a l y s i s o f P E R C E N T B y J U D G E

PE

RC

EN

T

5

15

25

35

45

A B C D E F SPOCK'S

JUDGE

M e a n s a n d S t d D e v i a t i o n s L e v e l N u m b e r M e a n S t d D e v S t d E r r M e a n L o w e r 9 5 % U p p e r 9 5 %

A 5 3 4 . 1 2 0 0 1 1 . 9 4 1 8 5 . 3 4 0 5 1 9 . 2 9 4 8 . 9 4 8 B 6 3 3 . 6 1 6 7 6 . 5 8 2 2 2 . 6 8 7 2 2 6 . 7 1 4 0 . 5 2 4 C 9 2 9 . 1 0 0 0 4 . 5 9 2 9 1 . 5 3 1 0 2 5 . 5 7 3 2 . 6 3 0 D 2 2 7 . 0 0 0 0 3 . 8 1 8 4 2 . 7 0 0 0 - 7 . 3 1 6 1 . 3 0 7 E 6 2 6 . 9 6 6 7 9 . 0 1 0 1 3 . 6 7 8 4 1 7 . 5 1 3 6 . 4 2 2 F 9 2 6 . 8 0 0 0 5 . 9 6 8 9 1 . 9 8 9 6 2 2 . 2 1 3 1 . 3 8 8 S P O C K ' S 9 1 4 . 6 2 2 2 5 . 0 3 8 8 1 . 6 7 9 6 1 0 . 7 5 1 8 . 4 9 5 O n e w a y A n o v a S u m m a r y o f F i t

R s q u a r e 0 . 5 0 8 2 6 A d j R s q u a r e 0 . 4 3 2 6 0 8 R o o t M e a n S q u a r e E r r o r 6 . 9 1 4 2 0 9 M e a n o f R e s p o n s e 2 6 . 5 8 2 6 1 O b s e r v a t i o n s ( o r S u m W g t s ) 4 6 A n a l y s i s o f V a r i a n c e S o u r c e D F S u m o f S q u a r e s M e a n S q u a r e F R a t i o P r o b > F

J U D G E 6 1 9 2 7 . 0 8 0 8 3 2 1 . 1 8 0 6 . 7 1 8 4 < . 0 0 0 1 E r r o r 3 9 1 8 6 4 . 4 4 5 3 4 7 . 8 0 6 C . T o t a l 4 5 3 7 9 1 . 5 2 6 0

Page 7: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

Spock Trial AnalysisLinear Combination of Interest:

6'FEDCBA

sSpock

Point Estimate:

42.176

62.1480.2697.2600.2710.2962.3312.3462.14

g

Standard Error:

62.29

)6/1(

9

)6/1(

6

)6/1(

2

)6/1(

9

)6/1(

6

)6/1(

9

191.6)(

2222222

gSE

95% Confidence Interval for : 62.2*)975(.42.17 645 t

Rounding the degrees of freedom down to nearest entry on Table A.6, 042.2)975(.30 t

Approximate 95% Confidence Interval: )77.22,07.12(62.2*042.242.17

Hypothesis test of 0:0 H vs. 0: aH at level 0.05: Reject 0H since 0 is not in the

95% confidence interval. Conclusion: There is evidence that the mean percent of women in Spock’s trial judge’s venire is less than the average of the mean percent of women in the other six trial judges’ venires. A 95% confidence interval for the difference between the mean percent of percent of women in Spock’s trial judge’s venire and the average of the mean percent of women in the other six trial judges’ venires is (-12.07, -22.77).

Page 8: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

Linear Combinations: Comparing Rates

• In mice diet study, we are interested in the rate of increase in lifetime for each additional kilocalorie of reduced diet.

• For example we are interested in comparing rate of increase in lifetime associated with reduction from 50 to 40 kcal/wk vs. rate of increase in lifetime associated with reduction from 85 to 50 kcal/wk

))4050(

( 50/40/

RNRN

))5085(

( 85/50/

NNRN

40/85/50/

50/40/85/50/

10

1

35

1

350

451035

RNNNRN

RNRNNNRN

Page 9: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

O n e w a y A n a l y s i s o f L I F E T I M E B y D I E T

O n e w a y A n o v a S u m m a r y o f F i t

R s q u a r e 0 . 4 5 4 2 7 5 A d j R s q u a r e 0 . 4 4 6 3 2 R o o t M e a n S q u a r e E r r o r 6 . 6 7 8 2 3 9 M e a n s a n d S t d D e v i a t i o n s L e v e l N u m b e r M e a n S t d D e v S t d E r r M e a n L o w e r 9 5 % U p p e r 9 5 %

N / N 8 5 5 7 3 2 . 6 9 1 2 5 . 1 2 5 3 0 0 . 6 7 8 8 6 3 1 . 3 3 1 3 4 . 0 5 1 N / R 4 0 6 0 4 5 . 1 1 6 7 6 . 7 0 3 4 1 0 . 8 6 5 4 1 4 3 . 3 8 5 4 6 . 8 4 8 N / R 5 0 7 1 4 2 . 2 9 7 2 7 . 7 6 8 1 9 0 . 9 2 1 9 2 4 0 . 4 5 8 4 4 . 1 3 6 N P 4 9 2 7 . 4 0 2 0 6 . 1 3 3 7 0 0 . 8 7 6 2 4 2 5 . 6 4 0 2 9 . 1 6 4 R / R 5 0 5 6 4 2 . 8 8 5 7 6 . 6 8 3 1 5 0 . 8 9 3 0 7 4 1 . 0 9 6 4 4 . 6 7 5 L o p r o 5 6 3 9 . 6 8 5 7 6 . 9 9 1 6 9 0 . 9 3 4 3 0 3 7 . 8 1 3 4 1 . 5 5 8

P a r a m e t e r o f I n t e r e s t : 40/85/50/ 10

1

35

1

350

45RNNNRN

P o i n t E s t i m a t e : 0057.01.45*10

17.32*

35

13.42*

350

45g m o n t h s / ( k c a l / w k )

S t a n d a r d E r r o r : 1359.060

)10/1(

57

)35/1(

71

)350/45(68.6)(

222

gSE

D e g r e e s o f F r e e d o m : 984.1)975(.)975(. 100343 tt

9 5 % C o n f i d e n c e I n t e r v a l : )6906.2,7020.2(1359.0*984.10057.0 m o n t h s / ( k c a l / w k )

H y p o t h e s i s t e s t o f 0:0 H v s . 0: aH d o e s n o t r e j e c t a t 0 . 0 5 l e v e l s i n c e 0 i s i n 9 5 %

c o n f i d e n c e i n t e r v a l . C o n c l u s i o n : N o e v i d e n c e o f a d i f f e r e n c e i n r a t e s o f i n c r e a s e i n l i f e t i m e a s s o c i a t e d w i t h r e d u c t i o n o f d i e t f r o m 8 5 k c a l / w k t o 5 0 k c a l / w k c o m p a r e d t o r e d u c t i o n i n d i e t f r o m 5 0 k c a l / w k t o 4 0 k c a l / w k .

Page 10: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

Populations of Nominal Data• So far we have focused on comparing populations of

interval data (e.g., heights, scores, incomes)• We now consider comparing populations of nominal data.

Nominal data are data that are categories. Examples:– Candidate person voted for (Bush or Gore)– Color of M&Ms (brown, yellow, red, orange, green or blue)

• A population of nominal data with k categories can be described by the proportion in each category, in category 1, in category 2, …, in category k, ( ) , e.g., population of M&M’s is supposed to have

1p 2p kp

1.0,2.0 orangegreenblueredyellowbrown pppppp

k

i ip11

Page 11: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

One Sample Test for Nominal Data

• Analogue of one sample problem with interval population: Take random sample of size n from a population of nominal data. We want to test whether population frequencies are *,*,*, 2211 kk pppppp

),...,1(* of oneleast at :

*,*,*,: 22110

kippH

ppppppH

iia

kk

Page 12: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

SAT example

• People sometimes say that “b” and “c” answers occur most frequently on multiple choice tests. To see if there is any evidence that the answers do not occur with equal frequency, a random SAT exam was selected from The College Board, 10 SATs, New York: College Entrance Examination Board.

2.0,,,, of oneleast at :

2.0:0

edcbaa

edcba

pppppH

pppppH

Page 13: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

Data (sat.JMP)

1. d 15. c 29. e 43. a 57. e 71. a 2. d 16. d 30. b 44. a 58. d 72. c 3. b 17. a 31. d 45. b 59. c 73. b 4. b 18. c 32. d 46. e 60. b 74. d 5. c 19. c 33. b 47. d 61. b 75. e 6. e 20. b 34. e 48. b 62. d 76. a 7. b 21. b 35. e 49. d 63. e 77. c 8. a 22. b 36. c 50. b 64. b 78. c 9. a 23. c 37. e 51. a 65. d 79. d 10. b 24. b 38. c 52. a 66. e 80. d 11. c 25. c 39. d 53. c 67. b 81. b 12. b 26. a 40. e 54. c 68. d 82. d 13. e 27. c 41. e 55. a 69. c 83. d 14. e 28. e 42. a 56. c 70. c 84. e 85. b

Page 14: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

Chi-squared Test

• Chi-squared test statistic:• Reject for large values of . Critical value for

level .05 test is .95 quantile of distribution with k-1 degrees of freedom (Table A.3)

• Test is only valid if expected frequencies in each cell are 5 or more. When necessary, cells should be combined in order to satisfy this condition.

Category Observed Frequency Expected Frequency under

0H

1 1f 1e np**1

2 2f 2e np**

2 ... k

kf ke npk**

k

ii

ii

e

ef1

22 )(

0H2

2

Page 15: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

Chi-Squared Test for SAT data

Letter Observed Frequency Expected Frequency (Observed-Expected)2/Expected

A 12 85*0.2=17 1.47 B 22 85*0.2=17 1.47 C 19 85*0.2=17 0.24 D 17 85*0.2=17 0.00 E 15 85*0.2=17 0.24 Total 3.42 The test statistic 42.32 . The critical value for rejecting at the 0.05 level is

49.9)95(.24 . Since 3.42<9.49, we do not reject 0H . There is no evidence that the

letters are not random on the SAT.

Page 16: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

Chi-Squared Test in JMP

• (For the SAT example)• Method I (list all observations in sample): Create a column for

answer and list the sample. Then click Analyze, Distribution, put column with answer in Y, click OK, then click red triangle next to answer, click Test Probabilities and then input the hypothesized probabilities (0.2 for each category for SAT example). Then click OK. The row Pearson gives the chi-squared statistic and the p-value.

• Method II (list frequencies for each category): Create a column for each answer (a,b,c,d,e) and another column frequency which contains the frequency of each answer. Then click Analyze, Distribution, put column with answer in Y and put column with frequency in Freq and click OK. Follow above instructions.

Page 17: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

D i s t r i b u t i o n s A n s w e r s

a

b

c

d

e

a

b

c

d

e

F r e q u e n c i e s L e v e l C o u n t P r o b

a 1 2 0 . 1 4 1 1 8 b 2 2 0 . 2 5 8 8 2 c 1 9 0 . 2 2 3 5 3 d 1 7 0 . 2 0 0 0 0 e 1 5 0 . 1 7 6 4 7 T o t a l 8 5 1 . 0 0 0 0 0 5 L e v e l s T e s t P r o b a b i l i t i e s L e v e l E s t i m P r o b H y p o t h P r o b

a 0 . 1 4 1 1 8 0 . 2 0 0 0 0 b 0 . 2 5 8 8 2 0 . 2 0 0 0 0 c 0 . 2 2 3 5 3 0 . 2 0 0 0 0 d 0 . 2 0 0 0 0 0 . 2 0 0 0 0 e 0 . 1 7 6 4 7 0 . 2 0 0 0 0 T e s t C h i S q u a r e D F P r o b > C h i s q

L i k e l i h o o d R a t i o 3 . 4 5 6 8 4 0 . 4 8 4 5 P e a r s o n 3 . 4 1 1 8 4 0 . 4 9 1 4

N o e v i d e n c e a g a i n s t 2.0:0 edcba pppppH , p - v a l u e = 0 . 4 9 1 4 .

Page 18: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

Random numbers experiment

• When selecting random numbers (e.g., for a random sample or randomized experiment), you should always use a random number generator or a random number table. People are very bad at picking random numbers themselves.

• Experiment: Everybody pick a random whole number between 1 and 10. We’ll then survey the class and test whether people’s “random” numbers are really random.

Page 19: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

Chi-squared test for random numbers experiment

N u m b e r O b s e r v e d E x p e c t e d 1 0 . 1 * n = 2 0 . 1 * n = 3 0 . 1 * n = 4 0 . 1 * n = 5 0 . 1 * n = 6 0 . 1 * n = 7 0 . 1 * n = 8 0 . 1 * n = 9 0 . 1 * n = 1 0 0 . 1 * n =

2 C r i t i c a l v a l u e : R e j e c t 1.0...: 10210 pppH i f 92.16)95(.2

92

Page 20: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

M&M’s

• According to the M&M’s web site, the color distribution in peanut butter M&M’s is 20% brown, 20% yellow, 20% red, 20% blue, 10% green and 10% orange. Test

not true. is esprobabilti above of oneleast at :

1.0,2.0:0

a

orangegreenblueredyellowbrown

H

ppppppH

Page 21: Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression.

Chi-squared test for M&Ms

C o l o r O b s e r v e d E x p e c t e d B r o w n 0 . 2 * n = Y e l l o w 0 . 2 * n = R e d 0 . 2 * n = B l u e 0 . 2 * n = G r e e n 0 . 1 * n = O r a n g e 0 . 1 * n =

2 C r i t i c a l v a l u e : R e j e c t 1.0,2.0:0 orangegreenblueredyellowbrown ppppppH i f

07.11)95(.25

2