Problem Points Grade 1 2 4 5 6 - University of Wisconsin ...pages.stat.wisc.edu/.../Final_Exam/2012_Spring.pdf · FINAL EXAM Instructions: ... Source df SS MS . F-ratio . p-value

STAT 301, Spring 2012 Name _______________________________

Lec 2: Ismor Fischer Discussion Section: Please circle one!

TA: Shixue Liu…..................................................321 (M 12:05) / 322 (M 4:35) / 323 (M 3:30)

Ju Hee Cho.....................................................324 (M 1:20) / 325 (W 12:05) / 326 (W 1:20)

FINAL EXAM Instructions: Complete any 5 out of the following 6 problems. You may do the remaining problem for extra credit if you wish.

Please show all work!

Problem Points Grade

1 30

2 30

3 30

4 30

5 30

6 30

Total 150

1. According to astronomers, many of the stars that are visible with the naked eye are actually “binary systems,” i.e., two stars that orbit each other around a common center of mass. Generally, the variable X = “Luminosity relative to the sun”∗

is normally distributed, and that of a binary system is bimodal, consisting of two of a single star

distinct normal distributions, i.e., 1 1 1~ ( , )X N µ σ and 2 2 2~ ( , )X N µ σ . Occasionally however, the primary signal from a single star can display a secondary “false echo,” due to interference. We wish to determine if the true mean luminosity difference is statistically significant (indicating a true binary system) or not (indicating a false image from a single star), using the two independent sample measurements shown below. Answer each of the following. Show all work!

(a) Compute the 95% confidence interval for the true mean difference 1 2µ µ− between the two sources. Show all work! (15 pts)

(b) Calculate the two-sided p-value of this sample, under the null hypothesis 0 1 2:H µ µ= .

Show all work! (10 pts)

(c) Use EACH of your results in (a) and (b) to reach a formal conclusion about whether or not the

null hypothesis 0 1 2:H µ µ= can be rejected in favor of the alternative 1 2:AH µ µ≠ , at the .05α = significance level. Interpret in context: What do the findings suggest? (5 pts)

• 95% confidence interval:

• p-value:

∗ The luminosity of the sun is equal to 1, by convention. Thus, for example, a star with luminosity 0.1 is one-tenth as luminous as the sun.

Primary Source

1 400n =

1 0.102x =

1 0.020s =

Secondary Source

2 400n =

2 0.100x =

2 0.015s =

1µ 2µ X = Relative Luminosity

2. Two locations in Wisconsin are thought to be infested with independent populations of an insect pest species, each of which is normally distributed, i.e., 1 1 1~ ( , )X N µ σ and 2 2 2~ ( , )X N µ σ .

A sample of 1 4n = baited traps is randomly scattered throughout the first location, and a sample

of 2 5n = baited traps is randomly scattered throughout the second location. After a certain period of time, they are collected, and the number of insects caught in each trap is recorded, as shown below.

Location 1: {133, 115, 91, 73} Location 2: {70, 64, 40, 16, 10}

(a) Find the 95% confidence interval for the mean difference 1 2µ µ− between the two populations. Show all work! (12 pts)

(b) Calculate the two-sided p-value of this sample, under the null hypothesis 0 1 2: 0H µ µ− = . Show all work! (8 pts)

(c) Use EACH of your results in (a) and (b) to reach a formal conclusion about whether or not the null hypothesis 0 1 2: 0H µ µ− = can be rejected in favor of the alternative

1 2: 0,AH µ µ− ≠ at the .05α = significance level, based on these samples. (5 pts)

• 95% confidence interval:

• p-value:

Interpret this formal conclusion in context: Exactly what has or has not been demonstrated? (5 pts)

3. A criminal study investigates the rate of handgun-related fatalities in a certain area. In one year, it is found that X1 = 18 out of n1 = 45 violent deaths were handgun-related, versus X2 = 126 out of n2 = 180 violent deaths in the following year.

(a) Determine the p-value under the null hypothesis H0: π1 = π2 of equal death rates between the two years, versus the two-sided alternative HA: π1 ≠ π2, and use it to infer a formal conclusion about the null hypothesis at the α = .05 significance level. (8 pts)

Interpret in context: Based on the sample evidence, exactly what has been shown? Explain.

(4 pts)

(b) Fill in the 2 × 2 contingency table below (including marginal totals). Using the Chi-squared Test

statistic, calculate the 2χ -score of this experiment, and verify that it is equal to the square of the z-score found in part (a). Show all work!! (15 pts)

Year 1st 2nd

Die

d

Yes

No

(c) Suppose it is desired to test specifically for an increase in handgun-related deaths over time.

State the complementary null and alternative hypotheses for the appropriate one-sided test of this study, and determine the associated p-value of the sample data above. (3 pts)

H0:

HA:

p-value =

4. (a) The results of a survey of a random sample of n = 120 chocolate-lovers who expressed a

preference between dark, milk, and white chocolate are shown below.

Dark Chocolate

Milk Chocolate

48 40 32

Conduct a Chi-squared Test to determine whether or not the null hypothesis

0 Dark Milk White:H π π π= = can be rejected, at the α = .05 significance level. Use the included table to find the closest lower and upper bounds for the p-value (Example: .01 < p < .05). (9 pts)

(b) Suppose the sample data is further divided by gender, via the following 2 × 3 contingency table.

Dark Chocolate

Milk Chocolate

Men 12 16 20 48

Women 36 24 12 72

48 40 32 120 Conduct a Chi-squared Test for the two categorical variables “I = Gender (Men / Women)” and “J = Chocolate Preference (Dark / Milk / White)” at the α = .05 significance level. Use the included table to find the closest lower and upper bounds for the p-value (Example: .01 < p < .05). (16 pts)

(c) Interpret: Summarize the results of parts (a) and (b) in context. What has been shown in this

formal analysis of chocolate preference overall, and its relation to gender? Be precise! (5 pts)

5. The efficacy of two chemotherapy treatments, either given alone or in combination, is being tested by comparing the mean survival time X (yrs) of k = 3 groups of cancer patients. Summary statistics are shown below.

Treatment 1 only Treatment 2 only Both Treatments

n1 = 22 n2 = 21 n3 = 20

1x = 4.0 yrs 2x = 4.0 yrs 3x = 10.3 yrs

s12 = 8.0 yrs2 s2

2 = 8.3 yrs2 s32 = 8.0 yrs2

Assume that X is normally distributed in each of the k = 3 patient populations from which these samples were obtained, i.e., 1 1 1~ ( , )X N µ σ , 2 2 2~ ( , )X N µ σ , and 3 3 3~ ( , )X N µ σ . Furthermore, because the three sample variances s1

2, s22, and s3

2 are so close in value, it is reasonable to assume equivariance of these populations, that is, σ1

2 = σ22 = σ3

2. Given these assumptions, answer the following. (a) Using this information, complete the ANOVA table below, including the F-statistic and

corresponding p-value, relative to .05 (i.e., < .05 , > .05, or =.05). (20 pts)

Source df SS MS F-ratio p-value

Treatment

Error

Total

Recall that, for the k groups being compared, and pooled sampled size n = n1 + n2 + … + nk ,

grand mean x = n1 1x + n2 2x + … + nk kx

n

SSTrt = n1 ( 1x – x )2 + n2 ( 2x – x )2 + … + nk ( kx – x )2, dfTrt = k – 1

SSErr = (n1 – 1) s1

2 + (n2 – 1) s22 + … + (nk – 1) sk

2, dfErr = n – k

CONTINUED…

(b) Test the null hypothesis H0: µ1 = µ2 = µ3 at the α = .05 significance level. Interpret in

context: Exactly what conclusion can be inferred in this comparison of the three groups? (5 pts)

(c) Without conducting any other formal statistical tests, what further conclusions (if any)

from (b) are informally suggested upon inspection of the original summary statistics, regarding the efficacy of the two treatments? Be as specific as possible. (5 pts)

6. In a sociological survey, individuals are asked the following question: “If the older partner of a couple is X years old, then what would you estimate to be the youngest ‘socially acceptable’ age Y of the younger partner?” For each of the n = 5 ages of X shown below, the survey responses for Y are averaged, and presented in the following table along with some summary statistics, and corresponding scatterplot.

X 20 30 40 50 60 x = 40 sx2 = 250

Y 18 22 25 32 38 y = 27 sy2 = 64

(a) Compute the sample covariance xys . Show all work. (4 pts) (b) Compute the sample correlation coefficient r. Use it to determine whether or not X and Y are

linearly correlated; if so, classify as positive or negative, and as weak, moderate, or strong. (4 pts)

(c) Determine the equation 0 1ˆ ˆY Xβ β= + of the least squares regression line for these data.

Calculate the fitted response values ˆiy , and sketch a graph of this line on the same scatterplot above. Show all work. (12 pts)

(d) Calculate the residuals ˆi i ie y y= − , and the residual sum of squares 2Error

1

ˆSS ( )n

i ii

y y=

= −∑ .

Show all work. How does this value compare with SSError for any other line that estimates the data? Be as precise as possible. (6 pts)

(e) Calculate the sample coefficient of determination r2, and interpret its value in the context of evaluating the fit of this linear model to the sample data. Be as precise as possible. (4 pts)

Right-tailed area

Chi-squared scores corresponding to selected right-tailed probabilities of the 2

dfχ distribution

χ 2-score 0 df 1 0.5 0.25 0.10 0.05 0.025 0.010 0.005 0.0025 0.0010 0.0005 0.00025 1 0 0.455 1.323 2.706 3.841 5.024 6.635 7.879 9.141 10.828 12.116 13.412 2 0 1.386 2.773 4.605 5.991 7.378 9.210 10.597 11.983 13.816 15.202 16.588 3 0 2.366 4.108 6.251 7.815 9.348 11.345 12.838 14.320 16.266 17.730 19.188 4 0 3.357 5.385 7.779 9.488 11.143 13.277 14.860 16.424 18.467 19.997 21.517 5 0 4.351 6.626 9.236 11.070 12.833 15.086 16.750 18.386 20.515 22.105 23.681 6 0 5.348 7.841 10.645 12.592 14.449 16.812 18.548 20.249 22.458 24.103 25.730 7 0 6.346 9.037 12.017 14.067 16.013 18.475 20.278 22.040 24.322 26.018 27.692 8 0 7.344 10.219 13.362 15.507 17.535 20.090 21.955 23.774 26.124 27.868 29.587 9 0 8.343 11.389 14.684 16.919 19.023 21.666 23.589 25.462 27.877 29.666 31.427 10 0 9.342 12.549 15.987 18.307 20.483 23.209 25.188 27.112 29.588 31.420 33.221 11 0 10.341 13.701 17.275 19.675 21.920 24.725 26.757 28.729 31.264 33.137 34.977 12 0 11.340 14.845 18.549 21.026 23.337 26.217 28.300 30.318 32.909 34.821 36.698 13 0 12.340 15.984 19.812 22.362 24.736 27.688 29.819 31.883 34.528 36.478 38.390 14 0 13.339 17.117 21.064 23.685 26.119 29.141 31.319 33.426 36.123 38.109 40.056 15 0 14.339 18.245 22.307 24.996 27.488 30.578 32.801 34.950 37.697 39.719 41.699 16 0 15.338 19.369 23.542 26.296 28.845 32.000 34.267 36.456 39.252 41.308 43.321 17 0 16.338 20.489 24.769 27.587 30.191 33.409 35.718 37.946 40.790 42.879 44.923 18 0 17.338 21.605 25.989 28.869 31.526 34.805 37.156 39.422 42.312 44.434 46.508 19 0 18.338 22.718 27.204 30.144 32.852 36.191 38.582 40.885 43.820 45.973 48.077 20 0 19.337 23.828 28.412 31.410 34.170 37.566 39.997 42.336 45.315 47.498 49.632 21 0 20.337 24.935 29.615 32.671 35.479 38.932 41.401 43.775 46.797 49.011 51.173 22 0 21.337 26.039 30.813 33.924 36.781 40.289 42.796 45.204 48.268 50.511 52.701 23 0 22.337 27.141 32.007 35.172 38.076 41.638 44.181 46.623 49.728 52.000 54.217 24 0 23.337 28.241 33.196 36.415 39.364 42.980 45.559 48.034 51.179 53.479 55.722 25 0 24.337 29.339 34.382 37.652 40.646 44.314 46.928 49.435 52.620 54.947 57.217 26 0 25.336 30.435 35.563 38.885 41.923 45.642 48.290 50.829 54.052 56.407 58.702 27 0 26.336 31.528 36.741 40.113 43.195 46.963 49.645 52.215 55.476 57.858 60.178 28 0 27.336 32.620 37.916 41.337 44.461 48.278 50.993 53.594 56.892 59.300 61.645 29 0 28.336 33.711 39.087 42.557 45.722 49.588 52.336 54.967 58.301 60.735 63.104 30 0 29.336 34.800 40.256 43.773 46.979 50.892 53.672 56.332 59.703 62.162 64.555 31 0 30.336 35.887 41.422 44.985 48.232 52.191 55.003 57.692 61.098 63.582 65.999 32 0 31.336 36.973 42.585 46.194 49.480 53.486 56.328 59.046 62.487 64.995 67.435 33 0 32.336 38.058 43.745 47.400 50.725 54.776 57.648 60.395 63.870 66.403 68.865 34 0 33.336 39.141 44.903 48.602 51.966 56.061 58.964 61.738 65.247 67.803 70.289 35 0 34.336 40.223 46.059 49.802 53.203 57.342 60.275 63.076 66.619 69.199 71.706 36 0 35.336 41.304 47.212 50.998 54.437 58.619 61.581 64.410 67.985 70.588 73.118 37 0 36.336 42.383 48.363 52.192 55.668 59.893 62.883 65.739 69.346 71.972 74.523 38 0 37.335 43.462 49.513 53.384 56.896 61.162 64.181 67.063 70.703 73.351 75.924 39 0 38.335 44.539 50.660 54.572 58.120 62.428 65.476 68.383 72.055 74.725 77.319 40 0 39.335 45.616 51.805 55.758 59.342 63.691 66.766 69.699 73.402 76.095 78.709 41 0 40.335 46.692 52.949 56.942 60.561 64.950 68.053 71.011 74.745 77.459 80.094 42 0 41.335 47.766 54.090 58.124 61.777 66.206 69.336 72.320 76.084 78.820 81.475 43 0 42.335 48.840 55.230 59.304 62.990 67.459 70.616 73.624 77.419 80.176 82.851 44 0 43.335 49.913 56.369 60.481 64.201 68.710 71.893 74.925 78.750 81.528 84.223 45 0 44.335 50.985 57.505 61.656 65.410 69.957 73.166 76.223 80.077 82.876 85.591 46 0 45.335 52.056 58.641 62.830 66.617 71.201 74.437 77.517 81.400 84.220 86.954 47 0 46.335 53.127 59.774 64.001 67.821 72.443 75.704 78.809 82.720 85.560 88.314 48 0 47.335 54.196 60.907 65.171 69.023 73.683 76.969 80.097 84.037 86.897 89.670 49 0 48.335 55.265 62.038 66.339 70.222 74.919 78.231 81.382 85.351 88.231 91.022 50 0 49.335 56.334 63.167 67.505 71.420 76.154 79.490 82.664 86.661 89.561 92.371

MARGIN OF ERROR

k samples

POPULATION PARAMETER

SAMPLE STATISTIC = product of these two factors:

Null Hypothesis H0: θ = θ 0

Point Estimate

One Sample

θ = f(x1,…, xn)CRITICAL VALUE

(2-sided) 1STANDARD ERROR

(estimate) 2

n ≥ 30: tn−1, α /2 or zα /2Mean* H0: μ = μ 0 μ = x = ∑ xi

n n < 30: tn−1, α /2 only

Any n: s / n

n ≥ 30: zα /2 ~ N(0, 1)

Proportion H0: π = π 0π = p =

Xn ,

where X = # Successes

n < 30: Use X ~ Bin(n,π).(not explicitly covered)

n ≥ 30: • For Confidence Interval: ˆ ˆ(1 ) nπ π−

• For Acceptance Region, p-value: 0 0(1 ) nπ π−

Two Paired Samples 3

Two Independent Samples

Null Hypothesis H0: θ1 – θ2 = 0

Point Estimate

1 2ˆ ˆθ θ−

CRITICAL VALUE (2-sided) 1

STANDARD ERROR (estimate) 2

n1, n2 ≥ 30:

1 2 2, / 2n nt α+ − or zα /2

n1, n2 ≥ 30: s1

2 / n1 + s22 / n2

Means* H0: μ1 – μ2 = 0 1 2x x− n1, n2 < 30: Is σ12 = σ2

2 ?Informal: 1/4 < s1

2/s22 < 4 ?

Yes → 1 2 2, / 2n nt α+ −

No → Satterwaithe’s Test

n1, n2 < 30:

spooled2 1 / n1 + 1 / n2

where spooled2 = (n1 − 1) s1

2 + (n2 − 1) s22

n1 + n2 − 2

n1, n2 ≥ 30: zα /2

(or use Chi-squared Test)

Proportions H0: π1 – π2 = 0 1 2ˆ ˆπ π− n1, n2 < 30:

Fisher’s Exact Test (not explicitly covered)

n1, n2 ≥ 30:

• For Confidence Interval:

1 1 1 2 2ˆ ˆ ˆ ˆ(1 ) (1 )n nπ π π π− + − 2

• For Acceptance Region, p-value:

1 2pooled pooledˆ ˆ(1 ) 1 1n nπ π− +

where = (X pooledπ 1 + X 2) / (n1 + n2)

(k ≥ 2)

Null Hypothesis H0: θ1 = θ2 = … = θk

Independent Dependent (not covered)

Means H0: μ1 = μ2 = … = μk F-test (ANOVA) Repeated Measures, “Blocks”Proportions H0: π1 = π2 = … = πk Chi-squared Test Other techniques

1 For 1-sided hypothesis tests, replace α /2 by α. 2 For means, always use the actual standard error if known – either / nσ or 2 2

1 1 2 2/ /nσ σ+ n – with the Z-distribution. 3 For Paired Means: Apply the appropriate one sample test to the pairwise differences D = X – Y. For Paired Proportions: Apply McNemar’s Test, a “matched” version of the 2 × 2 Chi-squared Test. * If normality is not established, then use a transformation, or a nonparametric Wilcoxon Test on the median(s).

Problem Points Grade 1 2 4 5 6 - University of Wisconsin ...pages.stat.wisc.edu/.../Final_Exam/2012_Spring.pdf · FINAL EXAM Instructions: ... Source df SS MS . F-ratio . p-value

Documents