STAT 301, Spring 2012 Name _______________________________ Lec 2: Ismor Fischer Discussion Section : Please circle one! TA: Shixue Liu…..................................................321 (M 12:05) / 322 (M 4:35) / 323 (M 3:30) Ju Hee Cho.....................................................324 (M 1:20) / 325 (W 12:05) / 326 (W 1:20) FINAL EXAM Instructions : Complete any 5 out of the following 6 problems. You may do the remaining problem for extra credit if you wish. Please show all work! Problem Points Grade 1 30 2 30 3 30 4 30 5 30 6 30 Total 150
14
Embed
Problem Points Grade 1 2 4 5 6 - University of Wisconsin ...pages.stat.wisc.edu/.../Final_Exam/2012_Spring.pdf · FINAL EXAM Instructions: ... Source df SS MS . F-ratio . p-value
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
STAT 301, Spring 2012 Name _______________________________
Lec 2: Ismor Fischer Discussion Section: Please circle one!
Ju Hee Cho.....................................................324 (M 1:20) / 325 (W 12:05) / 326 (W 1:20)
FINAL EXAM Instructions: Complete any 5 out of the following 6 problems. You may do the remaining problem for extra credit if you wish.
Please show all work!
Problem Points Grade
1 30
2 30
3 30
4 30
5 30
6 30
Total 150
1. According to astronomers, many of the stars that are visible with the naked eye are actually “binary systems,” i.e., two stars that orbit each other around a common center of mass. Generally, the variable X = “Luminosity relative to the sun”∗
is normally distributed, and that of a binary system is bimodal, consisting of two of a single star
distinct normal distributions, i.e., 1 1 1~ ( , )X N µ σ and 2 2 2~ ( , )X N µ σ . Occasionally however, the primary signal from a single star can display a secondary “false echo,” due to interference. We wish to determine if the true mean luminosity difference is statistically significant (indicating a true binary system) or not (indicating a false image from a single star), using the two independent sample measurements shown below. Answer each of the following. Show all work!
(a) Compute the 95% confidence interval for the true mean difference 1 2µ µ− between the two sources. Show all work! (15 pts)
(b) Calculate the two-sided p-value of this sample, under the null hypothesis 0 1 2:H µ µ= .
Show all work! (10 pts)
(c) Use EACH of your results in (a) and (b) to reach a formal conclusion about whether or not the
null hypothesis 0 1 2:H µ µ= can be rejected in favor of the alternative 1 2:AH µ µ≠ , at the .05α = significance level. Interpret in context: What do the findings suggest? (5 pts)
• 95% confidence interval:
• p-value:
∗ The luminosity of the sun is equal to 1, by convention. Thus, for example, a star with luminosity 0.1 is one-tenth as luminous as the sun.
Primary Source
1 400n =
1 0.102x =
1 0.020s =
Secondary Source
2 400n =
2 0.100x =
2 0.015s =
1µ 2µ X = Relative Luminosity
2. Two locations in Wisconsin are thought to be infested with independent populations of an insect pest species, each of which is normally distributed, i.e., 1 1 1~ ( , )X N µ σ and 2 2 2~ ( , )X N µ σ .
A sample of 1 4n = baited traps is randomly scattered throughout the first location, and a sample
of 2 5n = baited traps is randomly scattered throughout the second location. After a certain period of time, they are collected, and the number of insects caught in each trap is recorded, as shown below.
(a) Find the 95% confidence interval for the mean difference 1 2µ µ− between the two populations. Show all work! (12 pts)
(b) Calculate the two-sided p-value of this sample, under the null hypothesis 0 1 2: 0H µ µ− = . Show all work! (8 pts)
(c) Use EACH of your results in (a) and (b) to reach a formal conclusion about whether or not the null hypothesis 0 1 2: 0H µ µ− = can be rejected in favor of the alternative
1 2: 0,AH µ µ− ≠ at the .05α = significance level, based on these samples. (5 pts)
• 95% confidence interval:
• p-value:
Interpret this formal conclusion in context: Exactly what has or has not been demonstrated? (5 pts)
3. A criminal study investigates the rate of handgun-related fatalities in a certain area. In one year, it is found that X1 = 18 out of n1 = 45 violent deaths were handgun-related, versus X2 = 126 out of n2 = 180 violent deaths in the following year.
(a) Determine the p-value under the null hypothesis H0: π1 = π2 of equal death rates between the two years, versus the two-sided alternative HA: π1 ≠ π2, and use it to infer a formal conclusion about the null hypothesis at the α = .05 significance level. (8 pts)
Interpret in context: Based on the sample evidence, exactly what has been shown? Explain.
(4 pts)
(b) Fill in the 2 × 2 contingency table below (including marginal totals). Using the Chi-squared Test
statistic, calculate the 2χ -score of this experiment, and verify that it is equal to the square of the z-score found in part (a). Show all work!! (15 pts)
Year 1st 2nd
Die
d
Yes
No
(c) Suppose it is desired to test specifically for an increase in handgun-related deaths over time.
State the complementary null and alternative hypotheses for the appropriate one-sided test of this study, and determine the associated p-value of the sample data above. (3 pts)
H0:
HA:
p-value =
4. (a) The results of a survey of a random sample of n = 120 chocolate-lovers who expressed a
preference between dark, milk, and white chocolate are shown below.
Dark Chocolate
Milk Chocolate
48 40 32
Conduct a Chi-squared Test to determine whether or not the null hypothesis
0 Dark Milk White:H π π π= = can be rejected, at the α = .05 significance level. Use the included table to find the closest lower and upper bounds for the p-value (Example: .01 < p < .05). (9 pts)
(b) Suppose the sample data is further divided by gender, via the following 2 × 3 contingency table.
Dark Chocolate
Milk Chocolate
Men 12 16 20 48
Women 36 24 12 72
48 40 32 120 Conduct a Chi-squared Test for the two categorical variables “I = Gender (Men / Women)” and “J = Chocolate Preference (Dark / Milk / White)” at the α = .05 significance level. Use the included table to find the closest lower and upper bounds for the p-value (Example: .01 < p < .05). (16 pts)
(c) Interpret: Summarize the results of parts (a) and (b) in context. What has been shown in this
formal analysis of chocolate preference overall, and its relation to gender? Be precise! (5 pts)
5. The efficacy of two chemotherapy treatments, either given alone or in combination, is being tested by comparing the mean survival time X (yrs) of k = 3 groups of cancer patients. Summary statistics are shown below.
Treatment 1 only Treatment 2 only Both Treatments
n1 = 22 n2 = 21 n3 = 20
1x = 4.0 yrs 2x = 4.0 yrs 3x = 10.3 yrs
s12 = 8.0 yrs2 s2
2 = 8.3 yrs2 s32 = 8.0 yrs2
Assume that X is normally distributed in each of the k = 3 patient populations from which these samples were obtained, i.e., 1 1 1~ ( , )X N µ σ , 2 2 2~ ( , )X N µ σ , and 3 3 3~ ( , )X N µ σ . Furthermore, because the three sample variances s1
2, s22, and s3
2 are so close in value, it is reasonable to assume equivariance of these populations, that is, σ1
2 = σ22 = σ3
2. Given these assumptions, answer the following. (a) Using this information, complete the ANOVA table below, including the F-statistic and
corresponding p-value, relative to .05 (i.e., < .05 , > .05, or =.05). (20 pts)
Source df SS MS F-ratio p-value
Treatment
Error
Total
Recall that, for the k groups being compared, and pooled sampled size n = n1 + n2 + … + nk ,
grand mean x = n1 1x + n2 2x + … + nk kx
n
SSTrt = n1 ( 1x – x )2 + n2 ( 2x – x )2 + … + nk ( kx – x )2, dfTrt = k – 1
SSErr = (n1 – 1) s1
2 + (n2 – 1) s22 + … + (nk – 1) sk
2, dfErr = n – k
CONTINUED…
(b) Test the null hypothesis H0: µ1 = µ2 = µ3 at the α = .05 significance level. Interpret in
context: Exactly what conclusion can be inferred in this comparison of the three groups? (5 pts)
(c) Without conducting any other formal statistical tests, what further conclusions (if any)
from (b) are informally suggested upon inspection of the original summary statistics, regarding the efficacy of the two treatments? Be as specific as possible. (5 pts)
6. In a sociological survey, individuals are asked the following question: “If the older partner of a couple is X years old, then what would you estimate to be the youngest ‘socially acceptable’ age Y of the younger partner?” For each of the n = 5 ages of X shown below, the survey responses for Y are averaged, and presented in the following table along with some summary statistics, and corresponding scatterplot.
X 20 30 40 50 60 x = 40 sx2 = 250
Y 18 22 25 32 38 y = 27 sy2 = 64
(a) Compute the sample covariance xys . Show all work. (4 pts) (b) Compute the sample correlation coefficient r. Use it to determine whether or not X and Y are
linearly correlated; if so, classify as positive or negative, and as weak, moderate, or strong. (4 pts)
(c) Determine the equation 0 1ˆ ˆY Xβ β= + of the least squares regression line for these data.
Calculate the fitted response values ˆiy , and sketch a graph of this line on the same scatterplot above. Show all work. (12 pts)
(d) Calculate the residuals ˆi i ie y y= − , and the residual sum of squares 2Error
1
ˆSS ( )n
i ii
y y=
= −∑ .
Show all work. How does this value compare with SSError for any other line that estimates the data? Be as precise as possible. (6 pts)
(e) Calculate the sample coefficient of determination r2, and interpret its value in the context of evaluating the fit of this linear model to the sample data. Be as precise as possible. (4 pts)
Right-tailed area
Chi-squared scores corresponding to selected right-tailed probabilities of the 2
Means H0: μ1 = μ2 = … = μk F-test (ANOVA) Repeated Measures, “Blocks”Proportions H0: π1 = π2 = … = πk Chi-squared Test Other techniques
1 For 1-sided hypothesis tests, replace α /2 by α. 2 For means, always use the actual standard error if known – either / nσ or 2 2
1 1 2 2/ /nσ σ+ n – with the Z-distribution. 3 For Paired Means: Apply the appropriate one sample test to the pairwise differences D = X – Y. For Paired Proportions: Apply McNemar’s Test, a “matched” version of the 2 × 2 Chi-squared Test. * If normality is not established, then use a transformation, or a nonparametric Wilcoxon Test on the median(s).