Top Banner
1 STAT 2300: Cumulative Review Spring 2022 Note that this cumulative review is in no way meant to be reflective of the length and/or difficulty of the Final Exam. This review cannot be guaranteed to be inclusive or exclusive of all that is covered on the Final Exam, it is merely meant to aid you in the review process. PART I: Multiple Choice. Circle the letter corresponding to the best answer. 1. After an airplane security scare on Christmas day, 2009, the Gallup organization surveyed 542 American air travelers about increased security measures at airports. They found that 78% of these air travelers were in favor of United States airports using full-body-scan imaging on airline passengers. Identify the sample and the population in this study. (A) The sample is the 542 American air travelers who were surveyed and the population is all American air travelers. (B) The population is the 542 American air travelers who were surveyed and the sample is all American air travelers. (C) The sample is the 78% of air travelers in favor of the full-body-scan-imaging and the population is the 542 surveyed American air travelers. (D) The sample is the 78% of air travelers in favor of the full-body-scan imaging and the population is all American air travelers in favor of the full-body-scan imaging. 2. Officials at a metropolitan transit authority want to get input from people who use a certain bus route about a possible change in the schedule. They randomly select 5 buses during a certain week and poll all riders on those buses about the change. What type of sampling method is this? (A) Stratified sampling (B) Cluster sampling (C) Systematic sampling (D) Convenience sampling 3. A medical researcher wants to determine whether exercising can lower blood pressure. At a health fair, he measures the blood pressure of 100 individuals, and interviews them about their exercise habits. He divides the individuals into two categories: those whose typical level of exercise is low, and those whose level of exercise is high. Those in the low-exercise group had considerably higher blood pressure, on average, than those in the high-exercise group. Can the researcher conclude that exercise causes blood pressure to decrease? (A) Yes, a causal relationship can be established because this is an observational study. (B) No, a causal relationship cannot be established because this is an observational study. (C) Yes, a causal relationship can be established because this is a well-designed experiment. (D) No, a causal relationship cannot be established because this is a well-designed experiment.
21

STAT 2300: Cumulative Review Spring 2022

Mar 08, 2023

Download

Documents

Khang Minh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: STAT 2300: Cumulative Review Spring 2022

1

STAT 2300: Cumulative Review Spring 2022

Note that this cumulative review is in no way meant to be reflective of the length and/or difficulty of the Final Exam. This review cannot be guaranteed to be inclusive or exclusive of all that is covered on the Final Exam, it is merely meant to aid you in the review process.

PART I: Multiple Choice. Circle the letter corresponding to the best answer.

1. After an airplane security scare on Christmas day, 2009, the Gallup organization surveyed 542American air travelers about increased security measures at airports. They found that 78% of theseair travelers were in favor of United States airports using full-body-scan imaging on airlinepassengers. Identify the sample and the population in this study.

(A) The sample is the 542 American air travelers who were surveyed and the population is allAmerican air travelers.

(B) The population is the 542 American air travelers who were surveyed and the sample is allAmerican air travelers.

(C) The sample is the 78% of air travelers in favor of the full-body-scan-imaging and the populationis the 542 surveyed American air travelers.

(D) The sample is the 78% of air travelers in favor of the full-body-scan imaging and the populationis all American air travelers in favor of the full-body-scan imaging.

2. Officials at a metropolitan transit authority want to get input from people who use a certain bus routeabout a possible change in the schedule. They randomly select 5 buses during a certain week andpoll all riders on those buses about the change. What type of sampling method is this?

(A) Stratified sampling

(B) Cluster sampling

(C) Systematic sampling

(D) Convenience sampling

3. A medical researcher wants to determine whether exercising can lower blood pressure. At a healthfair, he measures the blood pressure of 100 individuals, and interviews them about their exercisehabits. He divides the individuals into two categories: those whose typical level of exercise is low,and those whose level of exercise is high. Those in the low-exercise group had considerably higherblood pressure, on average, than those in the high-exercise group. Can the researcher conclude thatexercise causes blood pressure to decrease?

(A) Yes, a causal relationship can be established because this is an observational study.

(B) No, a causal relationship cannot be established because this is an observational study.

(C) Yes, a causal relationship can be established because this is a well-designed experiment.

(D) No, a causal relationship cannot be established because this is a well-designed experiment.

Page 2: STAT 2300: Cumulative Review Spring 2022

2

4. Based on results from the 2013 Consumer Expenditure Survey of 30,000 American consumers, the U.S. Bureau of Labor Statistics reported that Americans spent an average of $219 per month on dining out. Which of the following represents the appropriate symbols for the statistic and the parameter of interest in this study?

(A) The statistic is �̂�𝑝 and the parameter is p.

(B) The statistic is p and the parameter is �̂�𝑝.

(C) The statistic is �̅�𝑥 and the parameter is 𝜇𝜇.

(D) The statistic is 𝜇𝜇 and the parameter is �̅�𝑥.

5. Which of the following summary statistics would be greatly affected by outliers? I. IQR

II. Range

III. Median

(A) I only

(B) II only

(C) I and II

(D) I and III 6. The JMP output below displays selected summary statistics for a sample of 35 observations.

Which of the following statements is most consistent with this information?

(A) A histogram of the observations will be skewed right.

(B) A histogram of the observations will be skewed left.

(C) A histogram of the observations will be approximately normal.

(D) A histogram of the observations will be bimodal.

Page 3: STAT 2300: Cumulative Review Spring 2022

3

7. A recent Nielsen survey asked Americans whether they did the majority of their Christmas shopping in stores, on-line, or did roughly an equal amount of shopping in stores and on-line. Which of the following would be the most appropriate graphical display for the results of this survey?

(A) A histogram

(B) A stem-and-leaf plot

(C) A bar chart

(D) None of the above would be an appropriate graph for these survey results 8. For a random sample of 62 bank tellers, the correlation between their weekly wages and months of

employment with their current company is r = 0.67. Which of the following is the best interpretation of this correlation coefficient?

(A) As length of employment increases, weekly wages tends to increase for bank tellers.

(B) As length of employment increases, weekly wages tends to decrease for bank tellers.

(C) Working longer for one company guarantees that a bank teller’s wages will increase.

(D) Working longer for one company guarantees that a bank teller’s wages will decrease. 9. Data on x = weight of a pickup truck (in pounds) and y = distance (in feet) required for a truck

traveling 30 miles per hour to come to a complete stop for 16 trucks was used to fit the least squares regression line, 𝑦𝑦� = 23 + 0.02𝑥𝑥. Which of the following statements is the correct interpretation of the intercept of this regression line?

(A) The predicted stopping distance for a truck weighing 0 pounds is 23 feet.

(B) The predicted stopping distance for a truck weighing 23 pounds is 0 feet.

(C) The predicted stopping distance for a truck weighing 23 pounds is 0.02 feet.

(D) It is not reasonable to interpret the intercept in this setting because a weight of 0 is outside the range of data used to fit the regression line.

10. To determine how the mileage of a used car is related to its selling price, a used car dealer took a

random sample of 100 used cars of the same make and model sold by dealerships during the past month. Each car was 3 years old, in good condition, and equipped with all the features that come standard with this car. Based on this sample, the regression equation for predicting the price ($1,000) of this used car from the miles (thousands) on the odometer is 𝑦𝑦� = 17.142 − 0.065𝑥𝑥. Which of the following gives the best interpretation of the slope of this regression line?

(A) For each additional 1,000 miles on the odometer, predicted selling price increases by $17.

(B) For each additional 1,000 miles on the odometer, predicted selling price decreases by $65.

(C) For each additional $1,000 in selling price, predicted miles on the odometer increases by 17.

(D) For each additional $1,000 in selling price, predicted miles on the odometer decreases by 65.

Page 4: STAT 2300: Cumulative Review Spring 2022

4

11. When manufacturing a certain smartphone, defects occur at different rates depending on the type of defect. Specifically, screen-related defects occur in 1% of phones, defects related to the charging port occur in 2% of phones, and both defects occur in 0.5% of phones. What is the probability a phone has a screen-related defect, given that it has a defect related to the charging port?

(A) 0.0300

(B) 0.0302

(C) 0.2000

(D) 0.2500

12. Suppose it has been determined that 16% of all homes with refrigerators own a Brand A refrigerator, while 12% of all homes with refrigerators own Brand B. Also, 2% of homes with refrigerators have both brands. If one home with a refrigerator is selected at random, what is the approximate probability that they will own either Brand A or Brand B?

(A) 0.26

(B) 0.28

(C) 0.30

(D) 0.74 13. A local company is interested in supporting environmentally friendly initiatives such as carpooling

among employees. The company surveyed all of the 200 employees at the downtown offices. Employees responded as to whether or not they own a car and to the location of the home where they live. The results are shown in the table below.

Location of Home Downtown Area

In the City Elsewhere In the City

Outside The City Total

Car Ownership

Yes 10 15 35 60 No 60 55 25 140

Total 70 70 60 200

Which of the following statements about a randomly chosen person from among these 200 employees is true?

(A) If the person owns a car, he or she is more likely to live elsewhere in the city than to live in the downtown area in the city.

(B) If the person does not own a car, he or she is more likely to live outside the city than to live in the city (downtown area or elsewhere).

(C) The person is more likely to live in the downtown area in the city than elsewhere in the city.

(D) The person is more likely to own a car than to not own a car.

Page 5: STAT 2300: Cumulative Review Spring 2022

5

14. For two events A and B, P(A) = 0.40, P(B) = 0.30, and P(A or B) = 0.58. Which of the following statements must be true?

(A) A and B are independent, but not mutually exclusive.

(B) A and B are mutually exclusive, but not independent.

(C) A and B are both mutually exclusive and independent.

(D) A and B are neither mutually exclusive nor independent. 15. Which of the following is an example of a discrete random variable?

(A) The time between oil changes in a car

(B) The time it takes to finish a 60-minute exam

(C) A person’s height in centimeters

(D) The number of students passing an exam in a class of 35 16. An Olympic archer has a 65% probability of hitting a bulls-eye on a shot. If this archer attempts

seven shots at the target, what is the probability of making at least six out of seven attempts?

(A) 0.0490

(B) 0.1848

(C) 0.2338

(D) 0.7662 17. About 106,000 people attended the Reagan funeral at the Reagan Library. If the total amount of time

people spent waiting in line followed a normal distribution with mean 6.5 hours and standard deviation 0.85 hours, approximately what proportion of attendees spent more than 8 hours in line?

(A) 0.0392

(B) 0.3023

(C) 0.4608

(D) 0.9608 18. The scores on a standardized test are normally distributed with μ=1000 and σ = 250. What score

would be necessary to score at the 85th percentile?

(A) 740

(B) 1075

(C) 1215

(D) 1260

Page 6: STAT 2300: Cumulative Review Spring 2022

6

19. An automobile insurer has found that repair claims have a distribution that is skewed right with a mean of $920 and a standard deviation of $870. A sample of 100 repair claims is taken at random. The average of the repair claims in the sample would follow what distribution?

(A) A distribution that is skewed right with mean $920 and standard deviation $870.

(B) A distribution that is skewed right with mean $920 and standard deviation $87.

(C) An approximately normal distribution with mean $920 and standard deviation $870.

(D) An approximately normal distribution with mean $920 and standard deviation $87. 20. A mortgage company gathers data concerning applications for mortgages and has determined that a

95% confidence interval for the proportion of applicants for a new mortgage who are approved is (0.62, 0.72). Which of the following statements gives a valid interpretation of the 95% level of confidence?

(A) There is a 95% chance the sample proportion of approved applications for a new mortgage is between 0.62 and 0.72.

(B) There is a 95% chance the population proportion of approved applications for a new mortgage is between 0.62 and 0.72.

(C) If the procedure were repeated many times, 95% of the resulting sample proportions would be between 0.62 and 0.72.

(D) If the procedure were repeated many times, 95% of the resulting confidence intervals would contain the population proportion of applicants approved for a new mortgage.

21. A polling agency wants to estimate the percentage of voters in favor of extending tax cuts, and it

wants to provide a margin of error of no more than 1.8 percentage points. Using 95% confidence, how many respondents must the agency poll?

(A) Cannot be determined from the given information.

(B) 30

(C) 564

(D) 2965 22. Suppose that the producers of a certain brand of peanut butter want to estimate the mean amount of

fat in a 16-ounce jar. In a randomly selected sample of 25 jars, the mean amount was 165.6 grams, with a sample standard deviation of 38.8 grams. Which of the following would be a 99% confidence interval for the mean amount of fat for a 16-ounce jar, assuming fat content is normally distributed?

(A) 165.6 ± 19.34

(B) 165.6 ± 19.98

(C) 165.6 ± 21.63

(D) 165.6 ± 21.70

Page 7: STAT 2300: Cumulative Review Spring 2022

7

23. A large simple random sample of people aged nineteen to thirty living in the state of Colorado was surveyed to determine which of two MP3 players just developed by a new company was preferred. To which of the following populations can the results of this survey be safely generalized?

(A) Only people aged nineteen to thirty living in the state of Colorado who were in the survey

(B) Only people aged nineteen to thirty living in the state of Colorado

(C) All people living in the state of Colorado

(D) Only people aged nineteen to thirty living in the United States 24. Are young women delaying marriage and marrying at a later age? This question was addressed in a

report issued by the Census Bureau. The report stated that in 1970 the mean age of brides marrying for the first time was 20.8 years. In 2014, a random sample of brides marrying for the first time had a mean age of 23.9 and a standard deviation of 6.4. A hypothesis test was conducted to determine if there is sufficient evidence to support the claim that women are now marrying later in life. The hypotheses tested were:

H0: 𝜇𝜇 = 20.8 H1: 𝜇𝜇 > 20.8

For these hypotheses, what does 𝜇𝜇 represent?

(A) The sample mean age of the 100 brides marrying for the first time.

(B) The population mean age of all brides marrying for the first time in 1970.

(C) The population mean age of all brides marrying for the first time in 2014.

(D) The population proportion of brides that are marrying for the first time in 1970.

25. The null hypothesis of a test is H0: 𝜇𝜇 = 70, and the alternative hypothesis is H1: 𝜇𝜇 > 70. A sample has been taken, and based on the results, a Type I error has been committed. Which of the following best describes a scenario in which this occurred?

(A) The sample mean was observed to be near 70, but the true mean was greater than 70.

(B) The sample mean was observed to be significantly greater than 70, but the true mean was 70.

(C) The sample mean was observed to be significantly less than 70, but the true mean was 70.

(D) The sample mean was observed to be near 70, and the true mean was greater than 70. 26. In a hypothesis test for a proportion, the null hypothesis is H0: p = 0.7 and the alternative hypothesis

is H1: p > 0.7. The test statistic is computed and is z0 = 1.68. For which levels of the test would rejection of the null hypothesis occur?

(A) Reject for 𝛼𝛼 = 0.10, 𝛼𝛼 = 0.05, and for 𝛼𝛼 = 0.01

(B) Reject for 𝛼𝛼 = 0.10 and 𝛼𝛼 = 0.05, but not for 𝛼𝛼 = 0.01

(C) Reject for 𝛼𝛼 = 0.10, but not for 𝛼𝛼 = 0.05 nor 𝛼𝛼 = 0.01

(D) Fail to reject for 𝛼𝛼 = 0.10, 𝛼𝛼 = 0.05, and for 𝛼𝛼 = 0.01

Page 8: STAT 2300: Cumulative Review Spring 2022

8

27. To test the null hypothesis H0: 𝜇𝜇 = 5 against the alternative hypothesis H1: 𝜇𝜇 < 5, a random sample of 35 observations from the population of interest was obtained. The JMP output below displays the results of the hypothesis test.

Which of the following statements gives the best interpretation of the p-value for this test?

(A) If the population mean is equal to 5, the probability of obtaining a test statistic of -1.956 or less is 0.0587.

(B) If the population mean is equal to 5, the probability of obtaining a test statistic of -1.956 or less is 0.0294.

(C) If the population mean is less than 5, the probability of obtaining a test statistic of -1.956 or less is 0.0587.

(D) If the population mean is less than 5, the probability of obtaining a test statistic of -1.956 or less is 0.0294.

28. A dog food company wishes to test a new high-protein formula for puppy food to determine whether

it promotes faster weight gain than the existing formula for puppy food. Two puppies were selected from each of seven different litters of Labrador puppies for this experiment. One puppy from each litter was randomly assigned to the high-protein formula puppy food and the other puppy from the same litter to the existing formula puppy food. The weight gains (in pounds) after six months are shown in the following table.

Litter 1 2 3 4 5 6 7 High-Protein Formula 32 29 27 32 28 30 33 Existing Formula 28 32 22 25 29 25 31

Which of the following best describes this experiment?

(A) This experiment involved only one sample.

(B) This experiment involved paired samples.

(C) This experiment involved two independent samples.

(D) This experiment involved both paired samples and independent samples.

Page 9: STAT 2300: Cumulative Review Spring 2022

9

29. Laptop magazine conducted an experiment to determine which of two smartphone models, the Samsung Galaxy S5 or the Apple iPhone 6 Plus, had the longer battery life. They obtained a random sample of 8 of the Samsung phones and a random sample of 8 of the Apple phones. They subjected each of the 16 phones to their battery test, which involves continuous web surfing over 4G LTE. The results are summarized in the table below, with battery lifetimes given in hours.

Samsung Galaxy S5 10.70 10.65 10.00 11.00 10.65 10.25 10.35 10.75 Apple iPhone 6 Plus 10.15 9.50 10.35 10.45 9.90 10.35 10.00 9.90

Which of the following best describes this experiment?

(A) This experiment involved only one sample.

(B) This experiment involved paired samples.

(C) This experiment involved two independent samples.

(D) This experiment involved both paired samples and independent samples. 30. Researchers wondered whether maintaining a patient's body temperature close to normal by heating

the patient during surgery would decrease wound infection rates, which would in turn decrease the length of their hospital stay. Patients were assigned at random to two groups: the normothermic (N) group (patients' core temperatures were maintained at near normal, 98.6°F, with heating blankets) and the hypothermic (H) group (patients' core temperatures were allowed to decrease to about 94.1°F). Data on the length of these patients' hospital stays (in days) were used to test the hypotheses

𝐻𝐻0: 𝜇𝜇N = 𝜇𝜇H 𝐻𝐻1: 𝜇𝜇N < 𝜇𝜇H.

The resulting JMP output is shown below.

At the 1% significance level, which of the following is the appropriate conclusion?

(A) There is insufficient evidence to conclude that heating a patient during surgery has no effect on the mean length of hospital stay.

(B) There is insufficient evidence to conclude that heating a patient during surgery decreases the mean length of hospital stay.

(C) There is sufficient evidence to conclude that heating a patient during surgery has no effect on the mean length of hospital stay.

(D) There is sufficient evidence to conclude that heating a patient during surgery decreases the mean length of hospital stay.

Page 10: STAT 2300: Cumulative Review Spring 2022

10

31. A 90% confidence interval for 𝜇𝜇1 − 𝜇𝜇2 is (24, 30). Which of the following could be the 95% confidence interval calculated from the same data?

(A) (−30, −24)

(B) (21, 29)

(C) (23, 31)

(D) (26, 28) Use the following information to answer questions 32 – 33. From a random sample of 125 workers from Company A, 35 admitted to using sick leave when they weren't really ill. From a random sample of 68 workers from Company B, 17 admitted to using sick leave when they weren't really ill. 32. Which of the following is not a condition for constructing a 95% confidence interval for the

difference in the proportion of workers at the two companies who would admit to using sick leave when they weren't really ill?

(A) Both populations are normally distributed

(B) The data come from two independent samples

(C) Both samples were chosen at random

(D) Both 𝑛𝑛A�̂�𝑝A(1 − �̂�𝑝A) and 𝑛𝑛B�̂�𝑝B(1 − �̂�𝑝B) are more than 10 33. Which of the following expressions gives a 90% confidence interval for the difference in the

proportion of workers at the two companies who would admit to using sick leave when they weren't really ill?

(A) 68

)75.0)(25.0(125

)72.0)(28.0(05.003.0 +±

(B) 68

)75.0)(25.0(125

)72.0)(28.0(645.103.0 +±

(C) 68

)731.0)(269.0(125

)731.0)(269.0(645.103.0 +±

(D) 68

)75.0)(25.0(125

)72.0)(28.0(96.103.0 +±

Page 11: STAT 2300: Cumulative Review Spring 2022

11

34. For their final project, a group of statistics students at a large university investigated their belief that females at their school text more than males at their school. They asked a random sample of 110 students – 50 males (M) and 60 females (F) – from their school to record the number of text messages sent and received over a two-day period. Histograms of their data are shown below.

Males Females

Would it be appropriate to use this data to perform a two-sample mean t-test addressing these students' belief?

(A) No, because the histograms are quite skewed indicating it is not reasonable to assume that the populations are normally distributed.

(B) No, because the data are not from independent random samples.

(C) No, because 𝑛𝑛M�̂�𝑝M(1− �̂�𝑝M) and 𝑛𝑛F�̂�𝑝F(1 − �̂�𝑝F) are not both more than 10.

(D) Yes, because all of the conditions for this inference procedure have been met. 35. The general manager of a grocery store suspects that female grocery shoppers bring their own

reusable bags more often than male grocery shoppers. To test his theory, he selects a random sample of 90 female grocery shoppers and 70 male grocery shoppers. It was found that 35 of the females and 18 of the males brought their own reusable bags. A test of the hypotheses 𝐻𝐻0: 𝑝𝑝F = 𝑝𝑝M versus 𝐻𝐻1:𝑝𝑝F > 𝑝𝑝M yields a p-value of .04. Which of the following is the best interpretation of this p-value?

(A) There is a 4% chance that female grocery shoppers bring their own reusable bags more often than male grocery shoppers.

(B) There is a 4% chance that there is no difference in the proportion of females and males that bring their own reusable bags.

(C) Assuming there is no difference in the proportion of female and male grocery shoppers who bring their own reusable bags, there is a 4% chance of obtaining a sample result like this or more extreme.

(D) Assuming that female grocery shoppers bring their own reusable bags more often than male grocery shoppers, there is a 4% chance of obtaining a sample result like this or more extreme.

Page 12: STAT 2300: Cumulative Review Spring 2022

12

Use the following information to answer questions 36 – 37. The U.S. Department of Agriculture (USDA) conducted a survey to determine if the average price of wheat increased from July (J) to September (S) of the same year. Independent random samples of wheat producers were selected for each of the two months. Given below are summary statistics on the reported price of wheat from the selected producers, in dollars per bushel.

Month n Mean Std Dev

July 90 $2.95 $0.22

September 45 $3.61 $0.19

Do the data give convincing evidence that the average price of wheat increased from July to September? 36. Which of the following represents the appropriate null and alternative hypotheses?

(A) 𝐻𝐻0: 𝜇𝜇S = 𝜇𝜇J

𝐻𝐻1: 𝜇𝜇S > 𝜇𝜇J (B) 𝐻𝐻0:𝑝𝑝S = 𝑝𝑝J

𝐻𝐻1:𝑝𝑝S > 𝑝𝑝J (C) 𝐻𝐻0: �̅�𝑥S = �̅�𝑥J

𝐻𝐻1: �̅�𝑥S > �̅�𝑥J (D) 𝐻𝐻0: 𝜇𝜇S = $2.95

𝐻𝐻1: 𝜇𝜇S > $2.95

37. Which of the following is the test statistic for the hypothesis test?

(A)

4519.0

95.261.3 −

(B)

9022.0

4519.0

95.261.3

+

(C)

9022.0

4519.0

95.261.3

+

(D)

9022.0

4519.0

95.261.322

+

Page 13: STAT 2300: Cumulative Review Spring 2022

13

Use the following information to answer questions 38 – 39. A researcher conducted a paired samples study to investigate whether local car dealers tend to charge women more than men for the same car model. Using information from the county tax collector's records, the researcher randomly selected one man and one woman from among everyone who had purchased the same model of an identically equipped car from the same dealer. The process was repeated for a total of 6 randomly selected car models. The purchase prices and the differences, d = woman – man, are shown in the table below. Selected summary statistics are also shown.

Car Model 1 2 3 4 5 6 Mean Std Dev Women $20,100 $17,400 $32,500 $17,710 $29,600 $46,300 $27,268.33 $11,270.69 Men $19,580 $17,500 $32,300 $17,720 $28,300 $45,630 $26,838.33 $11,028.37 Difference $520 −$100 $200 −$10 $1,300 $670 $430.00 $519.62

38. Which of the following is the appropriate alternative hypothesis for a matched pairs t-test to address

the researcher's question of interest?

(A) 𝜇𝜇𝑑𝑑 > 0

(B) �̅�𝑥𝑑𝑑 > 0

(C) 𝜇𝜇𝑑𝑑 < 0

(D) �̅�𝑥𝑑𝑑 < 0

39. Which of the following expressions gives the value of the test statistic for a matched pairs t-test to address the researcher's question of interest?

(A)

662.519

0430 −

(B)

662.5194300 −

(C)

1262.519

0430 −

(D) ( )

637.028,11

669.270,11

033.838,2633.268,2722

+

−−

Page 14: STAT 2300: Cumulative Review Spring 2022

14

Part II: Free Response.

1. Caffeine, a chemical found in many popular beverages, is known for reducing fatigue. A studentwanted to investigate the caffeine content in popular beverages, such as soft drinks, energy drinks,tea, and coffee. The following data collected by the student show the amounts of caffeine (inmilligrams per 12-ounce serving) for twelve popular beverages.

72 55 34 45 38 70 7.5 165 80 105 40 35

(a) Are there any outliers in the data set? Show your calculations.

(b) Construct a boxplot of the amounts of caffeine found in the 12 beverages.

(c) Use the graph in part (b) to write a few sentences describing the distribution of caffeine contentfor the 12 beverages.

(d) A 12-ounce cup of one popular gourmet coffee contains over 300 milligrams of caffeine. If thisvalue was added to the data set of 12 numbers above, how would the mean and median of thedata set above compare with the mean and median of the new data set with the 13 numbers?Explain how this comparison could be made without performing any calculations.

Page 15: STAT 2300: Cumulative Review Spring 2022

15

2. Many manufacturers have quality control programs that include inspection of incoming materials for defects. Suppose that a computer manufacturer receives hard drives in bundles of five. From one particular bundle of five hard drives (D1, D2, D3, D4, D5), two will be selected at random for inspection.

(a) List the ten possible outcomes for the two hard drives selected for inspection.

(b) Suppose that D1 and D2 are the only defective hard drives in the bundle of five. Define X to be the number of defective hard drives among the two inspected. Determine the value of X for each of the outcomes listed in part (a) and write it above each outcome.

(c) If the two hard drives were randomly selected from among the five for inspection, then each of the outcomes in part (a) is equally likely. Use this information to find the probability distribution of X, and give the distribution in table format.

(d) Find the expected number of defective hard drives among the two inspected.

Page 16: STAT 2300: Cumulative Review Spring 2022

16

3. Trains carry bauxite ore from a mine in Canada to an aluminum processing plant in northern New York state in hopper cars. Filling equipment is used to load ore into the hopper cars. When functioning properly, the actual weights of ore loaded into each car by the filling equipment at the mine are approximately normally distributed with a mean of 70 tons and a standard deviation of 0.9 tons. If the mean is greater than 70 tons, the loading mechanism is overfilling. (a) If the filling equipment is functioning properly, what is the probability that the weight of ore in a

randomly selected car will be 70.7 tons or more? Show your work.

(b) Suppose that the weight of ore in a randomly selected car is 70.7 tons. Would that fact make you suspect that the loading mechanism is overfilling the cars? Justify your answer.

(c) If the filling equipment is function properly, what is the probability that a random sample of 10 cars will have a mean ore weight of 70.7 tons or more? Show your work.

(d) Based on your answer in part (c), if a random sample of 10 cars had a mean ore weight of 70.7 tons, would you suspect that the loading mechanism was overfilling? Justify your answer.

Page 17: STAT 2300: Cumulative Review Spring 2022

17

4. Some boxes of a certain brand of breakfast cereal include a voucher for two free movie tickets inside the box. The company that makes the cereal claims that a voucher can be found in 25 percent of the boxes. However, based on their experiences eating this cereal at home, a group of students believes that the proportion of boxes with vouchers is less than 0.25. This group of students purchased 65 boxes of the cereal to investigate the company’s claim. The students found a total of 11 vouchers for two free movie tickets in the 65 boxes. Suppose it is reasonable to assume that the 65 boxes purchased by the students are a random sample of all boxes of this cereal. Based on this sample, is there support for the students’ belief that the proportion of boxes with vouchers is less than 0.25? Answer by performing all steps of a hypothesis test with 𝛼𝛼 = .05.

Page 18: STAT 2300: Cumulative Review Spring 2022

18

5. Investigators at the U.S. Department of Agriculture wished to compare methods of determining the level of E. coli bacteria contamination in beef. Two different methods (A and B) of determining the level of contamination were used on each of ten randomly selected specimens of a certain type of beef. The data obtained, in millimicrobes/liter of ground beef, for each of the methods is shown in the table below along with the difference, d = Method A – Method B, for each specimen.

Specimen 1 2 3 4 5 6 7 8 9 10 Method A 22.7 23.6 24.0 27.1 27.4 27.8 34.4 35.2 40.4 46.8 Method B 23.0 23.1 23.7 26.5 26.6 27.1 33.2 35.0 40.5 47.8 Difference −0.3 0.5 0.3 0.6 0.8 0.7 1.2 0.2 −0.1 −0.1

The JMP output below gives selected summary measures for each sample.

Page 19: STAT 2300: Cumulative Review Spring 2022

19

The investigators would like to report a 99% confidence interval for the true mean difference in the amount of E. coli bacteria detected by the two methods for this type of beef. (a) What parameter are the investigators interested in estimating? Give the appropriate symbol and

define in the context of this study.

(b) Verify that the conditions necessary for constructing the confidence interval have been met.

(c) Find the 99% confidence interval. Round your final answer to two decimal places.

(d) Based on the confidence interval in part (c), can we conclude that there is a true difference between the two methods? Explain.

Page 20: STAT 2300: Cumulative Review Spring 2022

20

6. Researchers asked a random sample of 385 never-married college students the question, "Would you marry a person from a lower social class than your own?" Of the 149 men in the sample, 89 said "Yes." Among the 236 women, 117 said "Yes." At the 5% significance level, do the data provide convincing evidence that the true proportion of never-married college students who would marry a person from a lower social class than their own is greater for males than for females? (a) Define the parameters of interest and state the null and alternative hypotheses.

(b) Write out the formula for the test statistic with the appropriate values substituted. Note that you do not need to calculate the final answer.

(c) If the p-value is .0258, write a conclusion for this hypothesis test.

Page 21: STAT 2300: Cumulative Review Spring 2022

21

7. In 2012, QSR Magazine conducted a study of drive-through times for fast-food restaurants. They recorded the service times (in seconds) for 362 randomly selected drive-through visits at McDonald's and 318 randomly selected drive-through visits at Burger King. The resulting JMP output for a 99% confidence interval for the difference in the mean service times at McDonald's and Burger King's drive-throughs is given below.

(a) Check that the conditions necessary for the confidence interval have been met.

(b) Use the output to give the 99% confidence interval and the degrees of freedom. Confidence Interval: ______________________________ Degrees of Freedom: ______________________________

(c) Does the interval in part (b) suggest that one company has faster service times, on average? If so, which company? Explain.