Top Banner
       1  Activity 1-6: A N urse Accused  a. he observational units are the eight-hour shifts.  b. One variable is whether Gilbert worked on the shift. his variable is categorical and binary. he other variable is whether a patient died on the shift. his variable is also categorical and binary.  • • •  Homewor k Activities  Activ ity 1- 7 : Miscella ny  a. V ariable: binary categorical; observational units: pennies being spun  b. Variable: binary categorical; observational units: people leaving the washroom  c. V ariable: quantitative; observationa l units: fast-food sandwich  d. V ariable: quantitative; observationa l units: residents of that country  e. V ariable: binary categorical; observational units: American households  f. V ariable: quantitative (though a ge might make more sense to interpret); observational units: colleges  g. V ariable: quantitative; observationa l units: colleges  h. Variable: categorical; observational units: American voters in 2004  i. Variable: binary categorical; observational unit: newborn babies  j. V ariable: quantitative; observationa l units: Alfred Hitchcock movies  k. V ariable: quantitative; observationa l units: American pennies  l. V ariable: quantitative; observationa l units: automobiles  m. Variable: binary categorical; observational units: automobiles  n. Variable: categorical; observational units: automobiles  o. Variable: binary categorical; observational units: applicants for graduate school  p. Variable: quantitative; observational units: college students  q. Variable: categorical; observational unit: person  r. V ariable: binary categorical; observational units: college students  s. Variable: binary categorical; observational units: participants in sport  t. Variable: quantitative; observational units: sport participants  u. Variable: quantitative; observational units: states  v. Variable: quantitative; observational units: bartenders (or glasses if just one bartender)  w. Variable: quantitative; observational unit: person  x. Variable: quantitative; observational units: brides  y. Variable: categorical; observational units: brides  z. Variable: quantitative; observational units: couples getting married  Activity 1-7 7
322

Ws3 Ir Homework Sol

Oct 09, 2015

Download

Documents

Karla Hoffman

Homework Solutions
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 1Activity 1-6: A Nurse Accused a. The observational units are the eight-hour shifts.

    b. One variable is whether Gilbert worked on the shift. This variable is categorical and binary. The other variable is whether a patient died on the shift. This variable is also categorical and binary.

    Homework ActivitiesActivity 1-7: Miscellany a. Variable: binary categorical;

    observational units: pennies being spun

    b. Variable: binary categorical; observational units: people leaving the washroom

    c. Variable: quantitative; observational units: fast-food sandwich

    d. Variable: quantitative; observational units: residents of that country

    e. Variable: binary categorical; observational units: American households

    f. Variable: quantitative (though age might make more sense to interpret); observational units: colleges

    g. Variable: quantitative; observational units: colleges

    h. Variable: categorical; observational units: American voters in 2004

    i. Variable: binary categorical; observational unit: newborn babies

    j. Variable: quantitative; observational units: Alfred Hitchcock movies

    k. Variable: quantitative; observational units: American pennies

    l. Variable: quantitative; observational units: automobiles

    m. Variable: binary categorical; observational units: automobiles

    n. Variable: categorical; observational units: automobiles

    o. Variable: binary categorical; observational units: applicants for graduate school

    p. Variable: quantitative; observational units: college students

    q. Variable: categorical; observational unit: person

    r. Variable: binary categorical; observational units: college students

    s. Variable: binary categorical; observational units: participants in sport

    t. Variable: quantitative; observational units: sport participants

    u. Variable: quantitative; observational units: states

    v. Variable: quantitative; observational units: bartenders (or glasses if just one bartender)

    w. Variable: quantitative; observational unit: person

    x. Variable: quantitative; observational units: brides

    y. Variable: categorical; observational units: brides

    z. Variable: quantitative; observational units: couples getting married

    Activity 1-7 7

    WS3_IR_U1_T1.indd 7 3/6/08 7:22:12 PM

  • 8 Topic 1: Data and Variables

    Activity 1-8: Top 100 Films a. Quantitative

    b. Quantitative (though age might be easier to interpret)

    c. Categorical

    d. Binary categorical

    e. Binary categorical

    f. Binary categorical

    g. Quantitative (notice how this quantity will vary from movie to movie)

    Activity 1-9: Credit Card Usage a. Year in school: categorical

    Whether the student has a credit card: binary categorical Outstanding balance on the credit card: quantitative Whether the outstanding balance exceeds $1000: binary categorical Source for selecting a credit card: categorical Region of the country: categorical

    b. Answers will vary. Examples include these:

    1. Which class (freshman, sophomore, . . .) tends to have the largest outstanding credit card balance?

    2. Do all regions of the country tend to obtain their credit cards from the same source?

    Activity 1-10: Got a Tip? a. Answers will vary. Examples include these:

    Th e number of customers at each table Th e amount spent on food and drink Whether there were children at the table Whether a man or woman paid the bill

    b. Answers will vary. Examples include these:

    Which tends to have more in uence on the tipthe size of the bill or the number of people in the party?

    Do males tend to be better tippers than females?

    Activity 1-11: Proximity to the Teacher a. The observational units are the students.

    b. One variable is the distance the student is sitting from the teacher. This variable is categorical and binary. The other variable is the quiz scores. This variable is quantitative.

    WS3_IR_U1_T1.indd 8 3/6/08 7:22:12 PM

  • 1Activity 1-12: Emergency Rooms a. Categorical variable

    b. Quantitative variable

    c. Categorical variable

    d. This is not a variable; it doesnt vary from patient to patient.

    e. This is not a variable; you cannot ask an individual patient to tell you this information.

    f. Binary categorical variable

    g. Quantitative variable

    h. Binary categorical variable

    i. This is not a variable; it needs to be worded as in part h in order to be a variable.

    j. This is not a variable; this is summary information about the emergency room.

    Activity 1-13: Candy Colors a. The observational units are the pieces of candy.

    b. The variable is the color of the candy. This variable is categorical (non-binary).

    c. Now the observational units are the samples of 25 pieces of candy.

    d. The variable is the proportion of the sample that is colored orange. This variable is quantitative.

    Activity 1-14: Natural Light and Achievement a. The observational units are the students.

    b. One variable is whether the student learned in natural light. The other variable is the score on the standardized test.

    c. The first variable in part b is categorical and binary. The second variable in part b is quantitative.

    Activity 1-15: Childrens Television Viewing a. The observational units are the third- and fourth-grade students in San Jose.

    b. The quantitative variables are body mass index, triceps skinfold thickness, waist circumference, waist-to-hip ratio, weekly time spent watching television, and weekly time spent playing video games. The categorical variable is which school the student attends.

    Activity 1-16: Nicotine Lozenge a. The observational units are smokers.

    b. The categorical variables are whether they received the nicotine or placebo, gender, whether the person made a previous attempt to quit smoking, and whether the subject successfully refrained from smoking during the study.

    Activity 1-16 9

    WS3_IR_U1_T1.indd 9 3/6/08 7:22:13 PM

  • 10 Topic 1: Data and Variables

    c. The quantitative variables are weight and number of cigarettes smoked per day.

    d. Type of lozenge assigned is a binary categorical variable.

    Activity 1-17: Oscar Winners and Super Bowls a. Answers will vary from student to student. Examples include these:

    Categorical variables:

    What is the movies genre? Did the picture also win an Academy Award for best director?

    Quantitative variables:

    What was the total length (in minutes) of the movie? What was the production cost of the movie? How much did the movie gross during its rst weekend of release?

    b. Answers will vary from student to student. Examples include these:

    Categorical variables:

    In what city was the game played? Was the game played indoors or outdoors? Which league was the winning team a member of? Was either team a wild card? Did the winner of the coin toss win the game?

    Quantitative variables:

    What was the season percentage of wins for the winning team? What was the total payroll for the winning team? How many people attended the game? What was the point spread?

    AssessmentSample Quiz 1A Suppose for every email message that you receive in the next week, you keep track of

    Whether the message is spam Whether the sender is a family member, a friend, or someone else Whether the message contains an emoticon (such as a smiley face ) How many words are in the message What day of the week the message was sent

    1. Which of these variables is quantitative?

    2. How many of these variables are categorical? How many are binary?

    3. What are the observational units in this study?

    4. State a research question that you could address with these data.

    5. Is people who send you a message with an emoticon a legitimate variable in this study? Explain why or why not.

    WS3_IR_U1_T1.indd 10 3/6/08 7:22:13 PM

  • 20 Topic 2: Data and Distributions

    regular section, or perhaps students were sleepier in the sports section because it met earlier in the day.

    Homework ActivitiesActivity 2-7: Student Data

    How many hours you slept in the past 24 hours: dotplot Whether you have slept for at least 7 hours in the past 24 hours: bar graph How many states you have visited: dotplot Handedness: bar graph Day of the week on which you were born: bar graph Gender: bar graph Average study time per week: dotplot Score on the rst exam in this course: dotplot

    Activity 2-8: Student Data a. In general, do most female students study more than most male students? This

    does not mean that you would expect to find that all female students study more than all male students. You can also think in terms of the typical female student scoring higher than the typical male student.

    b. In general, do most students who study more score higher on exams than students who dont study as much? Again, this does not mean that you would expect to find that all those who study earn higher grades than all those who do not study.

    Activity 2-9: Value of Statistics a. Answers will vary from student to student.

    b. Answers will vary from class to class. Here are some sample answers:

    Rating 1 2 3 4 5 6 7 8 9

    Tally (Count) 0 0 1 0 5 6 11 6 6

    c. Yes, 7 was chosen more often than any other value.

    d. 29/35 or .829 gave a response greater than 5; 1/35 or .029 gave a response less than 5.

    e. The vast majority of this class (more than 80%) feel that statistics is important to society. In fact, more than 65% of the class feel that statistics is very important to society. About 14% of the class are neutral about the importance of statistics and only 1 of the 35 students in this group believe that statistics is not very important to study.

    Activity 2-10: Value of Statistics i. Class C

    ii. Class D

    iii. Class E

    iv. Class A

    v. Class B

    WS3_IR_U1_T2.indd 20 3/6/08 7:23:50 PM

  • 2Activity 2-11: Quiz ScoresMany answers are possible. Here are some examples:

    a. Quiz 1: 0, 1, 1, 2, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10

    b. Quiz 2: 3, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6

    c. Quiz 3: 0, 1, 1, 2, 2, 3, 3, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 10, 10, 10

    d. Quiz 4: 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 8, 8, 9, 9, 9, 9, 10, 10, 10, 10

    Quiz Score

    Qui

    z N

    umbe

    r

    1086420Quiz 4

    1086420Quiz 3

    Quiz 21086420

    Quiz 11086420

    Activity 2-12: Responding to KatrinaMore than 85% of the whites who were surveyed did not think race was a factor in the governments slow response to Hurricane Katrina, whereas only about a third of the blacks who were surveyed gave the same response. Equivalently, 60% of the blacks surveyed did think that race was a factor in the governments response rate, compared to just over 10% of the whites surveyed. These results show a very strong difference of opinion between the two groups.

    Activity 2-13: Backpack Weights a. The distributions of backpack weights for both the male and female students are

    roughly bell-shaped and range from a minimum of 2 lbs to a maximum of about 25 lbs. The females weights are centered between 1011 lbs, whereas the males weights are centered slightly higher at about 1112 lbs. The males appear to have one unusually heavy backpack weighing in at 35 lbs.

    b. Yes, it appears that males tend to carry slightly more weight in their backpacks than females. This is shown primarily by the centers in the dotplots. The graph for males appears shifted to the right of the graph for females.

    c. The ratios of backpack weights to body weights for these students range from about .015 to .18 and are roughly the same for both males and females. There is a cluster of ratios from .025 to .075 and another smaller cluster from about .08 to .125. There are five females with high ratios (.146 and above) but only one male with such a high ratio.

    Activity 2-13 21

    WS3_IR_U1_T2.indd 21 3/6/08 7:23:51 PM

  • 22 Topic 2: Data and Distributions

    d. No, it does not appear that one sex tends to carry a higher ratio of their body weight in their backpacks than the other sex. Both dotplots look quite similar in terms of shape, center, and spread.

    e. Males tend to weigh more and so tend to carry more weight in their backpacks. But this factor is accounted for when you compute the ratio of backpack weight to body weight, as the ratio carried by each gender tends to be about the same.

    Activity 2-14: Broadway ShowsFor the Broadway plays, the number of seats was fairly evenly distributed in 7 theaters, from just less than 600 to about 1100 seats. The number of seats available for the Broadway musicals in the remaining 22 theaters spread over a much wider rangefrom 650 seats to more than 1800 seats. Most of the musicals seemed to have either 10001200 seats or 14001700 seats.

    Attendance at the musicals was clustered primarily from 85% to more than 100% of the theatres capacity at each show. There were two low musical outliers near 40%. Attendance at the plays was more evenly distributed, with attendance for two plays near 60% of capacity, two near 80% of capacity, and three near or greater than 100% of capacity.

    The average price of a musical tended to be $70, although prices ranged from a low of about $55 to a high of $105. Ticket prices for a Broadway play were generally less than for a musical, with all but two of the play prices less than $60. The remaining two plays had ticket prices of about $83.

    Activity 2-15: Highest PeaksThe highest peaks in the east tend to be significantly less high than those in the west. The elevations in the east are all below 7,000 feet, whereas more than half of those in the west are above 9,000 feet. The west has a high outlier near 21,000 feet and a large cluster of elevations between 12,000 and 15,000 feet.

    Activity 2-16: Pursuit of HappinessThe following bar graph displays the results:

    1.9.8.7.6.5.4.3.2.10

    Very Happy Pretty Happy Not Too Happy Did Not KnowResponse

    Level of Happiness

    Prop

    orti

    on

    Activity 2-17: Roller Coasters a. The observational units are the roller coasters.

    b. The quantitative variables are height, length, speed, and number of inversions. The categorical variables are type of coaster (wooden or steel; binary) and design (sit down, stand up, inverted).

    WS3_IR_U1_T2.indd 22 3/6/08 7:23:51 PM

  • 2 c. The heights of the steel coasters appear to have both a larger center and much greater variability than the wooden coasters. The steel coasters also seem to have a couple of high outliers at 420 feet.

    d. A typical height for a steel coaster is 148 feet. A typical height for a wooden coaster is 100 feet.

    e. The steel coasters tend to be taller than the wooden coasters. Most of the steel coasters are taller than most of the wooden coasters.

    f. No, one type of coaster is not always taller. There are some very short steel coasters and some relatively tall wooden coasters.

    Activity 2-18: Nicotine LozengeAfter six weeks, 45% of smokers using the nicotine lozenge had successfully quit smoking, whereas only 30% of those using the placebo had quit smoking. Thus, smokers using the lozenge are 1.5 times more likely to quit smoking. However, after 52 weeks (a year later), only about 18% of those using the nicotine lozenge were still not smoking, compared to 10% of those using the placebo. This result still means the nicotine lozenge users are more likely to quit smoking (1.8 times)but the overall chance that a member of either group will successfully refrain from smoking has dropped significantly after a year.

    Activity 2-19: Candy Colors a. The observational units are the Reeses pieces candies. The variable is the color of

    the candy. This variable is categorical.

    1.9.8.7.6.5.4.3.2.10

    Orange Brown YellowColor

    Reeses Pieces Candy

    Prop

    orti

    on

    b. More than half of the candies were colored orange, just over a fourth (28%) were brown, and only 21% were yellow. This suggests that Hershey does not make equal proportions of each color.

    c. Answers will vary by class.

    AssessmentSample Quiz 2A You want to compare prices of textbooks, so you ask six friends who are science majors and six friends who are humanities majors to report how much they spent on textbooks this term.

    Quizzes 23

    WS3_IR_U1_T2.indd 23 3/6/08 7:23:51 PM

  • 3Activity 3-5: Childhood Obesity and Sleep a. The explanatory variable is the amount of sleep that a child gets per night. This

    is a quantitative variable, although it would be categorical if the sleep data were reported only in intervals (more sleep vs. less sleep). The response variable is whether the child is obese, which is a binary categorical variable.

    b. This is an observational study because the researchers passively recorded information about the childrens sleeping habits. They did not impose a certain amount of sleep on children. Therefore, it is not appropriate to draw a cause-and-effect conclusion that less sleep causes a higher rate of obesity. Children who get less sleep might differ in some other way that could account for the increased rate of obesity. For example, amount of exercise could be a confounding variable. Perhaps children who exercise less have more trouble sleeping, in which case exercise would be confounded with sleep. You have no way of knowing whether the higher rate of obesity is due to less sleep or less exercise, or both, or due to some other variable that is also related to both sleep and obesity.

    c. The population from which these children were selected is apparently all children aged 510 in primary schools in the city of Trois-Rivires. These Quebec children might not be representative of all children in this age group worldwide, so you should be cautious about generalizing that a relationship between sleep and obesity exists for children around the world.

    Homework ActivitiesActivity 3-6: Elvis Presley and Alf Landon a. This is a very biased sampling method. You would expect this method to

    overestimate the proportion of adults who believe that Elvis faked his death because people who feel strongly about this are likely to be the ones responding to such an Internet poll.

    b. This number is a statistic.

    c. Although this number may feel large, you really have no way of knowing, based on the statistic alone, whether a sampling method is biased. It is better to consider the sampling method when assessing whether you believe bias is present.

    d. The sample size is 2032. Taking a larger sample would not reduce bias; if the sampling method is flawed, increasing the sample size will not correct the problem.

    Activity 3-7: Student Data a. Answers will vary by school and class.

    b. Answers will vary by school and class.

    c. Answers will vary by school and class.

    Activity 3-8: Generation M a. Parameter

    b. Statistic

    c. Statistic

    d. Statistic

    e. Parameter

    f. Statistic

    g. Parameter

    h. Statistic

    Activity 3-8 33

    WS3_IR_U1_T3.indd 33 3/6/08 7:24:57 PM

  • 34 Topic 3: Drawing Conclusions from Studies

    Activity 3-9: Community Ages a. This number is a parameter; you would view your community as the population.

    b. Yes, this sampling method would be biased. It would probably overestimate the average age of residents as younger residents do not attend church as frequently as older residents do.

    c. Yes, this would be a biased sampling method. This method would underestimate the average age of residents as most drivers at the daycare facility tend to be young adults, not middle-aged or elderly. This method would also exclude all residents who are not yet old enough to drive.

    Activity 3-10: Penny Thoughts a. The number 2136 is the sample size, not the population. The population is all

    American adults.

    b. The sample is the 2136 people contacted by the Harris Poll; 59% is a statistic.

    c. The variable is whether the person opposes abolishing the penny; 59% is a statistic, not a variable.

    d. The observational units are people, not pennies.

    e. The parameter is a number (of unknown value). The population is all American adults.

    f. The statistic is a percentage of the sample of 2136 people who favor abolishing the penny, 59%not an average (whether they vote to abolish the penny is a categorical variable).

    Activity 3-11: Class Engagement a. No; this is an observational study, and there are at least two potential

    confounding variables that could explain the higher level of engagement in the statistics class. You cannot attribute the difference to the subject matter.

    b. Two confounding variables are time of class (8:00 am or 11 am) and instructor (Newton or Fisher).

    Activity 3-12: Web Addiction a. The population is all visitors to the abcnews.com Web site (or Internet users).

    The sample is the 17,251 users of abcnews.com who responded to the survey.

    b. The corresponding parameter of interest in this study is the proportion of the population who have some sort of addiction to the Internet.

    c. The 6% is probably not a reasonable estimate of the parameter because the survey was voluntary. Those who use the Internet more and are more addicted to it are more likely to respond to an online survey. This makes the 6% higher than the percentage for all visitors to the site or for all Internet users in general. Alternatively, you could argue that many addicts might not be willing to admit to a problem and the 6% is less than the true proportion in the population (but this is more of a nonsampling error [people lying] rather than a sample selection issue). See Topic 4 (Activity 4-20 in particular) for more discussion of nonsampling errors.

    WS3_IR_U1_T3.indd 34 3/6/08 7:24:57 PM

  • 3Activity 3-13: Alternative MedicineThe sample result is probably not representative of the truth concerning the population of all adult Americans because the sampling method is biased. Only readers of Self magazine were part of the poll, and the readers of this health magazine were probably the type of people who try alternative medicines more than nonreaders (bad sampling frame). Furthermore, strong advocates of alternative medicines would probably be more likely to reply to a mail-in poll (voluntary response bias). Therefore, this result is very likely to be an overestimate of the proportion of all adult Americans who have used alternative medicines.

    Activity 3-14: Courtroom Cameras a. The proportion is 800/812 or .985. This number is a statistic.

    b. This sample probably is not representative of the population of all adult Americans. Only those people familiar with the trial and with the fact that they could write letters to the judge about their opinion and who felt very strongly about the issue would take the time to write. Those who didnt mind the use of cameras probably wouldnt feel the need to write in. This sample was voluntary and not random at all.

    Activity 3-15: Junior Golfer Survey a. No, this is not a representative sample of all American teenagers because most

    teenagers do not play golf.

    b. Yes, this sampling procedure is likely to be biased with respect to voting preference. Golfing is an expensive sport, and the wealthy tend to vote Republican, so these teenagers have probably grown up in Republican households.

    c. The following graph displays the responses:

    Democrat

    Junior Golfer Survey

    Republican Neither Dont KnowVoting Preference

    Prop

    orti

    on

    1.9.8.7.6.5.4.3.2.10

    This graph shows that the majority of respondents indicated they were more likely to vote for a Republican. If you dont believe most teenagers are Republicans, this gives you evidence that the sampling method is overrepresenting the Republicans in the population.

    d. Yes, this sampling procedure is likely to be biased with regard to both of these variables. If junior golfers tend to come from more affluent families, they almost certainly have a cell phone and computer in their homes, making online access readily available and probably giving them more free time to spend on the computer. Of course, if they are more physically active and training for tournaments, they might tend to spend less time online than a typical teen.

    Activity 3-15 35

    WS3_IR_U1_T3.indd 35 3/6/08 7:24:57 PM

  • 36 Topic 3: Drawing Conclusions from Studies

    Activity 3-16: Accumulating Frequent Flyer Miles a. The observational units are the visitors of msnbc.com. The variable is whether they

    use a credit card to accumulate airline miles. The variable is categorical and binary.

    b. This number is a statistic because it is a number computed from a sample (from 1935 online responses).

    c. This sampling method is most likely biased (because it is voluntary) and will provide an overestimate of the proportion of all American adults who use a credit card to accumulate airline miles. People who are willing to respond to an online survey are more likely to be comfortable using their credit cards over the Internet and to take advantage of Internet offers.

    d. The sample size is 1935. No, it does not affect the answer to part c. This is a large sample size, and even if it werent, a large sample size will not compensate for bias caused by a poor sampling method.

    Activity 3-17: Foreign Language Study a. Yes, these are observational studies. Researchers could only have passively

    observed the association between foreign language study and verbal SAT scores rather than determining for students whether they took a foreign language in high school.

    b. No, it is not legitimate to conclude that foreign language study causes an improvement in students verbal abilities. You can never draw cause-and-effect conclusions between variables from an observational study. One possible confounding variable is verbal aptitude. Perhaps students with strong verbal aptitudes choose to enroll in foreign language courses and also perform well on the verbal portion of the SAT exam. Students with weaker verbal skills may avoid foreign language courses and may also perform less well on the verbal portion of the SAT.

    Activity 3-18: Smoking and Lung CancerThe student needs to explain how diet could be connected to both the explanatory (smoking) and response (lung cancer) variables. How could diet explain the apparent strong connection between smoking and lung cancer? For example, smokers may also tend to have poorer overall diets, and it could be the poor diet that leads to higher rates of cancer.

    Activity 3-19: Smoking and Lung Cancer a. The explanatory variable is smoking habits. The response variable is whether the

    men died of lung cancer.

    b. Yes, this is an observational study. The researchers passively observed the smoking habits and lifespans of their subjects rather than actively imposing smoking habits on the individuals.

    c. Yes, you should have qualms about generalizing these results to a larger population. The subjects were all males and were haphazardly selected by volunteers, so the results definitely should not be extended to women. The results might also be unrepresentative of the general population as well, depending on how the volunteers selected the individuals.

    WS3_IR_U1_T3.indd 36 3/6/08 7:24:58 PM

  • 3Activity 3-20: A Nurse Accused a. The observational units are the eight-hour shifts. The explanatory variable is

    whether Gilbert worked on the shift. The response variable is whether a patient died on the shift.

    b. Yes, this is an observational study because the researchers did not randomly determine which shifts Gilbert would work.

    c. No, because this is an observational study, you cannot draw any cause-and-effect conclusions between the variables. You cannot conclude that Gilbert caused the higher death rate on her shift.

    d. Perhaps Gilbert is a senior-level intensive care nurse whose patients are generally in more critical condition than those seen by nurses on other shifts. If she works primarily with patients who are less likely to survive, then it would not be surprising that the death rate on her shift is higher than that of the hospital average. Or, perhaps Gilbert works night or weekend shifts, which tend to have higher death rates than daytime or weekday shifts.

    Activity 3-21: Buckle Up! a. Yes, this is an observational study because you collected existing data about the states.

    b. No, you cannot conclude that the tougher seatbelt laws cause a higher proportion of residents to comply because this is an observational study.

    c. Yes, the data suggest that tougher seatbelt laws may result in lower death rates because the tougher seatbelt laws are associated with higher seatbelt compliance.

    Activity 3-22: Yoga and Middle-Aged Weight Gain a. The explanatory variable is whether middle-aged adults practiced yoga. This

    variable is categorical and binary. The response variable is amount of weight gained/lost between the ages of 45 and 55. This variable is quantitative.

    b. Yes, this is an observational study because the researchers passively collected data through surveys rather than randomly determining who would practice yoga.

    c. No, this study does not allow you to draw a cause-and-effect conclusion between practicing yoga and gaining less weight because it is an observational study and you can never draw such conclusions based on observational studies.

    d. A potential confounding variable is the amount of weekly exercise performed by each adult. Perhaps adults who practice yoga also tend to engage in other forms of exercise on a regular basis, and this is what caused their weight loss. Adults who showed more weight gain may have participated in less overall exercise between the ages of 45 and 55.

    Activity 3-23: Pet Therapy a. Yes, this is an observational study because you are passively observing and

    recording information about the patients instead of randomly determining which individuals own a pet.

    b. The explanatory variable is whether a recovering heart attack patient has a pet. This variable is categorical and binary. The response variable is whether the patient survives for five years. This variable is categorical and binary.

    Activity 3-23 37

    WS3_IR_U1_T3.indd 37 3/6/08 7:24:58 PM

  • 38 Topic 3: Drawing Conclusions from Studies

    c. No, you cannot conclude that pet ownership leads to therapeutic benefits for heart attack patients based on this study because it is an observational study and you can never conclude cause-and-effect from an observational study. There are many potential confounding variables that could explain the association.

    Activity 3-24: Winter Heart Attacks a. A possible confounding variable could be weather. An alternative explanation could

    be that during the months of December and January, the weather is colder, the days are shorter, people tend to get less exercise (or more straining exercise such as shoveling snow), and these factors in turn increase the number of heart attacks.

    b. The Los Angeles study reduces the viability of the change in weather explanation.

    c. A remaining confounding variable might be the length of the days. As the days shorten in the winter (and less sunlight is available), people become depressed, and this may increase the number of heart attacks that occur.

    Activity 3-25: Pursuit of HappinessNo, these study results do not establish a causal connection between income and happiness because this is an observational study and you can never conclude cause-and-effect from an observational study. There are many potential confounding variables that could explain the association.

    Activity 3-26: Televisions, Computers, and Achievement a. Two explanatory variables are whether there was a television in the bedroom (binary

    categorical) and whether there was a computer in the home (binary categorical). Two response variables are score on mathematics portion of the achievement test (quantitative) and score on language arts portion of the achievement test (quantitative).

    b. Yes, this is an observational study. The researchers passively observed/collected the achievement scores and television/computer information about these children and did not impose any treatments.

    c. No, you cannot make either conclusion because this is an observational study.

    d. There are many possible answers. One confounding variable might be the financial status of the family. Families who are better-off financially are more likely to have computers but are also more likely to expose their children to various forms of literature and language arts, such as books, magazines, and theatre. This exposure, rather than the home computer, could be responsible for the higher scores on the language arts portion of the test.

    e. The sample in this study is the 348 Chicago third-graders.

    f. If you assume the sample was randomly selected, then you could generalize to all third-graders in the Chicago area. As they may not be typical of third-graders in other areas, you probably would not want to generalize beyond this population.

    Activity 3-27: Parking Meter ReliabilityIf the meters were randomly selected from Berkeley, you would be willing to generalize to Berkeley. However, because they were not randomly selected from all California parking meters, you wouldnt be willing to generalize the results to this population.

    WS3_IR_U1_T3.indd 38 3/6/08 7:24:58 PM

  • 3Activity 3-28: Night Lights and Nearsightedness a. No, assuming that these are observational studies, there are potential

    confounding variables that prevent you from legitimately concluding that sleeping with a night light causes a higher rate of nearsightedness.

    b. This argument is incomplete because the student has not explained how genetics is connected to sleeping with a night light (the explanatory variable) as well as to the rate of nearsightedness (the response variable). The student should have said something such as Parents eyesight, because nearsighted parents tend to have nearsighted children (genetics), and it could be that parents who are themselves nearsighted are more likely to need a night light in their childrens rooms.

    AssessmentSample Quiz 3A You want to investigate whether teenagers in England tend to read more Harry Potter books than teenagers in the United States.

    1. Identify the populations in this study.

    2. Identify the explanatory variable, and classify it as categorical or quantitative.

    3. Identify the response variable, and classify it as categorical or quantitative.

    If you read a report that Hospital A has a higher mortality (death) rate than Hospital B when treating heart attack patients, its possible that the severity of the patients condition is a confounding variable.

    4 and 5. Describe what it means for patients condition to be a confounding variable in this context. Be sure to indicate how this potential confounding variable could be related both to the explanatory and the response variable.

    Solution to Sample Quiz 3A 1. The populations are teenagers in England (1) and teenagers in the United

    States (2).

    2. The explanatory variable is whether the teenager is from England or the United States. This is a binary categorical variable.

    3. The response variable is the number of Harry Potter books the teenager has read. This is a quantitative variable.

    4 and 5. A confounding variable is an undefined/unrecorded variable whose effects on the response variable are indistinguishable from the explanatory variable. It is possible that most of the patients who go to Hospital A are in critical condition when they arrive, whereas most of the patients who go to Hospital B are in fair to good condition when they arrive. This would necessarily mean that more of Hospital As heart attack patients would die (because of their prior condition, not because of their treatment), and more of Hospital Bs patients would be likely to survive.

    Quizzes 39

    WS3_IR_U1_T3.indd 39 3/6/08 7:24:58 PM

  • 4participate might differ systematically in some ways from those who were included. Nevertheless, the researchers did use randomness to select their sample, and they probably obtained as representative a sample as reasonably possible.

    d. Perhaps mothers in those groups were in a lower economic class and therefore less likely to have phones in the first place, or perhaps they had to work so their children were in daycare.

    e. These comparisons address the issue of bias, not precision. The sampling method was slightly biased with regard to the mothers race and age and the infants birth weight.

    f. These percentages are statistics because they are based on the sample.

    g. The large sample size produces high precision. This means that the sample statistics are likely to be close to their population counterparts. For example, the population proportion of infants who sleep on their backs should be close to the sample proportion who sleep on their backs.

    h. The sample size for subgroups is smaller than for the whole group, so the sample results would be less precise.

    Homework ActivitiesActivity 4-6: Rating Chain Restaurants a. It seems unlikely that this sample was randomly chosen as it would be extremely

    difficult to give each Consumer Reports reader an equally likely chance of being selected for the sample and to ensure that everyone selected responded. It is much more likely that the responders self-selected by returning a survey.

    b. The authors probably make the disclaimer because the sample was not randomly selected from the entire population, but only from their readers who may have different habits and attitudes from nonreaders and therefore cannot reasonably be extended to the general population.

    c. Answers will vary, but you probably should generalize these results only to Consumer Reports readers who tend to visit full-service restaurant chains and like to complete surveys.

    Activity 4-7: Sampling Words a. Categorical (binary)

    b. 99/268 .369

    c. The answer to part b is a parameter; .369 is the proportion of all 268 words (the population) in the Gettysburg address that are over 5 letters long.

    d. No; because of sampling variability you would not expect the sample proportion to equal .369, but you would expect it to be reasonably close most of the time. (In fact, with a sample of size 5, the sample proportion could not equal .369; it could only be 0, .2, .4, .6, .8, or 1.)

    Activity 4-8: Sampling WordsAnswers will vary. These are based on one particular running of the applet.

    a. Yes, this distribution should be centered at about .369 (it is .38 in this case).

    Activity 4-8 49

    WS3_IR_U1_T4.indd 49 3/6/08 7:26:00 PM

  • 50 Topic 4: Random Sampling

    b. This distribution should still be centered at .369 (the mean is .37), but with much less variability.

    c. Because you are taking random samples, you expect your sample proportions to center around the parameter (.369), regardless of the sample size. However, as you increase the sample size, you expect your samples to become more precise; that is, you expect the variability between samples to decrease.

    Activity 4-9: Sampling Senators a. The observational units are the U.S. senators. The variable is years of service in

    the senate. The population is the current 100 U.S. senators. The sample is the 5 selected current U.S. senators. The parameter is the average years of service of all 100 U.S. senators. The statistic is the average years of service of the 5 selected senators.

    b. This sampling method would most likely overestimate the average years of service because your classmates would most likely select names of well-known senators who have been serving in the senate for a long time. (You also need to worry about a tendency for students to mention the senators from their own states more than those from other states.)

    c. No, increasing the sample size will not correct for a biased sampling method. Students would still tend to overrepresent the senators who have served longer.

    d. Obtain a list of the current senators. Number each senator in the list from 0099. Select any row of the Random Digits Table and read the row as a sequence of two-digit numbers. These two-digit numbers tell you which senators from your list will make up your sample. Continue selecting senators until you have five senators in your sample. Skip any repeated two-digit numbers.

    e. Obtain a list of the current representatives. Number each representative in the list from 000434. Select any row of the Random Digits Table and read the row as a sequence of three-digit numbers. These three-digit numbers tell you which senators from your list will make up your sample. Skip any repeated three-digit numbers or numbers greater than 434. Continue selecting representatives until you have five representatives in your sample. If necessary, continue to another row of the Random Digits Table.

    Activity 4-10: Responding to KatrinaBased on sample sizes, the non-Hispanic white adults responses probably come closer to reflecting the groups population value than the black adults responses do because there were so many more white adults sampled. If both samples were selected randomly, the larger sample is more likely to produce a sample result similar to the population parameter.

    Activity 4-11: Rose-y Opinions a. The observational units are the 1000 individuals. The variable is whether they have

    a favorable or unfavorable opinion of Pete Rose. This is a categorical variable.

    b. The population is American sports fans. The sample is the first 1000 people leaving an LA Lakers basketball game.

    WS3_IR_U1_T4.indd 50 3/6/08 7:26:01 PM

  • 4 c. This was not a randomly selected sample. People attending this basketball game are not necessarily sports fans in general or may be extreme LA Lakers fans or simply basketball fans. This is an example of convenience sampling and is unlikely to result in a representative sample.

    d. No, the individuals in the sample may still be only interested in basketball and not sports in general.

    e. If you have a list of subscribers to Sports Illustrated, you could number the list and use a table of random digits or a computer to select a random sample of subscribers. The population who would be represented by this sample would be all readers of Sports Illustrated, which would certainly be more representative of the general sports fan than the previous methods.

    f. The parameter is the percentage of American sports fans who have an unfavorable opinion of Pete Rose. Its value is unknown. The statistic is the 49% of the 1000 people interviewed by the Gallup pollsters who said they had an unfavorable opinion of Pete Rose.

    g. The value of the statistics would most likely change if Gallup had selected another random sample of 1000 people to interview. But the value of the parameter would remain the same.

    Activity 4-12: Sampling on Campus a. The observational units are college freshmen. The variable is weight gained during

    the first term at college. The population is all U.S. college freshmen. The sample is a random sample of college freshmen. The parameter is the average weight gained by all college freshmen during their first term.

    Because it would be impossible to obtain a random sample of all U.S. college freshmen, work with freshmen at a particular college. Obtain a list of all freshmen from the registrar. Number the list and use a table of random digits to obtain a random sample of freshmen.

    b. The observational units are college students. The variable is price paid for textbooks. The population is all U.S. colleges. The sample is a random sample of college students. The parameter is average price paid for textbooks by all college students.

    Because it would be impossible to obtain a random sample of all U.S. college students, work with students at a particular college. Obtain a list of all students from the registrar. Number the list and use a table of random digits to obtain a random sample of students.

    c. The observational units are pages of your history book. The variable is number of words on each page. The population is all pages in your history book. The sample is a random sample of pages from your history book. The parameter is average number of words per page in your history book.

    Number all the pages in your history book consecutively. Use a table of random digits to select a sample of pages from your book and count all the words on these pages.

    d. The observational units are college faculty. The variable is political party registration. The population is all U.S. college faculty. The sample is a random sample of U.S. college faculty. The parameter is percentages of U.S. college faculty who are registered in each political party.

    Activity 4-12 51

    WS3_IR_U1_T4.indd 51 3/6/08 7:26:01 PM

  • 52 Topic 4: Random Sampling

    Because it would be impossible to obtain a random sample of all U.S. college faculty, work with faculty at a particular college. Obtain a list of all faculty and number the list. Then use a table of random digits to obtain a random sample of faculty.

    Activity 4-13: Sport Utility Vehicles a. The observational units are the vehicles. The variable is whether the vehicle is an

    SUV. The population is all vehicles on the road in your hometown. The sample is the vehicles that pass by the intersection between 7 and 8 am that morning. The parameter is the proportion of all vehicles on the road in your hometown that are SUVs. The statistic is the proportion of all vehicles that pass by that morning that are SUVs.

    b. The vehicles that you observed between 7 and 8 am may not be representative of all vehicles on the road. For example, the vehicles many be used to carpool children to school and therefore overrepresent larger families with children and larger cars, or they may be predominately commuter vehicles more than weekend recreational vehicles and underrepresent the proportion of SUVs.

    c. The sampling frame is the list of cars sold by that dealer.

    d. The recently purchased vehicles will probably not represent the vehicles on the road in your town. For example, there may have been a backlash against SUVs recently because of high gas prices so that fewer SUVs were purchased in the last year, yet many people would still own them from purchases made several years ago.

    Activity 4-14: Generation M a. Your classmates form a sample as they are only a subset of all students at your school.

    b. Answers will vary. This number is a statistic because it is collected from your class (a sample).

    c. Answers will vary from class to class, but the numbers calculated will all be statistics.

    d. No, you and your classmates do not constitute a random sample of the students at your school because every student did not have an equal chance of being selected for the sample.

    e. Answers will vary by school and class.

    f. Answers will vary by school and class.

    Activity 4-15: Emotional Support a. Hites sampling method is likely to be biased in the direction of women who

    think they give more support than they receive. She sampled women in womens groups who usually join because they arent getting the kind of companionship they want from their husbands or boyfriends.

    b. Hites poll surveyed the larger number of women.

    c. The ABC News/Washington Post poll was probably more representative of the truth about the population of all American women because they used random sampling that was presumably unbiased.

    WS3_IR_U1_T4.indd 52 3/6/08 7:26:01 PM

  • 4Activity 4-16: College Football Players a. Position: categorical

    Weight: quantitative

    Class: categorical

    b. Example answer, using line 13 of the table:

    Note: An early printing of the student book has an error that gives 82 as the number of players rather than 99. There are actually 99 players as some jersey numbers are missing and some are duplicated. If the players are renumbered from 01 to 99, an example answer would be to use line 13 to select players: 54 Danny Rohr (220 lbs), 40 Brandon Williamson (180 lbs), 02 Courtney Brown (205 lbs), 21 Anthony Randolph (220 lbs), 50 David Fullerton (195 lbs), 56 James Chen (240 lbs), 55 Alex Bynum (230 lbs), 87 Kyle Maddux (210 lbs), 52 Kevin Spach (220 lbs), 86 Louis Shepherd (250 lbs), 07 Pat Johnston (195 lbs), 30 Drew Robinson (195 lbs), 34 David Elmerick (185 lbs), 05 Mike Anderson (180 lbs), and 60 Bobby Best (245 lbs). The average weight in this sample is 211.3 lbs. This weight should be fairly close to the average weight of all 99 players because you took a random sample, but you dont expect it to match exactly. In particular, although this value will vary from sample to sample, you dont expect a tendency to consistently overestimate or underestimate the population mean weight.

    A population of 82 players can be considered by deleting the red-shirted freshmen from the list and renumbering the remaining players from 01 to 82. Then, using line 13, you select players 54 Brock Daniels (275 lbs), 40 Aris Borjas (200 lbs), 02 Courtney Brown (205 lbs), 55 Kenny Calderone (285 lbs), 52 Bobby Best (245 lbs), 07 Pat Johnston (195 lbs), 30 Drew Robinson (195 lbs), 34 Martin Mates (185 lbs), 05 Mike Anderson (180 lbs), 60 Lucas Trily (235 lbs), 57 Patrick Koligian (250 lbs), and 62 Julai Tuua (275 lbs).

    The average weight in this sample is 230 lbs. This weight should be fairly close to the average weight of all 82 players because you took a random sample, but you dont expect it to match exactly. In particular, although this value will vary from sample to sample, you dont expect a tendency to consistently overestimate or underestimate the population mean weight.

    Activity 4-17: Phone Book Gender a. The parameter is the proportion of women living in San Luis Obispo County.

    The statistic is the proportion of women listed on the randomly selected phone book page.

    b. This sampling technique will give a biased estimate for the proportion of women living in San Luis Obispo County because the phone listings of many married women are often only under their husbands names. In addition, many single women choose not to list their phone numbers to avoid harassing phone calls. Therefore, you expect the statistic will be an underestimate of the population parameter.

    Activity 4-18: Sampling SenatorsFrom most variability to least variability: a, c, d, b.

    As the sample size increases, regardless of the size of the population, the variability in the sample values decreases.

    Activity 4-18 53

    WS3_IR_U1_T4.indd 53 3/6/08 7:26:02 PM

  • 54 Topic 4: Random Sampling

    Activity 4-19: Voter Turnout a. 1783/2613 .682

    b. This is a statistic because it is a number calculated from a sample (of 2613 adults).

    c. The following bar graph displays the proportions who claimed to have voted and not:

    1.9.8.7.6.5.4.3.2.10

    Voted Did Not VoteResponse

    Voter Turnout in 1996Presidential Election

    Prop

    orti

    on

    d. This number (49%) is a parameter because the Federal Election Commission has the records of all registered voters. Everyone who was eligible to vote was included in this number.

    e. No, the sample grossly overestimated the proportion of eligible voters who actually voted.

    f. Although the sample result is unlikely to match the population value exactly, this difference is probably too large to be attributed to sampling variability.

    g. People may be reluctant to tell the truth (and seem unpatriotic) and so may overstate whether they voted. They might not remember that they didnt vote in this particular election. Even with random samples, you have to worry about the honesty of the respondents in surveys.

    Activity 4-20: Nonsampling Sources of Bias a. The proportions of yes responses would most likely differ between these two

    groups. The question that includes the words horrific murder is obviously putting a negative idea into the minds of those surveyed, whereas the other question seems neutral.

    b. The proportions declaring agreement with the policy might differ between these two groups. Those interviewed by the smoker might feel pressured into disagreeing.

    c. The proportion of yes responses would probably be lower than the actual proportion of married people in the community who have engaged in extramarital sex. This manner of survey is not very confidential, and the surveyor would be hard-pressed to get honest answers to such a personal and potentially harmful question.

    d. You should not be surprised that the proportions would differ between these two groups. The presidents views on foreign policy would be fresh in the minds of one group, whereas the other group would have to recall past speeches or actions

    WS3_IR_U1_T4.indd 54 3/6/08 7:26:02 PM

  • 4of the president in order to form an opinion. Approval ratings tend to rise shortly after rousing speeches but then come back down again over time.

    e. How the question is worded, appearance of the interviewer, lack of confidentiality, knowledge of the topic, and timing of the question are sources of bias.

    Activity 4-21: Prison Terms and Car Trips a. Prisoners with longer terms have a higher probability of ending up in the sample

    (similar to how longer words are more likely to be selected when you point your finger at one spot on the page).

    b. Cars engaged in longer trips have a higher chance of being observed at a particular time point than cars on shorter trips.

    c. Many answers are possible, but one example is estimating the average length of time that people have been employed by a particular company. If you take a random sample of employees, employees who have been around longer have a better chance of ending up in the sample.

    AssessmentSample Quiz 4A The National Retail Federation sponsors surveys of consumer behavior. One such survey, conducted on a yearly basis, asks American adults whether they plan to celebrate Mothers Day, what kind of gift(s) they plan to buy, and how much they plan to spend. The 2007 survey was conducted on April 411 with 7859 consumers participating. Of these respondents, 84.5% said that they were planning to celebrate Mothers Day, expecting to spend an average of $139.14.

    1. Identify the population of interest in this survey.

    2. Identify the sample and the sample size.

    3. Are the values listed (84.5%, $139.14) parameters or statistics? Explain.

    4. Identify (in words) the parameters of interest in this study.

    5. The press release describing this survey did not say how the 7859 consumers were selected. Explain why knowing this missing information is important.

    Solution to Sample Quiz 4A 1. The population is all adult Americans.

    2. The sample is a group of adult American consumers, surveyed during the week of April 411, 2007. The sample size is 7859.

    3. These values are statistics because they describe a sample.

    4. The parameters are the proportion of all American adults who plan to celebrate Mothers Day in 2007 and the mean/average amount that those who are planning to celebrate Mothers Day expect to spend.

    5. In order to consider the sample to be representative of the population, you need to know whether it was randomly selected.

    Quizzes 55

    WS3_IR_U1_T4.indd 55 3/6/08 7:26:02 PM

  • 5 Homework ActivitiesActivity 5-7: An Apple a Day a. Anecdote

    b. Observational study

    c. Experiment

    Activity 5-8: Treating Parkinsons Disease a. Sham surgery is a surgery that has no medical purpose. It is placebo surgery

    performed so that patients do not know which of them are receiving the Spheramine and which are not.

    b. If an experiment is double-blind, then neither the subjects nor the evaluator knows whether each subject is in the treatment group or in the control group. This is important in this study because it prevents the evaluator from being biased in his/her judgment of the effectiveness of the implant and also from the patient feeling psychologically different based on the perception of receiving treatment or not (see part d also).

    c. Randomized means that subjects are randomly assigned to either the treatment or control (placebo) group. This is important because it should mean that the only difference between the two groups should be the treatment, and so if there is a substantial difference observed between the groups later, you can conclude the effect was caused by the treatment.

    d. Placebo-controlled means the subjects in the control group are given a placebo (in this case, the sham surgery) so they cannot tell they are in the control group, and so that if subjects are going to improve because of the surgery itself (rather than the implant), this will happen at the same rate in both the treatment and control groups.

    Activity 5-9: Ice Cream Servings a. The explanatory variables are the large or small bowl (binary categorical) and large

    or small scoop (binary categorical). The response variable is amount of ice cream eaten (quantitative).

    b. This is an experiment because the researchers actively imposed the treatments on the subjects by randomly assigning the size of the scoops and bowls.

    c. The random assignment was important because it controlled for the potential confounding variable of self-selection. If the nutrition experts were allowed to choose for themselves, those who tended to have small appetites might have chosen the smaller bowls and/or scoops and consequently eaten less ice cream. Then appetite would be confounded with bowl/scoop size.

    d. The nutrition experts did not know that there were two different sizes of bowls and scoops being distributed, so they would not be conscious of the size of the bowl and perhaps adjust the amount of ice cream they ate in order to be more in line with one of the other groups.

    e. Because this study was a well-designed, randomized controlled experiment, it is valid to draw a cause-and-effect conclusion between size of bowl or scoop and size of the ice cream serving.

    Activity 5-9 65

    WS3_IR_U1_T5.indd 65 3/6/08 7:27:16 PM

  • 66 Topic 5: Designing Experiments

    f. You have controlled for this potentially confounding variable by randomly assigning the subjects to the treatment groups. The only difference between the two groups should be the bowl and scoop sizes.

    Activity 5-10: Spelling ErrorsRandomly divide college students into two groups (number a group of student participants and use a random digits table to split them into two groups). Have one group use a computers spell-checker to proofread a research paper, and have the other group proofread the same research paper without using the spell-checker. Compare the performance of both groups to see whether one group catches more errors than the other.

    Activity 5-11: Foreign Language Study a. No, you cannot conclude that foreign language study improves your verbal skills.

    Because this was an observational study, there are many confounding variables that could explain the association.

    b. A controlled experiment would need to randomly assign students to different treatment groups (i.e., foreign language study and no foreign language study) and then later compare the verbal SAT scores of the two groups. This would ensure that hidden confounding variables such as verbal aptitude would balance out between the groups.

    c. It might not be feasible to carry out such an experiment because you cannot generally control which courses students do or do not take.

    Activity 5-12: AZT and HIV a. The explanatory variable is whether the pregnant woman received AZT

    (categorical). The response variable is whether the resulting baby was HIV-infected at birth (categorical).

    b. This is an experiment. The researchers actively randomly assigned the mothers to the control and treatment groups.

    c. This study makes use of comparison by having a group who received AZT (treatment group) and a group who received a placebo (control group). This allowed the researchers to compare the babies infection rates between the two groups. In particular, any changes over time would occur for both groups.

    d. The study used random assignment to decide which mothers would receive AZT and which would receive the placebo. This should even out all variables, so the only difference between the groups of mothers is the AZT.

    e. The study used blindness by giving the mothers in the control group a placebo so that neither group of mothers could tell whether they were actually receiving the AZT. This would control for the placebo effect in both groups.

    Activity 5-13: Pet Therapy a. The explanatory variable is whether the heart attack patient owns a pet (categorical).

    The response variable is whether the patient survived for five years (categorical).

    b. This is an observational study. The researcher passively observed and recorded information on pet ownership and the patients recovery rather than assigning some people to own pets and others to not own pets.

    WS3_IR_U1_T5.indd 66 3/6/08 7:27:16 PM

  • 5 c. Yes, there is a group of patients who do not own pets for comparison.

    d. No, this study does not make use of randomization. Patients were not randomly selected or randomly assigned to treatment groups.

    e. No, you cannot conclude that owning a pet has a therapeutic effect for heart attack survivors because there may be confounding variables that explain the association. You cannot conclude causation with an observational study.

    f. This study could be a controlled experiment if the researcher used randomization to determine whether the patient owned a pet. In this case, the researcher would actively impose the treatment on the subjects. The experimenter would then hope to see the direct effect of pet ownership on the recovery rate of heart attack patients.

    g. This is debatable. Is it feasible to tell someone to own a pet? Probably not.

    Activity 5-14: Studies from Blink a. Parts a and b only have one variable so you cannot distinguish between

    explanatory and response variables in these cases (you could consider the one variable a response variable in each case). For part c, the explanatory variable is whether their version of the exam asks them to indicate race; the response variable is score on SAT-like exam. For part d, the explanatory variables are gender and race of customer; the response variable is price negotiated for the car.

    b. The observational studies are part a (height of American CEOs) and part b (marriage counselors). The experiments are part c (SAT-like exam given to African American students) and part d (best prices at car dealerships).

    c. Because parts a and b are observational studies, you cannot draw any cause-and-effect conclusions from either of them. In part a, because the economist took a random sample of American CEOs, you are probably safe in generalizing the results to the population of American CEOs. In part b, you are not told that the psychologist interviewed a random sample of marriage counselors, so you might hesitate to generalize these results to any larger population of counselors. In part c, you should be able to draw a cause-and-effect conclusion if an effect is found, because a randomized, controlled experiment was performed. You should be cautious in generalizing your results to African American college students at similar colleges, however, because you were not told how the 200 students were selected for the study. In part d, if a significant difference is found in average price among the four types of customers, you should be able to attribute the difference to race, gender, or both because you used a comparative, randomized experiment. You should be cautious about generalizing these results because only 10 dealerships were used, and they were all apparently in the same city.

    Activity 5-15: Reducing Cold Durations a. The experimental units are the 104 subjects reporting to the lab within 24 hours

    of getting a cold.

    b. The explanatory variable is amount of zinc nasal spray (full, low, or no dosage). The response variable is duration of cold symptoms.

    c. This is an experiment because the researchers randomly assigned the subjects to the treatment groups (amount of zinc spray) and actively imposed the treatments on the patients.

    Activity 5-15 67

    WS3_IR_U1_T5.indd 67 3/6/08 7:27:16 PM

  • 68 Topic 5: Designing Experiments

    d. The researchers used a placebo to ensure that if the subjects colds improved because of receiving any treatment, this effect would been seen equally in each of the groups.

    Activity 5-16: Religious Lifetimes a. The explanatory variable is attends religious services at least once a month. The

    response variable is lifespan.

    b. This is an observational study because the researchers did not randomly assign the subjects to attend religious services or not.

    c. You cannot conclude that attending religious services will lengthen ones life because this is an observational study. A possible confounding variable is the subjects health and lifestyle. Perhaps people who attend religious ceremonies take better care of their bodies, which may affect their lifespan.

    d. Yes, if the sample is selected randomly, it should represent the population, regardless of the population size. The important consideration here is how the sample is selected, not the relative size of the sample compared to the size of the population.

    Activity 5-17: Natural Light and Achievement a. Researchers would randomly assign the students to two different treatment groups

    one with high natural light and one with low natural light. Then the researchers would compare the standardized test scores of the students in these two groups.

    b. It would be difficult to carry out this experiment because there are ethical considerations that could prevent you from depriving students of natural light and also from possibly detrimentally affecting their education.

    c. John B. Lyons could say, There is a causal relationship between daylight and achievement if this was a well-designed, randomized comparative experiment.

    Activity 5-18: SAT Coaching a. The explanatory variable is before or after attending the coaching program. The

    response variable is SAT score (improvement).

    b. This is an observational study. The researcher passively observed and recorded information on the students SAT scores. He/she did not randomly decide who would or would not enroll in the coaching program.

    c. You cannot conclude the SAT coaching program caused the improvements in scores because this was not a randomized, comparative experiment. Perhaps most students would generally improve the second time they take the test regardless of the coaching program (you had no comparison group here). Or there may have been other changes in their study habits in addition to the coaching program.

    Activity 5-19: Capital Punishment a. No, this is not an experiment because the researcher did not impose the death

    penalty statute on the states that have it or prevent other states from having it.

    b. No, you cannot conclude that the death penalty caused the difference in homicide rates because this is an observational study. There may be confounding variables

    WS3_IR_U1_T5.indd 68 3/6/08 7:27:17 PM

  • 5(such as the states overall crime rate or legal system) that could also affect the response variable.

    c. No, you cannot conclude a lack of causation either because this is an observational study. There could be other variables that are masking the effect of the death penalty.

    Activity 5-20: Literature for Parolees a. Committed a crime: 6/32 .1875; did not commit a crime: 26/32 .8125

    b. Committed a crime: 18/40 .45; did not commit a crime: 22/40 .55

    c. This study did not randomly assign the parolees to the control and treatment groups. Instead, qualifications had to be met in order to get into the literature (treatment) program. Perhaps literacy or motivation was a confounding variable that affected the likelihood of committing a new crime.

    Activity 5-21: Therapeutic Touch a. This was an experiment. Emily imposed the treatment (her hand) on the subjects.

    b. Emily flipped a coin to decide which of the subjects hands she would hold hers over.

    c. This study was not double-blind. Emily was aware of which subjects received which treatments.

    d. No; Emilys sample consisted of volunteers. It was not randomly selected from all practitioners.

    e. No, you should not attribute this tendency to detection of Emilys energy field. Emily used only practitioners of therapeutic touch in her study; she did not have a control group of people who did not claim to participate in this practice with which to compare.

    Activity 5-22: Prayers, Cell Phones, School UniformsAnswers will vary. These are example answers.

    a. Randomly divide the subjects into two groups. Have one group talk on the cell phone while driving and prohibit the other group from using a cell phone while driving. Compare the performance of both groups on an obstacle course to see whether they behave differently. An observational study would not allow you to conclude causation because you would be unable to control for confounding variables. Drivers who choose to use a cell phone may be less careful in general and therefore more prone to accidents, regardless of cell phone use.

    b. Locate a group of patients with a common type of pain (cancer, back pain, etc.). Randomly divide the patients into two groups. Have a prayer group pray for one group for a specified period of time and not pray for the other group. Record any decrease in pain in both groups. If you simply passively observe which patients use prayer to try to reduce suffering and which do not, you will not be able to control for confounding variables such as sociability. Perhaps patients who believe in prayer are more sociable than those who do not, and this sociability raises their spirits, which provides pain relief.

    Activity 5-22 69

    WS3_IR_U1_T5.indd 69 3/6/08 7:27:17 PM

  • 70 Topic 5: Designing Experiments

    c. Randomly divide the students into two groups. Have one group wear uniforms to school and allow the other group to wear anything they like. Compare the performance of both groups on a standardized test. An observational study would not allow you to conclude causation because you would be unable to control for confounding variables. Perhaps students who choose to wear uniforms (or whose parents require that they wear uniforms) are more studious than those who do not and would perform better on the standardized test, regardless of what they wore.

    Activity 5-23: Proximity to the Teacher a. The observational units are the students. The explanatory variable is whether the

    student sits close to/far away from the teacher. The response variable is performance on quizzes.

    b. Researchers would randomly assign the students to two different treatment groupsone that sits close to the teacher and one that sits far from the teacher. Then the researchers would compare the quiz scores of the students in these two groups.

    c. You will be able to conclude that sitting closer to the teacher does (or does not) cause students to perform better on quizzes if you use a well-designed, controlled experiment.

    d. You would not have to worry about the ethics of assigning seats to students or possibly detrimentally affecting the students education through their seat assignments.

    Activity 5-24: Smoking While Pregnant a. These are almost certainly observational studies. It would be difficult, if not

    ethically impossible, for researchers to assign the subjects to control and treatment groupsthey would have to simply passively observe which women smoke and which do not.

    b. No, it would not be ethical (or feasible) for researchers to randomly assign the pregnant women to control (nonsmoking) and treatment (smoking) groups. You could perhaps use mice or other animals, but then you would have trouble generalizing the results to human subjects.

    Activity 5-25: Dolphin Therapy a. This is an experiment because the researcher actively imposed the assignment to

    the two groups (with or without dolphins) on the subjects.

    b. The explanatory variable is swimming with dolphins or swimming/snorkeling without dolphins. This variable is categorical. The response variable is change in depression symptoms. This variable is presumably categorical.

    c. Assuming that the patients were randomly assigned to the two groups, yes, you can conclude that swimming with dolphins improves depression symptoms because this was a well-designed, controlled experiment.

    d. No, the subjects were not blind as to which treatment they received. It would be impossible to achieve blindness in this experiment because you cannot make people unaware that they are swimming with dolphins.

    WS3_IR_U1_T5.indd 70 3/6/08 7:27:17 PM

  • 5Activity 5-26: Cold Attitudes a. The explanatory variable is emotional state score (numerical score is quantitative;

    top third vs. bottom third is categorical). The response variable is whether the subjects catch a cold.

    b. This is an observational study because the researchers passively observed the emotional states of the subjects. They did not randomly assign the subjects to treatment groups (positive vs. negative attitudes).

    c. No, you cannot draw causal conclusions from an observational study. There could be confounding variables, such as lifestyle, that explain the association. Perhaps people with positive emotions tend to exercise more and eat healthier foods, which would help them ward off colds.

    Activity 5-27: Friendly Observers a. This study is an experiment because the researcher randomly assigned the subjects

    to the two groups (only participants would win $3, and participants plus observers would win $3).

    b. The observational units are the subjects playing a video game.

    c. The explanatory variable is whether the observer was to share in the prize. The response variable is whether the threshold was beaten.

    d. This study makes use of blindness because the subjects were not told that there were two different groups or which group they were placed in.

    Activity 5-28: Got a Tip? a. These are explanatory variables.

    b. Record the percentage tip per check (rather than looking at how the amounts vary across the different sizes of bills).

    c. Yes, she can conduct an experiment; she can randomly determine whether she introduces herself by name (or stands throughout).

    d. If she conducts this as an experiment she can control for confounding variables and draw cause-and-effect conclusions.

    e. On a customer-to-customer level, she could flip a coin to decide whether to introduce herself by name or to decide whether she stands or squats at the table. She could also flip a coin to decide whether she wears a flower in her hair, but she would probably want to do this on a shift-by-shift basis.

    AssessmentSample Quiz 5A In a study published in the July 4, 2007, issue of the Journal of the American Medical Association, researchers investigated whether small doses of dark chocolate can reduce blood pressure for people who suffer from mild cases of high blood pressure. They recruited 44 German adults who were otherwise healthy except for mild cases of high blood pressure. These subjects were randomly assigned to either a dark chocolate group or a white chocolate group, and all subjects were instructed to eat one square portion of a chocolate bar (containing about 30 calories) every day for 18 weeks. They were

    Quizzes 71

    WS3_IR_U1_T5.indd 71 3/6/08 7:27:18 PM

  • 92 Topic 6: Two-Way Tables

    Homework ActivitiesActivity 6-6: Lifetime Achievements a. The conditional distribution of preferred achievement for each gender follows:

    Male Female

    Olympic Medal .4583 .2963

    Nobel Prize .5000 .4444

    Academy Award .0417 .2593

    b. The following segmented bar graph displays these conditional distributions:

    100908070605040302010

    0Male Female

    Academy Award

    Gender

    Perc

    enta

    ge

    Nobel PrizeOlympic Medal

    Classmate Preferences for LifetimeAchievement

    c. These data indicate that in this class males are much more likely than females to prefer an Olympic medal to an Academy Award. Only 4% of the males would like to win an Academy Award, whereas 26% of the females would. The males indicated a slight preference for Nobel prizes, but 46% of them would like to win an Olympic medal. Only 30% of the females would prefer an Olympic medal, but 44% of them would like a Nobel prize.

    Activity 6-7: Hella Project a. This is an observational study.

    b. The explanatory variable is whether the student is from northern or southern California. The response variable is whether the student used hella in their everyday vocabulary.

    c. A two-way table of the responses is shown here:

    Northern Californians Southern Californians Total

    Uses Hella Regularly 10 3 13

    Does Not Use Hella Regularly 5 22 27

    Total 15 25 40

    WS3_IR_U2_T6.indd 92 3/7/08 12:12:12 PM

  • 6 d. For southern Californians, 3/25 or .12. For northern Californians, 10/15 or .667.

    e. The following segmented bar graph displays these data:

    1009080706050403020100

    SouthernCalifornia

    NorthernCalifornia

    Yes

    Region

    Perc

    enta

    ge

    No

    Hella Project

    f. Yes, the data seem to support the students conjecture. Students in this sample from northern California were more than five times as likely (relative risk .667/.12 5.56) to use hella in their everyday vocabulary as the students from southern California.

    Activity 6-8: Suitability for PoliticsIn this sample, about 19% of the liberals (40/208), 23% of the moderates (68/293), and 31% of the conservatives (96/311) polled agreed with the statement on suitability for politics. The increase in percentage as the amount of conservatism increases makes sense from what is known about the general beliefs of liberals, moderates, and conservatives. Therefore, believing the sample is representative of adult Americans, you could say that the more conservative a person is, the more likely he or she is to agree with the statement. The following segmented bar graph displays these results:

    1009080706050403020100

    Liberal Moderate Conservative

    Disagree

    Political A iliation

    Perc

    enta

    ge

    Agree

    2004 General Social Survey:Suitability for Politics

    In this sample, about 28% (109/385) of the males and 23% (103/449) of the females polled agreed with this statement on suitability for politics. Therefore, these data suggest that gender does not play a major role in influencing adult Americans

    Activity 6-8 93

    WS3_IR_U2_T6.indd 93 3/7/08 12:12:13 PM

  • 94 Topic 6: Two-Way Tables

    decision regarding the statement. The following segmented bar graph displays these results:

    1009080706050403020100

    Male Female

    Disagree

    Gender

    Perc

    enta

    ge

    Agree

    2004 General Social Survey:Suitability for Politics

    Activity 6-9: Suitability for PoliticsAccording to the data, in the 1970s about 47% (2398/5049) of those polled agreed with this statement on suitability for politics, and this percentage declined to about 36% (2563/7160) in the 1980s, about 23% in the 1990s (1909/8336), and stayed at just over 23% in the 2000s (802/3411). Therefore, based on these randomly selected samples, you have evidence that over time the population has tended to disagree more and more with this statement until the turn of the century, when opinion may have leveled off slightly. The following segmented bar graph displays these results:

    1009080706050403020100

    1970s 1980s 1990s 2000s

    Disagree

    Decade

    Perc

    enta

    ge

    Agree

    2004 General Social Survey: Suitability for Politics

    Activity 6-10: A Nurse Accused a. Here is the 2 2 table:

    Shifts Gilbert Worked Shifts Gilbert Didnt Work

    Number of Patients Who Died 40 34

    Number of Patients Who Survived 217 1350

    b. For the shifts that Gilbert worked, 40/257 .156. For the shifts that Gilbert didnt work, 34/1384 .025

    WS3_IR_U2_T6.indd 94 3/7/08 12:12:17 PM

  • 6 c. The relative risk of a patient dying is .156/.025 or 6.24 (6.34 if using more than three decimal places for the proportions).

    d. The risk of dying was over six times greater during shifts on which Gilbert worked than it was during those shifts on which she didnt work.

    Activity 6-11: Childrens Television Advertisements a. This is a 5 3 table.

    b. The proportion of food advertisements on BET that were for fast food is 61/162 or .377.

    c. The proportion of fast-food advertisements that were on BET is 61/93 or .656.

    d. Here is the conditional distribution of the types of food commercials shown:

    BET WB Disney

    Fast Food .377 .386 .000

    Drinks .407 .108 .455

    Snacks .019 .000 .182

    Cereal .093 .193 .364

    Candy .105 .313 .000

    e. The following segmented bar graph displays these conditional distributions:

    1009080706050403020100

    BET WB Disney

    Candy

    Network

    Perc

    enta

    ge

    CerealSnacksDrinksFast Food

    Childrens TV Ads

    f. The Disney channel showed no commercials for fast food or candy during this time period (in fact, they showed very few food advertisements at all). About 37% of the BET and WB food advertisements were for fast food and almost none of them were for snacks. The percentage of food advertisements for cereal on the WB network was double that of the BET network (19% vs. 9%) and for candy the percentage was tripled (31% vs. 10%). Assuming these data were randomly selected, it appears that the Disney channel shows fewer advertisements than the other networks and generally healthier ones.

    Activity 6-11 95

    WS3_IR_U2_T6.indd 95 3/7/08 12:12:17 PM

  • 96 Topic 6: Two-Way Tables

    Activity 6-12: Female Senators a. The proportion of senators who are women is 16/100 or .16.

    b. The proportion of senators who are Democrats is 49/100 or .49.

    c. No, it is not fair to say that most Democratic senators are women. Only 22% (11/49) of the Democratic senators are womenthis is less than a quarter of the Democratic senators.

    d. Yes, it is fair to say that most of the female senators are Democrats because 11/16 or .688 of the females are Democrats.

    Activity 6-13: Weighty Feelings a. The explanatory variable is gender. The response variable is feeling about ones

    weight.

    b. Here is the marginal distribution for the variable feeling about ones weight (the variable gender should be ignored):

    Underweight .066

    About Right .450

    Overweight .484

    Total 1.000

    The following bar graph displays the marginal distributions:

    Underweight About Right OverweightFeelings about Current Weight

    Prop

    orti

    on

    NHANES Survey

    1.9.8.7.6.5.4.3.2.10

    c. Here is the conditional distribution of weight feelings for each gender:

    Female Male

    Underweight .038 .096

    About Right .389 .515

    Overweight .573 .389

    WS3_IR_U2_T6.indd 96 3/7/08 12:12:21 PM

  • 6The following segmented bar graph displays the conditional distributions:

    1009080706050403020100

    Female Male

    Overweight

    GenderPe

    rcen

    tage

    About RightUnderweight

    Feelings About Current Weight

    d. The distributions of the two genders do appear to differ here. The men in the sample were 2.5 times more likely than the women to feel they are underweight and 1.3 times more likely to feel that their weight is about right. In contrast, the women were almost 1.5 times more likely than the men to feel that they are overweight.

    Activity 6-14: Preventing Breast Cancer a. This is an experiment. You know because you are told that the researchers

    randomly assigned the subjects to the treatments (tamoxifen or raloxifene).

    b. The explanatory variable is drug assigned (tamoxifen or raloxifene). The response variable is whether the woman developed invasive breast cancer.

    c. Here is the 2 2 table:

    Tamoxifen Raloxifene

    Developed Breast Cancer 163 167

    Did Not Develop Breast Cancer 9563 9578

    d. For the tamoxifen group, 163/9726 .0168. For the raloxifene group, 167/9745 .0171.

    e. The following segmented bar graph displays these conditional proportions:

    1009080706050403020100

    Tamoxifen Raloxifene

    No cancer

    Drug

    Perc

    enta

    ge

    Breast cancer

    Treatment for PostmenopausalWomen at Risk for Breast Cancer

    Activity 6-14 97

    WS3_IR_U2_T6.indd 97 3/31/08 2:42:58 PM

  • 98 Topic 6: Two-Way Tables

    f. The relative risk of developing invasive breast cancer is .0171/.0168 or 1.0225.

    g. Because the relative risk is so close to 1.00, these two drugs are almost equally effective in preventing invasive breast cancer. Because this is an experiment, you can conclude that the drugs are roughly equally effective at preventing breast cancer in postmenopausal women (though you would like more information about how the women were selected for the study before you generalize this conclusion to the larger population).

    Activity 6-15: Preventing Breast CancerThe analyses for the risk of developing blood clots in a major vein follow:

    a. Here is the two-way table:

    Tamoxifen Raloxifene Total

    Blood Clot 53 65 118

    No Blood Clot 9673 9680 19353

    Total 9726 9745 19471

    b. Here are the conditional proportions of developing a blot clot for each drug:

    Tamoxifen Raloxifene

    Blood Clot .005 .007

    No Blood Clot .995 .993

    c. The following segmented bar graph displays the conditional proportions:

    1009080706050403020100

    Tamoxifen Raloxifene

    No blood clot

    Drug

    Perc

    enta

    ge

    Blood clot

    Treatment for PostmenopausalWomen at Risk for Breast Cancer

    1009080706050403020100

    Tamoxifen Raloxifene

    No blood clot in lung

    Drug

    Perc

    enta

    ge

    Blood clot in lung

    Treatment for PostmenopausalWomen at Risk for Breast Cancer

    d. The relative risk of developing a blood clot is .007/.005 or 1.4 (or 1.224 if using unrounded proportions).

    e. You can conclude that for postmenopausal women the risk of developing blood clots in a major vein is about 1.22 times greater for those on raloxifene compared to those on tamoxifen. You can conclude that this difference in risk can be

    WS3_IR_U2_T6.indd 98 3/7/08 12:12:25 PM

  • 6attributed to the drugs because this was a comparative, randomized experiment. However, you would need more information about how the women were selected for this study before generalizing to the larger population.

    The analyses for the risk of developing a blood clot in a lung follow:

    a. Here is the two-way table:

    Tamoxifen Raloxifene Total

    Blood Clot 54 35 89

    No Blood Clot 9672 9710 19382

    Total 9726 9745 19471

    b. Here are the conditional proportions for the risk of developing a blood clot in a lung for each drug:

    Tamoxifen Raloxifene

    Blood Clot .006 .004

    No Blood Clot .994 .996

    c. The following segmented bar graph displays the conditional proportions:

    1009080706050403020100