Top Banner
1 CHAPTER 07 Random Variables
42

APS Chapter 07 Notes

Sep 01, 2014

Download

Education

WCalhoun

AP Stats
Chapter 07 Notes
William James Calhoun
WWPS - 2009
Test Run
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: APS Chapter 07 Notes

1

CHAPTER 07Random Variables

Page 2: APS Chapter 07 Notes

2

CASE STUDY

Lost income and the courts

Jane Blaylock joined the Ladies Professional Golf Association (LPGA) in 1969 and by 1972 had become the leading money winner on the tour. During the Bluegrass Invitational Tournament, she was disqualified for an alleged rules infraction. The LPGA appointed a committee of her competitors, who suspended her from the next tournament, the Carling Open. Blaylock sued for damages and expenses under the Sherman Antitrust Act, which says that individuals cannot be prevented by their peers from working in their profession because it would lessen competition. She won but then had to come up with a method of determining how much money she might reasonably have made if she had been allowed to play in the next tournament. This is a difficult issue in most antitrust cases but was particularly problematic for a professional golfer, who might play well one day and poorly another day. Her task was challenging. She would have to use a measure that the judge and the jury would understand and that would be sufficiently convincing for a ruling in her favor. She and her legal team used a statistical procedure called the expected value, which we will study in this chapter. Using data from the nine most recent tournaments that Blaylock played in prior to the disqualification, they estimated the probability that she would achieve various scores based on her past performance. The scores for players who won money ranged from 209 (the tournament winner) to 232. To simplify things, the 24 possible scores were reduced to 8, where 210, for example, represents 209, 210, and 211. Here is a table that summarizes her possible outcomes and the probabilities calculated for each of the outcomes:

The probability is 0.07 that she would score 209, 210, or 211; and so forth. Using these numbers, her expected score was calculated to be approximately 218. Had she played in the tournament, her 218 would have earned her $1427.50. Not only was the jury persuaded, but they also believed that Blaylock might well have won the tournament, so they awarded her first­place money, $4500. This amount was then tripled to $13,500 to cover legal expenses, according to the provisions of the Sherman act. Statistics to the rescue.

Page 3: APS Chapter 07 Notes

3

Activity 7A

The game of craps

Materials: Pair of dice for each pair of students

The game of craps is one of the most famous (or notorious) of all gambling games played with dice. In this game, the player rolls a pair of six­sided dice, and the sum of the numbers that turn up on the two faces is noted. If the sum is 7 or 11, the player wins immediately. If the sum is 2, 3, or 12, the player loses immediately. If any other sum is obtained, the player continues to throw the dice until he either wins by repeating the first sum he obtained or loses by rolling a 7. Your mission in this activity is to estimate the probability of a player winning at craps. But first, let's get a feel for the game. For this activity, your class will be divided into groups of two. Your instructor will provide a pair of dice for each group of two students.

Page 4: APS Chapter 07 Notes

4

1. In your group of two students, play a total of 20 games of craps. One person will roll the dice; the other will keep track of the sums and record the end result (win or lose). If you like, you can switch jobs after 10 games have been completed. How many times out of 20 does the player win? What is the relative frequency (that is, percent, written as a decimal) of wins?

2. Combine your results with those of all the other two­student groups in the class. What is the relative frequency of wins for the entire class?

3. Use simulation techniques to represent 20 games of craps, using either the table of random digits or the random number generating feature of your TI­83/84/89. What is the relative frequency of wins based on the 20 simulations? How does this number compare to the relative frequency you found in Step 2?

4. One of the ways you can win at craps is to roll a sum of 7 or 11 on your first roll. Using your results and those of your fellow students, determine the number of times a player won by rolling a sum of 7 on the first roll. What is the relative frequency of rolling a sum of 7? Repeat these calculations for a sum of 11. Which of these sums appears more likely to occur than the other, based on the class results?

Page 5: APS Chapter 07 Notes

5

5. One of the ways you can lose at craps is to roll a sum of 2, 3, or 12 on your first roll. Using your results and those of your fellow students, determine the number of times a player lost by rolling a sum of 2 on the first roll. What is the relative frequency of rolling a sum of 2? Repeat these calculations for a sum of 3 and a sum of 12. Which of these sums appears more likely to occur than the others, based on the class results?

6. Clearly, the key quantity of interest in craps is the sum of the numbers on the two dice. Let's try to get a better idea of how this sum behaves in general by conducting a simulation. First, determine how you would simulate the roll of a single fair die. (Hint: Just use digits 1 to 6 and ignore the others.) Then determine how you would simulate a roll of two fair dice. Using this model, simulate 36 rolls of a pair of dice and determine the relative frequency of each of the possible sums. Alternatively, use the applet at the Web site nces.ed.gov/nceskids/probability/.

Page 6: APS Chapter 07 Notes

6

7. Construct a relative frequency histogram of the relative frequency results in Step 6. What is the approximate shape of the distribution? What sum appears most likely to occur? Which appears least likely to occur?

8. From the relative frequency data in Step 6, compute the relative frequency of winning and the relative frequency of losing on your first roll in craps. How do these simulated results compare with what the class obtained?

Page 7: APS Chapter 07 Notes

7

7.Introduction

Page 8: APS Chapter 07 Notes

8

07.Intro.01: Define what is meant by a random variable.

Random variables are the basic units of sampling distributions, which, in turn, are the foundation for inference.

Two flavors ­ discrete and continuous.

Page 9: APS Chapter 07 Notes

9

7.1 Discrete and Continuous Random Variables

Page 10: APS Chapter 07 Notes

10

07.01.01: Define a discrete random variable.

The probabilities pi must satisfy two requirements:

1. Every probability pi is a number between 0 and 1.

2. The sum of the probabilities is 1: p1 + p2 + … + pk = 1.

Find the probability of any event by adding the probabilities pi of the particular values xi that make up the event.

Page 11: APS Chapter 07 Notes

11

Example 7.1 Getting good grades

Finding discrete probabilities

North Carolina State University posts the grade distributions for its courses online. Students in Statistics 101 in the fall 2003 semester received 21% A's, 43% B's, 30% C's, 5% D's, and 1% F's. Choose a Statistics 101 student at random. To “choose at random” means to give every student the same chance to be chosen. The student's grade on a four­point scale (with A = 4) is a random variable X.

The value of X changes when we repeatedly choose students at random, but it is always one of 0, 1, 2, 3, or 4. Here is the distribution of X:

The probability that the student got a B or better is the sum of the probabilities of an A and a B. In the language of random variables,

P(X ≥ 3) = P(X = 3) + P(X = 4) = 0.43 + 0.21 = 0.64

Page 12: APS Chapter 07 Notes

12

07.01.02: Explain what is meant by a probability distribution.

The probability distribution is the organization of possible outcomes of a discrete random variable with the associated probabilities of each outcome.

These distributions can be in a table, as before.

They can be in histograms as well.

This is closely related to the continuous r.v.

Page 13: APS Chapter 07 Notes

13

Figure 7.1 Probability histograms for (a) random digits 0 to 9 and (b) Benford's law. The height of each bar shows the probability assigned to a single outcome.

Make note of the sums of the bars...it's your destiny...er...density?

Page 14: APS Chapter 07 Notes

14

07.01.03: Construct the probability distribution for a discrete random variable.Example 7.2 Tossing coins

Values of a random variable

What is the probability distribution of the discrete random variable X that counts the number of heads in four tosses of a coin? We can derive this distribution if we make two reasonable assumptions:

1. The coin is balanced, so each toss is equally likely to give H or T. 2. The coin has no memory, so tosses are independent.

The outcome of four tosses is a sequence of heads and tails such as HTTH. There are 16 possible outcomes in all. Figure 7.2 lists these outcomes along with the value of X for each outcome. The multiplication rule for independent events tells us that, for example,

Each of the 16 possible outcomes similarly has probability 1/16. That is, these outcomes are equally likely.

Page 15: APS Chapter 07 Notes

15

The number of heads X has possible values 0, 1, 2, 3, and 4. These values are not equally likely. As Figure 7.2 shows, there is only one way that X = 0 can occur: namely, when the outcome is TTTT. So P(X = 0) = 1/16. But the event X = 2 can occur in six different ways, so that

Page 16: APS Chapter 07 Notes

16

We can find the probability of each value of X from Figure 7.2 in the same way. Here is the result:

These probabilities have sum 1, so this is a legitimate probability distribution. In table form the distribution is

Page 17: APS Chapter 07 Notes

17

Figure 7.3 is a probability histogram for this distribution. The probability distribution is exactly symmetric. It is an idealization of the relative frequency distribution of the number of heads after many tosses of four coins, which would be nearly symmetric but is unlikely to be exactly symmetric.

Page 18: APS Chapter 07 Notes

18

Any event involving the number of heads observed can be expressed in terms of X, and its probability can be found from the distribution of X. For example, the probability of tossing at most two heads is

The probability of at least one head is most simply found by use of the complement rule:

Page 19: APS Chapter 07 Notes

19

07.01.04: Given a probability distribution for a discrete random variable, construct a probability histogram.

See the last example.

Page 20: APS Chapter 07 Notes

20

Exercisesp4697.2, 3, 4, 5

Page 21: APS Chapter 07 Notes

21

07.01.05: Review: define density curve.

A nonnegative function that has area exactly 1 between it and the horizontal axis.

This corresponds to a sum probability of 1.

To be useful, the density curves must be functions we know (like the Normal curve) or they must have simple geometric shapes for area calculation.

Page 22: APS Chapter 07 Notes

22

07.01.06: Explain what is meant by a uniform distribution.

Page 23: APS Chapter 07 Notes

23

Example 7.3 Random numbers and the uniform distribution

Areas under a density curve

The random number generator will spread its output uniformly across the entire interval from 0 to 1 as we allow it to generate a long sequence of numbers. The results of many trials are represented by the density curve of a uniform distribution (Figure 7.5). This density curve has height 1 over the interval from 0 to 1. The area under the density curve is 1, and the probability of any event is the area under the density curve and above the event in question.

Page 24: APS Chapter 07 Notes

24

As Figure 7.5(a) illustrates, the probability that the random number generator produces a number X between 0.3 and 0.7 is

because the area under the density curve and above the interval from 0.3 to 0.7 is 0.4. The height of the density curve is 1 and the area of a rectangle is the product of height and length, so the probability of any interval of outcomes is just the length of the interval. So,

Notice that the last event consists of two nonoverlapping intervals, so the total area above the event is found by adding two areas, as illustrated by Figure 7.5(b). This assignment of probabilities obeys all of our rules for probability.

Page 25: APS Chapter 07 Notes

25

07.01.07: Define a continuous random variable and a probability distribution for a continuous random variable.

Page 26: APS Chapter 07 Notes

26

All continuous probability distributions assign probability 0 to every individual outcome.

P(X = a) = 0 for any individual outcome a.

Different from discrete probability.

Easy concept for you Calc kids...huh...

Let us get our non­Calcs to make sense of this non­sense.

Page 27: APS Chapter 07 Notes

27

The upcoming problems are much like the Chapter 2 problems, only the language has changed to that involving probabilities.

For example, the next slide with a new look at probability in the Normal distribution curve.

Page 28: APS Chapter 07 Notes

28

Example 7.4 Cheating in school

Continuous random variables

Students are reluctant to report cheating by other students. A sample survey puts this question to an SRS of 400 undergraduates: “You witness two students cheating on a quiz. Do you go to the professor?” Suppose that if we could ask all undergraduates, 12% would answer “Yes.”

Page 29: APS Chapter 07 Notes

29

Exercisesp475#7.7, 9

Page 30: APS Chapter 07 Notes

30

Section 7.1 | Summary

The previous chapter included a general discussion of the idea of probability and the properties of probability models. Two very useful specific types of probability models are distributions of discrete and continuous random variables. In our study of statistics we will employ only these two types of probability models.

A random variable is a variable taking numerical values determined by the outcome of a random phenomenon. The probability distribution of a random variable X tells us what the possible values of X are and what probabilities are assigned to those values.

A random variable X and its distribution can be discrete or continuous.

A discrete random variable has a countable number of possible values. The probability distribution assigns each of these values a probability between 0 and 1 such that the sum of all the probabilities is exactly 1. The probability of any event is the sum of the probabilities of all the values that make up the event.

A continuous random variable takes all values in some interval of numbers. A density curve describes the probability distribution of a continuous random variable. The probability of any event is the area under the curve above the values that make up the event.

Normal distributions are one type of continuous probability distribution.

You can picture a probability distribution by drawing a probability histogram in the discrete case or by graphing the density curve in the continuous case.

When you work problems, get in the habit of first identifying the random variable of interest. X = number of _____ for discrete random variables, and X = amount of _____ for continuous random variables.

Page 31: APS Chapter 07 Notes

31

Exercisesp477None specifically assigned

Page 32: APS Chapter 07 Notes

32

7.2 Means and Variances of Random Variables

Page 33: APS Chapter 07 Notes

33

Activity 7B

Means of random variables

To see how means of random variables work, consider a random variable that takes values 1, 1, 2, 3, 5, 8. Do the following.

1. Calculate the mean μ of the population.

2. Make a list of all of the samples of size 2 from this population. (Caution: Notice that in our population, the first two values are the same. To distinguish them from one another, we will use subscripts: 1a and 1b.) As a check, you should have 15 subsets of size 2. Here's the beginning of our list:

3. Find the mean of the 15 x­values in the third column. Compare this with the population mean that you calculated in Step 1.

4. Repeat Steps 1 to 3 for a different (but still small) population of your choice. Now compare your results with those of other students in your class.

5. Write a brief statement that describes what you discovered.

Page 34: APS Chapter 07 Notes

34

7.02.01: Define what is meant by the mean of a random variable.

Page 35: APS Chapter 07 Notes

35

7.02.02: Calculate the mean of a discrete random variable.

Page 36: APS Chapter 07 Notes

36

7.02.03: Calculate the variance and standard deviation of a discrete random variable.

Page 37: APS Chapter 07 Notes

37

7.02.04: Explain, and illustrate with an example, what is meant by the law of large numbers.

Page 38: APS Chapter 07 Notes

38

7.02.05: Explain what is meant by the law of small numbers.

Page 39: APS Chapter 07 Notes

39

7.02.06: Given μX and μY, calculate μa+bX' and μX+Y.

Page 40: APS Chapter 07 Notes

40

7.02.07: Given σX and σY, calculate σ2a+bX and σ2X+Y (where X and Y are independent.)

Page 41: APS Chapter 07 Notes

41

7.02.08: Explain how standard deviations are calculated when combining random variables.

Page 42: APS Chapter 07 Notes

42

7.02.09: Discuss the shape of a linear combination of independent Normal random variables.