Stat 321 – Lecture 19 Central Limit Theorem
Dec 21, 2015
Stat 321 – Lecture 19Central Limit Theorem
Reminders
HW 6 due tomorrow Exam solutions on-line Today’s office hours: 1-3pm
Ch. 5 “reading guide” in Blackboard Ignore page numbers
Definitions
A statistic is any quantity whose value can be calculated from sample data.
A simple random sample of size n gives every sample of size n the same probability of occurring. Consequently, the Xi are independent random variables and every Xi has the same probability distribution.
As a function of random variables, a statistic is also a random variable and has its own probability distribution called a sampling distribution.
When n is small, we can derive the sampling distribution exactly. In other cases, we can use simulation to investigate properties of the sampling distribution.
A statistic is an unbiased estimator if E(statistic) = parameter.
Previously
Rules for Expected ValueE(X+Y) = E(X) + E(Y)
Rules for VarianceV(X+Y) = V(X) + V(Y) IF X and Y are independent
Moral
It is often possible to find the distribution of combinations of random variables like sums and averages
What about the sample mean…
The Central Limit Theorem Let X1, …, Xn be independent and identically
distributed random variables, each with mean and variance 2. Then if n is sufficiently large, has (approximately) a normal distribution with E( ) = and V( ) = 2/n. X X
X
Example
Ethan Allen October 5, 2005
Are several explanations, could excess passenger weight be one?
Weights of Americans
CDC: mean = 167 lbs, SD = 35 lbs Want P(T > 7500) for a random sample of
n=47 passengers Equivalent to P(X>159.57)
Sampling distribution should be normal with mean 167 lbs and standard deviation 5.11 lbs
Z = (159.57-167)/5.11 = -1.45 92.6% of boats were overweight…
Roulette
Total winnings vs. average winnings Find P(X > 0) Exact sampling distribution with n = 2
-1 0 1.27699 .4983 .2244
Exact sampling distribution with n =3
-1 -1/3 1/3 1.1458 .3963 .3543 .1063
Empirical Sampling Distributions Starts to get very cumbersome to do this for
large n so will use simulation instead
Approximately 35% of samples have a positive sample mean
Number Bet
y p(y)
-$1 .9737
$35 .0263
E(Y) = -.0526
SD(Y) = 5.76
Number bet
What does CLT predict for n = 50 spins? Approximately 47% of samples have positive
average?
Only 36%
Increases to 49% with large n?
1000 spins
About 5% positive About 38% positive