Chapter 7: SAMPLING DISTRIBUTIONS & POINT …homepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch7_pt1.pdf · observed x as a point estimate for . ... The Sampling Distribution of X

Chapter 7: SAMPLING DISTRIBUTIONS& POINT ESTIMATIONOF PARAMETERS

Part 1: IntroductionSampling Distributions &the Central Limit Theorem

Point Estimation & EstimatorsSections 7-1 to 7-2

Sample data is collected on a population to drawconclusions, or make statistical inferences, aboutthe population.

Types of statistical inference:

1) parameter estimation (e.g. estimating µ)- with a certain level of confidence

2) hypothesis testing (e.g. H0 : µ = 50)

1

Example of parameter estimation(or point estimation):

We’re interested in the value of µ.

We collected data and we use theobserved x̄ as a point estimate for µ.

µ is the unknown parameter being estimated.

NOTATION: µ̂ = X̄X̄ is the estimator.

{We often show an estimator as a ‘hat’

over its respective parameter.}

The observed x̄ estimate is a single value,or a point estimate.

Prior to data collection, X̄ is random variableand it is the statistic of interest from the data.

2

Sample-to-sample variability

The value we get for X̄ (the sample mean) de-pends on the specific sample chosen.

Sample

Population

This means, X̄ is a random variable!

The distribution of the random variable X̄is called the sampling distribution of X̄ .

We expect X̄ to be close to µ (we ARE us-ing it to estimate µ) but there is variability inX̄ before it is observed because we use randomsampling to choose our sample of size n.

3

The Sampling Distribution of X̄...

• Tells us what kind of values are likely to occurfor X̄ .

• Puts a probability distribution over the pos-sible values for X̄ .HINT: It’s distribution will be normal when conditions are met.

In a simple random sample of n observationsfrom a population,

E(X̄) = µ

⇒ X̄ is an unbiased estimator of µ.

This gives us a measure of center for the sam-pling distribution for X̄ , but what about thevariability of the X̄ random variable?

4

Sampling distribution of X̄

Case 1 Original population is normally dis-tributed.

x

f(x)

The x̄ I observe depends on the sample (theparticular n observations) I chose from thisnormal distribution.

Let’s look at the distribution of x̄ values if Ichoose a sample of size n and compute x̄ forthat sample, and I repeat this process 1000times...

5

x

f(x)

1) Choose a sample of size n from a normaldistribution

2) Compute x̄

3) Plot the x̄ on our frequency histogram

4) Do steps 1-3 1000 times

See applet at:http://onlinestatbook.com/stat sim/sampling dist/index.html

6

SKETCH THE PLOTS:

Distribution of X̄ for n=2 when original pop-ulation is normal.

Distribution of X̄ for n=25 when originalpopulation is normal.

7

Turns out, in this case, the random variableX̄ is normally distributed.

This normal distribution is centered at µ (themean of the original population we were sam-pling from).

The variability of X̄ depends on the samplesize n, and the variability in the original pop-ulation.

SPECIFICALLY:

When X ∼ N(µ, σ2),

X̄ ∼ N(µ,σ2

n)

NOTE: the distribution for X̄ is less vari-able than the distribution for X .

8

X̄ ∼ N(µ,σ2

n)

NOTE: X̄ from n = 25 is less variable thanX̄ from n = 2.

More data (larger n) gives us a better esti-mate of µ from X̄ .

The distribution of our estimator X̄ is squishedcloser, or is tighter, around the thing we’retrying to estimate. Which is beneficial whenestimating something.

9

Sampling distribution of X̄

Case 2 Original population is NOT normallydistributed.

x

f(x)

x

f(x)

x

f(x)

Or anything else...

10

What does the distribution of X̄ look like?

1) Choose a sample of size nfrom the distribution

2) Compute x̄

3) Plot the x̄ on our frequency histogram

4) Do steps 1-3 1000 times

———————————————————–

Right-skewed with n = 10.

11

Really non-normal (mass out at the ends)with n = 2.

Really non-normal (mass out at the ends)with n = 25.

12

Turns out the random variable X̄ is normallydistributed no matter what your original dis-tribution was IF n is large enough...

What’s large enough?Rule of thumb is n ≥ 30

So, what have we learned...

if X is normally distributed, thenX̄ ∼ N(µ, σ2/n) for any n.

if X is NOT normally distributed, thenX̄ ∼ N(µ, σ2/n) for n ≥ 30.

if X is not severely non-normal, thenX̄ ∼ N(µ, σ2/n) is close to true for n < 30.

13

Sampling Distributions andthe Central Limit Theorem

Section 7-2

Sample data is collected on a population to drawconclusions, or make statistical inferences, aboutthe population.

NOTATION:

− A large letter like X̄ represents the randomvariable X̄ , and X̄ can take on many values.

− A small letter like x̄ represents an actual ob-served x̄ from a sample, and it is a fixedquanitity once observed.

14

•Random Sample

The random variables X1, X2, . . . , Xn are arandom sample of size n if...

a) theXi’s are independent random variables,and

b) every Xi has the same sample probabilitydistribution (i.e. they are drawn from thesame population).

NOTE: the observed data x1, x2, . . . , xn isalso referred to as a random sample.

15

• Statistic– A statistic is any function of the observa-

tions in a random sample.

∗ Example:The mean X̄ is a function of the obser-vations (specifically, a linear combina-tion of the observations).

X̄ =

∑ni=1Xin

=1

nX1+

1

nX2+· · ·+1

nXn

– A statistic is a random variable,and it has a probability distribution

– The distribution of a statistic is called thesampling distribution of the statis-tic because is depends on the sample cho-sen.

16

– The sampling distribution of the meanis very important.

What is the expected value of the samplemean X̄ in a random sample?

E(X̄) = E(1

nX1 +

1

nX2 + · · · + 1

nXn)

=1

n

∑E(Xi)

=1

n

∑µ =

nµ

n= µ = µX̄

Notation: E(X̄) = µX̄ = µ

where µ is the population mean.

(µ is also the expected valueof a single Xi)

17

What is the variance of the sample meanX̄ in a random sample?

(Xi’s in a random sample are independent.)

V (X̄) = V (1

nX1 +

1

nX2 + · · · + 1

nXn)

=

(1

n

)2∑V (Xi)

=

(1

n

)2∑σ2

=

(1

n

)2

nσ2 =σ2

n

Notation: V (X̄) = σ2X̄

= σ2

n

where σ2 is the population variance.

(σ2 is also the variance of a single Xi)

18

As we have described earlier, for n ≥ 30

X̄ ∼ N(µ,σ2

n)

(and this is also true for n < 30 if eachXi comesfrom a normal population).

Using this fact, and what we know about stan-dardizing variables, leads to...

• The Central Limit Theorem

If X1, X2, . . . , Xn is a random sample of sizen taken from a population with mean µ andvariance σ2, the limiting form of the distri-bution of

Z =X̄ − µσ/√n

as n → ∞ is the standard normal distribu-tion, or N(0, 1).

19

The approximation of

X̄ − µσ/√n∼ N(0, 1)

depends on the size of n.

Satisfactory approximation for n ≥ 30 forany population.

Satisfactory approximation for n < 30 fornear normal populations.

————————————————————

The next graphic shows 3 different original pop-ulations (one nearly normal, two that are not),and the sampling distribution for X̄ based on asample of size n = 5 and size n = 30.

20

The three original distributions are on the farleft (one that is nearly symmetric and bell-shaped,one that is right skewed, and one that is highlyright skewed).

As shown in: Navidi, W. ‘Statistics for Engineers and Scientists’, McGraw Hill, 2006

21

Things to notice from the previous graphic:

• The variability of X̄ decreases as n increases

Recall: V (X̄) = σ2

n .

• If the original population has a shape that’scloser to normal, smaller n is sufficient for X̄to be normal.

• The normal approximation gets better withlarger n when you’re starting with a non-normal population.

• Even when X has a very non-normal distri-bution, X̄ still has a normal distribution witha large enough n.

22

• Example: Flaws in a copper wire.

Let X denote the number of flaws in a 1 inchlength of copper wire. The probability massfunction of X is presented in the followingtable:

x P (X = x)0 0.481 0.392 0.123 0.01

Suppose n = 100 wires are sampled from thispopulation. What is the probability that theaverage number of flaws per wire in the sam-ple is less than 0.5?

23

ANS:

P (X̄ < 0.5) =?

24

Some Notation: Sampling distributionfor sample mean (X̄)

Suppose we have a random sample of size ndrawn from a parent (original) population withan expected value µ and variance σ2. Then,

X̄ ∼ N(µ,σ2

n)

is true for sample size n > 30 no matter whatthe distribution of the parent population, butalso true for smaller n when the parent popula-tion is normal or near-normal.

Notation:

E(X̄) = µX̄ = E(X) = µ

V (X̄) = σ2X̄

=V (X)n = σ2

n

25

Terminology:

The term standard deviation refers to thepopulation standard deviation, or

√V (X) = σ,

and...

Z =X − µσ

The term standard error is a value related toX̄ and is also more fully stated as the standarderror of the sample mean and it is the squareroot of the variance of X̄ , or√

V (X̄) =

√σ2

n = σ√n

And then...

Z =X̄ − µ√

σ2

n

=X̄ − µσ/√n

26

Chapter 7: SAMPLING DISTRIBUTIONS & POINT …homepage.stat.uiowa.edu/~rdecook/stat2020/notes/ch7_pt1.pdf · observed x as a point estimate for . ... The Sampling Distribution of X

Documents