Estimation Estimation - 1 1 Estimation Confidence Intervals for Means 2 Inferential Statistics 1. Type of Inference: Estimation Hypothesis Testing 2. Purpose Make Decisions about Population Characteristics Population? 3 Estimation Process Mean, , is unknown Population (, s) Random Sample Mean X = 98 Sample X 1 , X 2 , … , X n ~N(,s) ~N(,s) X 1 X 2 X 3 ~N(,s) X 4 ~N(,s) X 5 ~N(,s) ... 4 Random Sample Random sample is a set of independent and identically distributed (i.i.d.) random variables. 5 Theorem (Distribution of ) If X 1 , X 2 , …, X n are observations of a random sample of size n from the normal distribution N(, s 2 ), then the distribution of the sample mean is N(, s 2 /n) n i i X n X 1 1 X 6 Theorem (Distribution of ) If X 1 , X 2 , …, X n are observations of a random sample of size n from a distribution that has a mean and a finite variance s 2 , then the distribution of is N(0, 1), as n , X n X Z / s s n n X Z n i i 1 X and the distribution of the sample mean is N(, s 2 /n), as n .
11
Embed
Inferential Statistics Estimation Population?gchang.people.ysu.edu/class/s3743/L374307S_6_Estimation.pdf · Estimation Estimation - 8 43 Confidence Interval Solution* = 3.7 s s =
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Estimation
Estimation - 1
1
Estimation
Confidence Intervals for Means
2
Inferential Statistics
1. Type of Inference:
Estimation
Hypothesis Testing
2. Purpose
Make Decisions
about Population
Characteristics
Population?
3
Estimation Process
Mean, , is
unknown
Population (, s)
Random Sample
Mean
X= 98
Sample
X1, X2, … , Xn
~N(,s)
~N(,s)
X1
X2
X3 ~N(,s) X4 ~N(,s)
X5 ~N(,s) ...
4
Random Sample
Random sample is a set of independent and
identically distributed (i.i.d.) random variables.
5
Theorem (Distribution of )
If X1, X2, …, Xn are observations of a random
sample of size n from the normal distribution
N(, s2), then the distribution of the sample
mean is N(, s2/n)
n
i
iXn
X1
1
X
6
Theorem (Distribution of )
If X1, X2, …, Xn are observations of a random
sample of size n from a distribution that has a
mean and a finite variance s2, then the
distribution of is N(0, 1), as n ,
X
n
XZ
/s
s
n
nXZ
n
i i 1
Xand the distribution of the sample mean
is N(, s 2/n), as n .
Estimation
Estimation - 2
7
Statistics Used to Estimate Population Parameters
Sample Mean,
Sample Variance, s2
Sample Proportion,
…
Estimators
p̂
x population mean
s2 population variance
p population proportion
Parameters Statistics
8
Sampling Distribution
Theoretical Probability Distribution of
the Sample Statistic.
XThe distribution of the sample mean
from N(, s2), is N(, s2/n), as n .
Example:
9
Disadvantage of Point Estimation
1. Provides Single Value
Based on Observations from 1 Sample.
* Sample MeanX = 98 Is a Point Estimate of
Unknown Population Mean.
2. Gives No Information about How Close Value Is
to the Unknown Population Parameter
Which of the following statistics do you prefer? a. 32% b. 32% with a margin of error 3%
10
Estimation
You’re interested in finding the average body temperature of healthy adults in Northeastern Ohio (the population). What would you do?
How can we estimate this average with a measure of reliability?
98 1 F 98 .5 F 98 .2 F
11
Interval Estimation
Margin of Error Gives Information about How Close Value Is to the Unknown Population Parameter.
12
Sampling Error
x
Sample statistic
(point estimate)
Sampling Error = | – | x
Estimation
Estimation - 3
13
Key Elements of
Interval Estimation
Sample statistic
(point estimate)
Confidence
limit (lower)
Confidence
limit (upper)
Confidence
interval
Confidence Level: A probability that the
population parameter falls somewhere
within the interval.
x Margin of Error
98 1 F
14
Confidence Interval Estimation
X
Xs
s
s
1)( 2/2/n
zXn
zXP
ss 1)( 2/2/ xx zXzP
ss 1)( 2/2/ xx zXzXP
2/n
zxs
xz s 2/xz s 2/
/2 1 –
15
Confidence Interval Estimation
X
Xs
95.)96.196.1( n
Xn
XPs
s
95.)96.196.1( xx XP ss
95.)96.196.1( xx XXP ss
96.1n
xs
xs 96.1xs 96.1
.025 .95
16
The Confidence Interval
sx _
X
95% Sample
Means
+ 1.96sx - 1.96sx
1- = .95
Confidence Level
/2 /2 = .025
1.96 = z.025
x + 1.96sx x - 1.96sx
x
Confidence Interval =>
17
(1-)·100% Confidence Interval Estimate for
mean of a normal population
or
) , ( 2/2/n
zxn
zxss
2/n
zxs
Margin of Error
Confidence Interval for Mean
(s Known)
“s Known” may mean that we have very good estimate of s.
It is not practical to assume that we know s. 18
Confidence Interval of Mean
(s unKnown and n 30)
(1-)·100% Confidence Interval Estimate
for mean of a population when sample size
is relative large
or
) , ( 2/2/n
szx
n
szx
2/n
szx
Estimation
Estimation - 4
19
The Confidence Interval
95% Samples
sx _
X
+ 1.96sx - 1.96sx
x - 1.96sx x + 1.96sx
x
Confidence Interval =>
95% Confidence
Interval
20
95% Samples
sx _
X
2.5% 2.5%
95 % of
intervals
contain .
5% do not.
The Confidence Interval
21
Factors Affecting
Interval Width
1. Data Dispersion
Measured by s
2. Sample Size
Affects standard error:
3. Level of Confidence (1 )
Affects Z/ 2
n
x
ss
) , ( 2/2/n
zxn
zxss
22
90% Samples
95% Samples
99% Samples
+ 1.65sx + 2.58sx
sx _
X
+1.96sx
- 2.58sx - 1.65sx
-1.96sx
Size of Interval
23
Estimation Example
Mean (s Known) The average weight of a random sample of n = 25
subjects isX = 140. Set up a 95% confidence interval
estimate for if s = 10. (Assume Normal population.)
3.92140or ) 92.341 , 08.631 (
) 25
1096.1041 ,
25
1096.1041 (
) , (
1.96. z .025, 2
.05, ,95.1
2/2/
2
nZX
nZX
ss
2/n
zxs
143.92) (136.08,
92.3 140 25
1096.1401
24
Interpretation
We can be 95% confident that the population
mean is in (136.08, 143.92).
We can be 95% confident that the maximum
sampling error using this interval estimate for
estimating mean is within 3.92.
Estimation
Estimation - 5
25
Confidence Interval of Mean
(s unKnown and n 30)
(1-)·100% Confidence Interval Estimate
for mean of a population when sample size
is relative large
or
) , ( 2/2/n
szx
n
szx
2/n
szx
26
Thinking Challenge
Example: A city uses a certain noise index to monitor the noise pollution at a certain area of the city. A random sample of 100 observations from randomly selected days around noon showed an average index value of x = 1.99 and standard deviation s = 0.05. Find the 90% confidence interval estimate of the average noise index at noon.
27
Confidence Interval Solution*
) 998.1 , 982.1 (
0.008 1.99100
05.64.199.1
1.64z z
.05 /2 .1, 90.1 .90, 1
2/
.052 /
n
szx
28
Interval Estimation for Mean
In a survey on a random sample of 64
individuals who gambled at Las Vegas, the
average amount of money won for the day that
survey was done is –$25.50 with a standard
deviation of $100. Find the 95% confidence
interval estimate for the average amount of
money won by people gambled at Las Vegas
that day.
29
Finding Sample Sizes
for Estimating
I don’t want to
sample too much
or too little!
2
22
2
2
2
Error ofMargin
nz :C.I.
E
zn
nzE
x
s
s
s
B = Margin of Error or Bound
30
Sample Size Example
What sample size is needed to be 90%
confident of being correct within 5? A pilot
study suggested that the standard deviation is
45.
2202.2195
45645.12
22
2
22
05. E
zn
s
Estimation
Estimation - 6
31
Thinking Challenge
You plan to survey residents in
your county to find the average
health insurance premium that they
are paying. You want to be 95%
confident that the sample mean is
within ± $50.
A pilot study showed that s was
about $400. What sample size
should you use?
32
Sample Size Solution*
24686.245
50
40096.12
22
2
22
025.0
E
zn
s
33
Confidence Interval Mean
(s Unknown & n < 30)
1. Assumptions
Population Standard Deviation Is Unknown
Population Must Be Normally Distributed
2. Use Student’s t Distribution
3. Confidence Interval Estimate
) , ( 1,2/1,2/n
stx
n
stx nn
n
stx
n
1 ,2
34
t
Student’s t Distribution
0
t (df = 5)
Z
Standard
Normal (Z)
Bell-Shaped
Symmetric
‘Fatter’ Tails
t (df = 13)
ns
xt
35
Theorem (Distribution of and S 2) X
)1( is )1( 2
2
2
nSn
s
X
If X1, X2, …, Xn are observations of a random
sample of size n from the normal distribution
N(, s 2). The statistics, sample mean, , and
sample variance, S 2, are independent and
n
i iXn
X1
1
n
i i XXn
S1
22 )(1
136
Student’s t Distribution
Let Z be a random variable that is N(0, 1),
and U be a random variable that is 2(r), and
Z and U are independent. Then, the random
variable
has a t-distribution with degrees of freedom r.
rU
ZT
Estimation
Estimation - 7
37
Student’s t Distribution
If X1, X2, …, Xn are observations of a random
sample of size n from the normal distribution
N(, s 2). The statistics, sample mean, , and
sample variance, s 2, are independent and
has a t-distribution with d.f. n – 1.
nS
XT
X
38
t-statistic
nS
X
nSn
n
X
T
s
s
)1()1(
2
2
Z
U d.f. of U
39 40
Student’s t Table
t values
or percentile in t-distribution t0
.05
For a 90% C.I.:
n = 3
df = n - 1 = 2
= .10
/2 =.05
t/2 = ?
2.920
41
Estimation Example
Mean (s Unknown) A random sample of weights of 25 subjects, has a sample
mean 140 and sample standard deviation 8. Set up a
95% confidence interval estimate for .
) 31.341 , 69.631 (
3.31 140 25
8064.2041
064.2
.025, /2 .05,.951 .95, 1
025.024 , /2
tt df
1,2/n
stx n
42
Thinking Challenge
The numbers of community hospital beds per 1000
population that are available in each different
regions of the country is normally distributed. A
random sample 6 regions were selected and the
rates of beds per 1000 were recorded and they are
3.6, 4.2, 4.0, 3.5, 3.8, 3.1.
Find the 90% confidence interval estimate of the
mean bed-rate in the country.
Estimation
Estimation - 8
43
Confidence Interval Solution*
= 3.7
s = 0.38987
x
1592.6
38987.
n
s
(use 90% confidence level)
n = 6, df = n 1 = 6 1 = 5
t.05,5 = 2.015
( 3.7 - (2.015)(0.1592), 3.7 + (2.015)(0.1592) )
( 3.379, 4.021 )
n
stx n 1 ,2/
44
Confidence interval with z-score:
The (1 %confidence interval estimate
for population mean:
Assumption: If sampled from normal
population with known variance, s,
Assumption: If large sample and if
unknown variance, s replaces s,
nzx
s 2/
n
szx 2/
45
Confidence interval with t-score:
The (1 %confidence interval estimate
for population mean:
Assumption: If sampled from normal
population with unknown variance, s,
n
stx ndf 1 ,2/
(If sample size is large the normality assumption is
insignificant.) t z as sample becomes large
46
Average Weight for Female Ten
Year Children In US
Info. from a random sample: n = 10, x = 80 lb, s =
Remark: If data is coded as 1 or 0, sample mean is the same as sample proportion of 1’s.
Data: 1, 0, 0, 1, 0 px
4.5
2
5
01001
56
Confidence Interval
Proportion
1. Assumptions
Two Categorical Outcomes
Normal Approximation Can Be Used If
np and n(1 – p) are both greater than 5.
) )ˆ1(ˆ
ˆ , )ˆ1(ˆ
ˆ ( 22n
ppzp
n
ppzp
2. Confidence Interval Estimate
(for large sample)
n
ppp
)ˆ1(ˆzˆ
2
57
Parameters of Sample Proportion
X ~ Binomial (n,p),
E[X] = np, Var[X] = np(1p)
n
Xp ˆ
~ ?
E[X/n] = ? , Var[X/n] = ?
n
pp )1( p
58
Estimation Example
Proportion
A random sample of 400 from a large
community showed that 32 have diabetes. Set up
a 95% confidence interval estimate for p, the
percentage of people that have diabetes.
96.1,40008400
32ˆ 025.2/ zzn.p ,
n
ppp
)ˆ1(ˆzˆ
2
59
Estimation Example
Proportion
The 95% C.I. for p, the percentage of people that
have diabetes:
) 107. , 053. ( %7.2%8 .027 .08
400
)08.1(08.96.108.
)ˆ1(ˆ
ˆ 2/n
ppZp
400 ,08400
32ˆ n.p
60
Thinking Challenge
A member of a health department wish to see what percentage of people in a community will support an environmental policy. Of 200 survey forms sent and received, 35 responded that they support the policy and the rest of them do not support the policy.
Find a 90% confidence interval estimate of the percentage of the population in this community that support the policy?
Estimation
Estimation - 11
61
Confidence Interval
Solution*
) %92.21 , %08.13 (
4.42%17.5%0442. .175
645.1 ,200 175.200
35ˆ
2/ zn,p
)ˆ1(ˆ
ˆ 2/n
ppzp
200
)825(.175.645.1175.
62
Example:
Researchers wish to estimate the percentage of
hospital employees infected by SARS in a
certain country. Out of 500 randomly chosen
hospital employees, 14 were infected. Find the
95% confidence interval estimate for
percentage of hospital employees infected by
SARS in this country.
63
Sample Size
25.0
or
2
2
2 E
zn
to get the largest sample to
achieve the goal.
n
ppp
)ˆ1(ˆzˆ :C.I.
2
n
ppZE
)ˆ1(ˆError ofMargin
2
if pilot study is done.
)ˆ1(ˆ2
2
2 ppE
zn
64
Sample Size (No prior information on p)
Sample Size Example: If one wishes to do a
survey to estimate the population proportion
with 95% confidence and a margin of error of
3%, how large a sample is needed?
z/2 = 1.96; E = .03
n = (1.962/.032) x .25 = 1067.11
A sample of size 1068 is needed.
65
Sample Size (With prior information on p)
Sample Size Example: If one wishes to to estimate
the percentage of people infected with West Nile in a
population with 95% confidence and a margin of
error of 3%, how large a sample is needed? (A pilot
study has been done, and the sample proportion was