Fundamentals of Mathematical Statistics Read Wooldridge, Appendix C: Fundamentals of Mathematical Statistics: Part One . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat Outline: Fundamentals of Mathematical Statistics Part One I. Populations, Parameters, and Random Sampling II. Finite Sample Properties of Estimators III. Asymptotic or Large Sample Properties of Estimators Part Two IV. General Approaches to Parameter Estimation V. Interval Estimation and Confidence Intervals Part Three VI. Hypothesis Testing VII. Remarks on Notation 2 I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat I. Populations, Parameters, and Random Sampling • Population refers to any well‐defined group of subjects. • Statistical inference involves learning something about the population from a sample. • Parameters are constants that determine the directions and strengths of relationship among variables. 3 I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat I. Populations, Parameters, and Random Sampling Populations, Parameters, and Random Sampling • By “learning”, we can mean several things. – Most important are estimation and hypothesis testing. Example: • Suppose our interest is to find the average percentage increase in wage given an additional year of education. – Population: obtain wage and education of 33 million working people – Sample: obtain data on a subset of the population. 4 I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat I. Populations, Parameters, and Random Sampling Example: Results: o the return to education is 7.5% ‐ example of point estimate. o the return to education is between 5.6% and 9.4% ‐ example of interval estimates. o Does education affect wage? – example of hypothesis testing.
32
Embed
Populations, Fundamentals of Mathematicalpioneer.netserv.chula.ac.th/~achairat/Appendix C Mathematical... · Finite Sample Properties of Estimators Fundamentals of Mathematical Statistics
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Fundamentals of Mathematical Statistics
Read Wooldridge, Appendix C:
Fundamentals of Mathematical Statistics: Part One . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat
Outline: Fundamentals of Mathematical Statistics
Part OneI. Populations, Parameters, and Random SamplingII. Finite Sample Properties of EstimatorsIII. Asymptotic or Large Sample Properties of Estimators
Part TwoIV. General Approaches to Parameter EstimationV. Interval Estimation and Confidence Intervals
Part ThreeVI. Hypothesis TestingVII. Remarks on Notation
2I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat
I. Populations, Parameters, and Random Sampling
• Population refers to any well‐defined group of subjects.
• Statistical inference involves learning something about the population from a sample.
• Parameters are constants that determine the directions and strengths of relationship among variables.
3I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat I. Populations, Parameters, and Random Sampling
Populations, Parameters, and Random Sampling• By “learning”, we can mean several things.
– Most important are estimation and hypothesis testing.
Example:• Suppose our interest is to find the average percentage increase in wage given an
additional year of education.– Population: obtain wage and education of 33 million working people– Sample: obtain data on a subset of the population.
4I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat I. Populations, Parameters, and Random Sampling
Example: Results:o the return to education is 7.5%
‐ example of point estimate.o the return to education is between 5.6% and 9.4%
‐ example of interval estimates.o Does education affect wage?
– example of hypothesis testing.
Sampling • Let Y be a random variable representing a population with a probability
density function f(y;).
• The probability density function (pdf) of Y is assumed to be known except for the value of – Different values of imply different population distributions.
5I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat I. Populations, Parameters, and Random Sampling
SamplingRandom Sampling: Definition• If Y1, …, Yn are independent random variables with a common probability
density function f(y;), then {Y1, …, Yn} is a random sample from the population represented by f(y;).
• We also say the Yi are i.i.d. random variables from f(y;).– i.i.d. (independent, identically distributed)
6I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat I. Populations, Parameters, and Random Sampling
Example: a random sample from normal distribution.
• If Y1, …, Yn are independent random variables with a normal distribution with mean and variance 2, then {Y1, …, Yn} is a random sample from the Normal(,2) population.
Sampling
Example: working population
• We may obtain a sample of 100 families. – Note that the data we observe will differ for each different sample. A
sample provides a set of numbers, say, {y1, …, yn}.
7I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat I. Populations, Parameters, and Random Sampling
Sampling
Example: random sample from Bernoulli distribution.
• If Y1, ..., Yn are independent random variables, each is distributed as Bernoulli () so that
P(Yi=1) = P(Yi=0) = 1 ‐
then, {Y1, ..., Yn} constitutes a random sample from the Bernoulli () distribution.
• Note that Yi = 1 if passenger i show upYi = 0 otherwise
8I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat I. Populations, Parameters, and Random Sampling
II. Finite Sample Properties of Estimators
• A “finite sample” implies a sample of any size, no matter how large or small.
– Small sample properties.
• Asymptotic properties have to do with the behavior of estimators as the sample size grows without bound.
A. UnbiasednessB. VarianceC. Efficiency
9I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
Estimators and Estimates
• Suppose {Y1, …, Yn} is a random sample from a population distribution that depends on an unknown parameter .
– An estimator of is the rule that assigns each possible outcome of the sample.
– A rule is specified before any sampling is carried out.
10
A. UnbiasednessB. VarianceC. Efficiency
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
• An estimatorW of a parameter can be expressed as
W = h(Y1, …, Yn}
for some known function h.
• When [particular set of values, say {y1, …, yn}, is plugged into the function h, we obtain the estimate of .
Estimators and Estimates:sampling distribution
• The distribution of an estimator is called the sampling distribution. – It describes the likelihood of various outcomes of W across different
random samples.
• The entire sampling distribution of W1 can be obtained given the probability distribution of W1 and outcomes.
11
A. UnbiasednessB. VarianceC. Efficiency
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
Estimators and EstimatesExample:
• Let {Y1, …, Yn} be a random sample from the population with mean . The natural estimator of is the average of the random sample:
Note that Y‐bar is called the sample average.
12
A. UnbiasednessB. VarianceC. Efficiency
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
• Unlike in Appendix A, we define the sample average of a set of numbers as a descriptive statistic.
• For actual data outcomes, y1, …, yn, the estimate is the average in the sample …
• Example:
• Estimator:
• Estimate: = 6.0– Our estimate of the average city unemployment rate in the U.S. is 6.0%.
Notes1) Each sample results in a different estimate.2) The rule for obtaining the estimate is the same.
Example C.1: City Unemployment Rates
13
A. UnbiasednessB. VarianceC. Efficiency
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
Unbiasedness Unbiased Estimator: Definition
An estimator W of is unbiased if
E(W) =
for all possible values of W
– Intuitively, if the estimator is unbiased, then its probability distribution has an expected value equal to the parameter it is supposed to be estimating.
14
A. UnbiasednessB. VarianceC. Efficiency
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
• Unbiasedness does not mean that the estimate from a particular sample is equal to , or even very close to .
• If we could indefinitely draw random samples on Y from the population– then average these estimates over all random samples will obtain .
Unbiasedness
Bias of an Estimator: Definition
If W is an estimator of , its bias is defined as
Bias() = E(W) –
• An estimator has a positive bias if E(W) – >0.
15
A. UnbiasednessB. VarianceC. Efficiency
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
• The unbiasedness of an estimator and the size of bias depend on – the distribution of Y and – the function h
• We cannot control the distribution of Y, but we could choose the choice of the rule h.
Unbiasedness
• Show: the sample average is an unbiased estimator of the population mean .
1 1 1
1 1 1( ) ( )n n n
i i ii i i
E Y E Y E Y E Yn n n
1
1 1 ( )n
in
n n
16
A. UnbiasednessB. VarianceC. Efficiency
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
UnbiasednessWeaknesses: (1) Some very good estimators are not unbiased.
(2) Unbiased estimators could be quite poor estimators.
Example:Let W = Y1 (from a random sample of size n, we discard all of the observations except
the first)E(Y1) =
17
A. UnbiasednessB. VarianceC. Efficiency
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
• Unbiasedness ensures that the probability distribution of an estimator has a mean value equal to the parameter it is supposed to be estimating.
• Variance shows how spread out the distribution of an estimator.
The Sampling Variance of Estimators
• The variance of an estimator is the measure of the dispersion in the distribution. It is often called sampling variance.
• Example: the variance of sample average from a population.
18
A. UnbiasednessB. VarianceC. Efficiency
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
• Summary:If {Y1, …, Yn} is a random sample from a population with mean and variance 2, then
• has the same mean as the population• Its sampling variance equals the population variance 2 over the sample size.
(2/n)
The Sampling Variance of Estimators
• Define the estimator as
which is usually called the sample variance.
• Show that the sample variance is an unbiased estimator of 2.E(S2) = 2
19
A. UnbiasednessB. VarianceC. Efficiency
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
The Sampling Variance of Estimators
Suppose W1 and W2 both are unbiased estimators of , but W1 is more tightly centered about . (See graph!)
20
A. UnbiasednessB. VarianceC. Efficiency
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
This implies that the probability that W1 is greater than any distance from is less thanthe probability W2 is greater than the same distance from .
Estimator Estimator Y1
Mean is unbiasedE( ) =
Y1 is unbiasedE(Y1) =
Variance Var( )= 2/n Var(Y1)= 2
Example:For a random sample with mean and variance 2.Y1 is the estimator, the first observation drawn.
If the sample size n=10; this implies Var(Y1) is ten times larger than Var( )
21
A. UnbiasednessB. VarianceC. Efficiency
The Sampling Variance of Estimators
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
The Sampling Variance of Estimators
Example:From simulation in Table C.1.
20 random samples of size 10 (n=10) generated from the normal distribution with =2 and 2=1
y1 ranges from (‐0.64‐4.27); mean = 1.89
ranges from (1.16‐2.58)mean = 1.96
Which estimator is better?
22I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
Relative Efficiency: If W1 and W2 are two unbiased estimators of , then W1 is efficient relative to W2 when Var(W1)Var(W2) for all .
Relative Efficiency
23
A. UnbiasednessB. VarianceC. Efficiency
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
Example:
Example:y1
EfficiencyExample:• For estimating the population mean ,
Var( ) < Var(Y1) for any value of 2.
• The estimator is efficient relative to Y1 for estimating .
24
A. UnbiasednessB. VarianceC. Efficiency
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
• In a certain class of estimators, we can show that the sample average has the smallest variance.
Example:
Show that has the smallest variance among all unbiased estimators that are also linear functions of Y1, Y2, …, Yn.
– The assumptions are that Yi have common mean and variance, and that they are pairwise uncorrelated.
Efficiency
• If we do not restrict our attention to unbiased estimators, then comparing variances is meaningless
Example: In estimating the population mean , we use trivial estimator equal to zero– mean equal to zero E(0) = 0– Variance equal to zero: Var(0) = 0– bias of this estimator equal ‐ Bias(0) = ‐
Bias(0) = E(0) ‐ = ‐
• So this trivial estimator is a very poor estimator when the bias of the estimator or is large.
25
A. UnbiasednessB. VarianceC. Efficiency
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
Efficiency
• A measure when comparing estimators that are not necessarily unbiased:
– Mean squared error (MSE)
• If W is an estimator of , thenMSE(W) = E(W‐)2
= E[W‐E(W) +E(W)‐]2= Var(W) + [bias(W)]2
• The MSE measures how far, on average, the estimator is away from . It depends on the variance and bias.
26
A. UnbiasednessB. VarianceC. Efficiency
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
Problem C.1C.1 Let Y1, Y2, Y3, and Y4 be independent, identically distributed
random variables from a population with mean and variance 2. Let
(Y1 + Y2 + Y3 + Y4)
denote the average of these four random variables.
(i) What are the expected value and variance of in terms of and 2? [ans.]
14
Y
27
A. UnbiasednessB. VarianceC. Efficiency
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
Problem C.1 continue..
(ii) Now, consider a different estimator of :
This is an example of a weighted average of the Yi. Show that W is also an unbiased estimator of . Find the variance of W. [ans.]
(iii) Based on your answers to parts (i) and (ii), which estimator of do you prefer, or W? [ans.]
4321 21
41
81
81 YYYYW
28
A. UnbiasednessB. VarianceC. Efficiency
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
Problem C.1 (i) continue…
• Expected value of =
• Variance of = 2/4.
1 1 1
1 1 1( ) ( )n n n
i i ii i i
E Y E Y E Y E Yn n n
1
1 1 ( )n
in
n n
29
(i) This is just a special case of what we covered in the text, with n = 4: • E( ) = µ and Var( ) = 2/4.
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
Problem C.1 (iii)
• (iii) Var(W) = 11/32Var( ) = 8/32 = ¼
• Because 11/32 > 8/32 = 1/4, Var(W) > Var( ) for any 2 > 0, so is preferred to W because each is unbiased.
31I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat II. Finite Sample Properties of Estimators
III. Asymptotic or Large Sample Properties of Estimators
• For estimating a population mean – One notable feature of Y1 is that it has the same variance for any sample
size– improves in the sense that its variance gets smaller as n gets larger.
• Y1 does not improve in this case
• We can rule out silly estimators by studying the asymptotic or large sample properties of estimators (n).
A. ConsistencyB. Asy. Normality
32I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat III. Asymptotic or Large Sample Properties of Estimators
• How large is “large” sample?– This depends on the underlying population distribution.– Note that large sample approximations have been known to work well for
sample sizes as small as 20 observations (n=20).
Consistency• Consistency concerns how far the estimator is likely to be from the
parameter it is supposed to be estimating – as sample size increases indefinitely.
• Definition: ConsistencyLet Wn be an estimator of based on Y1, …, Yn of sample size n. Then, Wn is a consistent estimator if for for every >0
P(Wn‐ ) > 0 as n
• Note that we index the estimator by the sample size, n, in stating this definition.
I. Random Sampling II. Finite Sample III. Asymptotic Sample
A. ConsistencyB. Asy. Normality
Fundamentals of Mathematical Statistics: Part One . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat 33
Consistency1. The distribution Wn becomesmore and more concentrated about as sample size increases (n).
2. For larger sample sizes, Wn is less and less likely to be very far from .
A. ConsistencyB. Asy. Normality
34I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat III. Asymptotic or Large Sample Properties of Estimators
3. When Wn is consistent, we say that is the probability limit of Wn, written as
plim(Wn) =
4. The conclusion that Yn is consistent of is known as the law of large numbers (LLN)
Consistency
• Example: the average of a random sample drawn with mean and variance 2
– the sample average is unbiased
– Thus, Var( n) 0 as n ‐>
n is a consistent estimator of
nYVar n
2
)(
A. ConsistencyB. Asy. Normality
35I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat III. Asymptotic or Large Sample Properties of Estimators
• Unbiased estimators are not necessarily consistent, but those whose variances shrink to zero as sample size increases are consistent.
• Formally, if Wn is an estimator of and Var(Wn)0 as n , then plim(Wn)=.
Law of Large NumberDefinition: LLNLet Y1, …, Yn be i.i.d. random variables with mean . Then
plim( n) =
A. ConsistencyB. Asy. Normality
36I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat III. Asymptotic or Large Sample Properties of Estimators
Intuitively, the LLN says that if we are interested in finding population average , we get arbitrarily close to by choosing a sufficiently large sample.
ConsistencyProperty PLIM.1
Let be a parameter and define a new parameter =g() for some continuous function g( ). Suppose plim(Wn)= . Define an estimator of as Gn=g(Wn). Then
plim(Gn) = .Alternatively,
plim[g(Wn)] = g[plim(Wn)]for some continuous function, g().
I. Random Sampling II. Finite Sample III. Asymptotic Sample
A. ConsistencyB. Asy. Normality
Fundamentals of Mathematical Statistics: Part One . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat 37
• What is a continuous function?– Note a continuous function is a “function that can be graphed without lifting your pencil from the
If plim(Tn)= and plim(Un)=, then1) plim(Tn+Un) = + 2) plim(TnUn) = 3) plim(Tn/Un) = / provided that 0
I. Random Sampling II. Finite Sample III. Asymptotic Sample
A. ConsistencyB. Asy. Normality
Fundamentals of Mathematical Statistics: Part One . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat 38
Consistency• Example:
(1) ∑ E(2) Yn* ∑ E(Y*)
• As n , and X* are consistent estimators of .
(1) plim( ) = Var( ) = 0 as n
(2) plim(Y*) = plim( ∑ = plim( plim( )= 1* =
Var( ) = 0 as n
• Y* is also a consistent estimator since Y* approaches the value of the parameter as sample size gets larger and larger.
A. ConsistencyB. Asy. Normality
39I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat III. Asymptotic or Large Sample Properties of Estimators
Consistency
Example:Estimating standard deviation from a population with mean and variance 2
Given sample variance
– Sample variance is unbiased estimator for 2.– Sn2 is also a consistent estimator for 2.
• Sample standard deviation– Sn is not an unbiased estimator of . Why?– Sn is a consistent estimator of .
2 plim plim nn SS
A. ConsistencyB. Asy. Normality
40I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat III. Asymptotic or Large Sample Properties of Estimators
Consistency
• Due to the facts that
• It follows from PLIM.1 and PLIM.2 that
• Gn is a consistent estimator of . It is just the percentage difference between ̅n and n.
nnnn YYZG /)(100
)(plim nG
YnY )(plimZnZ )(plim
A. ConsistencyB. Asy. Normality
41I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat III. Asymptotic or Large Sample Properties of Estimators
Example:Yi annual earnings with a high school education
(population mean = Y)Zi annual earnings with a college education
(population mean = Z)• Let {Y1,…, Yn} and {Z1,…, Zn} be a random sample of size n from a population of workers, and
we want to estimate the percentage difference in annual earnings between two groups, which is
YYZ /)(100
Asymptotic Normality Central Limit Theorem
Consistency is a property of point estimators, so is unbiasedness.
Consistency and distribution• Consistency does not tell us about the shape of that distribution for a given
sample size.• Most econometric estimators have distributions that are well approximated by
a normal distribution for large samples (n).
A. ConsistencyB. Asy. Normality
42I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat III. Asymptotic or Large Sample Properties of Estimators
ZN│z
cdf for Zn (z) as n
Asymptotic NormalityDefinition: Asymptotic Normality• Let {Zn: n=1, …, n} be a sequence of random variables such that for all numbers z,
P(Zn z) (z) as n ,
where (z) is the standard normal cumulative distribution function (cdf)
A. ConsistencyB. Asy. Normality
43I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat III. Asymptotic or Large Sample Properties of Estimators
• Intuitively,This property means that the cdf for Zn gets closer and closer to the cdf of the standard normal distribution as the sample size n gets large.
Central Limit Theorem (CLT)
• Definition: CLTIf Yi d(, 2), then
has an asymptotic standard normal distribution.
A. ConsistencyB. Asy. Normality
44I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat III. Asymptotic or Large Sample Properties of Estimators
Yi d(, 2) implies that {Y1, …, Yn} be a random sample with mean and variance 2.
• Intuitively,The central limit theorem (CLT) suggests that the average from a random sample for any population, when standardized, has an asymptotic standard normal distribution.
The variable Zn is the standardized version of n: we have subtracted off E( n)= and divided by sd( n)=/ .
“a” stands for “asymptotically” or “approximately”.
Asymptotic Normality Central Limit Theorem• if is replaced by Sn, does have an approximate standard normal
distribution for size n?
The exact distributions of
are not the same as (1) , but the difference is often small enough to be ignored for large n.
A. ConsistencyB. Asy. Normality
45
(1)
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat III. Asymptotic or Large Sample Properties of Estimators
Problem C.3C.3 Let denote the sample average from a random sample with
mean and variance 2. Consider two alternative estimators of :
W1 = [(n‐1)/n] and W2 = /2.
(i) Show that W1 and W2 are both biased estimators of and find the biases. What happens to the biases as n ? Comment on any important differences in bias for the two estimators as the sample size gets large. [ans.]
A. ConsistencyB. Asy. Normality
46I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat III. Asymptotic or Large Sample Properties of Estimators
Problem C.3 continue …
(ii) Find the probability limits of W1 and W2. {Hint: Use properties PLIM.1 and PLIM.2; for W1, note that plim[(n‐1)/n] = 1.} Which estimator is consistent? [ans.]
(iii) Find Var(W1) and Var(W2). [ans.]
(iv) Argue that W1 is a better estimator than if is “close” to zero. [ans.]
(Consider both bias and variance.)
A. ConsistencyB. Asy. Normality
47I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat III. Asymptotic or Large Sample Properties of Estimators
Bias(W1) = [(n – 1)/n]µ – µ = –µ/n. As n , Bias(W1) 0
• Similarly, E(W2) = E( )/2 = µ/2,
Bias(W2) = µ/2 – µ = –µ/2. As n , Bias(W1) = –µ/2
• The bias in W1 tends to zero as n, while the bias in W2 is –µ/2 for all n. This is an important difference.
48I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat III. Asymptotic or Large Sample Properties of Estimators
Problem C.3 (ii)(ii) • plim(W1) = plim[(n – 1)/n]plim( )
= 1µ = µ.
• plim(W2) = plim( )/2 = µ/2.
• Because plim(W1) = µ and plim(W2) = µ/2, W1 is consistent whereas W2 is inconsistent.
49I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat III. Asymptotic or Large Sample Properties of Estimators
Problem C.3 (iii)
(iii)
• Var(W1) = [(n – 1)/n]2Var( )= [(n – 1)2/n3]2
• Var(W2) = Var( )/4 = 2/(4n).
50I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat III. Asymptotic or Large Sample Properties of Estimators
Problem C.3 (iv)(iv) • Because is unbiased, its mean squared error is simply its variance.
MSE( ) = Var( ) + [Bias( )]2= 2/n
• On the other hand, MSE(W1) = Var(W1) + [Bias(W1)]2
• Therefore, MSE(W1) is smaller than Var( ) for µ close to zero.
• For large n, the difference between the two estimators is trivial.
51I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat III. Asymptotic or Large Sample Properties of Estimators
IV. General Approaches to Parameter Estimation
• We have learned finite and asymptotic properties for estimation –unbiasedness, consistency and efficiency.
Question:Are they general approaches that produce estimators with good properties?
A. MomentsB. Max LikelihoodC. Least Squares
52I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat IV. General Approaches to Parameter Estimation
• Given a parameter appearing in a population distribution, there are usually many ways to obtain unbiased and consistent estimator of .
• There are three methods– Method of Moments– Method of Maximum Likelihood– Method of Least Squares
Method of Moments• The basis of the method of moments proceeds as follows.
– The parameter is shown to be related to some function of some expected value in the distribution of Y, usually E(Y) and E(Y2).
A. MomentsB. Max LikelihoodC. Least Squares
53I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat IV. General Approaches to Parameter Estimation
Example: Population mean• Suppose is a function of ; i.e., =g()
• Given the sample average is an unbiased and consistent estimator of , it is natural to replace for . Thus, g( ) is the estimator .
– In addition, the estimator g( ) is a consistent estimator of . If g () is linearfunction of , then g( ) is an unbiased estimator of .
• Why is this a method of moments? – Here we replace the population moment with the sample average .
Method of MomentsExample: Population covariance• The population covariance between two random variables X and Y is
XY = E(X‐X)(Y‐Y)
• The method of moment suggests the following estimator,
1)This is a consistent estimator of XY.2) However, this is a biased estimator.
1
1 ( )( )n
XY i ii
S X X Y Yn
A. MomentsB. Max LikelihoodC. Least Squares
54I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat IV. General Approaches to Parameter Estimation
Example: Sample covariance• The sample covariance is
1) It can be shown that this is an unbiased estimator of XY.2) This is a consistent estimator of XY.
Method of MomentsExample: Population correlation
XY = XY/(XY)
• The method of moments suggests estimating XY as
This is called the sample correlation coefficient.
A. MomentsB. Max LikelihoodC. Least Squares
55I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat IV. General Approaches to Parameter Estimation
Notes:(1) RXY is a consistent estimator of XY.
(Why?): SXY, SX and SY are consistent.(2) RXY is not an unbiased estimator of XY. (Why?)
First: SX and SY are not unbiased estimators.Second: RXY is a ratio of estimators, so it would not be unbiased.
Maximum Likelihood• Maximum likelihood Estimator (MLE)
• Let {Y1, …, Yn} be a random sample from the population distribution f(y;).
A. MomentsB. Max LikelihoodC. Least Squares
56I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat IV. General Approaches to Parameter Estimation
• The likelihood function, which is a random variable, can be defined as
L(; Y1, …, Yn) = f(Y1;)f(Y2;) f(Yn;)
• It is easier to work with the log‐likelihood function
1. In the discrete case, this isP(Y1=y1, Y2=y2,…, Yn=yn)
= P(Y1=y1)P(Y2=y2) … P(Yn=yn)
2. The joint distribution of {Y1, …, Yn} can be written as the product of the densities: f(Y1;)f(Y2,) f(Yn,).
• Then, the maximum likelihood estimator of , call it W, is the value of that maximizes the likelihood function.
– Intuitively, out of the possible values for , the value that makes the likelihood that the observed values are largest should be chosen.
Facts:1) It is obtained by taking the natural
log of the likelihood function.
2) The log of the product is the sum of the logs.
Maximum LikelihoodA. MomentsB. Max LikelihoodC. Least Squares
57I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat IV. General Approaches to Parameter Estimation
• Properties:1) MLE is usually consistent and sometimes unbiased.2) MLE is the generally the most asymptotically efficient estimator (when
the population model f(y;) is correctly specified).– MLE has the smallest variance among all unbiased estimators of .– MLE is the minimum variance unbiased estimator.
• Least squares estimators – a third kind of estimator.• The sample mean, , is a least square estimator of the population mean .
– It can be shown that the value of mwhich make the sum of squared deviations as small as possible is m =
Least SquaresA. MomentsB. Max LikelihoodC. Least Squares
58I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat IV. General Approaches to Parameter Estimation
Properties:1) LSE is consistent and unbiased.
2) LSE is the generally the most efficient estimator in finite and large samples.
– LSE has the smallest variance among all linear unbiased estimators of .
Methods of Moments, Least Squares, Maximum Likelihood
• The principles of least squares, method of moments, and maximum likelihood often result in the same estimator.
A. MomentsB. Max LikelihoodC. Least Squares
59I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat IV. General Approaches to Parameter Estimation
• A point estimate is the researcher’s best guess at the population value, but it provides no information how close the estimate is likely to be to the population parameter.
60
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
Example:• On the basis of a random sample of workers,
– a researcher reports that job training grants increase hourly wage by 6.4%– We cannot know how close an estimate is for a particular sample because we
do not know the population value.
The Nature of Interval Estimation
• Interval estimation comes in when we make statement involving probabilities.
– One way of assessing the uncertainly in an estimator – sampling standard deviation.
• Interval estimation uses information on the point estimate and the standard deviation by constructing Confidence interval.
– It shows how the population value is likely to lie in relation to the estimate.
61
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
Concept of Interval estimation:• Assume {Y1, …, Yn} is a random sample from the Normal(, 2) population.
Suppose that the variance 2 is known (or 2=1).– The sample average has a normal distribution with mean and variance
2 /n; i.e., – Normal(, 2 /n).
The Nature of Interval Estimation
• The standardized version of has a standard normal distribution.
• Rewrite as
• The random interval is
95.96.1/
96.1
nYP
( 1.96 / , 1.96 / )Y n Y n
62
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI
( 1.96 / < 1.96 / ) 0.95P Y n Y n
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
1) The probability that the random interval contains the population mean is .95 or 95%2) This information allows us to construct an interval estimate of
‐ by plugging the sample outcome of the average, and =1 into the random interval .
• It is called a 95% confidence interval.
• A shorthand notation is ny /96.1
( 1.96 / < 1.96 / ) 0.95P Y n Y n
( 1.96 / , 1.96 / )Y n Y n random intervalconfidence interval
The Nature of Interval Estimation
Example:• Given the sample data {y1, y2, …, yn} are observed. We can find .
Suppose n= 16, =7.3, =1
• The 95% confidence interval for is = 7.3 .49
• We can write in an interval form as [6.81, 7.89].
16/96.13.7
63
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
• The meaning of a confidence is more difficult to understand.We mean that the random interval contains with probability .95– There is a 95% chance that random interval contains .
• Random interval is an example of interval estimator.– since endpoints change with different samples.
The Nature of Interval Estimation
Correct Interpretation:A random interval contains with probability 0.95.
Incorrect interpretation:The probability that is in the interval is 0.95.
• since is unknown and it either is or is not in the interval.
64
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI( 1.96 / , 1.96 / )Y n Y n
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
The Nature of Interval Estimation
65
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI
Example:• Table C.2 contains calculations for 20
random samples• Assume Normal(2,1) distribution with
sample size n=10.• Interval estimates of are .62.Results:1) The interval changes with each random
sample.2) 19 of the 20 intervals contain the
population value of .3) Only for replication number 19, =2 is not
in the confidence interval. 4) 95% of the samples result in a confidence
interval that contain .
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
CIs for the Mean from a Normally Distributed Population
• Suppose the variance is 2 and known. The 95% confidence interval is
IV. Parameter Estimation V. Interval Estimation & Confidence Interval66
Fundamentals of Mathematical Statistics: Part Two . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI
• In practice, we rarely know the population variance 2.• To allow for unknown , we can use an estimate:
• However, the random interval no longer contains with probability .95 because the constant has been replaced with the random variable S.
)/(96.1 nSY
CIs for the Mean from a Normally Distributed Population
• We use t distribution, rather standard normal distribution.
where S is the sample standard deviation of the random sample {Y1, …, Yn}.
– Note that
67
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
• To construct a 95% confidence interval using t distribution, let c be the 97.5th percentile in the tn‐1 distribution.
• Once the critical value c is chosen, the random interval contains .
]/ ,/[ nScYnScY
P(-c<t<c) =.95
Table G.2 in Appendix G.
CIs for the Mean from a Normally Distributed Population
Example:• Let n=20
df = n‐1 = 19c = 2.093 (See Table G.2 in Appendix G)
• The 95% confidence interval is
and s are the values obtained from the sample.
)20/(093.2 sy
68
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
CIs for the Mean from a Normally Distributed Population
• More generally, let c be the 100(1‐/2) percentile in the tn‐1 distribution.
• A 100(1‐ )% confidence interval is c/2 – known after choosing and degree of freedom n‐1.
69
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
• Recall that
• s/ is the point estimate of sd( or the standard error of .
n/)sd( Y
se( ) / nY s
• A 100(1‐)% confidence interval can be written as
• The notion of the standard error of an estimate plays an important role in econometrics.
Example C.2 Effect of Job Training Grants on Worker Productivity
• A sample of firms receiving job training grants in 1988. Scrap rates – number of items per 100
produced that are not useable and need to be scrapped.
The change in scrap rates has a normal distribution.
70
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI
• A 95% confidence interval for the mean change in scrap rate is
2.093se( )[‐2.28, ‐0.02]
• With 95% confidence, the average change in scrap rates in the population is not zero!
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
Example C.2 Effect of Job Training Grants on Worker Productivity
• The analysis above has some potentially serious flaws.
• It assumes that any systematic reduction in scrap rates is due to the job training grants.– Many things (variables) can happen over the course of the year to change
worker productivity!
71
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
• Note that t distribution approaches the standard normal distribution as the degrees of freedom gets large.
• In particular,=.05, c/2 1.96 as n
(see graph)
A Simple Rule of Thumb for a 95% Confidence Interval
• A Rule of Thumb for an approximate 95% confidence interval is
1) It is slightly too big for large sample sizes2) It is slightly too small for small sample sizes.
72
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
• For some applications, the population is nonnormal.– In some cases, the nonnormal population has no standard distribution.
• This does not matter as long as sample sizes are sufficiently large for the central limit theorem to give a good approximation of the distribution of the sample average .
• A large sample size has a nice feature since it results in small confidence intervals. This is because standard error for
– [se( )] shrinks to zero as the sample size grows.
73
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI
Asymptotic Confidence Intervals for Nonnormal Populations
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
• For large n, an approximate 95% confidence interval is
where 1.96 is the 97.5th percentile in the standard normal distribution.
• Note that the standard normal distribution is used in place of t distribution since we deal with asymptotics.
– as n increases without bound, the t distribution approaches standard normal distribution.
Example C.3 Race Discrimination in Hiring• Matched pairs analysis – each person in a pair interviewed for the same job.
• We are interested in the difference B ‐ W
• Unbiased estimators of B and W are and the fractions of interviews for which blacks and whites were offered jobs.
B Probability that the black person is offered a job
w Probability that the white person is offered a job
Bi=1 If the black person gets a job offer from employer i
Wi=1 If the white person gets a job offer from employer i
74
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
• A new variableYi = Bi –Wi
• Yi can take these values, then,
Yi =-1 if the black did not get the job, but the white did
Yi =0 if both did or did not get the job
Yi =1 if the white did not get the job, but the black didWBii WEBE )()(
Example C.3 Race Discrimination in Hiring
• Sample size n=241
– 22.4% of black were offered jobs, while 35.7% of white were offered jobs– This is prima facie evidence of discrimination!
• Sample standard deviation: s=0.482• Find an approximate 95% confidence interval for
133.357.224. ,357. and 224. ysowb
75
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
• A 95% CI for = B ‐ w is ‐.133 1.96(.482/(241)½
‐.133 .031[‐.164, ‐.102]
• A 99% CI for = B ‐ w is ‐.133 2.58(.482/(241)½
[‐.213, ‐.053]
• We are very confident that the population difference is not zero!
Problem C.7C.7 The new management at a bakery
claims that workers are now more productive than they were under old management, which is why wages have “generally increased.” Let Wi
b be Worker i’s wage under the old management and let Wi
a be Worker i’s wage after the change. The difference is
Di = Wia ‐Wi
b. Assume that the Di are a random sample from a Normal(, 2) distribution.
(i) Using the following data on 15 workers, construct an exact 95% confidence interval for . [ans.]
76
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
obs Wb Wa D=Wa-Wb
1 8.3 9.25 0.95
2 9.4 9 -0.4
3 9 9.25 0.25
4 10.5 10 -0.5
5 11.4 12 0.6
6 8.75 9.5 0.75
7 10 10.25 0.25
8 9.5 9.5 0
9 10.8 11.5 0.7
10 12.55 13.1 0.55
11 12 11.5 -0.5
12 8.65 9 0.35
13 7.75 7.75 0
14 11.25 11.5 0.25
15 12.65 13 0.35
mean 10.16667 10.40667 0.24
Problem C.7 (i)
(i) • The average increase in wage is ̅ = .24, or 24 cents. • The sample standard deviation is
about s = .451n = 15, se( ̅) = .1164.
77I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
78I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat V. Interval Estimation and Confidence Intervals
VI. Hypothesis Testing
• We have reviewed how to evaluate point estimators and to construct confidence intervals.
– Sometimes the question we are interested in has a definite yes or no answer.1) Does a job training program effectively increase average worker
productivity?2) Are blacks discriminated against in hiring?
• Devising methods for answering such questions, using a sample of data, is known as hypothesis testing.
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
79I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Fundamentals of Hypothesis Testing
• Suppose the election results are as follows: – Candidate A=42% and – Candidate B=58% of the popular vote
• Candidate A argued that the election was rigged.Consulting agency: a sample of 100 voters. It was found that 53% voted for Candidate A.
Question:how strong is the sample evidence against the officially reported percentage of 42%?
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
80I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• One way to proceed is to set up a hypothesis test.Let be the true proportion of the population voting for Candidate A
• The null hypothesis isH0: =.42
Fundamentals of Hypothesis Testing
• The null hypothesis plays a role similar to that of a defendant. – A defendant is presumed to be innocent until proven guilty.– The null hypothesis is presumed to be true until the data strongly suggest
otherwise.
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
81I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• The alternative hypothesis is that the true proportion voting for Candidate A is above 0.42.
H1: >.42
• In order to conclude H1 is true and H0 is false, we must prove beyond reasonable doubt.
– Observing 43 votes out of a sample of 100 is not enough to overturn the original result.
• Such an outcome is within the expected sampling variation.
– How about observing 53 votes out of a sample of 100?
Fundamentals of Hypothesis Testing
• There are two kinds of mistakes:
1)We reject the null hypothesis when it is true – Type I errorExample: We reject H0 when the true proportion of voting for Candidate
A is in fact 0.42.
2)We “accept” or do not reject the null hypothesis when it is false – Type II errorExample: we “accept” H0, but >.42.
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
82I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Fundamentals of Hypothesis Testing
• We can compute the probability of making either a Type I or a Type II error.
• Hypothesis testing requires choosing the significance level, denoted by .
= P(Reject H0 H0)
Read: the probability of rejecting null hypothesis, given that H0 is true.
• A significance level is the probability of committing Type I error.
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
83I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• Classical hypothesis requires that we specify a significance level for a test.
• Common values for are .10, .05, and .01. – They quantify our tolerance for an error.
• =.05: The researcher is willing to make mistakes (falsely reject H0) 5% of the time.
Fundamentals of Hypothesis Testing
• Type II error– Want to minimize the probability of Type II error– Alternatively, want to maximize the power of a test.
• The power of the test is one minus the probability of a Type II error. Mathematically,
() = P(Reject H0) = 1 – P(Type II)
where the actual value of the parameter.
• We would like the power to equal unity whenever the null hypothesis is false.
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
84I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Testing Hypotheses about the Mean in a Normal Population
• In order to test hypothesis, we need to choose a test statisticand a critical value.
• The test statistic T is some function of the random sample.
• When we compute the statistic for particular outcome, we obtain an outcome of the test statistic, denoted by t.
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
85I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• Provided that the null hypothesis is true, the critical value c is determined by the distribution of T and the chosen significance level .
• All rejection rules depend on the outcome of the test statistic t and the critical value c.
Testing Hypotheses about the Mean in a Normal Population
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
86I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• To test hypothesis about the mean from a Normal(, 2) is as follows. The null hypothesis is
H0: = 0,
where 0 is a value we specify. In the majority of applications, =0.
• The rejection rule depends on the nature of the alternative hypothesis.
Testing Hypotheses about the Mean in a Normal Population
Three alternatives of interest
• One sided alternative:H1: > 0,H1: < 0,
• Two sided alternative:H1: 0,
• Here we are interested in any departure from the null hypothesis.
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
87I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• For example, for an one‐sided alternative,H1: 0 > 0.
• The null hypothesis is effectively H0: 0.
• Here we reject the null hypothesis when the value of sample average, , is sufficiently greater than 0. How?
Testing Hypotheses about the Mean in a Normal Population
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
88I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• We use standardized version,
• Note that s is used in place of and • This is called the t statistic. The t statistic measures the distance from to 0
relative to the standard error of .
nsyse /)(
Testing Hypotheses about the Mean in a Normal Population
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
89I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• Under the null hypothesis, the random variable is
T has a tn‐1 distribution.
Example of a one‐tailed test:• Choose the significance level =.05. The critical
value c is chosen so that
P( T > cH0) = .05
• The rejection rule is t > c
SYnT /)( 0where c is the 100(1-) percentile in a tn-
1 distribution. This is an example of a one-tailed test.
Example C.4: Effect of Enterprise Zones on Business Investments
• Y denotes the percentage change in investment from the year before and year after a city became an enterprise zone.
• Assume that Y has a Normal(,2) distribution.H0: =0 (Null hypothesis: Enterprise zones have no effect)H1: >0 (Alternative hypothesis:They have a positive effect)
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
90I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• Suppose that we wish to test H0 at the 5% level. The test statistic is
• A sample of 36 cities. =0.5; C=1.69 (see Table G.2) y‐bar=8.2; s=23.9 t = 2.06
• We conclude that, at the 5% significance level, enterprise zones have an effect on average investment.
• At 1% significance level, do enterprise zones have an positive effect?
Back
Up
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
91I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• For the null hypothesis and the alternative hypothesisH0: ≥ 0,H1: < 0.
• The rejection rule is
t < ‐cThis implies that < 0 that are sufficiently far from zero to reject H0.
Testing Hypotheses about the Mean in a Normal Population
92
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Example C.5: Race Discrimination in Hiring
• =B‐W is the difference in the probability that blacks and whites receive job offers. is the population mean of the variable Y=B‐W where B and W are binary variables.
• TestingH0: =0H1: <0
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
93I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• Given n=241,• The t statistic for testing H0: =0
t = ‐.133/.031 = ‐4.29
• Critical value = ‐2.58 (one‐sided test; =.005)• t<‐2.58 There is very strong evidence against H0 in favor of H1.
031.241/48.)(133. ysey
• For the null hypothesis and the alternative hypothesis,H0: = 0,H1: 0.
• The rejection rule is
t> c
This gives a two‐tailed test.
Testing Hypotheses about the Mean in a Normal Population
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
94I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• We have to be careful in obtaining the critical value, c.
• The critical value c (See graph!)– It is the 100(1‐/2) percentile in a tn‐1 distribution.– If =.05, c is the 97.5th percentile in the tn‐1 distribution.
Testing Hypotheses about the Mean in a Normal Population
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
95I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Example: Let n=22,
• c=2.08, the 97.5th percentile in a t21distribution.(See Table G.2)
• Rejection Rule: the absolute value of t statistic must exceed 2.08.
Testing Hypotheses about the Mean in a Normal Population
• Proper language for hypothesis testing:“We fail to reject H0 in favor of H1 at the 5% significance level”
• Incorrect wording:“ We accept H0 at the 5% significance level”
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
96I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Asymptotic Tests for Nonnormal Population
• If the sample is large enough, we can invoke central limit theorem.
• Asymptotic theory is based on n increasing without bound
• Under the null hypothesis, a Normal (0,1)
As n gets large, the tn‐1 distribution converges to the standard normal distribution.
SYnT /)( 0
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
97I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Asymptotic Tests for Nonnormal Population
• Because asymptotic theory is based on n increasing without bound, – standard normal or t critical values are pretty much the same
• Suggestions:– For moderate values of n, say between 30 and 60, it is traditional to use the
t distribution.– For n 120, the choice between two distributions is irrelevant.
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
98I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• Note that our chosen significance levels are only approximate.
• When the sample size is large, the actual significance level will be very close to 5%.
Example C.5: Race Discrimination in Hiring
• Given n=241,
• The t statistic for testing H0: =0t = ‐.133/.031 = ‐4.29
• Critical value = ‐2.58 (two‐sided test; =.01)• t<‐2.58 There is very strong evidence against H0 in favor of H1.
031.241/48.)(133. ysey
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
99I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• =B‐W is the difference in the probability that blacks and whites receive job offers. is the population mean of the variable Y=B‐W where B and W are binary variables.
• TestingH0: =0H1: 0
Computing and Using p‐Values
• The traditional requirement of choosing the significance level ahead of time means that different researchers could wind up with different conclusions,
– although they use the same set of data and same procedures.
• p‐value of the test
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
100I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• p‐value of the testIt is the largest significance level at which we fail to reject the null hypothesis.
• p‐value of the testIt is the smallest significance level at which we reject the null hypothesis.
Computing and Using p‐Values
• One sided Test: Let H0: =0 in a Normal(,2). The test statistic is
• The observed value of T for our sample is t = 1.52
SYnT /
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
101I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• The p‐value is the area to the right hand side of 1.52, which is
p‐value = P(T>1.52H0) = 1 – (1.52) = .0655
where () is the standard normal cumulative distribution function (cdf).
Computing and Using p‐Values
• Interpretation: t=1.52 and p‐value=.065
– The largest significance level at which we carry out the test and fail to reject the H0 is .065.
– The probability that we observe a value of T as large as 1.52 when the null hypothesis is true.
– If we carry out the test at the significance level above .065, we reject the null hypothesis.
– The smallest significance level at which we reject the null hypothesis is .065.
– We would observe the value of T as large as 1.52 due to chance 6.5% of the time.
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
102I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Computing and Using p‐Values
• Interpretation: t=2.85 and (n is large)p‐value= 1 –(2.85) =.0022
– If the null hypothesis is true, we observe a value of T as large as 2.85 with probability .002.
– If we carry out the test at the significance level above .002, we reject the null hypothesis.
– The smallest significance level at which we reject the null hypothesis is .002.
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
103I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Example C.6: Effect of Training Grants on Worker Productivity(one tail test)• is the average change in scrap rates; n=20. Note that the change in scrap rates has a
normal distribution.
• HypothesisH0: = 0 (Training grants have no effect)H1: <0
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
104I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
n=20 (for one‐tail test)
• If we carry out the test at the significance level above 0.023, we reject the null hypothesis.
• The smallest significance level at which we reject the null hypothesis is 0.023.
Example: Training Grants and Worker Productivity (two tails)
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
105
2.13
Area = p-value = 023
P-value = 0.023+0.023 = 0.046
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• If we carry out the test at the significance level above 0.046, we reject the null hypothesis.
• The smallest significance level at which we reject the null hypothesis is 0.046.
Two sided alternative• For t testing about population means, the p‐value is
P(Tn‐1>t) = 2P(Tn‐1>t)
• P‐value is computed by finding the area to the right oftand multiplying the area by two.
• Hull Hypothesis and Two‐sided alternativeH0: =0 against H1: 0
t value of the test statistic
Tn-1 t random variable
Example C.7 Race Discrimination in Hiring
• The t statistic for testing H0: =0t = ‐.133/.031 = ‐4.29
• If Z is a standard normal random variableP(Z<‐4.29) 0
• Hull Hypothesis and Two‐sided alternativeH0: =0 against H1: 0
• There is very strong evidence against H0 in favor of H1.– Note that Critical value = ‐2.58 (=.01)
• Givenn=241; = ‐.133; se( ) = .48/ = .031
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
106I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
For nonnormal distribution, the exact p‐value can be difficult to obtain, but we can find asympototic p‐values by using the same calculations.
Computing and Using p‐Values
• Rejection rules for t value: Summary1) For H1: > 0, the rejection rule is t>c and the p‐value is P(T>t).2) For H1: < 0, the rejection rule is t<‐c and the p‐value is P(T<t).3) For H1: ≠ 0, the rejec on rule is t>c and the p‐value is P(T>t).
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
107I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• Rejection rules for p‐value: SummaryChoose significance level, 1) We reject the H0 at the 100% level if
p‐value < 2) We fail to reject H0 at the 100% level if
p‐value
The Relationship between Confidence Interval and Hypothesis Testing
• The confidence interval and hypothesis testing are linked.• Assume =.05. The confidence interval can be used to test two sided
alternatives. SupposeH0: = 0H1: 0
• Rejection Rule– If 0 does not in the confidence interval, we reject the null hypothesis at the
5% level.– If the hypothesized value of , 0, lies in the confidence interval, we fail to
reject the null hypothesis at the 5% level.
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
108I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• After a confidence interval is constructed, many values of 0 can be tested.
– Since a confidence interval contains more than one value, there are many null hypotheses that can be rejected.
Example C.8: Training Grants and Worker Productivity
• A 95% confidence interval for the mean change in scrap rate is[‐2.28, ‐0.02]
• Since zero is excluded from this interval, at 5% level, we reject H0: =0 against H1: 0
• If H0: =‐2, we fail to reject the null hypothesis.
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
109I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• Don’t sayWe “accept” the null hypothesis H0: =‐1.0 at the 5% significance level.
• This is because in the same set of data, there are usually many hypotheses that cannot be rejected.
• For example, – it is logically incorrect to say that H0: =‐1 and H0: = ‐2 are both “accepted.” – It is possible that neither are rejected. – Thus, we say “fail to reject”.
Practical Significance and Statistical Significance
• Three covered evidences of population parameters include 1) point estimates, 2) confidence intervals and 3) hypothesis tests.
• In empirical analysis, we should also put emphasis on the magnitudes of the point estimates!
110
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• Statistical significance depends on the size of the test statistic, not on the size of .
– It depends on the ratio of to its standard error:• The test statistic could be large because se( ) is large or is large.
)(/ yseyt
Practical Significance and Statistical Significance
• Note that the magnitude and sign of the test statistic determine the statistical significance.
• Practical significance depends on the magnitude of . – The estimate can be statistically significant without being large,
especially when we work with large sample sizes.
111
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Example C.9: Effect of Free Way Width on Commute Time
• Given n=900,• The t statistic for testing H0: =0
t = ‐3.6/1.09 = ‐3.30p‐value=.005
• Statistical Significance: We conclude that the estimated reduction in comunte time had a statistically significant effect on average commute time.
• Practical Significance: The estimated reduction in average commute time is only 3.6 minutes.
3.6; sample sd 32.7 ( ) 32.7 / 900 1.09y se y
112
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
• Let Y denote the change in commute time, measured in minutes, for commuters before and after a freeway was widened.
• Assume YNormal(, 2)• Hypotheses:
H0: =0 H1: <0
VII. Remarks on Notation
• We have been careful to use standard conventions to denote
• Distinguishing between an estimator and estimate (outcome of the random variable W) is important for understanding various concepts in– Estimation and– Hypothesis Testing.
random variable Westimator Westimate w
113I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VII. Remarks on Notation
Remarks on Notation• In the main text, we use a simpler convention that is widely used in
econometrics.
• If is a population parameter, the notation will be used to denote both an estimator and an estimate of
• = theta hat
114I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VII. Remarks on Notation
Example:• If the population parameter is , then
denotes an estimator or estimate of .
• If the parameter is 2, then denotes an estimator or estimate of 2.
Problem C.6C.6 You are hired by the governor to study whether a tax on
liquor has decreased average liquor consumption in your state. You are able to obtain, for a sample of individuals selected at random, the difference in liquor consumption (in ounces) for the years before and after the tax. For person i who is sampled randomly from the population, Yi denotes the change in liquor consumption. Treat these as a random sample from a Normal(, 2) distribution.
(i) The null hypothesis is that there was no change in average liquor consumption. State this formally in terms of . [ans.]
(ii) The alternative is that there was a decline in liquor consumption; state the alternative in terms of . [ans.]
115
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
VI. Hypothesis Testing VII. Remarks on NotationFundamentals of Mathematical Statistics: Part Three . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat
Problem C.6 continue(iii) Now, suppose your sample size is n = 900 and you obtain the
estimates = ‐ 32.8 and s = 466.4.Calculate the t statistic for testing H0 against H1; obtain the p‐value for the test. (Because of the large sample size, just use the standard normal distribution tabulated in Table G.1.) Do you reject H0 at the 5% level? at the 1% level? [ans.]
(iv) Would you say that the estimated fall in consumption is large in magnitude? Comment on the practical versus statistical significance of this estimate. [ans.]
(v) What has been implicitly assumed in your analysis about other determinants of liquor consumption over the two‐year period in order to infer causality from the tax change to liquor consumption? [ans.]
116
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
VI. Hypothesis Testing VII. Remarks on NotationFundamentals of Mathematical Statistics: Part Three . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat
Problem C.6 (i) (ii)
Yi – the change in liquor consumption.Yi are a random sample from a Normal (, 2)
(i) – H0: µ = 0.
(ii)– H1: µ < 0.
117I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Problem C.6 (iii)(iii) • The standard error of is
se( ) = s/ = 466.4/30 = 15.55.
• Therefore, the t statistic for testing H0: µ = 0 is t = /se( ) = – 32.8/15.55
= –2.11. • We obtain the p‐value as
P(Z –2.11), where Z ~ Normal(0,1).
• These probabilities are in Table G.1: p‐value = .0174.
• (=.05) Because the p‐value is below .05, we reject H0 against the one‐sided alternative at the 5% level.
• (=.01) We do not reject at the 1% level because p‐value = .0174 > .01.
y
118I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
119I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Problem C.6 (iv)(iv) • The estimated reduction, about 33 ounces, does not seem large for an
entire year’s consumption. – If the alcohol is beer, 33 ounces is less than three 12‐ounce cans of beer. – Even if this is hard liquor, the reduction seems small. – (On the other hand, when aggregated across the entire population, alcohol
distributors might not think the effect is so small.)
(v) • The implicit assumption is that other factors that affect liquor
consumption – such as income, or changes in price due to transportation costs, are constant over the two years.
120I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Problem C.7C.7 The new management at a bakery
claims that workers are now more productive than they were under old management, which is why wages have “generally increased.” Let Wi
b be Worker i’s wage under the old management and let Wi
a be Worker i’s wage after the change. The difference is
Di = Wia ‐Wi
b. Assume that the Di are a random sample from a Normal(, 2) distribution.
(i) Using the following data on 15 workers, construct an exact 95% confidence interval for . [ans.]
121
A. NatureB. CI N(0,1)C. Rule of ThumbD. Asymptotic CI
obs Wb Wa D=Wa-Wb
1 8.3 9.25 0.95
2 9.4 9 -0.4
3 9 9.25 0.25
4 10.5 10 -0.5
5 11.4 12 0.6
6 8.75 9.5 0.75
7 10 10.25 0.25
8 9.5 9.5 0
9 10.8 11.5 0.7
10 12.55 13.1 0.55
11 12 11.5 -0.5
12 8.65 9 0.35
13 7.75 7.75 0
14 11.25 11.5 0.25
15 12.65 13 0.35
mean 10.16667 10.40667 0.24
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Problem C.7 continue …(ii) Formally state the null hypothesis that there has been no
change in average wages. In particular, what is E(Di) under H0? If you are hired to examine the validity of the new management’s claim, what is the relevant alternative hypothesis in terms of E(Di)? [ans.]
• (iii) Test the null hypothesis from part (ii) against the stated alternative at the 5% and 1% levels. [ans.]
• (iv) Obtain the p‐value for the test in part (iii). [ans.]
122
A. FundamentalsB. HT N(0,1)C. AsymptoticD. P-ValueE. CI & HT
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Problem C.7 (i)(i) • The average increase in wage is ̅ = .24, or 24
cents.
• The sample standard deviation is about s = .451, n = 15, se( ̅) = .1164.
• From Table G.2, the 97.5th percentile in the t14distribution is 2.145.
• So the 95% CI is
= .24 2.145(.1164), = or about –.010 to .490.
(ii) • If µ = E(di) then
H0: µ = 0. • The alternative is that management’s claim is
true: H1: µ > 0.
123
obs Wb Wa D=Wa-Wb
1 8.3 9.25 0.95
2 9.4 9 -0.4
3 9 9.25 0.25
4 10.5 10 -0.5
5 11.4 12 0.6
6 8.75 9.5 0.75
7 10 10.25 0.25
8 9.5 9.5 0
9 10.8 11.5 0.7
10 12.55 13.1 0.55
11 12 11.5 -0.5
12 8.65 9 0.35
13 7.75 7.75 0
14 11.25 11.5 0.25
15 12.65 13 0.35
mean 10.16667 10.40667 0.24
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
124I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
125I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Problem C.7 (iv)(iv) • We obtain the p‐value as
P(T > 2.062), where T is from the t14 distribution.
• The p‐value obtained from Eview is .029; – this is half of the p‐value for the two‐sided alternative. – (Econometrics packages, including Eviews, report the p‐value for the two‐sided alternative.)
126I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Hypothesis Testing for DI
Date: 05/07/07 Time: 08:03Sample: 1 15Included observations: 15Test of Hypothesis: Mean = 0.000000
Sample Mean = 0.240000Sample Std. Dev. = 0.450872
Method Value Probabilityt-statistic 2.061595 0.0583
View / Test of Descriptive Stats / Simple Hypothesis Tests
127
Problem C.7 (iv)
I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. RemarksFundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VI. Hypothesis Testing
Good Luck!
FT19, PT15: See you around!
128I. Random S. II. Finite S. III. Asymptotic S. IV. Parameter E. V. Interval E. & Confidence I. VI. Hypothesis T VII. Remarks
Fundamentals of Mathematical Statistics . Intensive Course in Mathematics and Statistics . Chairat Aemkulwat VII. Remarks on Notation