8/18/2019 15-MVUE
1/28
15. Minimum Variance Unbiased EstimationECE 830, Spring 2014
1/28
8/18/2019 15-MVUE
2/28
Bias-Variance Trade-Off
Recall thatMSE( θ) = Bias2( θ) + Var( θ).
In general, the minimum MSE estimator has non-zero bias and
non-zero variance.
We can reduce bias only at a potential increase in variance.
Conversely, modifying the estimator to reduce the variance may
lead to an increase in bias.
2/28
8/18/2019 15-MVUE
3/28
Example:
Let
xn = A + wn
wn ∼ N 0, σ2
A =
α
N
N n=1
xn
where α is an arbitrary constant. If
S N ≡ 1N
N n=1
xn,
then A =S N ∼
3/28
8/18/2019 15-MVUE
4/28
Example: (cont.)
Let’s find the value of α that minimizes the MSE.
Var A = Var (αS N ) = Var (S N ) =Bias
A = E A− A = E [ S N ] − A = − A =Thus the MSE is
MSE A =
4/28
8/18/2019 15-MVUE
5/28
Aside: alternatively, we could have computed the MSE as follows
E[xixj ] =
A2 + σ2 , i = jA2 , i = j
MSE A = E A − A2= E
A2− 2E AA + A2= α2E
1
N 2
N
i,j=1 xixj− 2αE 1
N
N
n=1 xnA + A2= α2
1
N 2
N i,j=1
E[xixj ] − 2α 1N
N n=1
E[xn]A + A2
= α2A2 + σ2N − 2αA2 + A2
= α2σ2
N
Var(A)
+ (α − 1)2 A2
Bias2( A)
5/28
8/18/2019 15-MVUE
6/28
So how practical is the MSE as a design criterion?
In the previous example, the MSE is minimized when
dMSE A
dα =
⇒ α∗ =
The optimal (in an MSE sense) value α∗ depends on the unknownparameter A! Therefore, the estimator is not realizable. This
phenomenon occurs for many classes of problems.
We need an alternative to direct MSE minimization.
6/28
8/18/2019 15-MVUE
7/28
Note that in the above example, the problematic dependence onthe parameter (A) enters through the Bias component of the MSE.This occurs in many situations. Thus a reasonable alternative is to
constrain the estimator to be unbiased, and then find the estimatorthat produces the minimum variance (and hence provides the
minimum MSE among all unbiased estimators).
Note: Sometimes no unbiased estimator exists and we cannotproceed at all in this direction.
Definition: Minimum Variance Unbiased Estimator
θ is a minimum variance unbiased estimator (MVUE) for θ if 1. E θ = θ ∀ θ ∈ Θ2. If E θ0 = θ ∀ θ ∈ Θ, then Var θ ≤ Var θ0 ∀ θ ∈ Θ.
7/28
8/18/2019 15-MVUE
8/28
8/18/2019 15-MVUE
9/28
Example:
Suppose we observe a single scalar realization x of
X
∼Unif (0, 1/θ) , θ > 0.
An unbiased estimator of θ does not exist. To see this, note that
p (x|θ) = θ · I [0,1/θ] (x) .
If θ is unbiased, then∀θ > 0, θ = E
θ
=
=⇒=⇒
But if this is true for all θ, then we have θ (x) = 0, which is not anunbiased estimator.9/28
8/18/2019 15-MVUE
10/28
Finding the MVUE Estimator
There is no simple, general procedure for finding the MVUEestimator. In the next several lectures we will discuss severalapproaches:
1. Find a sufficient statistic and apply the Rao-Blackwell theorem
2. Determine the so-called Cramer-Rao Lower Bound (CRLB)and verify that the estimator achieves it.
3. Further restrict the estimator to a class of estimators (e.g.,linear or polynomial functions of the data)
10/28
8/18/2019 15-MVUE
11/28
Recipe for finding a MVUE
(1) Find a complete sufficient statistic t = T (X ).
(2) Find any unbiased estimator θ0 and set θ(X ) := E[ θ0(X )|t = T (X )]or find a function g such that
θ(X ) = g(T (X ))is unbiased.
These notes answer the following questions:
1. What is a sufficient statistic?2. What is a complete sufficient statistic?
3. What does step (2) do above?
4. Is this estimator unique?
5. How do we know it’s the MVUE?11/28
8/18/2019 15-MVUE
12/28
Definition: Sufficient statisticLet X be an N -dimensional random vector and let θ denote a p-dimensional parameter of the distribution of X . The statistict := T (X ) is a sufficient statistic for θ if and only if the conditional
distribution of X given T (X ) is independent of θ.
See lecture 4 for more information on Sufficient Statistics and howto find them.
12/28
8/18/2019 15-MVUE
13/28
Minimal and Complete Sufficient Statistics
Definition: Minimal Sufficient Statistic
A sufficient statistic t is said to be minimal if the dimension of tcannot be reduced and still be sufficient.
Definition: Complete sufficient statistic
A sufficient statistic t := T (X ) is complete if for all real-valued
functions φ which satisfy
(E[φ(t)|θ] = 0∀θ)
we have
(P[φ(t) = 0|θ] = 1∀θ)
Under very general conditions, if t is a complete sufficient statistic,then t is minimal .
13/28
8/18/2019 15-MVUE
14/28
Example: Bernoulli trials
Consider N independent Bernoulli trials
xiiid
∼ Bernoulli(θ), θ
∈[0, 1].
Recall k = N
n=1 xi is sufficient for θ. Now suppose E[φ(k)|θ] = 0for all θ. But
E[φ(k)|θ] ==
where poly(θ) is an N th degree polynomial. Then
poly(θ) = 0∀θ ∈ [0, 1]=⇒ poly(θ) is the zero polynomial=⇒ φ(k)
=⇒14/28
8/18/2019 15-MVUE
15/28
Rao-Blackwell Theorem
Rao-Blackwell Theorem
Let Y ,Z be random variables and define the function
g(z) := E[Y |Z = z].
Then
E[g(Z )] = E[Y ]
andVar(g(Z )) ≤ Var(Y )
with equality iff Y = g(Z ) almost surely.
Note that this version of Rao-Blackwell is quite general and hasnothing to do with estimation of parameters. However, we canapply it to parameter estimation as follows.
15/28
( | )
8/18/2019 15-MVUE
16/28
Consider X ∼ p(x|θ). Let θ1 be an unbiased estimator of θ and lett = T (x) be a sufficient statistic for θ. Apply Rao-Blackwell with
Y :=
θ1(x)
Z := t = T (x).
Consider the new estimator
θ2(x) = g(T (x)) = E[
θ1(X )|T (X ) = t].
Then we may conclude:1. θ2 is unbiased2. Var(
θ2) ≤ Var(
θ1)
In words, if θ1 is any unbiased estimator, then smoothing θ1 withrespect to a sufficient statistic decreases the variance whilepreserving unbiasedness.
Therefore, we can restrict our search for the MVUE to functions of a sufficient statistic.
16/28
8/18/2019 15-MVUE
17/28
The Rao-Blackwell Theorem
Rao-Blackwell Theorem, special case
Let X be a random variable with pdf p(X |θ) and let t(X ) be asufficient statistic. Let θ1(x) be an estimator of θ and define
θ2(t) := E
θ1(X )|t(X ).
ThenE[ θ2(T )] = E[ θ1(X )]
andVar( θ2(T )) ≤ Var( θ1(X ))
with equality iff θ1(X ) ≡ θ2(t(X )) with probability one (almostsurely).
17/28
R Bl k ll Th i A i
8/18/2019 15-MVUE
18/28
Rao-Blackwell Theorem in Action
Suppose we observe 2 independent realizations from a N (µ, σ2)distribution. Denote these observations x1 and x2, with
X = [x1, x2]T
. Consider the simple estimator of µ:
µ̂ =x1
E[
µ] =
Var [ µ] =The MSE is therefore:
Intuitively, we expect that the sample mean should be a betterestimator since
µ =
1
2(x1 + x2)
averages the two observations together.
18/28
I hi h b ibl i ?
8/18/2019 15-MVUE
19/28
Is this the best possible estimator?
Let’s find a sufficient statistic for µ:
p(x1, x2) = 1
2πσ2e−(x1−µ)
2/2σ2e−(x2−µ)2/2σ2
=
=
19/28
8/18/2019 15-MVUE
20/28
The Rao-Blackwell Theorem states that:
µ∗ = E[
µ|t]
is as good as or better than µ in terms of estimator variance. (SeeScharf p94.) What is µ∗? First we need to compute the mean of the conditional density p( µ|t) or p(x1|t)
p(x1|t) = p(x1, t)
p(t)
p(x1, t) =
p(t) =
E(t) =
Var(t) =
20/28
8/18/2019 15-MVUE
21/28
p(x1|t) =1
2πσ2
1√ 4πσ2
exp
−12σ2
(x1 − µ)2 + (t − x1 − µ)2 − (t − 2µ)2/2
=
1
√ πσ2 exp −12σ2 x21 − 2µx1 + µ2 + t2 − 2x1t + x21 − 2µt++2µx1 + µ2 − t2/2 + 4µt/2 − 4µ2/2 =
1√ πσ2
exp
−12σ2
2x2
1− 2x1t + t2/2
=
1
√ πσ2exp
−(x1 − t/2)2
σ2 ⇒ x1|t ∼
µ∗ =E[ µ|t] =Var(µ∗) =
⇒ MSE(µ∗) =
21/28
Th L h S h ff Th
8/18/2019 15-MVUE
22/28
The Lehmann-Scheffe Theorem
The Rao-Blackwell Theorem tells us how to decrease the varianceof an unbiased estimator. But when can we know that we get aMVUE?Answer: When t is a complete sufficient statistic.
Lehmann-Scheffe Theorem
If t is complete , there is at most one unbiased estimator that is afunction of t.
22/28
Proof
8/18/2019 15-MVUE
23/28
Proof Suppose
E[ θ1] = E[ θ2] = θ θ1(X ) := g1(T (X )) θ2(X ) := g2(T (X )).Define
φ(t) := g1(t) − g2(t).Then
E[φ(t)] =
By definition of completeness, we have
In other words
θ1 =
θ2 with probability 1.
23/28
Recipe for finding a MVUE
8/18/2019 15-MVUE
24/28
Recipe for finding a MVUE
This result suggests the following method for finding a MVUE:(1) Find a complete sufficient statistic t = T (X ).
(2) Find any unbiased estimator
θ0 and set
θ(X ) := E[ θ0(X )|t = T (X )]or find a function g such that
θ(X ) = g(T (X ))
is unbiased.
24/28
Rao Blackwell and Complete Suff Stats
8/18/2019 15-MVUE
25/28
Rao-Blackwell and Complete Suff. Stats.
Theorem
If θ is constructed by the recipe above, then θ is the unique MVUE.Proof: Note that in either construction, θ is a function of t. Let
θ1 be any unbiased estimator. We must show that
Var( θ) ≤ Var( θ1).Define
θ2(X ) := E[
θ1(X )|t = T (X )].
By Rao-Blackwell, it suffices to show
Var( θ) ≤ Var( θ2).25/28
Proof (cont )
8/18/2019 15-MVUE
26/28
Proof (cont.)
But θ and θ2 are both unbiased and functions of a completesufficient statistic
To show uniqueness, in the above argument supposeVar( θ1) = Var( θ). Then the Rao-Blackwell bound holds withequality
26/28
8/18/2019 15-MVUE
27/28
Example: Uniform distribution.
Suppose X = [x1 · · · xN ]T where
xiiid∼ Unif [0, θ], i = 1, . . . , N .
What is an unbiased estimator of θ?
θ1 = 2N
N i=1
xi
is unbiased. However, it is not MVUE.
27/28
Example: (cont )
8/18/2019 15-MVUE
28/28
Example: (cont.)
From the Fisher-Neyman factorization theorem,
p(X |θ) =N i=1
1θ I [0,θ](xi)
= 1
θN I [maxi xi,∞)(θ)
bθ(t)· I (−∞,mini xi](0)
a(X )we see that
T = maxi
xi
is a sufficient statistic. It is left as an exercize to show that T is in
fact complete. Since θ1 is not a function of T , it is not MVUE.However, θ2(X ) = E[ θ1(X )|t = T (X )]is the MVUE.
28/28