15-MVUE

8/18/2019 15-MVUE

1/28

15. Minimum Variance Unbiased EstimationECE 830, Spring 2014

1/28

8/18/2019 15-MVUE

2/28

Bias-Variance Trade-Off

Recall thatMSE( θ) = Bias2( θ) + Var( θ).

In general, the minimum MSE estimator has non-zero bias and

non-zero variance.

We can reduce bias only at a potential increase in variance.

Conversely, modifying the estimator to reduce the variance may

lead to an increase in bias.

2/28

8/18/2019 15-MVUE

3/28

Example:

Let

xn = A + wn

wn ∼ N 0, σ2

A =

α

N

N n=1

xn

where α is an arbitrary constant. If

S N ≡ 1N

N n=1

xn,

then A =S N ∼

3/28

8/18/2019 15-MVUE

4/28

Example: (cont.)

Let’s find the value of α that minimizes the MSE.

Var A = Var (αS N ) = Var (S N ) =Bias

A = E A− A = E [ S N ] − A = − A =Thus the MSE is

MSE A =

4/28

8/18/2019 15-MVUE

5/28

Aside: alternatively, we could have computed the MSE as follows

E[xixj ] =

A2 + σ2 , i = jA2 , i = j

MSE A = E A − A2= E

A2− 2E AA + A2= α2E

1

N 2

N

i,j=1 xixj− 2αE 1

N

N

n=1 xnA + A2= α2

1

N 2

N i,j=1

E[xixj ] − 2α 1N

N n=1

E[xn]A + A2

= α2A2 + σ2N − 2αA2 + A2

= α2σ2

N

Var(A)

+ (α − 1)2 A2

Bias2( A)

5/28

8/18/2019 15-MVUE

6/28

So how practical is the MSE as a design criterion?

In the previous example, the MSE is minimized when

dMSE A

dα =

⇒ α∗ =

The optimal (in an MSE sense) value α∗ depends on the unknownparameter A! Therefore, the estimator is not realizable. This

phenomenon occurs for many classes of problems.

We need an alternative to direct MSE minimization.

6/28

8/18/2019 15-MVUE

7/28

Note that in the above example, the problematic dependence onthe parameter (A) enters through the Bias component of the MSE.This occurs in many situations. Thus a reasonable alternative is to

constrain the estimator to be unbiased, and then find the estimatorthat produces the minimum variance (and hence provides the

minimum MSE among all unbiased estimators).

Note: Sometimes no unbiased estimator exists and we cannotproceed at all in this direction.

Definition: Minimum Variance Unbiased Estimator

θ is a minimum variance unbiased estimator (MVUE) for θ if 1. E θ = θ ∀ θ ∈ Θ2. If E θ0 = θ ∀ θ ∈ Θ, then Var θ ≤ Var θ0 ∀ θ ∈ Θ.

7/28

8/18/2019 15-MVUE

8/28

8/18/2019 15-MVUE

9/28

Example:

Suppose we observe a single scalar realization x of

X

∼Unif (0, 1/θ) , θ > 0.

An unbiased estimator of θ does not exist. To see this, note that

p (x|θ) = θ · I [0,1/θ] (x) .

If θ is unbiased, then∀θ > 0, θ = E

θ

=

=⇒=⇒

But if this is true for all θ, then we have θ (x) = 0, which is not anunbiased estimator.9/28

8/18/2019 15-MVUE

10/28

Finding the MVUE Estimator

There is no simple, general procedure for finding the MVUEestimator. In the next several lectures we will discuss severalapproaches:

1. Find a sufficient statistic and apply the Rao-Blackwell theorem

2. Determine the so-called Cramer-Rao Lower Bound (CRLB)and verify that the estimator achieves it.

3. Further restrict the estimator to a class of estimators (e.g.,linear or polynomial functions of the data)

10/28

8/18/2019 15-MVUE

11/28

Recipe for finding a MVUE

(1) Find a complete sufficient statistic t = T (X ).

(2) Find any unbiased estimator θ0 and set θ(X ) := E[ θ0(X )|t = T (X )]or find a function g such that

θ(X ) = g(T (X ))is unbiased.

These notes answer the following questions:

1. What is a sufficient statistic?2. What is a complete sufficient statistic?

3. What does step (2) do above?

4. Is this estimator unique?

5. How do we know it’s the MVUE?11/28

8/18/2019 15-MVUE

12/28

Definition: Sufficient statisticLet X be an N -dimensional random vector and let θ denote a p-dimensional parameter of the distribution of X . The statistict := T (X ) is a sufficient statistic for θ if and only if the conditional

distribution of X given T (X ) is independent of θ.

See lecture 4 for more information on Sufficient Statistics and howto find them.

12/28

8/18/2019 15-MVUE

13/28

Minimal and Complete Sufficient Statistics

Definition: Minimal Sufficient Statistic

A sufficient statistic t is said to be minimal if the dimension of tcannot be reduced and still be sufficient.

Definition: Complete sufficient statistic

A sufficient statistic t := T (X ) is complete if for all real-valued

functions φ which satisfy

(E[φ(t)|θ] = 0∀θ)

we have

(P[φ(t) = 0|θ] = 1∀θ)

Under very general conditions, if t is a complete sufficient statistic,then t is minimal .

13/28

8/18/2019 15-MVUE

14/28

Example: Bernoulli trials

Consider N independent Bernoulli trials

xiiid

∼ Bernoulli(θ), θ

∈[0, 1].

Recall k = N

n=1 xi is sufficient for θ. Now suppose E[φ(k)|θ] = 0for all θ. But

E[φ(k)|θ] ==

where poly(θ) is an N th degree polynomial. Then

poly(θ) = 0∀θ ∈ [0, 1]=⇒ poly(θ) is the zero polynomial=⇒ φ(k)

=⇒14/28

8/18/2019 15-MVUE

15/28

Rao-Blackwell Theorem

Rao-Blackwell Theorem

Let Y ,Z be random variables and define the function

g(z) := E[Y |Z = z].

Then

E[g(Z )] = E[Y ]

andVar(g(Z )) ≤ Var(Y )

with equality iff Y = g(Z ) almost surely.

Note that this version of Rao-Blackwell is quite general and hasnothing to do with estimation of parameters. However, we canapply it to parameter estimation as follows.

15/28

( | )

8/18/2019 15-MVUE

16/28

Consider X ∼ p(x|θ). Let θ1 be an unbiased estimator of θ and lett = T (x) be a sufficient statistic for θ. Apply Rao-Blackwell with

Y :=

θ1(x)

Z := t = T (x).

Consider the new estimator

θ2(x) = g(T (x)) = E[

θ1(X )|T (X ) = t].

Then we may conclude:1. θ2 is unbiased2. Var(

θ2) ≤ Var(

θ1)

In words, if θ1 is any unbiased estimator, then smoothing θ1 withrespect to a sufficient statistic decreases the variance whilepreserving unbiasedness.

Therefore, we can restrict our search for the MVUE to functions of a sufficient statistic.

16/28

8/18/2019 15-MVUE

17/28

The Rao-Blackwell Theorem

Rao-Blackwell Theorem, special case

Let X be a random variable with pdf p(X |θ) and let t(X ) be asufficient statistic. Let θ1(x) be an estimator of θ and define

θ2(t) := E

θ1(X )|t(X ).

ThenE[ θ2(T )] = E[ θ1(X )]

andVar( θ2(T )) ≤ Var( θ1(X ))

with equality iff θ1(X ) ≡ θ2(t(X )) with probability one (almostsurely).

17/28

R Bl k ll Th i A i

8/18/2019 15-MVUE

18/28

Rao-Blackwell Theorem in Action

Suppose we observe 2 independent realizations from a N (µ, σ2)distribution. Denote these observations x1 and x2, with

X = [x1, x2]T

. Consider the simple estimator of µ:

µ̂ =x1

E[

µ] =

Var [ µ] =The MSE is therefore:

Intuitively, we expect that the sample mean should be a betterestimator since

µ =

1

2(x1 + x2)

averages the two observations together.

18/28

I hi h b ibl i ?

8/18/2019 15-MVUE

19/28

Is this the best possible estimator?

Let’s find a sufficient statistic for µ:

p(x1, x2) = 1

2πσ2e−(x1−µ)

2/2σ2e−(x2−µ)2/2σ2

=

=

19/28

8/18/2019 15-MVUE

20/28

The Rao-Blackwell Theorem states that:

µ∗ = E[

µ|t]

is as good as or better than µ in terms of estimator variance. (SeeScharf p94.) What is µ∗? First we need to compute the mean of the conditional density p( µ|t) or p(x1|t)

p(x1|t) = p(x1, t)

p(t)

p(x1, t) =

p(t) =

E(t) =

Var(t) =

20/28

8/18/2019 15-MVUE

21/28

p(x1|t) =1

2πσ2

1√ 4πσ2

exp

−12σ2

(x1 − µ)2 + (t − x1 − µ)2 − (t − 2µ)2/2

=

1

√ πσ2 exp −12σ2 x21 − 2µx1 + µ2 + t2 − 2x1t + x21 − 2µt++2µx1 + µ2 − t2/2 + 4µt/2 − 4µ2/2 =

1√ πσ2

exp

−12σ2

2x2

1− 2x1t + t2/2

=

1

√ πσ2exp

−(x1 − t/2)2

σ2 ⇒ x1|t ∼

µ∗ =E[ µ|t] =Var(µ∗) =

⇒ MSE(µ∗) =

21/28

Th L h S h ff Th

8/18/2019 15-MVUE

22/28

The Lehmann-Scheffe Theorem

The Rao-Blackwell Theorem tells us how to decrease the varianceof an unbiased estimator. But when can we know that we get aMVUE?Answer: When t is a complete sufficient statistic.

Lehmann-Scheffe Theorem

If t is complete , there is at most one unbiased estimator that is afunction of t.

22/28

Proof

8/18/2019 15-MVUE

23/28

Proof Suppose

E[ θ1] = E[ θ2] = θ θ1(X ) := g1(T (X )) θ2(X ) := g2(T (X )).Define

φ(t) := g1(t) − g2(t).Then

E[φ(t)] =

By definition of completeness, we have

In other words

θ1 =

θ2 with probability 1.

23/28


8/18/2019 15-MVUE

24/28


This result suggests the following method for finding a MVUE:(1) Find a complete sufficient statistic t = T (X ).

(2) Find any unbiased estimator

θ0 and set

θ(X ) := E[ θ0(X )|t = T (X )]or find a function g such that

θ(X ) = g(T (X ))

is unbiased.

24/28

Rao Blackwell and Complete Suff Stats

8/18/2019 15-MVUE

25/28

Rao-Blackwell and Complete Suff. Stats.

Theorem

If θ is constructed by the recipe above, then θ is the unique MVUE.Proof: Note that in either construction, θ is a function of t. Let

θ1 be any unbiased estimator. We must show that

Var( θ) ≤ Var( θ1).Define

θ2(X ) := E[

θ1(X )|t = T (X )].

By Rao-Blackwell, it suffices to show

Var( θ) ≤ Var( θ2).25/28

Proof (cont )

8/18/2019 15-MVUE

26/28

Proof (cont.)

But θ and θ2 are both unbiased and functions of a completesufficient statistic

To show uniqueness, in the above argument supposeVar( θ1) = Var( θ). Then the Rao-Blackwell bound holds withequality

26/28

8/18/2019 15-MVUE

27/28

Example: Uniform distribution.

Suppose X = [x1 · · · xN ]T where

xiiid∼ Unif [0, θ], i = 1, . . . , N .

What is an unbiased estimator of θ?

θ1 = 2N

N i=1

xi

is unbiased. However, it is not MVUE.

27/28

Example: (cont )

8/18/2019 15-MVUE

28/28

Example: (cont.)

From the Fisher-Neyman factorization theorem,

p(X |θ) =N i=1

1θ I [0,θ](xi)

= 1

θN I [maxi xi,∞)(θ)

bθ(t)· I (−∞,mini xi](0)

a(X )we see that

T = maxi

xi

is a sufficient statistic. It is left as an exercize to show that T is in

fact complete. Since θ1 is not a function of T , it is not MVUE.However, θ2(X ) = E[ θ1(X )|t = T (X )]is the MVUE.

28/28

15-MVUE

Documents