Top Banner

of 28

15-MVUE

Jul 07, 2018

Download

Documents

Arup Kumar Das
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/18/2019 15-MVUE

    1/28

    15. Minimum Variance Unbiased EstimationECE 830, Spring 2014

    1/28

  • 8/18/2019 15-MVUE

    2/28

    Bias-Variance Trade-Off 

    Recall thatMSE( θ) = Bias2( θ) + Var( θ).

    In general, the minimum MSE estimator has non-zero bias  and

    non-zero variance.

    We can reduce bias only at a potential increase in variance.

    Conversely, modifying the estimator to reduce the variance may

    lead to an increase in bias.

    2/28

  • 8/18/2019 15-MVUE

    3/28

    Example:

    Let

    xn  = A + wn

    wn ∼ N 0, σ2

    A =

      α

    N n=1

    xn

    where  α  is an arbitrary constant. If 

    S N  ≡   1N 

    N n=1

    xn,

    then A   =S N    ∼

    3/28

  • 8/18/2019 15-MVUE

    4/28

    Example: (cont.)

    Let’s find the value of  α  that minimizes the MSE.

    Var A   =   Var (αS N ) =   Var (S N ) =Bias

     A   =   E  A− A =  E [   S N ] − A =   − A =Thus the MSE is

    MSE A =

    4/28

  • 8/18/2019 15-MVUE

    5/28

    Aside: alternatively, we could have computed the MSE as follows

    E[xixj ] =

      A2 + σ2 , i =  jA2 , i = j

    MSE A   =   E A − A2=   E

     A2− 2E AA + A2=   α2E

      1

    N 2

    i,j=1 xixj− 2αE 1

    n=1 xnA + A2=   α2

      1

    N 2

    N i,j=1

    E[xixj ] − 2α  1N 

    N n=1

    E[xn]A + A2

    =   α2A2 +  σ2N − 2αA2 + A2

    =  α2σ2

       Var(A)

    + (α − 1)2 A2

       Bias2( A)

    5/28

  • 8/18/2019 15-MVUE

    6/28

    So how practical is the MSE as a  design criterion?

    In the previous example, the MSE is minimized when

    dMSE A

    dα  =

    ⇒ α∗ =

    The optimal (in an MSE sense) value  α∗ depends on the unknownparameter A!  Therefore, the estimator is not realizable.   This

    phenomenon occurs for many classes of problems.

    We need an alternative to direct MSE minimization.

    6/28

  • 8/18/2019 15-MVUE

    7/28

    Note that in the above example, the problematic dependence onthe parameter (A) enters through the Bias component of the MSE.This occurs in many situations. Thus a reasonable alternative is to

    constrain the estimator to be unbiased, and then find the estimatorthat produces the minimum variance (and hence provides the

    minimum MSE among all unbiased estimators).

    Note:  Sometimes no unbiased estimator exists and we cannotproceed at all in this direction.

    Definition: Minimum Variance Unbiased Estimator

     θ   is a  minimum variance unbiased estimator  (MVUE) for  θ   if 1.   E θ = θ ∀ θ ∈ Θ2.   If  E θ0 = θ ∀ θ ∈ Θ, then Var θ ≤ Var θ0 ∀ θ ∈ Θ.

    7/28

  • 8/18/2019 15-MVUE

    8/28

  • 8/18/2019 15-MVUE

    9/28

    Example:

    Suppose we observe a single scalar realization  x  of 

     ∼Unif (0, 1/θ) , θ > 0.

    An unbiased estimator of  θ  does not exist. To see this, note that

     p (x|θ) = θ · I [0,1/θ] (x) .

    If  θ   is unbiased, then∀θ > 0, θ =  E

     

    θ

     =

    =⇒=⇒

    But if this is true for all  θ, then we have θ (x) = 0, which is not anunbiased estimator.9/28

  • 8/18/2019 15-MVUE

    10/28

    Finding the MVUE Estimator

    There is no simple, general procedure for finding the MVUEestimator. In the next several lectures we will discuss severalapproaches:

    1.  Find a sufficient statistic and apply the Rao-Blackwell theorem

    2.  Determine the so-called Cramer-Rao Lower Bound (CRLB)and verify that the estimator achieves it.

    3.  Further restrict the estimator to a class of estimators (e.g.,linear or polynomial functions of the data)

    10/28

  • 8/18/2019 15-MVUE

    11/28

    Recipe for finding a MVUE

    (1)  Find a  complete  sufficient statistic  t = T (X ).

    (2)  Find any  unbiased estimator  θ0  and set θ(X ) :=  E[ θ0(X )|t =  T (X )]or  find a function  g   such that

     θ(X ) = g(T (X ))is unbiased.

    These notes answer the following questions:

    1.  What is a sufficient statistic?2.  What is a complete sufficient statistic?

    3.  What does step (2) do above?

    4.   Is this estimator unique?

    5.  How do we know it’s the MVUE?11/28

  • 8/18/2019 15-MVUE

    12/28

    Definition: Sufficient statisticLet  X  be an  N -dimensional random vector and let  θ  denote a p-dimensional parameter of the distribution of  X . The statistict :=  T (X )  is a  sufficient statistic  for θ  if and only if the conditional

    distribution of  X   given  T (X )  is independent of  θ.

    See lecture 4 for more information on Sufficient Statistics and howto find them.

    12/28

  • 8/18/2019 15-MVUE

    13/28

    Minimal and Complete Sufficient Statistics

    Definition: Minimal Sufficient Statistic

    A sufficient statistic  t  is said to be  minimal  if the dimension of  tcannot be reduced and still be sufficient.

    Definition: Complete sufficient statistic

    A sufficient statistic  t := T (X )   is  complete  if for all real-valued

    functions  φ  which satisfy

    (E[φ(t)|θ] = 0∀θ)

    we have

    (P[φ(t) = 0|θ] = 1∀θ)

    Under very general conditions, if  t   is a  complete  sufficient statistic,then  t   is  minimal .

    13/28

  • 8/18/2019 15-MVUE

    14/28

    Example: Bernoulli trials

    Consider  N   independent Bernoulli trials

    xiiid

    ∼ Bernoulli(θ), θ

    ∈[0, 1].

    Recall  k = N 

    n=1 xi   is sufficient for  θ. Now suppose  E[φ(k)|θ] = 0for all  θ. But

    E[φ(k)|θ] ==

    where poly(θ)   is an  N th degree polynomial. Then

    poly(θ) = 0∀θ ∈ [0, 1]=⇒   poly(θ)  is the zero polynomial=⇒   φ(k)

    =⇒14/28

  • 8/18/2019 15-MVUE

    15/28

    Rao-Blackwell Theorem

    Rao-Blackwell Theorem

    Let  Y  ,Z  be random variables and define the function

    g(z) :=  E[Y  |Z  =  z].

    Then

    E[g(Z )] =  E[Y  ]

    andVar(g(Z )) ≤ Var(Y  )

    with equality iff  Y    = g(Z )  almost surely.

    Note that this version of Rao-Blackwell is quite general and hasnothing to do with estimation of parameters. However, we canapply it to parameter estimation as follows.

    15/28

    ( | )

  • 8/18/2019 15-MVUE

    16/28

    Consider  X  ∼  p(x|θ). Let θ1  be an unbiased estimator of  θ  and lett =  T (x)  be a sufficient statistic for  θ. Apply Rao-Blackwell with

    Y     :=

     θ1(x)

    Z    :=   t =  T (x).

    Consider the new estimator

     θ2(x) = g(T (x)) =  E[

     θ1(X )|T (X ) = t].

    Then we may conclude:1. θ2   is unbiased2.   Var(

     θ2) ≤ Var(

     θ1)

    In words, if  θ1   is any unbiased estimator, then smoothing θ1  withrespect to a sufficient statistic decreases the variance whilepreserving unbiasedness.

    Therefore, we can restrict our search for the MVUE to functions of a sufficient statistic.

    16/28

  • 8/18/2019 15-MVUE

    17/28

    The Rao-Blackwell Theorem

    Rao-Blackwell Theorem, special case

    Let  X  be a random variable with pdf  p(X |θ)  and let  t(X )  be asufficient statistic. Let θ1(x)  be an estimator of  θ  and define

     θ2(t) :=  E  

    θ1(X )|t(X ).

    ThenE[ θ2(T )] =  E[ θ1(X )]

    andVar( θ2(T )) ≤ Var( θ1(X ))

    with equality iff  θ1(X ) ≡ θ2(t(X ))  with probability one (almostsurely).

    17/28

    R Bl k ll Th i A i

  • 8/18/2019 15-MVUE

    18/28

    Rao-Blackwell Theorem in Action

    Suppose we observe 2 independent realizations from a N (µ, σ2)distribution. Denote these observations  x1  and  x2, with

    X  = [x1, x2]T 

    . Consider the simple estimator of  µ:

    µ̂ =x1

    E[

     µ] =

    Var [ µ] =The MSE is therefore:

    Intuitively, we expect that the sample mean should be a betterestimator since

    µ =

     1

    2(x1 + x2)

    averages the two observations together.

    18/28

    I hi h b ibl i ?

  • 8/18/2019 15-MVUE

    19/28

    Is this the best possible estimator?

    Let’s find a sufficient statistic for  µ:

     p(x1, x2) =  1

    2πσ2e−(x1−µ)

    2/2σ2e−(x2−µ)2/2σ2

    =

    =

    19/28

  • 8/18/2019 15-MVUE

    20/28

    The Rao-Blackwell Theorem states that:

    µ∗ =  E[

     µ|t]

    is as good as or better than µ  in terms of estimator variance. (SeeScharf p94.) What is  µ∗? First we need to compute the mean of the conditional density  p( µ|t)  or p(x1|t)

     p(x1|t) =  p(x1, t)

     p(t)

     p(x1, t) =

     p(t) =

    E(t) =

    Var(t) =

    20/28

  • 8/18/2019 15-MVUE

    21/28

     p(x1|t) =1

    2πσ2

    1√ 4πσ2

    exp

    −12σ2

    (x1 − µ)2 + (t − x1 − µ)2 − (t − 2µ)2/2

    =

      1

    √ πσ2 exp   −12σ2 x21 − 2µx1 + µ2 + t2 − 2x1t + x21 − 2µt++2µx1 + µ2 − t2/2 + 4µt/2 − 4µ2/2 =

      1√ πσ2

    exp

    −12σ2

    2x2

    1− 2x1t + t2/2

    =

      1

    √ πσ2exp

    −(x1 − t/2)2

    σ2 ⇒ x1|t   ∼

    µ∗ =E[ µ|t] =Var(µ∗) =

    ⇒ MSE(µ∗) =

    21/28

    Th L h S h ff Th

  • 8/18/2019 15-MVUE

    22/28

    The Lehmann-Scheffe Theorem

    The Rao-Blackwell Theorem tells us how to decrease the varianceof an unbiased estimator. But when can we know that we get aMVUE?Answer: When  t   is a complete sufficient statistic.

    Lehmann-Scheffe Theorem

    If  t   is  complete , there is at most  one  unbiased estimator that is afunction of  t.

    22/28

    Proof

  • 8/18/2019 15-MVUE

    23/28

    Proof Suppose

    E[ θ1] =  E[ θ2] = θ θ1(X ) := g1(T (X )) θ2(X ) := g2(T (X )).Define

    φ(t) := g1(t) − g2(t).Then

    E[φ(t)] =

    By definition of completeness, we have

    In other words

     θ1 =

     θ2  with probability  1.

    23/28

    Recipe for finding a MVUE

  • 8/18/2019 15-MVUE

    24/28

    Recipe for finding a MVUE

    This result suggests the following method for finding a MVUE:(1)  Find a complete sufficient statistic  t = T (X ).

    (2)  Find any unbiased estimator

     θ0  and set

     θ(X ) :=  E[ θ0(X )|t =  T (X )]or find a function  g   such that

     θ(X ) = g(T (X ))

    is unbiased.

    24/28

    Rao Blackwell and Complete Suff Stats

  • 8/18/2019 15-MVUE

    25/28

    Rao-Blackwell and Complete Suff. Stats.

    Theorem

    If  θ   is constructed by the recipe above, then θ  is the  unique  MVUE.Proof:  Note that in either construction, θ   is a function of  t. Let 

    θ1  be any unbiased estimator. We must show that

    Var( θ) ≤ Var( θ1).Define

     θ2(X ) :=  E[

     θ1(X )|t =  T (X )].

    By Rao-Blackwell, it suffices to show

    Var( θ) ≤ Var( θ2).25/28

    Proof (cont )

  • 8/18/2019 15-MVUE

    26/28

    Proof (cont.)

    But θ  and θ2  are both unbiased and functions of a completesufficient statistic

    To show uniqueness, in the above argument supposeVar( θ1) = Var( θ). Then the Rao-Blackwell bound holds withequality

    26/28

  • 8/18/2019 15-MVUE

    27/28

    Example: Uniform distribution.

    Suppose X  = [x1 · · · xN ]T  where

    xiiid∼  Unif [0, θ], i = 1, . . . , N .

    What is an unbiased estimator of  θ?

     θ1 =   2N 

    N i=1

    xi

    is unbiased. However, it is not MVUE.

    27/28

    Example: (cont )

  • 8/18/2019 15-MVUE

    28/28

    Example: (cont.)

    From the Fisher-Neyman factorization theorem,

     p(X |θ) =N i=1

    1θ I [0,θ](xi)

    =  1

    θN  I [maxi xi,∞)(θ)

       bθ(t)· I (−∞,mini xi](0)

       a(X )we see that

    T  = maxi

    xi

    is a sufficient statistic. It is left as an exercize to show that  T   is in

    fact complete. Since θ1  is not a function of  T , it is not MVUE.However,  θ2(X ) =  E[ θ1(X )|t =  T (X )]is the MVUE.

    28/28