ESE 524 ESE 524 Detection and Estimation Theory Joseph A. O’Sullivan Joseph A. O Sullivan Samuel C. Sachs Professor Electronic Systems and Signals Research Laboratory l l d Electrical and Systems Engineering Washington University 211 Urbauer Hall 314-935-4173 (Lynda Markham answers) [email protected]J. A. O'S. ESE 524, Lecture 7, 02/03/09 1
22
Embed
ESE 524ESE 524 Detection and Estimation Theory · 2009-02-03 · φ() ln ()ss=Φ function equals the Legendre-Fenchel transform of the log-moment generating function. The rate function
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
ESE 524ESE 524Detection and Estimation Theoryy
Joseph A. O’SullivanJoseph A. O SullivanSamuel C. Sachs Professor
Electronic Systems and Signals Research Laboratoryl l dElectrical and Systems Engineering
A tAnnouncements Problem Set 1 is due today Problem Set 1 is due today Problem Set 2 is posted, due 02/13/09 No class 02/10 02/12 and 02/19 No class 02/10, 02/12, and 02/19 Make up on Fridays: 02/20, 02/27 Class website: Class website:
http://classes.engineering.wustl.edu/ese524/
Questions? Questions?
J. A. O'S. ESE 524, Lecture 7, 02/03/09 22
Last Class: Information Rate Functions d P f B dand Performance Bounds
Motivation Chernoff Bound Binary Hypothesis Testing Tilted Distributions Relative Entropy Information Rate Functions Information Rate Functions Additivity of Information Examples: Examples:
Gaussian same covariance, different means Poisson different means,
G i diff t iJ. A. O'S. ESE 524, Lecture 7, 02/03/09 33
Gaussian same mean, different variances
Last Class: Information Rate Functions:M i iMotivation Receiver Operating Characteristic is often not
easily computable Performance is often a function of a few
parameters: SNR N other statisticsparameters: SNR, N, other statistics Bounds on performance may be more easily
computed The bounds below guarantee a level of
performance for optimal testsB d ti l i th t f ti Bounds are exponential in the rate function
Rate function is additive in information (proportional to N)
J. A. O'S. ESE 524, Lecture 7, 02/03/09 44
(proportional to N)
Last Class: Ch r ff B d
( ) ( )sx sXxs E e p X e dX
∞
−∞
Φ = = Chernoff Bound
Let x be a random variable with probability density
( ) ( ) sXx
A
s p X e dX∞
∞
Φ ≥ p y y
function px(X). Define the moment generating function and the log-moment generating function for real-
( ) , for all 0
ln ( ) ln ( ), for all 0
sAx
A
e p X dX s
P x A sA s s
∞
≥ ≥
≥ < − + Φ ≥
g gvalued s.
The Chernoff bound is a bound on the tail probability.
The probability of a rare event
[ ]0
ln ( ) ln ( ), for all 0( ) sup ln ( )
ln ( ) ( )s
P x A sA s sI A sA s
P x A I A≥
≥ < + Φ ≥= − Φ
≥ < − The probability of a rare event is closely approximated by the Chernoff bound.
The (information) rate f n tion eq als the Legend e
ln ( ) ( )( ) ln ( )
P x A I As sφ
≥ <= Φ
function equals the Legendre-Fenchel transform of the log-moment generating function.
The rate function can be used
( ) ln , for
ln ( ) inf ( )
T NE e
P I
φ
∈
= ∈ − ∈ >
s x
X
s X
x XA
A
J. A. O'S. ESE 524, Lecture 7, 02/03/09 55
to bound the probability of any open set in N dimensions. ( )I
∈X
XA
sup ln ( )T = − Φ s
s X s
Chernoff Bound SummaryFor random variables:For random variables:
( ) ( )sx sXxs E e p X e dX
∞
Φ = =
[ ]( ) ln ( )( ) ( )
s sI A Aφ
φ
−∞
= Φ
[ ]0
( ) sup ( )
ln ( ) ( )s
I A sA s
P x A I A
φ≥
= −
≥ < −For random vectors:
( ) ln , forT NE eφ = ∈
s xs X ( ) ln , for
ln ( ) inf ( )
E e
P I
φ
∈
∈ − ∈ >
X
s X
x XA
A
J. A. O'S. ESE 524, Lecture 7, 02/03/09 6
( ) sup ln ( )TI = − Φ s
X s X s
Binary Hypothesis T i
1
0
( | )( ) ln( | )
p Hlp H
= rrr
Testing Probabilities of miss and
( )0 0 0( ) | ( | )
( ( ) | )
sl sLls E e H p L H e dL
P P l Hγ
∞
−∞
Φ = =
= ≥
r
rfalse alarm are tail probabilities.
Upper bound them using [ ]
0
0 0
0 0
( ( ) | )ln ( ( ) | ) ( )
( ) sup ln ( )
FP P l HP l H II s s
γγ γ
γ γ
= ≥≥ < −= − Φ
rr
Chernoff bound information rate functions given hypotheses 1 and 0.
0
0 0( ) ln ( )s
s sφ≥
= Φ
Performance is better than computed point: optimal (PD,PF) is “up and to the l ft” f i t
1
1
( ( ) | )ln ( )
M
M
P P l HP I
γγ
∞
= ≤< −
r
left” of point ( )1 1 1
1
( ) | ( | )
( )
l LlE e H p L H e dL
I
σ σσ
γ
∞
−∞
Φ = =
=
r
[ ]1sup ln ( )σγ σ− Φ
J. A. O'S. ESE 524, Lecture 7, 02/03/09 77
1( )I γ [ ]10
1 1
sup ln ( )
( ) ln ( )σ
σγ σ
φ σ σ≤
Φ
= Φ
A idAside Tail probabilities can be
i h id
1( ( ) | )ln ( )
MP P l HP I
γγ
= ≤<
r
on either side. For the bound, the
variable in the (log-) moment generating
1
( )1 1 1
ln ( )
( ) | ( | )
M
l Ll
P I
E e H p L H e dLσ σ
γ
σ∞
−∞
< −
Φ = = r
moment generating function is a dummy variable.
The variables in the log-1( | ) L
lp L H e dLγ
σ
∞
−∞
≥ g
moment generating functions under the two hypotheses are different.
[ ]
1( | ) , for all 0
( ) l ( )
le p L H dL
I
γσγ σ
−∞
≥ ≤
Φ
[ ]1 1
0( ) sup ln ( )I
σγ σγ σ
≤= − Φ
J. A. O'S. ESE 524, Lecture 7, 02/03/09 88
Til d Di ib i
1
0
( )
( | )( ) ln( | )
l
p Hlp H
=
rrr
Tilted Distributions The moment
i f i
1
0
( )0 0
( | )ln( | )
0
( ) |
( | )
sl
p Hsp H
s E e H
p H e d ∞
Φ =
=
r
RRR R
generating functions are for the log-likelihood ratio given the hypotheses. [ ] [ ]
0
10 1
( | )
( | ) ( | )s s
p
p H p H d
−∞∞
−=
R R Rthe hypotheses.
If the supremum defining the rate function is achieved at
[ ]0 00
( ) sup ln ( )
( *)s
I s s
d s
γ γ−∞
≥= − Φ
Φan interior point, then the derivative is zero.
Tilt the original distribution until the
0
00
( *)ln ( *)
( *)( *)
d sd dssds s
d s
γΦ
= Φ =Φ
Φdistribution until the mean of the log-likelihood function equals the threshold.
[ ]0
*0
( |
( *)( )
( *) s
d sds E l
s
Φ=
ΦR
r
)H
J. A. O'S. ESE 524, Lecture 7, 02/03/09 99
q( |ln
0( | )( )
ps
sp H ep =
R
RR
1
0
)( | )
0 ( )
Hp H
s
Φ
R
Relationship toR l i E ( || ) log pD p q p= Relative Entropy Relative entropy is a
i i f
( || ) log
ln 1 1/ log 1 0
D p q pq
p qx x p p pq
=
≥ − ≥ − =
quantitative measure of information, given in bits or nats.
Information rate
( || ) 0( )( || ) ( ) ln s
pqD p q
pD p p p d
≥
= RR R Information rate
function equals the relative entropy between the tilted pdf
1
0
00
( | )ln( | )
0
( || ) ( ) ln( | )
( | )( )
s s
p Hsp H
D p p p dp H
p H e
= RR
R RR
RRand the pdf under the hypothesis.
Duality of exponential family and its mean:
[ ]
0
0
0 0
( | )( )( )
( || ) ( ) ln ( )
s
s s
p eps
D p p E sl s
=Φ
= − Φ
R
rfamily and its mean: mean determines parameter; parameter determines mean.
[ ]0 0ln ( ) ( )
( )*
s s
s s
s s IE l
γ γγ
= − Φ ==r
J. A. O'S. ESE 524, Lecture 7, 02/03/09 1010
* ss sγ γ→ →
Relationship to Relative E P 2
( )1 1( ) |lE e Hσσ Φ =
r
Entropy Part 2 Relative entropy
[ ]1 1
1 10
1
( ) |
( ) sup ln ( )
( *)
I
dσ
γ σγ σ
σ≤
= − Φ
Φbetween the tilted density and the density under hypothesis 1 is i l l d h
1
11
( )ln ( *)
( *)
dd d
d
σσγ σ
σ σ
Φ= Φ =
Φ
simply related to that under hypothesis 0. 1
0
( | )ln( | )
11
( | )( )( )
p Hp Hp H ep
σ
σ
+ =Φ
RRRR
11
( | )
( )( || ) ( ) ln( | )
ss s
p H
pD p p p dp H
= R
RR RR 1
0
1
( | )( 1)ln( | )
0
( )
( | )p Hp Hp H e
σ
σ
σ
+
Φ
=
RRR
1
0
( | )ln( | )
0
0
( | )( )( )
p Hsp H
sp H ep
s
=Φ
RRRR
[ ]
0
1
( 1)( *)
( )d
dE l
σσ
σ
=Φ +
Φ=r
J. A. O'S. ESE 524, Lecture 7, 02/03/09 1111
[ ]1 0
0 0
( || ) ( 1) ( ) ln ( )ln ( ) ( )
s s
s s s s
D p p E s l ss s Iγ γ γ γ
= − − Φ= − + − Φ = − +
r [ ]* 11
( )( *)
E lσ σ+ =Φ
r
Summary of Simple R l i hi ( )( ) |lE Hσ Φ rRelationships Find I0 as a function
( )1 1
11 0
( ) |
( | )( ) exp ( 1) ln( | )
|
lE e H
p HE HH
σσ
σ σ
Φ =
Φ = +
r
RRof the threshold;
subtract threshold to get I1.
1 00
1 0
( | )
( ) ( 1)( ) ( )
|p H
I Iσ σγ γ γ
Φ = Φ +
= +
R
Plot bounds Vary parameters to
gain a better
0 1
1 0
0
( ) ( )(0) (0) 1
(0) is a convex function
I Iγ γ γ= +Φ = Φ =
Φunderstanding.
Information is additive. N i.i.d.
0 0 1 0 1
1 1 0 1
( || ) (0, ( || )) is a point( || ) ( ( ||D p p D p p
D p p D p pγγ
= − = 0 ),0) is a point
0 ( ( || ) ( || )) i i tD D(N I0, N I1) 0 0
0
0 ( ( || ), ( || )) is a pointln ( ) 0
s sD p p D p pd s
ds
γ = Φ =
J. A. O'S. ESE 524, Lecture 7, 02/03/09 1212
Information Rate Functions and P f B dPerformance Bounds Example: Exponential Distributionsp p Additivity of Information Examples: Examples:
Gaussian same covariance, different means Poisson different means, Gaussian same mean, different variances
J. A. O'S. ESE 524, Lecture 7, 02/03/09 1313
Summary of Simple R l i hiRelationships Find I0 as a
( )1 1
1 0
( ) |
( ) ( 1)
lE e Hσσ
σ σ
Φ = Φ = Φ +
r
0function of the threshold; subtract
1 0
0 1
1 0
( ) ( )( ) ( )(0) (0) 1
I Iγ γ γ= +Φ = Φ =
subtract threshold to get I1.
0
0 0 1 0 1
(0) is a convex function( || ) (0, ( || )) is a point
( || ) ( ( || ) 0) is a pointD p p D p p
D p p D p pγγ
Φ= − =
Plot bounds Vary
parameters to
1 1 0 1 0
0 0
( || ) ( ( || ),0) is a point0 ( ( || ), ( || )) is a point
ls s
D p p D p pD p p D p p
d
γγ
=
=
0n ( ) 0sΦ =
parameters to gain a better understanding.
0
0 1ds
s
=
≤ ≤
J. A. O'S. ESE 524, Lecture 7, 02/03/09 1414
E pl E p ti l Di trib tiExample: Exponential Distributions0 : , 0, 1, 2,... , i.i.d.R
iR
H r e R i Nλ
α
λ − ≥ =
1 : , 0, 1, 2,... , i.i.d.
( ) ( ) l
Ri
N
H r e R i N
l
ααλ α
αλ
− ≥ =>
1
( )0 0
( ) ( ) ln
( ) |
ii
sl
l R
s E e H
αλ αλ=
= − + Φ =
r
R
[ ] [ ]10 1( | ) ( | )s s
N
p H p H d∞
−
−∞∞
= R R R
[ ] [ ]10 1
1
( | ) ( | )
( ) ( )
Ns s
i i ii
N
p R H p R H dR
s s
∞−
= −∞
=
Φ = Φ
∏
J. A. O'S. ESE 524, Lecture 7, 02/03/09 1515
0 0( ) ( )s s Φ = Φ
0 0
1 ((1 ) )
( ) ( )
( )
N
s s R s s
s s
s e dRλ αλ α∞
− − − +
Φ = Φ
Φ = Ex: Exponential
00
1
( )
((1 ) )
s s
s e dR
s s
λ α
λ αλ α
−
Φ =
=− +
Distributions
Compute the [ ]0 0
0
0
((1 ) )( ) sup ln ( )
( *)s
s sI s s
d sd d
λ αγ γ
≥
+= − Φ
Φ
pmoment generating functions and
[ ]
00
ln ( *)( *)
( ) ln , one component((1 ) )s s
d dssds s
E l
γ
λ α αγλ λ
= Φ =Φ−= = +
+r
functions and information rate functions. [ ]
( )0
((1 ) )
( ) ln ln((1 ) )
s s
s
s ss
I s ss s
λ α λλ α αγ α
λ α λ
− +−
= + −− + Plot parametric
curves: fns of s (1 )s− − ln ln((1 ) )
1 ln(1 ) (1 )
s s
s s s s
λ λ αλ λλ α λ α
+ − +
= − − − + − +
curves: fns of s γs, I0(γs),
J. A. O'S. ESE 524, Lecture 7, 02/03/09 16161
( ) ( )
( ) 1 ln(1 ) (1 )sI
s s s sα αγλ α λ α
= − − − + − +
I1(γs) = -γs + I0(γs),
E pl ti dExample continued We know exact performance
Th th h ld l t d
( )1
| | , 0( 1)!
1 1
i
i
LN
l H i Ni
L ep L H LN
μ
μ
−−
= ≥−
The thresholds are related The log-probabilities can be
compared to bounds ( )1
0 1
| 1'
1 1,
|D l HP p L H dLγ
μ μλ α∞
= =
=
( )0| 0
'
1
|
1
F l H
N
P p L H dL
γ
γ
∞
= 1( ) ( ) ln
( ) ln
N
ii
N
l R
R N
αλ αλ
αλ α
=
= − + +
R
( )
( )
1'
01
''
1 '!
1 '' , '' '
Nk
Fk
Nk
F
P ek
P ek
λγ
γ
λγ
γ γ λγ
−−
=
−−
=
= =
1
1
( ) ln
1 ( ) ln
ii
N
ii
R N
R l N
λ αλ
αλ α λ
=
=
= − + = − −
R ( )0
1
0
,!
'' 1 ''exp!
Fk
kN
Dk
k
Pk
γ γ γ
αγ αγλ λ
=
−
=
= −
1
1' ln
'' '
i
N αγ γλ α λ
γ λγ
= = − −
=
J. A. O'S. ESE 524, Lecture 7, 02/03/09 1717( )1 1ln '' ln lnF FNP P
N N N
γ λγλ γ αγ
λ α λ = − −
Matlab Code1
ROC in blue, Bound in redfunction [pf,pd]=gammainforate(N,lambda,alpha)gamma=0:N/200:(N+5);
plot(exp(-N*info0),1-exp(-N*info1),'LineWidth',1.5,'Color',[1 0 0]);xlabel('P_F','FontSize',16);ylabel('P_D','FontSize',16);title('ROC in blue, Bound in red','FontSize',16);
p (g , , , , ,[ ]);plot(gammas,info1,'LineWidth',1.5,'Color',[1 0 0]);axis tightxlabel('Threshold \gamma','FontSize',16);ylabel('Information Rates','FontSize',16);title('I_0 in blue, I_1 in red','FontSize',16);
-0.1 0 0.1 0.2 0.30
Threshold γ
J. A. O'S. ESE 524, Lecture 7, 02/03/09 19
Relationship of Bounds to L Pr b bilitLog-Probabilitygammas1=gamma*((lambda-alpha)/N*lambda)+log(alpha/lambda);l t( 1 l ( f)/N 'Li Width' 1 5 'C l ' [0 1 0])
I f r ti i AdditiInformation is Additive Relative entropy of product distributions Relative entropy of product distributions
equals the sum of the relative entropies. Log-moment generating functions add.og o e t ge e at g u ct o s add Information rate functions add. Information is additive. Information is additive.
N i.i.d. (N I0, N I1) Exponential error bounds. Exponential error bounds.
J. A. O'S. ESE 524, Lecture 7, 02/03/09 2121
I f r ti i AdditiInformation is Additive Relative entropy of product distributions Relative entropy of product distributions
equals the sum of the relative entropies. Log-moment generating functions add.og o e t ge e at g u ct o s add Information rate functions add. Information is additive. Information is additive.
N i.i.d. (N I0, N I1) Exponential error bounds. Exponential error bounds.