Signal Detection and Estimation Chapter 2. Binary and M -ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Detection problems can usually be casted as binary or M -ary hypothesis testing problems. Applications: This chapter: Simple hypothesis testing problem, probability distribution of the observations under each hypothesis is assumed to be known exactly. Example: Composite hypothesis testing: problems involving unknown parameters (Chapter 4). Example: 1
27
Embed
Chapter 2. Binary and M-ary Hypothesis Testing 2.1 ...gencheng/signalDetectionEstimation/Chap... · Signal Detection and Estimation Chapter 2. Binary and M-ary Hypothesis Testing
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Signal Detection and Estimation
Chapter 2. Binary and M-ary Hypothesis Testing
2.1 Introduction (Levy 2.1)
Detection problems can usually be casted as binary or M -aryhypothesis testing problems.
Applications:
This chapter: Simple hypothesis testing problem, probabilitydistribution of the observations under each hypothesis is assumedto be known exactly.
2.6 Gaussian Detection2.7 M -ary Hypothesis Testing
2
Signal Detection and Estimation
2.2 Binary Hypothesis Testing Problem Formulation(Levy 2.2 and 2.4)
Binary Hypothesis Testing is to decide between 2 hypotheses basedon the observation (random).
Model contains:
1. Hypothesis and a-priori probability
2. Observation
3. Connection b/w hypotheses and observation
4. Decision function
5. Performance measure
3
Signal Detection and Estimation
1. Hypothesis and a-priori probabilityHypotheses: H0 and H1.π0 = P (H0) and π1 = P (H1) = 1− π0.
2. ObservationRandom vector Y with sample space Y.An observation is a sample vector y of Y.
3. Connection b/w hypotheses and observation:Distributions of Y under H0 and H1.
For continuous Y, For discrete Y,
PDF under each hypothesis PMF under each hypothesis
H0 : Y ∼ fY(y|H0)
H1 : Y ∼ fY(y|H1)
H0 : P (Y = y|H0) = p(y|H0)
H1 : P (Y = y|H1) = p(y|H1)
Assume to be known in this chapter.
4
Signal Detection and Estimation
4. Decision function: Decide whether H0 or H1 is true given anobservation.
A map from Y to {0, 1}:
δ(y) =
1 if decide on H1
0 if decide on H0
Decision regions: Y0 and Y1
Y0 , {y | δ(y) = 0} Y1 , {y | δ(y) = 1}.
We have Y0 ∩ Y1 = ∅ and Y0 ∪ Y1 = Y.A decision function is a partition of the sample space of Y.
Examples on decision rules:
5
Signal Detection and Estimation
Goal: obtain the decision rule which is “optimal” (in somesense).
5. An optimality/performance measure.
Bayesian formulation: All uncertainties are quantifiable. Thecost and benefits of outcomes can be measured.
Cost function:Cij for i = 0, 1, j = 0, 1, the cost of deciding on Hi when Hj holds.The value of Cij depends on the application/nature of theproblem.
Examples on cost function:
Assumption: Making a correct decision is always less costlythan making a mistake, i.e., C00 < C10 and C11 < C01.
6
Signal Detection and Estimation
testBayes risk of a decision function δ:.Risk under H0:
R(δ|H0) = C00P (δ(y) = 0|H0) + C10P (δ(y) = 1|H0)
= C00P (Y0|H0) + C10P (Y1|H0)
Risk under H1:
R(δ|H1) = C01P (δ(y) = 0|H1) + C11P (δ(y) = 1|H1)
= C01P (Y0|H1) + C11P (Y1|H1)
Continuous Y: P (Yi|Hj) =∫Yi
f(y|Hj)dy.Discrete Y: P (Yi|Hj) =
∑y∈Yi
P (y|Hj).
7
Signal Detection and Estimation
Bayes risk:
R(δ) = R(δ|H0)P (H0) + R(δ|H1)P (H1)
= π0C00P (Y0|H0) + π0C10P (Y1|H0)
+π1C01P (Y0|H1) + π1C11P (Y1|H1)
=1∑
i=0
1∑
j=0
πjCijP (Yi|Hj).
Since P (Y0|H0) + P (Y1|H0) = P (Y0|H1) + P (Y1|H1) = 1.
R(δ) = π0C00 + π0(C10 − C00)P (Y1|H0)
+π1C01 + π1(C11 − C01)P (Y1|H1).
Optimal δ: δ that minimizes R(δ), the Bayes risk.
8
Signal Detection and Estimation
False alarm: H0 is true but H1 is decided. (Error of Type I)Detection: H1 is true and H1 is decided.Miss of detection: H1 is true but H0 is decided. (Error ofType II).
Probability of detection: PD(δ) = P (Y1|H1).Probability of false alarm: PF (δ) = P (Y1|H0).Probability of miss: PM (δ) = P (Y0|H1) = 1− PD(δ).
Choose the hypothesis with the larger likelihood function value.
2.3.2 Examples
14
Signal Detection and Estimation
2.3.3 Asymptotic Performance of LRT (Levy 3.2)
For a binary hypothesis testing problem, Y1,Y2, · · · ,YN is thesequence of i.i.d. random observations. Yk ∈ Rn.
Assume that Y is continuous and let
Y =
Y1
Y2
· · ·YN
.
LRT:
L(y) =f(y|H1)f(y|H0)
=N∏
k=1
f(yk|H1)f(yk|H0)
=N∏
k=1
L(yk)H1
RH0
τ(N)
⇔ 1N
N∑
k=1
ln(yk)H1
RH0
1N
ln τ(N) , γ(N).
15
Signal Detection and Estimation
Let Zk , lnL(Yk) = f(yk|H1)f(yk|H0)
and SN , 1N
∑Nk=1 Zk.
The LRT becomes:
SN
H1
RH0
γ(N).
Notice that Zk’s are i.i.d. and SN is the sample mean of Zk’s.
When N →∞, strong law of large numbers
H1 : SNa.s.−→ E [Zk|H1] =
∫ln
f(y|H1)f(y|H0)
f(y|H1)dy
H0 : SNa.s.−→ E [Zk|H0] =
∫ln
f(y|H1)f(y|H0)
f(y|H0)dy
Def. For two PDFs f and g, the Kullback-Leibler (KL)divergence is
D(f |g) =∫
f(x) lnf(x)g(x)
dx.
A natural notion of distance between random variables. Not a true“distance” metric.
16
Signal Detection and Estimation
Properties:1. D(f |g) ≥ 0 with equality if and only if f = g.2. Non-symmetric D(f |g) 6= D(g|f).3. Does not satisfy the triangular inequality.
Let f0(y) , f(y|H0) and f1(y) , f(y|H1). When N →∞,
H1 : SNa.s.−→
∫ln
f1(y)f0(y)
f1(y)dy = D(f1|f0) > 0.
H0 : SNa.s.−→
∫ln
f1(y)f0(y)
f0(y)dy = −D(f0|f1) < 0
Thus PD(N) → 1 and PF (N) → 0.
* As long as we are willing to collect an arbitrarily large number ofind. observations, we can separate perfectly H0 and H1 regardlessof π0 and Cij .
How fast does PD(N) → 1 and PF (N) → 0? Exponentially with N .
17
Signal Detection and Estimation
2.4 Mini-max Hypothesis Testing (Levy 2.5)
Assume:1. A-priori probabilities (π0, π1) is unknown.2. Cost structure (Cij) is known.
Possible solutions:1. Guess. May lead to bad performance.2. Design the test conservatively by assuming the least-favorable
choice of a-priori and selecting the test that minimizes theBayes risk for this choice. Minimizes the maximum risk.
Guarantees a minimum level of performance independent of thea-priori.
Problem statement: Find the test δM and a-priori value π0M
that solves the mini-max problem
(δM , π0M ) = arg minδ
maxπ0∈[0,1]
R(δ, π0).
18
Signal Detection and Estimation
Approach: Saddle point method.
Def. A saddle point is a point in the domain of a function whichis a stationary point but not a local extremum.
If a point (δM , π0M ) satisfies
R(δM , π0) ≤ R(δM , π0M ) ≤ R(δ, π0M ), for any δ, π0, (1)
It is a saddle point of the function R. It is the solution of themini-max problem.
Proof:Step 1: A saddle point of the form (1) exists.Step 2: The saddle point is the solution (Saddle point property)Step 3: Construct the saddle point.
testComments:1. If C00 = C11, the mini-max equation becomes
PD = 1− C10 − C00
C01 − C11PF ,
a line through (0,1) of the (PF , PD) square.If Cij = 1− δij , mini-max equation becomes PD = 1− PF .
2. Mini-max test corresponding to the intersection of the ROCand the line of the mini-max equation.
3. The LRT threshold τm of the mini-max test, corresponding tothe intersection, equals the slope of the ROC at this point. Thecorresponding a-priori probability can be calculated by
π0M =
[1 +
C10 − C00
(C01 − C11)τM
]−1
.
4. Another way of finding π0M : π0M = arg maxπ0 minδ R(δ, π0).Define V (π0) , minδ R(δ, π0), which is the minimum Bayes riskwith a-priori π, achieved by the LRT.
π0M = arg maxπ0
V (π0).
20
Signal Detection and Estimation
Examples:
21
Signal Detection and Estimation
2.5 Neyman-Pearson (NP) Testing (Levy 2.4.1)
Assume:1. A-priori probabilities (π0, π1) is unknown.2. Cost structure (Cij) is unknown.
NP-testing problem:Select the test δ that maximizes PD(δ) ensuring that theprobability of false alarm PF (δ) is no more than α.
Dα , {δ | PF (δ) ≤ α}δNP = arg max
δ∈Dα
PD(α)
Lagrangian method for constrained optimization.
22
Signal Detection and Estimation
δNP = arg maxδ
PD(δ) subject to PF (δ) ≤ α.
Consider the Lagrangian:L(δ, λ) , −PD(δ) + λ(PF (δ)− α).
A test δ is optimal if it minimizes L(δ, λ) (maximizes −L(δ, λ)),λ ≥ 0, PF (δ) ≤ α, and λ(α− PF (δ)) = 0.
−L(δ, λ) =∫
Y1
f(y|H1)dy + λα− λ
∫
Y1
f(y|H0)dy
=∫
Y1
[f(y|H1)− λf(y|H0)] dy + λα.
−L(δ, λ) is maximized when
δ(y) =
1 if f(y|H1) > λf(y|H0)
0 if f(y|H1) < λf(y|H0)
0 or 1 if f(y|H1) = λf(y|H0)
=
1 if L(y) > λ
0 if L(y) < λ
0 or 1 if L(y) = λ
Thus, δ has to be an LRT. λ must satisfy the KKT condition.
23
Signal Detection and Estimation
Let FL(l|H0) , P (L ≤ l|H0), CDF of the LR L = L(y) under H0.Let f0 , FL(0|H0) = P (L = 0|H0).Define 2 tests:
δL,λ(y) =
1 if L(y) > λ
0 if L(y) ≤ λδU,λ(y) =
1 if L(y) ≥ λ
0 if L(y) < λ
Case 1: If 1− α < f0, let λ = 0 and δNP = δL,0.
Case 2: If 1− α ≥ f0 and there exists a λ such thatFL(λ|H0) = 1− α, i.e, 1− α is in the range of FL(l|H0),choose this λ as the LRT threshold and let δNP = δL,λ.
Case 3: If 1− α ≥ f0 and 1− α is not in the range of FL(l|H0), i.e.,there is a discontinuity point λ > 0 of FL(l|H0) such that
FL(λ−|H0) < 1− α < FL(λ|H0),
Choose this λ as the LRT threshold, the NP test is the randomized
24
Signal Detection and Estimation
test: Choose δU,λ with probability p and δL,λ with probability1− p, equivalently,
δNP =
1 if L(y) > λ
0 if L(y) < λ
1 w.p. p;
0 w.p. 1− pif L(y) = λ
.
Comments:1. When Y is discrete, FL(l|H0) is discontinuous, thus,
randomized test is usually needed.2. Similarly, we could consider the minimization of PF under the
constraint PM (δ) ≤ β. Similar solution can be obtained. Thisproblem is called an NP test of Type II. The previouslydiscussed one is called an NP test of Type I.
Example:
25
Signal Detection and Estimation
2.5.1. ROC Properties
Finding ROC is naturally the NP test problem, which must be anLRT.
L(y)H1
RH0
τ.
PD(τ) =∫ ∞
τ
fL(l|H1)dl PF (τ) =∫ ∞
τ
fL(l|H0)dl (2)
As τ varies from 0 to ∞, (PF (δ), PD(δ)) moves continuously alongthe ROC curve.
1. Let τ = 0−. Thus δ1(y) = 1 always and PD(δ) = PF (δ) = 1.
(1, 1) belongs to the ROC.
2. Let τ = ∞. Thus δ1(y) = 0 always and PD(δ) = PF (δ) = 0.
(0, 0) belongs to the ROC.
3. The slope of the ROC at point (PF (τ), PD(δ)) equals to τ .
26
Signal Detection and Estimation
4. The ROC curve is concave, i.e., the domain of the achievablepairs (PF , PD) is convex.
5. All points on the ROC curve satisfy PD ≥ PF .
6. The region of feasible tests is symmetric about the point(1/2, 1/2), i.e., if (PF , PD) is feasible, so is (1− PF , 1− PD).