Almost optimal sequential detection in multiple data streams Georgios Fellouris Department of Statistics University of Illinois Joint work with Alexander Tartakovsky University of Michigan Ann Arbor, May 13th, 2015 Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 1 / 49
46
Embed
Almost optimal sequential detection in multiple data streams
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Almost optimal sequential detectionin multiple data streams
Georgios Fellouris
Department of Statistics
University of Illinois
Joint work with Alexander Tartakovsky
University of MichiganAnn Arbor, May 13th, 2015
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 1 / 49
Outline
1 Simple null against simple alternative
2 A simple null against a finite number of alternatives
3 The continuous-parameter case
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 2 / 49
Sequentially testing of two simple hypotheses
Sequentially acquired observations
X1, . . . ,Xt, . . .iid∼ f .
Stop sampling as soon as possible and distinguish between
H0 : f = f0 and H1 : f = f1.
Let Ft be the history of observations up to time t,
Ft = σ(Xs : 1 ≤ s ≤ t).
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 4 / 49
Wald’s formulation
Find an Ft-stopping time, T , at which to stop samplingand an FT -measurable r.v., dT , so that
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 25 / 49
Asymptotic Optimality under H0
Let Ii0 = D(f0||fi) for every 1 ≤ i ≤ M and
I0 = min1≤i≤M
Ii0.
If there is a unique i that attains I0, then
E0[S] = 1I0
[| logβ|+O(1)]
If not,
E0[S] = 1I0
[| logβ|+ Θ(
√log B)
]The second-order term is not always constant.If A,B are selected so that S, S ∈ Cα,β , then
E0[S] ∼ inf(T,d)∈Cα,β
E0[T] ∼ E0[S].
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 26 / 49
Remarks
These asymptotic results are based on non-linear renewal theory (Lai andSiegmund ’77,’79, Woodroofe ’82, Zhang’88) and Dragalin et al. (’99,’00) .
Were known for the GSLRT (Tartakovsky, 2003).
Here, we have shown that they hold for arbitrary weights (and both tests).
How should one choose these weights?
For this choice, we will show that a particular choice of weights satisfies an evenstronger asymptotic optimality property.
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 27 / 49
Almost minimax?
What if we select q so that
max1≤i≤M
Ei[S] = inf(T,dT )∈Cα,β
max1≤i≤M
Ei[S] + o(1)?
This would require that Ei[S] = Ej[S] + o(1) for every 1 ≤ i, j ≤ M.
However, to have Ei[S] ∼ Ej[S] for every 1 ≤ i, j ≤ M, we need
Ei[S] ∼ | logα|Ii
∼ | logα|Ij
∼ Ej[S].
This is not possible unless I1 = . . . = IM .
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 28 / 49
Almost optimality with respect to a weighted expected sample size
Let p = (p1, . . . , pK) a vector of positive numbers that add up to 1.
We will try to design the proposed tests so that
K∑k=1
piEi[S] = inf(T,dT )∈Cα,β
K∑k=1
piEi[T] + o(1) =K∑
k=1
piEi[S].
(Later how to choose the pi’s).
For this, we need to generalize the class of sequential tests.
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 29 / 49
Two Families of Sequential Tests
Let q0, q1 M-dimensional vectors of positive numbers.
WSLRT
S = inf{
t ≥ 1 : Zt(q1) ≥ B or Zt(q0) ≤ −A}
{dS = 1} ={
ZS(q1) ≥ B}, {dS = 0} =
{ZS(q0) ≤ A
}WG-SLRT
S = inf{
t ≥ 1 : Zt(q1) ≥ B or Zt(q0) ≤ A},
{dS = 1} ={
ZS(q1) ≥ B}, {dS = 0} =
{ZS(q0) ≤ A
}.
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 30 / 49
Almost optimality
Theorem(F. & Tartakovsky 2013)If A,B are chosen so that (S, dS) ∈ Cα,β and
qi1 = pi/Li, qi
0 = piLi,
then as α, β → 0 so that | logα| ∼ | logβ| we have
M∑i=1
piEi[S] = inf(T,dT )∈Cα,β
M∑i=1
piEi[T] + o(1)
The Li’ were introduced by Lorden (1977)
Li : = exp{−∞∑
n=1
n−1[P0(Zin > 0) + Pi(Zi
n ≤ 0)]}
= δi Ii.
| logα| ∼ | logβ| is more restrictive than what we had assumed before.
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 31 / 49
Ingredients of proof
Formulate a Bayesian problem, in which there is a penalty for a wrong decisionunder each hypothesis and a cost of sampling, c, per observation.
Show that the WSLRT with these particular weights that involve the L numbers(and appropriate thresholds) attains the Bayes risk up to an o(c) term (Lorden(1977)).
A third-order asymptotic expansion for expected sample size of this rule:
Ei[S] = 1Ii
[| logα|+ ρi + log δi + Ci(p)
]+ o(1),
where
Ci(p) = log
(M∑
k=1
pk
Ik
)− log
(pk
Ik
).
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 32 / 49
How to select p?
We have seen that an almost minimax rule does not make sense.
We may design the rule so that
max1≤i≤M
(Ii Ei[S]) = inf(T,dT )∈Cα,β
max1≤i≤M
(Ii Ei[T]) + o(1).
This is done when pi is selected ∝ Li eρi .
It is not clear with this is a good criterion.
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 33 / 49
Robustness
Let Si the optimal SPRT for testing f0 against fi and set
Ji[S] := Ei[S]− Ei[Si]Ei[Si]
when both tests satisfy, at least approximately, the error probability constraints.
Based on the previous approximations,
Ji[S] ≈ Ci(p)| logα|+ ρi + log δi
, where Ci(p) = log
(M∑
k=1
pk
Ik
)− log
(pi
Ii
).
Setting pi ∝ Ii guarantees that
Ji[S] ∼ Jj[S] ∀ 1 ≤ i 6= j ≤ M.
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 34 / 49
Example
Two channels with densities
f k0 (x) = h(x) and f k
1 (x) = eθkx−ψ(θk) h(x), k = 1, 2.
Say, θ1 = 4 (fixed) and let θ2 = x vary.
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 35 / 49
Relative performance loss vs relative signal strength
0 2 4 6 8
0.00.1
0.20.3
0.40.5
0.6
First Channel (θ = 4)
x
0 2 4 6 8
0.00.1
0.20.3
0.40.5
0.6
Second Channel (θ = x)
x
Li ≤ Ii ≤ eρkLi 1
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 36 / 49
Sequentially testing of a continuous parameter
Sequentially acquired observations
X1, . . . ,Xn, . . .iid∼ f ∈ {fθ, θ ∈ Θ}
Stop sampling as soon as possible and distinguish between
H0 : θ = θ0 and H1 : θ ∈ Θ1,
where θ0 /∈ Θ1 ⊂ Θ.Then, we would like to minimize Eθ0 [T] and Eθ[T] for every θ ∈ Θ1 in
Cα,β = {(T, dT) : Pθ0(dT = 1) ≤ α and supθ∈Θ1
Pθ(dT = 0) ≤ β}.
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 38 / 49
A multi-parameter exponential family
SetupAn exponential family
fθ(x) := e〈θ,x〉−ψ(θ), x ∈ Rd , θ ∈ Θ ⊂ Rd.
Θ = {θ ∈ Rd :∫
e〈θ,x〉 ν(dx) <∞} is the natural parameter space.ψ(θ) = log
∫e〈θ,x〉 ν(dx) is the log-moment generating function of X.
We denote by ψ(θ) the gradient and by ψ(θ) the Hessian matrix of ψ(θ).We assume that ψ(θ) is non-singular for all θ ∈ Θ.The Kullback–Leibler information number between fθ2 and fθ1 is
I(θ2, θ1) := Eθ2
[log
fθ2(X)fθ1(X)
]= 〈θ2 − θ1, ψ(θ2)〉 − [ψ(θ2)− ψ(θ1)].
I(θ0, θ1) > 0 ∀ θ0 ∈ Θ0, θ1 ∈ Θ1.
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 39 / 49
The one-sided setup
Suppose that sampling needs to stop only to reject H0.
Then, we need to minimize Eθ[T] for every θ ∈ Θ1 among stopping times in
Cα = {T : P0(T <∞) ≤ α}.
Let `n(θ) the likelihood of the first n observations under Pθ, i.e.,
`n(θ) =n∏
k=1
fθ(Xk).
Open-ended WSPRT and GSLRTLet B > 1 be a fixed threshold and g a positive function on Θ1. Define
SB(g) = inf{t ≥ 1 : Λt ≥ B}, Λt = 1`n(θ0)
∫Θ1
`t(θ) g(θ) dθ
SB = inf{t ≥ 1 : Λt ≥ B}, Λt = 1`t(θ0) sup
θ∈Θ1
`t(θ).
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 40 / 49
A minimax, second-order property
The weighted idea goes back to Wald (1945).The GSLRT has been studied by Schwarz (1962), Wong (1968), Lorden (1977),Lai (1988,2004), etc.
If Θ1 is a compact set bounded away from 0,both tests attain
infT∈Cα
supθ∈Θ1
I(θ, 0)Eθ[T]
within an O(1) term as α→ 0.
Pollak (1978) proved this result for the WSPRT with any continuous mixingdensity whose support includes Θ1 (for a one-parameter exponential family).
Lai (2004) proved this result for the GSLRT.
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 41 / 49
Almost Minimax WSPRT
Asymptotic average overshootConsider the one-sided SPRT for testing fθ versus f0,
Theorem (F. & Tartakovsky (2013)Consider the WSPRT SB(g) with weight function
g(θ) := eκθ√
det(ψ(θ))/I(θ, 0)
and suppose that P0(SB(g) <∞) = α. Then, as α→ 0,
supθ∈Θ1
I(θ, 0) Eθ[SB(g)] = infT∈Cα
supθ∈Θ1
I(θ, 0)Eθ[T] + o(1).
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 42 / 49
Idea of Proof
Auxiliary Bayesian problemConsider the sequential decision problem with
loss 1 when stopping under P0,sampling cost per observation equal to cIθ under Pθ,conditional prior distribution on Θ1 given that θ 6= 0 equal to g.
The WSPRT SB(g) is asymptotically Bayes as c→ 0 within an o(c) term.
Almost equalizerAs B→∞
I(θ, 0)Eθ[SB(g)] = log B + d2
log log B + C + o(1),
where C is a constant term that does not depend on θ.
Idea of proof“Almost Bayesian + Almost Equalizer = Almost Minimax”
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 43 / 49
Almost Minimax Weighted GSLRT
A weighted version of the GSLRT turns out to have the same optimality property.Recall that
Λn = supθ∈Θ1
`n(θ)`n(θ0) = `n(θn)
`n(θ0) ,
where θn is the (constrained on Θ1) MLE of θ based on the first n observations.We define:
SB(g) = inf{n ≥ 1 : Λn g(θn) ≥ B}
where g is some positive function on Θ1 and B > 1 is a fixed threshold.
Theorem
Consider the WGSLRT with g(θ) = eκθ . If P0(SB(g) <∞) = α, then as α→ 0
supθ∈Θ1
I(θ, 0) Eθ[SB(g)] = infT∈Cα
supθ∈Θ1
I(θ, 0)Eθ[T] + o(1).
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 44 / 49
Remarks
The function θ → κθ usually does not admit a closed-form expression.
As a result, the previous nearly minimax sequential tests can be implementedonly approximately, as the corresponding mixture-based and generalizedlikelihood ratio statistics can only be computed numerically.
For the two-sided testing problem, we need an almost Bayes rule for exponentialfamilies (Lorden (1977)) and we need to consider again the L numbers (Keener(2005)).
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 45 / 49
Work in progress
Extension to a composite null hypothesis.
Extension to multiple hypotheses.
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 46 / 49
THE END
THANK YOU!
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 47 / 49
References
Dragalin, V. P., Tartakovsky, A. G., and Veeravalli, V. V. (2000).“Multihypothesis se quential probability ratio tests - Part II: Accurate asymptoticexpansions for the ex- pected sample size.” IEEE Trans. Inform. Theory 46,1366-1343.
Fellouris, G. and Tartakovsky, A. G. (2012). “Nearly minimax mixture-basedopen-ended sequential tests.” Sequential Analysis 31, 297-325.
G. Fellouris and A.G. Tartakovsky (2013) “Almost optimal sequential tests fordiscrete composite hypotheses.” Statistica Sinica vol. 23
Lai, T. L. (1988). “Nearly optimal sequential tests of composite hypotheses.”Ann. Statist. 16, 856-886.
Lai, T. L. and Siegmund, D. (1977). “A nonlinear renewal theory withapplications to sequential analysis I. ” Ann. Statist. 5, 628-643.
Lai, T. L. and Siegmund, D. (1979). “A nonlinear renewal theory withapplications to sequential analysis II.” Ann. Statist. 7, 60-76.
Lorden, G. (1967). “Integrated risk of asymptotically Bayes sequential tests.]]Ann. Math. Statist. 38, 1399-1422.
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 48 / 49
References
Lorden, G. (1977). “Nearly optimal sequential tests for finitely many parametervalues.” Ann. Statist. 5, 1-21.
Schwarz, G. (1962). “Asymptotic shapes of Bayes sequential testing regions.”Ann. Math. Statist. 33, 224-236
Tartakovsky, A. G., Li, X. R., and Yaralov, G. (2003). Sequential detection oftargets in multichannel systems. IEEE Trans. Inform. Theory, vol. 49, 425-445.
A. Wald and J. Wolfowitz, “ Optimum character of the sequential probabilityratio test,” Ann. Math. Statist., vol. 19, pp. 326-339, 1948.
A. Wald, Sequential analysis. Wiley, New York, 1947.
Georgios Fellouris (UIUC) Almost optimal sequential tests July 9, 2012 49 / 49