Model-Based Compressive Sensing Richard G. Baraniuk, Volkan Cevher, Marco F. Duarte, Chinmay Hegde Department of Electrical and Computer Engineering Rice University Abstract Compressive sensing (CS) is an alternative to Shannon/Nyquist sampling for acquisition of sparse or compressible signals that can be well approximated by just K ≪ N elements from an N -dimensional basis. Instead of taking periodic samples, we measure inner products with M<N random vectors and then recover the signal via a sparsity-seeking optimization or greedy algorithm. The standard CS theory dictates that robust signal recovery is possible from M = O (K log(N/K)) measurements. The goal of this paper is to demonstrate that it is possible to substantially decrease M without sacrificing robustness by leveraging more realistic signal models that go beyond simple sparsity and compressibility by including dependencies between values and locations of the signal coefficients. We introduce a model- based CS theory that parallels the conventional theory and provides concrete guidelines on how to create model-based recovery algorithms with provable performance guarantees. A highlight is the introduction of a new class of model-compressible signals along with a new sufficient condition for robust model- compressible signal recovery that we dub the restricted amplification property (RAmP). The RAmP is the natural counterpart to the restricted isometry property (RIP) of conventional CS. To take practical advantage of the new theory, we integrate two relevant signal models — wavelet trees and block sparsity — into two state-of-the-art CS recovery algorithms and prove that they offer robust recovery from just M = O (K) measurements. Extensive numerical simulations demonstrate the validity and applicability of our new theory and algorithms. Index Terms Compressive sensing, sparsity, signal model, union of subspaces, wavelet tree, block sparsity The authors are listed alphabetically. Email: {richb, volkan, duarte, chinmay}@rice.edu; Web: dsp.rice.edu/cs. This work was supported by the grants NSF CCF-0431150 and CCF-0728867, DARPA/ONR N66001-08-1-2065, ONR N00014-07-1-0936 and N00014-08-1-1112, AFOSR FA9550-07-1-0301, ARO MURI W311NF-07-1-0185, and the Texas Instruments Leadership University Program.
48
Embed
Model-Based Compressive Sensing - Rice Universityduarte/images/ModelCS082008.pdf · Model-Based Compressive Sensing Richard G. Baraniuk, Volkan Cevher, Marco F. Duarte, Chinmay Hegde
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Model-Based Compressive Sensing
Richard G. Baraniuk, Volkan Cevher, Marco F. Duarte, Chinmay Hegde
Department of Electrical and Computer Engineering
Rice University
Abstract
Compressive sensing (CS) is an alternative to Shannon/Nyquist sampling for acquisition of sparse or
compressible signals that can be well approximated by justK ≪ N elements from anN -dimensional
basis. Instead of taking periodic samples, we measure innerproducts withM < N random vectors
and then recover the signal via a sparsity-seeking optimization or greedy algorithm. The standard CS
theory dictates that robust signal recovery is possible from M = O (K log(N/K)) measurements. The
goal of this paper is to demonstrate that it is possible to substantially decreaseM without sacrificing
robustness by leveraging more realistic signal models thatgo beyond simple sparsity and compressibility
by including dependencies between values and locations of the signal coefficients. We introduce a model-
based CS theory that parallels the conventional theory and provides concrete guidelines on how to create
model-based recovery algorithms with provable performance guarantees. A highlight is the introduction
of a new class of model-compressible signals along with a newsufficient condition for robust model-
compressible signal recovery that we dub the restricted amplification property (RAmP). The RAmP is
the natural counterpart to the restricted isometry property (RIP) of conventional CS. To take practical
advantage of the new theory, we integrate two relevant signal models — wavelet trees and block sparsity
— into two state-of-the-art CS recovery algorithms and prove that they offer robust recovery from just
M = O (K) measurements. Extensive numerical simulations demonstrate the validity and applicability
of our new theory and algorithms.
Index Terms
Compressive sensing, sparsity, signal model, union of subspaces, wavelet tree, block sparsity
The authors are listed alphabetically. Email:richb, volkan, duarte, [email protected]; Web: dsp.rice.edu/cs. This work
was supported by the grants NSF CCF-0431150 and CCF-0728867, DARPA/ONR N66001-08-1-2065, ONR N00014-07-1-0936
and N00014-08-1-1112, AFOSR FA9550-07-1-0301, ARO MURI W311NF-07-1-0185, and the Texas Instruments Leadership
University Program.
I. INTRODUCTION
We are in the midst of a digital revolution that is enabling the development and deployment
of new sensors and sensing systems with ever increasing fidelity and resolution. The theoretical
foundation is the Shannon/Nyquist sampling theorem, whichstates that a signal’s information is
preserved if it is uniformly sampled at a rate at least two times faster than its Fourier bandwidth.
Unfortunately, in many important and emerging applications, the resulting Nyquist rate can be
so high that we end up with too many samples and must compress in order to store or transmit
them. In other applications the cost of signal acquisition is prohibitive, either because of a high
cost per sample, or because state-of-the-art samplers cannot achieve the high sampling rates
required by Shannon/Nyquist. Examples include radar imaging and exotic imaging modalities
outside visible wavelengths.
Transform compression systems reduce the effective dimensionality of anN-dimensional
signal x by re-representing it in terms of a sparse set of coefficientsα in a basis expansion
x = Ψα, with Ψ an N × N basis matrix. By sparse we mean that onlyK ≪ N of the
coefficientsα are nonzero and need to be stored or transmitted. By compressible we mean that
the coefficientsα, when sorted, decay rapidly enough to zero thatα can be well-approximated as
K-sparse. The sparsity and compressibility properties are pervasive in many signals of interest.
For example, smooth signals and images are compressible in the Fourier basis, while piecewise
smooth signals and images are compressible in a wavelet basis [1]; the JPEG and JPEG2000
standards are examples of practical transform compressionsystems based on these bases.
Compressive sensing(CS) provides an alternative to Shannon/Nyquist sampling when the
signal under acquisition is known to be sparse or compressible [2–4]. In CS, we measure
not periodic signal samples but rather inner products withM ≪ N measurement vectors. In
matrix notation, the measurementsy = Φx = ΦΨα, where the rows of theM × N matrix
Φ contain the measurement vectors. While the matrixΦΨ is rank deficient, and hence loses
information in general, it can be shown to preserve the information in sparse and compressible
signals if it satisfies the so-calledrestricted isometry property(RIP) [3]. Intriguingly, a large
2
class of random matrices have the RIP with high probability.To recover the signal from the
compressive measurementsy, we search for the sparsest coefficient vectorα that agrees with
the measurements. To date, research in CS has focused primarily on reducing both the number
of measurementsM (as a function ofN andK) and on increasing the robustness and reducing
the computational complexity of the recovery algorithm. Today’s state-of-the-art CS systems
can robustly recoverK-sparse and compressible signals from justM = O (K log(N/K)) noisy
measurements using polynomial-time optimization solversor greedy algorithms.
While this represents significant progress from Nyquist-rate sampling, our contention in this
paper is that it is possible to do even better by more fully leveraging concepts from state-of-the-
art signal compression and processing algorithms. In many such algorithms, the key ingredient is
a more realisticsignal modelthat goes beyond simple sparsity by codifying the inter-dependency
structureamong the signal coefficientsα.1 For instance, JPEG2000 and other modern wavelet
image coders exploit not only the fact that most of the wavelet coefficients of a natural image
are small but also the fact that the values and locations of the large coefficients have a particular
structure. Coding the coefficients according to a model for this structure enables these algorithms
to compress images close to the maximum amount possible – significantly better than a naıve
coder that just processes each large coefficient independently.
In this paper, we introduce a model-based CS theory that parallels the conventional theory and
provides concrete guidelines on how to create model-based recovery algorithms with provable
performance guarantees. By reducing the degrees of freedomof a sparse/compressible signal
by permitting only certain configurations of the large and zero/small coefficients, signal models
provide two immediate benefits to CS. First, they enable us toreduce, in some cases significantly,
the number of measurementsM required to stably recover a signal. Second, during signal
recovery, they enable us to better differentiate true signal information from recovery artifacts,
which leads to a more robust recovery.
1Obviously, sparsity and compressibility correspond to simple signal models where each coefficient is treated independently;
for example in a sparse model, the fact that the coefficientαi is large has no bearing on the size of anyαj , j 6= i. We will
reserve the use of the term “model” for situations where we are enforcing dependencies between the values and the locations
of the coefficientsαi.
3
To precisely quantify the benefits of model-based CS, we introduce and study several new
theoretical concepts that could be of more general interest. We begin with signal models forK-
sparse signals and make precise how the structure encoded ina signal model reduces the number
of potential sparse signal supports inα. Then using themodel-based restricted isometry property
(RIP) from [5, 6], we prove that suchmodel-sparse signalscan be robustly recovered from noisy
compressive measurements. Moreover, we quantify the required number of measurementsM
and show that for some modelsM is independent ofN . These results unify and generalize
the limited related work to date on signal models for strictly sparse signals [5–9]. We then
introduce the notion of amodel-compressible signal, whose coefficientsα are no longer strictly
sparse but have a structured power-law decay. To establish that model-compressible signals can
be robustly recovered from compressive measurements, we generalize the CS RIP to a new
restricted amplification property(RAmP). For some compressible signal models, the required
number of measurementsM is independent ofN .
To take practical advantage of this new theory, we demonstrate how to integrate signal
models into two state-of-the-art CS recovery algorithms, CoSaMP [10] and iterative hard thresh-
olding (IHT) [11]. The key modification is surprisingly simple: we merely replace the nonlinear
approximation step in these greedy algorithms with a model-based approximation. Thanks to our
new theory, both new model-based recovery algorithms have provable robustness guarantees for
both model-sparse and model-compressible signals.
To validate our theory and algorithms and demonstrate its general applicability and utility, we
present two specific instances of model-based CS and conducta range of simulation experiments.
The first model accounts for the fact that the large wavelet coefficients of piecewise smooth
signals and images tend to live on a rooted, connectedtree structure[12]. Using the fact that the
number of such trees is much smaller than(
NK
), the number ofK-sparse signal supports inN
dimensions, we prove that a tree-based CoSaMP algorithm needs onlyM = O (K) measurements
to robustly recover tree-sparse and tree-compressible signals. Figure 1 indicates the potential
performance gains on a tree-compressible, piecewise smooth signal.
The second model accounts for the fact that the large coefficients of many sparse signals clus-
x0 = 0, r = y, i = 0 initializewhile halting criterion falsedo
i← i+ 1
e← ΦT r form signal residual estimateΩ← supp(M2(e,K)) prune signal residual estimate according to signal modelT ← Ω ∪ supp(xi−1) merge supportsb|T ← Φ†
T y, b|T C ← 0 form signal estimatexi ←M(b,K) prune signal estimate according to signal modelr ← y − Φxi update measurement residual
end while
return x← xi
it has a simple iterative, greedy structure based on a bestBK-term approximation (withB a
small integer) that is easily modified to incorporate a bestBK-term model-based approximation
MB(K, x). Pseudocode for the modified algorithm is given in Algorithm1.
We now study the performance of model-based CoSaMP signal recovery on model-sparse
signals and model-compressible signals.
B. Performance of model-sparse signal recovery
A robustness guarantee for noisy measurements of model-sparse signals can be obtained
using the model-based RIP (10). The following theorem is proven in Appendix IV.
Theorem 4:Let x ∈MK and lety = Φx+ n be a set of noisy CS measurements. IfΦ has
anM4K-RIP constant ofδM4
K≤ 0.1, then the signal estimatexi obtained from iterationi of the
model-based CoSaMP algorithm satisfies
‖x− xi‖2 ≤ 2−i‖x‖2 + 15‖n‖2. (13)
17
C. Performance of model-compressible signal recovery
Using the new tools introduced in Section III, we can providea robustness guarantee for
noisy measurements of model-compressible signals, using the RAmP as a condition on the
measurement matrixΦ.
Theorem 5:Let x ∈ Ms be ans-model-compressible signal from a modelM that obeys
the NAP, and lety = Φx + n be a set of noisy CS measurements. IfΦ has theM4K-RIP with
δM4K≤ 0.1 and the(ǫK , r)-RAmP with ǫK ≤ 0.1 and r = s − 1, then the signal estimatexi
obtained from iterationi of the model-based CoSaMP algorithm satisfies
‖x− xi‖2 ≤ 2−i‖x‖2 + 35(‖n‖2 + |x|MsK
−s(1 + ln⌈N/K⌉)). (14)
To prove the theorem, we first bound the recovery error for ans-model-compressible signal
x ∈ Ms when the matrixΦ has the(ǫK , r)-RAmP with r ≤ s − 1. Then, using Theorems 3
and 4, we can easily prove the result by following the analogous proof in [10].
D. Robustness to model mismatch
We now analyze the robustness of model-based CS recovery tomodel mismatch, which occurs
when the signal being recovered from compressive measurements does not conform exactly to
the model used in the recovery algorithm.
We begin with optimistic results for signals that are “close” to matching the recovery model.
First consider a signalx that is notK-model sparse as the recovery algorithm assumes but rather
(K + κ)-model sparse for some small integerκ. This signal can be decomposed intoxK , the
signal’sK-term model-based approximation, andx − xK , the error of this approximation. For
κ ≤ K, we have thatx−xK ∈ R2,K . If the matrixΦ has the(ǫK , r)-RAmP, then it follows than
‖Φ(x− xK)‖2 ≤ 2r√
1 + ǫK‖x− xK‖2. (15)
18
Using equations (13) and (15), we obtain the following guarantee for theith iteration of model-
based CoSaMP:
‖x− xi‖2 ≤ 2−i‖x‖2 + 16 · 2r√
1 + ǫK‖x− xK‖2 + 15‖n‖2.
By noting that‖x− xK‖2 is small, we obtain a guarantee that is close to (13).
Second, consider a signalx that is nots-model compressible as the recovery algorithm
assumes but rather(s− ǫ)-model compressible. The following bound can be obtained under the
conditions of Theorem 5 by modifying the argument in Appendix II:
‖x− xi‖2 ≤ 2−i‖x‖2 + 35
(‖n‖2 + |x|MsK
−s
(1 +⌈N/K⌉ǫ − 1
ǫ
)).
As ǫ becomes smaller, the factor⌈N/K⌉ǫ−1ǫ
approacheslog⌈N/K⌉, matching (14). In summary,
as long as the deviations from the model-sparse and model-compressible models are small, our
model-based recovery guarantees still apply within a smallbounded constant factor.
We end with a more pessimistic, worst-case result for signals that are arbitrarily far away
from model-sparse or model-compressible. Consider such anarbitraryx ∈ RN and compute its
nested model-based approximationsxjK = M(x, jK), j = 1, . . . , ⌈N/K⌉. If x is not model-
compressible, then the model-based approximation errorσjK(x) is not guaranteed to decay asj
decreases. Additionally, the number of residual subspacesRj,K could be as large as(
NK
); that is,
the jth difference between subsequent model-based approximations xTj= xjK − x(j−1)K might
lie in any arbitraryK-dimensional subspace. This worst case is equivalent to setting r = 0 and
Rj =(
NK
)in Theorem 2. It is easy to see that this condition on the number of measurements
M is nothing but the standard RIP for CS. Hence, if inflate the number of measurements to
M = O (K log(N/K)) (the usual number for conventional CS), the performance of model-based
CoSaMP recovery on an arbitrary signalx follows theK-term model-basedapproximation ofx
within a bounded constant factor.
E. Computational complexity of model-based recovery
The computational complexity of a model-based signal recovery algorithm differs from
that of a standard algorithm by two factors. The first factor is the reduction in the number
19
of measurementsM necessary for recovery: since most current recovery algorithms have a
computational complexity that is linear in the number of measurements, any reduction inM
reduces the total complexity. The second factor is the cost of the model-based approximation.
TheK-term approximation used in most current recovery algorithms can be implemented with a
simple sorting operation (O (N logN) complexity, in general). Ideally, the signal model should
support a similarly efficient approximation algorithm.
To validate our theory and algorithms and demonstrate theirgeneral applicability and utility,
we now present two specific instances of model-based CS and conduct a range of simulation
experiments.
V. EXAMPLE : WAVELET TREE MODEL
Wavelet decompositions have found wide application in the analysis, processing, and
compression of smooth and piecewise smooth signals becausethese signals areK-sparse and
compressible, respectively [1]. Moreover, the wavelet coefficients can be naturally organized
into a tree structure, and for many kinds of natural and manmade signals the largest coefficients
cluster along the branches of this tree. This motivates a connected tree model for the wavelet
coefficients [26–28].
While CS recovery for wavelet-sparse signals has been considered previously [14–16],
the resulting algorithms integrated the tree constraint inan ad-hoc fashion. Furthermore, the
algorithms provide no recovery guarantees or bounds on the necessary number of compressive
measurements.
A. Tree-sparse signals
We first describe tree sparsity in the context of sparse wavelet decompositions. We focus
on one-dimensional signals and binary wavelet trees, but all of our results extend directly to
d-dimensional signals and2d-ary wavelet trees.
Consider a signalx of lengthN = 2I , for an integer value ofI. The wavelet representation
20
...Fig. 2. Binary wavelet tree for a one-dimensional signal. The squares denote the large wavelet coefficients that arise
from the discontinuities in the piecewise smooth signal drawn below; the support of the large coefficients forms a
rooted, connected tree.
of x is given by
x = v0ν +
I−1∑
i=0
2i−1∑
j=0
wi,jψi,j ,
where ν is the scaling function andψi,j is the wavelet function at scalei and offsetj. The
wavelet transform consists of the scaling coefficientv0 and wavelet coefficientswi,j at scalei,
0 ≤ i ≤ I − 1, and positionj, 0 ≤ j ≤ 2i − 1. In terms of our earlier matrix notation,x has
the representationx = Ψα, whereΨ is a matrix containing the scaling and wavelet functions as
columns, andα = [v0 w0,0 w1,0 w1,1 w2,0 . . .]T is the vector of scaling and wavelet coefficients.
We are, of course, interested in sparse and compressibleα.
The nested supports of the wavelets at different scales create a parent/child relationship
between wavelet coefficients at different scales. We say that wi−1,⌊j/2⌋ is the parent of wi,j
and thatwi+1,2j and wi+1,2j+1 are thechildren of wi,j. These relationships can be expressed
graphically by the wavelet coefficient tree in Figure 2.
Wavelet functions act as local discontinuity detectors, and using the nested support property
of wavelets at different scales, it is straightforward to see that a signal discontinuity will give
rise to a chain of large wavelet coefficients along a branch ofthe wavelet tree from a leaf to
the root. Moreover, smooth signal regions will give rise to regions of small wavelet coefficients.
This “connected tree” property has been well-exploited in anumber of wavelet-based processing
21
[12, 29, 30] and compression [31, 32] algorithms. In this section, we will specialize the theory
developed in Sections III and IV to a connected tree modelT .
A set of wavelet coefficientsΩ forms aconnected subtreeif, whenever a coefficientwi,j ∈ Ω,
then its parentwi−1,⌊j/2⌋ ∈ Ω as well. Each such setΩ defines a subspace of signals whose support
is contained inΩ; that is, all wavelet coefficients outsideΩ are zero. In this way, we define the
modelTK as the union of allK-dimensional subspaces corresponding to supportsΩ that form
connected subtrees.
Definition 9: Define the set ofK-tree sparse signalsas
TK =
x = v0ν +
I−1∑
i=0
2i∑
j=1
wi,jψi,j : w|ΩC = 0, |Ω| = K,Ω forms a connected subtree
.
To quantify the number of subspaces inTK , it suffices to count the number of distinct
connected subtrees of sizeK in a binary tree of sizeN . We prove the following result in
Appendix V.
Proposition 1: The number of subspaces inTK obeysTK ≤ 4K+4
Ke2 for K ≥ log2N and
TK ≤ (2e)K
K+1for K < log2N .
B. Tree-based approximation
To implement tree-based signal recovery, we seek an efficient algorithm T(x,K) to solve
the optimal approximation
xTK = arg minx∈TK
‖x− x‖2. (16)
Fortuitously, an efficient solver exists, called thecondensing sort and select algorithm(CSSA)
[26–28]. Recall that subtree approximation coincides withstandardK-term approximation (and
hence can be solved by simply sorting the wavelet coefficients) when the wavelet coefficients
are monotonically nonincreasing along the tree branches out from the root. The CSSA solves
(16) in the case of general wavelet coefficient values bycondensingthe nonmonotonic segments
of the tree branches using an iterative sort-and-average routine. The condensed nodes are called
“supernodes”. Condensing a large coefficient far down the tree accounts for the potentially large
cost (in terms of the total budget of tree nodesK) of growing the tree to that point.
22
The CSSA can also be interpreted as a greedy search among the nodes. For each node in the
tree, the algorithm calculates the average wavelet coefficient magnitude for each subtree rooted
at that node, and records the largest average among all the subtrees as the energy for that node.
The CSSA then searches for the unselected node with the largest energy and adds the subtree
corresponding to the node’s energy to the estimated supportas a supernode [28].
Since the first step of the CSSA involves sorting all of the wavelet coefficients, overall it
requiresO (N logN) computations. However, once the CSSA grows the optimal treeof sizeK,
it is trivial to determine the optimal trees of size< K and computationally efficient to grow the
optimal trees of size> K [26].
The constrained optimization (16) can be rewritten as an unconstrained problem by
introducing the Lagrange multiplierλ [33]:
minx∈T‖x− x‖22 + λ(‖α‖0 −K),
where T = ∪Nn=1Tn and α are the wavelet coefficients ofx. Except for the inconsequential
λK term, this optimization coincides with Donoho’scomplexity penalized sum of squares[33],
which can be solved in onlyO (N) computations using coarse-to-fine dynamic programming on
the tree. Its primary shortcoming is the nonobvious relationship between the tuning parameter
λ and and the resulting sizeK of the optimal connected subtree.
C. Tree-compressible signals
Specializing Definition 2 from Section III-C toT , we make the following definition.
Definition 10: Define the set ofs-tree compressible signalsas
Ts =x ∈ R
N : ‖x− T(x,K)‖2 ≤ SK−s, 1 ≤ K ≤ N, S <∞.
Furthermore, define|x|Ts as the smallest value ofS for which this condition holds forx ands.
Tree approximation classes contain signals whose wavelet coefficients have a loose (and
possibly interrupted) decay from coarse to fine scales. These classes have been well-characterized
for wavelet-sparse signals [27, 28, 32] and are intrinsically linked with the Besov spaces
23
Bsq(Lp([0, 1])). Besov spaces contain functions of one or more continuous variables that have
(roughly speaking)s derivatives inLp([0, 1]); the parameterq provides finer distinctions of
smoothness. When a Besov space signalxa with s > 1/p − 1/2 is sampled uniformly and
converted to a length-N vectorx, its wavelet coefficients belong to the tree approximation space
Ts, with
|xN |Ts ≍ ‖xa‖Lp([0,1]) + ‖xa‖Bsq (Lp([0,1])),
where “≍” denotes an equivalent norm. The same result holds ifs = 1/p− 1/2 andq ≤ p.
D. Stable tree-based recovery from compressive measurements
For tree-sparse signals, by applying Theorem 1 and Proposition 1, we find that a subgaussian
random matrix has theTK-RIP property with constantδTKand probability1−e−t if the number
of measurements obeys
M ≥
2cδ2
TK
(K ln 48
δTK+ ln 512
Ke2 + t)
if K < log2N,
2cδ2
TK
(K ln 24e
δTK
+ ln 2K+1
+ t)
if K ≥ log2N,
Thus, the number of measurements necessary for stable recovery of tree-sparse signals is linear
in K, without the dependence onN present in conventional non-model-based CS recovery.
For tree-compressible signals, we must quantify the numberof subspacesRj in each residual
set Rj,K for the approximation class. We can then apply the theory of Section IV-C with
Proposition 1 to calculate smallest allowableM via Theorem 5.
Proposition 2: The number ofK-dimensional subspaces that compriseRj,K obeys
Rj ≤
(2e)K(2j+1)
(Kj+K+1)(Kj+1)if 1 ≤ j <
⌊log2 N
K
⌋,
2(3j+2)K+8ejK
(Kj+1)K(j+1)e2 if j =⌊
log2 NK
⌋,
4(2j+1)K+8
K2j(j+1)e4 if j >⌊
log2 NK
⌋.
(17)
Using Proposition 2 and Theorem 5, we obtain the following condition for the matrixΦ to
have the RAmP, which is proved in Appendix VI.
24
Proposition 3: Let Φ be anM ×N matrix with i.i.d. subgaussian entries.. If
M ≥
2
(√
1+ǫK−1)2
(10K + 2 ln N
K(K+1)(2K+1)+ t)
if K ≤ log2N,
2
(√
1+ǫK−1)2
(10K + 2 ln 601N
K3 + t)
if K > log2N,
then the matrixΦ has the(ǫK , s)-RAmP for modelT and alls > 0.5 with probability1− e−t.
Both cases give a simplified bound on the number of measurements required asM = O (K),
which is a substantial improvement over theM = O (K log(N/K)) required by conventional
CS recovery methods. Thus, whenΦ satisfies Proposition 3, we have the guarantee (14) for
sampled Besov space signals fromBsq(Lp([0, 1])).
E. Experiments
We now present the results of a number of numerical experiments that illustrate the
effectiveness of a tree-based recovery algorithm. Our consistent observation is that explicit
incorporation of the model in the recovery process significantly improves the quality of recovery
for a given number of measurements. In addition, model-based recovery remains stable when the
inputs are no longer tree-sparse, but rather are tree-compressible and/or corrupted with differing
levels of noise. We employ the model-based CoSaMP recovery of Algorithm 1 with a CSSA-
based approximation step in all experiments.
We first study one-dimensional signals that match the connected wavelet-tree model described
above. Among such signals is the class of piecewise smooth functions, which are commonly
encountered in analysis and practice.
Figure 1 illustrates the results of recovering the tree-compressibleHeaviSinesignal of length
N = 1024 from M = 80 noise-free random Gaussian measurements using CoSaMP,ℓ1-norm
minimization using thel1 eq solver from theℓ1-Magic toolbox,3 and our tree-based recovery
algorithm. It is clear that the number of measurements (M = 80) is far fewer than the minimum
number required by CoSaMP andℓ1-norm minimization to accurately recover the signal. In
contrast, tree-based recovery usingK = 26 is accurate and uses fewer iterations to converge
3http://www.acm.caltech.edu/l1magic.
25
2 2.5 3 3.5 4 4.5 50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
M/KA
vera
ge n
orm
aliz
ed e
rror
mag
nitu
de Model−based recoveryCoSaMP
Fig. 3. Performance of CoSaMP vs. wavelet tree-based recovery on a class of piecewise-cubic signals as a function
of M/K.
than conventional CoSaMP. Moreover, the normalized magnitude of the squared error for tree-
based recovery is equal to 0.037, which is remarkably close to the error between the noise-free
signal and itsbestK-term tree-approximation (0.036).
Figure 3 illustrates the results of a Monte Carlo simulationstudy on the impact of the number
of measurementsM on the performance of model-based and conventional recovery for a class
of tree-sparse piecewise-polynomial signals. Each data point was obtained by measuring the
normalized recovery error of 500 sample trials. Each sampletrial was conducted by generating
a new piecewise-polynomial signal with five polynomial pieces of cubic degree and randomly
placed discontinuities, computing its bestK-term tree-approximation using the CSSA, and then
measuring the resulting signal using a matrix with i.i.d. Gaussian entries. Model-based recovery
attains near-perfect recovery atM = 3K measurements, while CoSaMP only matches this
performance atM = 5K. We defer a full Monte Carlo comparison of our method with the
much more computationally demandingℓ1-norm minimization to future work. In practice, we
have noticed that CoSaMP andℓ1-norm minimization offer similar recovery trends; consequently,
we can expect that model-based recovery will offer a similardegree of improvement overℓ1-norm
minimization.
Further, we demonstrate that model-based recovery performs stably in the presence of
26
0 0.1 0.2 0.3 0.4 0.50
0.2
0.4
0.6
0.8
1CoSaMP (M = 5K)
))
Max
imum
nor
mal
ized
rec
over
y er
ror
Fig. 4. Robustness to measurement noise for standard and wavelet tree-based CS recovery algorithms. We plot the
maximum normalized recovery error over 200 sample trials asa function of the expected signal-to-noise ratio. The
linear growth demonstrates that model-based recovery possesses the same robustness to noise as CoSaMP andℓ1-norm
minimization.
measurement noise. We generated sample piecewise-polynomial signals as above, computed
their bestK-term tree-approximations, computedM measurements of each approximation, and
finally added Gaussian noise of expected norm‖n‖2 to each measurement. Then, we recovered
the signal using CoSaMP and model-based recovery and measured the recovery error in each case.
For comparison purposes, we also tested the recovery performance of aℓ1-norm minimization
algorithm that accounts for the presence of noise, which hasbeen implemented as thel1 qc
solver in theℓ1-Magic toolbox. First, we determined the lowest value ofM for which the
respective algorithms provided near-perfect recovery in the absence of noise in the measurements.
This corresponds toM = 3.5K for model-based recovery,M = 5K for CoSaMP, andM = 4.5K
for ℓ1 minimization. Next, we generated 200 sample tree-modeled signals, computedM noisy
measurements, recovered the signal using the given algorithm and recorded the recovery error.
Figure 4 illustrates the growth in maximum normalized recovery error (over the 200 sample
trials) as a function of the expected measurement signal-to-noise ratio for the tree algorithms. We
observe similar stability curves for all three algorithms,while noting that model-based recovery
offers this kind of stability using significantly fewer measurements.
27
(a) Peppers (b) CoSaMP (c) model-based recovery
(RMSE = 22.8) (RMSE = 11.1)
Fig. 5. Example performance of standard and model-based recovery on images. (a)N = 128× 128 = 16384-pixel
Pepperstest image. Image recovery fromM = 5000 compressive measurements using (b) conventional CoSaMP and
(c) our wavelet tree-based algorithm.
Finally, we turn to two-dimensional images and a wavelet quadtree model. The connected
wavelet-tree model has proven useful for compressing natural images [27]; thus, our algorithm
provides a simple and provably efficient method for recovering a wide variety of natural images
from compressive measurements. An example of recovery performance is given in Figure 5. The
test image (Peppers) is of sizeN = 128 × 128 = 16384 pixels, and we computedM = 5000
random Gaussian measurements. Model-based recovery againoffers higher performance than
standard signal recovery algorithms like CoSaMP, both in terms of recovery mean-squared error
and visual quality.
VI. EXAMPLE : BLOCK-SPARSE SIGNALS AND SIGNAL ENSEMBLES
In a block-sparsesignal, the locations of the significant coefficients cluster in blocks under
a specific sorting order. Block-sparse signals have been previously studied in CS applications,
including DNA microarrays and magnetoencephalography [7,8]. An equivalent problem arises
in CS for signal ensembles, such as sensor networks and MIMO communication [8, 9, 34]. In this
case, several signals share a common coefficient support set. For example, when a frequency-
sparse acoustic signal is recorded by an array of microphones, then all of the recorded signals
28
contain the same Fourier frequencies but with different amplitudes and delays. Such a signal
ensemble can be re-shaped as a single vector by concatenation, and then the coefficients can be
rearranged so that the concatenated vector exhibits block sparsity.
It has been shown that the block-sparse structure enables signal recovery from a reduced
number of CS measurements, both for the single signal case [7, 8] and the signal ensemble
case [9], through the use of specially tailored recovery algorithm [7, 8, 35]. However, the
robustness guarantees for such algorithms either are restricted to exactly sparse signals and
noiseless measurements, do not have explicit bounds on the number of necessary measurements,
or are asymptotic in nature.
In this section, we formulate the block sparsity signal model as a union of subspaces and
pose an approximation algorithm on this union of subspaces.The approximation algorithm is
used to implement block-based signal recovery. We also define the corresponding class of block-
compressible signals and quantify the number of measurements necessary for robust recovery.
A. Block-sparse signals
Consider a classS of signal vectorsx ∈ RJN , with J andN integers. This signal can be
reshapped into aJ × N matrix X, and we use both notations interchangeably in the sequel.
We will restrict entire columns ofX to be part of the support of the signal as a group. That
is, signalsX in a block-sparse model have entire columns as zeros or nonzeros. The measure
of sparsity forX is its number of nonzero columns. More formally, we make the following
definition.
Definition 11: [7, 8] Define the set ofK-block sparse signalsas
SK = X = [x1 . . . xN ] ∈ RJ×N such thatxn = 0 for n /∈ Ω,Ω ⊆ 1, . . . , N, |Ω| = K.
It is important to note that aK-block sparse signal has sparsityKJ , which is dependent
on the size of the blockJ . We can extend this formulation to ensembles ofJ , length-N signals
with common support. Denote this signal ensemble byx1, . . . , xJ, with xj ∈ RN , 1 ≤ j ≤ J .
We formulate a matrix representationX of the ensemble that features the signalxj in its jth
row: X = [x1 . . . xN ]T . The matrixX features the same structure as the matrixX obtained
29
from a block-sparse signal; thus, the matrixX can be converted into a block-sparse vectorx
that represents the signal ensemble.
B. Block-based approximation
To pose a the block-based approximation algorithm, we need to define the mixed norm of
a matrix.
Definition 12: The (p, q) mixed normof the matrixX = [x1 x2 . . . xN ] is defined as
‖X‖(p,q) =
(N∑
n=1
‖xn‖qp
)1/q
.
Whenq = 0, ‖X‖(p,0) simply counts the number of nonzero columns inX.
We immediately find that‖X‖(p,p) = ‖x‖p, with x the vectorization ofX. Intuitively, we
pose the algorithmS(X,K) to obtain the best block-based approximation of the signalX as
follows:
XSK = arg min
X∈RJ×N‖X − X‖(2,2) subject to‖X‖(2,0) ≤ K. (18)
It is easy to show that to obtain the approximation, it suffices to perform column-wise hard
thresholding: letρ be theK th largestℓ2-norm among the columns ofX. Then our approximation