A Task-specific Approach to Computational Imaging System Design by Amit Ashok A Dissertation Submitted to the Faculty of the Department of Electrical and Computer Engineering In Partial Fulfillment of the Requirements For the Degree of Doctor of Philosophy In the Graduate College The University of Arizona 2008
178
Embed
A Task-specific Approach to Computational Imaging System Design
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
A Task-specific Approach to Computational
Imaging System Design
by
Amit Ashok
A Dissertation Submitted to the Faculty of the
Department of Electrical and Computer Engineering
In Partial Fulfillment of the RequirementsFor the Degree of
Doctor of Philosophy
In the Graduate College
The University of Arizona
2 0 0 8
2
THE UNIVERSITY OF ARIZONA
GRADUATE COLLEGE
As members of the Dissertation Committee, we certify that we have read the dissertation prepared by Amit Ashok entitled "A Task-Specific Approach to Computational Imaging System Design" and recommend that it be accepted as fulfilling the dissertation requirement for the Degree of Doctor of Philosophy _______________________________________________________________________ Date: 07/30/2008
Prof. Mark A. Neifeld _______________________________________________________________________ Date: 07/30/2208
Prof. Raymond K. Kostuk _______________________________________________________________________ Date: 07/30/2008
Prof. William E. Ryan _______________________________________________________________________ Date: 07/30/2008
Prof. Michael W. Marcellin _______________________________________________________________________ Date:
Final approval and acceptance of this dissertation is contingent upon the candidate’s submission of the final copies of the dissertation to the Graduate College. I hereby certify that I have read this dissertation prepared under my direction and recommend that it be accepted as fulfilling the dissertation requirement. ________________________________________________ Date: 07/30/2008 Dissertation Director: Prof. Mark A. Neifeld
3
Statement by Author
This dissertation has been submitted in partial fulfillment of requirements for anadvanced degree at The University of Arizona and is deposited in the UniversityLibrary to be made available to borrowers under rules of the Library.
Brief quotations from this dissertation are allowable without special permission,provided that accurate acknowledgment of source is made. Requests for permissionfor extended quotation from or reproduction of this manuscript in whole or in partmay be granted by the head of the major department or the Dean of the GraduateCollege when in his or her judgment the proposed use of the material is in the interestsof scholarship. In all other instances, however, permission must be obtained from theauthor.
Signed: Amit Ashok
Approval by Dissertation Director
This dissertation has been approved on the date shown below:
Mark A. NeifeldProfessor of Electrical and Computer
Engineering
Date
4
Acknowledgements
Signal processing has found a multitude of applications ranging from communicationsto pattern recognition. Its application to various imaging modalities such as sonar,radar, tomography, and optical imaging systems has been a very interesting topicof research to me. I am fortunate to have had to opportunity to conduct disserta-tion research in the multi-disciplinary area of computational imaging systems thatinvolves various subjects such as optics, statistics, optimization, and of course, signalprocessing.
I would like to express my sincere gratitude to my advisor, Prof. Mark Neifeld,who has always provided invaluable guidance and steadfast support. He has beenan inspiring mentor who has set a very high standard to achieve. Thanks to mycolleagues in the OCPL lab, in particular Ravi Pant, Pawan Baheti, and Jun Ke,who were very helpful and supportive and helped create an exciting and friendlywork environment. I wish to express my heartfelt thanks to my parents and my wife,Sabina, who have always believed in me and encouraged me to persist. I want tothank Prof. W. Ryan, Prof. R. Kostuk, and Prof. M. Marcellin for serving on mydissertation committee and providing invaluable feedback on my dissertation researchwork.
Table 3.1. Imaging system performance for K = 1, K = 4, K = 9, andK = 16 on training set. . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Table 3.2. Imaging system performance for K = 1, K = 4, K = 9, andK = 16 on validation set. . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Table 5.1. TSI (in bits) for candidate compressive imagers at three represen-tative values of SNR: low(s = 0.5), medium(s = 5.0), and high(s = 20.0). 155
8
List of Figures
Figure 1.1. System layout of (a) a traditional imaging system and (b) acomputational imaging system. . . . . . . . . . . . . . . . . . . . . . . 17
Figure 1.2. Extended depth of field imaging system layout (image examplesare taken from Ref. [7]). . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Figure 2.1. Schematic depicting the effect of pixel-limited resolution: (a)optical PSF is impulse-like and (b) engineered optical PSF is extended. 27
Figure 2.2. Imaging system setup used in the simulation study. . . . . . . . 30Figure 2.3. Example simulated PSFs: (a) Conventional sinc2(·) PSF and (b)
Figure 2.11. Experimentally measured Rayleigh resolution versus number offrames for both the PRPEL and conventional imagers. . . . . . . . . . . 41
Figure 2.12. The USAF resolution target (a) Group 0 element 1 and (b) Group0 elements 2 and 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Figure 2.13. Raw detector measurements obtained using USAF Group 0 ele-ment 1 from (a) the conventional imager and (b) the PRPEL imager. . . 43
Figure 2.14. LMMSE reconstructions of USAF group 0 element 1 with leftcolumn for PRPEL imager and right column for conventional imager: toprow for K=1, middle row for K=4, and bottom row for K=9. . . . . . . 44
List of Figures—Continued
9
Figure 2.15. Horizontal line scans through the USAF target and its LMMSEreconstruction for conventional and PRPEL imagers for K=4: (a) group0 elements 1 and (b) group 0 elements 2 and 3. . . . . . . . . . . . . . . 45
Figure 2.16. LMMSE reconstructions of USAF group 0 element 2 and 3 withleft column for PRPEL imager and right column for conventional imager:top row for K=1, middle row for K=4, and bottom row for K=9. . . . . 46
Figure 2.17. Richardson-Lucy reconstructions of USAF group 0 element 1with left column for PRPEL imager and right column for conventionalimager: top row for K=1, middle row for K=4, and bottom row for K=9. 48
Figure 2.18. Richardson-Lucy reconstructions of USAF group 0 element 2 and3 with left column for PRPEL imager and right column for conventionalimager: top row for K=1, middle row for K=4, and bottom row for K=9. 49
Figure 2.19. Horizontal line scans through the USAF target and its Richardson-Lucy reconstruction for conventional and PRPEL imagers for K=4: (a)group 0 elements 1 and (b)group 0 elements 2 and 3. . . . . . . . . . . . 50
Figure 2.20. (a) Rayleigh resolution and (b) RMSE versus number of framesfor multi-frame imagers that employ smaller pixels and lower measure-ment SNR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Figure 2.21. The optical PSF obtained using PRPEL with both narrowband(10 nm) and broadband (150 nm) illumination. . . . . . . . . . . . . . . 52
Figure 2.22. (a) Rayleigh resolution and (b) RMSE versus number of framesfor broadband PRPEL and conventional imagers. . . . . . . . . . . . . . 53
Figure 3.1. PSF-engineered multi-aperture imaging system layout. . . . . . 57Figure 3.2. Iris examples from the training dataset. . . . . . . . . . . . . . 60Figure 3.3. Examples of (a) iris-segmentation, (b) masked iris-texture region,
(c) unwrapped iris, and (d) iris-code. . . . . . . . . . . . . . . . . . . . . 62Figure 3.4. Illustration of FRR and FAR definitions in the context of intra-
class and inter-class probability densities. . . . . . . . . . . . . . . . . . 65Figure 3.5. Optimized ZPEL imager with K = 1 (a) pupil-phase, (b) optical
PSF, and (c) optical PSF of conventional imager . . . . . . . . . . . . . 70Figure 3.6. Cross-section MTF profiles of optimized ZPEL imager with K = 1. 71Figure 3.7. Optimized ZPEL imager with K = 4: (a) pupil-phase and (b)
Figure 4.1. (a) A 256 × 256 image, (b) the compressed version of image in(a) using JPEG2000, and (c) 64 × 64 image obtained by rescaling imagein (a). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Figure 4.2. Block diagram of an imaging chain. . . . . . . . . . . . . . . . . 83Figure 4.3. Example scenes from the deterministic encoder. . . . . . . . . . 83Figure 4.4. Example scenes from the stochastic encoder. . . . . . . . . . . . 84Figure 4.5. (a) mmse and (b) TSI versus signal to noise ratio for the scalar
T and position vector ~ρ and (b) clutter profile matrix Vc and mixing
vector ~β. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90Figure 4.7. Structure of T and ρ matrices for the two-class problem. . . . . 92Figure 4.8. Structure of T and Λ matrices for the joint detection/localization
problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94Figure 4.9. Structure of T and Ω matrices for the joint classification/localization
problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96Figure 4.10. Example scenes: (a) Tank in the middle of the scene, (b) Tank
in the top of the scene, (c) Jeep at the bottom of the scene, and (d) Jeepin the middle of the scene. . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Figure 4.11. Detection task: (a) mmse versus signal to noise ratio for an idealgeometric imager and (b) TSI versus signal to noise ratio for geometricand diffraction-limited imagers. . . . . . . . . . . . . . . . . . . . . . . . 101
Figure 4.12. Scene partitioned into four regions: (a) Tank in the top left regionof the scene, (b) Tank in the top right region of the scene, (c) Tank in thebottom left region of the scene, and (d) Tank in the bottom right regionof the scene. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Figure 4.13. Joint detection/localization task: (a) mmse versus signal to noiseratio for an ideal geometric imager and (b) TSI versus signal to noise ratiofor geometric and diffraction-limited imagers. . . . . . . . . . . . . . . . 104
Figure 4.14. Classification task: TSI versus signal to noise ratio for geometricand diffraction-limited imagers. . . . . . . . . . . . . . . . . . . . . . . . 106
Figure 4.15. Joint classification/localization task: TSI versus signal to noiseratio for geometric and diffraction-limited imagers. . . . . . . . . . . . . 107
Figure 4.16. Example scenes with optical blur: (a) Tank in the top of thescene, (b) Tank in the middle of the scene, (c) Jeep at the bottom of thescene, and (d) Jeep in the middle of the scene. . . . . . . . . . . . . . . 108
List of Figures—Continued
11
Figure 4.17. Block diagram of a compressive imager. . . . . . . . . . . . . . 109Figure 4.18. Detection task: TSI for PC compressive imager versus signal to
versus signal to noise ratio. . . . . . . . . . . . . . . . . . . . . . . . . . 115Figure 4.22. Example textures (a) from each of the 16 texture classes and (b)
within one of the texture class. . . . . . . . . . . . . . . . . . . . . . . . 116Figure 4.23. TSI versus signal to noise ratio at various values of defocus. . . 117Figure 4.24. TSI versus defocus at s = 10 and s = 4 for the texture classifi-
and cubic phase-mask imager with γ = 2.0 at (c) Wd = 0, (d) Wd = 3. . 119Figure 4.26. Depth of Field and TSI versus γ parameter at s = 10. . . . . . 122Figure 4.27. TSI versus defocus at s = 10: DOF of conventional imager and
Figure 5.5. Example scenes with optical blur and noise: (a) Tank in the topof the scene, (b) Tank in the middle of the scene . . . . . . . . . . . . . 136
Figure 5.6. Example projection vectors in the PC projection basis, clockwisefrom upper left, #2,#6,#16,#31. . . . . . . . . . . . . . . . . . . . . . . 140
Figure 5.7. TSI versus SNR for PC compressive imager. . . . . . . . . . . . 141Figure 5.8. Example projection vectors in the GMF projection basis, clock-
wise from upper left, #1,#16,#32,#64. . . . . . . . . . . . . . . . . . . 143Figure 5.9. Example projection vectors in the GFD1 projection basis, clock-
wise from upper left, #1,#10,#11,#14. . . . . . . . . . . . . . . . . . . 146Figure 5.10. Projection vector in the GFD2 projection basis. . . . . . . . . . 147Figure 5.11. Example projection vectors in the IC projection basis, clockwise
Figure 5.12. Optimized compressive imagers: TSI versus SNR for candidateCI system and conventional imager. . . . . . . . . . . . . . . . . . . . . 150
Figure 5.13. Optimal photon allocation vectors for PC compressive imager at:(a) s = 0.5 , (b) s = 5.0 , and (c) s = 20.0. . . . . . . . . . . . . . . . . . 151
Figure 5.14. Optimal photon allocation vectors for GFD1 compressive imagerat: (a) s = 0.5 , (b) s = 5.0 , and (c) s = 20.0. . . . . . . . . . . . . . . 156
Figure 5.15. Lower bound on probability of error as a function of TSI. . . . 158Figure 5.16. Comparison of probability of error obtained via Bayes’ detector
versus lower bound obtained by Fano’s inequality as a function of SNR. 159
13
Abstract
The traditional approach to imaging system design places the sole burden of image
formation on optical components. In contrast, a computational imaging system relies
on a combination of optics and post-processing to produce the final image and/or
output measurement. Therefore, the joint-optimization (JO) of the optical and the
post-processing degrees of freedom plays a critical role in the design of computa-
tional imaging systems. The JO framework also allows us to incorporate task-specific
performance measures to optimize an imaging system for a specific task. In this
dissertation, we consider the design of computational imaging systems within a JO
framework for two separate tasks: object reconstruction and iris-recognition. The
goal of these design studies is to optimize the imaging system to overcome the perfor-
mance degradations introduced by under-sampled image measurements. Within the
JO framework, we engineer the optical point spread function (PSF) of the imager,
representing the optical degrees of freedom, in conjunction with the post-processing
algorithm parameters to maximize the task performance. For the object reconstruc-
tion task, the optimized imaging system achieves a 50% improvement in resolution
and nearly 20% lower reconstruction root-mean-square-error (RMSE ) as compared to
the un-optimized imaging system. For the iris-recognition task, the optimized imaging
system achieves a 33% improvement in false rejection ratio (FRR) for a fixed alarm ra-
tio (FAR) relative to the conventional imaging system. The effect of the performance
measures like resolution, RMSE, FRR, and FAR on the optimal design highlights
the crucial role of task-specific design metrics in the JO framework. We introduce a
fundamental measure of task-specific performance known as task-specific information
(TSI), an information-theoretic measure that quantifies the information content of an
image measurement relevant to a specific task. A variety of source-models are derived
to illustrate the application of a TSI-based analysis to conventional and compressive
14
imaging (CI) systems for various tasks such as target detection and classification. A
TSI-based design and optimization framework is also developed and applied to the
design of CI systems for the task of target detection, it yields a six-fold performance
improvement over the conventional imaging system at low signal-to-noise ratios.
15
Chapter 1
Introduction
1.1. Evolution of Imaging Systems
The first imaging systems simply imaged a scene onto a screen for viewing purposes.
One of the earliest imaging devices “camera obscura,” invented in the 10th century,
relied on a pinhole and a screen to form an inverted image [1]. The next signifi-
cant step in the evolution of imaging systems was the development of photo-sensitive
material that allowed the image to be recorded for later viewing. The perfection
of photographic film gave birth to a multitude of new applications, ranging from
medical imaging using X-rays for diagnosis purposes to aerial imaging for surveil-
lance. Development of the charge-coupled device (CCD) in 1969 by George Smith
and Willard Boyle at Bell labs [2] combined with the advances in communication
theory revolutionized imaging system design and its applications. The electronic
recording of an image allowed it to be stored digitally and transmitted over long dis-
tances reliably using digital communication systems. Furthermore, with the advent of
computed-aided optical design coupled with the development of modern machining
tools and new optical materials such as plastics/polymers allowed imaging system
designs that were light-weight, low-cost, and high-performance. This led to an ex-
plosion of applications, such as medical imaging for diagnosis, military applications
involving surveillance, tracking, recognition, weapon guidance, and a host of com-
mercial imaging applications such as security, consumer photography, automotive,
aerospace, and entertainment. Advances in the semiconductor industry have allowed
the processing power of computers and embedded processors to grow at an expo-
nential rate following Moore’s law [3]. This has led to real-time implementations of
16
sophisticated image processing algorithms that can further enhance the capabilities
of digital imaging systems. The post-processing algorithms, operating on acquired
images, have been developed for a variety of tasks such as pattern-recognition in se-
curity and surveillance, image restoration, detection in medical diagnosis, estimation
in computer vision, compression of still images and video storage/transmission appli-
cations. However, due to the separate evolutionary paths of imaging system design
and image processing technology, they have been viewed as two separate processes by
imaging system designers. As a result, there has been a disconnect between the imag-
ing system design and the post-processing algorithm design. Recently, this disconnect
has been addressed with the emergence of a new imaging system paradigm known
as computational imaging [4, 5, 6]. Computational imaging offers several advantages
over traditional imaging techniques, especially when dealing with specific tasks. This
dissertation investigates the task-specific aspects of design methodologies for compu-
tational imaging system design. Before discussing the specific contributions of this
dissertation we begin by defining computational imaging and outlining its various
benefits relative to traditional imaging.
1.2. Computational Imaging and Task-specific Design
In a traditional imaging system, the optics has the sole burden of the image formation.
The post-processing algorithm, which is not an essential part of the imaging system,
operates on the image measurement to extract the desired information. Note that the
optics and the post-processing algorithms are designed separately. Fig. 1.1(a) shows
the architecture of a traditional imaging system. In contrast, a computational imaging
system involves the use of both a front-end optical system and a post-processing
algorithm in the image formation process. As shown in Fig. 1.1(b), the post-processing
algorithm forms an integral part of the overall imaging system design. Here the
front-end optics does not yield the final image directly but instead relies on the
17
Object
Detector array
Image
Imaging optics
Output dataalgorithm
Post−processing
(a)
Object
Detector arrayEncoded optics
Intermediate image Final image/output data
Post−processingalgorithm
(b)
Figure 1.1. System layout of (a) a traditional imaging system and (b) a computationalimaging system.
Cubic phase mask
Imaging optics
Aperture stopDetector array
Intermediate image
Reconstruction filter
Z−axis
X−axis
Y−
axis
Object
Final image
Figure 1.2. Extended depth of field imaging system layout (image examples are takenfrom Ref. [7]).
18
post-processing sub-system to form the image. The extended depth of field (EDOF)
imaging system, described in Ref. [4], is an example of a computational imaging
system. Fig. 1.2 shows the system layout of this EDOF imaging system. Note that
it consists of a front-end optical system to form an intermediate image on the sensor
array that is subsequently processed by an image reconstruction algorithm to yield
the final focused image. The EDOF is achieved by modifying a traditional optical
imaging system with the addition of a cubic-phase mask in the aperture stop. The
resulting optical point spread function (PSF) has a larger support compared to a
traditional PSF and therefore, the optical image formed on the sensor array appears
to be blurred. However, as the optical PSF is invariant over an extended range of
object distances, a simple reconstruction filter can be used in the post-processing step
to form the final image that is focused throughout an extended object volume. This
imaging system demonstrates the potential of the computational imaging paradigm to
yield designs with novel capabilities, like EDOF, that simply could not be achieved by
a traditional imaging system without significant performance trade-offs. Nevertheless,
it is important to recognize that this EDOF imaging system does not fully exploit
the capabilities of the computational imaging paradigm.
The true potential of computational imaging can only be realized via a joint-
optimization of the optical and the post-processing degrees of freedom. The joint
design methodology yields a larger and richer design space for the designer. In order
to understand this advantage let us examine the multi-dimensional design space de-
picted in Fig. 1.3, the optical design parameters are represented on the vertical axis
and the post-processing design parameters are shown on the horizontal axis. Note
that the traditional approach constrains the designer to a relatively small design sub-
space, outlined in brown and green. The region outlined in brown represents a design
sub-space resulting from optimization of only optical parameters without any con-
sideration to the degrees of freedom available in the post-processing domain. In the
traditional design methodology, the optical design is followed by the optimization of
19
Post−processing domain parameters
Op
tica
l do
ma
in p
ara
me
ters
MinimaMaxima
Optical design sub−space
Post−processing design sub−space
Global optima
Joint design space
Figure 1.3. A two-dimensional illustration of the joint optical and post-processingdesign space.
post-processing parameters, represented by the sub-space in the green region. This
approach does not guarantee an overall optimal system design and it usually leads to
a sub-optimal system performance. In contrast, the joint-optimization design method
combines the degrees of freedom available from the optical and the post-processing do-
mains expanding the design space to a larger volume, represented by the red outlined
region. This larger design space encompasses potential designs that offer benefits
such as lower system cost, reduced complexity, improved yields and perhaps most
importantly optimal/near-optimal system performance.
Another key aspect of the joint design methodology is that it inherently supports
a task-specific approach to imaging system design. To support this assertion let us
consider an example of imaging system design for a classification task. The traditional
design approach would involve: 1) design an optical imaging system to maximize the
fidelity of the output image measurement and 2) design a classification algorithm that
operates on the image measurement and minimizes the probability of misclassifica-
20
tion. Note that in this approach the optical imaging system and the classification
algorithm are designed separately (and sequentially). Typically, a classification al-
gorithm involves two steps: the feature extraction step and the classification step.
In the feature extraction step, the original high-dimensional image measurement is
transformed (compressed) into a low-dimensional data vector that is referred to as
a feature vector. This dimensionality reduction step effectively lowers the computa-
tional complexity of the subsequent classification step. Acquiring a high-dimensional
image measurement and subsequently reducing it to a low-dimensional feature clearly
represents an inefficient data measurement process and a poor utilization of optical
design resources. Thus, the traditional approach results in an imaging system design
with sub-optimal performance for the classification task. Alternatively, a more logi-
cal approach would suggest an optical imaging system design that directly measures
the optimal low-dimensional feature(s) for post-processing such that it maximizes the
task performance, within the system constraints. This approach yields a computa-
tional imaging system design that offers two main advantages: a) a direct feature
measurement yields a higher measurement signal to noise ratio (SNR) and b) the
number of detectors required is significantly reduced. The high measurement SNR
directly translates into improved system performance. This type of imaging system,
referred to as a feature-specific imager (FSI) or a compressive imager, is an exam-
ple of a computational imaging system [6]. This example clearly illustrates that the
computational imaging paradigm supports and enables a task-specific approach to
imaging system design.
1.3. Main Contributions
The task-specific approach to computational imaging system design is an emerging
area of research. Barrett et al. have conducted an extensive task-based analysis of
imaging systems for detection and classification tasks in the area of medical imag-
21
ing [8, 9, 10]. Their focus has been primarily on the performance of ideal Bayesian
observers and human observers. However, the application of the task-specific ap-
proach within a joint-optimization design framework is a relatively unexplored area.
In this dissertation, we apply a task-specific approach to maximize the performance
of a computational imaging system for a given task within a joint-optimization design
framework. We consider two separate example tasks in this work: a reconstruction
task and a classification task. In each case, the computational imaging system is
optimized to maximize the task performance as measured by a task-specific metric.
For example, the reconstruction task employs the traditional root mean square error
(RMSE) and resolution metrics to quantify the quality of the reconstructed images.
In the case of the classification task, false rejection ratio (FRR) and false alarm ra-
tio (FAR) statistics are used as task-specific metrics to evaluate the overall system
performance. In addition to the two design studies, a novel information theoretic task-
specific metric is also derived. A formal design framework based on this task-specific
metric is developed and applied to the design of a compressive imaging system for the
task of target detection. More specifically, the main contributions of this dissertation
work are as follows:
1. The application of the optical PSF engineering method to optimize the imaging
system performance for a specific task is considered. This task-specific method
is first applied to a reconstruction task to overcome the distortions introduced by
the detector under-sampling in the sensor array. Simulation results show nearly
a 20% improvement in RMSE for the optimized imaging system design relative
to the conventional imaging system. The optical PSF engineering method is
also successfully applied to the design of an iris-recognition imaging system
to minimize the impact of detector under-sampling on the overall performance.
The optimized iris-recognition imaging system design achieves a 33% lower FRR
compared to the conventional imaging system design.
22
2. Development a formal task-specific framework for computational imaging sys-
tem design based on a novel information theoretic task-specific metric. This
metric, known as task-specific information (TSI), quantifies the information
content of an imaging system measurement relevant to a specific task. The
TSI metric can also be used to derive an upper-bound on the performance of
any post-processing algorithm for a specific task. Therefore, within the pro-
posed design framework, the TSI metric can be used improve the upper-bound
on imaging system performance thereby allowing the designer to optimize the
imaging system for a particular task. The utility of the TSI metric is investi-
gated for a variety of target detection and classification tasks. The application
of the TSI-based design framework to extend the depth of field of an imager by
optical PSF engineering is also considered.
3. The TSI-based design framework is used to design several compressive imaging
systems for a target detection task. The resulting optimized imaging system
designs shows a significant performance improvement over the un-optimized
imaging designs.
1.4. Dissertation Organization
The rest of the dissertation is organized as follows:
• Chapter 2 presents the application of the optical PSF engineering method,
within a multi-aperture imaging architecture, to overcome the distortions due
to under-sampling in the detector array. The reconstruction task is considered
in this study. RMSE and resolution are used as task-specific metrics during
the imaging system optimization process. In the simulation study, the opti-
mized imaging system designs show significant improvement, both in terms of
RMSE and resolution metrics, compared to imaging system with a traditional
23
diffraction-limited PSF. The experimental results support the performance im-
provements predicted by the simulation study.
• The task of iris-recognition, in the presence of detector under-sampling, is con-
sidered in Chapter 3. A multi-aperture imaging system in conjunction with
optical PSF engineering is employed to optimize the overall performance of the
imaging system. The task-specific design framework employs the FAR and FRR
metrics to quantify the imaging system performance in this study. The simula-
tion results show a substantial improvement in iris-recognition performance as
a result of PSF optimization compared to the design that employs a traditional
optical PSF.
• As emphasized by the design studies described in Chapter 2 and Chapter 3, the
performance metric plays a crucial role in the task-specific approach to imaging
system design. In Chapter 4, the notion of task-specfic information is introduced
as an objective metric for task-specfic design. TSI is an information theoretic
metric that is derived using the recently discovered relationship between esti-
mation theory and mutual-information. This metric is applied to a variety of
detection and classification tasks to demonstrate its utility for task-specific per-
formance evaluation. A brief analysis of a TSI-based optical PSF engineering
approach for extending the depth of field of an imager is also presented in the
context of a texture-classification task.
• Chapter 5 presents a formal task-specific design framework that utilizes the
TSI metric to optimize a compressive imaging system for a target detection
task. The optimized imaging system designs deliver substantial performance
improvement over the conventional design. The implementation issues regarding
compressive imaging systems and the computational complexity associated with
the TSI-based design framework are also discussed.
24
• Chapter 6 draws conclusions from the various aspects of the task-specifc ap-
proach investigated in this dissertation and provides direction for future work
relevant to the further development of the joint-optimization design framework
for computational imaging systems.
25
Chapter 2
Optical PSF Engineering: Object
Reconstruction Task
The optical PSF represents a degree of freedom that can be exploited to optimize
an imaging system for a specific task. In a digital imaging system, the detector
can limit the overall resolution when the optical PSF is smaller than the extent of
the detector, leading to under-sampling or aliasing. In this chapter, we apply the
optical PSF engineering method to improve the overall system resolution beyond
the detector-limit and also increase the object reconstruction fidelity in such under-
sampled imaging systems.
2.1. Introduction
In a traditional (i.e. film-based) design paradigm the optical PSF is typically viewed
as the resolution-limiting element and therefore, optical designers strive for an impulse-
like PSF. Digital imagers however, employ photodetectors that are sometimes large
relative to the extent of the optical PSF and in such cases the resulting pixel-blur
and/or aliasing can become the dominant distortion limiting overall imager perfor-
mance. This is illustrated by Fig. 2.1(a). This figure is a one-dimensional depiction
of the image formed by a traditional camera when two point objects are separated
by a sub-pixel distance. We see that the resulting impulse-like PSFs are imaged onto
essentially the same pixel leading to spatial ambiguity and hence a loss of resolution.
In such an imager the resolution is said to be pixel-limited [11].
The effect depicted in Fig. 2.1(a) may also be understood by noting that the
detector array under-samples the image and therefore, introduces aliasing. The gen-
26
eralized sampling theorem by Papoulis [12] provides a mechanism through which this
aliasing distortion can be mitigated. The theorem states that a bandlimited signal
(−Ω ≤ ω ≤ Ω) can be completely/perfectly reconstructed from the sampled outputs
of R non-redundant (i.e., diverse) linear channels, each of which employs a sample rate
of 2ΩR
(i.e., each of the R signals is under-sampled at 1R
the Nyquist rate). This theo-
rem suggests that the aliasing distortion can be reduced by combining multiple under-
sampled/low-resolution images to obtain a high-resolution image. A detailed descrip-
tion of this technique can be found in Borman [13]. This approach has been used by
several researchers in the image processing community [11, 14, 15, 16, 17, 18] and was
recently adopted for use in the TOMBO (Thin observing module with bounded optics)
imaging architecture [19, 20]. The TOMBO system was designed to simultaneously
acquire multiple low-resolution images of an object through multiple lenslets in an
integrated aperture. The resulting collection of low-resolution measurements is then
processed to yield a high-resolution image. Within the TOMBO system the multiple
non-redundant images were obtained via a diverse set of sub-pixel shifts. The use of
other forms of diversity including magnification, rotation, and defocus has also been
considered [21]. However, it is important to note that these methods of obtaining
measurement diversity do not fully exploit the optical degrees of freedom available to
the designer. The approach described in this chapter will utilize PSF engineering in
order to obtain additional diversity from a set of sub-pixel shifted measurements.
The optical PSF of a digital imager may be viewed as a mechanism for encoding
object information so as to better tolerate distortions introduced by the detector ar-
ray. From this viewpoint an impulse-like optical PSF may be sub-optimal [22, 23].
To support this assertion let us consider the scenario depicted in Fig. 2.1(b), it shows
an image of two point objects formed using a non-impulse-like PSF. The two point
objects are displaced by the same amount as in Fig. 2.1(a). We see that the use of
an extended PSF enables the extraction of sub-pixel position information from the
sampled detector outputs. For example, a simple correlation-based processor [24] can
27
(a) (b)
Figure 2.1. Schematic depicting the effect of pixel-limited resolution: (a) optical PSFis impulse-like and (b) engineered optical PSF is extended.
yield the PSF centroid/point-source location to sub-pixel accuracy, given sufficient
measurement signal-to-noise ratio (SNR). In this chapter, we study the performance
of one such extended PSF design obtained by placing a pseudo-random phase mask
in the aperture-stop of a conventional imager. Our choice of pseudo-random phase
mask has been motivated in part by the pseudo-random sequences found in CDMA
multi-user communication systems [25, 26] and in part by a study in Ref. [27] which
found pseudo-random phase masks to be efficient in an information-theoretic sense
for imaging sparse volumetric scenes. In the context of multi-user communications,
pseudo-random sequences are used to encode the information of each end-user. These
encoded messages are combined and transmitted over a common channel. The struc-
ture of the encoding is then used at the receiver side to extract individual messages
from the super-position. In a digital imaging system, the optical PSF serves a simi-
lar purpose in terms of encoding the location of individual resolution elements that
comprise the object. The pixels within a semiconductor detector array measure a
super-position of responses from each resolution element in the object. Further the
spatial integration across the finite pixel size of the detector array leads to spatial
blurring. These signal transformations imposed by the detector array must be in-
verted via decoding. In the next section, we describe the mathematical model of the
imaging system and the pseudo-random phase mask used to engineer the extended
28
optical PSF.
2.2. Imaging System Model
Consider a linear model of a digital imaging system. Mathematically, we can represent
the system as
g = Hcdfc + n, (2.1)
where fc is the continuous object, g is the detector-array measurement vector, Hcd
is the continuous-to-discrete imaging operator and n is additive measurement noise
vector. For simulation purposes we use a discrete representation f of the continuous
object fc. This discrete representation f can be obtained from fc as follows [28]
fi =
∫
S∩Φi
fc(~r)φi(~r)dr2, (2.2)
where S is the object support, φi is an analysis basis set, Φi is the support of
ith basis function φi and fi is the ith element of the object vector f . Note that we
obtain an approximation fa of the original continuous object fc from its discrete
representation f as follows [28]
fa(~r) =
N∑
i=1
fi · ψi(~r), (2.3)
where N is the dimension of the discrete object vector and ψi is a synthesis basis
set which can be chosen to be the same as the analysis basis set φi. Here we use the
pixel function to construct our analysis and synthesis basis sets. The pixel function
is defined as
φi(r) =1
Ωrrect
(r − iΩr
Ωr
)(2.4)
and
∫
Φi∩Φj
φi(r)φj(r)dr2 = δij ,
where 2Ωr is the size of the resolution element in the continuous object that can
be accurately represented by this choice of basis set. Note that the pixel functions
29
φi form an orthonormal basis. We set the object resolution element size equal to
the diffraction-limited optical resolution of the imager to ensure that the discrete
representation of the object does not incur any loss of spatial resolution. Here we
adopt the Rayleigh’s criteria [29] to define resolution. Henceforth, all references to
resolution will represent the Rayleigh resolution.
The imaging equation is modified to include the discrete object representation as
follows
g = Hf + n, (2.5)
where H is the equivalent discrete-to-discrete imaging operator: H is therefore a
matrix. The imaging operator H includes the optical PSF, the detector PSF, and
the detector sampling. The vectors f , g, and n are lexicographically arranged one-
dimensional representations of the two-dimensional object, image, and noise arrays,
respectively.
Consider a diffraction-limited PSF of the form: h(r) = sinc2(
rR
), with Rayleigh
resolution R. The Nyquist sampling theorem requires the detector spacing to be
at most R2. When this requirement is met, the imaging operator H has full rank
(condition-number → 1) allowing a reconstruction of the object up to the optical
resolution. However, when the optical PSF has an extent (2R) that is smaller than
the detector spacing, the image measurement is aliased and the imaging operator H
becomes singular (condition-number → ∞). Under these conditions the object cannot
be reconstructed up to the optical resolution. Also note that due to under-sampling
the imaging operator H is no longer shift-invariant but only block-wise shift-invariant
even if the imaging optics itself is shift-invariant.
As mentioned in the previous section, one method to overcome the resolution
constraint imposed by the pixel-size is to use multiple sub-pixel shifted image mea-
surements. The sub-pixel shift δ may be obtained either by a shift in the imager
position or through object movement. The ith sub-pixel shifted image measurement
30
X−axis
Y−
axis
Z−axis
Det
ecto
r−ar
ray Pseudo−random phase mask
Lens system
Apertute stop
Object
Figure 2.2. Imaging system setup used in the simulation study.
gi with shift δi can be represented as
gi = Hif + ni, (2.6)
where Hi represents the imaging operator associated with the sub-pixel shift δi.
For a set of K such measurements we can write the composite image measure-
ment by concatenating the individual vectors as, g =g1 g2 · · ·gK
and similarly
n =n1 n2 · · ·nK
. The overall multi-frame composite imaging system can be ex-
pressed as
g = Hcf + n, (2.7)
where Hc is the composite imaging operator. By combining several sub-pixel shifted
image measurements, the condition number of the composite imaging operator Hc
can be progressively improved and the overall resolution can be increased towards
the optical resolution limit. Ideally, the sub-pixel shifts should be chosen in multiples
of DK
so as to minimize the condition-number of the forward imaging operator Hc,
where D is the detector spacing [30].
We are interested in designing an extended optical PSF for use within the sub-pixel
shifting framework. The use of an extended optical PSF can improve the condition-
number of the imaging operator Hc. We consider an extended optical PSF obtained
by placing a pseudo-random phase mask in the aperture-stop of a conventional imager,
as shown in Fig. 2.2. For simulation purposes the aperture-stop is defined on a discrete
spatial grid. Therefore, the pseudo-random phase mask is represented by an array,
31
−15 −10 −5 0 5 10 150
2
4
6
8
10
12
14
Spatial dimension [µm]
Am
plitu
de
x10−3
(a)
−15 −10 −5 0 5 10 150
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
x 10−3
Spatial dimension [µm]
Am
plitu
de
(b)
Figure 2.3. Example simulated PSFs: (a) Conventional sinc2(·) PSF and (b) PSFobtained from PRPEL imager.
each element of which corresponds to the phase at given a position on the discrete
spatial grid. The pseudo-random phase mask is synthesized in two steps: (1) generate
a set of identical independently distributed random numbers distributed uniformly
on the interval [0,∆] to populate the phase array and (2) convolve this phase array
with a Gaussian filter kernel which is a Gaussian function with standard-deviation
ρ, sampled on the discrete spatial grid. The resulting set of random numbers define
the phase distribution Φ(r) of the pseudo-random phase mask. The phase mask is
thus a realization of a spatial Gaussian random process which is parameterized by
its roughness ∆ and correlation length ρ. The auto-correlation function of this phase
distribution is given by
RΦΦ(r) =∆
12
2
exp
[− r2
4ρ2
]. (2.8)
The incoherent PSF is related to the phase-mask profile Φ(r) as follows [28]
psf(r) =Ac
(λf)4
∣∣∣∣Tpupil
(− r
λf
)∣∣∣∣2
, (2.9)
Tpupil(ω) = F exp[j2π(nr − 1)Φ(r)/λ]tap(r) , (2.10)
where Ac is normalization constant with units of area, nr is the refractive index of
the lens, f is the back focal length, tap(r) is the aperture function and F denotes the
forward Fourier transform operator.
32
Fig. 2.3(a) shows a simulated impulse-like PSF and Fig. 2.3(b) an extended PSF
resulting from simulating a pseudo-random phase mask with parameters ∆ = 1.5λc
and ρ = 10λc, where λc is the operating center wavelength. Here we set λc =550 nm
and the imager F/# = 1.8. Assuming a detector size of 7.5µm, the support of
extended PSF extends over roughly six detectors, in contrast with a sub-pixel extent
of 2µm for the impulse-like PSF. The extended PSF will therefore accomplish the
desired encoding; however, it will do so at the cost of measurement SNR. Because the
extended PSF is spread over several pixels, its photon count per detector is lower than
that for the impulse-like PSF for a point-like object. Assuming a constant detector
noise, the measurement SNR per detector for the extended PSF is thus lower than
that of the impulse-like PSF. For more general objects, the extended PSF results in
a reduced contrast image with a commensurate SNR reduction, though smaller than
for point-like objects. In the next section, we present a simulation study to quantify
the tradeoff between the overall imaging resolution and the SNR for two candidate
imagers that use multiple sub-pixel shifted measurements: (a) the conventional imager
and (b) the pseudo-random phase enhanced lens (PRPEL) imager.
2.3. Simulation results
For the purposes of the simulation study, we consider only one-dimensional objects
and image measurements. The target imaging system has a modest specification with
an angular resolution of 0.2mrad and an angular field of view(FOV) of 0.1 rad. The
conventional imager uses a lens of F/# = 1.8 and back focal length 5mm. We assume
that the lens is diffraction-limited and the optical PSF is shift-invariant. The detector
array in the image plane has a pixel size of 7.5µm with a full-well capacity (FWC) of
45000 electrons and a 100% fill factor. We further assume that the imager’s spectral
bandwidth is limited to 10 nm centered at λc =550 nm. For the PRPEL imager the
only modification is that the lens is followed by a pseudo-random phase mask with
33
parameters ∆ and ρ.
We assume a shot-noise limited SNR=46dB (20 log10
√FWC) given by the FWC
of the detector element. The shot-noise is modeled as equivalent AWGN with variance
σ2 = FWC. The under-sampling factor for this imager is F = 15. This implies that
for an object vector f of size N×1 the resulting image measurement vector gi is of size
M × 1 where M = NF
. For the target imager, these values are N = 512 and M = 34.
Note that the block-wise shift-invariant imaging operator Hc is of size KM ×N .
To improve the overall imager performance we consider multiple sub-pixel shifted
image measurements or frames. These frames result from moving the imager with
respect to the object by a sub-pixel distance δi. Here it is important to constrain
the number of photons per frame to ensure a fair comparison among imagers using
multiple frames. We have two options: (a) assume that each imager has access to
the same finite number of photons and (b) assume that each frame of each imager
has access to the same finite number of photons. Option (b) may be physical under
certain conditions; however, the results that are obtained will be unable to distinguish
between improvements arising from frame diversity versus improvements arising from
increased SNR. We therefore utilize option (a) because it is the only option that
allows us to study how best to use fixed photon resources. As a result, the photon
count for each frame is normalized to FK
in this simulation study.
The inversion of the composite imaging Eq. (2.7), is based on the optimal linear-
minimum-mean-squared-error (LMMSE) operator W. The resulting object estimate
is given by
f = Wg, (2.11)
where W is defined as [31]
W = RfHT
c(HcRfH
T
c+ Rn)−1. (2.12)
Rf is the auto-correlation matrix for the object vector f and Rn is the auto-correlation
matrix of the noise vector n. Because the composite imaging operator Hc is not shift-
34
(a)
20 40 60 80 100 120 140 160
−80
−70
−60
−50
−40
−30
−20
−10
0
Angular frequency [cycles/degree]
Log
pow
er s
pect
ral d
ensi
ty
Burg estimatePower law η=1.0Power law η=1.4Power law η=2.0
(b)
Figure 2.4. Reconstruction incorporates object priors: (a) object class used for train-ing and (b) power spectral density obtained from the object class and the best power-law fit used to define the LMMSE operator.
invariant the LMMSE solution does not reduce to the well-known Wiener filter. The
noise auto-correlation matrix reduces to a diagonal matrix under the assumption of
independent and identically distributed (i.i.d.) noise and therefore, can be written as
edge within the reconstruction process as a regularizing term. Here we obtain the
object auto-correlation matrix from a power-law power spectral density (PSD): 1fη ,
that serves as a good model for natural images [32, 33, 34]. A power-law PSD was
computed to model the class of 10 objects shown in Fig. 2.4(a) chosen to represent
a wide variety of scenes (rows and columns of these scenes are used as 1D objects).
Fig. 2.4(b) shows several power law PSDs plotted along with the PSD obtained using
Burg’s method [35] on 3 objects chosen from the set in Fig. 2.4(a). The power-law
PSD(η = 1.4) is used to model the PSD of the object class as it is applicable to wider
range of natural images compared to PSD models such as Burg’s that are obtained
for a specific set of objects. The value of power-law PSD parameter η was obtained
by a least-squares fit to the Burg’s PSD estimate.
In order to quantify the performance of both the PRPEL and the conventional
35
−2 −1 0 1 2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Angular dimension [mrad]
Am
plitu
de
Post−processed PSF
Fitted sinc2(.) PSF
Estimated resolution=0.4mrad
Figure 2.5. Rayleigh resolution estimation for multi-frame imagers using a sinc2(·)fit to the post-processed PSF.
imaging systems we employ two metrics: (a) Rayleigh resolution and (b) normalized
root-mean-square-error (RMSE). The Rayleigh resolution of a composite multi-frame
imager is found by using a point-source object and applying the LMMSE operator to
the K image frames. The resulting point-source reconstruction represents the overall
PSF of the computational imager. A least-squares fit of a diffraction-limited sinc2(·)PSF to the overall imager PSF is used to obtain the resolution estimate. Fig. 2.5
illustrates this resolution estimation method with an example of a post-processed PSF
and the associated sinc2(·) fit. The second imager performance metric uses RMSE
to quantify the quality of a reconstructed object. The RMSE metric is defined as,
RMSE =
√〈||f − f ||2〉
255× 100%, (2.13)
where 255 is the peak object pixel value. Here, the expectation 〈·〉 is taken over both
the object and the noise ensembles. We have used all columns and rows of the 2D
objects shown in Fig. 2.4(a) to form a set of 1D objects for computing the RMSE
metric in the simulation study.
First, we consider the conventional imager. The sub-pixel shift for each frame
is chosen randomly. The performance metrics are computed and averaged over 30
36
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
2
3
4
5
6
7
Number of frames − K
RM
SE
[% o
f dyn
amic
ran
ge]
(a)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Number of frames − K
Ang
ular
res
olut
ion
[mra
d]
Diffraction−limited resolution
(b)
Figure 2.6. Conventional imager performance with number of frames (a) RMSE and(b) Rayleigh resolution.
sub-pixel shift-sets for each value of K. Fig. 2.6(a) shows a plot of the RMSE versus
the number of frames K. We observe that the RMSE decreases with the number
of frames, as expected. This result demonstrates that additional object information
is accumulated through the use of diverse (i.e., shifted) channels: as the number of
frames increases, the condition-number of the composite imaging operator Hc im-
proves. The reason that the RMSE does not converge to zero for K = 16 is because
the detector noise ultimately limits the minimum reconstruction error. The resolution
of the overall imager is plotted against the number of frames K in Fig. 2.6(b). Ob-
serve that the resolution improves with increasing K, converging towards the optical
resolution limit of 0.2mrad. The resolution obtained with K = 16 is not equal to
the diffraction-limit because this data represents an average resolution over a set of
random sub-pixel shift-sets. When the sub-pixel shifts are chosen as multiples of DF
the resolution achieved for K = 16 is indeed equal to the optical resolution limit.
The PRPEL imager employs a pseudo-random phase mask to modify the impulse-
like optical PSF. The phase mask parameters ∆ and ρ jointly determine the statistics
of the spatial intensity distribution and the extent of the optical PSF. We design an
optimal phase mask by setting ρ to a constant(10λc) and finding the value of ∆ that
37
1 2 3 4 5 6 7 8 9
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
Mask roughness − ∆ [λ], ρ=10λ
Ang
ular
res
olut
ion
[mra
d]
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
3.8
4
4.2
4.4
4.6
4.8
5
Mask roughness − ∆[λ] ρ=10[λ]
RM
SE
[% o
f dyn
amic
ran
ge]
Figure 2.7. PRPEL imager performance versus mask roughness parameter ∆ withρ = 10λc and K = 3: (a) Rayleigh resolution and (b) RMSE.
maximizes the imager performance for a given K. Fig. 2.7(a) presents representative
data quantifying imager resolution as a function of ∆ with ρ = 10λc and K = 3. This
plot shows the fundamental tradeoff between the condition number of the imaging
operator and the SNR cost. Note that for small values of ∆ the PSF is impulse-like.
As the value of ∆ increases the PSF becomes more diffuse as shown in Fig. 2.3(b).
This results in an improvement in condition number; however, as the PSF becomes
more diffuse the photon-count per detector decreases resulting in an overall decrease in
measurement SNR. Fig. 2.7(a) shows that optimal resolution is achieved for ∆ = 7λc.
Fig. 2.7(b) demonstrates a similar trend in RMSE versus ∆ with ρ = 10λc and K = 3.
The optimal value of ∆ under the RMSE metric is ∆ = 1.5λc. Note that the optimal
values of ∆ are different for the resolution and RMSE metrics. The resolution of an
imager is determined by its spatial frequency response alone; whereas, the RMSE is
dependent on the spatial frequency response as well as the object statistics. Therefore,
the value of ∆ that maximizes the resolution metric may result in an imager with a
particular spatial frequency response that may not achieve the minimum RMSE given
the object statistics and detector noise. All the subsequent results for the PRPEL
imager are obtained for the optimal value of ∆ which will therefore be a function of
38
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1616
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Number of frames − K
Ang
ular
res
olut
ion
[mra
d]
Lens imagerPRPEL imager
Diffraction−limited resolution
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16161
2
3
4
5
6
7
8
Number of Frames − K
RM
SE
[% o
f dyn
amic
ran
ge]
Lens imagerPRPEL imager
Figure 2.8. PRPEL and conventional imager performance versus number of frames:(a) Rayleigh resolution, and (b) RMSE.
K, σ and the metric (RMSE or resolution).
Fig. 2.8(a) presents the resolution performance of both the PRPEL and the con-
ventional imagers as a function of the number of frames K. We note that the PRPEL
imager converges faster than the conventional imager. A resolution of 0.3mrad is
achieved with only K = 4 by the PRPEL imager in contrast with K = 12 for the
conventional imager. A plot comparing the RMSE performance of the two imagers
is shown in Fig. 2.8(b). We note that the PRPEL imager is consistently superior to
the conventional imager. For K = 4 the PRPEL imager achieves an RMSE of 3.5%
as compared with RMSE of 4.3% for the conventional imager.
2.4. Experimental results
An experimental demonstration of the PRPEL imager was undertaken in order to
validate the performance improvements predicted by simulation. Fig. 2.9 shows the
experimental setup along with the relevant physical dimensions. A Santa Barbara
Instrument Group ST2000XM CCD was used as the detector array. The CCD consists
of a 1600 × 1200 detector array, with a detector size of 7.4µm, 100% fill factor and
a FWC of 45000 electrons. The detector output from the CCD is quantized with a
39
X−axis
Y−
axis
FOV
16mm
SB
IG C
CD
arr
ay
Fiber−tip
7.4µ
m
m21
0µ
m210µ
Fujinon Lens Ape
rtur
e =
20m
m Diffuser(phase mask)Zoom lens(2.5x)
540mm
Figure 2.9. Schematic of the optical setup used for experimental validation of thePRPEL imager.
16 bit analog-digital convertor yielding a dynamic range of [0− 64000] digital counts.
During the experiment the CCD is cooled to −10 C, to minimize electronic noise.
The experimental setup uses a Fujinon’s CF16HA-1 TV lens operated at F/#=4.0. A
circular holographic diffuser from Physical Optical Corporation is used as a pseudo-
random phase mask. The divergence angle(full-width half-maximum) of the diffuser
is 0.1. A zoom lens with magnification 2.5x is used to decrease the divergence angle
of the diffuser. The actual phase statistics of the diffuser are not disclosed by the
manufacturer. Therefore, to relate the physical diffuser to the pseudo-random phase
mask model we compute phase mask parameters ∆ and ρ that yield a PSF similar
to the one produced by the physical diffuser. The phase mask parameters ∆ = 2.0λc
and ρ = 175λc yield the PSF shown in Fig. 2.10(c). Comparing this PSF to the
PRPEL experimental PSF shown in Fig. 2.10(b), we note that they are similar in
appearance. This comparison although qualitative suggests that the physical diffuser
might possess statistics similar to the pseudo-random phase mask model described
here.
The Rayleigh resolution of the conventional optical PSF was estimated to be 5µm
or 0.31mrad. This yields an under-sampling factor of F = 3 along each direction.
This implies that a total of F 2 = 9 frames are required to achieve the full optical
40
[mrad]
[mra
d]
−3 −2 −1 0 1 2 3
−3
−2
−1
0
1
2
3
(a)
[mrad]
[mra
d]
−3 −2 −1 0 1 2 3
−3
−2
−1
0
1
2
3
(b)
[mrad]
[mra
d]
−3 −2 −1 0 1 2 3
−3
−2
−1
0
1
2
3
(c)
Figure 2.10. Experimentally measured PSFs obtained from the (a) conventional im-ager, (b) PRPEL imager, and (c) simulated PRPEL PSF with phase mask parameters∆ = 2.0λc and ρ = 175λc.
resolution. The FOV for the experiment is 10mrad×10mrad consisting of 64 × 64
pixels each of size 0.156mrad×0.156mrad. The highly under-sampled nature of the
conventional imager as well as the extended nature of the PRPEL PSF demand
careful system calibration. Our calibration apparatus consisted of a fiber-tip point-
source mounted on a X-Y translation stage that can be scanned across the object
FOV. The 50µm fiber core diameter in object space yields a 0.6µm diameter point
in image space(system magnification= 184
x)which is much smaller than the detector
size of 7.4µm. Therefore, we can assume that the fiber-tip serves a good point-source
approximation for imager calibration purpose. Also note that the exiting radiation
from the fiber-tip(numerical aperture=0.22) overfills the entrance aperture of the
imager optics by a factor of 12. The motorized translation stage is controlled by a
Newport EPS300 motion controller. The fiber tip is illuminated by a white light-
source filtered by a 10 nm bandpass filter centered at λc=535 nm. The calibration
procedure involves scanning the fiber-tip over each object pixel position in the FOV
and for each such position, recording the discrete PSF at the CCD. To obtain reliable
PSF data during calibration we average 32 CCD frames to increase the measurement
SNR. To obtain PSF data with a particular sub-pixel shift, the calibration process is
repeated after shifting the FOV by that sub-pixel amount. This calibration data is
41
1 2 3 4 5 6 7 8 90.3
0.35
0.4
0.45
0.5
0.55
Number of frames − K
Ang
ular
res
olut
ion
[mra
d]
Lens imagerPRPEL imager
Optical resolution
Figure 2.11. Experimentally measured Rayleigh resolution versus number of framesfor both the PRPEL and conventional imagers.
subsequently used to construct the composite imaging operator Hc and compute the
LMMSE operator W using Eq. (2.12). The same calibration procedure is used for
both the conventional and the PRPEL imagers.
The experimental PSFs for these two imagers are shown in Fig. 2.10(a) and
Fig. 2.10(b). The PSF of the conventional imager is seen to be impulse-like; whereas,
the PSF of the PRPEL imager has a diffused/extended shape as expected. The reso-
lution estimation procedure described in the previous section is once again employed
to estimate the resolution of the two experimental imagers. Fig. 2.11 presents the
plot of resolution versus number of frames K from the experiment data. Three data
points are obtained at K = 1, 4, and 9. The sub-pixel shifts (in microns) used for
these measurements were: (0,0) for K=1, (0,0), (0,3.7), (3.7,0), (3.7,3.7) for K=4,
and (0,0), (0,2.5), (0,5), (2.5,0), (2.5,2.5), (2.5,5), (5,0), (5,2.5), (5,5) for K = 9. Note
the imager resolution is estimated using test data that is distinct from the calibration
data. As predicted in simulation, we see that the PRPEL imager outperforms the
conventional imager at all values of K. We observe that the PRPEL resolution nearly
saturates by K = 4. A maximum resolution gain of 13% is achieved at K = 4 by
the PRPEL imager relative to conventional imager. Note that even at K = 9 the
42
[mrad]
[mra
d]
−5 −4 −3 −2 −1 0 1 2 3 4 5
−5
−4
−3
−2
−1
0
1
2
3
4
5
(a)
[mrad]
[mra
d]
−5 −4 −3 −2 −1 0 1 2 3 4 5
−5
−4
−3
−2
−1
0
1
2
3
4
5
(b)
Figure 2.12. The USAF resolution target (a) Group 0 element 1 and (b) Group 0elements 2 and 3.
resolution achieved by both the imagers is slightly poorer than the estimated optical
resolution of 0.31mrad. This can be attributed to errors in the calibration process,
which include non-zero noise in the PSF measurements and shift errors due to the
finite positioning accuracy of the computer-controlled translation stages.
A USAF resolution target was used to compare the object reconstruction quality
of the two imagers. Because the imager FOV is relatively small (10mrad×10mrad/
13.44mm×13.44mm) we used two small areas of the USAF resolution target shown in
Fig. 2.12(a) and Fig. 2.12(b). In Fig. 2.12(a) the spacing between lines of group 0 el-
ement 1 is 500µm in object space or equivalently 0.37mrad. Similarly in Fig. 2.12(b)
the line spacings for group 0 elements 2 and 3 are 0.33mrad and 0.30mrad respec-
tively. Given the optical resolution of the experimental system, we expect that group
0 element 3 should be resolvable by both the conventional and PRPEL imagers.
Fig. 2.13 presents the raw detector measurements of USAF group 0 element 1
from the two imagers. Consistent with the measured degree of under-sampling, the
imagers are unable to resolve the constituent line elements in the raw data. Fig. 2.14
shows reconstructions from the two multi-frame imagers for the same object using
K = 1, 4, and 9 sub-pixel shifted frames. We observe that for K = 1 neither imager
can resolve the object. For K = 4 however, the PRPEL imager clearly resolves the
43
[mrad]
[mra
d]
−5 −4 −3 −2 −1 0 1 2 3 4 5
−5
−4
−3
−2
−1
0
1
2
3
4
5
(a)
[mrad]
[mra
d]
−5 −4 −3 −2 −1 0 1 2 3 4 5
−5
−4
−3
−2
−1
0
1
2
3
4
5
(b)
Figure 2.13. Raw detector measurements obtained using USAF Group 0 element 1from (a) the conventional imager and (b) the PRPEL imager.
lines in the object; whereas, the conventional imager does not resolve them clearly.
Fig. 2.15(a) shows a horizontal line scan through the object and LMMSE reconstruc-
tions for K = 4, affirming our observation that the PRPEL imager achieves superior
contrast to that of the conventional imager. For K = 9 we note that both imagers
resolve the object equally well. Next we consider USAF group 0 elements 2 and 3
object whose reconstructions are shown in Fig. 2.16. As before, for K = 1 neither
imager can resolve the object. However, for K = 4 the PRPEL imager clearly re-
solves element 2 and barely resolves element 3. In contrast, the conventional imager
barely resolves element 2 only. This is also evident in the horizontal line scan of the
object and the LMMSE reconstructions shown in Fig. 2.15(b). Both imagers achieve
comparable performance for K = 9, completely resolving the object.
We observe that despite having precise channel knowledge we obtain poor recon-
struction results for the caseK = 1. This points to the limitations of linear reconstruc-
tion techniques that can not include powerful object constraints such as positivity and
finite support. However, non-linear reconstruction techniques such as iterative back
projection(IBP) [36] and maximum-likelihood expectation-maximization(MLEM) [37]
can easily incorporate these constraints. The Richardson-Lucy(RL) algorithm [38, 39]
based on the MLEM principle has been shown to be one such effective reconstruction
44
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
Figure 2.14. LMMSE reconstructions of USAF group 0 element 1 with left column forPRPEL imager and right column for conventional imager: top row for K=1, middlerow for K=4, and bottom row for K=9.
45
−5 −4 −3 −2 −1 0 1 2 3 4 5
0
0.2
0.4
0.6
0.8
1
[mrad]
Object
PRPEL reconstruction
Lens reconstruction
(a)
−5 −4 −3 −2 −1 0 1 2 3 4 5
0
0.2
0.4
0.6
0.8
1
[mrad]
Object
PRPEL reconstruction
Lens reconstruction
(b)
Figure 2.15. Horizontal line scans through the USAF target and its LMMSE recon-struction for conventional and PRPEL imagers for K=4: (a) group 0 elements 1 and(b) group 0 elements 2 and 3.
46
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
Figure 2.16. LMMSE reconstructions of USAF group 0 element 2 and 3 with leftcolumn for PRPEL imager and right column for conventional imager: top row forK=1, middle row for K=4, and bottom row for K=9.
47
technique. The RL algorithm is a multiplicative iterative scheme where the k + 1th
object update denoted by f (k+1) is defined as [28],
f (k+1)n = f (k)
n
1
sn
KM∑
m=1
gm(Hcf (k)
)m
Hcmn, (2.14)
sn =KM∑
m=1
Hcmn,
where the subscript denotes the corresponding element of a vector or a matrix. Note
that if all elements of the composite imaging matrix Hc, the raw image measurement
g and the initial object estimate f (0) are positive then all subsequent estimates of
the object are guaranteed to be positive, thereby achieving the positivity constraint.
Further, by setting the appropriate elements of f (0) to 0 we can implement the finite
support constraint in the RL algorithm.
We apply the RL algorithm described above to the experimental data in an effort
to improve reconstruction quality, especially for K = 1. A constant positive vector
is used as an initial object estimate i.e. f (0) = c where ci = a > 0, ∀i. Fig. 2.17 and
Fig. 2.18 shows the RL object reconstructions of the USAF group 0 element 1 and
USAF group 0 elements 2 and 3 respectively. As expected, the RL algorithm yields a
substantial improvement in reconstruction quality over the LMMSE processor. This
improvement is most notable for the K = 1 case. In Fig. 2.17 we observe that the
PRPEL imager delivers better results compared to the conventional imager for K = 1
and K = 4. The horizontal line scans in Fig. 2.19(a) show that the PRPEL imager
maintains a superior contrast compared to the conventional imager for K = 4. From
Fig. 2.18 we observe that for K = 1 the PRPEL imager begins to resolve element 2
whereas the conventional imager still fails to resolve element 2. For K = 4, element 2
is clearly resolved and element 3 is just resolved by the PRPEL imager. In comparison
the conventional imager barely resolves element 2. These observations are confirmed
by the horizontal line scan plots shown in Fig. 2.19(b).
Overall the experimental reconstruction and resolution results confirm the conclu-
48
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
Figure 2.17. Richardson-Lucy reconstructions of USAF group 0 element 1 with leftcolumn for PRPEL imager and right column for conventional imager: top row forK=1, middle row for K=4, and bottom row for K=9.
49
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
[mrad]
[mra
d]
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
Figure 2.18. Richardson-Lucy reconstructions of USAF group 0 element 2 and 3 withleft column for PRPEL imager and right column for conventional imager: top rowfor K=1, middle row for K=4, and bottom row for K=9.
50
−5 −4 −3 −2 −1 0 1 2 3 4 5
0
0.2
0.4
0.6
0.8
1
[mrad]
Object
PRPEL reconstruction
Lens reconstruction
(a)
−5 −4 −3 −2 −1 0 1 2 3 4 5
0
0.2
0.4
0.6
0.8
1
[mrad]
Object
PRPEL reconstruction
Lens reconstruction
(b)
Figure 2.19. Horizontal line scans through the USAF target and its Richardson-Lucyreconstruction for conventional and PRPEL imagers for K=4: (a) group 0 elements1 and (b)group 0 elements 2 and 3.
sions drawn from our simulation study; the PRPEL imager offers superior resolution
and reconstruction performance compared to the conventional multi-frame imager.
2.5. Imager parameters
The results reported here have demonstrated the utility of the PRPEL imager. In
order to motivate a more general applicability of the PRPEL approach, there are
two important parameters that require further investigation: pixel size and spectral-
51
1 2 3 4 5 6 7 8
0.2
0.3
0.4
0.5
0.6
0.7
Number of frames − K
Ang
ular
res
olut
ion
[mra
d]
Lens ImagerPRPEL Imager
Diffraction−limited resolution
(a)
1 2 3 4 5 6 7 8
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
Number of frames − K
RM
SE
[% o
f dyn
amic
ran
ge]
Lens Imager PRPEL Imager
(b)
Figure 2.20. (a) Rayleigh resolution and (b) RMSE versus number of frames formulti-frame imagers that employ smaller pixels and lower measurement SNR.
bandwidth. We consider two case studies in which these imaging system parameters
are modified in order to study their impact on overall imager performance.
2.5.1. Pixel size
Here we consider the effect of smaller pixel size which is typical of CMOS detectors ar-
rays, now commonly employed in many imagers. Consider a sensor having a pixel size
of 3.2µm resulting in a less severe under-sampling as compared with the 7.5µm pixel
size assumed earlier. This detector has a 100% fill-factor and a smaller FWC of 28000
electrons(lower SNR). All other parameters of the imaging system remain unchanged.
The under-sampling factor for the new sensor is F = 7 and the photon-limited SNR
is now 22dB. We repeat the simulation study of the overall imaging system perfor-
mance for both the conventional imager and the PRPEL imager. Fig. 2.20(a) shows
the plot of the resolution versus the number of frames for both imaging systems. This
plot shows that for K = 2 the PRPEL imager achieves a resolution of 0.3mrad while
the conventional imager resolution is only 0.5mrad. Fig. 2.20(b) shows the RMSE
performance of the two imagers versus the number of frames. For K = 2 the PRPEL
52
−15 −10 −5 0 5 10 150
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
x 10−3
Spatial dimension [µm]
Am
plitu
de
PSF − 10nmPSF − 150nm
Figure 2.21. The optical PSF obtained using PRPEL with both narrowband (10 nm)and broadband (150 nm) illumination.
imager achieves a RMSE of 3.2% compared to 4.0% for the conventional imager, an
improvement of nearly 20%. From these results we conclude that the PRPEL imager
remains a useful option for imagers with CMOS sensors that have smaller pixels and
a lower SNR.
2.5.2. Broadband operation
Recall that all our simulation studies have assumed a 10 nm spectral bandwidth so
far. In this section, we will relax this constraint and allow the spectral bandwidth
to increase to 150 nm, roughly equal to the bandwidth of the green band of the
visible spectrum. All other imaging system parameters remain unchanged (using the
original 7.5µm sensor). There is a two-fold implication of the increased bandwidth.
First, because we accept a wider bandwidth, the photon count increases resulting
in an improved measurement SNR. Within the PRPEL imager however, this SNR
increase is accompanied by increased chromatic dispersion and a smoothing of the
PRPEL PSF. This smoothing results in a worsening of the condition number for
the PRPEL imager. To illustrate the dispersion effect, Fig. 2.21 shows a plot of
the extended PRPEL PSF for both the 10 nm and the 150 nm bandwidths. The
Figure 2.22. (a) Rayleigh resolution and (b) RMSE versus number of frames forbroadband PRPEL and conventional imagers.
smoothing of the PSF affects the optical transfer function of the imager by attenuating
the higher spatial frequencies. Hence, we can expect a trade-off between the higher
SNR and the worsening of the condition number, especially for the PRPEL imaging
system. The plot in Fig. 2.22(a) shows that the conventional imager resolution is
relatively unaffected by broadband operation. The PRPEL imager performance on
the other hand suffers due to dispersion despite the increase in SNR. Similar trends
in RMSE performance can be observed for the two imagers as shown by the plot in
Fig. 2.22(b). The performance of the broadband PRPEL imager deteriorates relative
to narrowband operation for small values of K; however, note that for medium and
large values of K the performance of the PRPEL imager actually improves due to
increased SNR.
2.6. Conclusions
The optical PSF engineering approach for improving imager resolution and object re-
construction fidelity in under-sampled imaging system was successfully demonstrated.
The simulation study of the PRPEL imager predicted substantial performance im-
54
provements over a conventional multi-frame imager. The PRPEL imager was shown
to offer as much as 50% resolution improvement and 20% RMSE improvement as
compared to the conventional imager. The experimental results confirmed these pre-
dicted performance improvements. We also applied the non-linear Richardson-Lucy
reconstruction technique to the experimental data. The results obtained showed
that imager performance is substantially improved with non-linear techniques. In
this chapter, the application of optical PSF engineering method to the object re-
constructed task has shown the potential benefits of the joint-optimization design
approach. In next chapter, we extend the application of the optical PSF engineering
method to an iris-recognition task.
55
Chapter 3
Optical PSF Engineering: Iris Recognition Task
In this chapter we will apply the optical PSF engineering approach to the task of
iris-recognititon to overcome the performance degradations introduced by an under-
sampled imaging system. Note that the metric for quantifying the imaging system
performance for a particular task plays a critical role in the joint-optimization design
approach. For the object reconstruction task we had employed two metrics: 1) res-
olution and 2) RMSE. Here we will use the statistical metric of false rejection ratio
(FRR) (evaluated at a fixed false acceptance ratio (FAR)) for quantifying the imaging
system performance for the iris-recognition task.
3.1. Introduction
Many modern defense and security applications require automatic recognition and
verification services that employ a variety of biometrics such as facial features, hand
shape, voice, fingerprints, and iris. The iris is the annular region between the pupil
and the outer white sclera of the eye. Iris-based recognition has been gaining pop-
ularity in recent years and it has several advantages compared to other traditional
biometrics such as fingerprints and facial features. The iris-texture pattern represents
a high-density of information and the resulting statistical uniqueness can yield false
recognition rates as low as 1 in 1010 [41, 42, 43]. Further, it has been found that
the human iris is stable over the lifetime of an individual and is therefore considered
to be a reliable biometric [44]. Iris-based recognition systems rely on capturing the
iris-texture pattern with a high-resolution imaging system. This places stringent de-
mands on imaging optics and sensor design. In the case where the detector pixel size
56
limits the overall resolution of the imaging system, the under-sampling in the sensor
array can lead to degradation of the iris-recognition performance. Therefore, over-
coming the detector-induced under-sampling becomes a vital issue in the design of
an iris-recognition imaging system. One approach to improve the resolution beyond
the detector limit employs multiple sub-pixel shifted measurements within a TOMBO
imaging system architecture [19, 20]. However, this approach does not exploit the
optical degrees of freedom available to the designer and more importantly it does not
address the specific nature of the iris-recognition task. We note that there are some
studies that have exploited the optical degrees of freedom to extend the depth-of-field
of iris-recognition systems [45, 46], but we are not aware of any previous work that
has examined under-sampling in iris-recognition imaging systems. In this chapter,
we propose an approach that involves engineering the optical point spread function
(PSF) of the imaging system in conjunction with use of multiple sub-pixel shifted
measurements. It is important to note that the goal of our approach is to maxi-
mize the iris-recognition performance and not necessarily the overall resolution of the
imaging system. To accomplish this goal, we employ an optimization framework to
engineer the optical PSF and optimize the post-processing system parameters. The
task-specific performance metric used within our optimization framework is FRR for
a given FAR [47]. The mechanism of modifying the optical PSF employs a phase-
mask in the aperture-stop of the imaging system. The phase-mask is defined with
Zernike polynomials and the coefficient of these polynomials serve as the optical de-
sign parameters. The optimization framework is used to design imaging systems for
various numbers of sub-pixel shifted measurements. The CASIA iris database [48] is
used in the optimization framework and it also serves to quantify the performance of
the resulting optimized imaging system designs.
57
X−axis
Y−
axis
Z−axis
Object
Phase−mask
Imaging Optics
Detector−
array
Figure 3.1. PSF-engineered multi-aperture imaging system layout.
3.2. Imaging System Model
In this study, our iris-recognition imaging system is composed of three components: 1)
the optical imaging system, 2) the reconstruction algorithm, and 3) the recognition al-
gorithm. The optical imaging system consists of multiple sub-apertures with identical
optics. This multi-aperture imaging system produces a set of sub-pixel shifted im-
ages on the sensor array. The task of the reconstruction algorithm is to combine these
image measurements to form an estimate of the object. Finally, the iris-recognition
algorithm operates on this object estimate and either accepts or rejects the iris as a
match. We begin by describing the multi-aperture imaging system.
3.2.1. Multi-aperture imaging system
Fig. 3.1 shows the system layout of the multi-aperture(MA) imaging system. The
number of sub-imagers comprising the MA imaging system is denoted by K. The
sensor array in the focal plane of the MA imager generates K image measurements,
where the kth measurement (also referred to as a frame) is denoted by gk. The detector
pitch d of the sensor array relative to the Nyquist sampling interval δ, determined by
the optical cut-off spatial frequency, defines the under-sampling factor: F = dδ× d
δ.
Therefore, for an object of size N × N pixels the under-sampled kth sub-imager
58
measurement gk is of dimension M ×M , where M = ⌈ N√F⌉. Mathematically, the kth
frame can be expressed as
gk = Hkf + nk, (3.1)
where f is a N2 × 1 dimensional vector formed by a lexicographic arrangement of a
two-dimensional (N ×N) discretized representation of the object, Hk is the M2 ×N2
discrete-to-discrete imaging operator of the kth sub-imager and nk denotes the M2×1
dimensional measurement error vector. Here we model the measurement error nk as
zero-mean additive white Gaussian noise (AWGN) with variance σ2n. Note that the
imaging operator Hk is different for each sub-imager and is expressed as
Hk = DCSk, (3.2)
where Sk is the N2 × N2 shift operator that produces a two-dimensional sub-pixel
shift (∆Xk,∆Yk) in the kth sub-imager, C is N2 × N2 convolution operator that
represents the optical PSF and D is the M2 × N2 down-sampling operator which
includes the effect of spatial integration over the detector and the under-sampling
caused by the sensor array. Note that the convolution operator C does not vary with
k because the optics are assumed to be identical in all sub-imagers. By combining
the K measurements we can form a composite measurement g = g1 g2 · · ·gK that
can be expressed in terms of the object vector f as follows
g = Hcf + n, (3.3)
where Hc = H1 H2 · · ·HK is the composite imaging operator of size KM2 × N2
obtained by stacking the K imaging operators corresponding to each of the K sub-
imagers and n is the composite noise vector defined as n = n1 n2 · · ·nK.As mentioned earlier, the optical PSF is engineered by placing a phase-mask in the
aperture-stop of each sub-imager. The pupil-function tpupil(ρ, θ) of each sub-imager
is expressed as [29]
tpupil(ρ, θ) = tamp(ρ)exp(j2π(nr − 1)tphase(ρ, θ)
λ
), (3.4)
59
where ρ and θ are the polar coordinate variables in the pupil, nr is the refractive index
of the phase-mask, tamp(ρ) = circ( ρDap
) is the circular pupil-amplitude function(Dap
denotes the aperture diameter), tphase(ρ, θ) represents the pupil-phase function and λ
is the wavelength. A Zernike-polynomial of order P is used to define the pupil-phase
function as follows
tphase(ρ, θ) =P∑
i=1
ai · Zi(ρ, θ), (3.5)
where ai is the coefficient of the ith Zernike polynomial denoted by Zi(ρ, θ) [49]. In
this work, we will use Zernike polynomials up to order P = 24. The resulting optical
PSF h(ρ, θ) is expressed as [28]
h(ρ, θ) =Ac
(λfl)4
∣∣∣∣Tpupil
(− ρ
λfl, θ
)∣∣∣∣2
, (3.6)
Tpupil(ω) = F2 tpupil(ρ, θ) , (3.7)
where ω is the two-dimensional spatial frequency vector, Ac is a normalization con-
stant with units of area, fl is the back focal length, and F2 denotes the 2-dimensional
forward Fourier transform operator.
A discrete representation of the optical PSF hd(l,m), required for defining the C
operator is obtained as follows
hd(l,m) =
∫ d2
− d2
∫ d2
− d2
h(x− ld, y −md)dxdy (l,m) : l = −L · · ·L,m = −L · · ·L,
(3.8)
where (2L + 1)2 is the number of samples used to represent the optical PSF. Note
that a lexicographic ordering of the hd(l,m) yields one row of C and all other rows
are obtained by lexicographically ordering the appropriately shifted version of this
discrete optical PSF.
3.2.2. Reconstruction algorithm
The measurements from the K sub-imagers comprising the MA imaging system form
the input to the reconstruction algorithm. We employ a reconstruction algorithm
60
Figure 3.2. Iris examples from the training dataset.
based on the linear minimum mean square error (LMMSE) criterion. The LMMSE
method is essentially a generalized form of the Wiener filter and operates on the
measurement in the spatial domain without the assumption of shift-invariance. Given
the imaging model specified in Eq. (3.3) the LMMSE operator W can be written
as [31]
W = RffHTc
(HcRffHT
c + Rnn
)−1, (3.9)
where Rff is the object auto-correlation matrix and Rnn is the noise auto-correlation
matrix. Here we assume that noise is zero-mean AWGN with variance σn2 and there-
fore Rnn = σ2nI. Note that for an object of size N2 and measurement of size KM2, the
size of W matrix is N2 ×KM2. For even a modest object size of 280× 280, as is the
case here, computing the W matrix becomes computationally very expensive. There-
fore, we adopt an alternate approach that does not rely on directly computing matrix
inverses but instead uses a conjugate-gradient method to compute the LMMSE solu-
tion iteratively. Before we describe the iterative algorithm, we first need a method to
estimate the object auto-correlation matrix Rff . We use a training set of 40 subjects
with 4 iris samples for each subject, randomly selected from the CASIA iris database
61
yielding a total of 160 iris object samples. Fig. 3.2 shows example iris-objects in the
training dataset. The kth iris object yields the sample auto-correlation function rkff
which is used to estimate the actual auto-correlation function as follows
Rff =1
160
160∑
k=1
rkff. (3.10)
The corresponding power spectral density Sff can be written as [50]
Sff (ρ) = F2(Rff ). (3.11)
To obtain a smooth approximation of the power spectral density we use the following
parametric function [51]
Sff (ρ) =σ2
f
(1 + 2πµdρ2)3
2
. (3.12)
Note that because the iris is circular, we assume a radially symmetric power spectrum
Sff . A least square fit to Sff (ρ) yields σf = 43589 and µd = 1.5.
In general, a conjugate-gradient algorithm minimizes the following form of quadratic
objective function Q [28]
Q(f) =1
2f tAf − btf . (3.13)
For the LMMSE criterion, A = HTc Hc + σ2R−1
ffand b = HT
c g. Within our iterative
conjugate gradient-based algorithm we use a conjugate-vector pj instead of the gradi-
ent of the objective Q(f) to achieve a faster convergence to the LMMSE solution [52].
The k + 1th update rule can be expressed as [28]
fk+1 = fk + αkpk (3.14)
αk = −ptk∇Qk
dk, (3.15)
where ∇Qk denotes the gradient of objective function Qk evaluated at the kth step and
pk is conjugate to all previous pj , j < k (i.e. ptjApk = djδjk), δjk is the Kronecker-
delta function, and dk is the ‖ · ‖2 norm of pk. The stopping criterion is specified as
when the residual vector rk = ∇Qk = Afk − b changes less than β% over the last 4
iterations (i.e.rk−4−rk
rk−4≤ β
100).
62
(a) (b)
θ
ρ
50 100 150 200
102030
(c)
θ
ρ
50 100 150 200 250 300 350 400
102030
(d)
Figure 3.3. Examples of (a) iris-segmentation, (b) masked iris-texture region, (c)unwrapped iris, and (d) iris-code.
63
3.2.3. Iris-recognition algorithm
The object estimate obtained with the reconstruction algorithm is processed by the
iris-recognition algorithm to make the final decision. There are three main processing
steps that form the basis of the iris-recognition algorithm. The first step involves
a segmentation algorithm that extracts the iris, pupil, and the eye-lid regions from
the reconstructed object. The segmentation algorithm used in this work is adapted
from Ref. [53] with the addition of eye-lid boundary detection. The output of the
segmentation algorithm yields an estimate of the center and radius of the circular
pupil and iris regions and also the boundaries of the upper and lower eyelids in
the object. Fig. 3.3(a) shows an example iris image that was processed with the
segmentation algorithm. The pupil and iris regions are outlined by circular boundaries
and the upper/lower eyelid edges are represented by the elliptical boundaries. This
information is used to generate a mask M(x, y) that extracts the annular region
between iris and pupil boundaries which contains only the unobscured iris-texture
region. An example of the masked iris region is shown in Fig. 3.3(b). The extracted
iris-texture region is the input to the next processing step. Given the center and
radius of the pupil and the iris regions, the annular iris-texture region is unwrapped
into a rectangular area a(ρ, θ) using Daugman’s homogenous rubber sheet model [54].
The size of the rectangular region is specified as Lρ×Lθ with Lρ rows along the radial
direction and Lθ columns along the angular direction. Fig. 3.3(c) shows an example
of an unwrapped rectangular region with Lρ = 36 and Lθ = 224. In the next step,
a complex log-scale Gabor filter is applied to each row to extract the phase of the
underlying iris-texture pattern. The complex log-scale Gabor filter spectrum Glog(ρ)
is defined as [55]
Glog(ρ) = exp
(
−log( ρ
ρo)
2log(σg
ρo)
)
, (3.16)
where ρo is the center frequency of the filter and σg specifies its bandwidth. Note that
this filter is only applied along the angular direction which corresponds to pixels on
64
the circumference of a circle in the original object. The angular direction is chosen
over the radial direction because the maximum texture variation occurs along this
direction [53]. The phase of the complex output of each Gabor filter is then quantized
into four quadrants using two bits. The 4-level quantized phase is coded using a Grey
code so that the difference between two adjacent quadrants is one bit. The Grey
coding scheme also ensures that any misalignment between two similar iris-codes
results in a minimum of errors. The quantized phase results in a binary pattern,
shown in Fig. 3.3(d), which is referred to as an “iris-code.”
In the final step, the iris-recognition task is performed based on the iris-code
obtained from a test object. To determine whether the given iris-code denoted by
tcode, matches any iris-code in the database, a score is computed. The score denoted
by s(tcode) is defined as
s(tcode) = mink,i
dhd(tcodeckmask, Ri(r
kcode)c
kmask), (3.17)
where rkcode is the kth reference iris-code in the database, ckmask is a mask that represents
the unobscured bits common among both the test and the reference iris-codes, Ri is
a shift operator which performs an i-pixel shift along the angular direction, and dhd
is the Hamming distance operator. All shifts in the range i : −O · · · + O are
considered, where O denotes the maximum shift. The dhd operator is defined as
follows
dhd(tcodecmask, rcodecmask) =
∑(tcodecmask ⊕ rcodecmask)
W, (3.18)
where W is the weight (i.e. number of all 1s) of the mask cmask. The normalized
Hamming distance score defined in Eq. (3.18) is computed over all iris-codes in the
database. The iris-code is shifted to account for any rotation of the iris in the object.
Finally, the following decision rule is applied to the minimum iris score s(tcode)
s(tcode)H0
≶H1
THD, (3.19)
this translates to: accept the null hypothesis H0 if the score is less than threshold
where X, ~Vtarget, and ~Vbg are the same as in Eq. (4.1). Clutter components ~Vtree
and ~Vshrub represent tree and shrub profiles respectively and are weighted by random
variables β1 and β2. Note that CS2 will depend on random variables β1 and β2;
therefore, CS2 is a stochastic operator. Fig. 4.4(a) and Fig. 4.4(b) show examples of
scene realizations generated by this stochastic encoding operator.
As X is the only parameter of interest for a given task, it is important to note
that the entropy of X defines the maximum task-specific information content of any
image measurement. Other blocks in the imaging chain may add entropy to the
85
image measurement R; however, only the entropy of the virtual source X is relevant
to the task. We may therefore define TSI as the Shannon mutual-information I(X;R)
between the virtual source X and the image measurement R as follows [67]
TSI ≡ I(X;R) = J(X) − J(X|R), (4.3)
where J(X) = −Elog(pr(X)) denotes the entropy of virtual source X, J(X|R) =
−Elog(pr(X|R) denotes the entropy of X conditioned on the measurement R, E·denotes statistical expectation, pr(·) denotes the probability density function, and
all the logarithms are taken to be base 2. Note that from this definition of TSI
we have I(X;R) ≤ J(X) indicating that an image cannot contain more TSI than
there is entropy in the variable representing the task. However, for most realistic
imaging problems computing TSI from Eq. (4.3) directly is intractable owing to the
dimensionality and non-Gaussianity of R. Numerical approaches may also prove
to be computationally prohibitive, even when using methods such as importance-
sampling, Markov Chain Monte Carlo(MCMC) or Bahl Cocke Jelinek Raviv(BCJR)
[68, 69, 70, 71, 72].
Recently, Guo et al. [73] demonstrated a direct relationship between the minimum
mean square error (mmse) in estimating X from R, and the mutual-information
I(X;R) for an additive Gaussian channel. Although the relation between estimation
mmse and Fisher information has been known via VanTree’s inequality [74], Guo’s
result connects estimation mmse with the Shannon information for the first time.
The result expresses mmse as a derivative of the mutual-information I(X;R) with
respect to signal to noise ratio. For a simple additive Gaussian noise channel we have
R =√sX +N, (4.4)
where N is the additive Gaussian noise with variance σ2 = 1 and s is the signal to
noise ratio. For this simple case we find that [73]
d
dsI(X;R) =
1
2mmse =
1
2E[|X − E(X|R)|2], (4.5)
86
where E(X|R) is the conditional mean estimator. This relation allows us to compute
mutual-information indirectly from mmse for an additive Gaussian channel without
any restrictions on the distribution of the virtual source variable X. It is interesting
to note that even though the source variable X is discrete valued, the conditional
mean estimator is a continuous variable which does not necessarily take values in the
range of the source variable X. For example, when X is a binary variable(0/1) the
conditional mean estimator will yield a real number between 0 and 1.
This result has been extended to the linear vector Gaussian channel for which
H[ ~X] = H ~X, where H denotes the matrix channel operator and ~X is the vector
channel input. The output of such a channel can be written as
~R =√sH ~X + ~N, (4.6)
where ~N follows a multivariate Gaussian distribution with covariance Σ ~N . In this
case, the Guo’s result becomes [75]
d
dsI( ~X; ~R) =
1
2E[||H ~X − E[H ~X|~R]||2]. (4.7)
The right hand side of Eq. (4.7) is the mmse in estimating H ~X rather than ~X and
therefore, we denote it by mmseH throughout the rest of this work to avoid confusion.
For an arbitrary noise covariance Σ ~N , mmseH can be computed using Tr(H†Σ−1~N
HE)
where E = E[( ~X − E[ ~X|~R])( ~X − E[ ~X |~R])T ], H† denotes the hermitian conjugate of
H and Tr(·) denotes the trace of the matrix. Therefore, the relationship between
mutual information and mmseH can be written as
d
dsI( ~X; ~R) =
1
2mmseH =
1
2Tr(H†Σ−1
~NHE). (4.8)
These results have also been extended to the case for which the channel input is
a random function of ~X, denoted by ~Y = C( ~X). The relation between I( ~X; ~R) and
mmseH for a random function C( ~X) is slightly different from the previous expression
in Eq. (4.8). Using the stochastic encoding model we have
~R =√sHC( ~X) + ~N. (4.9)
87
In this case the relation between mutual information and mmse can be expressed
We now combine the encoding model for localization, defined in Eq. (4.22), with
the detection and classification models described in the previous section. For the joint
detection/localization task we are interested in detecting the presence of a target and
if present, localizing it in one of Q regions. The imaging model from Eq. (4.22)
becomes
~R =√sHTΛ(X)~ρα + ~Nc, (4.23)
where α is a binary variable indicating the presence or absence of the target. There-
fore, the virtual source in this case is a (Q + 1)-ary variable and is defined as:
X ′ ∈ X, 0 so that when α = 0, X ′ = 0 and when α = 1, X ′ = X. Compar-
ing Eq. (4.10) with the imaging model shown in Eq. (4.23), we note that the ~X and
~Y in Eq. (4.10) are equal to the virtual source X ′ and the term TΛ(X)~ρα respectively.
The channel operator H is replaced with H and ~N is replaced by ~Nc. Therefore, TSI
97
and mmseH for this task can be expressed as
TSI = I(X ′; ~R) =1
2
∫ s
0
mmseH(s′)ds′, (4.24)
where mmseH(s) = Tr(H†Σ−1~Nc
H(E~Y −E~Y |X′)), (4.25)
X ′ ∈ X, 0 , ~Y = TΛ(X)~ρα. (4.26)
The (Q+1)-ary nature of the virtual source variable in the joint detection/localization
task increases the upper bound on TSI as compared to that for the simple detection
task. For the probabilities Pr(α = 1) = p and Pr(α = 0) = 1 − p, the TSI is upper
bounded by
J(X ′) = −(1 − p) log(1 − p) −Q∑
q=1
Pr(X = q) log Pr(X = q), (4.27)
where∑Q
q=1 Pr(X = q) = p. For the case of p = 12
and Pr(X = q) = pQ
, the maximum
TSI is [1 + 12logQ ]bits.
Finally, we consider the joint classification/localization task where the task of
interest is to identify one of the two targets from H1 or H2 and localize it in one of
Q regions. The exact position of the target within each region remains a nuisance
parameter. The imaging model for this task is given by
~R =√sHTΩ(X)ρ~α+ ~Nc. (4.28)
This model is the same as the one given in Eq. (4.23) except for minor modifications.
The total number of positions that each target can take remains unchanged. However,
now T has dimensions M2×2P and is given by T = [TH1TH2
] where THiis the target
profile matrix for target i. The arrangement of the target profiles in TH1and TH2
is
similar to the arrangement described in Subsection 4.2.3. The virtual source in this
case is 2Q-ary and given by ~X ′ = [X, ~α], where X ∈ 1, 2.., Q indicates the region
and ~α ∈ [1, 0]T , [0, 1]T represents one of the two targets. The localization matrix
Ω(X = i), now has dimensions 2P × 2Pi for selecting the H1 and H2 profiles in the
98
(a) (b)
(c) (d)
Figure 4.10. Example scenes: (a) Tank in the middle of the scene, (b) Tank in thetop of the scene, (c) Jeep at the bottom of the scene, and (d) Jeep in the middle ofthe scene.
99
region i and is given by
Ω(X = i) =
[Λ(X = i) 0
0 Λ(X = i)
], (4.29)
where matrices Λ(X = i) and 0 are of dimension P × Pi. The matrix Λ(X) is
identical to the one in Eq. (4.22). Fig. 4.9 illustrates the role of TΩ(X) in choosing
the H1 and H2 profiles at all positions in the region specified by X. This example
uses X = 2, Q = 4, and Pi = P4
for i = 1, 2, 3, 4. The matrix TΩ(X) in Eq. (4.28)
is post-multiplied by the matrix ρ of dimension 2Pi × 2 to yield the targets H1 and
H2 at one of the positions in region i. Here ρ is defined as
ρ =
[~ρH 00 ~ρH
], (4.30)
where 0 is an all zero Pi-dimensional column vector and ~ρH ∈ ~e1, ~e2....~ePi, where ~ek
is an indicator vector as before. Therefore, for ~ρH = ~ek, TΩ(X)ρ results in a M2 × 2
matrix with its first column representing H1 at the kth position in region i and its
second column representing H2 at this same position. This result is then multiplied
by ~α which selects either H1 or H2 for ~α = [1, 0]T or ~α = [0, 1]T respectively.
The TSI expression in Eq. (4.24) requires only minor modifications to remain valid
for the joint classification and localization problem. The upper bound for TSI in this
task is given by
J( ~X ′) = −2∑
i=1
P∑
q=1
Pr(X = q, ~αi) log Pr(X = q, ~αi), (4.31)
where ~α1 = [0, 1]T , ~α2 = [1, 0]T ,∑Q
q=1 Pr(X = q, ~α1) = 1 − p, and∑Q
q=1 Pr(X =
q, ~α2) = p. For the case when p = 12, Pr(X = q, ~α1) = 1−p
Qand Pr(X = q, ~α2) = p
Q,
the maximum TSI is [1 + logQ ]bits.
4.3. Simple Imaging Examples
The TSI framework described in the previous section allows us to evaluate the task-
specific performance of an imaging system for a task defined by a specific encoding
100
operator and virtual source variable. Three encoding operators corresponding to three
different tasks: (a) detection, (b) classification, and (c) joint detection/classification
and localization have been defined. Now we apply the TSI framework to evaluate
the performance of both a geometric imager and a diffraction-limited imager on these
three tasks.
We begin by describing the source, object, and clutter used in the scene model.
The source variableX in the detection task represents “tank present” or “tank absent”
conditions with equal probability i.e. p = 12. In the classification task, the source
variable ~X represents “tank present” or “jeep present” states with equal probability.
The joint localization task adds the position parameter to both the detection and
classification tasks. From Eq. (4.16) we see that the source parameter is the input
to the encoding operator, which in turn generates a scene consisting of both object
and clutter. Here the scene ~Y is of dimension 80 × 80 pixels (M = 80). The object
in the scene can be either a tank or a jeep at one of 64 equally likely positions
(P = 64). Therefore, the matrix T has dimensions of 6400 × 64 for the detection
task and 6400 × 128 for the classification task. In our scene model, the number of
clutter components is set to K = 6. Recall that the clutter components are arranged
as column vectors in the clutter matrix Vc. Clutter is generated by combining these
components with relative weights specified by the column vector ~β. Note that each
clutter vector is non-random but the weight vector ~β follows a multivariate Gaussian
distribution. In the simulation study the mean of ~β is set to ~µ~β = [160 80 40 40 64 40]
and covariance to Σ~β = ~µTβ I/5. The clutter to noise ratio, denoted by c, is set to 1.
The noise ~N is zero mean with identity covariance matrix Σ ~N = I.
Monte-Carlo simulations with importance sampling are used to estimate mmseH
using the conditional mean estimators for a given task. The mmseH estimates are
numerically integrated to obtain TSI over a range of s. For each value of s, we use
160, 000 clutter and noise realizations in the Monte-Carlo simulations.
101
0 10 20 30 40 50 60 70 80 90 100 110 120
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
s
mm
se
EY
EY|X
EY−E
Y|X
(a)
0 10 20 30 40 50 60 70 80 90 100 110 1200
0.2
0.4
0.6
0.8
1.0
s
Tas
k S
peci
fic In
form
atio
n [b
its]
GeometricDiffraction−limited
(b)
Figure 4.11. Detection task: (a) mmse versus signal to noise ratio for an ideal geomet-ric imager and (b) TSI versus signal to noise ratio for geometric and diffraction-limitedimagers.
102
4.3.1. Ideal Geometric Imager
The geometric imager represents an ideal imaging system with no blur and therefore,
we set H= I. Fig. 4.10 shows some example scenes resulting from object realizations
measured in the presence of noise. Note that the object in the scene is either a tank
or a jeep at one of the 64 positions.
We begin by describing the results for the detection task. Fig. 4.11(a) and
Fig. 4.11(b) show the plots of mmseH and TSI versus s respectively. Recall that
the mmseH is equal to the difference of E~Y and E~Y |X represented by the dotted and
dashed curves in Fig. 4.11(a) respectively. The term E~Y |X represents the mmse in
estimating ~Y given the knowledge of both the measurement ~R and source X. There-
fore, we expect it to always be less than E~Y , which is the mmse in estimating ~Y
given only the measurement ~R. Fig. 4.11 confirms this behavior. In the low s region,
mmseH (in solid line) is small as both E~Y and E~Y |X are nearly equal. Despite the
additional conditioning on X, E~Y |X does not significantly improve upon E~Y as the
noise remains the dominating factor. However, in the moderate s region E~Y |X im-
proves faster than E~Y and therefore the mmseH increases here. In the high s regime,
the noise has negligible effect and hence the additional knowledge of X does not sig-
nificantly improve E~Y |X . This leads to the mmseH converging towards zero as both
the mmse components become equal. The solid line in Fig. 4.11(b) shows the plot
of TSI versus s. As expected the TSI increases with s eventually saturating at 1 bit.
The saturation occurs because TSI is always upper bounded by the entropy of the
virtual source X. The TSI plot confirms our expectations regarding blur-free imaging
system performance with increasing s.
Now we consider TSI for the joint task of detecting and localizing a target. The
scene is partitioned into four regions, i.e., Q = 4. There are a total of 64 allowable
target positions, with 16 positions in each region. Fig. 4.12 shows some examples
scenes. Recall that the position of the target within each region is a nuisance param-
103
(a) (b)
(c) (d)
Figure 4.12. Scene partitioned into four regions: (a) Tank in the top left region ofthe scene, (b) Tank in the top right region of the scene, (c) Tank in the bottom leftregion of the scene, and (d) Tank in the bottom right region of the scene.
eter. We assume that the probability of the target being present or absent is 12
and
the conditional probability of the target in any of the four regions is 14, given that the
target is present. The entropy of the source variable therefore, increases to 2 bits as
per Eq. (4.27). Fig. 4.13(a) shows a plot of mmse versus s for the joint detection and
localization task. The dotted line represents the mmse of the estimator conditioned
over the image measurement only. The dashed line corresponds to the mmse of the
estimator conditioned jointly on the virtual source variable and the image measure-
ment. As expected we see that E~Y |X ≤ E~Y . The solid line represents mmseH , the
difference between the dotted and dashed curves, and is integrated to yield TSI. The
TSI of the geometric imager is plotted in solid line versus s in Fig. 4.13(b) . We note
that the TSI saturates at 2 bits as expected.
The previous two examples have demonstrated how the formalism of Section 4.2
104
0 10 20 30 40 50 60 70 80 90 100 110 1200
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
s
mm
se
EY
EY|X’
EY−E
Y|X’
(a)
0 10 20 30 40 50 60 70 80 90 100 110 1200
0.4
0.8
1.2
1.6
2.0
s
Tas
k S
peci
fic In
form
atio
n [b
its]
GeometricDiffraction−limited
(b)
Figure 4.13. Joint detection/localization task: (a) mmse versus signal to noise ratiofor an ideal geometric imager and (b) TSI versus signal to noise ratio for geometricand diffraction-limited imagers.
105
can be applied to either a detection task or a joint detection/localization task. These
examples have also confirmed the two important TSI trends: (1) TSI is a monotoni-
cally increasing function of signal to noise ratio and (2) TSI saturates at the entropy
of the virtual source. Section 4.2 also described how a classification task or a joint
classification/localization task may be captured within the TSI formalism. The solid
curve in Fig. 4.14 depicts the TSI obtained from an ideal geometric imager for a
classification task in which the two classes are equally probable. Recall that for the
classification task we treat the position as the nuisance parameter and so the equi-
probable assumption results in a virtual source entropy of 1 bit. As expected the TSI
in Fig. 4.14 saturates at 1 bit. Fig. 4.15 presents the results of the TSI analysis of the
joint classification/localization task. Once again we have used two equally probable
targets and Q = 4 equally probable regions resulting in a source entropy of 3 bits. We
see that once again despite the measurement entropy that results from random clut-
ter and noise, the TSI provides an accurate estimate of the task-specific information,
saturating at 3 bits.
4.3.2. Ideal Diffraction-limited imager
The previous subsection presented the TSI results for an ideal geometric imager.
Those results should therefore be interpreted as upper bounds on the performance of
any real-world imager. In this subsection, we examine the effect of optical blur on TSI.
We will assume aberration-free, space-invariant, diffraction-limited performance. The
discretized optical point spread function (PSF) associated with a rectangular pupil
can be expressed as [29]
hi,j =
∫ ∆/2
−∆/2
∫ ∆/2
−∆/2
sinc2
((x− i∆)
W
)sinc2
((y − j∆)
W
)dxdy, (4.32)
where ∆ is the detector pitch and W quantifies the degree of optical blur associated
with the imager. Lexicographic ordering of this two-dimensional PSF yields one row
of H and all other rows are obtained by lexicographically ordering shifted versions of
106
0 10 20 30 40 50 60 70 80 90 100 110 1200
0.2
0.4
0.6
0.8
1.0
s
Tas
k S
peci
fic In
form
atio
n [b
its]
GeometricDiffraction−limited
Figure 4.14. Classification task: TSI versus signal to noise ratio for geometric anddiffraction-limited imagers.
this PSF. The optical blur is set to W = 2 and the detector pitch is set to ∆ = 1 so
that the optical PSF is sampled at the Nyquist rate. The clutter and noise statistics
remain unchanged.
Fig. 4.16 shows examples of images that demonstrate the effects of both optical
blur and noise. The object, as before, is either a tank or a jeep at one of the 64
positions. The plots of TSI versus s are represented by dash-dot curves for the
detection and classification tasks in Fig. 4.11(b) and Fig. 4.14 respectively. The TSI
metric verifies that imager performance is degraded due to optical blur compared to
the geometric imager. For example, in the detection task, s = 34 yields TSI = 0.9 bit
for the geometric imager, whereas a higher signal to noise ratio s = 43 is required to
achieve the same TSI for the diffraction-limited imager.
The dash-dot curves in Fig. 4.13(b) and Fig. 4.15 show the TSI versus s plots
for the joint detection/localization and classification/localization tasks respectively.
Once again we see that TSI is reduced due to optical blur. In Fig. 4.13(b) TSI = 1.8 bit
107
0 10 20 30 40 50 60 70 80 90 100 110 1200
0.5
1.0
1.5
2.0
2.5
3.0
s
Tas
k S
peci
fic In
form
atio
n [b
its]
GeometricDiffraction−limited
Figure 4.15. Joint classification/localization task: TSI versus signal to noise ratio forgeometric and diffraction-limited imagers.
is achieved at s = 35 for the diffraction-limited imager as opposed to s = 28 in
case of the geometric imager for the detection/localization task. Similarly, for the
classification/localization task the signal to noise ratio required to achieve TSI =
2.7 bit increases by 10 due to the optical blur associated with the diffraction-limited
imager.
In this section, we have presented several numerical examples that demonstrate
how the TSI analysis can be applied to various tasks and/or imaging systems. The
results obtained herein are consistent with our expectations that (1) TSI increases
with increasing signal to noise ratio, (2) TSI is upper bounded by J(X), and (3)
blur degrades TSI. Although these general trends were known in advance of our
analysis, we are encouraged by our ability to quantify these trends using a formal
approach. In the next section we will use a TSI analysis to evaluate the target-
detection performance of two candidate compressive imagers.
108
(a) (b)
(c) (d)
Figure 4.16. Example scenes with optical blur: (a) Tank in the top of the scene, (b)Tank in the middle of the scene, (c) Jeep at the bottom of the scene, and (d) Jeep inthe middle of the scene.
109
X Y Z
SourceChannel
SceneH[ ]
EncodingC[ ]
Virtual Projection NoiseN[ ]
R
MeasurementP[ ]
F
Figure 4.17. Block diagram of a compressive imager.
4.4. Compressive imager
For task-specific applications (e.g. detection) an isomorphic measurement (i.e. a
pretty picture) may not represent an optimal approach for extracting TSI in the
presence of detector noise and a fixed photon budget. The dimensionality of the
measurement vector has a direct effect on the measurement signal to noise ratio [6].
Therefore, we strive to design an imager that directly measures the scene information
most relevant to the task while minimizing the number of detector measurements and
thereby increasing the measurement signal to noise ratio. One approach towards this
goal is to measure linear projections of the scene, yielding as many detector measure-
ments as there are projections. We refer to such an imager as a compressive imager,
sometimes also referred to as a projective/feature-specific imager. Fig. 4.17 shows
the imaging chain block diagram modified to include a projective transformation P.
For the compressive imager the measurement can be written as
R = N (P(H(C(X)))). (4.33)
We only consider discrete linear projections here, therefore the P operator is
represented by the matrix P. If we consider the detection task from Subsection 4.2.2
then the measurement model for the compressive imager can be written as
~R =√sPHT~ρX + ~N ′
c, (4.34)
where, ~N ′c =
√cPHVcβ + ~N.
The TSI and the mmseH expressions for the compressive imager are found by substi-
110
tuting PH for H in Eqs. (4.18)-(4.25) yielding
TSI ≡ I(X; ~R) =1
2
∫ s
0
mmseH(s′)ds′, (4.35)
where mmseH(s) = Tr(H†P†Σ−1~N ′
c
PH(E~Y − E~Y |X)) (4.36)
here ~Y = T~ρX and E~Y and E~Y |X are given earlier in Eq. (4.10).
Similarly for the joint detection/localization task from Subsection 4.2.4 the mod-
ified expressions for the imaging model and TSI are given by
~R =√sPHTΛ(X)~ρα + ~Nc, (4.37)
TSI ≡ I(X ′; ~R) =1
2
∫ s
0
mmseH(s′)ds′, (4.38)
where mmseH(s) = Tr(H†P†Σ−1~Nc
PH(E~Y −E~Y |X′)) (4.39)
here X ′ ∈ X, 0 and ~Y = TΛ(X)~ρα.
We consider compressive imagers based on two classes of projection: a) princi-
pal component projections and b) matched filter projections. Their performance is
compared with that of the conventional diffraction-limited imager.
4.4.1. Principal component projection
Principal component (PC) projections are determined by the statistics of the object
ensemble. For a set of objects O, the PC projections are defined as the eigenvectors
of the object auto-correlation matrix ROO given by
ROO = E(ooT ), (4.40)
where o ∈ O is a column vector formed by lexicographically arranging the elements
of a two-dimensional object. Note that the expectation is over all objects in the set
O. These PC projection vectors are used as rows of the projection matrix P∗. In
our numerical study, example objects in the set O are obtained by generating sample
The terms E(~Y |~R, s) and E(~Y |~R,X, s) represent the conditional estimators for ~Y ,
with the former one conditioned over ~R and the latter one conditioned over both ~R and
X. Explicit expressions for these estimators, required for evaluating the expectations
in Eq. (5.8), can be found in Appendix A. Note that the relation between TSI and
mmse specified in Eq. (5.7) suggests that TSI increases with increasing mmse which
is counterintuitive. However, note that the actual mmse expression in Eq. (5.7) is
composed of two individual mmse terms E~Y and E~Y |X . The first mmse term E~Y is
the expected error in estimating ~Y given the measurement ~R, while the second mmse
term E~Y |X denotes the expected error given the joint knowledge of both ~R and X.
Fig. 5.4 shows a plot of these two mmse terms along with the difference mmse as a
function of SNR for a conventional imager. Note that in the low SNR region the two
mmse terms have similar values as the additional knowledge of X does not improve
the error significantly because the noise dominates in this region. In the mid SNR
region, the effect of noise is reduced and therefore, the second mmse error is lower
leading to an increase in difference mmse. The two mmse terms converge in the the
high SNR region as the noise becomes negligible with increasing SNR thereby making
the difference mmse smaller. Given that it is the difference between these mmse terms
whose integral is equal to TSI, it is expected that increasing the difference mmse leads
to a higher TSI. Note that TSI can not be increased arbitrarily by simply increasing
the difference mmse because the integral of the difference mmse is upper-bounded by
the entropy of the source variable X.
135
0 10 20 30 40 50 60 70 80 90 100 110 120
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
s
mm
se
EY
EY|X
EY−E
Y|X
Figure 5.4. Difference mmse and mmse components versus SNR for a conventionalimager.
5.2.2. Simulation details
The source variable X in our detection task represents “tank present” or “tank ab-
sent” conditions with equal probability i.e. p = 12. Here we consider a scene of
dimension 80× 80 pixels (i.e. M = 80). The object in the scene is a “tank” at one of
P = 64 equally likely positions and therefore, the matrix T is of dimensions 6400×64.
In our scene model, the number of clutter components is set to L = 6 with the L2
norm of each column vector in Vc set to unity. The mean of the mixing vector ~β is
set to ~µ~β = [160 80 40 40 64 40] and covariance to Σ~β = ~µTβ I/5. The CNR is set to
c = 1 and the detector noise ~N is AWGN with zero-mean and covariance Σ ~N = I.
We assume that the imaging optics in Fig. 5.1 (b) exhibits aberration-free, space-
invariant, and diffraction-limited performance for each lenslet. The discretized optical
point spread function (PSF) associated with a rectangular pupil therefore assumes
136
(a) (b)
Figure 5.5. Example scenes with optical blur and noise: (a) Tank in the top of thescene, (b) Tank in the middle of the scene
the following form [29]
h(i, j) =
∫ ∆/2
−∆/2
∫ ∆/2
−∆/2
sinc2
((x− i∆)
W
)sinc2
((y − j∆)
W
)dxdy, (5.9)
where W quantifies the degree of optical blur associated with the imaging optics and
∆ is the pixel pitch of the mask in Fig. 5.1(b). The optical blur is set to W = 2 and
the pixel pitch is set to ∆ = 1 so that the optical PSF is sampled at the Nyquist rate.
Note that a lexicographic ordering of the two-dimensional PSF yields one row of H
and all other rows are obtained by lexicographically ordering the appropriately shifted
version of this PSF. Fig. 5.5 shows example images that demonstrate the effects of
both optical blur and noise.
To ensure a fair comparison between CI and conventional imaging, we introduce
a system constraint based on the total photon-count. This system constraint has two
physical implications: 1) the total number of photons incident on the detector array
is always less than or equal to the total number of photons entering the entrance pupil
(i.e., the CI system is passive) and 2) the total number of photons available at the
entrance pupil is fixed (i.e., the CI system uses the same pupil and observation time
as the conventional imager). Mathematically, this total photon-count constraint can
137
be expressed as
P =1
ωP∗, (5.10)
where ω = maxj
∑Ki=1 |P∗
ij| denotes the maximum absolute column sum of the
matrix operator P∗. Here P∗ represents the original unnormalized projection matrix
and P refers to the normalized photon-count-constrained matrix that is implemented
optically. Note that a conventional imager does not employ a SLM and instead uses a
sensor array for image measurement, both P and P∗ are equal to an identity matrix
of dimension M2 ×M2.
Monte-Carlo simulations with importance sampling [68] are used to estimate the
mmse via the conditional mean estimators defined in Eq. (5.8). For each value of
s, 8000 clutter and noise realizations are used to estimate the mmse. These mmse
estimates are then numerically integrated with respect to SNR over the interval [0, s],
via the adaptive Lobatto quadrature method [96], to yield the TSI at s.
5.3. Optimization framework
We now describe the optimization framework for designing a CI system to maximize
task-specific performance. The degrees of freedom available in a CI system include
all the elements comprising the projection matrix P (i.e. all elements of P are valid
design variables). The constrained optimization problem can therefore be expressed
as
maxP
[TSI], such that maxj
K∑
i=1
|Pij| = 1. (5.11)
However, we note that the computational complexity resulting from the use of TSI
as a design metric increases exponentially with the number of design variables (which
is equal to K ×M2). As a result, this optimization approach becomes computation-
ally intractable for realistic scene dimensionality. Therefore, we pursue an alternate
approach that attempts to find the optimal photon-allocation per feature for a given
138
projection basis. This approach reduces the number of design parameters fromK×M2
to K, and therefore lowers the computational burden to a manageable level. We ex-
pect a TSI improvement through non-uniform photon-allocation scheme because, the
photon-budget can now be distributed among the basis vectors according to their
task-relevance. Note that in this approach the projection basis is pre-determined and
not optimized.
Within this optimization framework, the fraction of photons associated with the
ith basis vector ~Pi
∗(i.e. the ith row of P∗) is denoted by the design variable πi.
Therefore, for a given projection basis P∗ there is an associated photon-allocation
vector ~π that is defined as ~π = [π1, π2, · · · , πK ]. Note that the non-uniform photon
allocation vector ~π can be implemented via the use of non-uniform lenslet diameters in
the parallel CI architecture. Designing a CI system within the proposed optimization
framework involves three steps: 1) construct the unnormalized projection matrix
P∗ = [~P ∗1 ,~P ∗
2 , · · · , ~P ∗K ]T by choosing K projection vectors from the pre-defined basis,
2) construct the normalized projection matrix P = diag(~π)P∗ by choosing a ~π that
satisfies the photon-count constraint, where diag(·) denotes a diagonal matrix whose
diagonal is equal to its vector argument, and 3) optimize upon the associated photon-
allocation vector ~π in the presence of the total photon-count constraint to maximize
the TSI for a given value of SNR. Mathematically, this constrained optimization
problem can be expressed as
max~π
[TSI], such that maxj
K∑
i=1
|[diag(~π)P∗]ij | = 1. (5.12)
We use an optimization algorithm based on simulated tunneling [56] to maximize
the TSI for a given value of s. The simulated tunneling approach guarantees conver-
gence to the global maximum/minimum of an optimization problem as the number
of iterations tends to infinity. We observe convergence to a common solution after
5000 iterations, from multiple different initial conditions giving confidence that our
139
TSI optimization framework results in a global optima. Note that the computational
complexity of each iteration step is a function of the number of target positions P , the
number of projection vectors K, the SNR parameter s and the number of clutter/noise
realizations NCN used in the Monte-Carlo simulation. The number of floating points
operations (Flops) involved in each evaluation of the objective function can be ex-
pressed as ⌊√
10s⌋NCN(2P 4 + 2P 3 + 3P 2 + PK). For example, at s = 5, K = 1,
NCN = 8000 and P = 64, 1778GFlops were required to compute the TSI. Therefore,
as the number of target positions P is increased the computational cost grows quar-
tically O(P 4). Some practical tricks could be employed for large values of P like 1)
Monte-Carlo simulations over the diverse perspectives (only), and 2) parametrization
of the target library with P ≪ P parameters. However, it is important to realize that
the actual target detection problem does not become more complex as P increases
(for the same number of measured features) [84]. As mentioned earlier, several differ-
ent projection bases are considered for use in the CI system design. Now we describe
each of these projection bases in the context of our target-detection task.
5.3.1. Principal component projections
Principal component (PC) projections are derived from principal component analysis,
and are frequently employed for data dimensionality reduction in pattern recognition
problems [81]. The salient aspect of this basis is its strong energy compaction property
that leads to dimensionality reduction with the smallest reconstruction RMSE for
certain types of signals [97, 98]. The normally distributed signals fall in this category.
In practice, PC projections are computed using second-order statistics of a training
set chosen to represent an object ensemble. Specifically, for a training set O, the PC
projections are defined as the eigenvectors of the object auto-correlation matrix ROO
defined as
ROO = E~o~oT, (5.13)
140
Figure 5.6. Example projection vectors in the PC projection basis, clockwise fromupper left, #2,#6,#16,#31.
where ~o ∈ O is a column vector formed by lexicographically arranging the elements of
a two-dimensional image in O. Note that the expectation, denoted by operator E·,is over the complete training set O. In this work the object samples in the training
set O were obtained by generating sample realizations of scenes with varying clutter
levels, target strength, and target position using the stochastic encoder C defined in
Eq. (5.2). The K dominant eigenvectors of ROO are used to create the projection
matrix P∗PC. Fig. 5.6 shows some example projection vectors from this PC projection
basis.
In Chapter 4, it was demonstrated that the PC compressive imager, with an
uniform photon-allocation, achieves a higher TSI than that of the conventional imager.
This is the result of a higher measurement fidelity in a PC compressive imager due to
its strong image-energy compaction property. Fig. 5.7 shows the plot of TSI versus s
for the PC compressive imager for various choices of K. Observe that TSI increases
141
0 2 4 6 8 10 12 14 16 18 20 22 240
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
s
Tas
k−sp
ecifi
c In
form
atio
n [b
its]
K=12
K=16
K=24
K=32
Optimized
Figure 5.7. TSI versus SNR for PC compressive imager.
monotonically with s, eventually saturating at 1.0 bit. Also note that for a particular
SNR, TSI increases with K up to a certain value and then starts decreasing. We refer
to this behavior as the “rollover effect,” which is the result of a trade-off between
two competing processes: 1) as K increases, the projective measurements provide
more target-detection information, leading to an increase in TSI and 2) with a fixed
photon-budget, the measurement fidelity per feature decreases with increasing K,
resulting in a decrease in TSI. The tradeoff between these two processes results in
an optimal value of K that maximizes the TSI for a given value of SNR. For the
example in Fig. 5.7, the optimal value of K is 24 for s = 20. Note that the optimal
K is a function of SNR. From here onwards, we will refer to this effect of decreasing
measurement fidelity with increasing K as the “noise cost.”
In the next section we will further improve the PC compressive imager performance
by using the optimal photon-allocation. A PC projection matrix with K = 32 is
142
chosen as it accounts for more than 99.99% of the total eigenvalue sum. Here, the
total eigenvalue sum is defined as the sum of all eigenvalues of a projection matrix
P. It is important to remember that the PC projection basis itself is not an optimal
choice for the target-detection task.
5.3.2. Generalized matched-filter projections
The generalized matched-filter (GMF) is commonly used for the purpose of target-
detection in radar applications. For a target-detection problem, in which the target
and the background are known exactly, the GMF provides optimal performance in
terms of maximizing the probability of detection for a fixed false alarm rate [47]. Re-
call that in our target-detection problem, the target position is a nuisance parameter
that must be estimated implicitly. In such a case, instead of a matched-filter (e.g.
correlator), we consider a set of matched projections, as described in Ref. [85]. Each
matched projection corresponds to the target at a given position. Therefore, the re-
sulting compressive imager yields an inner-product between the scene and the target
at a particular position as specified by each projection vector. The GMF projection
matrix P∗GMF is defined as
P∗GMF = TΣ
−1~Nc, (5.14)
where T is the modified target profile matrix, each row of which corresponds to a
target profile at a particular position. The number of positions chosen is K and
therefore, the dimension of the matrix T is K ×M2. The whitening transformation
Σ−1~Nc
accounts for the joint effect of clutter and detector noise and is pre-multiplied by
T resulting in the final projection matrix P∗GMF [47]. We choose K = 64 to construct
the GMF projection basis matrix, thus accounting for all allowed target positions in
our scene model. Fig. 5.8 shows some examples of projection vectors from the GMF
projection matrix.
143
Figure 5.8. Example projection vectors in the GMF projection basis, clockwise fromupper left, #1,#16,#32,#64.
For the detection task in Subsection 4.2.2, the virtual source variable X is binary;
therefore, substituting (A15) and (A3) into (A13) and simplifying we obtain the
following expressions for the estimator
E(~Y |~R,X = 1) =
∑Pl=1
~Yl · exp[−12(Θ2l + Θ3l)]∑P
m=1 exp[−12(Θ2m + Θ3m)]
, (A16)
E(~Y |~R,X = 0) = 0,
where ~Yl in (A16) is the target profile at the lth position. Θ2l and Θ3l in (A16)
are evaluated using (A6) and (A7) respectively. Similarly for the classification task
defined in Subsection 4.2.3, the estimator in (A13) can be written as
E(~Y |~R, ~X) =
∑Pl=1
~Yl · exp[−12(Θ2l + Θ3l)]∑P
m=1 exp[−12(Θ2m + Θ3m)]
, (A17)
where ~Yl in this case is the target profile specified by ~X at lth position.
Recall from Subsection 4.2.4 that for the joint detection and localization task, the
virtual source variable X ′ is (Q + 1)-ary. Note that X ′ = X, where X denotes the
170
region in which target is present when α = 1 and X ′ = 0 when α = 0. The estimator
in (A13) for this case is given by
E(~Y |~R,X = i, α = 1) =
∑Pi
l=1~Yi,l · exp[−1
2(Θi,2l + Θi,3l)]∑Pi
m=1 exp[−12(Θi,2m + Θi,3m)]
, (A18)
E(~Y |~R, α = 0) = 0,
where X = i implies that target is present in region i, ~Yi,l is the target profile at
the lth position of region i. Once again Θi,2l and Θi,3l are evaluated using (A6) and
(A7) respectively by substituting Yl with the appropriate Yi,l. In a similar manner the
estimator E(~Y |~R, ~X) for the joint classification and localization task can be expressed
as
E(~Y |~R,X = i, ~α) =
∑Pi
l=1~Yi,l,~α · exp[−1
2(Θi,2l + Θi,3l)]∑Pi
m=1 exp[−12(Θi,2m + Θi,3m)]
, (A19)
where Yi,l,~α, Θi,2l and Θi,3l have the same meaning as in (A12).
171
References
[1] N. J. Wade and S. Finger, “The eye as an optical instrument: from cameraobscura to Helmholtz’s perspective,” Perception 30(10), 1157-1177 (2001).
[2] W. Boyle and G. Smith, “Charge Coupled Semiconductor Devices,” Bell SystemTechnical Journal 49, 587 (1970).
[3] G. E. Moore, “Cramming more components onto integrated circuits,” Electron-ics Magazine 38(8), (1965).
[4] E. R. Dowski and W.T. Cathey, “Extended Depth of Field Through WavefrontCoding,” Applied Optics 34(11), 1859-1866 (1995).
[5] P. Potuluri, U. Gopinathan, J. R. Adleman, and D. J. Brady, “Lensless sensorsystem using a reference structure,” Optics Express 11, 965-974 (2003).
[6] M. A. Neifeld and P. Shankar, “Feature-Specific Imaging,” Applied Optics 42,3379-3389 (2003).
[7] http://www.cdm-optics.com
[8] H. H. Barrett, “Objective assessment of image quality: effects of quantum noiseand object variability,” J. Opt. Soc. Am. A 7, 1266-1278 (1990).
[9] H. H. Barrett, J. L. Denny, R. F. Wagner, and K. J. Myers, “Objective assess-ment of image quality. II. Fisher information, Fourier crosstalk, and figures ofmerit for task performance,” J. Opt. Soc. Am. A 12, 834-852 (1995).
[10] H. H. Barrett, C. K. Abbey, and E. Clarkson, “Objective assessment of imagequality. III. ROC metrics, ideal observers, and likelihood-generating functions,”J. Opt. Soc. Am. A 15, 1520-1535 (1998).
[11] L. Poletto and P. Nicolosi, “Enhancing the spatial resolution of a two-dimensional discrete array detector,” Optical Eng. 38, 1748-1757 (1999).
[13] S. Borman, “Topics in Multiframe Superresolution Restoration,” Ph.D. disser-tation (University of Notre Dame, Notre Dame, 2004).
[14] S. Farsiu, D. Robinson, M. Elad, and P. Milanfar, “Fast and robust multi-framesuper-resolution,” IEEE Trans. Image Process. 13, 1327-1344 (2004).
172
[15] N. Galatsanos and R. Chin, “Digital restoration of multichannel images,” IEEETrans. on Acoustics, Speech, and Signal Process. 37, 415-421 (1989).
[16] S. P. Kim, N. K. Bose, and H. M. Valenzuela, “Recursive reconstruction ofhigh resolution image from noisy undersampled multiframes,” IEEE Trans. onAcoustics, Speech, and Signal Process. 38, 1013-1027 (1990).
[17] H. Ur and D. Gross, “Improved resolution from subpixel shifted pictures,” Com-puter Vision Graphics Image Processing: Graph. Models Image Process. 54,181-186 (1992).
[18] M. Elad and A. Feuer, “Restoration of a single superresolution image from sev-eral blurred, noisy and undersampled images,” IEEE Trans. in Image Process. 6,1646-1658 (1997).
[19] J. Tanida, T. Kumagai, K. Yamada, S. Miyatake, K. Ishida, T. Morimoto,N. Kondou, D. Miyazaki, and Y. Ichioka,“Thin Observation Module by BoundOptics (TOMBO): Concept and Experimental Verification,” Applied Optics 40,1806-1813 (2001).
[20] Y. Kitamura, R. Shogenji, K. Yamada, S. Miyatake, M. Miyamoto, T. Mori-moto, Y. Masaki, N. Kondou, D. Miyazaki, J. Tanida, and Y. Ichioka, “Re-construction of a High-Resolution Image on a Compound-Eye Image-CapturingSystem,” Applied Optics 43, 1719-1727 (2004).
[21] P. M. Shankar, W. C. Hasenplaugh, R. L. Morrison, R. A. Stack, and M. A.Neifeld, “Multiaperture imaging,” Applied Optics 45, 2871-2883 (2006).
[22] M. A. Neifeld and A. Ashok, “Imaging using alternate point spread functions:Lenslets with pseudo-random phase diversity,” in Proceedings of OSA TopicalMeeting: Computational Optical Sensing and Imaging(COSI), Charlotte, NC,June 6-8, paper CMB1 (2005).
[23] A. Ashok and M. A. Neifeld, “Engineering the point spread function for super-resolution from multiple low-resolution sub-pixel shifted frames,” in Proceedingsof OSA Annual Meeting, Tucson, AZ, Oct 16-20 (2005).
[24] Q. Tian and M. N. Huhns, “Algorithms for subpixel registration,” ComputerVision Graphics Image Processing 35, 220-233 (1986).
[25] S. Verdu, Multiuser detection, (Cambridge, University Press, 1998), Chap. 2.
[26] J. Solmon, Z. Zalevsky , D. Mendlovicm, “Geometric Superresolution by CodeDivision Multiplexing,” Applied Optics 44, 32-40 (2005).
173
[27] A. Ashok and M. A. Neifeld, “Information-based analysis of simple incoherentimaging systems,” Optics Express 11, 2153-2162 (2003).
[28] H. H. Barrett and K. J. Myers, Foundations of Image Science, (Wiley-Interscience, 2004).
[29] J.W. Goodman, Introduction to Fourier Optics, (MCGraw Hill, 1996), Chap.7.
[30] E. Y. Lam, “Noise in superresolution reconstruction ,” Optics Letter 28, 2234-2236 (2003).
[31] H.C. Andrews and B.R. Hunt, Digital Image Restoration, (Prentice-Hall, En-glewood Cliffs, N.J., 1977).
[32] D. J. Tolhurst, Y. Tadmor, and T. Chao, “Amplitude spectra of natural im-ages,” Ophthalm. Physiol. Opt. 12, 229-232 (1992).
[33] D. L. Ruderman, “Origins of scaling in natural images,” Vision Res. 37, 3385-3398 (1997).
[34] D. J. Field and N. Brady, “Visual sensitivity, blur and the sources of variabilityin the amplitude spectra of natural scenes,” Vision Res. 37, 3367-3383 (1997).
[36] M. Irani and S. Peleg, “Improving resolution by image registration,” CVGIP:Graph. Models Image Process. 53, 231-239 (1991).
[37] A. P. Dempster, N. M. Laird and D. B. Rubin, “Maximum likelihood fromincomplete data via the EM algorithm,” J. Roy. Stat. Soc. Ser. B 39, 1-38(1977).
[38] L. B. Lucy, “An iterative technique for the rectification of observed distribu-tion,” Astron. J. 79, 745-754 (1974).
[39] W. H. Richardson, “Bayesian-based iterative method of image restoration,” J.Opt. Soc. Am. A 56, 1141-1142 (1972).
[40] A. Ashok and M. A. Neifeld, “ Recent progress on multidomain optimizationfor ultrathin cameras,” Proc. SPIE 6232, 62320N (2006).
[41] J. G. Daugman, “High confidence visual recognition of person by a test ofstatistical independence,” IEEE Trans. PAMI 15, 1148-1161 (1993).
174
[42] J. G. Daugman, “The importance of being random: statistical principles of irisrecognition,” Pattern Recognition 36, 279-291 (2003).
[43] J. G. Daugman, “How iris recognition works,” IEEE Trans. Circuits and Sys-tems for Video Tech. 14(1), 21-30 (2004).
[44] R. Barnard, V.P. Pauca, T.C. Torgersen, R.J. Plemmons, S. Prasad, J. van derGracht, J. Nagy, J. Chung, G. Behrmann, S. Mathews, and M. Mirotznik.“High-Resolution Iris Image Reconstruction from Low-Resolution Imagery,”Proc. SPIE 6313, 1-13, (2006).
[45] R. Narayanswamy, P. Silveira, H. Setty, V. Pauca, and J. van der Gracht, “Ex-tended depth-of-field iris recognition system for a workstation environment,”Proc. SPIE 5779, 41-50 (2005).
[46] R. Narayanswamy, G. E. Johnson, P. E. X. Silveira, and H. B. Wach, “Extendingthe imaging volume for biometric iris recognition,” Appl. Opt. 44, 701-712(2005).
[47] S. Kay, Fundamentals of Statistical signal processing: Detection theory, (Pren-tice Hall, 1993).
[49] M. Born and E. Wolf, Principles of Optics: Electromagnetic Theory of Prop-agation, Interference, and Diffraction of Light, (Pergamon Press, 1989) Chap.9.
[50] A. Papoulis and S. U. Pillai, Probability, Random Variables and StochasticProcesses, (McGraw Hill, 2001).
[51] C. L. Fales, F. O. Huck, and R. W. Samms, ”Imaging system design for improvedinformation capacity,” Applied Optics 23, 873-888, (1984).
[52] N. X. Nguyen, “Numerical Algorithms for Image Superresolution,” Ph. D. Dis-sertation, Stanford University, (2000).
[53] L. Masek, “Recognition of Human Iris Patterns for Biometric Identification,”Technical report, University of Western Australia, (2003).
[54] S. Sanderson and J. Erbetta, “Authentication for secure environments based oniris scanning technology,” IEE Colloquium on Visual Biometrics, (2000).
[55] D. J. Field, “Relations between the statistics of natural images and the responseproperties of cortical cells,” Journal of the Optical Society of America 4, 2379-2394 (1987).
175
[56] W. Wenzel and K. Hamacher, “A Stochastic tunneling approach for globalminimization,” Phys. Rev. Lett. 82(15), 3003-3007(1999).
[58] J. A. O’Sullivan, R. E. Blahut and D. L. Snyder,“Information-theoretic imageformation,” IEEE Trans. on Image Processing 44, 2094-2123 (1998).
[59] A. Ortega and K. Ramchandran, “Rate-distortion methods for image and videocompression,” IEEE Signal Processing Magazine 15, 23-50 (1998).
[60] F. O. Huck, C. L. Fales, and Z. Rahman, “An Information Theory of VisualCommunication,” Phil. Trans. R. Soc. A: Phys. Sci. and Engr. 354, 2193-2248(1996).
[61] F. O. Huck and C. L. Fales, “Information-theoretic assessment of sampled imag-ing systems,” Optical Engineering 38, 742-762 (1999).
[62] J. Ahlberg and I. Renhorn, “An information-theoretic approach to band selec-tion,” Proc. SPIE 5811, 15-23 (2005).
[63] S. P. Awate, T. Tasdizen, N. Foster, and R. T. Whitaker, “Adaptive, Nonpara-metric Markov Modeling for Unsupervised, MRI Brain-Tissue Classification,”Med. Image. Anal. (to be published).
[64] J. Liu and P. Moulin, “Information-Theoretic Analysis of Interscale and In-trascale Dependencies Between Image Wavelet Coefficients,” IEEE Trans. onImage Processing 10, 1647-1658 (2001).
[65] L. Zhen and Karam, “Mutual information-based analysis of JPEG2000 con-texts,” IEEE Trans. on Image Processing 14, 411-422 (2005).
[66] D. S. Taubman and M. W. Marcellin, JPEG2000: Image Compression Funda-mentals, Standards and Practice, (Springer Publishing, 2002)
[67] T. Cover and J. Thomas, Elements of Information Theory, (John Wiley andSons, New York, 1991).
[68] M. Tanner, Tools for Statistical Inference, (Springer, 2nd edition 1993).
[69] J. S. Liu, Monte Carlo Strategies in Scientific Computing, (Springer, 2001).
[70] C. P. Robert and G. Casella, Monte Carlo Statistical Methods, (Springer, 2004).
[71] A. Doucet, N. de Freitas, and N. Gordon, Sequential Monte Carlo Methods inPractice, (Springer, 2001).
176
[72] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linearcodes for minimizing symbol error rate,” IEEE Trans. Inform. Theory 20, 284-287 (1974).
[73] D. Guo, S. Shamai and S. Verdu, “Mutual information and minimum mean-square error in Gaussian channels,” IEEE Trans. on Inform. Theory 51, 1261-1282 (2005).
[74] H. L. Van Trees, Detection, Estimation, and Modulation Theory, Part I: De-tection, Estimation, and Linear Modulation Theory, New York: Wiley, 1968.
[75] D. P. Palomar and S. Verdu, “Gradient of mutual information in linear vectorGaussian channels,” IEEE Trans. on Inform. Theory 52, 141-154 (2006).
[76] W. T. Cathey and E. R. Dowski, “New Paradigm for Imaging Systems,” AppliedOptics 41, 6080-6092 (2002).
[77] A. Ashok and M. A. Neifeld, “Pseudorandom phase masks for superresolutionimaging from subpixel shifting,” Applied Optics 46, 2256-2268 (2007).
[78] M. D. Stenner, A. Ashok, and M. A. Neifeld, “Multi-Domain Optimization forUltra-Thin Cameras,” Frontiers in Optics, Rochester, NY (2006).
[80] M. A. Neifeld and J. Ke, “Optical architectures for compressive imaging,” Ap-plied Optics 46, 5293-5303 (2007).
[81] M. Turk and A. Pentland, “Eigenfaces for recognition,” Journal of CognitiveNeuroscience 3, 71-86 (1991).
[82] P. Belhumeur, J. Hespanha, and D. Kriegman, “Eigenfaces vs. Fisherfaces:Recognition Using Class Specific Linear Projection,” IEEE Trans. on Patternanalysis and Machine intelligence 19, 711-720 (1997).
[83] M. S. Bartlett, J. R. Movellan, T. J. Sejnowski, “Face recognition by inde-pendent component analysis,” IEEE Trans. on Neural Networks 13, 1450-1464(2002).
[84] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, (Wiley Inter-science 2000).
[85] M. A. Neifeld, A. Ashok, and P. K. Baheti, “Task Specific Information forImaging System Analysis,” J. Opt. Soc. Am. A 24, B25-B41 (2007).
177
[86] H. Pal and M. A. Neifeld, “Multispectral principal component imaging,” OpticsExpress 11, 2118-2125 (2003).
[87] D. L. Donoho, “Compressed sensing,” IEEE Trans. on Information Theory 52,1289-1306 (2006).
[88] M. Lustig, D. L. Donoho, J. M. Santos, J. M. Pauly, “Compressed Sensing MRI[A look at how CS can improve on current imaging techniques],” IEEE SignalProcessing Magazine. 25, no. 2, 72-82, March 2008.
[89] A. Mahalanobis, “Optical Systems for Task Specific Compressed Sensing andImage Reconstruction,” Annual meeting of the IEEE Lasers and Electro-OpticsSociety, 157-158, Oct 2007.
[90] M. F. Duarte, M. A. Davenport, M. B. Wakin and R. G.Baraniuk,“Sparse signaldetection from incoherent projections,” in Proc. of IEEE International Conf.Acoustics, Speech and Signal Processing (ICASSP), vol. 3, 14-19 (2006).
[91] D. Takhar, J. N. Laska, M. B. Wakin, M. F. Duarte, D. Baron, S. Sarvotham,K. Kelly, and R. G. Baraniuk, “A new compressive imaging camera architectureusing optical-domain compression,” Proc. SPIE 6065, 43-52 (2006).
[92] D. P. Palomar and S. Verdu, “Representation of Mutual Information Via InputEstimates,” IEEE Trans. on Inform. Theory 53, 453-470 (2007).
[93] N. Towghi and B. Javidi, “Optimum receivers for pattern recognition in thepresence of Gaussian noise with unknown statistics,” J. Opt. Soc. Am. A 18,1844-1852 (2001).
[94] R. Patnaik and D. Casasent, “MINACE filter classification algorithms for ATRusing MSTAR data,” Proc. SPIE 5807, 100-111 (2005).
[95] R. Patnaik and D. Casasent, “SAR classification and confuser and clutter re-jection tests on MSTAR ten-class data using Minace filters,” Proc. SPIE 6574,657402:1-15 (2007).
[96] W. Gander and W. Gautschi, “Adaptive Quadrature - Revisited,” BIT 40,84-101 (2000).
[97] I. T. Jolliffe, Principal Component Analysis, (Springer, 2002).
[98] D. Barber and F. V. Agakov, “The IM Algorithm: A Variational Approach toInformation Maximization,” In NIPS (MIT Press, 2003).
[99] A. J. Bell and T. J. Sejnowski, “An information-maximization approach to blindseparation and blind deconvolution,” Neural Computation 7, 1129-1159 (1995).
178
[100] A. Hyvarinen, J. Karhunen, and E. Oja, Independent Component Analysis,(Wiley, 2001).