A Task-specific Approach to Computational Imaging System Design

A Task-specific Approach to Computational

Imaging System Design

by

Amit Ashok

A Dissertation Submitted to the Faculty of the

Department of Electrical and Computer Engineering

In Partial Fulfillment of the RequirementsFor the Degree of

Doctor of Philosophy

In the Graduate College

The University of Arizona

2 0 0 8

2

THE UNIVERSITY OF ARIZONA

GRADUATE COLLEGE

As members of the Dissertation Committee, we certify that we have read the dissertation prepared by Amit Ashok entitled "A Task-Specific Approach to Computational Imaging System Design" and recommend that it be accepted as fulfilling the dissertation requirement for the Degree of Doctor of Philosophy _______________________________________________________________________ Date: 07/30/2008

Prof. Mark A. Neifeld _______________________________________________________________________ Date: 07/30/2208

Prof. Raymond K. Kostuk _______________________________________________________________________ Date: 07/30/2008

Prof. William E. Ryan _______________________________________________________________________ Date: 07/30/2008

Prof. Michael W. Marcellin _______________________________________________________________________ Date:

Final approval and acceptance of this dissertation is contingent upon the candidate’s submission of the final copies of the dissertation to the Graduate College. I hereby certify that I have read this dissertation prepared under my direction and recommend that it be accepted as fulfilling the dissertation requirement. ________________________________________________ Date: 07/30/2008 Dissertation Director: Prof. Mark A. Neifeld

3

Statement by Author

This dissertation has been submitted in partial fulfillment of requirements for anadvanced degree at The University of Arizona and is deposited in the UniversityLibrary to be made available to borrowers under rules of the Library.

Brief quotations from this dissertation are allowable without special permission,provided that accurate acknowledgment of source is made. Requests for permissionfor extended quotation from or reproduction of this manuscript in whole or in partmay be granted by the head of the major department or the Dean of the GraduateCollege when in his or her judgment the proposed use of the material is in the interestsof scholarship. In all other instances, however, permission must be obtained from theauthor.

Signed: Amit Ashok

Approval by Dissertation Director

This dissertation has been approved on the date shown below:

Mark A. NeifeldProfessor of Electrical and Computer

Engineering

Date

4

Acknowledgements

Signal processing has found a multitude of applications ranging from communicationsto pattern recognition. Its application to various imaging modalities such as sonar,radar, tomography, and optical imaging systems has been a very interesting topicof research to me. I am fortunate to have had to opportunity to conduct disserta-tion research in the multi-disciplinary area of computational imaging systems thatinvolves various subjects such as optics, statistics, optimization, and of course, signalprocessing.

I would like to express my sincere gratitude to my advisor, Prof. Mark Neifeld,who has always provided invaluable guidance and steadfast support. He has beenan inspiring mentor who has set a very high standard to achieve. Thanks to mycolleagues in the OCPL lab, in particular Ravi Pant, Pawan Baheti, and Jun Ke,who were very helpful and supportive and helped create an exciting and friendlywork environment. I wish to express my heartfelt thanks to my parents and my wife,Sabina, who have always believed in me and encouraged me to persist. I want tothank Prof. W. Ryan, Prof. R. Kostuk, and Prof. M. Marcellin for serving on mydissertation committee and providing invaluable feedback on my dissertation researchwork.

5

Table of Contents

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 151.1. Evolution of Imaging Systems . . . . . . . . . . . . . . . . . . . 151.2. Computational Imaging and Task-specific Design . . . . . . . . 161.3. Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . 201.4. Dissertation Organization . . . . . . . . . . . . . . . . . . . . . 22

Chapter 2. Optical PSF Engineering: Object Reconstruction

Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.2. Imaging System Model . . . . . . . . . . . . . . . . . . . . . . . 282.3. Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . 322.4. Experimental results . . . . . . . . . . . . . . . . . . . . . . . . 382.5. Imager parameters . . . . . . . . . . . . . . . . . . . . . . . . . 50

2.5.1. Pixel size . . . . . . . . . . . . . . . . . . . . . . . . 512.5.2. Broadband operation . . . . . . . . . . . . . . . . . . 52

2.6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Chapter 3. Optical PSF Engineering: Iris Recognition Task . . . 553.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553.2. Imaging System Model . . . . . . . . . . . . . . . . . . . . . . . 57

3.2.1. Multi-aperture imaging system . . . . . . . . . . . . 573.2.2. Reconstruction algorithm . . . . . . . . . . . . . . . 593.2.3. Iris-recognition algorithm . . . . . . . . . . . . . . . 63

3.3. Optimization framework . . . . . . . . . . . . . . . . . . . . . . 653.4. Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . 683.5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Chapter 4. Task-Specific Information . . . . . . . . . . . . . . . . . . 794.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 794.2. Task-Specific Information . . . . . . . . . . . . . . . . . . . . . 82

4.2.1. Detection with deterministic encoding . . . . . . . . 874.2.2. Detection with stochastic encoding . . . . . . . . . . 894.2.3. Classification with stochastic encoding . . . . . . . . 92

Table of Contents—Continued

6

4.2.4. Joint Detection/Classification and Localization . . . 944.3. Simple Imaging Examples . . . . . . . . . . . . . . . . . . . . . 99

4.3.1. Ideal Geometric Imager . . . . . . . . . . . . . . . . 1024.3.2. Ideal Diffraction-limited imager . . . . . . . . . . . . 105

4.4. Compressive imager . . . . . . . . . . . . . . . . . . . . . . . . 1094.4.1. Principal component projection . . . . . . . . . . . . 1104.4.2. Matched filter projection . . . . . . . . . . . . . . . 113

4.5. Extended depth of field imager . . . . . . . . . . . . . . . . . . 1164.6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

Chapter 5. Compressive Imaging System Design With Task Spe-

cific Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1255.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1255.2. Task-specific information: Compressive imaging system . . . . . 129

5.2.1. Model for target-detection task . . . . . . . . . . . . 1305.2.2. Simulation details . . . . . . . . . . . . . . . . . . . 135

5.3. Optimization framework . . . . . . . . . . . . . . . . . . . . . . 1375.3.1. Principal component projections . . . . . . . . . . . 1395.3.2. Generalized matched-filter projections . . . . . . . . 1425.3.3. Generalized Fisher discriminant projections . . . . . 1445.3.4. Independent component projections . . . . . . . . . 148

5.4. Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . 1505.5. Conventional metric: Probability of error . . . . . . . . . . . . . 1575.6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

Chapter 6. Conclusions and Future Work . . . . . . . . . . . . . . 163

Appendix A: Conditional mean estimators for detection, classifi-

cation, and localization tasks . . . . . . . . . . . . . . . . . . . . . 167

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

7

List of Tables

Table 3.1. Imaging system performance for K = 1, K = 4, K = 9, andK = 16 on training set. . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

Table 3.2. Imaging system performance for K = 1, K = 4, K = 9, andK = 16 on validation set. . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Table 5.1. TSI (in bits) for candidate compressive imagers at three represen-tative values of SNR: low(s = 0.5), medium(s = 5.0), and high(s = 20.0). 155

8

List of Figures

Figure 1.1. System layout of (a) a traditional imaging system and (b) acomputational imaging system. . . . . . . . . . . . . . . . . . . . . . . 17

Figure 1.2. Extended depth of field imaging system layout (image examplesare taken from Ref. [7]). . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Figure 1.3. A two-dimensional illustration of the joint optical and post-processing design space. . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Figure 2.1. Schematic depicting the effect of pixel-limited resolution: (a)optical PSF is impulse-like and (b) engineered optical PSF is extended. 27

Figure 2.2. Imaging system setup used in the simulation study. . . . . . . . 30Figure 2.3. Example simulated PSFs: (a) Conventional sinc2(·) PSF and (b)

PSF obtained from PRPEL imager. . . . . . . . . . . . . . . . . . . . . 31Figure 2.4. Reconstruction incorporates object priors: (a) object class used

for training and (b) power spectral density obtained from the object classand the best power-law fit used to define the LMMSE operator. . . . . . 34

Figure 2.5. Rayleigh resolution estimation for multi-frame imagers using asinc2(·) fit to the post-processed PSF. . . . . . . . . . . . . . . . . . . . 35

Figure 2.6. Conventional imager performance with number of frames (a)RMSE and (b) Rayleigh resolution. . . . . . . . . . . . . . . . . . . . . . 36

Figure 2.7. PRPEL imager performance versus mask roughness parameter∆ with ρ = 10λc and K = 3: (a) Rayleigh resolution and (b) RMSE. . . 37

Figure 2.8. PRPEL and conventional imager performance versus number offrames: (a) Rayleigh resolution, and (b) RMSE. . . . . . . . . . . . . . 38

Figure 2.9. Schematic of the optical setup used for experimental validationof the PRPEL imager. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

Figure 2.10. Experimentally measured PSFs obtained from the (a) conven-tional imager, (b) PRPEL imager, and (c) simulated PRPEL PSF withphase mask parameters ∆ = 2.0λc and ρ = 175λc. . . . . . . . . . . . . . 40

Figure 2.11. Experimentally measured Rayleigh resolution versus number offrames for both the PRPEL and conventional imagers. . . . . . . . . . . 41

Figure 2.12. The USAF resolution target (a) Group 0 element 1 and (b) Group0 elements 2 and 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

Figure 2.13. Raw detector measurements obtained using USAF Group 0 ele-ment 1 from (a) the conventional imager and (b) the PRPEL imager. . . 43

Figure 2.14. LMMSE reconstructions of USAF group 0 element 1 with leftcolumn for PRPEL imager and right column for conventional imager: toprow for K=1, middle row for K=4, and bottom row for K=9. . . . . . . 44

List of Figures—Continued

9

Figure 2.15. Horizontal line scans through the USAF target and its LMMSEreconstruction for conventional and PRPEL imagers for K=4: (a) group0 elements 1 and (b) group 0 elements 2 and 3. . . . . . . . . . . . . . . 45

Figure 2.16. LMMSE reconstructions of USAF group 0 element 2 and 3 withleft column for PRPEL imager and right column for conventional imager:top row for K=1, middle row for K=4, and bottom row for K=9. . . . . 46

Figure 2.17. Richardson-Lucy reconstructions of USAF group 0 element 1with left column for PRPEL imager and right column for conventionalimager: top row for K=1, middle row for K=4, and bottom row for K=9. 48

Figure 2.18. Richardson-Lucy reconstructions of USAF group 0 element 2 and3 with left column for PRPEL imager and right column for conventionalimager: top row for K=1, middle row for K=4, and bottom row for K=9. 49

Figure 2.19. Horizontal line scans through the USAF target and its Richardson-Lucy reconstruction for conventional and PRPEL imagers for K=4: (a)group 0 elements 1 and (b)group 0 elements 2 and 3. . . . . . . . . . . . 50

Figure 2.20. (a) Rayleigh resolution and (b) RMSE versus number of framesfor multi-frame imagers that employ smaller pixels and lower measure-ment SNR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

Figure 2.21. The optical PSF obtained using PRPEL with both narrowband(10 nm) and broadband (150 nm) illumination. . . . . . . . . . . . . . . 52

Figure 2.22. (a) Rayleigh resolution and (b) RMSE versus number of framesfor broadband PRPEL and conventional imagers. . . . . . . . . . . . . . 53

Figure 3.1. PSF-engineered multi-aperture imaging system layout. . . . . . 57Figure 3.2. Iris examples from the training dataset. . . . . . . . . . . . . . 60Figure 3.3. Examples of (a) iris-segmentation, (b) masked iris-texture region,

(c) unwrapped iris, and (d) iris-code. . . . . . . . . . . . . . . . . . . . . 62Figure 3.4. Illustration of FRR and FAR definitions in the context of intra-

class and inter-class probability densities. . . . . . . . . . . . . . . . . . 65Figure 3.5. Optimized ZPEL imager with K = 1 (a) pupil-phase, (b) optical

PSF, and (c) optical PSF of conventional imager . . . . . . . . . . . . . 70Figure 3.6. Cross-section MTF profiles of optimized ZPEL imager with K = 1. 71Figure 3.7. Optimized ZPEL imager with K = 4: (a) pupil-phase and (b)

optical PSF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71Figure 3.8. Cross-section MTF profiles of optimized ZPEL imager with K = 4. 72Figure 3.9. Optimized ZPEL imager with K = 9: (a) pupil-phase and (b)

optical PSF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73Figure 3.10. Cross-section MTF profiles of optimized ZPEL imager with K = 9. 73Figure 3.11. Optimized ZPEL imager with K = 16: (a) pupil-phase and (b)

optical PSF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74


10

Figure 3.12. Cross-section MTF profiles of optimized ZPEL imager with K =16. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

Figure 3.13. Iris examples from the validation dataset. . . . . . . . . . . . . 76

Figure 4.1. (a) A 256 × 256 image, (b) the compressed version of image in(a) using JPEG2000, and (c) 64 × 64 image obtained by rescaling imagein (a). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

Figure 4.2. Block diagram of an imaging chain. . . . . . . . . . . . . . . . . 83Figure 4.3. Example scenes from the deterministic encoder. . . . . . . . . . 83Figure 4.4. Example scenes from the stochastic encoder. . . . . . . . . . . . 84Figure 4.5. (a) mmse and (b) TSI versus signal to noise ratio for the scalar

detection task. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88Figure 4.6. Illustration of stochastic encoding Cdet: (a) Target profile matrix

T and position vector ~ρ and (b) clutter profile matrix Vc and mixing

vector ~β. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90Figure 4.7. Structure of T and ρ matrices for the two-class problem. . . . . 92Figure 4.8. Structure of T and Λ matrices for the joint detection/localization

problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94Figure 4.9. Structure of T and Ω matrices for the joint classification/localization

problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96Figure 4.10. Example scenes: (a) Tank in the middle of the scene, (b) Tank

in the top of the scene, (c) Jeep at the bottom of the scene, and (d) Jeepin the middle of the scene. . . . . . . . . . . . . . . . . . . . . . . . . . . 98

Figure 4.11. Detection task: (a) mmse versus signal to noise ratio for an idealgeometric imager and (b) TSI versus signal to noise ratio for geometricand diffraction-limited imagers. . . . . . . . . . . . . . . . . . . . . . . . 101

Figure 4.12. Scene partitioned into four regions: (a) Tank in the top left regionof the scene, (b) Tank in the top right region of the scene, (c) Tank in thebottom left region of the scene, and (d) Tank in the bottom right regionof the scene. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

Figure 4.13. Joint detection/localization task: (a) mmse versus signal to noiseratio for an ideal geometric imager and (b) TSI versus signal to noise ratiofor geometric and diffraction-limited imagers. . . . . . . . . . . . . . . . 104

Figure 4.14. Classification task: TSI versus signal to noise ratio for geometricand diffraction-limited imagers. . . . . . . . . . . . . . . . . . . . . . . . 106

Figure 4.15. Joint classification/localization task: TSI versus signal to noiseratio for geometric and diffraction-limited imagers. . . . . . . . . . . . . 107

Figure 4.16. Example scenes with optical blur: (a) Tank in the top of thescene, (b) Tank in the middle of the scene, (c) Jeep at the bottom of thescene, and (d) Jeep in the middle of the scene. . . . . . . . . . . . . . . 108


11

Figure 4.17. Block diagram of a compressive imager. . . . . . . . . . . . . . 109Figure 4.18. Detection task: TSI for PC compressive imager versus signal to

noise ratio. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111Figure 4.19. Joint detection/localization task: TSI for PC compressive imager

versus signal to noise ratio. . . . . . . . . . . . . . . . . . . . . . . . . . 112Figure 4.20. Detection task: TSI for MF compressive imager versus signal to

noise ratio. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114Figure 4.21. Joint detection/localization task: TSI for MF compressive imager

versus signal to noise ratio. . . . . . . . . . . . . . . . . . . . . . . . . . 115Figure 4.22. Example textures (a) from each of the 16 texture classes and (b)

within one of the texture class. . . . . . . . . . . . . . . . . . . . . . . . 116Figure 4.23. TSI versus signal to noise ratio at various values of defocus. . . 117Figure 4.24. TSI versus defocus at s = 10 and s = 4 for the texture classifi-

cation task. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118Figure 4.25. Optical PSF of conventional imager at (a) Wd = 0, (b) Wd = 3

and cubic phase-mask imager with γ = 2.0 at (c) Wd = 0, (d) Wd = 3. . 119Figure 4.26. Depth of Field and TSI versus γ parameter at s = 10. . . . . . 122Figure 4.27. TSI versus defocus at s = 10: DOF of conventional imager and

cubic phase-mask EDOF imager with optimized optical PSF. . . . . . . 122

Figure 5.1. Candidate optical architectures for compressive imaging (a) se-quential and (b) parallel. . . . . . . . . . . . . . . . . . . . . . . . . . . 126

Figure 5.2. Block diagram of a compressive imaging system. . . . . . . . . . 129Figure 5.3. Illustration of stochastic encoding C: (a) Target profile matrix T

and position vector ~ρ and (b) clutter profile matrix Vc and mixing vector~β. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

Figure 5.4. Difference mmse and mmse components versus SNR for a con-ventional imager. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

Figure 5.5. Example scenes with optical blur and noise: (a) Tank in the topof the scene, (b) Tank in the middle of the scene . . . . . . . . . . . . . 136

Figure 5.6. Example projection vectors in the PC projection basis, clockwisefrom upper left, #2,#6,#16,#31. . . . . . . . . . . . . . . . . . . . . . . 140

Figure 5.7. TSI versus SNR for PC compressive imager. . . . . . . . . . . . 141Figure 5.8. Example projection vectors in the GMF projection basis, clock-

wise from upper left, #1,#16,#32,#64. . . . . . . . . . . . . . . . . . . 143Figure 5.9. Example projection vectors in the GFD1 projection basis, clock-

wise from upper left, #1,#10,#11,#14. . . . . . . . . . . . . . . . . . . 146Figure 5.10. Projection vector in the GFD2 projection basis. . . . . . . . . . 147Figure 5.11. Example projection vectors in the IC projection basis, clockwise

from upper left, #8,#16,#22,#28. . . . . . . . . . . . . . . . . . . . . . 149


12

Figure 5.12. Optimized compressive imagers: TSI versus SNR for candidateCI system and conventional imager. . . . . . . . . . . . . . . . . . . . . 150

Figure 5.13. Optimal photon allocation vectors for PC compressive imager at:(a) s = 0.5 , (b) s = 5.0 , and (c) s = 20.0. . . . . . . . . . . . . . . . . . 151

Figure 5.14. Optimal photon allocation vectors for GFD1 compressive imagerat: (a) s = 0.5 , (b) s = 5.0 , and (c) s = 20.0. . . . . . . . . . . . . . . 156

Figure 5.15. Lower bound on probability of error as a function of TSI. . . . 158Figure 5.16. Comparison of probability of error obtained via Bayes’ detector

versus lower bound obtained by Fano’s inequality as a function of SNR. 159

13

Abstract

The traditional approach to imaging system design places the sole burden of image

formation on optical components. In contrast, a computational imaging system relies

on a combination of optics and post-processing to produce the final image and/or

output measurement. Therefore, the joint-optimization (JO) of the optical and the

post-processing degrees of freedom plays a critical role in the design of computa-

tional imaging systems. The JO framework also allows us to incorporate task-specific

performance measures to optimize an imaging system for a specific task. In this

dissertation, we consider the design of computational imaging systems within a JO

framework for two separate tasks: object reconstruction and iris-recognition. The

goal of these design studies is to optimize the imaging system to overcome the perfor-

mance degradations introduced by under-sampled image measurements. Within the

JO framework, we engineer the optical point spread function (PSF) of the imager,

representing the optical degrees of freedom, in conjunction with the post-processing

algorithm parameters to maximize the task performance. For the object reconstruc-

tion task, the optimized imaging system achieves a 50% improvement in resolution

and nearly 20% lower reconstruction root-mean-square-error (RMSE ) as compared to

the un-optimized imaging system. For the iris-recognition task, the optimized imaging

system achieves a 33% improvement in false rejection ratio (FRR) for a fixed alarm ra-

tio (FAR) relative to the conventional imaging system. The effect of the performance

measures like resolution, RMSE, FRR, and FAR on the optimal design highlights

the crucial role of task-specific design metrics in the JO framework. We introduce a

fundamental measure of task-specific performance known as task-specific information

(TSI), an information-theoretic measure that quantifies the information content of an

image measurement relevant to a specific task. A variety of source-models are derived

to illustrate the application of a TSI-based analysis to conventional and compressive

14

imaging (CI) systems for various tasks such as target detection and classification. A

TSI-based design and optimization framework is also developed and applied to the

design of CI systems for the task of target detection, it yields a six-fold performance

improvement over the conventional imaging system at low signal-to-noise ratios.

15

Chapter 1

Introduction

1.1. Evolution of Imaging Systems

The first imaging systems simply imaged a scene onto a screen for viewing purposes.

One of the earliest imaging devices “camera obscura,” invented in the 10th century,

relied on a pinhole and a screen to form an inverted image [1]. The next signifi-

cant step in the evolution of imaging systems was the development of photo-sensitive

material that allowed the image to be recorded for later viewing. The perfection

of photographic film gave birth to a multitude of new applications, ranging from

medical imaging using X-rays for diagnosis purposes to aerial imaging for surveil-

lance. Development of the charge-coupled device (CCD) in 1969 by George Smith

and Willard Boyle at Bell labs [2] combined with the advances in communication

theory revolutionized imaging system design and its applications. The electronic

recording of an image allowed it to be stored digitally and transmitted over long dis-

tances reliably using digital communication systems. Furthermore, with the advent of

computed-aided optical design coupled with the development of modern machining

tools and new optical materials such as plastics/polymers allowed imaging system

designs that were light-weight, low-cost, and high-performance. This led to an ex-

plosion of applications, such as medical imaging for diagnosis, military applications

involving surveillance, tracking, recognition, weapon guidance, and a host of com-

mercial imaging applications such as security, consumer photography, automotive,

aerospace, and entertainment. Advances in the semiconductor industry have allowed

the processing power of computers and embedded processors to grow at an expo-

nential rate following Moore’s law [3]. This has led to real-time implementations of

16

sophisticated image processing algorithms that can further enhance the capabilities

of digital imaging systems. The post-processing algorithms, operating on acquired

images, have been developed for a variety of tasks such as pattern-recognition in se-

curity and surveillance, image restoration, detection in medical diagnosis, estimation

in computer vision, compression of still images and video storage/transmission appli-

cations. However, due to the separate evolutionary paths of imaging system design

and image processing technology, they have been viewed as two separate processes by

imaging system designers. As a result, there has been a disconnect between the imag-

ing system design and the post-processing algorithm design. Recently, this disconnect

has been addressed with the emergence of a new imaging system paradigm known

as computational imaging [4, 5, 6]. Computational imaging offers several advantages

over traditional imaging techniques, especially when dealing with specific tasks. This

dissertation investigates the task-specific aspects of design methodologies for compu-

tational imaging system design. Before discussing the specific contributions of this

dissertation we begin by defining computational imaging and outlining its various

benefits relative to traditional imaging.

1.2. Computational Imaging and Task-specific Design

In a traditional imaging system, the optics has the sole burden of the image formation.

The post-processing algorithm, which is not an essential part of the imaging system,

operates on the image measurement to extract the desired information. Note that the

optics and the post-processing algorithms are designed separately. Fig. 1.1(a) shows

the architecture of a traditional imaging system. In contrast, a computational imaging

system involves the use of both a front-end optical system and a post-processing

algorithm in the image formation process. As shown in Fig. 1.1(b), the post-processing

algorithm forms an integral part of the overall imaging system design. Here the

front-end optics does not yield the final image directly but instead relies on the

17

Object

Detector array

Image

Imaging optics

Output dataalgorithm

Post−processing

(a)

Object

Detector arrayEncoded optics

Intermediate image Final image/output data

Post−processingalgorithm

(b)

Figure 1.1. System layout of (a) a traditional imaging system and (b) a computationalimaging system.

Cubic phase mask

Imaging optics

Aperture stopDetector array

Intermediate image

Reconstruction filter

Z−axis

X−axis

Y−

axis

Object

Final image

Figure 1.2. Extended depth of field imaging system layout (image examples are takenfrom Ref. [7]).

18

post-processing sub-system to form the image. The extended depth of field (EDOF)

imaging system, described in Ref. [4], is an example of a computational imaging

system. Fig. 1.2 shows the system layout of this EDOF imaging system. Note that

it consists of a front-end optical system to form an intermediate image on the sensor

array that is subsequently processed by an image reconstruction algorithm to yield

the final focused image. The EDOF is achieved by modifying a traditional optical

imaging system with the addition of a cubic-phase mask in the aperture stop. The

resulting optical point spread function (PSF) has a larger support compared to a

traditional PSF and therefore, the optical image formed on the sensor array appears

to be blurred. However, as the optical PSF is invariant over an extended range of

object distances, a simple reconstruction filter can be used in the post-processing step

to form the final image that is focused throughout an extended object volume. This

imaging system demonstrates the potential of the computational imaging paradigm to

yield designs with novel capabilities, like EDOF, that simply could not be achieved by

a traditional imaging system without significant performance trade-offs. Nevertheless,

it is important to recognize that this EDOF imaging system does not fully exploit

the capabilities of the computational imaging paradigm.

The true potential of computational imaging can only be realized via a joint-

optimization of the optical and the post-processing degrees of freedom. The joint

design methodology yields a larger and richer design space for the designer. In order

to understand this advantage let us examine the multi-dimensional design space de-

picted in Fig. 1.3, the optical design parameters are represented on the vertical axis

and the post-processing design parameters are shown on the horizontal axis. Note

that the traditional approach constrains the designer to a relatively small design sub-

space, outlined in brown and green. The region outlined in brown represents a design

sub-space resulting from optimization of only optical parameters without any con-

sideration to the degrees of freedom available in the post-processing domain. In the

traditional design methodology, the optical design is followed by the optimization of

19

Post−processing domain parameters

Op

tica

l do

ma

in p

ara

me

ters

MinimaMaxima

Optical design sub−space

Post−processing design sub−space

Global optima

Joint design space

Figure 1.3. A two-dimensional illustration of the joint optical and post-processingdesign space.

post-processing parameters, represented by the sub-space in the green region. This

approach does not guarantee an overall optimal system design and it usually leads to

a sub-optimal system performance. In contrast, the joint-optimization design method

combines the degrees of freedom available from the optical and the post-processing do-

mains expanding the design space to a larger volume, represented by the red outlined

region. This larger design space encompasses potential designs that offer benefits

such as lower system cost, reduced complexity, improved yields and perhaps most

importantly optimal/near-optimal system performance.

Another key aspect of the joint design methodology is that it inherently supports

a task-specific approach to imaging system design. To support this assertion let us

consider an example of imaging system design for a classification task. The traditional

design approach would involve: 1) design an optical imaging system to maximize the

fidelity of the output image measurement and 2) design a classification algorithm that

operates on the image measurement and minimizes the probability of misclassifica-

20

tion. Note that in this approach the optical imaging system and the classification

algorithm are designed separately (and sequentially). Typically, a classification al-

gorithm involves two steps: the feature extraction step and the classification step.

In the feature extraction step, the original high-dimensional image measurement is

transformed (compressed) into a low-dimensional data vector that is referred to as

a feature vector. This dimensionality reduction step effectively lowers the computa-

tional complexity of the subsequent classification step. Acquiring a high-dimensional

image measurement and subsequently reducing it to a low-dimensional feature clearly

represents an inefficient data measurement process and a poor utilization of optical

design resources. Thus, the traditional approach results in an imaging system design

with sub-optimal performance for the classification task. Alternatively, a more logi-

cal approach would suggest an optical imaging system design that directly measures

the optimal low-dimensional feature(s) for post-processing such that it maximizes the

task performance, within the system constraints. This approach yields a computa-

tional imaging system design that offers two main advantages: a) a direct feature

measurement yields a higher measurement signal to noise ratio (SNR) and b) the

number of detectors required is significantly reduced. The high measurement SNR

directly translates into improved system performance. This type of imaging system,

referred to as a feature-specific imager (FSI) or a compressive imager, is an exam-

ple of a computational imaging system [6]. This example clearly illustrates that the

computational imaging paradigm supports and enables a task-specific approach to

imaging system design.

1.3. Main Contributions

The task-specific approach to computational imaging system design is an emerging

area of research. Barrett et al. have conducted an extensive task-based analysis of

imaging systems for detection and classification tasks in the area of medical imag-

21

ing [8, 9, 10]. Their focus has been primarily on the performance of ideal Bayesian

observers and human observers. However, the application of the task-specific ap-

proach within a joint-optimization design framework is a relatively unexplored area.

In this dissertation, we apply a task-specific approach to maximize the performance

of a computational imaging system for a given task within a joint-optimization design

framework. We consider two separate example tasks in this work: a reconstruction

task and a classification task. In each case, the computational imaging system is

optimized to maximize the task performance as measured by a task-specific metric.

For example, the reconstruction task employs the traditional root mean square error

(RMSE) and resolution metrics to quantify the quality of the reconstructed images.

In the case of the classification task, false rejection ratio (FRR) and false alarm ra-

tio (FAR) statistics are used as task-specific metrics to evaluate the overall system

performance. In addition to the two design studies, a novel information theoretic task-

specific metric is also derived. A formal design framework based on this task-specific

metric is developed and applied to the design of a compressive imaging system for the

task of target detection. More specifically, the main contributions of this dissertation

work are as follows:

1. The application of the optical PSF engineering method to optimize the imaging

system performance for a specific task is considered. This task-specific method

is first applied to a reconstruction task to overcome the distortions introduced by

the detector under-sampling in the sensor array. Simulation results show nearly

a 20% improvement in RMSE for the optimized imaging system design relative

to the conventional imaging system. The optical PSF engineering method is

also successfully applied to the design of an iris-recognition imaging system

to minimize the impact of detector under-sampling on the overall performance.

The optimized iris-recognition imaging system design achieves a 33% lower FRR

compared to the conventional imaging system design.

22

2. Development a formal task-specific framework for computational imaging sys-

tem design based on a novel information theoretic task-specific metric. This

metric, known as task-specific information (TSI), quantifies the information

content of an imaging system measurement relevant to a specific task. The

TSI metric can also be used to derive an upper-bound on the performance of

any post-processing algorithm for a specific task. Therefore, within the pro-

posed design framework, the TSI metric can be used improve the upper-bound

on imaging system performance thereby allowing the designer to optimize the

imaging system for a particular task. The utility of the TSI metric is investi-

gated for a variety of target detection and classification tasks. The application

of the TSI-based design framework to extend the depth of field of an imager by

optical PSF engineering is also considered.

3. The TSI-based design framework is used to design several compressive imaging

systems for a target detection task. The resulting optimized imaging system

designs shows a significant performance improvement over the un-optimized

imaging designs.

1.4. Dissertation Organization

The rest of the dissertation is organized as follows:

• Chapter 2 presents the application of the optical PSF engineering method,

within a multi-aperture imaging architecture, to overcome the distortions due

to under-sampling in the detector array. The reconstruction task is considered

in this study. RMSE and resolution are used as task-specific metrics during

the imaging system optimization process. In the simulation study, the opti-

mized imaging system designs show significant improvement, both in terms of

RMSE and resolution metrics, compared to imaging system with a traditional

23

diffraction-limited PSF. The experimental results support the performance im-

provements predicted by the simulation study.

• The task of iris-recognition, in the presence of detector under-sampling, is con-

sidered in Chapter 3. A multi-aperture imaging system in conjunction with

optical PSF engineering is employed to optimize the overall performance of the

imaging system. The task-specific design framework employs the FAR and FRR

metrics to quantify the imaging system performance in this study. The simula-

tion results show a substantial improvement in iris-recognition performance as

a result of PSF optimization compared to the design that employs a traditional

optical PSF.

• As emphasized by the design studies described in Chapter 2 and Chapter 3, the

performance metric plays a crucial role in the task-specific approach to imaging

system design. In Chapter 4, the notion of task-specfic information is introduced

as an objective metric for task-specfic design. TSI is an information theoretic

metric that is derived using the recently discovered relationship between esti-

mation theory and mutual-information. This metric is applied to a variety of

detection and classification tasks to demonstrate its utility for task-specific per-

formance evaluation. A brief analysis of a TSI-based optical PSF engineering

approach for extending the depth of field of an imager is also presented in the

context of a texture-classification task.

• Chapter 5 presents a formal task-specific design framework that utilizes the

TSI metric to optimize a compressive imaging system for a target detection

task. The optimized imaging system designs deliver substantial performance

improvement over the conventional design. The implementation issues regarding

compressive imaging systems and the computational complexity associated with

the TSI-based design framework are also discussed.

24

• Chapter 6 draws conclusions from the various aspects of the task-specifc ap-

proach investigated in this dissertation and provides direction for future work

relevant to the further development of the joint-optimization design framework

for computational imaging systems.

25

Chapter 2

Optical PSF Engineering: Object

Reconstruction Task

The optical PSF represents a degree of freedom that can be exploited to optimize

an imaging system for a specific task. In a digital imaging system, the detector

can limit the overall resolution when the optical PSF is smaller than the extent of

the detector, leading to under-sampling or aliasing. In this chapter, we apply the

optical PSF engineering method to improve the overall system resolution beyond

the detector-limit and also increase the object reconstruction fidelity in such under-

sampled imaging systems.

2.1. Introduction

In a traditional (i.e. film-based) design paradigm the optical PSF is typically viewed

as the resolution-limiting element and therefore, optical designers strive for an impulse-

like PSF. Digital imagers however, employ photodetectors that are sometimes large

relative to the extent of the optical PSF and in such cases the resulting pixel-blur

and/or aliasing can become the dominant distortion limiting overall imager perfor-

mance. This is illustrated by Fig. 2.1(a). This figure is a one-dimensional depiction

of the image formed by a traditional camera when two point objects are separated

by a sub-pixel distance. We see that the resulting impulse-like PSFs are imaged onto

essentially the same pixel leading to spatial ambiguity and hence a loss of resolution.

In such an imager the resolution is said to be pixel-limited [11].

The effect depicted in Fig. 2.1(a) may also be understood by noting that the

detector array under-samples the image and therefore, introduces aliasing. The gen-

26

eralized sampling theorem by Papoulis [12] provides a mechanism through which this

aliasing distortion can be mitigated. The theorem states that a bandlimited signal

(−Ω ≤ ω ≤ Ω) can be completely/perfectly reconstructed from the sampled outputs

of R non-redundant (i.e., diverse) linear channels, each of which employs a sample rate

of 2ΩR

(i.e., each of the R signals is under-sampled at 1R

the Nyquist rate). This theo-

rem suggests that the aliasing distortion can be reduced by combining multiple under-

sampled/low-resolution images to obtain a high-resolution image. A detailed descrip-

tion of this technique can be found in Borman [13]. This approach has been used by

several researchers in the image processing community [11, 14, 15, 16, 17, 18] and was

recently adopted for use in the TOMBO (Thin observing module with bounded optics)

imaging architecture [19, 20]. The TOMBO system was designed to simultaneously

acquire multiple low-resolution images of an object through multiple lenslets in an

integrated aperture. The resulting collection of low-resolution measurements is then

processed to yield a high-resolution image. Within the TOMBO system the multiple

non-redundant images were obtained via a diverse set of sub-pixel shifts. The use of

other forms of diversity including magnification, rotation, and defocus has also been

considered [21]. However, it is important to note that these methods of obtaining

measurement diversity do not fully exploit the optical degrees of freedom available to

the designer. The approach described in this chapter will utilize PSF engineering in

order to obtain additional diversity from a set of sub-pixel shifted measurements.

The optical PSF of a digital imager may be viewed as a mechanism for encoding

object information so as to better tolerate distortions introduced by the detector ar-

ray. From this viewpoint an impulse-like optical PSF may be sub-optimal [22, 23].

To support this assertion let us consider the scenario depicted in Fig. 2.1(b), it shows

an image of two point objects formed using a non-impulse-like PSF. The two point

objects are displaced by the same amount as in Fig. 2.1(a). We see that the use of

an extended PSF enables the extraction of sub-pixel position information from the

sampled detector outputs. For example, a simple correlation-based processor [24] can

27

(a) (b)

Figure 2.1. Schematic depicting the effect of pixel-limited resolution: (a) optical PSFis impulse-like and (b) engineered optical PSF is extended.

yield the PSF centroid/point-source location to sub-pixel accuracy, given sufficient

measurement signal-to-noise ratio (SNR). In this chapter, we study the performance

of one such extended PSF design obtained by placing a pseudo-random phase mask

in the aperture-stop of a conventional imager. Our choice of pseudo-random phase

mask has been motivated in part by the pseudo-random sequences found in CDMA

multi-user communication systems [25, 26] and in part by a study in Ref. [27] which

found pseudo-random phase masks to be efficient in an information-theoretic sense

for imaging sparse volumetric scenes. In the context of multi-user communications,

pseudo-random sequences are used to encode the information of each end-user. These

encoded messages are combined and transmitted over a common channel. The struc-

ture of the encoding is then used at the receiver side to extract individual messages

from the super-position. In a digital imaging system, the optical PSF serves a simi-

lar purpose in terms of encoding the location of individual resolution elements that

comprise the object. The pixels within a semiconductor detector array measure a

super-position of responses from each resolution element in the object. Further the

spatial integration across the finite pixel size of the detector array leads to spatial

blurring. These signal transformations imposed by the detector array must be in-

verted via decoding. In the next section, we describe the mathematical model of the

imaging system and the pseudo-random phase mask used to engineer the extended

28

optical PSF.

2.2. Imaging System Model

Consider a linear model of a digital imaging system. Mathematically, we can represent

the system as

g = Hcdfc + n, (2.1)

where fc is the continuous object, g is the detector-array measurement vector, Hcd

is the continuous-to-discrete imaging operator and n is additive measurement noise

vector. For simulation purposes we use a discrete representation f of the continuous

object fc. This discrete representation f can be obtained from fc as follows [28]

fi =

∫

S∩Φi

fc(~r)φi(~r)dr2, (2.2)

where S is the object support, φi is an analysis basis set, Φi is the support of

ith basis function φi and fi is the ith element of the object vector f . Note that we

obtain an approximation fa of the original continuous object fc from its discrete

representation f as follows [28]

fa(~r) =

N∑

i=1

fi · ψi(~r), (2.3)

where N is the dimension of the discrete object vector and ψi is a synthesis basis

set which can be chosen to be the same as the analysis basis set φi. Here we use the

pixel function to construct our analysis and synthesis basis sets. The pixel function

is defined as

φi(r) =1

Ωrrect

(r − iΩr

Ωr

)(2.4)

and

∫

Φi∩Φj

φi(r)φj(r)dr2 = δij ,

where 2Ωr is the size of the resolution element in the continuous object that can

be accurately represented by this choice of basis set. Note that the pixel functions

29

φi form an orthonormal basis. We set the object resolution element size equal to

the diffraction-limited optical resolution of the imager to ensure that the discrete

representation of the object does not incur any loss of spatial resolution. Here we

adopt the Rayleigh’s criteria [29] to define resolution. Henceforth, all references to

resolution will represent the Rayleigh resolution.

The imaging equation is modified to include the discrete object representation as

follows

g = Hf + n, (2.5)

where H is the equivalent discrete-to-discrete imaging operator: H is therefore a

matrix. The imaging operator H includes the optical PSF, the detector PSF, and

the detector sampling. The vectors f , g, and n are lexicographically arranged one-

dimensional representations of the two-dimensional object, image, and noise arrays,

respectively.

Consider a diffraction-limited PSF of the form: h(r) = sinc2(

rR

), with Rayleigh

resolution R. The Nyquist sampling theorem requires the detector spacing to be

at most R2. When this requirement is met, the imaging operator H has full rank

(condition-number → 1) allowing a reconstruction of the object up to the optical

resolution. However, when the optical PSF has an extent (2R) that is smaller than

the detector spacing, the image measurement is aliased and the imaging operator H

becomes singular (condition-number → ∞). Under these conditions the object cannot

be reconstructed up to the optical resolution. Also note that due to under-sampling

the imaging operator H is no longer shift-invariant but only block-wise shift-invariant

even if the imaging optics itself is shift-invariant.

As mentioned in the previous section, one method to overcome the resolution

constraint imposed by the pixel-size is to use multiple sub-pixel shifted image mea-

surements. The sub-pixel shift δ may be obtained either by a shift in the imager

position or through object movement. The ith sub-pixel shifted image measurement

30

X−axis

Y−

axis

Z−axis

Det

ecto

r−ar

ray Pseudo−random phase mask

Lens system

Apertute stop

Object

Figure 2.2. Imaging system setup used in the simulation study.

gi with shift δi can be represented as

gi = Hif + ni, (2.6)

where Hi represents the imaging operator associated with the sub-pixel shift δi.

For a set of K such measurements we can write the composite image measure-

ment by concatenating the individual vectors as, g =g1 g2 · · ·gK

and similarly

n =n1 n2 · · ·nK

. The overall multi-frame composite imaging system can be ex-

pressed as

g = Hcf + n, (2.7)

where Hc is the composite imaging operator. By combining several sub-pixel shifted

image measurements, the condition number of the composite imaging operator Hc

can be progressively improved and the overall resolution can be increased towards

the optical resolution limit. Ideally, the sub-pixel shifts should be chosen in multiples

of DK

so as to minimize the condition-number of the forward imaging operator Hc,

where D is the detector spacing [30].

We are interested in designing an extended optical PSF for use within the sub-pixel

shifting framework. The use of an extended optical PSF can improve the condition-

number of the imaging operator Hc. We consider an extended optical PSF obtained

by placing a pseudo-random phase mask in the aperture-stop of a conventional imager,

as shown in Fig. 2.2. For simulation purposes the aperture-stop is defined on a discrete

spatial grid. Therefore, the pseudo-random phase mask is represented by an array,

31

−15 −10 −5 0 5 10 150

2

4

6

8

10

12

14

Spatial dimension [µm]

Am

plitu

de

x10−3

(a)

−15 −10 −5 0 5 10 150

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

x 10−3


Am

plitu

de

(b)

Figure 2.3. Example simulated PSFs: (a) Conventional sinc2(·) PSF and (b) PSFobtained from PRPEL imager.

each element of which corresponds to the phase at given a position on the discrete

spatial grid. The pseudo-random phase mask is synthesized in two steps: (1) generate

a set of identical independently distributed random numbers distributed uniformly

on the interval [0,∆] to populate the phase array and (2) convolve this phase array

with a Gaussian filter kernel which is a Gaussian function with standard-deviation

ρ, sampled on the discrete spatial grid. The resulting set of random numbers define

the phase distribution Φ(r) of the pseudo-random phase mask. The phase mask is

thus a realization of a spatial Gaussian random process which is parameterized by

its roughness ∆ and correlation length ρ. The auto-correlation function of this phase

distribution is given by

RΦΦ(r) =∆

12

2

exp

[− r2

4ρ2

]. (2.8)

The incoherent PSF is related to the phase-mask profile Φ(r) as follows [28]

psf(r) =Ac

(λf)4

∣∣∣∣Tpupil

(− r

λf

)∣∣∣∣2

, (2.9)

Tpupil(ω) = F exp[j2π(nr − 1)Φ(r)/λ]tap(r) , (2.10)

where Ac is normalization constant with units of area, nr is the refractive index of

the lens, f is the back focal length, tap(r) is the aperture function and F denotes the

forward Fourier transform operator.

32

Fig. 2.3(a) shows a simulated impulse-like PSF and Fig. 2.3(b) an extended PSF

resulting from simulating a pseudo-random phase mask with parameters ∆ = 1.5λc

and ρ = 10λc, where λc is the operating center wavelength. Here we set λc =550 nm

and the imager F/# = 1.8. Assuming a detector size of 7.5µm, the support of

extended PSF extends over roughly six detectors, in contrast with a sub-pixel extent

of 2µm for the impulse-like PSF. The extended PSF will therefore accomplish the

desired encoding; however, it will do so at the cost of measurement SNR. Because the

extended PSF is spread over several pixels, its photon count per detector is lower than

that for the impulse-like PSF for a point-like object. Assuming a constant detector

noise, the measurement SNR per detector for the extended PSF is thus lower than

that of the impulse-like PSF. For more general objects, the extended PSF results in

a reduced contrast image with a commensurate SNR reduction, though smaller than

for point-like objects. In the next section, we present a simulation study to quantify

the tradeoff between the overall imaging resolution and the SNR for two candidate

imagers that use multiple sub-pixel shifted measurements: (a) the conventional imager

and (b) the pseudo-random phase enhanced lens (PRPEL) imager.

2.3. Simulation results

For the purposes of the simulation study, we consider only one-dimensional objects

and image measurements. The target imaging system has a modest specification with

an angular resolution of 0.2mrad and an angular field of view(FOV) of 0.1 rad. The

conventional imager uses a lens of F/# = 1.8 and back focal length 5mm. We assume

that the lens is diffraction-limited and the optical PSF is shift-invariant. The detector

array in the image plane has a pixel size of 7.5µm with a full-well capacity (FWC) of

45000 electrons and a 100% fill factor. We further assume that the imager’s spectral

bandwidth is limited to 10 nm centered at λc =550 nm. For the PRPEL imager the

only modification is that the lens is followed by a pseudo-random phase mask with

33

parameters ∆ and ρ.

We assume a shot-noise limited SNR=46dB (20 log10

√FWC) given by the FWC

of the detector element. The shot-noise is modeled as equivalent AWGN with variance

σ2 = FWC. The under-sampling factor for this imager is F = 15. This implies that

for an object vector f of size N×1 the resulting image measurement vector gi is of size

M × 1 where M = NF

. For the target imager, these values are N = 512 and M = 34.

Note that the block-wise shift-invariant imaging operator Hc is of size KM ×N .

To improve the overall imager performance we consider multiple sub-pixel shifted

image measurements or frames. These frames result from moving the imager with

respect to the object by a sub-pixel distance δi. Here it is important to constrain

the number of photons per frame to ensure a fair comparison among imagers using

multiple frames. We have two options: (a) assume that each imager has access to

the same finite number of photons and (b) assume that each frame of each imager

has access to the same finite number of photons. Option (b) may be physical under

certain conditions; however, the results that are obtained will be unable to distinguish

between improvements arising from frame diversity versus improvements arising from

increased SNR. We therefore utilize option (a) because it is the only option that

allows us to study how best to use fixed photon resources. As a result, the photon

count for each frame is normalized to FK

in this simulation study.

The inversion of the composite imaging Eq. (2.7), is based on the optimal linear-

minimum-mean-squared-error (LMMSE) operator W. The resulting object estimate

is given by

f = Wg, (2.11)

where W is defined as [31]

W = RfHT

c(HcRfH

T

c+ Rn)−1. (2.12)

Rf is the auto-correlation matrix for the object vector f and Rn is the auto-correlation

matrix of the noise vector n. Because the composite imaging operator Hc is not shift-

34

(a)

20 40 60 80 100 120 140 160

−80

−70

−60

−50

−40

−30

−20

−10

0

Angular frequency [cycles/degree]

Log

pow

er s

pect

ral d

ensi

ty

Burg estimatePower law η=1.0Power law η=1.4Power law η=2.0

(b)

Figure 2.4. Reconstruction incorporates object priors: (a) object class used for train-ing and (b) power spectral density obtained from the object class and the best power-law fit used to define the LMMSE operator.

invariant the LMMSE solution does not reduce to the well-known Wiener filter. The

noise auto-correlation matrix reduces to a diagonal matrix under the assumption of

independent and identically distributed (i.i.d.) noise and therefore, can be written as

Rn = σ2I. The object auto-correlation matrix Rf incorporates object prior knowl-

edge within the reconstruction process as a regularizing term. Here we obtain the

object auto-correlation matrix from a power-law power spectral density (PSD): 1fη ,

that serves as a good model for natural images [32, 33, 34]. A power-law PSD was

computed to model the class of 10 objects shown in Fig. 2.4(a) chosen to represent

a wide variety of scenes (rows and columns of these scenes are used as 1D objects).

Fig. 2.4(b) shows several power law PSDs plotted along with the PSD obtained using

Burg’s method [35] on 3 objects chosen from the set in Fig. 2.4(a). The power-law

PSD(η = 1.4) is used to model the PSD of the object class as it is applicable to wider

range of natural images compared to PSD models such as Burg’s that are obtained

for a specific set of objects. The value of power-law PSD parameter η was obtained

by a least-squares fit to the Burg’s PSD estimate.

In order to quantify the performance of both the PRPEL and the conventional

35

−2 −1 0 1 2

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Angular dimension [mrad]

Am

plitu

de

Post−processed PSF

Fitted sinc2(.) PSF

Estimated resolution=0.4mrad

Figure 2.5. Rayleigh resolution estimation for multi-frame imagers using a sinc2(·)fit to the post-processed PSF.

imaging systems we employ two metrics: (a) Rayleigh resolution and (b) normalized

root-mean-square-error (RMSE). The Rayleigh resolution of a composite multi-frame

imager is found by using a point-source object and applying the LMMSE operator to

the K image frames. The resulting point-source reconstruction represents the overall

PSF of the computational imager. A least-squares fit of a diffraction-limited sinc2(·)PSF to the overall imager PSF is used to obtain the resolution estimate. Fig. 2.5

illustrates this resolution estimation method with an example of a post-processed PSF

and the associated sinc2(·) fit. The second imager performance metric uses RMSE

to quantify the quality of a reconstructed object. The RMSE metric is defined as,

RMSE =

√〈||f − f ||2〉

255× 100%, (2.13)

where 255 is the peak object pixel value. Here, the expectation 〈·〉 is taken over both

the object and the noise ensembles. We have used all columns and rows of the 2D

objects shown in Fig. 2.4(a) to form a set of 1D objects for computing the RMSE

metric in the simulation study.

First, we consider the conventional imager. The sub-pixel shift for each frame

is chosen randomly. The performance metrics are computed and averaged over 30

36

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

2

3

4

5

6

7

Number of frames − K

RM

SE

[% o

f dyn

amic

ran

ge]

(a)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6


Ang

ular

res

olut

ion

[mra

d]

Diffraction−limited resolution

(b)

Figure 2.6. Conventional imager performance with number of frames (a) RMSE and(b) Rayleigh resolution.

sub-pixel shift-sets for each value of K. Fig. 2.6(a) shows a plot of the RMSE versus

the number of frames K. We observe that the RMSE decreases with the number

of frames, as expected. This result demonstrates that additional object information

is accumulated through the use of diverse (i.e., shifted) channels: as the number of

frames increases, the condition-number of the composite imaging operator Hc im-

proves. The reason that the RMSE does not converge to zero for K = 16 is because

the detector noise ultimately limits the minimum reconstruction error. The resolution

of the overall imager is plotted against the number of frames K in Fig. 2.6(b). Ob-

serve that the resolution improves with increasing K, converging towards the optical

resolution limit of 0.2mrad. The resolution obtained with K = 16 is not equal to

the diffraction-limit because this data represents an average resolution over a set of

random sub-pixel shift-sets. When the sub-pixel shifts are chosen as multiples of DF

the resolution achieved for K = 16 is indeed equal to the optical resolution limit.

The PRPEL imager employs a pseudo-random phase mask to modify the impulse-

like optical PSF. The phase mask parameters ∆ and ρ jointly determine the statistics

of the spatial intensity distribution and the extent of the optical PSF. We design an

optimal phase mask by setting ρ to a constant(10λc) and finding the value of ∆ that

37

1 2 3 4 5 6 7 8 9

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

Mask roughness − ∆ [λ], ρ=10λ

Ang

ular

res

olut

ion

[mra

d]

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6

3.8

4

4.2

4.4

4.6

4.8

5

Mask roughness − ∆[λ] ρ=10[λ]

RM

SE

[% o

f dyn

amic

ran

ge]

Figure 2.7. PRPEL imager performance versus mask roughness parameter ∆ withρ = 10λc and K = 3: (a) Rayleigh resolution and (b) RMSE.

maximizes the imager performance for a given K. Fig. 2.7(a) presents representative

data quantifying imager resolution as a function of ∆ with ρ = 10λc and K = 3. This

plot shows the fundamental tradeoff between the condition number of the imaging

operator and the SNR cost. Note that for small values of ∆ the PSF is impulse-like.

As the value of ∆ increases the PSF becomes more diffuse as shown in Fig. 2.3(b).

This results in an improvement in condition number; however, as the PSF becomes

more diffuse the photon-count per detector decreases resulting in an overall decrease in

measurement SNR. Fig. 2.7(a) shows that optimal resolution is achieved for ∆ = 7λc.

Fig. 2.7(b) demonstrates a similar trend in RMSE versus ∆ with ρ = 10λc and K = 3.

The optimal value of ∆ under the RMSE metric is ∆ = 1.5λc. Note that the optimal

values of ∆ are different for the resolution and RMSE metrics. The resolution of an

imager is determined by its spatial frequency response alone; whereas, the RMSE is

dependent on the spatial frequency response as well as the object statistics. Therefore,

the value of ∆ that maximizes the resolution metric may result in an imager with a

particular spatial frequency response that may not achieve the minimum RMSE given

the object statistics and detector noise. All the subsequent results for the PRPEL

imager are obtained for the optimal value of ∆ which will therefore be a function of

38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1616

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6


Ang

ular

res

olut

ion

[mra

d]

Lens imagerPRPEL imager


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16161

2

3

4

5

6

7

8

Number of Frames − K

RM

SE

[% o

f dyn

amic

ran

ge]


Figure 2.8. PRPEL and conventional imager performance versus number of frames:(a) Rayleigh resolution, and (b) RMSE.

K, σ and the metric (RMSE or resolution).

Fig. 2.8(a) presents the resolution performance of both the PRPEL and the con-

ventional imagers as a function of the number of frames K. We note that the PRPEL

imager converges faster than the conventional imager. A resolution of 0.3mrad is

achieved with only K = 4 by the PRPEL imager in contrast with K = 12 for the

conventional imager. A plot comparing the RMSE performance of the two imagers

is shown in Fig. 2.8(b). We note that the PRPEL imager is consistently superior to

the conventional imager. For K = 4 the PRPEL imager achieves an RMSE of 3.5%

as compared with RMSE of 4.3% for the conventional imager.

2.4. Experimental results

An experimental demonstration of the PRPEL imager was undertaken in order to

validate the performance improvements predicted by simulation. Fig. 2.9 shows the

experimental setup along with the relevant physical dimensions. A Santa Barbara

Instrument Group ST2000XM CCD was used as the detector array. The CCD consists

of a 1600 × 1200 detector array, with a detector size of 7.4µm, 100% fill factor and

a FWC of 45000 electrons. The detector output from the CCD is quantized with a

39

X−axis

Y−

axis

FOV

16mm

SB

IG C

CD

arr

ay

Fiber−tip

7.4µ

m

m21

0µ

m210µ

Fujinon Lens Ape

rtur

e =

20m

m Diffuser(phase mask)Zoom lens(2.5x)

540mm

Figure 2.9. Schematic of the optical setup used for experimental validation of thePRPEL imager.

16 bit analog-digital convertor yielding a dynamic range of [0− 64000] digital counts.

During the experiment the CCD is cooled to −10 C, to minimize electronic noise.

The experimental setup uses a Fujinon’s CF16HA-1 TV lens operated at F/#=4.0. A

circular holographic diffuser from Physical Optical Corporation is used as a pseudo-

random phase mask. The divergence angle(full-width half-maximum) of the diffuser

is 0.1. A zoom lens with magnification 2.5x is used to decrease the divergence angle

of the diffuser. The actual phase statistics of the diffuser are not disclosed by the

manufacturer. Therefore, to relate the physical diffuser to the pseudo-random phase

mask model we compute phase mask parameters ∆ and ρ that yield a PSF similar

to the one produced by the physical diffuser. The phase mask parameters ∆ = 2.0λc

and ρ = 175λc yield the PSF shown in Fig. 2.10(c). Comparing this PSF to the

PRPEL experimental PSF shown in Fig. 2.10(b), we note that they are similar in

appearance. This comparison although qualitative suggests that the physical diffuser

might possess statistics similar to the pseudo-random phase mask model described

here.

The Rayleigh resolution of the conventional optical PSF was estimated to be 5µm

or 0.31mrad. This yields an under-sampling factor of F = 3 along each direction.

This implies that a total of F 2 = 9 frames are required to achieve the full optical

40

[mrad]

[mra

d]

−3 −2 −1 0 1 2 3

−3

−2

−1

0

1

2

3

(a)

[mrad]

[mra

d]

−3 −2 −1 0 1 2 3

−3

−2

−1

0

1

2

3

(b)

[mrad]

[mra

d]

−3 −2 −1 0 1 2 3

−3

−2

−1

0

1

2

3

(c)

Figure 2.10. Experimentally measured PSFs obtained from the (a) conventional im-ager, (b) PRPEL imager, and (c) simulated PRPEL PSF with phase mask parameters∆ = 2.0λc and ρ = 175λc.

resolution. The FOV for the experiment is 10mrad×10mrad consisting of 64 × 64

pixels each of size 0.156mrad×0.156mrad. The highly under-sampled nature of the

conventional imager as well as the extended nature of the PRPEL PSF demand

careful system calibration. Our calibration apparatus consisted of a fiber-tip point-

source mounted on a X-Y translation stage that can be scanned across the object

FOV. The 50µm fiber core diameter in object space yields a 0.6µm diameter point

in image space(system magnification= 184

x)which is much smaller than the detector

size of 7.4µm. Therefore, we can assume that the fiber-tip serves a good point-source

approximation for imager calibration purpose. Also note that the exiting radiation

from the fiber-tip(numerical aperture=0.22) overfills the entrance aperture of the

imager optics by a factor of 12. The motorized translation stage is controlled by a

Newport EPS300 motion controller. The fiber tip is illuminated by a white light-

source filtered by a 10 nm bandpass filter centered at λc=535 nm. The calibration

procedure involves scanning the fiber-tip over each object pixel position in the FOV

and for each such position, recording the discrete PSF at the CCD. To obtain reliable

PSF data during calibration we average 32 CCD frames to increase the measurement

SNR. To obtain PSF data with a particular sub-pixel shift, the calibration process is

repeated after shifting the FOV by that sub-pixel amount. This calibration data is

41

1 2 3 4 5 6 7 8 90.3

0.35

0.4

0.45

0.5

0.55


Ang

ular

res

olut

ion

[mra

d]


Optical resolution

Figure 2.11. Experimentally measured Rayleigh resolution versus number of framesfor both the PRPEL and conventional imagers.

subsequently used to construct the composite imaging operator Hc and compute the

LMMSE operator W using Eq. (2.12). The same calibration procedure is used for

both the conventional and the PRPEL imagers.

The experimental PSFs for these two imagers are shown in Fig. 2.10(a) and

Fig. 2.10(b). The PSF of the conventional imager is seen to be impulse-like; whereas,

the PSF of the PRPEL imager has a diffused/extended shape as expected. The reso-

lution estimation procedure described in the previous section is once again employed

to estimate the resolution of the two experimental imagers. Fig. 2.11 presents the

plot of resolution versus number of frames K from the experiment data. Three data

points are obtained at K = 1, 4, and 9. The sub-pixel shifts (in microns) used for

these measurements were: (0,0) for K=1, (0,0), (0,3.7), (3.7,0), (3.7,3.7) for K=4,

and (0,0), (0,2.5), (0,5), (2.5,0), (2.5,2.5), (2.5,5), (5,0), (5,2.5), (5,5) for K = 9. Note

the imager resolution is estimated using test data that is distinct from the calibration

data. As predicted in simulation, we see that the PRPEL imager outperforms the

conventional imager at all values of K. We observe that the PRPEL resolution nearly

saturates by K = 4. A maximum resolution gain of 13% is achieved at K = 4 by

the PRPEL imager relative to conventional imager. Note that even at K = 9 the

42

[mrad]

[mra

d]

−5 −4 −3 −2 −1 0 1 2 3 4 5

−5

−4

−3

−2

−1

0

1

2

3

4

5

(a)

[mrad]

[mra

d]

−5 −4 −3 −2 −1 0 1 2 3 4 5

−5

−4

−3

−2

−1

0

1

2

3

4

5

(b)

Figure 2.12. The USAF resolution target (a) Group 0 element 1 and (b) Group 0elements 2 and 3.

resolution achieved by both the imagers is slightly poorer than the estimated optical

resolution of 0.31mrad. This can be attributed to errors in the calibration process,

which include non-zero noise in the PSF measurements and shift errors due to the

finite positioning accuracy of the computer-controlled translation stages.

A USAF resolution target was used to compare the object reconstruction quality

of the two imagers. Because the imager FOV is relatively small (10mrad×10mrad/

13.44mm×13.44mm) we used two small areas of the USAF resolution target shown in

Fig. 2.12(a) and Fig. 2.12(b). In Fig. 2.12(a) the spacing between lines of group 0 el-

ement 1 is 500µm in object space or equivalently 0.37mrad. Similarly in Fig. 2.12(b)

the line spacings for group 0 elements 2 and 3 are 0.33mrad and 0.30mrad respec-

tively. Given the optical resolution of the experimental system, we expect that group

0 element 3 should be resolvable by both the conventional and PRPEL imagers.

Fig. 2.13 presents the raw detector measurements of USAF group 0 element 1

from the two imagers. Consistent with the measured degree of under-sampling, the

imagers are unable to resolve the constituent line elements in the raw data. Fig. 2.14

shows reconstructions from the two multi-frame imagers for the same object using

K = 1, 4, and 9 sub-pixel shifted frames. We observe that for K = 1 neither imager

can resolve the object. For K = 4 however, the PRPEL imager clearly resolves the

43

[mrad]

[mra

d]

−5 −4 −3 −2 −1 0 1 2 3 4 5

−5

−4

−3

−2

−1

0

1

2

3

4

5

(a)

[mrad]

[mra

d]

−5 −4 −3 −2 −1 0 1 2 3 4 5

−5

−4

−3

−2

−1

0

1

2

3

4

5

(b)

Figure 2.13. Raw detector measurements obtained using USAF Group 0 element 1from (a) the conventional imager and (b) the PRPEL imager.

lines in the object; whereas, the conventional imager does not resolve them clearly.

Fig. 2.15(a) shows a horizontal line scan through the object and LMMSE reconstruc-

tions for K = 4, affirming our observation that the PRPEL imager achieves superior

contrast to that of the conventional imager. For K = 9 we note that both imagers

resolve the object equally well. Next we consider USAF group 0 elements 2 and 3

object whose reconstructions are shown in Fig. 2.16. As before, for K = 1 neither

imager can resolve the object. However, for K = 4 the PRPEL imager clearly re-

solves element 2 and barely resolves element 3. In contrast, the conventional imager

barely resolves element 2 only. This is also evident in the horizontal line scan of the

object and the LMMSE reconstructions shown in Fig. 2.15(b). Both imagers achieve

comparable performance for K = 9, completely resolving the object.

We observe that despite having precise channel knowledge we obtain poor recon-

struction results for the caseK = 1. This points to the limitations of linear reconstruc-

tion techniques that can not include powerful object constraints such as positivity and

finite support. However, non-linear reconstruction techniques such as iterative back

projection(IBP) [36] and maximum-likelihood expectation-maximization(MLEM) [37]

can easily incorporate these constraints. The Richardson-Lucy(RL) algorithm [38, 39]

based on the MLEM principle has been shown to be one such effective reconstruction

44

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

Figure 2.14. LMMSE reconstructions of USAF group 0 element 1 with left column forPRPEL imager and right column for conventional imager: top row for K=1, middlerow for K=4, and bottom row for K=9.

45

−5 −4 −3 −2 −1 0 1 2 3 4 5

0

0.2

0.4

0.6

0.8

1

[mrad]

Object

PRPEL reconstruction

Lens reconstruction

(a)

−5 −4 −3 −2 −1 0 1 2 3 4 5

0

0.2

0.4

0.6

0.8

1

[mrad]

Object


Lens reconstruction

(b)

Figure 2.15. Horizontal line scans through the USAF target and its LMMSE recon-struction for conventional and PRPEL imagers for K=4: (a) group 0 elements 1 and(b) group 0 elements 2 and 3.

46

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

Figure 2.16. LMMSE reconstructions of USAF group 0 element 2 and 3 with leftcolumn for PRPEL imager and right column for conventional imager: top row forK=1, middle row for K=4, and bottom row for K=9.

47

technique. The RL algorithm is a multiplicative iterative scheme where the k + 1th

object update denoted by f (k+1) is defined as [28],

f (k+1)n = f (k)

n

1

sn

KM∑

m=1

gm(Hcf (k)

)m

Hcmn, (2.14)

sn =KM∑

m=1

Hcmn,

where the subscript denotes the corresponding element of a vector or a matrix. Note

that if all elements of the composite imaging matrix Hc, the raw image measurement

g and the initial object estimate f (0) are positive then all subsequent estimates of

the object are guaranteed to be positive, thereby achieving the positivity constraint.

Further, by setting the appropriate elements of f (0) to 0 we can implement the finite

support constraint in the RL algorithm.

We apply the RL algorithm described above to the experimental data in an effort

to improve reconstruction quality, especially for K = 1. A constant positive vector

is used as an initial object estimate i.e. f (0) = c where ci = a > 0, ∀i. Fig. 2.17 and

Fig. 2.18 shows the RL object reconstructions of the USAF group 0 element 1 and

USAF group 0 elements 2 and 3 respectively. As expected, the RL algorithm yields a

substantial improvement in reconstruction quality over the LMMSE processor. This

improvement is most notable for the K = 1 case. In Fig. 2.17 we observe that the

PRPEL imager delivers better results compared to the conventional imager for K = 1

and K = 4. The horizontal line scans in Fig. 2.19(a) show that the PRPEL imager

maintains a superior contrast compared to the conventional imager for K = 4. From

Fig. 2.18 we observe that for K = 1 the PRPEL imager begins to resolve element 2

whereas the conventional imager still fails to resolve element 2. For K = 4, element 2

is clearly resolved and element 3 is just resolved by the PRPEL imager. In comparison

the conventional imager barely resolves element 2. These observations are confirmed

by the horizontal line scan plots shown in Fig. 2.19(b).

Overall the experimental reconstruction and resolution results confirm the conclu-

48

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

Figure 2.17. Richardson-Lucy reconstructions of USAF group 0 element 1 with leftcolumn for PRPEL imager and right column for conventional imager: top row forK=1, middle row for K=4, and bottom row for K=9.

49

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

[mrad]

[mra

d]

−4 −3 −2 −1 0 1 2 3 4

−4

−3

−2

−1

0

1

2

3

4

Figure 2.18. Richardson-Lucy reconstructions of USAF group 0 element 2 and 3 withleft column for PRPEL imager and right column for conventional imager: top rowfor K=1, middle row for K=4, and bottom row for K=9.

50

−5 −4 −3 −2 −1 0 1 2 3 4 5

0

0.2

0.4

0.6

0.8

1

[mrad]

Object


Lens reconstruction

(a)

−5 −4 −3 −2 −1 0 1 2 3 4 5

0

0.2

0.4

0.6

0.8

1

[mrad]

Object


Lens reconstruction

(b)

Figure 2.19. Horizontal line scans through the USAF target and its Richardson-Lucyreconstruction for conventional and PRPEL imagers for K=4: (a) group 0 elements1 and (b)group 0 elements 2 and 3.

sions drawn from our simulation study; the PRPEL imager offers superior resolution

and reconstruction performance compared to the conventional multi-frame imager.

2.5. Imager parameters

The results reported here have demonstrated the utility of the PRPEL imager. In

order to motivate a more general applicability of the PRPEL approach, there are

two important parameters that require further investigation: pixel size and spectral-

51

1 2 3 4 5 6 7 8

0.2

0.3

0.4

0.5

0.6

0.7


Ang

ular

res

olut

ion

[mra

d]

Lens ImagerPRPEL Imager


(a)

1 2 3 4 5 6 7 8

1

1.5

2

2.5

3

3.5

4

4.5

5

5.5


RM

SE

[% o

f dyn

amic

ran

ge]

Lens Imager PRPEL Imager

(b)

Figure 2.20. (a) Rayleigh resolution and (b) RMSE versus number of frames formulti-frame imagers that employ smaller pixels and lower measurement SNR.

bandwidth. We consider two case studies in which these imaging system parameters

are modified in order to study their impact on overall imager performance.

2.5.1. Pixel size

Here we consider the effect of smaller pixel size which is typical of CMOS detectors ar-

rays, now commonly employed in many imagers. Consider a sensor having a pixel size

of 3.2µm resulting in a less severe under-sampling as compared with the 7.5µm pixel

size assumed earlier. This detector has a 100% fill-factor and a smaller FWC of 28000

electrons(lower SNR). All other parameters of the imaging system remain unchanged.

The under-sampling factor for the new sensor is F = 7 and the photon-limited SNR

is now 22dB. We repeat the simulation study of the overall imaging system perfor-

mance for both the conventional imager and the PRPEL imager. Fig. 2.20(a) shows

the plot of the resolution versus the number of frames for both imaging systems. This

plot shows that for K = 2 the PRPEL imager achieves a resolution of 0.3mrad while

the conventional imager resolution is only 0.5mrad. Fig. 2.20(b) shows the RMSE

performance of the two imagers versus the number of frames. For K = 2 the PRPEL

52

−15 −10 −5 0 5 10 150

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

x 10−3


Am

plitu

de

PSF − 10nmPSF − 150nm

Figure 2.21. The optical PSF obtained using PRPEL with both narrowband (10 nm)and broadband (150 nm) illumination.

imager achieves a RMSE of 3.2% compared to 4.0% for the conventional imager, an

improvement of nearly 20%. From these results we conclude that the PRPEL imager

remains a useful option for imagers with CMOS sensors that have smaller pixels and

a lower SNR.

2.5.2. Broadband operation

Recall that all our simulation studies have assumed a 10 nm spectral bandwidth so

far. In this section, we will relax this constraint and allow the spectral bandwidth

to increase to 150 nm, roughly equal to the bandwidth of the green band of the

visible spectrum. All other imaging system parameters remain unchanged (using the

original 7.5µm sensor). There is a two-fold implication of the increased bandwidth.

First, because we accept a wider bandwidth, the photon count increases resulting

in an improved measurement SNR. Within the PRPEL imager however, this SNR

increase is accompanied by increased chromatic dispersion and a smoothing of the

PRPEL PSF. This smoothing results in a worsening of the condition number for

the PRPEL imager. To illustrate the dispersion effect, Fig. 2.21 shows a plot of

the extended PRPEL PSF for both the 10 nm and the 150 nm bandwidths. The

53

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 160.2

0.4

0.6

0.8

1

1.2

1.4

1.6


Ang

ular

res

olut

ion

[mra

d]

Lens Imager−10nmLens Imager−150nm PRPEL Imager−10nmPRPEL Imager−150nm


(a)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16161

2

3

4

5

6

7

8

Number of Frames − K

RM

SE

[% o

f dyn

amic

ran

ge]

Lens Imager−10nmLens Imager−150nmPRPEL Imager−10nmPRPEL Imager−150nm

(b)

Figure 2.22. (a) Rayleigh resolution and (b) RMSE versus number of frames forbroadband PRPEL and conventional imagers.

smoothing of the PSF affects the optical transfer function of the imager by attenuating

the higher spatial frequencies. Hence, we can expect a trade-off between the higher

SNR and the worsening of the condition number, especially for the PRPEL imaging

system. The plot in Fig. 2.22(a) shows that the conventional imager resolution is

relatively unaffected by broadband operation. The PRPEL imager performance on

the other hand suffers due to dispersion despite the increase in SNR. Similar trends

in RMSE performance can be observed for the two imagers as shown by the plot in

Fig. 2.22(b). The performance of the broadband PRPEL imager deteriorates relative

to narrowband operation for small values of K; however, note that for medium and

large values of K the performance of the PRPEL imager actually improves due to

increased SNR.

2.6. Conclusions

The optical PSF engineering approach for improving imager resolution and object re-

construction fidelity in under-sampled imaging system was successfully demonstrated.

The simulation study of the PRPEL imager predicted substantial performance im-

54

provements over a conventional multi-frame imager. The PRPEL imager was shown

to offer as much as 50% resolution improvement and 20% RMSE improvement as

compared to the conventional imager. The experimental results confirmed these pre-

dicted performance improvements. We also applied the non-linear Richardson-Lucy

reconstruction technique to the experimental data. The results obtained showed

that imager performance is substantially improved with non-linear techniques. In

this chapter, the application of optical PSF engineering method to the object re-

constructed task has shown the potential benefits of the joint-optimization design

approach. In next chapter, we extend the application of the optical PSF engineering

method to an iris-recognition task.

55

Chapter 3

Optical PSF Engineering: Iris Recognition Task

In this chapter we will apply the optical PSF engineering approach to the task of

iris-recognititon to overcome the performance degradations introduced by an under-

sampled imaging system. Note that the metric for quantifying the imaging system

performance for a particular task plays a critical role in the joint-optimization design

approach. For the object reconstruction task we had employed two metrics: 1) res-

olution and 2) RMSE. Here we will use the statistical metric of false rejection ratio

(FRR) (evaluated at a fixed false acceptance ratio (FAR)) for quantifying the imaging

system performance for the iris-recognition task.

3.1. Introduction

Many modern defense and security applications require automatic recognition and

verification services that employ a variety of biometrics such as facial features, hand

shape, voice, fingerprints, and iris. The iris is the annular region between the pupil

and the outer white sclera of the eye. Iris-based recognition has been gaining pop-

ularity in recent years and it has several advantages compared to other traditional

biometrics such as fingerprints and facial features. The iris-texture pattern represents

a high-density of information and the resulting statistical uniqueness can yield false

recognition rates as low as 1 in 1010 [41, 42, 43]. Further, it has been found that

the human iris is stable over the lifetime of an individual and is therefore considered

to be a reliable biometric [44]. Iris-based recognition systems rely on capturing the

iris-texture pattern with a high-resolution imaging system. This places stringent de-

mands on imaging optics and sensor design. In the case where the detector pixel size

56

limits the overall resolution of the imaging system, the under-sampling in the sensor

array can lead to degradation of the iris-recognition performance. Therefore, over-

coming the detector-induced under-sampling becomes a vital issue in the design of

an iris-recognition imaging system. One approach to improve the resolution beyond

the detector limit employs multiple sub-pixel shifted measurements within a TOMBO

imaging system architecture [19, 20]. However, this approach does not exploit the

optical degrees of freedom available to the designer and more importantly it does not

address the specific nature of the iris-recognition task. We note that there are some

studies that have exploited the optical degrees of freedom to extend the depth-of-field

of iris-recognition systems [45, 46], but we are not aware of any previous work that

has examined under-sampling in iris-recognition imaging systems. In this chapter,

we propose an approach that involves engineering the optical point spread function

(PSF) of the imaging system in conjunction with use of multiple sub-pixel shifted

measurements. It is important to note that the goal of our approach is to maxi-

mize the iris-recognition performance and not necessarily the overall resolution of the

imaging system. To accomplish this goal, we employ an optimization framework to

engineer the optical PSF and optimize the post-processing system parameters. The

task-specific performance metric used within our optimization framework is FRR for

a given FAR [47]. The mechanism of modifying the optical PSF employs a phase-

mask in the aperture-stop of the imaging system. The phase-mask is defined with

Zernike polynomials and the coefficient of these polynomials serve as the optical de-

sign parameters. The optimization framework is used to design imaging systems for

various numbers of sub-pixel shifted measurements. The CASIA iris database [48] is

used in the optimization framework and it also serves to quantify the performance of

the resulting optimized imaging system designs.

57

X−axis

Y−

axis

Z−axis

Object

Phase−mask

Imaging Optics

Detector−

array

Figure 3.1. PSF-engineered multi-aperture imaging system layout.

3.2. Imaging System Model

In this study, our iris-recognition imaging system is composed of three components: 1)

the optical imaging system, 2) the reconstruction algorithm, and 3) the recognition al-

gorithm. The optical imaging system consists of multiple sub-apertures with identical

optics. This multi-aperture imaging system produces a set of sub-pixel shifted im-

ages on the sensor array. The task of the reconstruction algorithm is to combine these

image measurements to form an estimate of the object. Finally, the iris-recognition

algorithm operates on this object estimate and either accepts or rejects the iris as a

match. We begin by describing the multi-aperture imaging system.

3.2.1. Multi-aperture imaging system

Fig. 3.1 shows the system layout of the multi-aperture(MA) imaging system. The

number of sub-imagers comprising the MA imaging system is denoted by K. The

sensor array in the focal plane of the MA imager generates K image measurements,

where the kth measurement (also referred to as a frame) is denoted by gk. The detector

pitch d of the sensor array relative to the Nyquist sampling interval δ, determined by

the optical cut-off spatial frequency, defines the under-sampling factor: F = dδ× d

δ.

Therefore, for an object of size N × N pixels the under-sampled kth sub-imager

58

measurement gk is of dimension M ×M , where M = ⌈ N√F⌉. Mathematically, the kth

frame can be expressed as

gk = Hkf + nk, (3.1)

where f is a N2 × 1 dimensional vector formed by a lexicographic arrangement of a

two-dimensional (N ×N) discretized representation of the object, Hk is the M2 ×N2

discrete-to-discrete imaging operator of the kth sub-imager and nk denotes the M2×1

dimensional measurement error vector. Here we model the measurement error nk as

zero-mean additive white Gaussian noise (AWGN) with variance σ2n. Note that the

imaging operator Hk is different for each sub-imager and is expressed as

Hk = DCSk, (3.2)

where Sk is the N2 × N2 shift operator that produces a two-dimensional sub-pixel

shift (∆Xk,∆Yk) in the kth sub-imager, C is N2 × N2 convolution operator that

represents the optical PSF and D is the M2 × N2 down-sampling operator which

includes the effect of spatial integration over the detector and the under-sampling

caused by the sensor array. Note that the convolution operator C does not vary with

k because the optics are assumed to be identical in all sub-imagers. By combining

the K measurements we can form a composite measurement g = g1 g2 · · ·gK that

can be expressed in terms of the object vector f as follows

g = Hcf + n, (3.3)

where Hc = H1 H2 · · ·HK is the composite imaging operator of size KM2 × N2

obtained by stacking the K imaging operators corresponding to each of the K sub-

imagers and n is the composite noise vector defined as n = n1 n2 · · ·nK.As mentioned earlier, the optical PSF is engineered by placing a phase-mask in the

aperture-stop of each sub-imager. The pupil-function tpupil(ρ, θ) of each sub-imager

is expressed as [29]

tpupil(ρ, θ) = tamp(ρ)exp(j2π(nr − 1)tphase(ρ, θ)

λ

), (3.4)

59

where ρ and θ are the polar coordinate variables in the pupil, nr is the refractive index

of the phase-mask, tamp(ρ) = circ( ρDap

) is the circular pupil-amplitude function(Dap

denotes the aperture diameter), tphase(ρ, θ) represents the pupil-phase function and λ

is the wavelength. A Zernike-polynomial of order P is used to define the pupil-phase

function as follows

tphase(ρ, θ) =P∑

i=1

ai · Zi(ρ, θ), (3.5)

where ai is the coefficient of the ith Zernike polynomial denoted by Zi(ρ, θ) [49]. In

this work, we will use Zernike polynomials up to order P = 24. The resulting optical

PSF h(ρ, θ) is expressed as [28]

h(ρ, θ) =Ac

(λfl)4

∣∣∣∣Tpupil

(− ρ

λfl, θ

)∣∣∣∣2

, (3.6)

Tpupil(ω) = F2 tpupil(ρ, θ) , (3.7)

where ω is the two-dimensional spatial frequency vector, Ac is a normalization con-

stant with units of area, fl is the back focal length, and F2 denotes the 2-dimensional

forward Fourier transform operator.

A discrete representation of the optical PSF hd(l,m), required for defining the C

operator is obtained as follows

hd(l,m) =

∫ d2

− d2

∫ d2

− d2

h(x− ld, y −md)dxdy (l,m) : l = −L · · ·L,m = −L · · ·L,

(3.8)

where (2L + 1)2 is the number of samples used to represent the optical PSF. Note

that a lexicographic ordering of the hd(l,m) yields one row of C and all other rows

are obtained by lexicographically ordering the appropriately shifted version of this

discrete optical PSF.

3.2.2. Reconstruction algorithm

The measurements from the K sub-imagers comprising the MA imaging system form

the input to the reconstruction algorithm. We employ a reconstruction algorithm

60

Figure 3.2. Iris examples from the training dataset.

based on the linear minimum mean square error (LMMSE) criterion. The LMMSE

method is essentially a generalized form of the Wiener filter and operates on the

measurement in the spatial domain without the assumption of shift-invariance. Given

the imaging model specified in Eq. (3.3) the LMMSE operator W can be written

as [31]

W = RffHTc

(HcRffHT

c + Rnn

)−1, (3.9)

where Rff is the object auto-correlation matrix and Rnn is the noise auto-correlation

matrix. Here we assume that noise is zero-mean AWGN with variance σn2 and there-

fore Rnn = σ2nI. Note that for an object of size N2 and measurement of size KM2, the

size of W matrix is N2 ×KM2. For even a modest object size of 280× 280, as is the

case here, computing the W matrix becomes computationally very expensive. There-

fore, we adopt an alternate approach that does not rely on directly computing matrix

inverses but instead uses a conjugate-gradient method to compute the LMMSE solu-

tion iteratively. Before we describe the iterative algorithm, we first need a method to

estimate the object auto-correlation matrix Rff . We use a training set of 40 subjects

with 4 iris samples for each subject, randomly selected from the CASIA iris database

61

yielding a total of 160 iris object samples. Fig. 3.2 shows example iris-objects in the

training dataset. The kth iris object yields the sample auto-correlation function rkff

which is used to estimate the actual auto-correlation function as follows

Rff =1

160

160∑

k=1

rkff. (3.10)

The corresponding power spectral density Sff can be written as [50]

Sff (ρ) = F2(Rff ). (3.11)

To obtain a smooth approximation of the power spectral density we use the following

parametric function [51]

Sff (ρ) =σ2

f

(1 + 2πµdρ2)3

2

. (3.12)

Note that because the iris is circular, we assume a radially symmetric power spectrum

Sff . A least square fit to Sff (ρ) yields σf = 43589 and µd = 1.5.

In general, a conjugate-gradient algorithm minimizes the following form of quadratic

objective function Q [28]

Q(f) =1

2f tAf − btf . (3.13)

For the LMMSE criterion, A = HTc Hc + σ2R−1

ffand b = HT

c g. Within our iterative

conjugate gradient-based algorithm we use a conjugate-vector pj instead of the gradi-

ent of the objective Q(f) to achieve a faster convergence to the LMMSE solution [52].

The k + 1th update rule can be expressed as [28]

fk+1 = fk + αkpk (3.14)

αk = −ptk∇Qk

dk, (3.15)

where ∇Qk denotes the gradient of objective function Qk evaluated at the kth step and

pk is conjugate to all previous pj , j < k (i.e. ptjApk = djδjk), δjk is the Kronecker-

delta function, and dk is the ‖ · ‖2 norm of pk. The stopping criterion is specified as

when the residual vector rk = ∇Qk = Afk − b changes less than β% over the last 4

iterations (i.e.rk−4−rk

rk−4≤ β

100).

62

(a) (b)

θ

ρ

50 100 150 200

102030

(c)

θ

ρ

50 100 150 200 250 300 350 400

102030

(d)

Figure 3.3. Examples of (a) iris-segmentation, (b) masked iris-texture region, (c)unwrapped iris, and (d) iris-code.

63

3.2.3. Iris-recognition algorithm

The object estimate obtained with the reconstruction algorithm is processed by the

iris-recognition algorithm to make the final decision. There are three main processing

steps that form the basis of the iris-recognition algorithm. The first step involves

a segmentation algorithm that extracts the iris, pupil, and the eye-lid regions from

the reconstructed object. The segmentation algorithm used in this work is adapted

from Ref. [53] with the addition of eye-lid boundary detection. The output of the

segmentation algorithm yields an estimate of the center and radius of the circular

pupil and iris regions and also the boundaries of the upper and lower eyelids in

the object. Fig. 3.3(a) shows an example iris image that was processed with the

segmentation algorithm. The pupil and iris regions are outlined by circular boundaries

and the upper/lower eyelid edges are represented by the elliptical boundaries. This

information is used to generate a mask M(x, y) that extracts the annular region

between iris and pupil boundaries which contains only the unobscured iris-texture

region. An example of the masked iris region is shown in Fig. 3.3(b). The extracted

iris-texture region is the input to the next processing step. Given the center and

radius of the pupil and the iris regions, the annular iris-texture region is unwrapped

into a rectangular area a(ρ, θ) using Daugman’s homogenous rubber sheet model [54].

The size of the rectangular region is specified as Lρ×Lθ with Lρ rows along the radial

direction and Lθ columns along the angular direction. Fig. 3.3(c) shows an example

of an unwrapped rectangular region with Lρ = 36 and Lθ = 224. In the next step,

a complex log-scale Gabor filter is applied to each row to extract the phase of the

underlying iris-texture pattern. The complex log-scale Gabor filter spectrum Glog(ρ)

is defined as [55]

Glog(ρ) = exp

(

−log( ρ

ρo)

2log(σg

ρo)

)

, (3.16)

where ρo is the center frequency of the filter and σg specifies its bandwidth. Note that

this filter is only applied along the angular direction which corresponds to pixels on

64

the circumference of a circle in the original object. The angular direction is chosen

over the radial direction because the maximum texture variation occurs along this

direction [53]. The phase of the complex output of each Gabor filter is then quantized

into four quadrants using two bits. The 4-level quantized phase is coded using a Grey

code so that the difference between two adjacent quadrants is one bit. The Grey

coding scheme also ensures that any misalignment between two similar iris-codes

results in a minimum of errors. The quantized phase results in a binary pattern,

shown in Fig. 3.3(d), which is referred to as an “iris-code.”

In the final step, the iris-recognition task is performed based on the iris-code

obtained from a test object. To determine whether the given iris-code denoted by

tcode, matches any iris-code in the database, a score is computed. The score denoted

by s(tcode) is defined as

s(tcode) = mink,i

dhd(tcodeckmask, Ri(r

kcode)c

kmask), (3.17)

where rkcode is the kth reference iris-code in the database, ckmask is a mask that represents

the unobscured bits common among both the test and the reference iris-codes, Ri is

a shift operator which performs an i-pixel shift along the angular direction, and dhd

is the Hamming distance operator. All shifts in the range i : −O · · · + O are

considered, where O denotes the maximum shift. The dhd operator is defined as

follows

dhd(tcodecmask, rcodecmask) =

∑(tcodecmask ⊕ rcodecmask)

W, (3.18)

where W is the weight (i.e. number of all 1s) of the mask cmask. The normalized

Hamming distance score defined in Eq. (3.18) is computed over all iris-codes in the

database. The iris-code is shifted to account for any rotation of the iris in the object.

Finally, the following decision rule is applied to the minimum iris score s(tcode)

s(tcode)H0

≶H1

THD, (3.19)

this translates to: accept the null hypothesis H0 if the score is less than threshold

65

0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.550

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

Normalized Hamming Distance (HD)

Pro

babi

lity

dens

ity

Inter−class probability density

Intra−class probability density

T

FARFRR

THD

Figure 3.4. Illustration of FRR and FAR definitions in the context of intra-class andinter-class probability densities.

THD, otherwise accept the alternative hypothesis H1. The null-hypothesis H0 implies

that the test iris was correctly recognized and the alternate-hypothesis H1 indicates

that test iris was mis-classified. The threshold THD determines the performance of

the iris-recognition system as summarized by the FRR and FAR statistics.

3.3. Optimization framework

The goal of our optimization framework is to enable the design of an iris-recognition

system that minimizes FRR for a fixed FAR in the presence of under-sampling.

Fig. 3.4 illustrates the definitions of FRR and FAR in the context of intra-class

distance and inter-class distance probability densities. Intra-class distance refers to

the set of distances between iris-codes of the same subject, whereas the inter-class

distance refers to the set of distances between iris-codes from different subjects. The

rational behind this choice of performance metric is that the cost of not recognizing

an iris which is actually enrolled (false rejection error) in the database is significantly

higher than recognizing an iris as a match when it is not enrolled in the database

(false acceptance error). Note that the FRR and FAR errors can not be reduced

66

simultaneously. In this study we set FAR to 0.001. This value of FAR may not

represent an optimal choice for an actual system however, here it only serves as a

representative value in our optimization framework.

In the MA imaging system the coefficients of the Zernike polynomials, that de-

scribe the pupil-phase function, represent the optical design parameters. The param-

eters of the reconstruction algorithm (e.g. β) and iris-recognition algorithm (e.g. ρo,

σg, Lρ, Lθ, O) comprise the degrees of freedom available in the computational do-

main. Ideally, a joint-optimization of the optical and the post-processing parameters

of the imaging system would yield the maximum improvement in the iris-recognition

performance. However, the resulting optimization process would be computationally

intractable due to the high computational complexity of evaluating the objective func-

tion coupled with the large number of design variables. Here the objective function

computation involves the estimation of the intra-class and inter-class iris distance

probability densities. This in turn requires computing iris-codes from a set of re-

constructed iris-objects and comparing to the reference iris database. Here we use a

training dataset with 160 iris object samples as described in Subsection 3.2.2. In or-

der to obtain a reliable estimate of the inter-class and intra-class distance probability

densities, we need to generate a large set of iris-code samples. This is achieved by

simulating an iris object for 10 random noise realizations yielding as many iris-codes

for each iris-object. Thus, a single evaluation of the objective function effectively

results in simulation of 1600 iris objects through the imaging system. Therefore, op-

timizing over all available degrees of freedom becomes a computationally prohibitive

task.

In our optimization framework, we adopt an alternative approach that departs

from the ideal joint-optimization goal. We note that our approach still involves a

joint-optimization of optical and computational parameters while reducing the com-

putational complexity by splitting the optimization into two separate steps. Note

that the iris-recognition algorithm parameters are inherently a function of the iris-

67

texture statistics and are not strongly dependent on the optics. For example, the

center frequency and the bandwidth of the log Gabor filter are tuned to the spa-

tial frequency distribution of the iris-texture that contains the most discriminating

information. Further, the parameters Lρ and Lθ are dependent on the correlation

length of iris-texture along radial and angular directions respectively. This allows us

to optimize the iris-recognition algorithm parameters independent of the optics and

the reconstruction algorithm. Therefore, the first optimization step involves opti-

mizing the iris-recognition algorithm parameters to minimize the FRR. For this step

the detector pitch is chosen such that there is no under-sampling and no phase-mask

is used in the optics. The optimization is performed with a coarse-to-fine search

method using the iris objects from the training dataset. It is found that Lρ = 36,

Lθ = 224, ρo = 1/18, and σg = 0.4 yield the optimal performance. The number of left

and right shifts required to achieve optimal performance is O = 8 in each direction.

As a result of the first step, the second optimization step is reduced to the task of

optimizing the optical and reconstruction algorithm parameters which becomes com-

putationally tractable. The optical system parameters include the P coefficients of

the Zernike polynomials. The reconstruction algorithm parameter β, associated with

the stopping condition in the iterative conjuage-gradient algorithm, is the only post-

processing design variable used in this optimization step. Note that the values of the

iris-recognition algorithm parameters remain fixed during this optimization step.

Our optimization framework employs a simulated tunneling algorithm, a global

optimization technique [56], to perform the second optimization step. This global

optimization algorithm is implemented in a MPI-based environment [57] that allows

it to run on multiple processors in parallel, thereby decreasing the computation time

required for each iteration. The simulated tunneling algorithm is run for 4000 itera-

tions to ensure that convergence is achieved. This optimization framework is used to

design imaging systems with an under-sampling of F = 8×8 that use K = 1, K = 4,

K = 9, and K = 16 frames. The sub-pixel shifts for K frames is chosen as multiples

68

Under-sampling Frames Conventional ZPEL

F = 1 × 1 0.133F = 8 × 8 K = 1 0.458 0.295F = 8 × 8 K = 4 0.153 0.128F = 8 × 8 K = 9 0.140 0.117F = 8 × 8 K = 16 0.135 0.113

Table 3.1. Imaging system performance for K = 1, K = 4, K = 9, and K = 16 ontraining set.

of ∆ = d√K

along each direction, where d is the detector pitch/size. For example,

for K = 4 the sub-pixel shifts are (∆X ,∆Y ) : (0, 0), (d2, 0), (0, d

2), (d

2, d

2). The noise

variance σ2n is set so that the measurement signal to noise ratio (SNR) is equal to

60 dB. From here onwards the optimized imaging system will be referred to as the

Zernike phase-enhanced lens (ZPEL) imaging system. In the next section, we discuss

the performance of the optimized ZPEL imager and compare it to a conventional

imaging system.

3.4. Results and Discussion

The under-sampling in the sensor array degrades the performance of the iris-recognition

imaging system. With an under-sampling factor of F = 8 × 8 we find that FRR =

0.458 as compared to FRR = 0.133 without under-sampling in the conventional imag-

ing system. This represents a significant reduction in performance and highlights the

need to mitigate the effect of under-sampling. Increasing the number of sub-pixel

shifted frames from K = 1 to K = 16 improves the performance of the conventional

imaging system, as evident from the FRR data shown in Table (3.1). To ensure a

fair comparison among imaging systems with various number of frames, we enforce

a total photon constraint. This constraint implies that the total number of photons

available to each imager (i.e. summed over all frames) is fixed. Therefore, for an

imaging system using K frames the measurement noise variance must be scaled by

69

a factor of K. For example, the measurement noise variance in an imaging system

with K = 4 frames is set to σ2K = 4σ2

n, where σ2n is the measurement noise variance of

the imaging system with K = 1. Subject to this constraint, we expect that a ZPEL

imaging system designed within the proposed optimization framework would improve

upon the performance of the conventional imaging system.

We begin by examining the result of optimizing the ZPEL imaging system with

K = 1. Fig. 3.5(a) shows the Zernike phase-mask and Fig. 3.5(b) shows the cor-

responding optical PSF of the optimized ZPEL imager. For comparison purpose,

Fig. 3.5(c) shows the optical PSF of the conventional imager. The phase-mask spans

over the extent the aperture-stop, where 0.5 corresponds to the radius (Dap

2) of the

aperture. The optical PSF is plotted over the normalized scale of [−1, 1] where 1 cor-

responds to the detector size d. Note that the large spatial extent of the PSF relative

to that of a conventional imaging system suggests that high spatial frequencies in the

corresponding modulation transfer function (MTF) would be suppressed. Fig. 3.6

shows plots of various cross-sections of the two-dimensional MTF. Here spatial fre-

quency is plotted on the normalized scale of [−1, 1], where 1 corresponds to the optical

cut-off frequency ρc. Observe that the MTF reduces rapidly with increasing spatial

frequency. This is a result of the optimization process suppressing the MTF at the

high spatial frequencies to reduce the effect of aliasing. Furthermore, the non-zero

MTF at mid-spatial frequencies allows the reconstruction algorithm to potentially

recover some information in this region which is crucial for the iris-recognition task.

The expected performance improvement is clearly evident from a lower FRR = 0.295

achieved by the optimized ZPEL imaging system as opposed to FRR = 0.458 of the

conventional imaging system. This is equivalent to an improvement of 32.7% which is

significant, given the ZPEL imager would result in nearly 163 fewer false rejections on

average for every 1000 iris tested. Similarly, with K = 4 the optimized ZPEL imager

yields an FRR = 0.153 that is 16.3% lower than FRR = 0.128 of the conventional

imaging system. Fig. 3.7(a) and Fig. 3.7(b) show the phase-mask and the optical PSF

70

−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5

−0.5

−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5 −10

−5

0

5

10

15

20

25

(a)

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

(b)

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

(c)

Figure 3.5. Optimized ZPEL imager with K = 1 (a) pupil-phase, (b) optical PSF,and (c) optical PSF of conventional imager .

71

−1 −0.9 −0.8 −0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Spatial frequency

Mod

ulat

ion

X−direction

Y−direction

θ=135o direction

θ=45o direction

Conventional

Figure 3.6. Cross-section MTF profiles of optimized ZPEL imager with K = 1.

−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5

−0.5

−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5−10

−8

−6

−4

−2

0

2

4

6

8

10

(a)

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

(b)

Figure 3.7. Optimized ZPEL imager with K = 4: (a) pupil-phase and (b) opticalPSF.

72

−1 −0.9 −0.8 −0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Spatial frequency

Mod

ulat

ion

X−direction

Y−direction

θ=45o direction

θ=135o direction

Conventional


of this optimized ZPEL imager respectively. Note that the optical PSF has a smaller

extent compared to that forK = 1. The use of 4 frames as opposed to 1 frame reduces

the effective under-sampling by a factor of 2 in each direction. Thus, we expect as

a result of the optimization, the MTF in this case would be higher, especially in the

mid-spatial frequencies compared to the MTF of the ZPEL imager with K = 1. This

is confirmed by the plot of the MTF in Fig. 3.8. The MTF at mid-spatial frequencies

is significantly higher in Fig. 3.8 compared to that in Fig. 3.6. It is also interesting

to note that the FRR = 0.128 achieved by this optimized ZPEL imager is actually

lower than FRR = 0.133 of the conventional imaging system without under-sampling.

This clearly highlights the effectiveness of the optimization framework; by not only

overcoming the performance degradation due under-sampling but also successfully

incorporating the task-specific nature of the iris-recognition task in the ZPEL imager

design to enhance the performance beyond that of the conventional imager.

Fig. 3.9(a) and Fig. 3.9(b) show the Zernike phase-mask and the optical PSF of

the optimized ZPEL imager with K = 9. The ZPEL imager achieves a FRR = 0.117

compared to FRR = 0.140 for the conventional imaging system, an improvement of

16.4%. The MTF of this imaging system is shown in Fig. 3.10. The optimized ZPEL

73

−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5

−0.5

−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

−3

−2

−1

0

1

2

(a)

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

(b)


−1 −0.9 −0.8 −0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Spatial frequency

Mod

ulat

ion

X−direction

Y−direction

θ=45o direction

θ=135o direction

Conventional


74

−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5

−0.5

−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5 −4

−3

−2

−1

0

1

2

3

4

(a)

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

(b)


−1 −0.9 −0.8 −0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Spatial frequency

Mod

ulat

ion

X−direction

Y−direction

θ=45o direction

θ=135o direction

Conventional


75

Under-sampling Frames Conventional ZPEL

F = 1 × 1 0.0543F = 8 × 8 K = 1 0.1642 0.1383F = 8 × 8 K = 4 0.0637 0.0513F = 8 × 8 K = 9 0.0558 0.0444F = 8 × 8 K = 16 0.0534 0.0440

Table 3.2. Imaging system performance for K = 1, K = 4, K = 9, and K = 16 onvalidation set.

imager with K = 16 frames reduces it further to FRR = 0.113 an improvement of

16.3% over FRR = 0.135 of the conventional imaging system with the same number

of frames. The Zernike phase mask, the optical PSF, and the MTF of this ZPEL

imager are shown in Fig. 3.11(a), Fig. 3.11(b), and Fig. 3.12 respectively. It is also

interesting to note that compared to the optimized ZPEL imager design with K = 9,

the design with K = 16 yields an improvement of only 3.4%. The same is true

for the conventional imaging system where the performance improves by only 3.6%

from K = 9 to K = 16. In fact, the iris-recognition performance achieved by the

conventional imaging system with K = 16 nearly equals that of the imaging system

without under-sampling i.e. F = 1. This suggests that adding more frames beyond

K = 16 does not significantly improve the iris-recognition performance, which seems

contrary to our expectations. However, recall that increasing the number of frames K

also increases the measurement noise variance σ2K as a result of the fixed total photon

count constraint, while reducing the effect of aliasing at the same time. Therefore,

the resulting trade-off between these two competing processes leads to diminishing

improvement in iris-recognition performance with increasing number of frames. As

a result at K = 16, the effect of increasing measurement noise nearly counters the

reduction in aliasing from the multiple frames resulting in only a small improvement

in FRR for both the ZPEL and the conventional imaging systems.

So far we have observed that the optimized ZPEL imager offers a substantial

76

Figure 3.13. Iris examples from the validation dataset.

improvement in the iris-recognition performance over the conventional imaging system

with an under-sampling detector array. However, these results were obtained using

the training dataset, the same data set that was used in the optimization process.

In order to estimate the actual performance of the optimized ZPEL imaging system

independent of the training dataset we need to assess it on a validation dataset. The

validation dataset consists of 44 distinct iris subjects with 7 samples of each iris,

selected from the CASIA database, resulting in a total of 308 iris samples. Fig. 3.13

shows example iris-objects from the validation dataset. Note that none of the iris

samples in the validation dataset appear in the training dataset. We use a total of 30

noise realizations for each iris-object to estimate the FRR from the intra-class and

inter-class densities. The FRR data for the validation dataset is shown in Table (3.2).

The optimized ZPEL imager for K = 1 yields a FRR = 0.138 on the validation

dataset as compared to FRR = 0.164 for the conventional imaging system. This

represents a performance improvement of about 15.9% over the conventional imaging

system, which is nearly half of the 32.7% improvement that was obtained on the

training dataset. This difference in performance can be explained by considering the

fact that the optimization process does not distinguish between the effect of under-

sampling and the statistics of the particular iris samples comprising the training

77

dataset. As a result, the imaging system is optimized jointly towards mitigating

the effect of under-sampling and adapting to the statistics of the iris samples in the

training dataset. We cannot expect the performance of the ZPEL imaging system to

be the same on the validation and training dataset, because the statistics of the iris

samples are different in the two datasets. However, it is important to add that the

difference between the performance on training and validation dataset will reduce as

the size of the training dataset is increased and it becomes more representative of the

true iris statistics.

In the case of K = 4, the ZPEL imager achieves an FRR = 0.0513 which is

an improvement of 19.4% over FRR = 0.0637 of the conventional imaging system.

With K = 9 the optimized ZPEL imager yields a FRR = 0.0444 compared to

FRR = 0.0558 of the conventional imaging system. This represents an improvement

of 21.6%. For K = 16 frames, the optimized ZPEL imager results in FRR = 0.0534

an 21.0% reduction from FRR = 0.0440 of the conventional imaging system with the

same number of frames. Note that the FRR of both the optimized ZPEL imager and

the conventional imaging system do not reduce significantly from K = 9 frames to

K = 16 frames. This is due to the same underlying trade-off between measurement

information and measurement noise with increasing number of frames which was

observed in the case of the training dataset.

3.5. Conclusions

We have studied the degradation in the iris-recognition performance resulting from

an under-sampling factor of F = 8 × 8 and found that in a conventional imager, it

yields FRR = 0.458 compared to FRR = 0.133 when there is no under-sampling. We

describe a joint-optimization framework that exploits the optical and post-processing

degrees of freedom jointly to maximize the iris-recognition performance in the pres-

ence of under-sampling. The resulting ZPEL imager design uses an engineered optical

78

PSF together with multiple sub-pixel shifted measurements to achieve the perfor-

mance improvement. The ZPEL imager is designed for K = 1, K = 4, K = 9, and

K = 16 number of frames. On the training dataset, the optimized ZPEL imager

achieved performance improvement of nearly 33% for K = 1 compared to the con-

ventional imaging system. With K = 4 frames the ZPEL imager design achieved a

FRR which is nearly equal to that of a conventional imager without under-sampling.

The effectiveness of the optimization framework was highlighted by the ZPEL imager

design with K = 16 that achieved a FRR = 0.113, that is actually 15% better than

FRR = 0.133 of the conventional imager without under-sampling. The comparison

of the ZPEL imager and conventional imaging system performance using a validation

dataset provided further support for the performance improvements obtained on the

training dataset. On the validation dataset, the ZPEL imager design required only

K = 4 frames as opposed to K = 16 frames needed by the conventional imaging

system to equal the performance without under-sampling. Similarly, with K = 16

frames the optimized ZPEL imager obtained a 21.0% performance improvement over

the conventional imaging system without under-sampling. These results demonstrate

the utility of the optimization framework for designing task-specific ZPEL imagers

that overcome the performance degradation due to under-sampling. The performance

improvements achieved with the ZPEL imager designs provide further validation for

the optical PSF engineering method within the joint-optimization design framework.

The reconstruction and the recognition tasks highlight the the task-specific nature of

the joint-optimization design framework and emphasize the crucial role of task-specific

metrics in the imaging system design process.

79

Chapter 4

Task-Specific Information

In this chapter, we introduce the notion of task-specific information (TSI) as a mea-

sure of the information content of an image measurement relevant to a specific task.

TSI is an information-theoretic metric that provides an upper bound on the task-

specific performance of an imaging system independent of the post-processing algo-

rithm. In this chapter we derive the TSI metric and demonstrate its application to

imaging system analysis for various detection and classification tasks. We also use

the TSI as a design metric to extend the depth of field of an imager by engineering

its optical PSF.

4.1. Introduction

The information content of an image plays an important role in a wide array of

applications ranging from video compression to imaging system design [27, 58, 59, 60,

61, 62]. However, the computation of image information content remains a challenging

problem. The problem is made difficult by (a) the high dimensionality of useful

images, (b) the complex/unknown correlation structure among image pixels, and (c)

the lack of relevant probabilistic models. It is possible to estimate the information

content of an image by using some simplifying assumptions. For example, Gaussian

and Markovian models have both been used to estimate image information [60, 61, 63].

Transform domain techniques have also been studied (e.g., wavelet prior models) [64,

65].

As natural images possess a high degree of redundancy, it is generally understood

80

(a)

(b)

(c)

Figure 4.1. (a) A 256 × 256 image, (b) the compressed version of image in (a) usingJPEG2000, and (c) 64 × 64 image obtained by rescaling image in (a).

81

that the information content of a natural image is not simply the product of the

number of pixels and the number of bits per pixel. An upper bound on the information

content of an image can be obtained from the file size that is generated by a lossless

compression algorithm. Consider the 256× 256 pixel image shown in Fig. 4.1(a). An

uncompressed version of this image requires 8 bits per pixel resulting in a file size

of 524,288 bits; whereas, a lossless compression algorithm yields a file size of only

299,600 bits. A tighter upper bound might be obtained from a lossy compression

algorithm that yields a visually indistinguishable reconstruction. Fig. 4.1(b) depicts

a reconstruction obtained using JPEG2000 [66] which yields a compressed file size of

36,720 bits. We may conclude from the high quality of this reconstruction that bits

discarded from Fig. 4.1(a) to obtain Fig. 4.1(b) were not important to visual quality.

Imagery is often used in support of a computational task (e.g., automated target

recognition). For this reason we would like to pursue a simple extension to the result

shown in Fig. 4.1(b) in which the task performance, instead of visual quality, is the

relevant metric. In such a scenario we might expect there to be aspects of the imagery

that are important to the task and other aspects that are not. For example, if our

task is target detection, then the image shown in Fig. 4.1(c) may contain nearly

as much information as do the images in Fig. 4.1(a) and Fig. 4.1(b). The file size

required for the image in Fig. 4.1(c) is only 25,120 bits. Taking this process one step

further, a compression algorithm that actually performs target (a tank in this case)

detection would yield a compressed file size of only 1 bit to indicate either “target

present” or “target absent.” The preceding discussion demonstrates that an image

used for target detection will contain no more than 1 bit of relevant information.

We will refer to this relevant information as task-specific information (TSI) and the

remainder of this chapter represents an effort to describe/quantify TSI as an analysis

tool for several tasks and imaging systems of interest. What we describe here is a

formal approach to the computation of TSI. Such a formalism is important primarily

because it enables imager design and/or adaptation that strives to maximize the TSI

82

content of measurements. This has two implications: (a) imager resources can be

optimally allocated so that irrelevant information is not measured and thus task-

specific performance is maximized and/or (b) imager resources can be minimized

subject to a TSI constraint thus reducing imager complexity, cost, size, weight, etc.

It is worth mentioning that as TSI is a Shannon information-theoretic measure it can

be used to bound conventional task performance metrics such as probability of error

via Fano’s inequality for a classification task [67].

Although a formal approach for quantifying the Shannon information in a task-

specific way has not been previously reported, we do note important previous work

concerning the use of task-based metrics for image quality assessment by Barrett et

al. [8, 9, 10, 28]. This previous work has primarily focused on ideal observer models

and their application to various detection and estimation tasks.

The remainder of this chapter is organized as follows. Section 4.2 introduces a

formal framework for the definition of TSI and a method for its computation using

conditional mean estimators. We consider three example tasks: target detection,

target classification, and joint detection/classification and localization. In Section 4.3

we apply the TSI framework to two simple imaging systems; an ideal geometric imager

and a diffraction-limited imager for each of the three tasks. Section 4.4 extends

the imaging model to compressive imagers. The TSI framework is applied to the

analysis of two compressive imagers: a principal component compressive imager and

a matched-filter compressive imager. In Section 4.5 the TSI metric is used to extend

the depth of field of an imager for a texture classification task by engineering its optical

PSF. Section 4.6 summarizes the TSI framework and draws the final conclusions.

4.2. Task-Specific Information

We begin by considering the various components of an imaging system. A block

diagram depicting these components is shown in Fig. 4.2. In this model, the scene

83

X Y Z R

SourceChannel

SceneH[ ]

NoiseN[ ]

Measurement

EncodingC[ ]

Virtual

Figure 4.2. Block diagram of an imaging chain.

(a) X = 1 (b) X = 0

Figure 4.3. Example scenes from the deterministic encoder.

Y provides the input to the imaging channel represented by the operator H to yield

Z = H(Y ). The quantity Z is then corrupted by the noise operator N to yield

the measurement R = N (Z). The model in Fig. 4.2 is made task-specific via the

incorporation of the virtual source and encoding blocks. The virtual source variable

X represents the parameter of interest for a specific task. For example, a target

detection task would utilize a binary-valued virtual source variable to indicate the

presence (X=1) or absence (X=0) of the target. Note that this virtual source serves

as a mechanism through which we can specify the TSI in a scene. The encoding

operator C uses X to generate the scene according to Y = C(X). In general, C can

be either deterministic or stochastic. In order to illustrate how C generates a scene,

let us consider the following two examples.

Our first example demonstrates the use of a deterministic encoding specified by

the operator

CS1(X) = ~VtargetX + ~Vbg, (4.1)

where CS1 is a deterministic operator and the virtual source variable X is a binary

random variable. ~Vtarget represents the target profile and ~Vbg is the background profile.

84

(a) X = 1

(b) X = 0

Figure 4.4. Example scenes from the stochastic encoder.

Note that ~Vtarget and ~Vbg are vectors formed by un-rastering a two-dimensional image

into a column vector. Fig. 4.3(a) and Fig. 4.3(b) show the encoder output for X = 1

and X = 0 respectively. The scene model defined by CS1 could be useful in a problem

where the task is to detect the presence or absence of a known target at a known

position in a known background.

Our second example demonstrates the use of a stochastic encoding specified by

the operator

CS2(X) = ~VtargetX + ~Vbg + ~Vtreeβ1 + ~Vshrubβ2, (4.2)

where X, ~Vtarget, and ~Vbg are the same as in Eq. (4.1). Clutter components ~Vtree

and ~Vshrub represent tree and shrub profiles respectively and are weighted by random

variables β1 and β2. Note that CS2 will depend on random variables β1 and β2;

therefore, CS2 is a stochastic operator. Fig. 4.4(a) and Fig. 4.4(b) show examples of

scene realizations generated by this stochastic encoding operator.

As X is the only parameter of interest for a given task, it is important to note

that the entropy of X defines the maximum task-specific information content of any

image measurement. Other blocks in the imaging chain may add entropy to the

85

image measurement R; however, only the entropy of the virtual source X is relevant

to the task. We may therefore define TSI as the Shannon mutual-information I(X;R)

between the virtual source X and the image measurement R as follows [67]

TSI ≡ I(X;R) = J(X) − J(X|R), (4.3)

where J(X) = −Elog(pr(X)) denotes the entropy of virtual source X, J(X|R) =

−Elog(pr(X|R) denotes the entropy of X conditioned on the measurement R, E·denotes statistical expectation, pr(·) denotes the probability density function, and

all the logarithms are taken to be base 2. Note that from this definition of TSI

we have I(X;R) ≤ J(X) indicating that an image cannot contain more TSI than

there is entropy in the variable representing the task. However, for most realistic

imaging problems computing TSI from Eq. (4.3) directly is intractable owing to the

dimensionality and non-Gaussianity of R. Numerical approaches may also prove

to be computationally prohibitive, even when using methods such as importance-

sampling, Markov Chain Monte Carlo(MCMC) or Bahl Cocke Jelinek Raviv(BCJR)

[68, 69, 70, 71, 72].

Recently, Guo et al. [73] demonstrated a direct relationship between the minimum

mean square error (mmse) in estimating X from R, and the mutual-information

I(X;R) for an additive Gaussian channel. Although the relation between estimation

mmse and Fisher information has been known via VanTree’s inequality [74], Guo’s

result connects estimation mmse with the Shannon information for the first time.

The result expresses mmse as a derivative of the mutual-information I(X;R) with

respect to signal to noise ratio. For a simple additive Gaussian noise channel we have

R =√sX +N, (4.4)

where N is the additive Gaussian noise with variance σ2 = 1 and s is the signal to

noise ratio. For this simple case we find that [73]

d

dsI(X;R) =

1

2mmse =

1

2E[|X − E(X|R)|2], (4.5)

86

where E(X|R) is the conditional mean estimator. This relation allows us to compute

mutual-information indirectly from mmse for an additive Gaussian channel without

any restrictions on the distribution of the virtual source variable X. It is interesting

to note that even though the source variable X is discrete valued, the conditional

mean estimator is a continuous variable which does not necessarily take values in the

range of the source variable X. For example, when X is a binary variable(0/1) the

conditional mean estimator will yield a real number between 0 and 1.

This result has been extended to the linear vector Gaussian channel for which

H[ ~X] = H ~X, where H denotes the matrix channel operator and ~X is the vector

channel input. The output of such a channel can be written as

~R =√sH ~X + ~N, (4.6)

where ~N follows a multivariate Gaussian distribution with covariance Σ ~N . In this

case, the Guo’s result becomes [75]

d

dsI( ~X; ~R) =

1

2E[||H ~X − E[H ~X|~R]||2]. (4.7)

The right hand side of Eq. (4.7) is the mmse in estimating H ~X rather than ~X and

therefore, we denote it by mmseH throughout the rest of this work to avoid confusion.

For an arbitrary noise covariance Σ ~N , mmseH can be computed using Tr(H†Σ−1~N

HE)

where E = E[( ~X − E[ ~X|~R])( ~X − E[ ~X |~R])T ], H† denotes the hermitian conjugate of

H and Tr(·) denotes the trace of the matrix. Therefore, the relationship between

mutual information and mmseH can be written as

d

dsI( ~X; ~R) =

1

2mmseH =

1

2Tr(H†Σ−1

~NHE). (4.8)

These results have also been extended to the case for which the channel input is

a random function of ~X, denoted by ~Y = C( ~X). The relation between I( ~X; ~R) and

mmseH for a random function C( ~X) is slightly different from the previous expression

in Eq. (4.8). Using the stochastic encoding model we have

~R =√sHC( ~X) + ~N. (4.9)

87

In this case the relation between mutual information and mmse can be expressed

as [75]

d

dsI( ~X; ~R) =

1

2mmseH , (4.10)

where mmseH = Tr(H†Σ−1~N

H(E~Y − E~Y | ~X)),

E~Y = E[(~Y − E(~Y |~R))(~Y − E(~Y |~R))T ],

E~Y | ~X = E[(~Y − E(~Y |~R, ~X))(~Y − E(~Y |~R, ~X))T ].

Next, we consider the application of these results to an important class of imaging

problems. We make the following assumptions about the general imaging chain model:

(1) The channel operator H is linear (discrete-to-discrete) and deterministic, (2) the

encoding operator C is linear and stochastic, and (3) the noise model N is additive

and Gaussian. We begin by developing some basic scene models for the tasks of

detection and classification.

4.2.1. Detection with deterministic encoding

For pedagogical purposes we begin with a scalar channel and a deterministic encoding.

Consider a simple task of detecting the presence or absence of a known scalar signal

t in the presence of noise. The measurement R is given as

R =√s t ·X +N (4.11)

where X is the virtual source variable that determines the signal present or absent

condition and N represents additive white Gaussian noise with variance σ2 = 1. Note

that here the encoding operator is deterministic and is defined as Cs(X) = t ·X. For

simplicity, we set HY = Y . Because X is a binary random variable with probability

distribution: Pr(X = 1) = p and Pr(X = 0) = 1 − p, we can assert

I(X;R) ≤ J(X) ≤ 1 bit, (4.12)

88

0 5 10 15 20 25 30 35 40 45 500

0.05

0.10

0.15

0.20

0.25

s

mm

se

(a)

0 5 10 15 20 25 30 35 40 45 500

0.2

0.4

0.6

0.8

1

s

Tas

k S

peci

fic In

form

atio

n [b

its]

mmse methoddirect method

(b)

Figure 4.5. (a) mmse and (b) TSI versus signal to noise ratio for the scalar detectiontask.

where the entropy of X is J(X) = −p log(p) − (1 − p) log(1 − p). Note that for

this simple detection task the received signal R contains at most 1 bit of task-specific

information. Therefore, the performance of any detection algorithm that operates on

the measurement R is upper bounded by the task-specific information.

We compute the mutual-information I(X;R) using two methods. The direct

method is based on the definition of mutual-information given in Eq. (4.3) wherein

differential entropies will be used owing to the continuous-valued nature of R. The

conditional differential entropy J(R|X) equals J(N) = 12ln(2πeσ2). Note that J(R)

is not straightforward to compute as R follows a mixture of Gaussian distribution

defined as

pr(R) =1√

2πσ2

(p exp

[−(R −√

st)2

2σ2

]+ (1 − p) exp

[− R2

2σ2

]). (4.13)

We therefore resort to numerical integration to compute J(R). Note that when R

is a vector this approach quickly becomes computationally prohibitive as the dimen-

sionality of R increases.

The alternative method for computing I(X;R) exploits the relationship between

89

mmse and mutual-information as stated in Eq. (4.5), where E(X|R) is the conditional

mean estimator which can be expressed as

E(X|R) =

[1 +

1 − p

pexp

(√st(

√st− 2R)

2σ2

)]−1

. (4.14)

The mutual-information is computed by numerically integrating mmse over a range

of s. The mmse itself is estimated using the Monte-Carlo and importance-sampling

methods [68, 69, 70, 71].

Fig. 4.5(a) shows a plot of mmse versus s for p = 12

and t = 1. As expected the

mmse decreases with increasing s. The mutual-information computed from this mmse

data is plotted in Fig. 4.5(b) versus s. The curve with ‘circle’ symbol corresponds

to the mutual-information computed using the mmse-based method and the curve

with ‘plus’ symbol corresponds to the mutual-information computed using the direct

method as per Eq. (4.3). As expected these two methods yield the same result. Note

that Guo’s method of estimating TSI via mmse is significantly more computationally

tractable for high-dimensional vector ~R as compared to the direct method. Hence-

forth, all the TSI results reported herein will employ Guo’s mmse-based method. Our

pedagogical example considered a deterministic C; however, in any realistic scenario

C will be stochastic. Next we consider a detection task in which C is stochastic,

allowing for additional scene variability arising from random background and target

realizations.

4.2.2. Detection with stochastic encoding

Let us consider a slightly more complex detection task where a known target is to

be detected in the presence of noise and clutter. The target position is assumed

to be variable and unknown and hence for the detection task, the target position

assumes the role of a nuisance parameter. Here, we have considered only one nuisance

parameter; however, more realistic scene models would utilize a multitude of nuisance

parameters such as target orientation, location, magnification, etc. Our aim here is

90

T1 T2 TP

M2 P

=×

T2

1

0

0

T ~ρ T2

(a)

Vc2 VcK

M2 K

=×

Vc~β Vc

~β

Vc1

0.5

0.8

0.3

(b)

Figure 4.6. Illustration of stochastic encoding Cdet: (a) Target profile matrix T and

position vector ~ρ and (b) clutter profile matrix Vc and mixing vector ~β.

to demonstrate an application of the TSI framework and the extension to additional

nuisance parameters will be straightforward.

The imaging model for this task is constructed as

~R = HCdet(X) + ~N, (4.15)

where H is the imaging channel matrix operator, ~N is the zero-mean additive white

Gaussian detector noise (AWGN) with covariance Σ ~N and Cdet is the stochastic

encoding operator. The encoding operator Cdet is defined as

Cdet(X) =√sT~ρX +

√cVc

~β, (4.16)

where T is the target profile matrix, in which each column is a target profile (lexico-

graphically ordered into a one-dimensional vector) at a specific position in the scene.

In general, when the scene is of dimension M ×M pixels and there are P different

possible target positions, the dimension of matrix T is M2 × P . The column vector

~ρ is a random indicator vector and selects the target position for a given scene real-

ization. Therefore, ~ρ ∈ ~c1,~c2...~cP where ~ci is a P -dimensional unit column vector

91

with a 1 in the ith position and 0 in all remaining positions. Fig. 4.6(a) illustrates the

structure of T and ~ρ. Note that ~ρ = ~c2 in Fig. 4.6(a) and therefore the output of T~ρ

is the target profile at position 2. All positions are assumed to be equally probable,

therefore Pr(~ρ = ~ci)=1P

for i = 1, 2, ...P. The virtual source variable X takes the

value 1 or 0 (i.e. “target present” or “target absent”) with probabilities p and 1 − p

respectively.

Vc is the clutter profile matrix whose columns represent various clutter compo-

nents such as tree, shrub, grass etc. The dimension of Vc is M2 × K where K is

the number of clutter components. ~β is the K-dimensional clutter mixing column

vector, which determines the strength of various components that comprise the clut-

ter. ~β follows a multivariate Gaussian distribution with mean ~µ~β and covariance Σ~β.

Fig. 4.6(b) shows individual clutter components arranged column-wise in the clutter

profile matrix Vc. The particular realization of clutter mixing vector ~β shown in

Fig. 4.6(b) yields the clutter shown on the right-hand side.

The coefficient c in Eq. (4.16) denotes the clutter-to-noise ratio. Note that clutter

and detector noise combine to form a multivariate Gaussian random vector ~Nc =√cHVc

~β + ~N with mean ~µ ~Nc= ~µ~β and covariance Σ ~Nc

= HVcΣ~βVcTHT · c + Σ ~N .

Now, we can rewrite the imaging model as

~R =√sHT~ρX + ~Nc. (4.17)

The task-specific information for the detection task is the mutual-information

between the image measurement ~R and the virtual source X. Since the encoding

operator Cdet is a random function of the source X, we apply the result given in

Eq. (4.10). Comparing Eq. (4.10) with the imaging model shown in Eq. (4.17) we

note that the ~X and ~Y in Eq. (4.10) are equal to the virtual source X and T~ρX

respectively. The channel operator H is substituted by H and ~N is replaced by ~Nc

92

M2

=

P

0

0

0

1

0

0

1

0

T2 T2+P

TρT ρ

×

T1 T2 TP T1+P T2+P T2P

Figure 4.7. Structure of T and ρ matrices for the two-class problem.

in Eq. (4.10). The TSI and mmseH are therefore related as

TSI = I(X; ~R) =1

2

∫ s

0

mmseH(s′)ds′, (4.18)

where mmseH(s) = Tr(H†Σ−1~Nc

H(E~Y −E~Y |X)), (4.19)

and ~Y = T~ρX. (4.20)

Explicit expressions for the estimators required for evaluating the expectations in

Eq. (4.19) are derived in Appendix A.

4.2.3. Classification with stochastic encoding

We consider a simple two-class classification problem for which we label the two

possible states of nature (i.e., targets) as H1 and H2. The extension to more than

two classes is straightforward and is considered later. The overall imaging model

remains the same as in Eq. (4.15). The number of positions that each target can

take remains unchanged. However, now T has dimensions M2 × 2P and is given by

T = [TH1TH2

] where THiis the target profile matrix for class i. The structure of this

composite target profile matrix T is shown in Fig. 4.7. The virtual source variable is

denoted by the vector ~X and takes the values [1, 0]T or [0, 1]T to represent H1 or H2

93

respectively. The prior probabilities for H1 and H2 are p and 1− p respectively. The

vector ~ρ from the detection problem becomes a matrix ρ of dimension 2P × 2 and is

defined as

ρ =

[~ρH 00 ~ρH

], (4.21)

where ~ρH ∈ ~c1,~c2....~cP and 0 is an all zero P -dimensional column vector. Once

again we assume all positions to be equally probable, therefore Pr(~ρH = ~ci)=1P

for

i = 1, 2, .., P.Consider an example that illustrates how the term Tρ ~X enables selection of a

target from either H1 or H2 at one of P positions. In order to generate a target from

H1 at the mth position in the scene, ~ρH = ~cm and ~X = [1, 0]T . The product of Tρ

will produce a M2×2 matrix whose first column is equal to the H1 profile at position

m and whose second column is equal to the H2 profile at the same position. This

resulting matrix, when multiplied by ~X = [1 0]T , will select the H1 profile. Similarly,

in order to choose a target from H2 at the mth position, ~ρH = ~cm and ~X = [0 1]T .

Note that ~ρH = ~c2 in Fig. 4.7 and therefore, selects the second position for H1 and

H2.

The imaging model presented for the detection problem in Eq. (4.17) and the

corresponding TSI defined in Eq. (4.18) require minor modifications to remain valid

for the classification problem. Specifically, we require the virtual source variable

to become a vector quantity ~X, and the dimensions of T and ρ to be adjusted

accordingly, as noted above. Note that despite the increase in dimensionality, the

binary source vector ~X results in the upper bound TSI ≤ 1 bit for the two-class

classification problem.

The two-class model for target classification can easily be extended to the case of

joint detection and classification. The simple extension involves introducing a third

class corresponding to the null hypothesis and can be accommodated by allowing ~X

to also take the value [0 0]T with some probability p0. The TSI upper bound in this

94

M2

T

×

Region 4Region 3Region 2Region 1

0

01

00

0

0

0

00

P4

P

=1

0

0

0

Λ

Region 2

TΛ

Figure 4.8. Structure of T and Λ matrices for the joint detection/localization prob-lem.

case becomes J( ~X) = −p0 log(p0)− p1 log(p1)− p2 log(p2) ≤ 1.6 bits for p0 = p1 = p2.

This important extension to joint detection and classification is pursued further in

the next section, where we also consider the simultaneous estimation of an unknown

target parameter.

4.2.4. Joint Detection/Classification and Localization

We begin with a discussion of the localization task. Later in this section we combine

the encoding model for localization with the models for detection and classification

described in Subsections 4.2.2 and 4.2.3. Consider the problem of localizing a target

(known to be present) in one of Q regions in a scene. The example shown in Fig. 4.8

depicts a case in which there are four regions (Q = 4). Note that for this problem,

the specific target location within a region is unimportant and is therefore treated

as a nuisance parameter. We allow Pi possible target locations within the ith region

such that∑Q

i=1 Pi = P , where P is the total number of possible target locations in

the scene. The noise and clutter models remain unchanged from Subsections 4.2.2

and 4.2.3 so that the task-specific imaging model for localization can be written as

~R =√sHTΛ(X)~ρ+ ~Nc, (4.22)

95

where we have simply inserted the localization matrix Λ(X) into the channel model

in Eq. (4.17). As defined earlier, the columns of T correspond to the target profiles

at all possible positions. For the sake of convenience, we rearrange the columns of

T such that the first P1 columns represent the target profiles at the P1 positions

in region 1, the next P2 columns correspond to region 2, and so on. The virtual

source variable X is now a Q-ary variable i.e., X ∈ 1, 2, .., Q representing one of

the Q regions where the target is present. Λ(X) acts as the localization matrix and

selects all target profiles in the region specified by the source X. For the case X = i,

Λ(X = i) is of dimension P × Pi and given by

Λ(X = i) =

[0]P1×Pi

...[0]Pi−1×Pi

[I]Pi×Pi

[0]Pi+1×Pi

...[0]PQ×Pi

.

For X = i, ~ρ is a Pi-dimensional random indicator vector which selects one of the

Pi target profiles resulting from TΛ(X = i). Therefore, ~ρ ǫ ~e1, ~e2....~ePi where

~ek is a Pi-dimensional unit column vector with a 1 in the kth position and 0 in all

remaining positions. All positions within each region are considered to be equally

probable; therefore, Pr(~ρ = ~ek) = Pr(X=i)Pi

, where Pr(X = i) is the probability of the

target being located in region i and k = 1, 2, .., Pi. Fig. 4.8 illustrates the structure

of T and Λ(X) using an example where X = 2. In the example, P positions are

equally distributed among the 4 regions i.e., Pi = P4

for i = 1, 2, 3, 4. Observe

that TΛ(X) selects all the target positions in region 2 and the post-multiplication

of this matrix with ~ρ = ~ek results in the target at the kth position of region 2.

Recall that the localization task is only concerned with estimating the region in

which the target is present and the exact position within the region is treated as

a nuisance parameter. Therefore, the upper bound on TSI in this case becomes

96

Region 2

2P

=

Region 2

M2

Region 4Region 3Region 2Region 1

Region 1 Region 2 Region 3 Region 4T

2 × P4

Λ

0

0

Λ

×

Ω TΩ

Figure 4.9. Structure of T and Ω matrices for the joint classification/localizationproblem.

J(X) = −∑Qq=1 Pr(X = q) log Pr(X = q) ≤ [logQ ]bits.

We now combine the encoding model for localization, defined in Eq. (4.22), with

the detection and classification models described in the previous section. For the joint

detection/localization task we are interested in detecting the presence of a target and

if present, localizing it in one of Q regions. The imaging model from Eq. (4.22)

becomes

~R =√sHTΛ(X)~ρα + ~Nc, (4.23)

where α is a binary variable indicating the presence or absence of the target. There-

fore, the virtual source in this case is a (Q + 1)-ary variable and is defined as:

X ′ ∈ X, 0 so that when α = 0, X ′ = 0 and when α = 1, X ′ = X. Compar-

ing Eq. (4.10) with the imaging model shown in Eq. (4.23), we note that the ~X and

~Y in Eq. (4.10) are equal to the virtual source X ′ and the term TΛ(X)~ρα respectively.

The channel operator H is replaced with H and ~N is replaced by ~Nc. Therefore, TSI

97

and mmseH for this task can be expressed as

TSI = I(X ′; ~R) =1

2

∫ s

0


where mmseH(s) = Tr(H†Σ−1~Nc

H(E~Y −E~Y |X′)), (4.25)

X ′ ∈ X, 0 , ~Y = TΛ(X)~ρα. (4.26)

The (Q+1)-ary nature of the virtual source variable in the joint detection/localization

task increases the upper bound on TSI as compared to that for the simple detection

task. For the probabilities Pr(α = 1) = p and Pr(α = 0) = 1 − p, the TSI is upper

bounded by

J(X ′) = −(1 − p) log(1 − p) −Q∑

q=1

Pr(X = q) log Pr(X = q), (4.27)

where∑Q

q=1 Pr(X = q) = p. For the case of p = 12

and Pr(X = q) = pQ

, the maximum

TSI is [1 + 12logQ ]bits.

Finally, we consider the joint classification/localization task where the task of

interest is to identify one of the two targets from H1 or H2 and localize it in one of

Q regions. The exact position of the target within each region remains a nuisance

parameter. The imaging model for this task is given by

~R =√sHTΩ(X)ρ~α+ ~Nc. (4.28)

This model is the same as the one given in Eq. (4.23) except for minor modifications.

The total number of positions that each target can take remains unchanged. However,

now T has dimensions M2×2P and is given by T = [TH1TH2

] where THiis the target

profile matrix for target i. The arrangement of the target profiles in TH1and TH2

is

similar to the arrangement described in Subsection 4.2.3. The virtual source in this

case is 2Q-ary and given by ~X ′ = [X, ~α], where X ∈ 1, 2.., Q indicates the region

and ~α ∈ [1, 0]T , [0, 1]T represents one of the two targets. The localization matrix

Ω(X = i), now has dimensions 2P × 2Pi for selecting the H1 and H2 profiles in the

98

(a) (b)

(c) (d)

Figure 4.10. Example scenes: (a) Tank in the middle of the scene, (b) Tank in thetop of the scene, (c) Jeep at the bottom of the scene, and (d) Jeep in the middle ofthe scene.

99

region i and is given by

Ω(X = i) =

[Λ(X = i) 0

0 Λ(X = i)

], (4.29)

where matrices Λ(X = i) and 0 are of dimension P × Pi. The matrix Λ(X) is

identical to the one in Eq. (4.22). Fig. 4.9 illustrates the role of TΩ(X) in choosing

the H1 and H2 profiles at all positions in the region specified by X. This example

uses X = 2, Q = 4, and Pi = P4

for i = 1, 2, 3, 4. The matrix TΩ(X) in Eq. (4.28)

is post-multiplied by the matrix ρ of dimension 2Pi × 2 to yield the targets H1 and

H2 at one of the positions in region i. Here ρ is defined as

ρ =

[~ρH 00 ~ρH

], (4.30)

where 0 is an all zero Pi-dimensional column vector and ~ρH ∈ ~e1, ~e2....~ePi, where ~ek

is an indicator vector as before. Therefore, for ~ρH = ~ek, TΩ(X)ρ results in a M2 × 2

matrix with its first column representing H1 at the kth position in region i and its

second column representing H2 at this same position. This result is then multiplied

by ~α which selects either H1 or H2 for ~α = [1, 0]T or ~α = [0, 1]T respectively.

The TSI expression in Eq. (4.24) requires only minor modifications to remain valid

for the joint classification and localization problem. The upper bound for TSI in this

task is given by

J( ~X ′) = −2∑

i=1

P∑

q=1

Pr(X = q, ~αi) log Pr(X = q, ~αi), (4.31)

where ~α1 = [0, 1]T , ~α2 = [1, 0]T ,∑Q

q=1 Pr(X = q, ~α1) = 1 − p, and∑Q

q=1 Pr(X =

q, ~α2) = p. For the case when p = 12, Pr(X = q, ~α1) = 1−p

Qand Pr(X = q, ~α2) = p

Q,

the maximum TSI is [1 + logQ ]bits.

4.3. Simple Imaging Examples

The TSI framework described in the previous section allows us to evaluate the task-

specific performance of an imaging system for a task defined by a specific encoding

100

operator and virtual source variable. Three encoding operators corresponding to three

different tasks: (a) detection, (b) classification, and (c) joint detection/classification

and localization have been defined. Now we apply the TSI framework to evaluate

the performance of both a geometric imager and a diffraction-limited imager on these

three tasks.

We begin by describing the source, object, and clutter used in the scene model.

The source variableX in the detection task represents “tank present” or “tank absent”

conditions with equal probability i.e. p = 12. In the classification task, the source

variable ~X represents “tank present” or “jeep present” states with equal probability.

The joint localization task adds the position parameter to both the detection and

classification tasks. From Eq. (4.16) we see that the source parameter is the input

to the encoding operator, which in turn generates a scene consisting of both object

and clutter. Here the scene ~Y is of dimension 80 × 80 pixels (M = 80). The object

in the scene can be either a tank or a jeep at one of 64 equally likely positions

(P = 64). Therefore, the matrix T has dimensions of 6400 × 64 for the detection

task and 6400 × 128 for the classification task. In our scene model, the number of

clutter components is set to K = 6. Recall that the clutter components are arranged

as column vectors in the clutter matrix Vc. Clutter is generated by combining these

components with relative weights specified by the column vector ~β. Note that each

clutter vector is non-random but the weight vector ~β follows a multivariate Gaussian

distribution. In the simulation study the mean of ~β is set to ~µ~β = [160 80 40 40 64 40]

and covariance to Σ~β = ~µTβ I/5. The clutter to noise ratio, denoted by c, is set to 1.

The noise ~N is zero mean with identity covariance matrix Σ ~N = I.

Monte-Carlo simulations with importance sampling are used to estimate mmseH

using the conditional mean estimators for a given task. The mmseH estimates are

numerically integrated to obtain TSI over a range of s. For each value of s, we use

160, 000 clutter and noise realizations in the Monte-Carlo simulations.

101

0 10 20 30 40 50 60 70 80 90 100 110 120

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

s

mm

se

EY

EY|X

EY−E

Y|X

(a)

0 10 20 30 40 50 60 70 80 90 100 110 1200

0.2

0.4

0.6

0.8

1.0

s

Tas

k S

peci

fic In

form

atio

n [b

its]

GeometricDiffraction−limited

(b)

Figure 4.11. Detection task: (a) mmse versus signal to noise ratio for an ideal geomet-ric imager and (b) TSI versus signal to noise ratio for geometric and diffraction-limitedimagers.

102

4.3.1. Ideal Geometric Imager

The geometric imager represents an ideal imaging system with no blur and therefore,

we set H= I. Fig. 4.10 shows some example scenes resulting from object realizations

measured in the presence of noise. Note that the object in the scene is either a tank

or a jeep at one of the 64 positions.

We begin by describing the results for the detection task. Fig. 4.11(a) and

Fig. 4.11(b) show the plots of mmseH and TSI versus s respectively. Recall that

the mmseH is equal to the difference of E~Y and E~Y |X represented by the dotted and

dashed curves in Fig. 4.11(a) respectively. The term E~Y |X represents the mmse in

estimating ~Y given the knowledge of both the measurement ~R and source X. There-

fore, we expect it to always be less than E~Y , which is the mmse in estimating ~Y

given only the measurement ~R. Fig. 4.11 confirms this behavior. In the low s region,

mmseH (in solid line) is small as both E~Y and E~Y |X are nearly equal. Despite the

additional conditioning on X, E~Y |X does not significantly improve upon E~Y as the

noise remains the dominating factor. However, in the moderate s region E~Y |X im-

proves faster than E~Y and therefore the mmseH increases here. In the high s regime,

the noise has negligible effect and hence the additional knowledge of X does not sig-

nificantly improve E~Y |X . This leads to the mmseH converging towards zero as both

the mmse components become equal. The solid line in Fig. 4.11(b) shows the plot

of TSI versus s. As expected the TSI increases with s eventually saturating at 1 bit.

The saturation occurs because TSI is always upper bounded by the entropy of the

virtual source X. The TSI plot confirms our expectations regarding blur-free imaging

system performance with increasing s.

Now we consider TSI for the joint task of detecting and localizing a target. The

scene is partitioned into four regions, i.e., Q = 4. There are a total of 64 allowable

target positions, with 16 positions in each region. Fig. 4.12 shows some examples

scenes. Recall that the position of the target within each region is a nuisance param-

103

(a) (b)

(c) (d)

Figure 4.12. Scene partitioned into four regions: (a) Tank in the top left region ofthe scene, (b) Tank in the top right region of the scene, (c) Tank in the bottom leftregion of the scene, and (d) Tank in the bottom right region of the scene.

eter. We assume that the probability of the target being present or absent is 12

and

the conditional probability of the target in any of the four regions is 14, given that the

target is present. The entropy of the source variable therefore, increases to 2 bits as

per Eq. (4.27). Fig. 4.13(a) shows a plot of mmse versus s for the joint detection and

localization task. The dotted line represents the mmse of the estimator conditioned

over the image measurement only. The dashed line corresponds to the mmse of the

estimator conditioned jointly on the virtual source variable and the image measure-

ment. As expected we see that E~Y |X ≤ E~Y . The solid line represents mmseH , the

difference between the dotted and dashed curves, and is integrated to yield TSI. The

TSI of the geometric imager is plotted in solid line versus s in Fig. 4.13(b) . We note

that the TSI saturates at 2 bits as expected.

The previous two examples have demonstrated how the formalism of Section 4.2

104

0 10 20 30 40 50 60 70 80 90 100 110 1200

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

s

mm

se

EY

EY|X’

EY−E

Y|X’

(a)

0 10 20 30 40 50 60 70 80 90 100 110 1200

0.4

0.8

1.2

1.6

2.0

s

Tas

k S

peci

fic In

form

atio

n [b

its]


(b)

Figure 4.13. Joint detection/localization task: (a) mmse versus signal to noise ratiofor an ideal geometric imager and (b) TSI versus signal to noise ratio for geometricand diffraction-limited imagers.

105

can be applied to either a detection task or a joint detection/localization task. These

examples have also confirmed the two important TSI trends: (1) TSI is a monotoni-

cally increasing function of signal to noise ratio and (2) TSI saturates at the entropy

of the virtual source. Section 4.2 also described how a classification task or a joint

classification/localization task may be captured within the TSI formalism. The solid

curve in Fig. 4.14 depicts the TSI obtained from an ideal geometric imager for a

classification task in which the two classes are equally probable. Recall that for the

classification task we treat the position as the nuisance parameter and so the equi-

probable assumption results in a virtual source entropy of 1 bit. As expected the TSI

in Fig. 4.14 saturates at 1 bit. Fig. 4.15 presents the results of the TSI analysis of the

joint classification/localization task. Once again we have used two equally probable

targets and Q = 4 equally probable regions resulting in a source entropy of 3 bits. We

see that once again despite the measurement entropy that results from random clut-

ter and noise, the TSI provides an accurate estimate of the task-specific information,

saturating at 3 bits.

4.3.2. Ideal Diffraction-limited imager

The previous subsection presented the TSI results for an ideal geometric imager.

Those results should therefore be interpreted as upper bounds on the performance of

any real-world imager. In this subsection, we examine the effect of optical blur on TSI.

We will assume aberration-free, space-invariant, diffraction-limited performance. The

discretized optical point spread function (PSF) associated with a rectangular pupil

can be expressed as [29]

hi,j =

∫ ∆/2

−∆/2

∫ ∆/2

−∆/2

sinc2

((x− i∆)

W

)sinc2

((y − j∆)

W

)dxdy, (4.32)

where ∆ is the detector pitch and W quantifies the degree of optical blur associated

with the imager. Lexicographic ordering of this two-dimensional PSF yields one row

of H and all other rows are obtained by lexicographically ordering shifted versions of

106

0 10 20 30 40 50 60 70 80 90 100 110 1200

0.2

0.4

0.6

0.8

1.0

s

Tas

k S

peci

fic In

form

atio

n [b

its]


Figure 4.14. Classification task: TSI versus signal to noise ratio for geometric anddiffraction-limited imagers.

this PSF. The optical blur is set to W = 2 and the detector pitch is set to ∆ = 1 so

that the optical PSF is sampled at the Nyquist rate. The clutter and noise statistics

remain unchanged.

Fig. 4.16 shows examples of images that demonstrate the effects of both optical

blur and noise. The object, as before, is either a tank or a jeep at one of the 64

positions. The plots of TSI versus s are represented by dash-dot curves for the

detection and classification tasks in Fig. 4.11(b) and Fig. 4.14 respectively. The TSI

metric verifies that imager performance is degraded due to optical blur compared to

the geometric imager. For example, in the detection task, s = 34 yields TSI = 0.9 bit

for the geometric imager, whereas a higher signal to noise ratio s = 43 is required to

achieve the same TSI for the diffraction-limited imager.

The dash-dot curves in Fig. 4.13(b) and Fig. 4.15 show the TSI versus s plots

for the joint detection/localization and classification/localization tasks respectively.

Once again we see that TSI is reduced due to optical blur. In Fig. 4.13(b) TSI = 1.8 bit

107

0 10 20 30 40 50 60 70 80 90 100 110 1200

0.5

1.0

1.5

2.0

2.5

3.0

s

Tas

k S

peci

fic In

form

atio

n [b

its]


Figure 4.15. Joint classification/localization task: TSI versus signal to noise ratio forgeometric and diffraction-limited imagers.

is achieved at s = 35 for the diffraction-limited imager as opposed to s = 28 in

case of the geometric imager for the detection/localization task. Similarly, for the

classification/localization task the signal to noise ratio required to achieve TSI =

2.7 bit increases by 10 due to the optical blur associated with the diffraction-limited

imager.

In this section, we have presented several numerical examples that demonstrate

how the TSI analysis can be applied to various tasks and/or imaging systems. The

results obtained herein are consistent with our expectations that (1) TSI increases

with increasing signal to noise ratio, (2) TSI is upper bounded by J(X), and (3)

blur degrades TSI. Although these general trends were known in advance of our

analysis, we are encouraged by our ability to quantify these trends using a formal

approach. In the next section we will use a TSI analysis to evaluate the target-

detection performance of two candidate compressive imagers.

108

(a) (b)

(c) (d)

Figure 4.16. Example scenes with optical blur: (a) Tank in the top of the scene, (b)Tank in the middle of the scene, (c) Jeep at the bottom of the scene, and (d) Jeep inthe middle of the scene.

109

X Y Z

SourceChannel

SceneH[ ]

EncodingC[ ]

Virtual Projection NoiseN[ ]

R

MeasurementP[ ]

F

Figure 4.17. Block diagram of a compressive imager.

4.4. Compressive imager

For task-specific applications (e.g. detection) an isomorphic measurement (i.e. a

pretty picture) may not represent an optimal approach for extracting TSI in the

presence of detector noise and a fixed photon budget. The dimensionality of the

measurement vector has a direct effect on the measurement signal to noise ratio [6].

Therefore, we strive to design an imager that directly measures the scene information

most relevant to the task while minimizing the number of detector measurements and

thereby increasing the measurement signal to noise ratio. One approach towards this

goal is to measure linear projections of the scene, yielding as many detector measure-

ments as there are projections. We refer to such an imager as a compressive imager,

sometimes also referred to as a projective/feature-specific imager. Fig. 4.17 shows

the imaging chain block diagram modified to include a projective transformation P.

For the compressive imager the measurement can be written as

R = N (P(H(C(X)))). (4.33)

We only consider discrete linear projections here, therefore the P operator is

represented by the matrix P. If we consider the detection task from Subsection 4.2.2

then the measurement model for the compressive imager can be written as

~R =√sPHT~ρX + ~N ′

c, (4.34)

where, ~N ′c =

√cPHVcβ + ~N.

The TSI and the mmseH expressions for the compressive imager are found by substi-

110

tuting PH for H in Eqs. (4.18)-(4.25) yielding

TSI ≡ I(X; ~R) =1

2

∫ s

0


where mmseH(s) = Tr(H†P†Σ−1~N ′

c

PH(E~Y − E~Y |X)) (4.36)

here ~Y = T~ρX and E~Y and E~Y |X are given earlier in Eq. (4.10).

Similarly for the joint detection/localization task from Subsection 4.2.4 the mod-

ified expressions for the imaging model and TSI are given by

~R =√sPHTΛ(X)~ρα + ~Nc, (4.37)

TSI ≡ I(X ′; ~R) =1

2

∫ s

0


where mmseH(s) = Tr(H†P†Σ−1~Nc

PH(E~Y −E~Y |X′)) (4.39)

here X ′ ∈ X, 0 and ~Y = TΛ(X)~ρα.

We consider compressive imagers based on two classes of projection: a) princi-

pal component projections and b) matched filter projections. Their performance is

compared with that of the conventional diffraction-limited imager.

4.4.1. Principal component projection

Principal component (PC) projections are determined by the statistics of the object

ensemble. For a set of objects O, the PC projections are defined as the eigenvectors

of the object auto-correlation matrix ROO given by

ROO = E(ooT ), (4.40)

where o ∈ O is a column vector formed by lexicographically arranging the elements

of a two-dimensional object. Note that the expectation is over all objects in the set

O. These PC projection vectors are used as rows of the projection matrix P∗. In

our numerical study, example objects in the set O are obtained by generating sample

111

0 10 20 30 40 50 60 70 80 90 100 110 1200

0.2

0.4

0.6

0.8

1.0

s

Tas

k S

peci

fic In

form

atio

n [b

its]

Diffraction−limitedProjective F=8Projective F=16Projective F=24Projective F=32

Figure 4.18. Detection task: TSI for PC compressive imager versus signal to noiseratio.

realization of random scenes with varying clutter levels, target strength and target

position. Here we use 10, 000 such object realizations to estimate ROO. The projection

matrix P∗ consists of F rows of length M2 = 6400, which are the eigenvectors of

ROO corresponding to the F dominant eigenvalues. To ensure a fair comparison of

the compressive imager with the diffraction-limited imager, we constrain the total

number of photons used by the former to be less than or equal to the total number

photons used by the latter. The following normalization is applied to P∗ to enforce

this photon constraint resulting in the projection matrix P,

P =1

csP∗, (4.41)

where the maximum column sum: cs = maxj

∑Fi=1 |P∗

ij|.Fig. 4.18 shows the TSI for this compressive imager plotted as a function of s for

the detection task. The dash-dot curve represents the TSI for the diffraction-limited

imager from Subsection 4.3.2. Note that the TSI for a compressive imager increases

112

0 10 20 30 40 50 60 70 80 90 100 110 1200

0.4

0.8

1.2

1.6

2.0

s

Tas

k S

peci

fic In

form

atio

n [b

its]

Diffraction−limitedProjective F=8Projective F=16Projective F=24Projective F=32

Figure 4.19. Joint detection/localization task: TSI for PC compressive imager versussignal to noise ratio.

as the number of PC projections F is increased from 8 to 24. This can be attributed

to the reduction in truncation error associated with increasing F . However, there

is also an associated signal to noise ratio cost with increasing F as we distribute

the fixed photon budget across more measurements while the detector noise variance

remains fixed. This effect is illustrated by the case F = 32, where the TSI begins

to deteriorate. This is especially evident at low signal to noise ratio. Notwithstand-

ing this effect, the PC compressive imager is seen to provide improved task-specific

performance compared to a conventional diffraction-limited imager, especially at low

signal to noise ratio. For example, the compressive imager with F = 24 achieves a

TSI = 0.9 bit at s = 18; whereas, the diffraction-limited imager requires s = 34 to

achieve the same TSI performance.

The TSI plot for the joint detection/localization task is shown in Fig. 4.19 for

both the compressive and diffraction-limited imagers. We see the same trends as in

Fig. 4.18. As before, a TSI rollover occurs at F = 32 due to the signal to noise ratio

113

trade-off associated with increasing F . In comparison with the diffraction-limited

imager which requires s = 35 to achieve TSI =1.8 bit, the compressive imager with

F = 24 achieves the same level of performance at s = 19.

Although we have shown that the PC compressive imager provides larger TSI

than the diffraction-limited imager we cannot claim that the PC projections are an

optimal choice. This is because PC projections seek to minimize the reconstruction

error towards the goal of estimating the whole scene [28], which is an overly stringent

requirement for a detection task. In fact, for a detection problem it is well known

that the generalized matched filter (MF) approach is optimal in terms of the Neyman-

Pearson criterion [47]. In the next section we present the TSI results for a matched

filter compressive imager.

4.4.2. Matched filter projection

For a detection problem in which both the signal and background are known, the gen-

eralized MF provides the optimal performance in terms of maximizing the probability

of detection for a fixed false alarm rate [47]. Recall that in our detection problem

the target position is a nuisance parameter that must be estimated implicitly. In

such a case, instead of a matched filter (e.g. correlator) we consider a set of matched

projections. Each matched projection corresponds to the target at a given position.

Therefore, the resulting compressive imager yields the inner-product between the

scene and the target at a particular position specified by each projection. Note that

compressive imaging in such a case is similar to an optical correlator except that in

an optical correlator the inner-product values are obtained for all possible shifts of

the target: our compressive imager will compute inner-products for only a subset of

these shifts.

The projection matrix P of the matched projection imager is defined as

P = TΣ−1~Nc, (4.42)

114

0 5 10 15 20 25 30 35 40 45 50 55 600

0.2

0.4

0.6

0.8

1.0

s

Tas

k S

peci

fic In

form

atio

n [b

its]

Diffraction−limitedProjective F=16Projective F=32Projective F=64

Figure 4.20. Detection task: TSI for MF compressive imager versus signal to noiseratio.

where T is the modified target profile matrix with each row corresponding to a target

profile at a specific position. The number of positions chosen is F and therefore, the

dimensions of the matrix T is F ×M2. The target positions for constructing T are

chosen such that they are equally spaced with some overlap between the profiles at the

adjacent positions. The target profile matrix T is post-multiplied by Σ−1~Nc

to account

for the effects of detector noise [47]. The dimensions of P are F ×M2. Therefore,

the compressive imager with projection P yields F measurements as opposed to M2

measurements as in the case of the diffraction-limited imager, where F << M2. As

in the previous section, the MF projection matrix P is normalized as per Eq. (4.41)

to allow for a fair comparison with the diffraction-limited imager.

Recall that the target can appear at one of the 64 possible positions, hence F = 64

is the maximum number of projections. Fig. 4.20 shows the plot of TSI versus s for

the MF compressive imager with F = 16, 32, and 64. As before, we see that TSI

increases with number of projections F . However, at F = 32 the TSI shows the

115

0 5 10 15 20 25 30 35 40 45 50 55 600

0.4

0.8

1.2

1.6

2.0

s

Tas

k S

peci

fic In

form

atio

n [b

its]

Diffraction−limitedProjective F=16Projective F=32Projective F=64

Figure 4.21. Joint detection/localization task: TSI for MF compressive imager versussignal to noise ratio.

rollover effect due to the signal to noise ratio cost associated with increasing F .

Ideally, we expect that the maximum TSI is obtained for F = 64, as it includes all

possible target positions. However, there is some overlap between the target profiles

at adjacent positions and so F ≤ 64 projections are sufficient to extract the detection-

task related information. Note that, in the absence of the photon-count constraint, a

choice of F = 64 would have indeed provided the highest TSI. As expected the MF

projection imager yields better performance compared to the PC projection imager.

For example, to achieve TSI = 0.9 bit the MF projections with F = 32 requires s = 17

compared to s = 23 for the PC projections.

The TSI versus s plot for the joint detection/localization task is shown in Fig. 4.21.

Similar to the detection task, we observe the rollover effect at F = 32. As expected,

TSI saturates at 2 bits at high signal to noise ratio. The MF compressive imager with

F = 32 offers improved performance achieving TSI = 1.8 bit at s = 8 compared to

s = 19 required by the PC projection imager.

116

(a) (b)

Figure 4.22. Example textures (a) from each of the 16 texture classes and (b) withinone of the texture class.

4.5. Extended depth of field imager

The depth of field (DOF) of a conventional imager refers to the range of object dis-

tances over which the optical PSF blur is within a pre-specified limit. The shallow

depth of field of a conventional imager can be a limitation in some applications such

as non-cooperative object recognition where the object cannot be easily confined to a

narrow range of imaging distances. In such a case, it is necessary to extend the depth

of field of the imager while minimizing the performance degradation. One method of

achieving an extended depth of field (EDOF) is to stop down the aperture (i.e. reduce

the diameter of aperture-stop) while keeping the effective focal length fixed. However,

this approach has a significant SNR penalty that severely degrades the imager per-

formance. An alternate method of achieving EDOF involves engineering the optical

PSF of the imager such that the resulting performance degradation is minimized. In

this section, we will consider a texture classification task to demonstrate the optical

PSF engineering approach for achieving EDOF using TSI as a design metric.

For the texture classification task our scene model is defined by the following

117

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 160

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

s

Tas

k S

peci

fic In

form

atio

n [b

its]

Wd=0/d

o=2.00m

Wd=1/d

o=2.20m

Wd=2/d

o=2.43m

Wd=3/d

o=2.72m

Wd=4/d

o=3.09m

Wd=5/d

o=3.58m

Figure 4.23. TSI versus signal to noise ratio at various values of defocus.

encoding operator Ctex

Y = Ctex( ~X) =√sTρ ~X, (4.43)

where T is a M2×LP dimension texture class matrix composed of L sub-matrices Tl

of size M2×L, each representing one of the L texture classes. Here s denotes the SNR.

Each texture class consists of P texture realizations with random magnitude scalings.

Within each texture class, the texture realization is considered to be a nuisance pa-

rameter, similar to the target position parameter in the two-class target classification

problem. Also note that the texture class matrix T has the same structure as the

two-class target matrix defined in subsection 4.2.3. The matrix ρ is a LP ×L dimen-

sion matrix, with the same definition as the ρ matrix appearing in subsection 4.2.3,

it selects a particular texture realization from each of the L texture classes at random

with uniform probability 1P. The source variable ~X is a L-dimensional unit column

vector that selects a particular texture class with uniform probability 1L. Here we

consider a 16 class (L = 16) texture classification problem where each class consists

118

0 1 2 3 4 50.5

1

1.5

2

2.5

3

3.5

4

Wd − Defocus

Tas

k S

peci

fic In

form

atio

n [b

its]

do − Object distance [m]

2.00 2.25 2.50 2.75 3.00 3.25 3.50

s=10

s=4

Figure 4.24. TSI versus defocus at s = 10 and s = 4 for the texture classificationtask.

of 16 (P = 16) different random texture realizations (M = 80). Fig. 4.22(a) shows an

example of texture from each of the 16 texture classes and Fig. 4.22(b) shows example

texture realizations that comprise one of the texture class. Note that for this task

the entropy of the source variable ~X is 4.0 bits (J( ~X) = log(L) bits) as all of the 16

texture classes are equally likely. Given the source model defined in Eq. (4.43) we

can express the image measurement ~R as

~R = HCtex( ~X) + ~N, (4.44)

where H is the M2×M2 imaging matrix and ~N is a M2×1 zero-mean additive white

Gaussian noise column vector with identity covariance matrix i.e. Σ ~N = I. The

conditional mean estimators E[~Y |~R] and E[~Y |~R, ~X] required for computing the TSI

for this imaging model remain the same as those derived for the two-class classification

problem considered in subsection 4.2.3, except that the number of classes is 16 instead

of 2.

119

pixels

pixe

ls

−16 −12 −8 −4 0 4 8 12 16

−16

−12

−8

−4

0

4

8

12

16

(a)

pixels

pixe

ls−16 −12 −8 −4 0 4 8 12 16

−16

−12

−8

−4

0

4

8

12

16

(b)

pixels

pixe

ls

−16 −12 −8 −4 0 4 8 12 16

−16

−12

−8

−4

0

4

8

12

16

(c)

pixels

pixe

ls

−16 −12 −8 −4 0 4 8 12 16

−16

−12

−8

−4

0

4

8

12

16

(d)

Figure 4.25. Optical PSF of conventional imager at (a) Wd = 0, (b) Wd = 3 andcubic phase-mask imager with γ = 2.0 at (c) Wd = 0, (d) Wd = 3.

120

The imaging matrix operator H is constructed from the discrete optical PSF h(i, j)

as described in subsection 4.3.2. The discrete optical PSF h(i, j) is derived from the

continuous incoherent optical PSF h(x, y) which is related to the aperture diameter

D and the lens focal length f through the following relationship

h(x, y) =Ac

(λdi)2

∣∣∣∣Tpupil

(x

λdi,y

λdi

)∣∣∣∣2

, (4.45)

Tpupil(ωx, ωy) = F2 tpupil(u, v) , (4.46)

tpupil(u, v) = circ

(√u2 + v2

D

)exp

(−j2π[Φ(u, v) +Wd · (u2 + v2)]

),(4.47)

where Ac is a normalization constant with units of area, Φ(u, v) is the pupil phase

function, λ is the wavelength, and di is the image distance. Wd is the defocus param-

eter that is defined as

Wd =D2

8λ

(1

f− 1

do

− 1

di

), (4.48)

where do is the object distance. Note that when the object is in perfect focus (i.e. lens

equation: 1f

= 1do

+ 1di

) Wd = 0. Here we consider an F/# = 10 imaging system with

D =10mm and f =100mm. For an object distance of d∗o =2m the image distance

required to achieve perfect focus is di =105.3mm and for any object distance do 6= d∗o

the imager is said to be defocused. We will analyze the imager performance for object

distances ranging from 2m (Wd = 0) to 3.6m (Wd = 5). Fig. 4.23 shows a plot of

TSI as function of s for several defocus values: Wd = 0, 1, 2, 3, 4, 5. From this plot

we can observe that the TSI increases monotonically with s for all values of defocus

and it saturates at 4 bits (the source entropy) at s = 16 for zero defocus. Also note

that as the defocus increases from 0 to 5, TSI decreases steadily for all values of s. To

visualize the sensitivity of TSI to defocus let us consider the imager performance at a

fixed value of s. Fig. 4.24 shows a plot of the TSI as a function of defocus parameter

Wd at s = 10 and s = 4. Observe that the TSI decreases at a slightly faster rate at

s = 4 than at s = 10, this is because TSI is more sensitive to increasing extent of the

optical PSF due to defocus at a lower SNR. In general, an increase in the extent of

121

the optical PSF leads to a lower SNR per pixel and therefore, a lower TSI. Recall that

this “SNR cost” associated with the increasing extent of the optical PSF was also

observed in the case of the PRPEL imager design in Chapter 2. Thus, this SNR cost

is inherent to the optical PSF engineering approach irrespective of the task and/or

the design metric.

In order to quantify the depth of field of an imager we need to define a minimum

performance level/threshold. The range of object distances that maintain the imager

performance above this threshold define the depth of field of the imager. It is impor-

tant to emphasize that this criteria for depth of field definition is task-specific and

is also dependent on imager design specifications. For example, for reconstruction

tasks traditionally, the depth of field has been defined as the region within which the

optical PSF is smaller than a specified size. For a classification task, the depth of

field of an imager can be defined as the range of object distances for which the TSI

is above a threshold. Here we set this TSI threshold to 2 bits which is reached at a

defocus value Wd = 2.1 at s = 10. This defocus corresponds to an object distance of

2.46m yielding a depth of field of 46 cm. Note that this threshold is usually deter-

mined by the imager design specifications and here we have set it arbitrarily to 2 bits.

As mentioned earlier, engineering the optical PSF of an imager can extend its depth

of field. Here we will engineer the optical PSF by placing a cubic phase-mask in the

aperture-stop of the imager. The cubic phase-mask or the cubic pupil phase function

was originally derived in Ref. [4] to make the optical PSF invariant to defocus and

thereby extend the depth of field of an imaging system. It is defined as follows,

Φ(u, v) = γ · (u3 + v3) (4.49)

where γ is the parameter that controls the extent of defocus invariance of the resulting

optical PSF. Note that increasing γ, increases the defocus invariance of the optical

PSF however, it also reduces the imager performance as a result of a larger optical PSF

(larger extent) that leads to a lower SNR per pixel (the SNR cost). Fig. 4.25 shows the

122

0 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 45

50

55

60

65

70

75

80

85

90

γ

Dep

th o

f Fie

ld [m

m]

2.8

3.0

3.2

3.4

3.6

3.8

Tas

k S

peci

fic In

form

atio

n [b

its]

TSI at Wd=0

Depth of Field

γopt

=2.0

Figure 4.26. Depth of Field and TSI versus γ parameter at s = 10.

0 1 2 3 4 51.0

1.5

2.0

2.5

3.0

3.5

4.0

Wd − Defocus

Tas

k S

peci

fic In

form

atio

n [b

its]

Conventional

EDOF

DOF=46cm

DOF Threshold

DOF=81cm

Figure 4.27. TSI versus defocus at s = 10: DOF of conventional imager and cubicphase-mask EDOF imager with optimized optical PSF.

123

optical PSF with a cubic phase-mask (γ = 2) and the optical PSF without the cubic

phase-mask for two values of defocus Wd = 0 and Wd = 3. Note the larger extent of

the cubic phase-mask optical PSF relative to the conventional imager’s optical PSF

at zero defocus. The value of the parameter γ determines the imager’s depth of field

extension. Here we set the depth of field goal to 80 cm; the design problem can now

be stated as finding the value of γopt that achieves this goal. Fig. 4.26 shows a plot

of the depth of field and TSI (at focus) as a function of cubic phase-mask parameter

γ. From this plot we can observe that γopt = 2.0 achieves the desired depth of field

of 80 cm. Note that for this cubic phase-mask design with γopt = 2.0, the imager’s

performance at focus is however reduced to TSI = 3 bits. This performance penalty

represents the “cost” incurred in achieving the larger depth of field. Further, we

observe that this performance penalty increases monotonically with the increasing

depth of field, as the extent of the optical PSF increases with γ. Fig. 4.27 shows

the plot of TSI versus defocus parameter for both the conventional imager and the

optimized cubic phase-mask imager at s = 10. Note that the slope of the TSI curve is

lower for the cubic phase-mask imager compared to that for the conventional imager,

indicating a greater tolerance to defocus. It is important to emphasize that γopt and

therefore, the corresponding depth of field is function of SNR.

4.6. Conclusions

The task-specific information content of an image measurement can serve as an objec-

tive measure of the imaging system performance. In this chapter, we have proposed

a framework for the definition of TSI in terms of the well known Shannon mutual

information measure. The use of the virtual source variable is key to our definition

of TSI and to our knowledge, a unique method of embedding task specificity in the

scene model itself. The recently discovered relationship between mutual information

and mmse allows us to calculate the TSI from the simulated performance of con-

124

ditional mean estimators. The proposed TSI framework is applied to evaluate the

performance of geometric and diffraction-limited imaging systems for three tasks: de-

tection, classification and joint detection/classification and localization. The results

obtained from the simulation study confirm our intuition about the performance of

these two candidate imaging systems, thereby establishing TSI as an effective task-

specific performance metric.

We also exercised the TSI framework to study the design of two compressive im-

agers and an EDOF imager. In the case of the PC compressive imager, we found that

the TSI analysis confirmed the previously known trade-off with increasing number of

projections. The TSI performance of the MF compressive imager verified that it can

be a superior compressive imager design for a detection task. For the texture classifi-

cation task, we used TSI as a design metric to extend the depth of field of the imager

by nearly two times by engineering its optical PSF. From these results we conclude

that TSI is a useful metric for studying the task-specific performance of an imaging

system. We note that TSI may serve as an upper bound on the performance of any

algorithm that attempts to extract task-specific information from the measurement

data. In the next chapter we extend the application of the TSI metric to compressive

imaging system design and optimization.

125

Chapter 5

Compressive Imaging System Design With Task

Specific Information

In the last chapter, we used the TSI metric to analyze the task-specific performance

of two compressive imaging systems as a function of the number of projections to

understand the associated performance trade-offs. In this chapter, we extend this

study to the design of compressive imaging systems for the task of target detection.

We develop a TSI-based optimization framework to maximize the task-specific per-

formance of a compressive imaging system while accounting for the various physical

constraints such as the total photon count.

5.1. Introduction

Many modern applications of imaging involve some computational task. For exam-

ple, detection/classification and estimation tasks are commonly encountered in both

medical and military imaging applications. As discussed in Chapter 1, the tradi-

tional approach for designing such imaging systems involves at least two separate

steps: 1) design the front-end optical imaging system to maximize the fidelity of the

image measured at the detector plane and 2) design an algorithm that operates on

the measured image to extract the relevant information. This approach demonstrates

a disconnect between the task for which the acquired imagery is intended (e.g. de-

tection/estimation) and the imaging system design method that strives to achieve

the maximum image fidelity irrespective of the task. The computational imaging

paradigm addresses this disconnect through a joint optimization of the optical and

processing sub-systems [76, 77, 78, 79]. In this chapter, we are interested in a spe-

126

Condensing LensAperture stop Imaging Optics

(Projection operation)Programmable spatial light modulator

Object

Z−axis

X−axis

Y−

axis

Detector(Feature measurement)

(a)

Object

Z−axis

X−axis

Y−

axis

Aperture stopImaging micro−optics

Sparse detector array(Feature measurements)

Fixed masks(Projection operation)

(b)

Figure 5.1. Candidate optical architectures for compressive imaging (a) sequentialand (b) parallel.

cific type of computational imaging system that employs compressive measurements.

Such an imaging system, commonly referred to as a compressive imaging (CI) or

feature-specific imaging system, measures linear projections of the scene irradiance

optically [6].

An implementation of a CI system usually involves one or several spatial light

modulator (SLM) for realizing the linear projection operation(s) and a condensing

lens/detector combination for making feature measurement(s). Implementation of a

CI system can employ a sequential measurement architecture in which feature mea-

surements are made one at a time, or a parallel architecture in which all features

are measured simultaneously [6, 80]. Fig. 5.1(a) shows the sequential implementa-

tion of a CI system. Here, the two-dimensional scene is imaged onto a SLM and

the transmitted light is collected on a single photo-detector yielding a single feature

127

measurement. In order to obtain multiple feature measurements, the programmable

SLM is stepped through a sequence of projection vectors and the corresponding fea-

ture measurements are obtained as a time-sequence from the detector. Photons may

be allocated uniformly or non-uniformly among feature measurements through proper

choice of integrations times. In contrast to the sequential approach, the parallel archi-

tecture employs a lenslet array, fixed masks, and a sparse array of detectors to acquire

multiple feature measurements simultaneously as shown in Fig. 5.1(b). Once again

photons may be allocated uniformly or non-uniformly among feature measurements

through proper choice of lenslet apertures. This architecture has several advantages

as compared to the sequential architecture, including (a) reduced detector bandwidth

resulting in reduced noise, (b) elimination of the time-varying SLM, thus removal

of the attendant practical device limitations (e.g., photon loss due to imperfect fill

factor, contrast, and/or latency), and (c) potential for realizing extremely compact

CI imagers.

Linear feature-based techniques have been extensively studied by the image pro-

cessing community for target detection/classification and reconstruction tasks [81,

82, 83, 84]. The recent emergence of CI systems has been inspired by the incor-

poration of these linear feature-based techniques within the computational imaging

paradigm. There have been several studies characterizing the performance of vari-

ous CI systems [6, 80, 85, 86, 87, 88, 89, 90, 91]. Neifeld et al. [6, 80, 85, 86] and

Baraniuk et al. [90, 91] have studied CI for both reconstruction and detection tasks

and have discussed various potential advantages of the CI approach. A measure of

task-specific performance is important within the computational imaging framework.

Metrics such as visually-weighted root mean square error (RMSE) and probability

of detection/misclassification have been previously used to quantify the task-specific

performance of imaging systems [6, 86, 91]. However, an information-theoretic metric

is particularly attractive as it allows us to upper-bound the task-specific performance

of an imager, independent of the algorithm used to extract the relevant informa-

128

tion. In Chapter 4 we have developed a rigorous framework to define a task-specific

information-theoretic metric which we refer to as task-specific information. Task-

specific information (TSI) is based on the Shannon mutual-information measure and

it quantifies the task-relevant information available in an optical measurement. An

important observation regarding the information-theoretic nature of the TSI metric

is that the data processing inequality [67] dictates that the TSI cannot be increased

by any post-processing algorithm that operates on the measurement. This implies

that maximizing the TSI maximizes the upper-bound on the performance of any pro-

cessing algorithm and therefore, maximizes a CI system’s task-specific performance.

The relationship between Shannon information and estimation theory developed in

Refs. [73, 75, 92] facilitates the evaluation of TSI in a computationally tractable man-

ner. The utility of the TSI metric for evaluating the performance of both compressive

and conventional imagers for a variety of tasks has been demonstrated in Chapter 4. It

was found that the task-specific performance of compressive imagers can be superior

to that of conventional imagers, especially at low signal-to-noise ratio (SNR) [85].

In this chapter, we describe a method for TSI-based CI system design optimiza-

tion. We present a framework within which task-specific projections may be defined

and we demonstrate the utility of this framework by use of a specific target detection

problem. The target-detection task considered here includes the presence of nuisance

parameters, stochastic clutter, and detector noise. Numerous linear feature-based

techniques for target detection/classification exist in the literature. Projection bases

commonly employed by these techniques include, principal components [81, 82, 86],

independent components [83], wavelet bases [86], the Fisher basis [82] and random

projections [90, 91]. However, these projection bases have been developed for conven-

tional imagery (i.e. they are typically employed in the post-processing step that op-

erates on a conventional image). A direct optical implementation of these projection

bases in a CI system is therefore not optimal, as they do not account for parameters

like finite measurement SNR, clutter-to-noise-ratio (cnr), and the transfer function

129

3

4

1

5

2

Y Z

SourceChannel

H[ ]Encoding

C[ ]Virtual Projection Noise

N[ ]

R

P[ ]

FX

[0,1]

Image Feature MeasurementScene Feature

3.1

4.3

1.2

4.7

1.7

Figure 5.2. Block diagram of a compressive imaging system.

of the front-end optical system. We extend the TSI framework developed in Chapter

4 so that we may optimize CI systems while taking these various parameters into

account. We exploit the scene and target models developed in Chapter 4 to define

the target-detection task. The steps involved in our TSI-based design/optimization

framework are: 1) select a projection basis, and 2) optimize the total photon-budget

distribution among the various feature measurements. We analyze the task-specific

performance of the optimized CI system designs at various values of SNR and for a

variety of projection bases. The relationship between the probability of error and the

TSI-based detection-theoretic upper bound derived via Fano’s inequality [67] is also

examined.

5.2. Task-specific information: Compressive imaging system

Consider the various components of a CI system as illustrated in Fig. 5.2. The two-

dimensional scene Y is imaged through the imaging channel, denoted by the operator

H. The resulting two-dimensional image Z is optically projected onto a pre-defined

basis, to yield a feature vector denoted by F . The optical projection is represented

by the linear operator P. Finally, the feature vector F is corrupted by the noise

operator N resulting in the detector measurement R = N (F ). Task-specificity is

incorporated into this imaging system model by introducing a virtual source variable

and an encoding block. The virtual source variable, denoted by X, represents the

parameter(s) of interest for a specific task. For example, in a detection task X will be

130

a binary variable taking the values 1/0 representing target present/absent conditions.

In a reconstruction/estimation task, X might instead be a coefficient vector that

represents the scene of interest in some sparse basis. The encoding operator C operates

on X to produce the scene Y . In general, C can be either deterministic or stochastic.

TSI is defined as the Shannon mutual-information I(X;R) between the virtual

source X and the measurement R as follows [67]

TSI ≡ I(X;R) = J(X) − J(X|R), (5.1)

where J(X) = −Elog(pr(X)) denotes the entropy of the virtual sourceX, J(X|R) =

−Elog(pr(X|R)) denotes the entropy of X conditioned on the measurement R,

E· denotes the statistical expectation operator, pr(·) denotes the probability den-

sity function, and all logarithms are taken to be base 2. From this definition, it is

clear that the maximum TSI content of any measurement R is upper bounded by

the entropy of X. From here onwards we will assume that: 1) the channel opera-

tor H is linear, discrete-to-discrete, and deterministic, 2) the encoding operator C is

linear and stochastic, 3) the projection operator P is linear, discrete-to-discrete, and

deterministic, and 4) the noise model N is additive and Gaussian distributed.

5.2.1. Model for target-detection task

We consider a target-detection task, in which the target is known and background

clutter is stochastic. The target position is assumed to be variable and unknown.

Thus, the target position is a nuisance parameter. For this task, the virtual source

variableX takes the value 1 or 0 (i.e. “target present” or “target absent” respectively)

with probability p and 1−p respectively. In our scene model, the stochastic encoding

matrix operator C is defined as

C(X) =√sT~ρX +

√cVc

~β, (5.2)

131

where T is the target profile matrix, in which each column is a target profile (lexico-

graphically ordered into a one-dimensional vector) at a specific position in the scene.

For a scene of dimension M ×M pixels with P different possible target positions,

the dimension of the matrix T is M2 × P . The position vector ~ρ is a random indi-

cator vector that selects the target position for a given scene realization. Therefore,

~ρ ∈ ~c1, · · · ,~cP where ~ci is a P -dimensional vector with 1 at the ith position and

zeros elsewhere. All target positions are assumed to be equally probable. The clutter

profile matrix Vc is composed of column vectors that represent the various clutter

components such as tree, shrub, grass, etc. The dimension of Vc is M2 × L, where

L denotes the number of clutter components. ~β is the L-dimensional clutter mixing

column vector that follows a multivariate Gaussian distribution with mean ~µ~β and

covariance Σ~β . Fig. 5.3 illustrates the stochastic encoding operator C through an

example. Note that ~ρ = ~c2 in Fig. 5.3(a) and therefore, the output of T~ρ is the target

profile at position 2. Fig. 5.3(b) shows the individual clutter components arranged

column-wise in the clutter profile matrix Vc. The coefficients s and c in Eq. (5.2)

denote the signal-to-noise ratio (SNR) and clutter-to-noise ratio (CNR) respectively.

We note that this scene model is an oversimplified representation of the scenes typi-

cally encountered in practice. However, in the present work this model will serve to

illustrate the TSI-based design approach. A more realistic scene model would allow

for greater target variability in terms of orientation, illumination, perspective, and

a non-additive target/background model [93, 94, 95]. Furthermore, a more realistic

background clutter model would incorporate a varying number of clutter components

with variable positions, illumination, and perspectives. Although it would add numer-

ical complexity, such a model would not have any effect on the TSI-based optimization

methodology described herein.

The goal of the CI system in this work is to make K ≪ M2 measurements that

are most relevant to the detection-task. Mathematically, we can express the K-

132

T1 T2 TP

M2 P

=×

T2

1

0

0

T ~ρ T2

(a)

Vc2 VcL

M2 L

=×

Vc~β Vc

~β

Vc1

0.5

0.8

0.3

(b)

Figure 5.3. Illustration of stochastic encoding C: (a) Target profile matrix T and

position vector ~ρ and (b) clutter profile matrix Vc and mixing vector ~β.

133

dimensional detector measurement ~R as

~R = PHC(X) + ~N, (5.3)

where H is the M2 × M2 dimensional imaging channel matrix operator, P is the

K × M2 projection matrix operator, ~N is the K × 1 dimensional additive white

Gaussian detector noise (AWGN) vector with zero-mean and covariance Σ ~N , and C

is the stochastic encoding operator defined in Eq. (5.2). Note that the clutter and

the detector noise can be jointly described by a multivariate Gaussian random vector

~Nc as

~Nc =√cPHVc

~β + ~N, (5.4)

with mean ~µ ~Nc=

√cPHVc~µ~β and covariance Σ ~Nc

=√c ·PHVcΣ~βVc

THTPT +Σ ~N .

Therefore, we can rewrite the imaging model from Eq. (5.3) as

~R =√sPHT~ρX + ~Nc. (5.5)

Note that for a conventional imaging system (P = I), s refers to the measurement

SNR: the ratio of average signal power (s) measured on a detector to the detector

noise power σ2~Ni

= 1 and therefore, SNR = s. However, in the case of a CI system,

the detected signal power is increased by a factor of ‖~Pi‖2 as a result of the inner

product with projection vector ~Pi while the detector noise power remains fixed. This

results in a higher measurement SNR = s‖~Pi‖2 than that of a conventional imaging

system for the same value of s. To avoid ambiguity, from here onwards all references

to the term SNR will refer to the value of the s parameter.

To compute the mutual-information we will use the results derived in Refs. [73, 75,

85] that relate it to minimum mean square error (mmse) estimates. The relationship

between mutual information and mmse can be expressed as

TSI =1

2

∫ s

0

mmse(s′)ds′, (5.6)

where mmse(s) = Tr(H†P†Σ−1~Nc

PH(E~Y (s) −E~Y |X(s))). (5.7)

134

Here ~Y = T~ρX and

E~Y (s) = E[(~Y − E(~Y |~R, s))(~Y − E(~Y |~R, s))T ],

E~Y |X(s) = E[(~Y − E(~Y |~R,X, s))(~Y − E(~Y |~R,X, s))T ]. (5.8)

The terms E(~Y |~R, s) and E(~Y |~R,X, s) represent the conditional estimators for ~Y ,

with the former one conditioned over ~R and the latter one conditioned over both ~R and

X. Explicit expressions for these estimators, required for evaluating the expectations

in Eq. (5.8), can be found in Appendix A. Note that the relation between TSI and

mmse specified in Eq. (5.7) suggests that TSI increases with increasing mmse which

is counterintuitive. However, note that the actual mmse expression in Eq. (5.7) is

composed of two individual mmse terms E~Y and E~Y |X . The first mmse term E~Y is

the expected error in estimating ~Y given the measurement ~R, while the second mmse

term E~Y |X denotes the expected error given the joint knowledge of both ~R and X.

Fig. 5.4 shows a plot of these two mmse terms along with the difference mmse as a

function of SNR for a conventional imager. Note that in the low SNR region the two

mmse terms have similar values as the additional knowledge of X does not improve

the error significantly because the noise dominates in this region. In the mid SNR

region, the effect of noise is reduced and therefore, the second mmse error is lower

leading to an increase in difference mmse. The two mmse terms converge in the the

high SNR region as the noise becomes negligible with increasing SNR thereby making

the difference mmse smaller. Given that it is the difference between these mmse terms

whose integral is equal to TSI, it is expected that increasing the difference mmse leads

to a higher TSI. Note that TSI can not be increased arbitrarily by simply increasing

the difference mmse because the integral of the difference mmse is upper-bounded by

the entropy of the source variable X.

135

0 10 20 30 40 50 60 70 80 90 100 110 120

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

s

mm

se

EY

EY|X

EY−E

Y|X

Figure 5.4. Difference mmse and mmse components versus SNR for a conventionalimager.

5.2.2. Simulation details

The source variable X in our detection task represents “tank present” or “tank ab-

sent” conditions with equal probability i.e. p = 12. Here we consider a scene of

dimension 80× 80 pixels (i.e. M = 80). The object in the scene is a “tank” at one of

P = 64 equally likely positions and therefore, the matrix T is of dimensions 6400×64.

In our scene model, the number of clutter components is set to L = 6 with the L2

norm of each column vector in Vc set to unity. The mean of the mixing vector ~β is

set to ~µ~β = [160 80 40 40 64 40] and covariance to Σ~β = ~µTβ I/5. The CNR is set to

c = 1 and the detector noise ~N is AWGN with zero-mean and covariance Σ ~N = I.

We assume that the imaging optics in Fig. 5.1 (b) exhibits aberration-free, space-

invariant, and diffraction-limited performance for each lenslet. The discretized optical

point spread function (PSF) associated with a rectangular pupil therefore assumes

136

(a) (b)

Figure 5.5. Example scenes with optical blur and noise: (a) Tank in the top of thescene, (b) Tank in the middle of the scene

the following form [29]

h(i, j) =

∫ ∆/2

−∆/2

∫ ∆/2

−∆/2

sinc2

((x− i∆)

W

)sinc2

((y − j∆)

W

)dxdy, (5.9)

where W quantifies the degree of optical blur associated with the imaging optics and

∆ is the pixel pitch of the mask in Fig. 5.1(b). The optical blur is set to W = 2 and

the pixel pitch is set to ∆ = 1 so that the optical PSF is sampled at the Nyquist rate.

Note that a lexicographic ordering of the two-dimensional PSF yields one row of H

and all other rows are obtained by lexicographically ordering the appropriately shifted

version of this PSF. Fig. 5.5 shows example images that demonstrate the effects of

both optical blur and noise.

To ensure a fair comparison between CI and conventional imaging, we introduce

a system constraint based on the total photon-count. This system constraint has two

physical implications: 1) the total number of photons incident on the detector array

is always less than or equal to the total number of photons entering the entrance pupil

(i.e., the CI system is passive) and 2) the total number of photons available at the

entrance pupil is fixed (i.e., the CI system uses the same pupil and observation time

as the conventional imager). Mathematically, this total photon-count constraint can

137

be expressed as

P =1

ωP∗, (5.10)

where ω = maxj

∑Ki=1 |P∗

ij| denotes the maximum absolute column sum of the

matrix operator P∗. Here P∗ represents the original unnormalized projection matrix

and P refers to the normalized photon-count-constrained matrix that is implemented

optically. Note that a conventional imager does not employ a SLM and instead uses a

sensor array for image measurement, both P and P∗ are equal to an identity matrix

of dimension M2 ×M2.

Monte-Carlo simulations with importance sampling [68] are used to estimate the

mmse via the conditional mean estimators defined in Eq. (5.8). For each value of

s, 8000 clutter and noise realizations are used to estimate the mmse. These mmse

estimates are then numerically integrated with respect to SNR over the interval [0, s],

via the adaptive Lobatto quadrature method [96], to yield the TSI at s.

5.3. Optimization framework

We now describe the optimization framework for designing a CI system to maximize

task-specific performance. The degrees of freedom available in a CI system include

all the elements comprising the projection matrix P (i.e. all elements of P are valid

design variables). The constrained optimization problem can therefore be expressed

as

maxP

[TSI], such that maxj

K∑

i=1

|Pij| = 1. (5.11)

However, we note that the computational complexity resulting from the use of TSI

as a design metric increases exponentially with the number of design variables (which

is equal to K ×M2). As a result, this optimization approach becomes computation-

ally intractable for realistic scene dimensionality. Therefore, we pursue an alternate

approach that attempts to find the optimal photon-allocation per feature for a given

138

projection basis. This approach reduces the number of design parameters fromK×M2

to K, and therefore lowers the computational burden to a manageable level. We ex-

pect a TSI improvement through non-uniform photon-allocation scheme because, the

photon-budget can now be distributed among the basis vectors according to their

task-relevance. Note that in this approach the projection basis is pre-determined and

not optimized.

Within this optimization framework, the fraction of photons associated with the

ith basis vector ~Pi

∗(i.e. the ith row of P∗) is denoted by the design variable πi.

Therefore, for a given projection basis P∗ there is an associated photon-allocation

vector ~π that is defined as ~π = [π1, π2, · · · , πK ]. Note that the non-uniform photon

allocation vector ~π can be implemented via the use of non-uniform lenslet diameters in

the parallel CI architecture. Designing a CI system within the proposed optimization

framework involves three steps: 1) construct the unnormalized projection matrix

P∗ = [~P ∗1 ,~P ∗

2 , · · · , ~P ∗K ]T by choosing K projection vectors from the pre-defined basis,

2) construct the normalized projection matrix P = diag(~π)P∗ by choosing a ~π that

satisfies the photon-count constraint, where diag(·) denotes a diagonal matrix whose

diagonal is equal to its vector argument, and 3) optimize upon the associated photon-

allocation vector ~π in the presence of the total photon-count constraint to maximize

the TSI for a given value of SNR. Mathematically, this constrained optimization

problem can be expressed as

max~π

[TSI], such that maxj

K∑

i=1

|[diag(~π)P∗]ij | = 1. (5.12)

We use an optimization algorithm based on simulated tunneling [56] to maximize

the TSI for a given value of s. The simulated tunneling approach guarantees conver-

gence to the global maximum/minimum of an optimization problem as the number

of iterations tends to infinity. We observe convergence to a common solution after

5000 iterations, from multiple different initial conditions giving confidence that our

139

TSI optimization framework results in a global optima. Note that the computational

complexity of each iteration step is a function of the number of target positions P , the

number of projection vectors K, the SNR parameter s and the number of clutter/noise

realizations NCN used in the Monte-Carlo simulation. The number of floating points

operations (Flops) involved in each evaluation of the objective function can be ex-

pressed as ⌊√

10s⌋NCN(2P 4 + 2P 3 + 3P 2 + PK). For example, at s = 5, K = 1,

NCN = 8000 and P = 64, 1778GFlops were required to compute the TSI. Therefore,

as the number of target positions P is increased the computational cost grows quar-

tically O(P 4). Some practical tricks could be employed for large values of P like 1)

Monte-Carlo simulations over the diverse perspectives (only), and 2) parametrization

of the target library with P ≪ P parameters. However, it is important to realize that

the actual target detection problem does not become more complex as P increases

(for the same number of measured features) [84]. As mentioned earlier, several differ-

ent projection bases are considered for use in the CI system design. Now we describe

each of these projection bases in the context of our target-detection task.

5.3.1. Principal component projections

Principal component (PC) projections are derived from principal component analysis,

and are frequently employed for data dimensionality reduction in pattern recognition

problems [81]. The salient aspect of this basis is its strong energy compaction property

that leads to dimensionality reduction with the smallest reconstruction RMSE for

certain types of signals [97, 98]. The normally distributed signals fall in this category.

In practice, PC projections are computed using second-order statistics of a training

set chosen to represent an object ensemble. Specifically, for a training set O, the PC

projections are defined as the eigenvectors of the object auto-correlation matrix ROO

defined as

ROO = E~o~oT, (5.13)

140

Figure 5.6. Example projection vectors in the PC projection basis, clockwise fromupper left, #2,#6,#16,#31.

where ~o ∈ O is a column vector formed by lexicographically arranging the elements of

a two-dimensional image in O. Note that the expectation, denoted by operator E·,is over the complete training set O. In this work the object samples in the training

set O were obtained by generating sample realizations of scenes with varying clutter

levels, target strength, and target position using the stochastic encoder C defined in

Eq. (5.2). The K dominant eigenvectors of ROO are used to create the projection

matrix P∗PC. Fig. 5.6 shows some example projection vectors from this PC projection

basis.

In Chapter 4, it was demonstrated that the PC compressive imager, with an

uniform photon-allocation, achieves a higher TSI than that of the conventional imager.

This is the result of a higher measurement fidelity in a PC compressive imager due to

its strong image-energy compaction property. Fig. 5.7 shows the plot of TSI versus s

for the PC compressive imager for various choices of K. Observe that TSI increases

141

0 2 4 6 8 10 12 14 16 18 20 22 240

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

s

Tas

k−sp

ecifi

c In

form

atio

n [b

its]

K=12

K=16

K=24

K=32

Optimized

Figure 5.7. TSI versus SNR for PC compressive imager.

monotonically with s, eventually saturating at 1.0 bit. Also note that for a particular

SNR, TSI increases with K up to a certain value and then starts decreasing. We refer

to this behavior as the “rollover effect,” which is the result of a trade-off between

two competing processes: 1) as K increases, the projective measurements provide

more target-detection information, leading to an increase in TSI and 2) with a fixed

photon-budget, the measurement fidelity per feature decreases with increasing K,

resulting in a decrease in TSI. The tradeoff between these two processes results in

an optimal value of K that maximizes the TSI for a given value of SNR. For the

example in Fig. 5.7, the optimal value of K is 24 for s = 20. Note that the optimal

K is a function of SNR. From here onwards, we will refer to this effect of decreasing

measurement fidelity with increasing K as the “noise cost.”

In the next section we will further improve the PC compressive imager performance

by using the optimal photon-allocation. A PC projection matrix with K = 32 is

142

chosen as it accounts for more than 99.99% of the total eigenvalue sum. Here, the

total eigenvalue sum is defined as the sum of all eigenvalues of a projection matrix

P. It is important to remember that the PC projection basis itself is not an optimal

choice for the target-detection task.

5.3.2. Generalized matched-filter projections

The generalized matched-filter (GMF) is commonly used for the purpose of target-

detection in radar applications. For a target-detection problem, in which the target

and the background are known exactly, the GMF provides optimal performance in

terms of maximizing the probability of detection for a fixed false alarm rate [47]. Re-

call that in our target-detection problem, the target position is a nuisance parameter

that must be estimated implicitly. In such a case, instead of a matched-filter (e.g.

correlator), we consider a set of matched projections, as described in Ref. [85]. Each

matched projection corresponds to the target at a given position. Therefore, the re-

sulting compressive imager yields an inner-product between the scene and the target

at a particular position as specified by each projection vector. The GMF projection

matrix P∗GMF is defined as

P∗GMF = TΣ

−1~Nc, (5.14)

where T is the modified target profile matrix, each row of which corresponds to a

target profile at a particular position. The number of positions chosen is K and

therefore, the dimension of the matrix T is K ×M2. The whitening transformation

Σ−1~Nc

accounts for the joint effect of clutter and detector noise and is pre-multiplied by

T resulting in the final projection matrix P∗GMF [47]. We choose K = 64 to construct

the GMF projection basis matrix, thus accounting for all allowed target positions in

our scene model. Fig. 5.8 shows some examples of projection vectors from the GMF

projection matrix.

143

Figure 5.8. Example projection vectors in the GMF projection basis, clockwise fromupper left, #1,#16,#32,#64.

144

5.3.3. Generalized Fisher discriminant projections

The generalized Fisher discriminant (GFD) belongs to a class of linear discrimi-

nants that maximize the between-class separability while minimizing the within-class

variabilty [84]. For the target-detection task, this implies that the GFD projection

matrix is designed so that the conditional distributions under the “target-present”

and “target-absent” hypotheses are well-separated in the measurement space. Note

that the GFD projections achieve optimal discrimination when the conditional distri-

butions underlying each hypothesis are normally distributed with equal covariance.

However, for the target-detection task the conditional distributions underlying the

two hypotheses are not normally distributed and therefore, the GFD projection ma-

trix is not optimal. Nevertheless, we expect the GFD projections to improve upon

the GMF projections due to its compactness, which is critical in the presence of a

photon-count constraint. We consider two methods for designing the GFD matrix,

labeled as GFD1 and GFD2.

The GFD1 projection matrix is designed by considering each target position as

a separate hypothesis. Therefore, P target positions along with the target-absent

hypothesis result in a P + 1-class classification problem. The covariance under each

hypotheses is equal to the clutter covariance Σclutter and is defined as

Σi = Σclutter = c·HVcΣ~βVcT HT , i = 1 . . . P + 1. (5.15)

The mean under the ith “target-present” hypotheses (corresponding to a target-

present at the ith position) is given by

µi =√sH~Yi +

√cHVc~µ~β i = 1 . . . P, (5.16)

where ~Yi denotes the target profile at the ith position. The mean under the (P + 1)th

(i.e. null) hypothesis is

µP+1 =√cHVc~µ~β. (5.17)

145

Therefore, the overall mean is defined as

µGFD1 =1

2P

P∑

i=1

µi +1

2µP+1. (5.18)

Now, we can define the within-class scatter matrix SW1 and the between-class

scatter matrix SB1 required to compute the GFD1 projection matrix.

SW1 =1

2P

P∑

i=1

Σclutter +1

2ΣP+1 =

1

2

1

P+ 1

· Σclutter (5.19)

SB1 =1

2P

P∑

i=1

(µi − µGFD1)(µi − µGFD1)T

+1

2(µP+1 − µGFD1)(µP+1 − µGFD1)

T .

(5.20)

The GFD1 projection matrix P∗GFD1 maximizes the generalized Fisher discrimi-

nation criterion DGFD expressed as

DGFD(P∗GFD1) =

P∗TGFD1SB1P

∗GFD1

P∗TGFD1SW1P∗

GFD1

, s.t. P∗TGFD1SB1P

∗GFD1 = I. (5.21)

This is equivalent to solving the generalized eigenvalue problem, defined as

SB1P∗GFD1 = SW1Λ1P

∗GFD1. (5.22)

Note that the rank of P∗GFD1 is at most P . We retain the K dominant eigenvectors

of P∗GFD1 to construct the GFD1 projection matrix. For the work reported here we

retained K = 16 projection vectors in the GFD1 projection matrix, as they accounted

for 99.99% of the total eigenvalue sum. Fig. 5.9 shows some examples of projection

vectors from the GFD1 projection matrix.

To derive the GFD2 projection matrix, we consider an alternate two-hypothesis

problem which is somewhat more intuitive considering the target-detection task. The

means µ1 and µ2 under the “target-present” (variable target-position) hypothesis and

“target-absent” hypothesis respectively, are defined as

µ1 =1

P

P∑

i=1

√sH~Yi +

√cHVc~µ~β (5.23)

146

Figure 5.9. Example projection vectors in the GFD1 projection basis, clockwise fromupper left, #1,#10,#11,#14.

147

Figure 5.10. Projection vector in the GFD2 projection basis.

and,

µ2 =√cHVc~µ~β. (5.24)

Therefore, the overall mean is given by

µGFD2 =1

2µ1 +

1

2µ2 =

1

2P

P∑

i=1

√sH~Yi +

√cHVc~µ~β. (5.25)

The corresponding covariance matrices under the two hypothesis can be expressed

as

Σ1 =1

P

P∑

i

(√sH~Yi −

1

P

P∑

j

√sH~Yj)(

√sH~Yi −

1

P

P∑

j

√sH~Yj)

T

+ c · HVcΣ~βVcTHT

(5.26)

and

Σ2 = c · HVcΣ~βVcTHT . (5.27)

The within-class scatter matrix SW2 and between-class scatter matrix SB2 are

defined respectively as

SW2 =1

2Σ1 +

1

2Σ2 (5.28)

and,

SB2 =1

2· (µ1 − µGFD2)(µ1 − µGFD2)

T +

1

2· (µ2 − µGFD2)(µ2 − µGFD2)

T .(5.29)

148

As before, we find P∗GFD2 by solving SB2P

∗GFD2 = SW2Λ2P

∗GFD2. Note that the rank

of P∗GFD2 is now 1 as this is a two-class problem. Fig. 5.10 shows the single projection

vector that comprises the GFD2 projection matrix.

5.3.4. Independent component projections

Independent component (IC) analysis attempts to find a projection basis such that the

resulting projected data is statistically independent. Statistical independence implies

that the mutual-information between any pair of independent components is actually

zero. It is important to note that the pairwise mutual-information (among indepen-

dent components) is distinct from TSI, which is defined as the mutual-information be-

tween the virtual source variable and the measurements. In practice, an IC projection

basis is estimated from a training data set, similar to PC projections. Although an IC

algorithm attempts to achieve statistical independence, the resulting projection basis

may not actually achieve strict independence in practice. There are several methods

for performing IC analysis on a training data. For example, Bell’s infomax principle

of minimizing mutual-information between projected components [99], and the Fas-

tICA method that attempts to make the projected components as non-Gaussian as

possible [100] are two popular methods. We employ the FastICA approach in this

study due to its robustness and computational speed. For a comprehensive review of

the FastICA algorithm, we refer the reader to Ref. [100].

Before computing an IC projection matrix we apply the PC analysis as a pre-

processing step to whiten the training object data set. Recall that the dimensionality

of the original scene is M×M . As a result of selecting the first K dominant eigenvec-

tors from the PC projection matrix the dimensionality of the data set is reduced from

M2 to K. The FastICA algorithm is applied to this reduced-dimensionality training

data set to obtain a IC projection matrix of size K × K. In this work we begin

with a PC projection matrix using K = 32 in the pre-processing step to construct

149

Figure 5.11. Example projection vectors in the IC projection basis, clockwise fromupper left, #8,#16,#22,#28.

150

0 2 4 6 8 10 12 14 16 18 20 22 240

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

s

Tas

k−sp

ecifi

c In

form

atio

n [b

its]

GFD1

GMF

PC

IC

GFD2

Conventional

Figure 5.12. Optimized compressive imagers: TSI versus SNR for candidate CI systemand conventional imager.

the IC projection matrix. Fig. 5.11 shows examples of projection vectors from the IC

projection matrix.

5.4. Results and Discussion

We apply the TSI-based optimization framework to maximize the task-specific per-

formance of candidate CI systems each of which uses one of the projection bases

discussed above. The optimization process results in an optimal photon-allocation

vector ~π for a given value of SNR. This photon-allocation vector characterizes the

optimized design of a CI system and is specific to the particular choice of projection

matrix. Fig. 5.12 shows a plot of TSI versus SNR for each optimized candidate CI

system. The performance of a conventional imager is also plotted in Fig. 5.12 to

151

1 2 3 4 5 6 7 8 9 10111213141516171819202122232425262728293031320

1

2

3

4

5

6

7

8

9

Projection vector #

Pho

ton

allo

catio

n

(a)

1 2 3 4 5 6 7 8 9 10111213141516171819202122232425262728293031320

1

2

3

4

5

6

7

Projection vector #

Pho

ton

allo

catio

n

(b)

1 2 3 4 5 6 7 8 9 10111213141516171819202122232425262728293031320

1

2

3

4

5

6

Projection vector #

Pho

ton

allo

catio

n

(c)

Figure 5.13. Optimal photon allocation vectors for PC compressive imager at: (a)s = 0.5 , (b) s = 5.0 , and (c) s = 20.0.

152

facilitate a direct comparison with the optimized CI system designs. There are two

general trends that are evident from this plot, 1) TSI increases monotonically with

SNR, approaching the upper-bound of 1.0 bit asymptotically and 2) all CI system

designs outperform the conventional imager, with the exception of the GFD2 imager

which shows inferior performance at high SNR.

To understand the various mechanisms underlying the performance of a partic-

ular CI system design, let us begin by examining the PC imager design. Fig. 5.7

includes a plot of TSI versus SNR for the optimized PC imager design. It is ap-

parent from this plot that the optimized photon-allocation design outperforms the

uniform photon-allocation design of the PC imager. For example, at s = 5.0 the

non-optimized design achieves its maximum TSI of 0.38 bits at K = 16, while the

optimized design yields a higher TSI of 0.9291 bits. One reason for this improved

performance is that the optimized PC imager design avoids the rollover effect which

is present in the non-optimized design. This can be understood by considering that

the optimization process has two effects on the final imager design: 1) selection of

only those projection vector(s) that are most relevant to the task and 2) allocation

of the available photon-budget to the chosen projection vectors according to their

relative importance to the task. It is also important to realize that the optimal

photon-allocation depends strongly on SNR. To illustrate this effect, let us examine

the photon-allocation vector of the optimized PC imager design at three representa-

tive values of SNR as shown in Fig. 5.13. At a low SNR value of s = 0.5, we observe

that the optimal photon-allocation vector contains only 15 non-zero elements out of

a total of 32 possible. As SNR is increased to s = 5.0, the number of non-zero ele-

ments increases to 18, eventually reaching 19 for a high SNR value of s = 20.0. To

understand this SNR-dependent behavior, recall that these results are obtained using

a total photon-count constraint as defined in Eq. (5.12). This means that measuring

more features can only come at the expense of measuring fewer photons per feature.

Therefore, it is reasonable to expect that at low SNR (i.e., large AWGN variance), an

153

optimal photon-allocation strategy would result in selection of a minimum number of

projection vectors. This in turn maximizes the measurement fidelity along with dis-

tributing the photon-budget among the K most task-relevant projection vectors. As

SNR increases, we expect that the optimal ~π will employ non-zero photon-allocation

for relatively more projection vectors and result in higher TSI. This expectation is

in agreement with our observations from Fig. 5.13. We expect to observe similar

behavior in other candidate CI imager designs as well. Another interesting observa-

tion regarding the PC imager is that the distribution of photons as specified by its

optimal ~π, departs significantly from the eigenvalue distribution of the PC projection

basis. This is not surprising because the eigenvalue-based photon distribution is more

natural for a reconstruction-task instead of a detection-task.

Unlike the PC projection basis, the IC projection basis is not energy compact.

However, the IC projection basis has the inherent statistical independence property

that directly translates into a potential increase in TSI. To understand this, let us con-

sider two IC projection vectors, denoted by ~P1 and ~P2, that produce a measurement

R = [R1 R2]. The TSI in this measurement can be expressed as

I(R1, R2;X) = J(R1, R2) − J(R1, R2|X) (5.30)

= J(R1) + J(R2|R1) − J(R1, R2|X)

≤ J(R1) + J(R2) − J(R1, R2|X). (5.31)

Note that it is only when R1 and R2 are statistically independent that the equality

is achieved in Eq. (5.31), which is indeed the case for the IC projection basis. This

property of the IC projection basis has a direct impact on the number of projection

vectors that receive a non-zero photon-allocation when optimized. To illustrate this

effect, consider a noise-free scenario for which the IC imager requires Q projection

vectors to obtain a certain value of TSI, while the PC imager will require more than

Q projections to achieve the same TSI, due to the statistical dependence among its

basis vectors. In the presence of noise, the IC imager design would yield a higher TSI

154

as it utilizes fewer projection vectors and as a result achieves higher measurement

fidelity compared to the PC imager design. Therefore, it is reasonable to expect

that the optimized IC imager design would yield superior performance relative to the

optimized PC imager design. This is indeed the case. We observe that the optimized

IC imager design outperforms the optimized PC imager design in the low-to-mid SNR

region as shown in Fig. 5.12. For example, at s = 0.5 the IC imager achieves a TSI

of 0.3556 bits compared to 0.3012 bits for the PC imager. In the high SNR region,

the performance of the optimized IC and PC imagers becomes comparable as the

advantage of IC projections is diminished due to high measurement fidelity.

As discussed earlier, the “noise cost” effect arises from the total photon-count

constraint because of decreasing measurement fidelity with increasing K. It is due to a

higher noise cost incurred by the optimized GMF imager design that it does not exceed

the performance of the optimized PC and IC imager designs. Although, the GMF

projection matrix is optimal for extracting detection information from a given target

position, it requires all 64 projection vectors to achieve adequate performance over

the whole scene. However, the PC and IC projection bases require fewer projections

due to their energy compaction/statistical independence property and therefore, incur

a smaller noise cost. Thus, the optimality of the GMF projections (i.e. for a given

target position) is effectively countered by the smaller noise cost of the PC and IC

projections. From Fig. 5.12 we see that this results in slightly inferior TSI performance

for the GMF projections.

Among all the candidate compressive imagers considered in this study, it is clear

from Fig. 5.12 that the optimized GFD1 imager design yields the maximum task-

specific performance. For example, at s = 0.5 the GFD1 imager achieves a TSI

of 0.6017 bits compared to 0.2176 bits for GMF, 0.3012 bits for PC, 0.3556 bits for

IC and 0.1026 bits for the GFD2 designs. Similarly, at medium (s = 5.0) and high

(s = 20.0) SNR values, the GFD1 imager outperforms all other compressive imager

designs. The comparative performance of all the candidate imagers is quantified by

155

Imager/TSI@ s = 0.5 s = 5.0 s = 20.0

GFD1 0.6017 0.9841 0.9999

IC 0.3556 0.9324 0.9956PC 0.3012 0.9291 0.9944

GMF 0.2176 0.8461 0.9904GFD2 0.1026 0.4434 0.5922

Conventional 0.0051 0.0979 0.5568

Table 5.1. TSI (in bits) for candidate compressive imagers at three representativevalues of SNR: low(s = 0.5), medium(s = 5.0), and high(s = 20.0).

the data presented in Table 5.1 at three representative values of SNR. The supe-

rior performance of the GFD1 imager is primarily due to its projection basis design.

Recall that the GFD1 projection basis is derived to maximize the ratio of between-

class distance to within-class distance. This objective is equivalent to maximizing

the Kullback-Leibler distance (closely related to Shannon information) between the

two class-conditional distributions when they are normally distributed with equal co-

variances. For the target-detection task the two class-conditional distributions are

not Gaussian distributed and as a result, the discrimination information lies along

multiple dimensions. Although, the GFD1 projections does not achieve optimality

it does extract discriminating information along all available dimensions in an effi-

cient manner (i.e. with fewer projection vectors) which is the key to its enhanced

performance.

To gain further insight into the GFD1 imager design, let us examine its optimal

photon-allocation vector. It is interesting to note that at a low SNR of s = 0.5, the op-

timal photon-allocation vector has only 2 non-zero elements as shown in Fig. 5.14(a).

Comparing this to the 15 non-zero elements for the PC imager at the same SNR,

we can conclude that the discriminating information in the GFD1 projection basis

is represented more compactly than it is in the PC projection basis. As a result,

the noise cost in the case of the GFD1 imager is lower, thus yielding a larger TSI.

156

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 160

1

2

3

4

5

6

Projection vector #

Pho

ton

allo

catio

n

(a)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 160

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

5.5

Projection vector #

Pho

ton

allo

catio

n

(b)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 160

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

Projection vector #

Pho

ton

allo

catio

n

(c)

Figure 5.14. Optimal photon allocation vectors for GFD1 compressive imager at: (a)s = 0.5 , (b) s = 5.0 , and (c) s = 20.0.

157

Similarly, at mid-SNR and high-SNR values of s = 5.0 and s = 20.0, the optimal

photon-allocation vector of the GFD1 imager requires only 4 and 8 non-zero elements

respectively, while the PC imager uses 18 and 19 components respectively.

Finally, we observe that the GFD2 compressive imager yields the lowest TSI

among all candidate compressive imager designs. Note that the GFD2 projection

matrix has only one projection vector and therefore requires, no photon allocation

optimization. For the target-detection task, the two-class definition inherent in the

GFD2 projection matrix seems like a natural choice. However, the linear transfor-

mation approach used in Section. 5.3.3 to derive the GFD2 projection matrix is not

optimal. Recall that, this approach is optimal only when the underlying class con-

ditional distributions are Gaussian with equal covariance, which is certainly not the

case here. Also we noted earlier in this section that the discriminating information

does not lie along only one-dimension, but rather lies along multiple dimensions.

Therefore, the GFD2 projection matrix, being limited to only one projection vector,

results in inferior TSI performance compared to the other bases considered here.

5.5. Conventional metric: Probability of error

Recall that the performance of any algorithm designed to accomplish a specific task

is upper bounded by TSI [85]. This implies that for the target-detection task, TSI

provides an upper-bound on the performance of any detection algorithm. Tradition-

ally, a statistical measure such as the probability of error is employed in order to

quantify the performance of an imaging system for such a task. In this section, we

examine the relation between the TSI metric and the probability of error, to gain a

more traditional statistical perspective on the TSI-based design/analysis approach.

Fano’s inequality [67] provides the required relation between information-theory

and the probability of error for classification/detection tasks. It states that

J(X|R) ≤ Pe log(|X| − 1) + J(Pe), (5.32)

158

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.010

−4

10−3

10−2

10−1

100

Task−specific information [bits]

Pro

babi

lity

of e

rror

Figure 5.15. Lower bound on probability of error as a function of TSI.

where Pe is defined as the probability of error in detecting X conditioned on R, J(Pe)

denotes the entropy of Pe, and |X| represents the cardinality of X. Rewriting the

left-hand-side of Eq. (5.32) in terms of the mutual-information between X and R and

rearranging we obtain a direct relation between TSI and Pe

TSI ≡ I(X;R) ≥ J(X) − Pe log(|X| − 1) − J(Pe). (5.33)

For the target-detection task |X| = 2, substituting this into Eq. (5.33) we obtain,

TSI ≥ J(X) − J(Pe) = 1 + Pe log(Pe) + (1 − Pe) log(1 − Pe). (5.34)

Using this version of Fano’s relation, we can compute a lower bound on Pe as a

function of TSI.

Fig. 5.15 plots the lower bound on Pe from Eq. (5.34) as a function of TSI. Note

that this lower bound on Pe may not always be achievable. Nevertheless, the lower

159

0 2 4 6 8 10 12 14 16 18 20 22 2410

−8

10−7

10−6

10−5

10−4

10−3

10−2

10−1

s

Pro

babi

lity

of e

rror

Lower bound

Bayes’ detector

Figure 5.16. Comparison of probability of error obtained via Bayes’ detector versuslower bound obtained by Fano’s inequality as a function of SNR.

bound can still serve a useful purpose in comparing candidate imaging systems to

quantify their respective task-specific performance. To examine the tightness of this

lower bound we use the Bayes’ detector to obtain an estimate of the Pe that is

achievable in practice. The Bayes’ detector is defined using the maximum a-posteriori

(MAP) rule that is expressed as follows

pr(R|X = 0)

pr(R|X = 1)

X=1≶

X=0

pr(X = 1)

pr(X = 0), (5.35)

where pr(R|X = 1) and pr(R|X = 0) represent the conditional probabilities, pr(X =

1) and pr(X = 0) are the prior probabilities for the target-present and target-absent

hypotheses respectively. Here, X is defined as the output of the Bayes’ detector. The

inequality sign in Eq.(5.35) translates to the following decision rule: decide X = 1

if the right hand side is larger, otherwise decide X = 0. We will use the GFD1

160

imager as an example to investigate the relation between TSI and the Bayes’ bound

on Pe. The Bayes’ detector performance is simulated for a range of SNR values and

the resulting Pe is plotted as a function of SNR in Fig. 5.16 as the dashed curve.

The lower bound on Pe resulting from Fano’s relation in Eq. (5.34) is plotted as the

solid curve in Fig. 5.16. In the low SNR region, Pe achieved by the Bayes’ detector

is close to the lower bound derived from Fano’s inequality. For example, at s = 0.5,

the actual Pe achieved is 0.1099 compared to a lower bound of 0.0789. In general, we

observe that with increasing SNR the Bayes’ detector’s Pe follows the lower bound

closely.

5.6. Conclusions

A TSI-based optimization framework for designing CI systems was presented. The

task of target-detection was used to study the effectiveness of the optimized com-

pressive imager designs. Several projection bases were used to demonstrate the ap-

plication of the optimization framework to the design of CI systems. These included

the PC, IC, GMF, and GFD projection bases which were computed specifically for

the target-detection task. It was found that the optimized GFD1 compressive imager

outperformed all other compressive imagers because its projection matrix represented

the target-discrimination information most compactly. The GFD1 compressive im-

ager achieved a TSI of 0.6017 bits at s = 0.5, which is nearly three times that of the

GMF compressive imager. The relation between the information-theoretic TSI metric

and conventional probability of error (Pe) was also discussed. It was shown that the

TSI can be used to derive a lower bound on Pe for the task of target-detection. This

lower bound was compared to the actual Pe obtained using the Bayes’ detector for a

GFD1 compressive imager. It was demonstrated that the TSI metric correlates well

with conventional metrics such as Pe and may serve as a robust and tractable metric

for designing compressive imagers.

161

It is also important to consider the impact of various imperfections associated with

an actual implementation of a compressive imager design on its performance. For in-

stance, the SLM component used for implementing the projection vector has several

non-ideal characteristics such as less than 100% fill factor for each pixel, non-linear

pixel response with input voltage, limited gray-level resolution, non-uniform spatial

response, and noise in each pixel. The non-100% fill factor of pixels effectively reduces

the SLM transmission efficiency and therefore, it decreases the magnitude of the ac-

tual feature measurement on the detector resulting in a lower measurement SNR.

Similarly, the SLM pixel noise also reduces the feature measurement SNR. These two

effects lower the TSI of the compressive imager compared to the value predicted by

the simulation. In order to provide an example of this performance degradation, let

us assume a fill factor of 90% and a noise variance in SLM signal equivalent to 5%

of the detector noise variance. These two effects combine to reduce the measurement

SNR by nearly 15%, which in turn reduces the performance of the GFD1 compressive

imager from 0.9907 bits at s = 6 to 0.9841 bits that corresponds to s = 5. Further,

the non-linear pixel response, limited gray-level resolution, and the non-uniform spa-

tial response of the SLM also contribute to feature measurement errors that impact

the system performance negatively. Another practical issue related to CI system im-

plementation is use of negative elements in the projection vectors. To implement

projection vectors with both positive and negative values, a dual-rail approach can

be used, as described in Ref. [6]. The dual-rail approach has an associated noise cost

because the number of detectors is doubled [6]. We also note that the projection

vectors and their corresponding optimal photon-allocation vector changes with SNR,

this implies that we need to use different SLM masks and different exposure times for

low SNR conditions as opposed to a high SNR case. The projection vectors required

for a particular SNR can be realized by use of programmable SLMs and the desired

exposure time can be achieved by the use of variable apertures on each lenslet in

the parallel architecture. Although, the practical issues associated with compressive

162

imager implementation degrade the system performance it is important to emphasize

that the optimized CI designs still offer a significant performance improvement over

a conventional imager, especially at low SNR values.

163

Chapter 6

Conclusions and Future Work

We asserted in Chapter 1 that a joint-optimization of the optical and the post-

processing degrees of freedom provides a design framework that maximizes the per-

formance of a computational imaging system while utilizing all available (especially

optical) design resources most efficiently. Moreover, a task-specific approach within

the joint-optimization design framework allows the designer to leverage the available

degrees of freedom to maximize the imaging system performance for a specific task.

To evaluate the task-specfic approach, we considered an imaging system design study

for two separate tasks: object reconstruction and iris-recognition in Chapter 2 and

Chapter 3 respectively. Each task was considered in the context of imaging systems

whose detector array produce under-sampled measurements. In both of the design

studies the optical PSF, representing the optical degrees of freedom, was engineered

in conjunction with the post-processing algorithm parameters with the goal of over-

coming the performance degradations introduced by the detector under-sampling. In

the case of the object reconstruction task, the optical PSF was engineered via the use

of a pseudo-random phase-mask for two different metrics: resolution and reconstruc-

tion RMSE. It was observed that maximizing the resolution of the imaging system

resulted in a more diffused optical PSF compared to the optical PSF corresponding

to the optimal solution that minimized the reconstruction RMSE. The fact that the

optimal solutions were different for each of the two design metrics highlights the im-

portance of the design metric choice on the optimal imaging system design. Overall,

the optimized imaging systems achieved as much as 50% improvement in resolution

and nearly 20% lower reconstruction RMSE compared to the conventional imaging

164

system design that did not use an engineered optical PSF. For the iris-recognition

task, the optical PSF was engineered using a phase-mask that was represented by

Zernike polynomials. The optical PSF was jointly optimized with selected parame-

ters of the post-processing algorithm to maximize the imaging system performance

for the iris-classification task. In this case, the design metric used FRR and FAR

statistics to quantify the iris-recognition performance. The optimized imaging sys-

tem with the engineered optical PSF achieved a 33% performance improvement over

the conventional imaging system.

In general, the performance improvements obtained by the optimized imaging

systems for both the object reconstruction and the iris-recognition tasks demonstrate

the power of the optical PSF engineering method within the joint-optimization de-

sign framework. Note that the implementation of the optical PSF engineering method

considered here required a parametric representation of a phase-mask that is specified

by a finite set of parameters. As mentioned earlier, the phase-mask parameters repre-

sent the optical degrees of freedom of the imaging system considered in the two design

studies. While increasing the number of parameters of the phase-mask function in-

creases the volume of the optical design sub-space it also increases the computational

burden associated with the optimization process. On the other extreme, too few

phase-mask parameters artificially constrain the optical design sub-space potentially

excluding optimal solution(s). The choice of a particular phase-mask parametrization

therefore remains a challenging problem that requires further work to better under-

stand the trade-offs between the system performance, optimization complexity, and

from a practical point of view, the manufacturability of the optimized phase-masks

that is possible with the current machining/replicating technologies.

As note earlier, the optimal imaging system design is dependent on the particular

task, as quantified by the design metric (e.g. resolution vs. RMSE). This emphasizes

the concept of task-specific design and more importantly the crucial role of task-

specific design metrics within the joint-optimization design framework. In Chapter

165

4, the notion of task-specific information (TSI) was introduced as an information-

theoretic measure of an imaging system’s performance on a particular task. Various

source models were employed to demonstrate the utility of the TSI metric in quanti-

fying an imaging system’s performance for detection, classification, and localization

tasks. The role of the TSI metric in upper bounding the performance of an imaging

system design for a given task, irrespective of the post-processing algorithm, was also

discussed. Note that the source models employed for the TSI analysis were relatively

simple because their purpose was to simply demonstrate the application of TSI to

various tasks and different imaging system architectures. As discussed in Chapter 4,

the typical scenes encounter in reality require more sophisticated scene models (that

account for occlusion, shadow, perspective) to yield a realistic task-specific evaluation

of an imaging system. Therefore, more work needs to be conducted in formulating

realistic scene models along with the associated analysis that would be required for

computing the TSI. Another important aspect of the TSI analysis is the noise model,

we assumed it to be additive Gaussian noise. This noise model, in addition to ac-

counting for the detector read noise present in sensors also provides a relatively good

approximation to the shot-noise associated with detecting an optical signal under

bright-light conditions. However, under low-light conditions it becomes important to

take into account the actual Poisson distribution of the shot-noise. Therefore, our

imaging system model needs to be extended to include shot-noise and other sources

of noise that are encountered in an actual implementation of an imaging system. This

represents another direction for future work in extending the TSI analysis to more

realistic imaging systems. The application of TSI metric for engineering the optical

PSF to extend the depth of field of an imager was also considered within the context of

a texture classification task. A cubic phase-mask representation was used to optimize

the optical PSF to achieve a specified depth of field extension. The main goal of this

brief study was to demonstrate the application of TSI as a design metric to engineer

the optical PSF so as to achieve a desired imager performance for a particular task.

166

This study forms the basis for future work that would address the application of the

TSI metric for optimizing the optical PSF to accomplish more complex tasks such as

joint target detection and tracking.

Chapter 5 considered the extension of the TSI analysis framework developed in

Chapter 4 to the actual design of compressive imaging systems. The TSI design

framework was applied to design of compressive imaging systems for the task of

target detection. Several projections such as matched filter, generalized Fisher dis-

criminant, independent component analysis, and principal component analysis were

considered. The photon-allocation vector, representing the photon distribution to

each projection vector, was optimized using the TSI design metric for each candidate

compressive imaging system and the resulting optimized design solutions were eval-

uated and compared. Relative to a conventional imaging system the optimized com-

pressive system designs offered as much as six fold improvement in target-detection

performance at low SNR and nearly a two fold increase at higher SNR. The TSI

metric was also used to compute an upper-bound on the probability of detection for

GFD1 CI system design and was compared to the actual probability of detection from

the optimal Baye’s MAP detector. It was found that that upper-bound followed the

detector performance very closely over a range of SNR. In addition to the theoretical

results, we also discussed the various issues related with implementing a compres-

sive imaging system design in a physical system. In order to quantify the effect of

the various imperfections encountered in actual implementation of CI system further

analysis is required.

167

Appendix A: Conditional mean estimators for detection,

classification, and localization tasks

Here we derive explicit expressions for the conditional mean estimators E(~Y |~R) and

E(~Y |~R, ~X) for each of the three tasks: detection, classification, and localization.

E(~Y |~R) is defined as the expected value of ~Y given the measurement ~R and can

be written as

E(~Y |~R) =∑

l

~Yl Pr(~Y = ~Yl|~R) =∑

l

~Ylpr(~R|~Y = ~Yl) Pr(~Y = ~Yl)∑

m pr(~R|~Y = ~Ym) Pr(~Y = ~Ym)

, (A1)

where ~Yl spans over all the possible scenes that can be generated by the random

encoding function. Recall that for the detection task defined in Subsection 4.2.2,

~Y = T~ρX and virtual source X is binary. Therefore, for X = 1, ~Y can take P

different values corresponding to the P possible positions. ~Y is equal to zero for

X = 0. We define Pr(X = 1) = p, Pr(X = 0) = 1 − p, Pr(~Y = ~0) = 1 − p, and

Pr(~Y = ~Yl) = pP

, where l = 1, 2, .., P. Substituting these probabilities into (A1) we

obtain

E(~Y |~R) =

∑Pl=1 p

~Ylpr(~R|~Y = Yl)∑Pm=1 p[pr(

~R|~Y = Ym)] + (1 − p)P [pr(~R|~Y = 0)]. (A2)

Here the conditional probability density function pr(~R|~Y = ~Yl) is Gaussian and is

given by

pr(~R|~Y = ~Yl) =1

(2π)K/2√

detΣ~R|~Y

exp

[−1

2(Θ1 + Θ2l + Θ3l + Θ4)

](A3)

where Σ~R|~Y = c · PHVcΣ~β(PHVc)T + σ2

NI, (A4)

Θ1 = ~RTΣ−1~R|~Y

~R− 2√c · ~RT Σ−1

~R|~Y PHVc~µ~β, (A5)

Θ2l = −2√s · ~RTΣ−1

~R|~Y PH~Yl, (A6)

Θ3l = s · ~Y Tl HTPTΣ

−1~R|~Y PH~Yl + 2

√s · c · ~Y T

l HTPTΣ−1~R|~Y PHVc~µ~β, (A7)

168

and Θ4 = c · ~µT~βVT

cHTPTΣ−1

~R|~Y PHVc~µ~β. (A8)

Substituting (A3) into (A2) and simplifying yields the following expression

E(~Y |~R) =

∑Pl=1 p

~Yl · exp[−12(Θ2l + Θ3l)]∑P

m=1 p exp[−12(Θ2m + Θ3m)] + (1 − p)P

. (A9)

Note that here P represents the projection matrix of the projective imager and must

be assumed as I otherwise.

Next, consider the classification task specified in Subsection 4.2.3, where ~Y =

T~ρ ~X and virtual source ~X is binary. For ~X equal to [1, 0]T/[0, 1]T , ~Y will have P

possible values corresponding to as many positions. Assuming Pr( ~X = [1, 0]T ) = p,

Pr( ~X = [0, 1]T ) = 1 − p and equi-probable positions for both targets we obtain

E(~Y |~R) = (A10)∑P

l=1; ~X=[1,0]T p~Yl · exp[−1

2(Θ2l + Θ3l)] +

∑Pl=1; ~X=[0,1]T (1 − p)~Yl · exp[−1

2(Θ2l + Θ3l)]

∑Pm=1; ~X=[1,0]T p exp[−1

2(Θ2m + Θ3m)] +

∑Pm=1; ~X=[0,1]T (1 − p) exp[−1

2(Θ2m + Θ3m)]

.

This expression is similar to (A9) except for about twice as many terms in numerator

and the denominator.

For the joint detection and localization task, the estimator E(~Y |~R) can be ob-

tained by minor modifications to (A9). Considering the probabilities specified in

Subsection 4.2.4, the modified expression can be found as

E(~Y |~R) =

∑Qi=1

∑Pi

l=1Pr(X=i)

Pi

~Yi,l · exp[−12(Θi,2l + Θi,3l)]

∑Qj=1

∑Pj

m=1Pr(X=j)

Pj· exp[−1

2(Θj,2m + Θj,3m)] + (1 − p)

, (A11)

where Yi,l is the target profile at lth position of region i. Θi,2l and Θi,3l are evaluated

using (A6) and (A7) respectively by substituting Yl with the corresponding Yi,l. Sim-

ilarly for the joint classification and localization task, the estimator E(~Y |~R) can be

written as

E(~Y |~R) =

∑Qi=1

∑Pi

l=1

∑~α=[1,0]T ,[0,1]T

Pr(X=i,~α)Pi

~Yi,l,~α · exp[−12(Θi,2l + Θi,3l)]

∑Qj=1

∑Pj

m=1

∑~α=[1,0]T ,[0,1]T

Pr(X=j,~α)Pj

· exp[−12(Θj,2m + Θj,3m)]

, (A12)

169

where Yi,l,~α is the target profile specified by ~α at lth position of region i. Θi,2l and

Θi,3l are evaluated using (A6) and (A7) respectively by substituting Yl with respective

Yi,l,~α.

Now we derive the expressions for the estimator E(~Y |~R, ~X) required in evaluating

Eq. (4.18) for each task. The estimator is defined as

E(~Y |~R, ~X) =∑

l

~Yl · Pr(~Y = ~Yl|~R, ~X). (A13)

We may express the conditional probability Pr(~Y = ~Yl|~R, ~X) using Bayes’ law as

follows

Pr(~Y = ~Yl|~R, ~X) =pr(~R, ~X|~Yl) Pr(~Y = ~Yl)

pr(~R, ~X)(A14)

=pr(~R|~Y = ~Yl, ~X) Pr( ~X|~Y = ~Yl) Pr(~Y = ~Yl)∑

m pr(~R|~Y = ~Ym, ~X) Pr( ~X|~Y = ~Ym) Pr(~Y = ~Ym)

.(A15)

For the detection task in Subsection 4.2.2, the virtual source variable X is binary;

therefore, substituting (A15) and (A3) into (A13) and simplifying we obtain the

following expressions for the estimator

E(~Y |~R,X = 1) =

∑Pl=1

~Yl · exp[−12(Θ2l + Θ3l)]∑P

m=1 exp[−12(Θ2m + Θ3m)]

, (A16)

E(~Y |~R,X = 0) = 0,

where ~Yl in (A16) is the target profile at the lth position. Θ2l and Θ3l in (A16)

are evaluated using (A6) and (A7) respectively. Similarly for the classification task

defined in Subsection 4.2.3, the estimator in (A13) can be written as

E(~Y |~R, ~X) =

∑Pl=1

~Yl · exp[−12(Θ2l + Θ3l)]∑P

m=1 exp[−12(Θ2m + Θ3m)]

, (A17)

where ~Yl in this case is the target profile specified by ~X at lth position.

Recall from Subsection 4.2.4 that for the joint detection and localization task, the

virtual source variable X ′ is (Q + 1)-ary. Note that X ′ = X, where X denotes the

170

region in which target is present when α = 1 and X ′ = 0 when α = 0. The estimator

in (A13) for this case is given by

E(~Y |~R,X = i, α = 1) =

∑Pi

l=1~Yi,l · exp[−1

2(Θi,2l + Θi,3l)]∑Pi

m=1 exp[−12(Θi,2m + Θi,3m)]

, (A18)

E(~Y |~R, α = 0) = 0,

where X = i implies that target is present in region i, ~Yi,l is the target profile at

the lth position of region i. Once again Θi,2l and Θi,3l are evaluated using (A6) and

(A7) respectively by substituting Yl with the appropriate Yi,l. In a similar manner the

estimator E(~Y |~R, ~X) for the joint classification and localization task can be expressed

as

E(~Y |~R,X = i, ~α) =

∑Pi

l=1~Yi,l,~α · exp[−1

2(Θi,2l + Θi,3l)]∑Pi

m=1 exp[−12(Θi,2m + Θi,3m)]

, (A19)

where Yi,l,~α, Θi,2l and Θi,3l have the same meaning as in (A12).

171

References

[1] N. J. Wade and S. Finger, “The eye as an optical instrument: from cameraobscura to Helmholtz’s perspective,” Perception 30(10), 1157-1177 (2001).

[2] W. Boyle and G. Smith, “Charge Coupled Semiconductor Devices,” Bell SystemTechnical Journal 49, 587 (1970).

[3] G. E. Moore, “Cramming more components onto integrated circuits,” Electron-ics Magazine 38(8), (1965).

[4] E. R. Dowski and W.T. Cathey, “Extended Depth of Field Through WavefrontCoding,” Applied Optics 34(11), 1859-1866 (1995).

[5] P. Potuluri, U. Gopinathan, J. R. Adleman, and D. J. Brady, “Lensless sensorsystem using a reference structure,” Optics Express 11, 965-974 (2003).

[6] M. A. Neifeld and P. Shankar, “Feature-Specific Imaging,” Applied Optics 42,3379-3389 (2003).

[7] http://www.cdm-optics.com

[8] H. H. Barrett, “Objective assessment of image quality: effects of quantum noiseand object variability,” J. Opt. Soc. Am. A 7, 1266-1278 (1990).

[9] H. H. Barrett, J. L. Denny, R. F. Wagner, and K. J. Myers, “Objective assess-ment of image quality. II. Fisher information, Fourier crosstalk, and figures ofmerit for task performance,” J. Opt. Soc. Am. A 12, 834-852 (1995).

[10] H. H. Barrett, C. K. Abbey, and E. Clarkson, “Objective assessment of imagequality. III. ROC metrics, ideal observers, and likelihood-generating functions,”J. Opt. Soc. Am. A 15, 1520-1535 (1998).

[11] L. Poletto and P. Nicolosi, “Enhancing the spatial resolution of a two-dimensional discrete array detector,” Optical Eng. 38, 1748-1757 (1999).

[12] A. Papoulis, “Generalized sampling expansion,” IEEE Trans. Circuit Sys-tems 24, 652-654 (1977).

[13] S. Borman, “Topics in Multiframe Superresolution Restoration,” Ph.D. disser-tation (University of Notre Dame, Notre Dame, 2004).

[14] S. Farsiu, D. Robinson, M. Elad, and P. Milanfar, “Fast and robust multi-framesuper-resolution,” IEEE Trans. Image Process. 13, 1327-1344 (2004).

172

[15] N. Galatsanos and R. Chin, “Digital restoration of multichannel images,” IEEETrans. on Acoustics, Speech, and Signal Process. 37, 415-421 (1989).

[16] S. P. Kim, N. K. Bose, and H. M. Valenzuela, “Recursive reconstruction ofhigh resolution image from noisy undersampled multiframes,” IEEE Trans. onAcoustics, Speech, and Signal Process. 38, 1013-1027 (1990).

[17] H. Ur and D. Gross, “Improved resolution from subpixel shifted pictures,” Com-puter Vision Graphics Image Processing: Graph. Models Image Process. 54,181-186 (1992).

[18] M. Elad and A. Feuer, “Restoration of a single superresolution image from sev-eral blurred, noisy and undersampled images,” IEEE Trans. in Image Process. 6,1646-1658 (1997).

[19] J. Tanida, T. Kumagai, K. Yamada, S. Miyatake, K. Ishida, T. Morimoto,N. Kondou, D. Miyazaki, and Y. Ichioka,“Thin Observation Module by BoundOptics (TOMBO): Concept and Experimental Verification,” Applied Optics 40,1806-1813 (2001).

[20] Y. Kitamura, R. Shogenji, K. Yamada, S. Miyatake, M. Miyamoto, T. Mori-moto, Y. Masaki, N. Kondou, D. Miyazaki, J. Tanida, and Y. Ichioka, “Re-construction of a High-Resolution Image on a Compound-Eye Image-CapturingSystem,” Applied Optics 43, 1719-1727 (2004).

[21] P. M. Shankar, W. C. Hasenplaugh, R. L. Morrison, R. A. Stack, and M. A.Neifeld, “Multiaperture imaging,” Applied Optics 45, 2871-2883 (2006).

[22] M. A. Neifeld and A. Ashok, “Imaging using alternate point spread functions:Lenslets with pseudo-random phase diversity,” in Proceedings of OSA TopicalMeeting: Computational Optical Sensing and Imaging(COSI), Charlotte, NC,June 6-8, paper CMB1 (2005).

[23] A. Ashok and M. A. Neifeld, “Engineering the point spread function for super-resolution from multiple low-resolution sub-pixel shifted frames,” in Proceedingsof OSA Annual Meeting, Tucson, AZ, Oct 16-20 (2005).

[24] Q. Tian and M. N. Huhns, “Algorithms for subpixel registration,” ComputerVision Graphics Image Processing 35, 220-233 (1986).

[25] S. Verdu, Multiuser detection, (Cambridge, University Press, 1998), Chap. 2.

[26] J. Solmon, Z. Zalevsky , D. Mendlovicm, “Geometric Superresolution by CodeDivision Multiplexing,” Applied Optics 44, 32-40 (2005).

173

[27] A. Ashok and M. A. Neifeld, “Information-based analysis of simple incoherentimaging systems,” Optics Express 11, 2153-2162 (2003).

[28] H. H. Barrett and K. J. Myers, Foundations of Image Science, (Wiley-Interscience, 2004).

[29] J.W. Goodman, Introduction to Fourier Optics, (MCGraw Hill, 1996), Chap.7.

[30] E. Y. Lam, “Noise in superresolution reconstruction ,” Optics Letter 28, 2234-2236 (2003).

[31] H.C. Andrews and B.R. Hunt, Digital Image Restoration, (Prentice-Hall, En-glewood Cliffs, N.J., 1977).

[32] D. J. Tolhurst, Y. Tadmor, and T. Chao, “Amplitude spectra of natural im-ages,” Ophthalm. Physiol. Opt. 12, 229-232 (1992).

[33] D. L. Ruderman, “Origins of scaling in natural images,” Vision Res. 37, 3385-3398 (1997).

[34] D. J. Field and N. Brady, “Visual sensitivity, blur and the sources of variabilityin the amplitude spectra of natural scenes,” Vision Res. 37, 3367-3383 (1997).

[35] J. Burg, “Maximum entropy spectral analysis,” Ph.D. dissertation (StanfordUniversity, 1975).

[36] M. Irani and S. Peleg, “Improving resolution by image registration,” CVGIP:Graph. Models Image Process. 53, 231-239 (1991).

[37] A. P. Dempster, N. M. Laird and D. B. Rubin, “Maximum likelihood fromincomplete data via the EM algorithm,” J. Roy. Stat. Soc. Ser. B 39, 1-38(1977).

[38] L. B. Lucy, “An iterative technique for the rectification of observed distribu-tion,” Astron. J. 79, 745-754 (1974).

[39] W. H. Richardson, “Bayesian-based iterative method of image restoration,” J.Opt. Soc. Am. A 56, 1141-1142 (1972).

[40] A. Ashok and M. A. Neifeld, “ Recent progress on multidomain optimizationfor ultrathin cameras,” Proc. SPIE 6232, 62320N (2006).

[41] J. G. Daugman, “High confidence visual recognition of person by a test ofstatistical independence,” IEEE Trans. PAMI 15, 1148-1161 (1993).

174

[42] J. G. Daugman, “The importance of being random: statistical principles of irisrecognition,” Pattern Recognition 36, 279-291 (2003).

[43] J. G. Daugman, “How iris recognition works,” IEEE Trans. Circuits and Sys-tems for Video Tech. 14(1), 21-30 (2004).

[44] R. Barnard, V.P. Pauca, T.C. Torgersen, R.J. Plemmons, S. Prasad, J. van derGracht, J. Nagy, J. Chung, G. Behrmann, S. Mathews, and M. Mirotznik.“High-Resolution Iris Image Reconstruction from Low-Resolution Imagery,”Proc. SPIE 6313, 1-13, (2006).

[45] R. Narayanswamy, P. Silveira, H. Setty, V. Pauca, and J. van der Gracht, “Ex-tended depth-of-field iris recognition system for a workstation environment,”Proc. SPIE 5779, 41-50 (2005).

[46] R. Narayanswamy, G. E. Johnson, P. E. X. Silveira, and H. B. Wach, “Extendingthe imaging volume for biometric iris recognition,” Appl. Opt. 44, 701-712(2005).

[47] S. Kay, Fundamentals of Statistical signal processing: Detection theory, (Pren-tice Hall, 1993).

[48] CASIA-IrisV1 database, “http://www.cbsr.ia.ac.cn/IrisDatabase.htm”.

[49] M. Born and E. Wolf, Principles of Optics: Electromagnetic Theory of Prop-agation, Interference, and Diffraction of Light, (Pergamon Press, 1989) Chap.9.

[50] A. Papoulis and S. U. Pillai, Probability, Random Variables and StochasticProcesses, (McGraw Hill, 2001).

[51] C. L. Fales, F. O. Huck, and R. W. Samms, ”Imaging system design for improvedinformation capacity,” Applied Optics 23, 873-888, (1984).

[52] N. X. Nguyen, “Numerical Algorithms for Image Superresolution,” Ph. D. Dis-sertation, Stanford University, (2000).

[53] L. Masek, “Recognition of Human Iris Patterns for Biometric Identification,”Technical report, University of Western Australia, (2003).

[54] S. Sanderson and J. Erbetta, “Authentication for secure environments based oniris scanning technology,” IEE Colloquium on Visual Biometrics, (2000).

[55] D. J. Field, “Relations between the statistics of natural images and the responseproperties of cortical cells,” Journal of the Optical Society of America 4, 2379-2394 (1987).

175

[56] W. Wenzel and K. Hamacher, “A Stochastic tunneling approach for globalminimization,” Phys. Rev. Lett. 82(15), 3003-3007(1999).

[57] MPI 1.1 standard, http://www.mpi-forum.org/docs/mpi-11-html/mpi-report.html

[58] J. A. O’Sullivan, R. E. Blahut and D. L. Snyder,“Information-theoretic imageformation,” IEEE Trans. on Image Processing 44, 2094-2123 (1998).

[59] A. Ortega and K. Ramchandran, “Rate-distortion methods for image and videocompression,” IEEE Signal Processing Magazine 15, 23-50 (1998).

[60] F. O. Huck, C. L. Fales, and Z. Rahman, “An Information Theory of VisualCommunication,” Phil. Trans. R. Soc. A: Phys. Sci. and Engr. 354, 2193-2248(1996).

[61] F. O. Huck and C. L. Fales, “Information-theoretic assessment of sampled imag-ing systems,” Optical Engineering 38, 742-762 (1999).

[62] J. Ahlberg and I. Renhorn, “An information-theoretic approach to band selec-tion,” Proc. SPIE 5811, 15-23 (2005).

[63] S. P. Awate, T. Tasdizen, N. Foster, and R. T. Whitaker, “Adaptive, Nonpara-metric Markov Modeling for Unsupervised, MRI Brain-Tissue Classification,”Med. Image. Anal. (to be published).

[64] J. Liu and P. Moulin, “Information-Theoretic Analysis of Interscale and In-trascale Dependencies Between Image Wavelet Coefficients,” IEEE Trans. onImage Processing 10, 1647-1658 (2001).

[65] L. Zhen and Karam, “Mutual information-based analysis of JPEG2000 con-texts,” IEEE Trans. on Image Processing 14, 411-422 (2005).

[66] D. S. Taubman and M. W. Marcellin, JPEG2000: Image Compression Funda-mentals, Standards and Practice, (Springer Publishing, 2002)

[67] T. Cover and J. Thomas, Elements of Information Theory, (John Wiley andSons, New York, 1991).

[68] M. Tanner, Tools for Statistical Inference, (Springer, 2nd edition 1993).

[69] J. S. Liu, Monte Carlo Strategies in Scientific Computing, (Springer, 2001).

[70] C. P. Robert and G. Casella, Monte Carlo Statistical Methods, (Springer, 2004).

[71] A. Doucet, N. de Freitas, and N. Gordon, Sequential Monte Carlo Methods inPractice, (Springer, 2001).

176

[72] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linearcodes for minimizing symbol error rate,” IEEE Trans. Inform. Theory 20, 284-287 (1974).

[73] D. Guo, S. Shamai and S. Verdu, “Mutual information and minimum mean-square error in Gaussian channels,” IEEE Trans. on Inform. Theory 51, 1261-1282 (2005).

[74] H. L. Van Trees, Detection, Estimation, and Modulation Theory, Part I: De-tection, Estimation, and Linear Modulation Theory, New York: Wiley, 1968.

[75] D. P. Palomar and S. Verdu, “Gradient of mutual information in linear vectorGaussian channels,” IEEE Trans. on Inform. Theory 52, 141-154 (2006).

[76] W. T. Cathey and E. R. Dowski, “New Paradigm for Imaging Systems,” AppliedOptics 41, 6080-6092 (2002).

[77] A. Ashok and M. A. Neifeld, “Pseudorandom phase masks for superresolutionimaging from subpixel shifting,” Applied Optics 46, 2256-2268 (2007).

[78] M. D. Stenner, A. Ashok, and M. A. Neifeld, “Multi-Domain Optimization forUltra-Thin Cameras,” Frontiers in Optics, Rochester, NY (2006).

[79] “Multi-Domain Optimization,” http://ocpl.ece.arizona.edu/mdo/.

[80] M. A. Neifeld and J. Ke, “Optical architectures for compressive imaging,” Ap-plied Optics 46, 5293-5303 (2007).

[81] M. Turk and A. Pentland, “Eigenfaces for recognition,” Journal of CognitiveNeuroscience 3, 71-86 (1991).

[82] P. Belhumeur, J. Hespanha, and D. Kriegman, “Eigenfaces vs. Fisherfaces:Recognition Using Class Specific Linear Projection,” IEEE Trans. on Patternanalysis and Machine intelligence 19, 711-720 (1997).

[83] M. S. Bartlett, J. R. Movellan, T. J. Sejnowski, “Face recognition by inde-pendent component analysis,” IEEE Trans. on Neural Networks 13, 1450-1464(2002).

[84] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, (Wiley Inter-science 2000).

[85] M. A. Neifeld, A. Ashok, and P. K. Baheti, “Task Specific Information forImaging System Analysis,” J. Opt. Soc. Am. A 24, B25-B41 (2007).

177

[86] H. Pal and M. A. Neifeld, “Multispectral principal component imaging,” OpticsExpress 11, 2118-2125 (2003).

[87] D. L. Donoho, “Compressed sensing,” IEEE Trans. on Information Theory 52,1289-1306 (2006).

[88] M. Lustig, D. L. Donoho, J. M. Santos, J. M. Pauly, “Compressed Sensing MRI[A look at how CS can improve on current imaging techniques],” IEEE SignalProcessing Magazine. 25, no. 2, 72-82, March 2008.

[89] A. Mahalanobis, “Optical Systems for Task Specific Compressed Sensing andImage Reconstruction,” Annual meeting of the IEEE Lasers and Electro-OpticsSociety, 157-158, Oct 2007.

[90] M. F. Duarte, M. A. Davenport, M. B. Wakin and R. G.Baraniuk,“Sparse signaldetection from incoherent projections,” in Proc. of IEEE International Conf.Acoustics, Speech and Signal Processing (ICASSP), vol. 3, 14-19 (2006).

[91] D. Takhar, J. N. Laska, M. B. Wakin, M. F. Duarte, D. Baron, S. Sarvotham,K. Kelly, and R. G. Baraniuk, “A new compressive imaging camera architectureusing optical-domain compression,” Proc. SPIE 6065, 43-52 (2006).

[92] D. P. Palomar and S. Verdu, “Representation of Mutual Information Via InputEstimates,” IEEE Trans. on Inform. Theory 53, 453-470 (2007).

[93] N. Towghi and B. Javidi, “Optimum receivers for pattern recognition in thepresence of Gaussian noise with unknown statistics,” J. Opt. Soc. Am. A 18,1844-1852 (2001).

[94] R. Patnaik and D. Casasent, “MINACE filter classification algorithms for ATRusing MSTAR data,” Proc. SPIE 5807, 100-111 (2005).

[95] R. Patnaik and D. Casasent, “SAR classification and confuser and clutter re-jection tests on MSTAR ten-class data using Minace filters,” Proc. SPIE 6574,657402:1-15 (2007).

[96] W. Gander and W. Gautschi, “Adaptive Quadrature - Revisited,” BIT 40,84-101 (2000).

[97] I. T. Jolliffe, Principal Component Analysis, (Springer, 2002).

[98] D. Barber and F. V. Agakov, “The IM Algorithm: A Variational Approach toInformation Maximization,” In NIPS (MIT Press, 2003).

[99] A. J. Bell and T. J. Sejnowski, “An information-maximization approach to blindseparation and blind deconvolution,” Neural Computation 7, 1129-1159 (1995).

178

[100] A. Hyvarinen, J. Karhunen, and E. Oja, Independent Component Analysis,(Wiley, 2001).

A Task-specific Approach to Computational Imaging System Design

Documents