-
Single-Frame Image Processing Techniques for Low-SNR Infrared
Imagery
Rich Edmondsona, Mike Rodgersa, Michele Banisha, Michelle
Johnsona, Heggere Ranganathb aPolaris Sensor Technologies, 200 West
Side Square, Suite 320, Huntsville, AL, 35801;
b Chairman, Computer Science Department, University of Alabama
in Huntsville, 300 Technology Hall, Huntsville, AL, 35899
ABSTRACT
Polaris Sensor Technologies, Inc. is identifying target pixels
in IR imagery at signal to noise (SNR) ranges from 1.25 to 3 with a
mixed set of algorithms that are candidates for next generation
focal planes. Some of these yield less than 50 false targets and a
95% probability of detection in this low SNR range. What has been
discovered is that single frame imagery combined with IMU data can
be input into a host of algorithms like Neural Networks and filters
to isolate signals and cull noise. Solutions for nonlinear
thresholding approaches can be solved using both genetic algorithms
and neural networks. What is being addressed is how to implement
these approaches and apply them to point target detection
scenarios. The large format focal planes will flood the down stream
image processing pipelines used in real time systems, and this team
wonders if data can be thinned near the FPA using one of these
techniques. Delivering all the target pixels with a minimum of
false positives is the goal addressed by the group. Algorithms that
can be digitally implemented in a ROIC are discussed as are the
performance statistics Probability of Detection and False Alarm
Rate. Results from multiple focal planes for varied scenarios will
be presented. Keywords: Infrared, SNR, neural network, genetic
algorithm, PCNN, filter
1. INTRODUCTION A method currently employed for detecting single
pixel targets involves summing frames to drive up the signal and
lower the background noise [1]. The noise is typically an
uncorrelated combination of focal plane noise, system noise,
thermal noise, target motion and kinematics motion in flight system
applications during acquisition. The process is much the same for
any dynamic system application like UAV, UGV, product tracking and
process control. Signals are separated from atmospheric or
uncluttered background by selecting pixels above a threshold some K
sigma over the expected or tolerated noise. Discrimination of
signal from background in this manner produces a frame delay and
reduction in frame rate due to the summation. Dim - and possibly
significant - signals are hidden below the threshold floor. As a
result currently the entire contents of the focal plane array are
required to be read out to present the data to a down stream
mathematical process that interprets the data. Our effort addresses
the need for high data rate computationally efficient methods that
can yield the separation of signal from background at low signal to
noise (SNR) ratios down to 2.
2. METHOD AND APPROACH In mathematical experiments presented in
this paper, the data set is derived from collected data from cooled
HgCdTe and InSb focal plane arrays (FPA), both 256 pixels square.
This paper presents several algorithms for the detection of small
unresolved features in noisy images where each signal is
represented by just one pixel, and the signal is most likely spread
among more than one pixel.
2.1 Data
The signal is added to the background with a computer program
that scales an unresolved point spread function to a specific
signal to noise ratio and randomly places it in the FPA background
scene. It is classically described as a point spread function, an
Airy disk, from an optical system designed to have a two pixel full
width at half max. The signal is spread among a number of pixels
due to its not being stationary in the field of view. If no motion
is added the full width at half max of the signal would be two
pixels. Example imagery can be seen in Figure 1. Background data
was obtained experimentally within a laboratory, and then large
data sets were manufactured. While we recognize that there are
Infrared Technology and Applications XXXIV, edited by Bjørn F.
Andresen, Gabor F. Fulop, Paul R. Norton,Proc. of SPIE Vol. 6940,
69402G, (2008) · 0277-786X/08/$18 · doi: 10.1117/12.778076
Proc. of SPIE Vol. 6940 69402G-1
-
I—
50 000 150 200 250
TO
ISS
SO
Sb
SO
45
I
alternative methods for determining true signal, such as the use
of a first order Bezel function, the work presented herein is
focused on algorithms, and description of optical signal is simply
for determination of truth in scoring.
Two types of target imagery are studied:
1. Targets that tend to dwell in a small number of pixels and
follow a predictable path in the field of view
2. Targets that tend to move randomly in the field of view and
whose extent is not predictable
The specific aspects of the data under study have a max linear
motion per frame between 3.5 pixels and 12.0 pixels. Each frame
shows some target smearing due to motion effects. It is likely that
in scenarios where the camera imagery is not motion compensated
that the direction of the smear will remain constant for several
frames. A scene generation tool was developed to manufacture data
to these specifications. A typical image and a magnified view of
the signal is seen in Figure 1.
Figure 1. Typical Input imagery used for algorithm
development
This experiment was motivated by data collected in hardware in
the loop simulation where an optical system was misaligned, and
there were focal plane array defects representative of those seen
in real world applications. Data from this experiment is shown
below. As seen in Figure 2 below, the diagonal fringes are an
optical misalignment among components, the vertical white bar is
focal plane defect and the horizontal striping is typical readout
noise from an Indigo ROIC. The large thermal bloom at the bottom
and the top of the frame is of no strategic interest and buries the
unresolved concentric rings of single pixel target signals.
- DC
IR HWIL Data: Difficult data set to process –artifacts, low
S/N
PCNN offers a solution with significant improvements over
thresholding
Best performance to date offers >7000 fewer False Target
Pixels
Figure 2. Imagery from Phase I HWIL data collection
Proc. of SPIE Vol. 6940 69402G-2
-
This poor quality data could have been easily been generated by
a system defect and what is presented in this paper are
mathematical algorithms developed to function in spite of system
partial failure.
Our processing of the HWIL data led to the belief that Pulse
Coupled Neural Network (PCNN) is a way that can provide better
results than simple thresholding. As seen in Figure 2, the image
with the average background removed is shown next to the raw image,
followed by a processing result for an “optimal threshold” which
requires that you know the signal intensity. The optimal
thresholding technique filters all pixels above the peak signal
intensity, and all pixels below some fixed number of counts less
than the peak signal intensity, which in this instance was 1 count.
The test result obtained using this optimal thresholding technique
is 8000 potential target pixels, much of which is noise. Shown
below the thresholded image is the PCNN output, an alternative
processing result that yields all signal with significantly fewer
false alarm pixels.
What was found is that signals could be correctly separated from
background, albeit not perfectly, with PCNN. The number of pixels
eliminated by the PCNN far exceeded all alternatives, opening
system bandwidth for discrimination mathematics downstream. The
unfortunate nature of the process at the time was that there was no
ability to set the PCNN control parameters other than via trial and
error. Efforts to cast the PCNN in a system of equations that could
be solved failed. The man in the loop posed an unrecoverable
implementation problem that was solved with a Genetic Algorithm
approach presented in a companion paper DSS08-6979-21.
The HgCdTe Focal Plane Array (MCT FPA), which has relatively
higher noise when compared to InSb, as well as InSb FPAs were used
during data collections. Multiple integration times were considered
in the experiments. Because the focus of our research was for a
technique to detect regions of interest in imagery that could be
implemented near the imager, no non-uniformity correction using
gain and offset was performed on the raw infrared imagery. However,
a background image created by averaging several instances of
background scenery without signal was created, and this background
image was subtracted from the raw imagery to perform a rough order
non-uniformity correction. Image statistics reported in this paper
are on the background subtracted imagery.
There are no meaningful geometric features that are useful in
unresolved signal detection. The peak value in the frame is most
often an error artifact such as a dead pixel. A number of
algorithms were evaluated that base their decision (signal or
background) on local contrast in intensity. Usually, there will be
many areas in the image with acceptable local contrast that
qualifies them for consideration as target pixels. The existence of
stray bright pixels due to random noise is quite common. Under
these conditions, if the decision is made strictly based on the
information available in one frame, it is not easy to keep the
number of false targets at an acceptable level. Therefore,
algorithms which process a sequence of image frames rather than a
single image frame are expected to perform better not only by
increasing the detection rate of the true target, but also by
reducing the number of false detections. The algorithms which are
used to detect targets in individual frames through pixel
classification are given with simulation results which represent
detection accuracy and level of tolerance to noise. Initial results
prompted developing target detection algorithms which base their
decision on the information extracted from a time sequence of
images, work that is currently underway at Polaris.
2.2 Figures of Merit
The figures of merit for the mathematical experiment were
probability of detection (Pd) and false alarm rate (FAR) as a
function of SNR. Probability of detection is measured by reporting
the number of target smears in the truth data that are co-located
with active pixels in the output mask, and is reported as a
fraction of targets detected. FAR is a count of active pixels per
output frame that are not on a pixel with signal in the input
image, and is reported as pixels per frame. The stated goals at the
outset were probability of detection of 80% at Signal to Noise
Ratio of 2.0 with less than 100 false alarm pixels being passed by
the algorithm. A restriction levied for our research was that the
algorithm should be able to be implemented in hardware at or near
the focal plane so as to allow for higher frame rates, lower
bandwidth, or larger format focal planes by passing only those
regions which likely contained signal to the downstream image
processing systems.
By minimizing the false alarm rate, the potential for reading
out a small fraction of the focal plane data becomes possible. If
only a fraction of the FPA data is read out, very high frame rates
can be achieved without resorting to using additional taps into the
dewar to read out more column data in parallel, a solution which
has thermal disadvantages for the system. If a minimum number of
false alarms (say, up to 50) could be tolerated and all true
targets could be identified, then the process would be a beneficial
image preprocessor for high data rate low SNR imaging
applications.
Proc. of SPIE Vol. 6940 69402G-3
-
In this paper SNR computed as the peak pixel counts produced by
the PSF when the PSF is centered on a pixel divided by the image
background standard deviation. For a given SNR in the range of 0 to
6 we report on a performance metric of single frame probability of
detection and the number of false detections per 256x256 frame. The
effort is focused on developing a robust algorithm able to operate
on single frame imagery which would be applicable to image
processing applications where the prior frame is not available or
relevant. Examples of situations which could render image history
of little help are FPA partial failure, thermal blooming, flash
events, process control of highly dynamical chemical systems. A
common thread through these applications is that all or part of a
frame can be saturated or of no interest. Additionally, in hardware
implementations near the FPA where there is no memory available and
no stored history of the image sequence, it is necessary to treat
each image frame as a problem unto itself. The developed algorithms
are resilient to these types of challenges. In addition, we
attempted to develop an approach to avoid a gain and offset
correction for focal plane non-uniformity. This desire arose from a
“wish” to alter the standard readout circuit on a focal plane array
with a simple algorithm that allowed image preprocessing without
complicated non-uniformity correction. Technology to put analog to
digital conversion within the readout circuit affords the use of
signal processing on chip to improve data quality within the
readout to preprocess and eliminate fixed pattern noise.
3. TARGET DETECTION ALGORITHMS – SINGLE FRAME APPROACHES The
single frame approach is suitable if the target appears as a small
blur of a single pixel where the potential target appears brighter
than its surrounding pixels. The tolerance for false detections is
offset by potential gains in frame rate or processing efficiency
downstream. Six algorithms are presented and compared in terms of
performance and computational efficiency. The first algorithm
classifies a pixel as target pixel or background pixel based on
local pixel intensity statistics. The second algorithm is based on
the box filter concept. Each of the next three algorithms bases its
decision on moving averages of intensity of pixels centered about
the pixel to be classified. The last algorithm uses the linking
property of the Pulse Coupled Neural Network (PCNN) to identify
potential targets. In all cases, the merits and the detractors are
discussed. Due to success in Phase I of this effort the PCNN was
the main focus of our research, and alternative filter
methodologies were studied as a means to measure the effectiveness
of the PCNN relative to those simple filter techniques so that we
might get an accurate indication of any performance enhancement
that might be available when using the PCNN.
A Note on Implementation
Global mean and standard deviation are expected to change very
little from frame to frame. Therefore, their values can be computed
using intensity data from a frame prior to the one used for the
computation of local means and detection of potential target areas.
In all our spatial filters presented it is our assumption that
global mean and standard deviation would be computed in the prior
frame to be used in the current frame thresholding. For an NxN
input image, the computation of spatialµ requires N2 additions and
one division. Computation of spatialσ requires N2 subtractions, N2
additions, N2 multiplications, one division and one square root
operation.
3.1 Statistically Based Spatial Filtering Techniques
Local Mean with Global Threshold
During image readout, a global mean and standard deviation can
be computed. This can be compared to a local neighborhood of pixel
intensities. Based on the assumption that the blur of the point
target will be extended only across two or three pixels, a 3 by 3
kernel is convolved across the image, with the mean of the pixel
intensities in the neighborhood being assigned to the center pixel
of the convolution kernel. Every region that is significantly
bright relative to global mean and global standard deviation can be
classified as a potential target. The filter equations and a
description of their use follows.
• Compute single frame spatial mean and sigma
• ijµ is the average of a 3x3 neighborhood surrounding the pixel
of interest
• Yij is the binary output ⎭⎬⎫
⎩⎨⎧ ×+>
=
×= ∑
otherwise if ij
021
91
spatialspatialij
klijklij
Y
S
σµµ
µ
Proc. of SPIE Vol. 6940 69402G-4
-
In our efforts we varied the threshold and found that 2 times
the spatial sigma worked reliably. Pixels whose local mean was
above this value was interpreted as potential targets. Processing
the resulting binary image using the median filter can further
eliminate single pixel false alarms.
Analysis: Let the input image be an array of N x N pixels.
Computation of global mean requires N2 additions and one division.
Computation of global standard deviation requires N2
multiplications, N2 additions, one division, one subtraction and
one square root operation. Therefore, computation of ( spatialµ + 2
spatialσ ) requires (2N
2 + 1) additions, (N2 +1) multiplications, one subtraction, and
one square root operation. Computation of local means requires
9(N-2)2 additions and (N-2)2 divisions. Finally, (N-2)2 comparisons
are needed to quantize the input image. The number additions,
multiplications and comparisons depend on the image size and must
be considered as basic operations. If addition and subtraction are
considered equivalent, and multiplication and division are
equivalent then the method requires approximately 2N2
multiplications, 2 N2 comparisons and 11N2 additions.
Implementation of this algorithm is very simple. Local means can
be computed in parallel using an array processor. However, this may
not be a feasible approach for a real-time application. A better
approach is to use pipeline architecture to compute the local means
as the image is read out. Once ijµ is computed, Yij can be set to
zero or one using a comparator.
The shift register architecture for the computation of ijµ
consists of a 3xN array of shift registers where N is the number of
columns in the image. The image at the focal plane is read out in
the raster scan order starting from pixel (0, 0) into the shift
register array. After 3N clock cycles, the first three rows of the
image will reside in the array of shift registers. At this time 11µ
is computed as the average of the nine pixels in the three right
most columns of the shift register array. The comparator evaluates
11µ and ( spatialµ + 2 spatialσ ), and sets Y11 to 1 if 11µ is
greater than ( spatialµ + 2 spatialσ ). Otherwise, Y11 is set to
0.
During the next clock cycle, 3x3 array will be positioned over
the pixel located at (1, 2), and the process will be repeated. The
hardware will be idle when the kernel is positioned over border
pixels, a side effect of the fact that there must be a 3x3
neighborhood about the pixel being processed.
Box Filter with Local Threshold
As targets pixels are expected to be brighter than the pixels
surrounding them, this algorithm uses global standard deviation and
box filter to identify target smears which are significantly
brighter than surrounding pixels. The kernel shape for this filter
can be seen in Figure 3 below.
Figure 3. Box Filter Convolution Kernel
Where m and n specify the 24 boarder pixels in a 7x7
neighborhood about the pixel of interest
• Compute single frame spatial sigma
• ijµ is the average of a 3x3 neighborhood surrounding the pixel
of interest (center)
• ijb is the average of the 24 border pixels on a box that is +
3 pixels from the pixel of interest (border)
• Yij is the binary output
Analysis: Computation of local means requires 34(N-6)2 additions
and 2(N-6)2 divisions. Also, (N-6)2 comparisons are needed to
obtain the binary image.
⎭⎬⎫
⎩⎨⎧ ×+>
=
×=
×=
∑
∑
otherwise,02if1
,24191
ij spatialij
ij
mnijmnij
klijklij
bY
Sb
S
σµ
µ
Proc. of SPIE Vol. 6940 69402G-5
-
Implementation of this algorithm makes use of the previous
image’s global sigma. The computation of the two local means can be
implemented using a shift register architecture similar to the one
described in the previous section that stores five rows of raster
scanned data. Because the convolution kernel is 5x5 in this filter,
there is a boundary of two pixels around the border of the input
image which can not be evaluated.
Cross Average Filter
This algorithm uses cross average filtering concept to identify
blobs which are significantly brighter than surrounding pixels. The
filter kernel can be seen in Figure 4 below.
Figure 4. Cross Average Filter
Convolution Kernel
• µij is the average of the outer pixels of a cross through the
center of a 19x19 neighborhood surrounding the pixel of interest
(center)
• Yij is the binary output
Analysis: Computation of local values of cross-average requires
20(N-20)2 additions and (N-20)2 divisions. Finally, (N-20)2
comparisons are needed to transform the input image to binary
image.
Implementation of this algorithm does not require computation of
global mean or standard deviation. At each location, the local
cross-average can be computed using a shift register architecture
similar to the one previously described. In this case it is
necessary to store nineteen rows of image data.
Line Average Filter
This algorithm uses line average filtering concept to identify
pixels which are significantly brighter than surrounding pixels.
Orientation of the filter kernel can be horizontal (rows) or
vertical (columns), and can be chosen based upon the noise
characteristics of the data being filtered. The filter kernel can
be seen in Figure 5 below. This filter is particularly good at
finding signal that is embedded in raw imagery augmented with
column readout noise.
⎪⎭
⎪⎬
⎫
⎪⎩
⎪⎨
⎧>
=
⎭⎬⎫
⎩⎨⎧ ×+>
=
±×=
∑
∑
otherwise0interest of pixel
about the kernel 5x5a specify l andk where5 if 1
otherwise03if1
interest of pixel thefrom pixels 4 line pixel 19a in reside that
pixels 10 especify th l andk where
,101
,
,
,,
klijkl
ij
spatialijijij
klijklij
TY
ST
S
σµ
µ
Figure 5. Line Average Filter Convolution Kernel
• µij is the average of the 10 outer pixels in a 19 pixel linear
neighborhood surrounding the pixel of interest
• Yij is the binary output
⎪⎭
⎪⎬
⎫
⎪⎩
⎪⎨
⎧>
=
⎭⎬⎫
⎩⎨⎧ ×+>
=
±×=
∑
∑
otherwise0interest of pixel
about the kernel 5x5a specify l andk where5 if 1
otherwise03if1
interest of pixel thefrom pixels 4odneighborho 19x19a in
pattern
crossa form that pixels 20 especify th l andk where,
201
,
,
,,
klijkl
ij
spatialijijij
klijklij
TY
ST
S
σµ
µ
Proc. of SPIE Vol. 6940 69402G-6
-
Analysis: Once again addition and comparison are basic
operations. The Computation of local values of line-average
requires 10N(N-20) additions and N(N-20) divisions. Finally,
N(N-20) comparisons are needed to transform the input image to
binary image. Implementation of this algorithm does not require
computation of global mean or standard deviation. At each location,
the local cross-average can be computed using a shift register
architecture similar to the one previously described. In this case
it is necessary to store only one row of image data.
Post Processing Spatially Filtered Data with Median Filter
In general, implementation of the median filter in real time is
a challenging task as it requires sorting. However, when the image
to be filtered is a binary image, it is possible to implement
median filter using high speed hardware for real-time operation. A
shift register as wide as three rows of the output image that is
capable of storing 1 bit of data per pixel, and the ability to add
9 of those pixel values is all that is required. The 9 pixels about
the pixel in question are added together. If 5 or more pixels in
the 3x3 neighborhood are on then the pixel value is turned on,
otherwise it is set to zero. If the goal is to compute the median
based on 5x5 window then the size of the shift register array and
threshold for the comparator change five rows and 13, respectively.
In fact, the above architecture can easily be modified to find
blobs of desired size and shape in binary images.
3.2 Pulse Coupled Neural Network In the past, researchers have
used PCNNs for image smoothing, image segmentation, feature
extraction, and region of interest detection [6]. Our own prior
efforts have demonstrated superior image processing results in a
variety of applications ranging from mine detection to medical
image processing. The PCNN design is based on the visual cortex of
the cat, and has been shown to be useful in a variety of
applications by Eckhorn et al [2], Banish and Ranganath [3],
Kuntimad [4], and Kinser [6] among others. The model is based on
both feeding inputs and linking inputs to each neuron. The feeding
inputs are weighted portions of the input scene, while the linking
inputs are weighted portions of the neurons neighbors. Each neuron
that fires will contribute to its neighbor some amount of energy
which will increase the likelihood that it will fire on a
successive pulse. By choosing the correct set of linking and
feeding weights and kernel sizes as well as the threshold decay
rate the PCNN can be used very effectively to segment regions of
interest from clutter or high noise backgrounds [4]. These
parameters are typically set by trial and error, and developing a
systematic means of weight determination was not intuitive by means
of a heuristic.
Image processing can be done using PCNN by mapping each pixel
value as an input to a PCNN neuron in the network such as seen in
Figure 6. Most image processing work with PCNNs, including Banish
and Ranganath [3], Kuntimad [4], and Kinser [6] among others, was
predicated on finding a set of network coefficients that produced
the desired segmentation in a single pulse. Our classical PCNN uses
feeding inputs and linking inputs to each neuron. The feeding
inputs are weighted portions of the input scene, while the linking
inputs are weighted portions of the neuron’s neighbors. Groups of
neurons that are spatially near one another that are at
approximately the same intensity will tend to fire in the same
pulse. Each neuron that fires will contribute to its neighbor some
amount of energy which will increase the likelihood that it will
fire on a successive pulse. Neurons firing together excite
contiguous regions, and small spatial gaps within the region are
spanned. Global thresholds decay at each pulse producing a set of
binary outputs that identify regions within the input that can be
segmented even if the input intensities are not as bright as the
background clutter within the scene [4].
S U
Figure 6. Connectivity of a PCNN and the input image pixel is
mapped to a neuron 1:1
The mathematical formalism is depicted in the set of equations
as shown below.
Proc. of SPIE Vol. 6940 69402G-7
-
[ ]
)()1()(
0
)1()()(
)(1)()(
)1()1()(
)1()1()(
nYVnen
otherwise
nnnY
nLnFnU
nYMVnLenL
nYWVSnFenF
ijn
ij
ijij
ijijij
klkl
ijklLijn
ij
klkl
ijklFijijn
ij
L
F
Θ∆−
∆−
∆−
+−Θ=Θ⎭⎬⎫
⎩⎨⎧ −Θ>
=
+=
−+−=
−++−=
Θ
∑
∑
α
α
α
β
,U if 1, ij
• S is the input signal intensity, (n is time, i is row, j is
column)
• F is the feeding signal
• L is the linking signal
• U is the internal activity
• Y is the binary image output state
• Θ is the dynamic threshold
• VF, VL and VΘ are normalizing constants
• M and W are the synaptic weight kernels through which
neighboring neurons communicate, typically 1/r or 1/r2
• αF, αL and αΘ are decay constants Practically, groups of
neurons that are spatially near one another that are at
approximately the same intensity tend to fire in the same pulse.
Neurons firing together excite contiguous regions, and small
spatial gaps within the region are spanned. Global thresholds decay
at each pulse producing a set of binary outputs that identify
regions within the input that can be segmented even if the input
intensities are not as bright as the background clutter within the
scene [4]. Unlike other neural networks, the neuron weights in a
PCNN are not solved by training. Linking and feeding weights and
kernel sizes can be selected by trial and error to empirically
solve for a network that will produce desirable results, such as
identifying features within the image. However, this proves to be a
time consuming, if not frustrating, exercise that does not
generally produce a result that excels at segmenting only features
of interest while rejecting background noise or clutter in all
circumstances. The network is evaluated iteratively, and there is a
state for each of the network equations that can be examined
following each of these “pulses”. The threshold on each pixel
decays until the internal activity for the pixel exceeds the
threshold, at which time the threshold is boosted for the
subsequent pulse. If the cycle were to repeat indefinitely the
output on each pixel would eventually oscillate.
Our experience reveals the following algorithm features
• Locates spatially contiguous regions of signal
• Tolerant to spatial noise
• Few control parameters that can be determined
heuristically
• For this application we forced the outcome at a given pulse
(4) as a part of the heuristic
• It is the only method to achieve perfect segmentation when
intensity ranges among regions and background overlap
As an alternative to manually tuning the PCNN for each use, this
group developed a framework for determining the PCNN coefficients,
linking and feeding kernel sizes and weights, and the number of
PCNN pulses based on the use of a Genetic Algorithm (GA). The GA
uses a representative data set and a scoring heuristic to find a
PCNN solution that will work on data that is statistically similar
to that used in training. Through the course of this research it
was determined that in addition to a quality GA that used
innovative breeding and sub-populations the quality of the PCNN
solution was
Proc. of SPIE Vol. 6940 69402G-8
-
ID
heavily influenced by the data set provided for training, and by
the scoring heuristic used to judge the fitness of a particular
PCNN individual that had been bred by the GA. Among the innovation
was also the ability to force the solution to segment the target
from the image on a specific PCNN pulse. These separate discoveries
addressed previous difficulties in using PCNN for segmentation –
the inability to predict when the segmentation would occur, and the
need to manually select a set of coefficients and weights to
perform the segmentation.
Figure 7. Low contrast input imagery used in this research (left
two images), and PCNN output image (right)
Moreover, we have found that this implementation works across a
family of focal planes. Solutions for both InSb and MCT systems
were tested, and algorithmic results were similar. Sample input
imagery and the results of the PCNN processing on it are shown in
Figure 7 above. The left images are actually the same image, shown
with different scales. The rightmost image is the PCNN output with
pixels on target shown in green, and false alarm pixels shown in
yellow.
4. RESULTS To determine the value of this research, our PCNN is
compared with segmentation results from the spatial filter
algorithms in an experiment of low single to noise ratio. Results
of each method can be seen in Figure 8 below. Figure 9 shows the
false alarm performance of each method tested. Note that two PCNNs
are reported. For all algorithms, the false alarm rate is
practically invariant to SNR.
Single Frame Probability of Detection Results
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 6Signal to Noise Ratio (SNR)
Pro
babi
lity
of D
etec
tion
(Pd) PCNN trained with 50 allowable
False AlarmsLocal Mean
Box Filter
Cross Average
Line Average
Vertical Line Average
PCNN trained with 250 allowableFalse Alarms
Figure 8. Single Frame Probability of Detection for a Variety of
Algorithms
Proc. of SPIE Vol. 6940 69402G-9
-
False Alarm Pixels Per Frame
0
50
100
150
200
250
300
0 1 2 3 4 5 6
Signal to Noise Ratio (SNR)
Fals
e A
larm
Pix
els
Box Filter
Cross Average
Line Average
Vertical Line Average
Local Mean
PCNN trained with 50allowable False Alarms
PCNN trained with 250allowable False Alarms
Figure 9. False alarm pixels per frame for a variety of
algorithms. Shown are the false alarm pixels for the spatial
filters and
false alarm pixels for the PCNN solved using the Genetic
Algorithm. Note that the false alarm pixels per frame on the PCNN
will are approximately equal to the allowable false alarms per
frame used in the GA.
4.1 Comparison of Single Frame Algorithms
We found through this research that there are a variety of means
that can be applied to infrared detect point targets in infrared
imagery. The key hurdle to overcome appears to be that the image
statistics – background mean and image standard deviation – must
remain somewhat constant for the segmentation filters to work
consistently. Focal plane non-uniformity is a large problem when
considering single frames of imagery, and for our research this was
overcome by simply doing a background subtraction from the current
image using an average background frame collected while looking at
the same background with the same integration time and image frame
rate. This simple correction resulted in a sufficiently uniform
image to allow for successful segmentation using a variety of
techniques.
Our research has resulted in a number of spatial filtering
techniques that can easily be implemented in a shift register
hardware configuration that is suitable for implementation near the
focal plane. These hardware solutions can be implemented
immediately post readout, and the background subtraction can be
performed using a small memory device and an address sequencer as
the pixels are converted and read off the focal plane.
The PCNN solution performed the best, although its complexity
makes it more suitable for a software implementation. The GA used
to find a PCNN solution allowed for a variety of floating point
coefficients used in the network repeatedly that are not easily
implemented in hardware, plus the fact that the network must be
evaluated iteratively makes it less suitable for a hardware
implementation than a single pass spatial filter. We have
considered using a GA to find a PCNN solution that is restricted to
simple small integer coefficients that would allow for an analog
circuit implementation of the neuron at the unit cell level of the
focal plane, but that exercise has been shifted to a later
date.
Additionally, the scoring heuristic for the PCNN effects greatly
to the quality of the results obtained by the PCNN. Over the course
of this effort it was determined that being too restrictive on
false alarm pixels resulted in a PCNN solution that had a very poor
probability of detection since the solution was rewarded for having
very few pixels on in the output image. Likewise, having a
heuristic that rewarded target detection disproportionately to low
false alarms resulted in a PCNN solution that simply turned on a
lot of pixels in the output image, thus guaranteeing a high
detection rate. The optimum tradeoff seemed to be in tolerating
some number of false alarm pixels before heavily penalizing the
PCNN solution. The result was almost always a PCNN that had
approximately that number of false alarm pixels in each output
image. The best PCNN solutions that we found using the GA had
around 250 allowable false alarm pixels per frame. You can see in
Figure 9 above that this was the case regardless of SNR of the
input imagery. This number can be reduced by further processing the
output using a median filter. Any further signal conditioning of
the output to remove false alarm pixels will likely need to be done
via temporal processing.
The simple filters tested in this study performed equally well
to the PCNN once a satisfactory SNR was achieved. If the input data
being segmented is sufficiently bright then these simple filters
should be considered. If the data set in question requires
detection in the range of signal to noise ratio of 1.5 to 3 then
the PCNN solution determined by the GA is a recommended option.
Proc. of SPIE Vol. 6940 69402G-10
-
[lub°
SO IOU ISO 200 260
IOU
ISO
The use of these algorithms near the focal plane will allow for
the possibility of data thinning by passing only regions of
interest to the downstream data processing systems which track or
characterize targets, allowing for either higher data rates or
larger imaging systems, both of which would be difficult to
implement if processing every pixel were required.
Although the probability of detection is good for all the
presented approaches, the number of false alarm pixels reported is
unacceptably high for most applications to use the filter output
directly. In several cases the use of a median filter on the output
of the presented algorithm is suggested as a means to eliminate
stray single pixel false alarms. While this method does improve the
false alarm rate for each algorithm, it also negatively impacts the
probability of detection by inadvertently deleting true target
detections from the segmented output imagery.
Sample output imagery produced by the algorithms presented can
be seen in Figure 10. The input image and truth image are shown for
reference. The data in this case represents an SNR of 2.5, with
targets having a blur circle of approximately 9 to 16 pixels, with
only 2 of those being above the noise floor of the image. Algorithm
outputs are shown.
(input image) (truth)
(local mean) (box filter) (cross average)
(line average) (matched filter) (PCNN)
Figure 10. Sample image segmentation results for a variety of
techniques, including spatial filtering, correlation, and Pulse
Coupled Neural Network Shown is results for input imagery at SNR
2.5. SNR 2.5 is below the ability of the filters. At SNR 4 they all
work equally well.
Proc. of SPIE Vol. 6940 69402G-11
-
4.2 Research in Progress
Following this research on segmentation of relatively stationary
point targets in infrared imagery we have begun to address the
problem of point targets that have been smeared in the image due to
motion of the target or imaging platform. Addressing only the case
where the target motion can be ignored and smear is due to a change
in pose of the imaging system during integration, a matched filter
approach can be used to detect segments within the image that match
the predicted shape of the streak based upon a sequence of
measurements indicating camera pose. Using these measurements a
correlation kernel can be constructed and convolved across the
input image. Peaks in the image can then be evaluated against a
threshold to find likely target segments in the image. A variety of
thresholding techniques can be applied to find likely targets in
the correlation image. Polaris Sensor Technologies will address one
of these techniques in a companion paper to this one, DSS08-
6967-22.
In addition to further research using shaped kernels to detect
streaks, Polaris is addressing a PCNN approach that utilizes
multiple frames of source imagery as input. By considering a set of
frames and the relative motion of the imaging system legitimate
targets can be separated from artifacts that may appear to be of
interest in only a single frame. This work is ongoing, but shows
promise to provide a means to detect small targets without image
stabilization techniques.
When the target appears small consisting of only a few pixels it
is common to have high false targets in the segmented image. This
is because all locally bright points need to be considered as
potential targets. However, if frames in the time sequence of
images are transformed to binary form, spatially registered and
stacked to form a 3-D binary image then the pixels that represent a
true object will form a curve of significant length in the 3-D
binary image. As noise is expected to appear at random locations
streaks that are persistent and smooth should be scored more highly
when considering what are targets and what are noise. Because of
the effects of object visibility and digitization – both related to
intensity of the target as it slews across detectors, and due to
blanking time during readout – the curve may not be perfectly
smooth and even may have small breaks. Techniques to link streaks
from consecutive frames are being addressed, but must still be
perfected to be useful.
5. REFERENCES [1] D. L. Webb, L. L. Hung, D. F. Elliott
Algorithms for EO Sensor SNR Enhancement and Smoothing Proceedings
of the Conference Record of the Twenty-Ninth Asilomar Conference on
Signals, Systems and Computers, 1995 (ASILOMAR '95) 1058-6393/95
1995 IEEE
[2] R. Eckhorn, H. J. Reitboeck, M. Arndt, and P. Dicke, Feature
Linking via Synchronization among Distributed Assemblies:
Simulations of Results from Cat Visual Cortex, Neural Computation,
Vol. 2, pp. 293-307. 1990.
[3] Banish, Michele R.; Ranganath, Heggere. Neural network based
element, image pre-processor, and method of pre-processing using a
neural network United States Patent number 20030076992. 2003.
[4] G. Kuntimad, “Pulse coupled neural networks for image
processing”, Ph.D. dissertation, Computer Science department, The
University of Alabama in Huntsville, 1995.
[5] G. Kuntimad and H. S. Ranganath, “Perfect Image Segmentation
using Pulse Coupled Neural Network”. IEEE Transactions of Neural
Networks, vol10, pp.591-599, May 1999.
[6] J. Lindblad and J Kinser, Image Processing Using
Pulse-Coupled Neural Networks, Second Revised Edition, Copyright
1985, 2005.
Proc. of SPIE Vol. 6940 69402G-12