Single-Frame Image Processing Techniques for Low-SNR ...client.blueskybroadcast.com/spie/dss08/6940-78-a1d_edmondson/manuscript.pdfSingle-Frame Image Processing Techniques for Low-SNR

Single-Frame Image Processing Techniques for Low-SNR Infrared Imagery

Rich Edmondsona, Mike Rodgersa, Michele Banisha, Michelle Johnsona, Heggere Ranganathb aPolaris Sensor Technologies, 200 West Side Square, Suite 320, Huntsville, AL, 35801;

b Chairman, Computer Science Department, University of Alabama in Huntsville, 300 Technology Hall, Huntsville, AL, 35899

ABSTRACT

Polaris Sensor Technologies, Inc. is identifying target pixels in IR imagery at signal to noise (SNR) ranges from 1.25 to 3 with a mixed set of algorithms that are candidates for next generation focal planes. Some of these yield less than 50 false targets and a 95% probability of detection in this low SNR range. What has been discovered is that single frame imagery combined with IMU data can be input into a host of algorithms like Neural Networks and filters to isolate signals and cull noise. Solutions for nonlinear thresholding approaches can be solved using both genetic algorithms and neural networks. What is being addressed is how to implement these approaches and apply them to point target detection scenarios. The large format focal planes will flood the down stream image processing pipelines used in real time systems, and this team wonders if data can be thinned near the FPA using one of these techniques. Delivering all the target pixels with a minimum of false positives is the goal addressed by the group. Algorithms that can be digitally implemented in a ROIC are discussed as are the performance statistics Probability of Detection and False Alarm Rate. Results from multiple focal planes for varied scenarios will be presented. Keywords: Infrared, SNR, neural network, genetic algorithm, PCNN, filter

1. INTRODUCTION A method currently employed for detecting single pixel targets involves summing frames to drive up the signal and lower the background noise [1]. The noise is typically an uncorrelated combination of focal plane noise, system noise, thermal noise, target motion and kinematics motion in flight system applications during acquisition. The process is much the same for any dynamic system application like UAV, UGV, product tracking and process control. Signals are separated from atmospheric or uncluttered background by selecting pixels above a threshold some K sigma over the expected or tolerated noise. Discrimination of signal from background in this manner produces a frame delay and reduction in frame rate due to the summation. Dim - and possibly significant - signals are hidden below the threshold floor. As a result currently the entire contents of the focal plane array are required to be read out to present the data to a down stream mathematical process that interprets the data. Our effort addresses the need for high data rate computationally efficient methods that can yield the separation of signal from background at low signal to noise (SNR) ratios down to 2.

2. METHOD AND APPROACH In mathematical experiments presented in this paper, the data set is derived from collected data from cooled HgCdTe and InSb focal plane arrays (FPA), both 256 pixels square. This paper presents several algorithms for the detection of small unresolved features in noisy images where each signal is represented by just one pixel, and the signal is most likely spread among more than one pixel.

2.1 Data

The signal is added to the background with a computer program that scales an unresolved point spread function to a specific signal to noise ratio and randomly places it in the FPA background scene. It is classically described as a point spread function, an Airy disk, from an optical system designed to have a two pixel full width at half max. The signal is spread among a number of pixels due to its not being stationary in the field of view. If no motion is added the full width at half max of the signal would be two pixels. Example imagery can be seen in Figure 1. Background data was obtained experimentally within a laboratory, and then large data sets were manufactured. While we recognize that there are

Infrared Technology and Applications XXXIV, edited by Bjørn F. Andresen, Gabor F. Fulop, Paul R. Norton,Proc. of SPIE Vol. 6940, 69402G, (2008) · 0277-786X/08/$18 · doi: 10.1117/12.778076

Proc. of SPIE Vol. 6940 69402G-1

I—

50 000 150 200 250

TO

ISS

SO

Sb

SO

45

I

alternative methods for determining true signal, such as the use of a first order Bezel function, the work presented herein is focused on algorithms, and description of optical signal is simply for determination of truth in scoring.

Two types of target imagery are studied:

1. Targets that tend to dwell in a small number of pixels and follow a predictable path in the field of view

2. Targets that tend to move randomly in the field of view and whose extent is not predictable

The specific aspects of the data under study have a max linear motion per frame between 3.5 pixels and 12.0 pixels. Each frame shows some target smearing due to motion effects. It is likely that in scenarios where the camera imagery is not motion compensated that the direction of the smear will remain constant for several frames. A scene generation tool was developed to manufacture data to these specifications. A typical image and a magnified view of the signal is seen in Figure 1.

Figure 1. Typical Input imagery used for algorithm development

This experiment was motivated by data collected in hardware in the loop simulation where an optical system was misaligned, and there were focal plane array defects representative of those seen in real world applications. Data from this experiment is shown below. As seen in Figure 2 below, the diagonal fringes are an optical misalignment among components, the vertical white bar is focal plane defect and the horizontal striping is typical readout noise from an Indigo ROIC. The large thermal bloom at the bottom and the top of the frame is of no strategic interest and buries the unresolved concentric rings of single pixel target signals.

- DC

IR HWIL Data: Difficult data set to process –artifacts, low S/N

PCNN offers a solution with significant improvements over thresholding

Best performance to date offers >7000 fewer False Target Pixels

Figure 2. Imagery from Phase I HWIL data collection


This poor quality data could have been easily been generated by a system defect and what is presented in this paper are mathematical algorithms developed to function in spite of system partial failure.

Our processing of the HWIL data led to the belief that Pulse Coupled Neural Network (PCNN) is a way that can provide better results than simple thresholding. As seen in Figure 2, the image with the average background removed is shown next to the raw image, followed by a processing result for an “optimal threshold” which requires that you know the signal intensity. The optimal thresholding technique filters all pixels above the peak signal intensity, and all pixels below some fixed number of counts less than the peak signal intensity, which in this instance was 1 count. The test result obtained using this optimal thresholding technique is 8000 potential target pixels, much of which is noise. Shown below the thresholded image is the PCNN output, an alternative processing result that yields all signal with significantly fewer false alarm pixels.

What was found is that signals could be correctly separated from background, albeit not perfectly, with PCNN. The number of pixels eliminated by the PCNN far exceeded all alternatives, opening system bandwidth for discrimination mathematics downstream. The unfortunate nature of the process at the time was that there was no ability to set the PCNN control parameters other than via trial and error. Efforts to cast the PCNN in a system of equations that could be solved failed. The man in the loop posed an unrecoverable implementation problem that was solved with a Genetic Algorithm approach presented in a companion paper DSS08-6979-21.

The HgCdTe Focal Plane Array (MCT FPA), which has relatively higher noise when compared to InSb, as well as InSb FPAs were used during data collections. Multiple integration times were considered in the experiments. Because the focus of our research was for a technique to detect regions of interest in imagery that could be implemented near the imager, no non-uniformity correction using gain and offset was performed on the raw infrared imagery. However, a background image created by averaging several instances of background scenery without signal was created, and this background image was subtracted from the raw imagery to perform a rough order non-uniformity correction. Image statistics reported in this paper are on the background subtracted imagery.

There are no meaningful geometric features that are useful in unresolved signal detection. The peak value in the frame is most often an error artifact such as a dead pixel. A number of algorithms were evaluated that base their decision (signal or background) on local contrast in intensity. Usually, there will be many areas in the image with acceptable local contrast that qualifies them for consideration as target pixels. The existence of stray bright pixels due to random noise is quite common. Under these conditions, if the decision is made strictly based on the information available in one frame, it is not easy to keep the number of false targets at an acceptable level. Therefore, algorithms which process a sequence of image frames rather than a single image frame are expected to perform better not only by increasing the detection rate of the true target, but also by reducing the number of false detections. The algorithms which are used to detect targets in individual frames through pixel classification are given with simulation results which represent detection accuracy and level of tolerance to noise. Initial results prompted developing target detection algorithms which base their decision on the information extracted from a time sequence of images, work that is currently underway at Polaris.

2.2 Figures of Merit

The figures of merit for the mathematical experiment were probability of detection (Pd) and false alarm rate (FAR) as a function of SNR. Probability of detection is measured by reporting the number of target smears in the truth data that are co-located with active pixels in the output mask, and is reported as a fraction of targets detected. FAR is a count of active pixels per output frame that are not on a pixel with signal in the input image, and is reported as pixels per frame. The stated goals at the outset were probability of detection of 80% at Signal to Noise Ratio of 2.0 with less than 100 false alarm pixels being passed by the algorithm. A restriction levied for our research was that the algorithm should be able to be implemented in hardware at or near the focal plane so as to allow for higher frame rates, lower bandwidth, or larger format focal planes by passing only those regions which likely contained signal to the downstream image processing systems.

By minimizing the false alarm rate, the potential for reading out a small fraction of the focal plane data becomes possible. If only a fraction of the FPA data is read out, very high frame rates can be achieved without resorting to using additional taps into the dewar to read out more column data in parallel, a solution which has thermal disadvantages for the system. If a minimum number of false alarms (say, up to 50) could be tolerated and all true targets could be identified, then the process would be a beneficial image preprocessor for high data rate low SNR imaging applications.


In this paper SNR computed as the peak pixel counts produced by the PSF when the PSF is centered on a pixel divided by the image background standard deviation. For a given SNR in the range of 0 to 6 we report on a performance metric of single frame probability of detection and the number of false detections per 256x256 frame. The effort is focused on developing a robust algorithm able to operate on single frame imagery which would be applicable to image processing applications where the prior frame is not available or relevant. Examples of situations which could render image history of little help are FPA partial failure, thermal blooming, flash events, process control of highly dynamical chemical systems. A common thread through these applications is that all or part of a frame can be saturated or of no interest. Additionally, in hardware implementations near the FPA where there is no memory available and no stored history of the image sequence, it is necessary to treat each image frame as a problem unto itself. The developed algorithms are resilient to these types of challenges. In addition, we attempted to develop an approach to avoid a gain and offset correction for focal plane non-uniformity. This desire arose from a “wish” to alter the standard readout circuit on a focal plane array with a simple algorithm that allowed image preprocessing without complicated non-uniformity correction. Technology to put analog to digital conversion within the readout circuit affords the use of signal processing on chip to improve data quality within the readout to preprocess and eliminate fixed pattern noise.

3. TARGET DETECTION ALGORITHMS – SINGLE FRAME APPROACHES The single frame approach is suitable if the target appears as a small blur of a single pixel where the potential target appears brighter than its surrounding pixels. The tolerance for false detections is offset by potential gains in frame rate or processing efficiency downstream. Six algorithms are presented and compared in terms of performance and computational efficiency. The first algorithm classifies a pixel as target pixel or background pixel based on local pixel intensity statistics. The second algorithm is based on the box filter concept. Each of the next three algorithms bases its decision on moving averages of intensity of pixels centered about the pixel to be classified. The last algorithm uses the linking property of the Pulse Coupled Neural Network (PCNN) to identify potential targets. In all cases, the merits and the detractors are discussed. Due to success in Phase I of this effort the PCNN was the main focus of our research, and alternative filter methodologies were studied as a means to measure the effectiveness of the PCNN relative to those simple filter techniques so that we might get an accurate indication of any performance enhancement that might be available when using the PCNN.

A Note on Implementation

Global mean and standard deviation are expected to change very little from frame to frame. Therefore, their values can be computed using intensity data from a frame prior to the one used for the computation of local means and detection of potential target areas. In all our spatial filters presented it is our assumption that global mean and standard deviation would be computed in the prior frame to be used in the current frame thresholding. For an NxN input image, the computation of spatialµ requires N2 additions and one division. Computation of spatialσ requires N2 subtractions, N2 additions, N2 multiplications, one division and one square root operation.

3.1 Statistically Based Spatial Filtering Techniques

Local Mean with Global Threshold

During image readout, a global mean and standard deviation can be computed. This can be compared to a local neighborhood of pixel intensities. Based on the assumption that the blur of the point target will be extended only across two or three pixels, a 3 by 3 kernel is convolved across the image, with the mean of the pixel intensities in the neighborhood being assigned to the center pixel of the convolution kernel. Every region that is significantly bright relative to global mean and global standard deviation can be classified as a potential target. The filter equations and a description of their use follows.

• Compute single frame spatial mean and sigma

• ijµ is the average of a 3x3 neighborhood surrounding the pixel of interest

• Yij is the binary output ⎭⎬⎫

⎩⎨⎧ ×+>

=

×= ∑

otherwise if ij

021

91

spatialspatialij

klijklij

Y

S

σµµ

µ


In our efforts we varied the threshold and found that 2 times the spatial sigma worked reliably. Pixels whose local mean was above this value was interpreted as potential targets. Processing the resulting binary image using the median filter can further eliminate single pixel false alarms.

Analysis: Let the input image be an array of N x N pixels. Computation of global mean requires N2 additions and one division. Computation of global standard deviation requires N2 multiplications, N2 additions, one division, one subtraction and one square root operation. Therefore, computation of ( spatialµ + 2 spatialσ ) requires (2N

2 + 1) additions, (N2 +1) multiplications, one subtraction, and one square root operation. Computation of local means requires 9(N-2)2 additions and (N-2)2 divisions. Finally, (N-2)2 comparisons are needed to quantize the input image. The number additions, multiplications and comparisons depend on the image size and must be considered as basic operations. If addition and subtraction are considered equivalent, and multiplication and division are equivalent then the method requires approximately 2N2 multiplications, 2 N2 comparisons and 11N2 additions.

Implementation of this algorithm is very simple. Local means can be computed in parallel using an array processor. However, this may not be a feasible approach for a real-time application. A better approach is to use pipeline architecture to compute the local means as the image is read out. Once ijµ is computed, Yij can be set to zero or one using a comparator.

The shift register architecture for the computation of ijµ consists of a 3xN array of shift registers where N is the number of columns in the image. The image at the focal plane is read out in the raster scan order starting from pixel (0, 0) into the shift register array. After 3N clock cycles, the first three rows of the image will reside in the array of shift registers. At this time 11µ is computed as the average of the nine pixels in the three right most columns of the shift register array. The comparator evaluates 11µ and ( spatialµ + 2 spatialσ ), and sets Y11 to 1 if 11µ is greater than ( spatialµ + 2 spatialσ ). Otherwise, Y11 is set to 0.

During the next clock cycle, 3x3 array will be positioned over the pixel located at (1, 2), and the process will be repeated. The hardware will be idle when the kernel is positioned over border pixels, a side effect of the fact that there must be a 3x3 neighborhood about the pixel being processed.

Box Filter with Local Threshold

As targets pixels are expected to be brighter than the pixels surrounding them, this algorithm uses global standard deviation and box filter to identify target smears which are significantly brighter than surrounding pixels. The kernel shape for this filter can be seen in Figure 3 below.

Figure 3. Box Filter Convolution Kernel

Where m and n specify the 24 boarder pixels in a 7x7 neighborhood about the pixel of interest

• Compute single frame spatial sigma

• ijµ is the average of a 3x3 neighborhood surrounding the pixel of interest (center)

• ijb is the average of the 24 border pixels on a box that is + 3 pixels from the pixel of interest (border)

• Yij is the binary output

Analysis: Computation of local means requires 34(N-6)2 additions and 2(N-6)2 divisions. Also, (N-6)2 comparisons are needed to obtain the binary image.

⎭⎬⎫

⎩⎨⎧ ×+>

=

×=

×=

∑

∑

otherwise,02if1

,24191

ij spatialij

ij

mnijmnij

klijklij

bY

Sb

S

σµ

µ


Implementation of this algorithm makes use of the previous image’s global sigma. The computation of the two local means can be implemented using a shift register architecture similar to the one described in the previous section that stores five rows of raster scanned data. Because the convolution kernel is 5x5 in this filter, there is a boundary of two pixels around the border of the input image which can not be evaluated.

Cross Average Filter

This algorithm uses cross average filtering concept to identify blobs which are significantly brighter than surrounding pixels. The filter kernel can be seen in Figure 4 below.

Figure 4. Cross Average Filter

Convolution Kernel

• µij is the average of the outer pixels of a cross through the center of a 19x19 neighborhood surrounding the pixel of interest (center)


Analysis: Computation of local values of cross-average requires 20(N-20)2 additions and (N-20)2 divisions. Finally, (N-20)2 comparisons are needed to transform the input image to binary image.

Implementation of this algorithm does not require computation of global mean or standard deviation. At each location, the local cross-average can be computed using a shift register architecture similar to the one previously described. In this case it is necessary to store nineteen rows of image data.

Line Average Filter

This algorithm uses line average filtering concept to identify pixels which are significantly brighter than surrounding pixels. Orientation of the filter kernel can be horizontal (rows) or vertical (columns), and can be chosen based upon the noise characteristics of the data being filtered. The filter kernel can be seen in Figure 5 below. This filter is particularly good at finding signal that is embedded in raw imagery augmented with column readout noise.

⎪⎭

⎪⎬

⎫

⎪⎩

⎪⎨

⎧>

=

⎭⎬⎫

⎩⎨⎧ ×+>

=

±×=

∑

∑

otherwise0interest of pixel

about the kernel 5x5a specify l andk where5 if 1

otherwise03if1

interest of pixel thefrom pixels 4 line pixel 19a in reside that pixels 10 especify th l andk where

,101

,

,

,,

klijkl

ij

spatialijijij

klijklij

TY

ST

S

σµ

µ

Figure 5. Line Average Filter Convolution Kernel

• µij is the average of the 10 outer pixels in a 19 pixel linear neighborhood surrounding the pixel of interest


⎪⎭

⎪⎬

⎫

⎪⎩

⎪⎨

⎧>

=

⎭⎬⎫

⎩⎨⎧ ×+>

=

±×=

∑

∑

otherwise0interest of pixel

about the kernel 5x5a specify l andk where5 if 1

otherwise03if1

interest of pixel thefrom pixels 4odneighborho 19x19a in pattern

crossa form that pixels 20 especify th l andk where,

201

,

,

,,

klijkl

ij

spatialijijij

klijklij

TY

ST

S

σµ

µ


Analysis: Once again addition and comparison are basic operations. The Computation of local values of line-average requires 10N(N-20) additions and N(N-20) divisions. Finally, N(N-20) comparisons are needed to transform the input image to binary image. Implementation of this algorithm does not require computation of global mean or standard deviation. At each location, the local cross-average can be computed using a shift register architecture similar to the one previously described. In this case it is necessary to store only one row of image data.

Post Processing Spatially Filtered Data with Median Filter

In general, implementation of the median filter in real time is a challenging task as it requires sorting. However, when the image to be filtered is a binary image, it is possible to implement median filter using high speed hardware for real-time operation. A shift register as wide as three rows of the output image that is capable of storing 1 bit of data per pixel, and the ability to add 9 of those pixel values is all that is required. The 9 pixels about the pixel in question are added together. If 5 or more pixels in the 3x3 neighborhood are on then the pixel value is turned on, otherwise it is set to zero. If the goal is to compute the median based on 5x5 window then the size of the shift register array and threshold for the comparator change five rows and 13, respectively. In fact, the above architecture can easily be modified to find blobs of desired size and shape in binary images.

3.2 Pulse Coupled Neural Network In the past, researchers have used PCNNs for image smoothing, image segmentation, feature extraction, and region of interest detection [6]. Our own prior efforts have demonstrated superior image processing results in a variety of applications ranging from mine detection to medical image processing. The PCNN design is based on the visual cortex of the cat, and has been shown to be useful in a variety of applications by Eckhorn et al [2], Banish and Ranganath [3], Kuntimad [4], and Kinser [6] among others. The model is based on both feeding inputs and linking inputs to each neuron. The feeding inputs are weighted portions of the input scene, while the linking inputs are weighted portions of the neurons neighbors. Each neuron that fires will contribute to its neighbor some amount of energy which will increase the likelihood that it will fire on a successive pulse. By choosing the correct set of linking and feeding weights and kernel sizes as well as the threshold decay rate the PCNN can be used very effectively to segment regions of interest from clutter or high noise backgrounds [4]. These parameters are typically set by trial and error, and developing a systematic means of weight determination was not intuitive by means of a heuristic.

Image processing can be done using PCNN by mapping each pixel value as an input to a PCNN neuron in the network such as seen in Figure 6. Most image processing work with PCNNs, including Banish and Ranganath [3], Kuntimad [4], and Kinser [6] among others, was predicated on finding a set of network coefficients that produced the desired segmentation in a single pulse. Our classical PCNN uses feeding inputs and linking inputs to each neuron. The feeding inputs are weighted portions of the input scene, while the linking inputs are weighted portions of the neuron’s neighbors. Groups of neurons that are spatially near one another that are at approximately the same intensity will tend to fire in the same pulse. Each neuron that fires will contribute to its neighbor some amount of energy which will increase the likelihood that it will fire on a successive pulse. Neurons firing together excite contiguous regions, and small spatial gaps within the region are spanned. Global thresholds decay at each pulse producing a set of binary outputs that identify regions within the input that can be segmented even if the input intensities are not as bright as the background clutter within the scene [4].

S U

Figure 6. Connectivity of a PCNN and the input image pixel is mapped to a neuron 1:1

The mathematical formalism is depicted in the set of equations as shown below.


[ ]

)()1()(

0

)1()()(

)(1)()(

)1()1()(

)1()1()(

nYVnen

otherwise

nnnY

nLnFnU

nYMVnLenL

nYWVSnFenF

ijn

ij

ijij

ijijij

klkl

ijklLijn

ij

klkl

ijklFijijn

ij

L

F

Θ∆−

∆−

∆−

+−Θ=Θ⎭⎬⎫

⎩⎨⎧ −Θ>

=

+=

−+−=

−++−=

Θ

∑

∑

α

α

α

β

,U if 1, ij

• S is the input signal intensity, (n is time, i is row, j is column)

• F is the feeding signal

• L is the linking signal

• U is the internal activity

• Y is the binary image output state

• Θ is the dynamic threshold

• VF, VL and VΘ are normalizing constants

• M and W are the synaptic weight kernels through which neighboring neurons communicate, typically 1/r or 1/r2

• αF, αL and αΘ are decay constants Practically, groups of neurons that are spatially near one another that are at approximately the same intensity tend to fire in the same pulse. Neurons firing together excite contiguous regions, and small spatial gaps within the region are spanned. Global thresholds decay at each pulse producing a set of binary outputs that identify regions within the input that can be segmented even if the input intensities are not as bright as the background clutter within the scene [4]. Unlike other neural networks, the neuron weights in a PCNN are not solved by training. Linking and feeding weights and kernel sizes can be selected by trial and error to empirically solve for a network that will produce desirable results, such as identifying features within the image. However, this proves to be a time consuming, if not frustrating, exercise that does not generally produce a result that excels at segmenting only features of interest while rejecting background noise or clutter in all circumstances. The network is evaluated iteratively, and there is a state for each of the network equations that can be examined following each of these “pulses”. The threshold on each pixel decays until the internal activity for the pixel exceeds the threshold, at which time the threshold is boosted for the subsequent pulse. If the cycle were to repeat indefinitely the output on each pixel would eventually oscillate.

Our experience reveals the following algorithm features

• Locates spatially contiguous regions of signal

• Tolerant to spatial noise

• Few control parameters that can be determined heuristically

• For this application we forced the outcome at a given pulse (4) as a part of the heuristic

• It is the only method to achieve perfect segmentation when intensity ranges among regions and background overlap

As an alternative to manually tuning the PCNN for each use, this group developed a framework for determining the PCNN coefficients, linking and feeding kernel sizes and weights, and the number of PCNN pulses based on the use of a Genetic Algorithm (GA). The GA uses a representative data set and a scoring heuristic to find a PCNN solution that will work on data that is statistically similar to that used in training. Through the course of this research it was determined that in addition to a quality GA that used innovative breeding and sub-populations the quality of the PCNN solution was


ID

heavily influenced by the data set provided for training, and by the scoring heuristic used to judge the fitness of a particular PCNN individual that had been bred by the GA. Among the innovation was also the ability to force the solution to segment the target from the image on a specific PCNN pulse. These separate discoveries addressed previous difficulties in using PCNN for segmentation – the inability to predict when the segmentation would occur, and the need to manually select a set of coefficients and weights to perform the segmentation.

Figure 7. Low contrast input imagery used in this research (left two images), and PCNN output image (right)

Moreover, we have found that this implementation works across a family of focal planes. Solutions for both InSb and MCT systems were tested, and algorithmic results were similar. Sample input imagery and the results of the PCNN processing on it are shown in Figure 7 above. The left images are actually the same image, shown with different scales. The rightmost image is the PCNN output with pixels on target shown in green, and false alarm pixels shown in yellow.

4. RESULTS To determine the value of this research, our PCNN is compared with segmentation results from the spatial filter algorithms in an experiment of low single to noise ratio. Results of each method can be seen in Figure 8 below. Figure 9 shows the false alarm performance of each method tested. Note that two PCNNs are reported. For all algorithms, the false alarm rate is practically invariant to SNR.

Single Frame Probability of Detection Results

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 1 2 3 4 5 6Signal to Noise Ratio (SNR)

Pro

babi

lity

of D

etec

tion

(Pd) PCNN trained with 50 allowable

False AlarmsLocal Mean

Box Filter

Cross Average

Line Average

Vertical Line Average

PCNN trained with 250 allowableFalse Alarms

Figure 8. Single Frame Probability of Detection for a Variety of Algorithms


False Alarm Pixels Per Frame

0

50

100

150

200

250

300

0 1 2 3 4 5 6

Signal to Noise Ratio (SNR)

Fals

e A

larm

Pix

els

Box Filter

Cross Average

Line Average

Vertical Line Average

Local Mean

PCNN trained with 50allowable False Alarms

PCNN trained with 250allowable False Alarms

Figure 9. False alarm pixels per frame for a variety of algorithms. Shown are the false alarm pixels for the spatial filters and

false alarm pixels for the PCNN solved using the Genetic Algorithm. Note that the false alarm pixels per frame on the PCNN will are approximately equal to the allowable false alarms per frame used in the GA.

4.1 Comparison of Single Frame Algorithms

We found through this research that there are a variety of means that can be applied to infrared detect point targets in infrared imagery. The key hurdle to overcome appears to be that the image statistics – background mean and image standard deviation – must remain somewhat constant for the segmentation filters to work consistently. Focal plane non-uniformity is a large problem when considering single frames of imagery, and for our research this was overcome by simply doing a background subtraction from the current image using an average background frame collected while looking at the same background with the same integration time and image frame rate. This simple correction resulted in a sufficiently uniform image to allow for successful segmentation using a variety of techniques.

Our research has resulted in a number of spatial filtering techniques that can easily be implemented in a shift register hardware configuration that is suitable for implementation near the focal plane. These hardware solutions can be implemented immediately post readout, and the background subtraction can be performed using a small memory device and an address sequencer as the pixels are converted and read off the focal plane.

The PCNN solution performed the best, although its complexity makes it more suitable for a software implementation. The GA used to find a PCNN solution allowed for a variety of floating point coefficients used in the network repeatedly that are not easily implemented in hardware, plus the fact that the network must be evaluated iteratively makes it less suitable for a hardware implementation than a single pass spatial filter. We have considered using a GA to find a PCNN solution that is restricted to simple small integer coefficients that would allow for an analog circuit implementation of the neuron at the unit cell level of the focal plane, but that exercise has been shifted to a later date.

Additionally, the scoring heuristic for the PCNN effects greatly to the quality of the results obtained by the PCNN. Over the course of this effort it was determined that being too restrictive on false alarm pixels resulted in a PCNN solution that had a very poor probability of detection since the solution was rewarded for having very few pixels on in the output image. Likewise, having a heuristic that rewarded target detection disproportionately to low false alarms resulted in a PCNN solution that simply turned on a lot of pixels in the output image, thus guaranteeing a high detection rate. The optimum tradeoff seemed to be in tolerating some number of false alarm pixels before heavily penalizing the PCNN solution. The result was almost always a PCNN that had approximately that number of false alarm pixels in each output image. The best PCNN solutions that we found using the GA had around 250 allowable false alarm pixels per frame. You can see in Figure 9 above that this was the case regardless of SNR of the input imagery. This number can be reduced by further processing the output using a median filter. Any further signal conditioning of the output to remove false alarm pixels will likely need to be done via temporal processing.

The simple filters tested in this study performed equally well to the PCNN once a satisfactory SNR was achieved. If the input data being segmented is sufficiently bright then these simple filters should be considered. If the data set in question requires detection in the range of signal to noise ratio of 1.5 to 3 then the PCNN solution determined by the GA is a recommended option.


[lub°

SO IOU ISO 200 260

IOU

ISO

The use of these algorithms near the focal plane will allow for the possibility of data thinning by passing only regions of interest to the downstream data processing systems which track or characterize targets, allowing for either higher data rates or larger imaging systems, both of which would be difficult to implement if processing every pixel were required.

Although the probability of detection is good for all the presented approaches, the number of false alarm pixels reported is unacceptably high for most applications to use the filter output directly. In several cases the use of a median filter on the output of the presented algorithm is suggested as a means to eliminate stray single pixel false alarms. While this method does improve the false alarm rate for each algorithm, it also negatively impacts the probability of detection by inadvertently deleting true target detections from the segmented output imagery.

Sample output imagery produced by the algorithms presented can be seen in Figure 10. The input image and truth image are shown for reference. The data in this case represents an SNR of 2.5, with targets having a blur circle of approximately 9 to 16 pixels, with only 2 of those being above the noise floor of the image. Algorithm outputs are shown.

(input image) (truth)

(local mean) (box filter) (cross average)

(line average) (matched filter) (PCNN)

Figure 10. Sample image segmentation results for a variety of techniques, including spatial filtering, correlation, and Pulse Coupled Neural Network Shown is results for input imagery at SNR 2.5. SNR 2.5 is below the ability of the filters. At SNR 4 they all work equally well.


4.2 Research in Progress

Following this research on segmentation of relatively stationary point targets in infrared imagery we have begun to address the problem of point targets that have been smeared in the image due to motion of the target or imaging platform. Addressing only the case where the target motion can be ignored and smear is due to a change in pose of the imaging system during integration, a matched filter approach can be used to detect segments within the image that match the predicted shape of the streak based upon a sequence of measurements indicating camera pose. Using these measurements a correlation kernel can be constructed and convolved across the input image. Peaks in the image can then be evaluated against a threshold to find likely target segments in the image. A variety of thresholding techniques can be applied to find likely targets in the correlation image. Polaris Sensor Technologies will address one of these techniques in a companion paper to this one, DSS08- 6967-22.

In addition to further research using shaped kernels to detect streaks, Polaris is addressing a PCNN approach that utilizes multiple frames of source imagery as input. By considering a set of frames and the relative motion of the imaging system legitimate targets can be separated from artifacts that may appear to be of interest in only a single frame. This work is ongoing, but shows promise to provide a means to detect small targets without image stabilization techniques.

When the target appears small consisting of only a few pixels it is common to have high false targets in the segmented image. This is because all locally bright points need to be considered as potential targets. However, if frames in the time sequence of images are transformed to binary form, spatially registered and stacked to form a 3-D binary image then the pixels that represent a true object will form a curve of significant length in the 3-D binary image. As noise is expected to appear at random locations streaks that are persistent and smooth should be scored more highly when considering what are targets and what are noise. Because of the effects of object visibility and digitization – both related to intensity of the target as it slews across detectors, and due to blanking time during readout – the curve may not be perfectly smooth and even may have small breaks. Techniques to link streaks from consecutive frames are being addressed, but must still be perfected to be useful.

5. REFERENCES [1] D. L. Webb, L. L. Hung, D. F. Elliott Algorithms for EO Sensor SNR Enhancement and Smoothing Proceedings of the Conference Record of the Twenty-Ninth Asilomar Conference on Signals, Systems and Computers, 1995 (ASILOMAR '95) 1058-6393/95 1995 IEEE

[2] R. Eckhorn, H. J. Reitboeck, M. Arndt, and P. Dicke, Feature Linking via Synchronization among Distributed Assemblies: Simulations of Results from Cat Visual Cortex, Neural Computation, Vol. 2, pp. 293-307. 1990.

[3] Banish, Michele R.; Ranganath, Heggere. Neural network based element, image pre-processor, and method of pre-processing using a neural network United States Patent number 20030076992. 2003.

[4] G. Kuntimad, “Pulse coupled neural networks for image processing”, Ph.D. dissertation, Computer Science department, The University of Alabama in Huntsville, 1995.

[5] G. Kuntimad and H. S. Ranganath, “Perfect Image Segmentation using Pulse Coupled Neural Network”. IEEE Transactions of Neural Networks, vol10, pp.591-599, May 1999.

[6] J. Lindblad and J Kinser, Image Processing Using Pulse-Coupled Neural Networks, Second Revised Edition, Copyright 1985, 2005.


Single-Frame Image Processing Techniques for Low-SNR ...client.blueskybroadcast.com/spie/dss08/6940-78-a1d_edmondson/manuscript.pdfSingle-Frame Image Processing Techniques for Low-SNR

Documents