Image Fusion

1. INTRODUCTION

1.1 Information Fusion

Pixel-level image fusion defines the process of fusing visual information from

a number of registered images into a single fused image. It is part of the much

broader subject of multisensor information fusion, which has attracted a considerable

amount of research attention in the last two decades.

Multisensor information fusion utilizes information obtained from a number

of different sensors surveying an environment. The aim is to achieve better situation

assessment and more rapid and accurate completion of a pre-defined task than would

be possible using any of the sensors individually. The only formal definition of

information fusion (data fusion) to date, is that given by the U.S. Department of

Defense, Joint Directors of Laboratories Data Fusion Subpanel which represents the

first formal body explicitly dealing with the process of data fusion. Their definition

can be found in as: a multilevel, multifaceted process dealing with the automatic

detection, association, correlation, estimation and combination of data and

information from multiple sources.

Image fusion represents a specific case of multisensor information fusion in

which all the information sources used represent imaging sensors. Information fusion

can be achieved at any level of the image information representation. Image fusion is

usually performed at one of the three different processing levels: signal, feature and

decision. Image level image fusion, also known as pixel-level image fusion,

represents fusion at the lowest level, where a number of raw input image signals are

combined to produce a single fused image signal. Object level image fusion, also

called feature level image fusion, fuses feature and object labels and property

descriptor information that have already been extracted from individual input images.

Finally, the highest level, decision or symbol level image fusion represents fusion of

probabilistic decision information obtained by local decision makers operating on the

results of feature level processing on image data produced from individual sensors.

Figure 1.1 illustrates a system using image fusion at all three levels of processing.

1

Scene observation Scene observationInput

Signal 2Signal 1 signals

Sensor 1 Sensor 2

Pixel-LevelFusion

Feature FeatureExtraction Extraction

FeatureExtraction

Feature FeatureVectors Vectors

Feature-LevelFusion

FusedFeature

Decision Decision DecisionvectorsMakers Makers Makers

DecisionDecision

Signal 1Signal 2Vectors

Symbol-LevelVectors

Fusion

FusedDecision

Figure 1.1: An example of a system using information fusion at all three

processing levels.

The aim would be to detect and correctly classify objects in a presented scene.

The two sensors (1 and 2) survey the scene and register their observations in the form

of image signals. Two images are then pixel-level fused to produce a third, fused

image and are also passed independently to local feature extraction processes. The

fused image can be directly displayed for a human operator to aid better scene

understanding or used in a further local feature extractor.

2

Feature extractors act as simple automatic target detection systems, including

processing elements such as segmentation, region characterization, morphological

processing and even neural networks to locate regions of interest in the scene.

Decision level fusion is performed on the decisions reached by the local

classifiers, on the basis of the relative reliability of individual sensor outputs and the

fused feature set. Fusion is achieved using statistical methods such as Bayesian

inference and the Dempster-Schafer method with the aim of maximizing the

probability of correct classification for each object of interest. The output of the

whole system is a set of classification decisions associated to the objects found in the

observed scene.

1.2 Project Objectives

The objectives of the project work :

1. The design of improved performance pixel-level image fusion algorithms,

when compared with existing schemes in terms of:

i) Minimizing information loss and distortion effects and

ii) Reducing overall computational complexity.

2. The design of perceptually meaningful objective measures of pixel-level

image Fusion performance.

1.3 Types of Image Fusion Technique

Image fusion methods can be broadly classified into two - spatial domain

fusion and transform domain fusion. The fusion methods such as averaging, Brovey

method, principal component analysis (PCA) and IHS based methods fall under

spatial domain approaches. Another important spatial domain fusion method is the

high pass filtering based technique. Here the high frequency details are injected into

upsampled version of MS images. The disadvantage of spatial domain approaches is

that they produce spatial distortion in the fused image. Spectral distortion becomes a

negative factor while we go for further processing, such as classification problem.

Spatial distortion can be very well handled by transform domain approaches on image

fusion. The multiresolution analysis has become a very useful tool for analysing

remote sensing images. The discrete wavelet transform has become a very useful tool

for fusion. Some other fusion methods are also there, such as Lapacian pyramid

3

based, curvelet transform based etc. These methods show a better performance in

spatial and spectral quality of the fused image compared to other spatial methods of

fusion.

The images used in image fusion should already be registered. Misregistration is a

major source of error in image fusion. Some well-known image fusion methods are:

1. High pass filtering technique

2. IHS transform based image fusion

3. PCA based image fusion

4. Wavelet transform image fusion

5. pair-wise spatial frequency matching

1.4 Application of Image fusion

1. Image Classification

2. Aerial and Satellite imaging

3. Medical imaging

4. Robot vision

5. Concealed weapon detection

6. Multi-focus image fusion

7. Digital camera application

8. Battle field monitoring

1.5 Medical Image Fusion

Medical imaging has become increasingly important in medical analysis and diagnosis. Different medical imaging techniques such as X-rays, computed tomography (CT), magnetic resonance imaging (MRI), and positron emission tomography (PET) provide different perspectives on the human body that are important in the diagnosis of diseases or physical disorders. For example, CT scans provide highresolution information on bone structure while MRI scans provide detailed information on tissue types within the body. Therefore, an improved

4

understanding of a patient’s condition can be achieved through the use of different imaging modalities. A powerful technique used in medical imaging analysis is medical image fusion, where streams of information from medical images of different modalities are combined into a single fused image.

The fused image of MRI scan and a CT scan give both the bone structure and

tissue structure can be clearly identified in the single image. Therefore, image fusion

allows a physician to obtain a better visualization of the patient’s overall condition.

1.6 Pixel-Level Image Fusion

Medical image fusion usually employs the pixel level fusion techniques.

Pixel-level image fusion represents fusion of visual information of the same scene,

from any number of registered image signals, obtained using different sensors. The

goal of pixel-level image fusion can broadly be defined as:

To represent the visual information present in any number of input images, in

a single fused image without the introduction of distortion or loss of information.

In simpler terms, the main condition for successful fusion is that “all” visible

information in the input images should also appear visible in the fused image. In

practice, however, the complete representation of all of the visual information from a

number of input images into a single one is almost impossible.

Thus, the practical goal of pixel-level image fusion is modified to the fusion,

or preservation in the output fused image, of the “most important” visual information

that exists in the input image set.

The main requirement of the fusion process then, is to identify the most

significant features in the input images and to transfer them without loss into the

fused image. What defines important visual information is generally application

dependant. In most applications and in image fusion for display purposes in

particular, it means perceptually important information.

A simple diagram of a system using pixel-level image fusion is shown in the

block diagram in Figure 1.2. For simplicity, only two imaging sensors survey the

environment, producing two different representations of the same scene.

5

The representations of the environment are, again, in the form of image

signals which are corrupted by noise arising from the atmospheric aberrations, sensor

design, quantization, etc.

The image signals produced by the sensors are input into a registration

process, which ensures that the input images to the fusion process correspond

spatially, by geometrically warping one of them.

Multisensor image registration is another widely researched area. In Figure

1.2, the registered input images are fused and the resulting fused image, can then be

used directly for display purposes or can be passed on for further processing see

Figure 1.1.

NoiseImage A

Image DisplaySensor 1 Registration

FusedImage Image

Environment Fusion

Further

Sensor 2 Processing

NoiseImage B

Figure 1.2: Basic structure of a multisensor system using pixel-level image

fusion.

The pixel-level image fusion work presented in this report assumes that the

input images meet a number of requirements. Firstly, input images must be of the

same scene, i.e. the fields of view of the sensors must contain a spatial overlap.

Furthermore, inputs are assumed to be spatially registered and of equal size and

spatial resolution. In practice, resampling one of the input images often satisfies size

and resolution constraints.

6

2. LITERATURE SURVEY

[1] Firooz Sadjadi Lockheed Martin Corporation, [email protected]

“Comparative image fusion analysis”

In this they have proposed the results of a study to provide a quantitative

comparative analysis of a typical set of image fusion algorithms. The results were

based on the application of these algorithms on two sets of collocated visible (electro-

optic) and infrared (IR) imagery. The quantitative comparative analysis of their

performances was based on using 5 different measures of effectiveness. These metrics

were based on measuring information content and/or measures of contrast. The results

of this study indicate that the comparative merit of each fusion method is very much

dependent on the measures of effectiveness being used. However, many of the fusion

methods produced results that had lower measures of effectiveness than their input

imagery. The highest relative MOE values were associated with the Fechner- Weber

and Entropy measures in both sets. Fisher metrics showed large values mainly due to

low pixel variances in the target background areas.

[2] V.P.S. Naidu and J.R. Raol National Aerospace Laboratories, Bangalore.

“Pixel-level Image Fusion using Wavelets and Principal Component Analysis”

Pixel-level image fusion using wavelet transform and principal component

analysis are implemented in PC MATLAB. Different image fusion performance

metrics with and without reference image have been evaluated. The simple averaging

fusion algorithm shows degraded performance. Image fusion using wavelets with

higher level of decomposition shows better performance in some metrics while in

other metrics, the PCA shows better performance.

[3] Stavri Nikolov, Paul Hill, David Bull, Nishan Canagarajah Image

Communications Group ,Centre for Communications Research University of

Bristol,Merchant Venturers Building, Woodland Road ,Bristol BS8

1UB,UK,[email protected], [email protected] “ Wavelts For Image

Fusion”

In this they have compared some newly developed wavelet transform fusion

methods with existing fusion techniques.

7

For an effective fusion of images a technique should aim to retain important

features from all input images. These features often appear at different positions and

scales.

Multiresolution analysis tools such as the wavelet transform are therefore

ideally suited to image fusion. Simple non multiresolution methods for image fusion

wavelet fusion schemes have many specical advantages and benefit from a well

understood theoretical background. Many image processing steps example denoising

contrast enhancement edge detection segmentation texture analysis and compression

can be easily and successfully performed in the wavelet domain. Wavelet techniques

thus provide a powerful set of tools for image enhancement and analysis together

with a common framework for various fusion tasks such as averaging and PCA

methods have produced limited results.

8

3. SYSTEM ANALYSIS

3.1 Existing Work

Fusion on two medical and normal images using wavelet transform.

3.1.1 Drawback In Existing Work

In wavelet transform it has two main disadvantages:

Lack of shift invariance, which means that small shifts in the input signal

can cause

major variations in the distribution of energy between DWT coefficients at different

scales.

Poor directional selectivity for diagonal features, because the wavelet filters

are separable and real.

3.2 Proposed Work

Fusion on two medical and normal images using multiresolution algorithm

such as gradient fusion and to overcome these drawbacks in wavelet transform and to

prove which is the best suited for medical images and normal images.

9

4. DIGITAL IMAGE PROCESSING

4.1 Image

An image is a two-dimensional functional that represents a measure of some

characteristic such as brightness or colour of a viewed scene. An image project of a

3D scene into a 2D projection plane. It can be defined as a two variable function

f(x,y) where for each position (x,y) in the projection plane, f(x,y) defines the light

intensity at this point.

4.2 Analog Image

An analog image can be mathematically represented as a continuous range of

values representing position and intensity. An analog image is characterized by a

physical magnitude varying continuously in space. For example, the image produced

on screen of a CRT monitor is analog in nature.

4.3 Digital Image

A digital image is composed of picture elements called pixels. Pixels are the

smallest sample of an image. A pixel represents the brightness at one point.

Conversion of an analog image into a digital image involves two important

operations, namely, sampling and quantization, which are illustrated in fig4.1

Analog image Sampling Quantisation Digitalimage

Figure 4.1: Digital Image From Analog Image

4.4 Advantages Of Digital Images

The advantages of digital images are summarized below:

1. The processing of image is faster and cost effective.

2. Digital image can be effectively stored and efficiently transmitted from one

place to another.

3. When shooting a digital image one can immediately see if the images good or

not.

10

4. Copying a digital image is easy. The quality of digital image will not be

degraded even if it is copied for several times.

5. Whenever the image is in digital format, the reproduction of the image is both

faster and cheaper.

6. Digital technology offers plenty of scope for versatile image manipulation.

4.5 Drawback of Digital Images

Some of the drawbacks of digital images are:

1. Misuse of copyright as become easier because image can be copied from the

internet just by clicking the mouse couple of time.

2. A digital file cannot be enlarged beyond a certain size without compromising

on quality

3. The memory required to store and process good quality digital images is very

high.

4. For real time implementation of digital image processing algorithms, the

processor has to be very fast because the volume of the data is very high

4.6 Digital Image Processing

The processing of an image by means of a computer is generally termed

digital image processing. The advantages of using computers for the processing of

images are summarized below:

1. Flexibility and adaptability

The main advantages of digital computers when compared to analog

electronic and optical information processing devices is that no hardware

modification are necessary in order to reprogram digital computers to solve

different tasks. This features makes digital computers an ideal device for

processing image digital signal adaptively.

11

2. Data storage and transmission

With the development of different image-compression algorithms, the digital

data can be effectively stored. The digital data within in the computer can be

easily transmitted from one place to another.

The only limitation of the digital imaging and digital image processing are

memory and processing speed capabilities of computers. Different image

processing techniques include image enhancement, image restoration, image

fusion and image watermarking.

4.7 Types of digital image processing

1. Binary image processing

2. Grayscale image processing

3. Colour image processing

4. Wavelet based image processing

4.7.1Binary image processing

The simplest type of image which is used widely in a variety of industrial and

medical applications is binary, i.e. a black-and-white or silhouette image. Binary

image processing has several advantages but some corresponding drawbacks:

Advantages

Easy to acquire: simple digital cameras can be used together with very simple

framestores, or low-cost scanners, or thresholding may be applied to grey-

level images.

Low storage: no more than 1 bit/pixel, often this can be reduced as such

images are very amenable to compression (e.g. run-length coding).

Simple processing: the algorithms are in most cases much simpler than those

applied to grey-level images.

12

Disadvantages

Limited application: as the representation is only a silhouette, application is

restricted to tasks where internal detail is not required as a distinguishing

characteristic.

Does not extend to 3D: the 3D nature of objects can rarely be represented by

silhouettes. (The 3D equivalent of binary processing uses voxels, spatial

occupancy of small cubes in 3D space).

Specialised lighting is required for silhouettes: it is difficult to obtain reliable

binary images without restricting the environment. The simplest example is an

overhead projector or light box.

4.7.2 Gray scale image processing

Grayscale images are distinct from one-bit bi-tonal black-and-white images,

which in the context of computer imaging are images with only the two colors, black,

and white (also called bi level or binary images). Grayscale images have many shades

of gray in between. Grayscale images are also called monochromatic, denoting the

presence of only one (mono) color (chrome).

Grayscale images are often the result of measuring the intensity of light at

each pixel in a single band of the electromagnetic spectrum (e.g.infrared, visibl light,

ultraviolet, etc.), and in such cases they are monochromatic proper when only a given

frequency is captured. But also they can be synthesized from a full color image; see

the section about converting to grayscale.

4.7.3 Color image processing

Methods and Applications is a versatile resource that can be used as a

graduate textbook or as stand-alone reference for the design and the implementation

of various image and video processing tasks for cutting-edge applications.

13

Features:

1. Details recent advances in digital color image acquisition, analysis, processing, and

display

2. Explains the latest techniques, algorithms, and solutions for digital color imaging

3. Provides comprehensive coverage of system design, implementation, and application

aspects of digital color imaging

4. Explores new color image, video, multimedia, and biomedical processing applications

5. Contains numerous examples, illustrations, online access to full-color results, and

tables summarizing results from quantitative studies

Application

1. Secure imaging

2. Object recognition and feature

detection

3. Facial and retinal image analysis

4. Digital camera image processing

5. Spectral and superresolution

imaging

6. Image and video colorization

7. Virtual restoration of artwork

8. Video shot segmentation and

surveillance

14

4.7.4 Wavelet image processing

A wavelet is a wave-like oscillation with an amplitude that starts out at zero,

increases, and then decreases back to zero. It can typically be visualized as a "brief

oscillation" like one might see recorded by a seismograph or heart monitor.

Generally, wavelets are purposefully crafted to have specific properties that make

them useful for signal processing. Wavelets can be combined, using a "revert, shift,

multiply and sum" technique called convolution, with portions of an unknown signal

to extract information from the unknown signal.

For example, a wavelet could be created to have a frequency of Middle C and a short

duration of roughly a 32nd note. If this wavelet were to be convolved at periodic

intervals with a signal created from the recording of a song, then the results of these

convolutions would be useful for determining when the Middle C note was being

played in the song. Mathematically, the wavelet will resonate if the unknown signal

contains information of similar frequency - just as a tuning fork physically resonates

with sound waves of its specific tuning frequency. This concept of resonance is at the

core of many practical applications of wavelet theory.

As a mathematical tool, wavelets can be used to extract information from many

different kinds of data, including - but certainly not limited to - audio signals and

images. Sets of wavelets are generally needed to analyze data fully. A set of

"complementary" wavelets will deconstruct data without gaps or overlap so that the

deconstruction process is mathematically reversible. Thus, sets of complementary

wavelets are useful in wavelet based compression/decompression algorithms where it

is desirable to recover the original information with minimal loss.

In formal terms, this representation is a wavelet series representation of a square-

integrable function with respect to either a complete, orthonormal set of basis

functions, or anovercomplete set or frame of a vector space, for the Hilbert space of

square integrable functions.

15

5. IMAGE FUSION

5.1 Introduction

Multisensor image fusion has attracted a considerable amount of research

attention in the last ten years. Soon after the introduction of the first multisensor

arrays in image dependant systems, researchers began considering image fusion as a

necessity to solve the growing problem of information overload. Since the end of the

1980s and throughout the 1990s image, and in particular pixel-level, fusion was

established as a subject through a stream of publications presenting fusion algorithms.

PixelInputlevelimage1fusion Wavelet Fused

transform image

PixelInputlevelimage2fusion

Figure 5.1 Block diagram of image fusion technique

Furthermore, towards the end of the last decade, research attention was also

beginning to focus on the problem of performance evaluation of different image

fusion systems. In this chapter the literature published on the subject of pixel-level

image fusion from its beginnings, in the end of 1980s, until this day is reviewed.

5.2 General Pixel-level Image Fusion Techniques

The multiresolution and multiscale methods dominate the field of pixel-level

fusion; arithmetic fusion algorithms are the simplest and sometimes effective fusion

methods. Arithmetic fusion algorithms produce the fused image pixel by pixel, as an

arithmetic combination of the corresponding pixels in the input images. Arithmetic

fusion can be summarized by the expression given in Equation (5.1)

F (n,m)= kA A(n,m)+ kB B(n,m)+C (5.1)

16

where A, B, and F represent the inputs and the fused images respectively at

location (n,m). kA, kB and C are all constants defining the fusion method, with kA and

kB defining the relative influence of the individual inputs on the fused image and C

the mean offset. Image averaging is the most commonly used example of such fusion

methods. In this case, the fused signal is evaluated as the average value between the

inputs, i.e. kA =½, kB =½ and C=0. In general, averaging produces reasonable image

quality in areas where input images are similar but the quality rapidly decreases in

regions where inputs are different.

The Intensity-Hue-Saturation (IHS) colour representation is another format

suitable for information fusion. It relates to the principles of human colour perception

and is easily obtained by a simple arithmetic transformation from the more common

RGB space. In IHS fusion the intensity channel of the colour input image is replaced

by the monochrome input image. The fused colour image is then obtained by reversed

transformation to the RGB space. Contrast stretching is commonly applied to the IHS

channels prior inverse transformation to obtain enhanced colour images.

Principal component analysis (PCA) is another powerful tool used for

merging remotely sensed images. It is a statistical technique that transforms a set of

intercorrelated variables into a set of new uncorrelated linear combinations of the

original variables. Evaluation of principal components (PCs) of an image signal also

involves calculations of covariance and eigen values (vectors). An inverse PCA,

transforms the data back to the original image space. Principal component analysis is

used in pixel-level image fusion based on the component substitution technique

where transforming the low-resolution colour image to principal components, PC1 is

substituted by the high-resolution monochrome data. Inverse PCA is applied to get

the fused image. These properties make multiresolution fusion algorithms potentially

more robust than other fusion approaches.

17

5.3 Multiresolution Image Fusion Based on the Gaussian Pyramid

Representation

Multiresolution processing methods enable an image fusion system to fuse

image information in a suitable pyramid format.

Image pyramids are made up of a series of sub-band signals, organized into

pyramid levels, of decreasing resolution (or size) each representing a portion of the

original image spectrum. Information contained within the individual sub-band

signals corresponds to a particular scale range, i.e. each sub-band contains features of

a certain size. By fusing information in the pyramid domain, superposition of features

from different input images is achieved with a much smaller loss of information than

in the case of single resolution processing.

Fusing images in their pyramid representation therefore, enables the fusion

system to consider image features of different scales separately even when they

overlap in the original image.

Furthermore, this scale separability also limits damage of sub-optimal fusion

decisions, made during the feature selection process, to a small portion of the

spectrum.

Figure 5.2: The structure of multiresolution pixel-level image fusion

systems based on the derivatives of the Gaussian pyramid

18

Multiresolution image processing was first applied to pixel-level image fusion

using derivatives of the Gaussian pyramid representation in which the information

from the original image signal is represented through a series of (coarser) low-pass

approximations of decreasing resolution. The pyramid is formed by iterative

application of low-pass filtering, usually with a 5x5 pixel Gaussian template,

followed by subsampling with a factor 2, a process also known as reduction.

All multiresolution image fusion systems based on this general approach

exhibit a very similar structure, which is shown in the block diagram of Figure 5.2.

Input images obtained from different sensors are first decomposed into their Gaussian

pyramid representations.

Gaussian pyramids are then used as a basis for another type of high pass (HP)

pyramids, such as the Laplacian, which contain, at each level, only information

exclusive to the corresponding level of the Gaussian pyramid.

HP pyramids represent a suitable representation for image fusion. Important

features from the input images are identified as significant coefficients in the HP

pyramids and they are transferred (fused) into the fused image by producing a new,

fused, HP pyramid from the coefficients of the input pyramids.

The process of selecting significant information from the input pyramids is

usually referred to as feature selection and the whole process of forming a new

composite pyramid is known as pyramid fusion. The fused pyramid is transformed

into the fused image using a multiresolution reconstruction process. This process is

dual to the decomposition and involves iterative expansion (up-sampling) of the

successive levels of the fused Gaussian pyramid and combination (addition in the

case of Laplacian pyramids) with the corresponding levels of the fused HP pyramid,

known as expand operation.

The contrast pyramid was also used in another interesting fusion approach

presented by Toet et al introduced an image fusion technique which preserves local

luminance contrast in the sensor images. The technique is based on selection of image

features with maximum contrast rather than maximum magnitude.

19

A contrast pyramid is formed by dividing each level of the Gaussian low-pass

pyramid with the expanded version of the next, coarser, level. Each level of the

contrast pyramid contains only information exclusive to the corresponding level of

the Gaussian pyramid.

5.4 Multiresolution Image Fusion Based on the Wavelet Transform

The Discrete Wavelet Transform (DWT) was successfully applied in

the field of image processing with the appearance of Mallat’s algorithm that enabled

the implementation of two dimensional DWT using one dimensional filter banks.

This significant multiresolution approach is discussed in more detail in the next

chapter.

Figure 5.3: The structure of an image fusion system based on wavelet

multiresolution analysis

Its general structure, briefly describe here, is very similar to that of the

Gaussian pyramid based approach. The structure of a wavelet based image fusion

system is shown in Figure 5.2. Input signals are transformed using the wavelet

decomposition process into the wavelet pyramid representation.

In contrast to Gaussian pyramid based methods, high pass information is also

separated into different sub-band signals according to orientation as well as scale. The

scale structure remains logarithmic, i.e. for every new pyramid level the scale is

reduced by a factor of 2 in both directions.

20

The wavelet pyramid representation has three different sub-band signals

containing information in the horizontal, vertical and diagonal orientation at each

pyramid level. The size of the pyramid coefficients corresponds to “contrast” at that

particular scale in the original signal, and can therefore, be used directly as a

representation of saliency. In addition, wavelet representation is compact.

One of the first wavelet based fusion systems was presented by Li et al. in

1995. It uses Mallat's technique to decompose the input images and an area based

feature selection for pyramid fusion.

In the proposed system, Li et al. use a 3x3 or a 5x5 neighbourhood to evaluate

a local activity measure associated with the centre pixel.

It is given as the largest absolute coefficient size within the neighbourhood. In

case of coefficients from the two input pyramids exhibiting dissimilar values, the

coefficient with the largest activity associated with it is chosen for the fused pyramid.

Otherwise, similar coefficients are simply averaged to get the fused value. Finally,

after the selection process, a majority filter is applied to the binary decision map to

remove bad selection decisions caused by noise “hot-spots”.

This fusion technique works well at lower pyramid levels, but for coarser

resolution levels, the area selection and majority filtering, especially with larger

neighbourhood sizes, can significantly bias feature selection towards one of the

inputs.

21

6. MULTIRESOLUTION WAVELET IMAGE FUSION

6.1 Introduction

Multiresolution analysis represents image signals in a multiresolution pyramid

form. This means that performing image fusion in this “pyramid” domain enables the

fusion of features from different input images at various scales even when they

occupy overlapping areas of the observed scene. Segmentation of the image spectrum

into pyramid levels corresponding to narrow ranges of scale and the use of selective

pyramid fusion techniques introduces robustness to the fusion system by minimizing

the information loss produced when applying fusion algorithms on a single resolution

basis. As a result, a number of multiresolution image processing techniques have been

proposed in the field of pixel-level image fusion.

The DWT multiresolution representation, or wavelet pyramid, has a number

of advantages over the Gaussian pyramid based multiresolution techniques. One of

the most fundamental issues is that wavelet functions used in this type of

multiresolution image analysis form an orthonormal basis that results in a

nonredundant signal representation. In other words, the size of the multiresolution

pyramid is exactly the same as that of the original image; Gaussian based pyramid

representations are 4/3 of the original image size. Also, further to their redundant

signal representation, the computational complexity of the multiresolution analysis

process used to obtain Gaussian based pyramids, far exceeds that of the wavelet

decomposition process, which can be implemented using one-dimensional filters

only.

The reconstruction process, which transforms the fused multiresolution

pyramid back to the original image representation, is the dual of the decomposition

process.In this chapter, we describe a new system for pixel-level image fusion of

gray-level image modalities, using the DWT multiresolution approach. It is based on

a novel cross-sub-band feature selection and fusion mechanism that is used to fuse

information from different input pyramids. Results obtained with this new method,

show that this form of pyramid fusion significantly reduces information loss and

ringing artifacts exhibited by more conventional wavelet based fusion systems.

22

6.2 Wavelet transform

The basic idea of the wavelet transform is to represent an arbitrary signal f as

a weighted superposition of wavelets. Wavelets are functions generated by dilations

and translations of a single prototype function, called the mother wavelet, (t).

a,b (t) = (1/a2) ([t-b] / a) (6.1)

The wavelet transform is useful in image fusion applications due to its good

localization properties in both the spectral and spatial domain. These properties arise

from the nature of the process in which the wavelets are produced from the prototype

function. Dilations of the orthogonal wavelet ensure that the signal is analyzed at

different spectral ranges providing spectral localization, while translations provide the

spatial analysis resulting in good spatial domain localization. The reconstruction of

the original signal from the wavelet representation is possible if the wavelet prototype

function satisfies the decay condition:

| ()|2 / ||d < (6.2)

Where () represents the Fourier transform of (t).

The integral wavelet transform of a signal f(t) with respect to some analyzing

wavelet is defined as

W f (a,b) = f(t) a,b (t) dt (6.3)

-

The parameters a and b are called dilation and translation parameters

respectively.

Equations mentioned above are relates to continuous wavelets and the

continuous wavelet transform (CWT), however, for practical reasons and in

applications of interest to this thesis a discrete version, or the Discrete Wavelet

Transform (DWT) is preferred.

6.3 Two Dimensional QMF Multiresolution Decomposition

The multiresolution image analysis technique used in our fusion system is

based on the Quadrature Mirror Filter (QMF) implementation of the discrete wavelet

transform, embodied in Mallet’s algorithm.

23

Quadrature Mirror filters represents a class of wavelet

decomposition/reconstruction filters developed independently for subband coding and

compression of discrete signals.

They satisfy the condition which defines the relationships between the low

and high-pass analysis and synthesis impulse responses (changing the sign of every

other sample between the LP and HP response and mirroring the analysis filters in

time to produce the synthesis bank), capable of pyramid decomposition and perfect

reconstruction of the original signal from its decomposed pyramid representation.

QM filter banks used in multiresolution signal processing are made up of two

pairs of power complimentary conjugate FIR filters.

Signal decomposition is performed in the analysis filter bank by the analysis

QMF pair h0(n) and h1(n). Signal reconstruction takes place in a synthesis bank

consisting of the QMF synthesis pair g0(n) and g1(n).

Figure 6.1 QMF decomposition structure: analysis bank

Figure 6.2 QMF reconstruction structure: synthesis bank

24

Two-dimensional signals are decomposed with one-dimensional FIR filters by

applying the filters in both directions independently.

The structure of the QMF analysis and synthesis filter banks is shown in

Figure 6.1 and Figure 6.2. In the analysis bank, a one dimensional decomposition

filter bank is first applied in the horizontal direction to the input image or its

approximation ALLk+1. The image is filtered along the rows with the low and high

pass analysis filters, H0 and H1, and the resulting signals are critically decimated in

the horizontal direction by keeping one column out of two.

The two half images produced in this way are themselves inputs into identical

filter banks which operate in vertical direction. Signals are filtered along their

columns and only every other row of the processed signals is kept.

Image reconstruction from the multiresolution pyramid is through a series of

synthesis filter banks. The reconstruction process is dual to the decomposition

process. In each stage of the reconstruction, all the sub-band signals of the same

resolution level are input into the synthesis bank to produce the low-pass

approximation of the higher resolution level. Initially, all signals are interpolated in

the vertical direction by inserting a row of zeros after each sub-band row. Interpolated

signals are then filtered along the columns with the QMF synthesis pair G0 and G1.

The results of the two one dimensional synthesis banks are input into a further

synthesis bank where they are processed in the horizontal dimension by inserting

columns of zeros followed by filtering along the rows. Finally, the reconstructed

signal is obtained as the sum of the outputs of the low and high-pass filtering

branches of the last filter bank.

6.4 Wavelet Fusion Structure

Wavelet based pixel-level image fusion schemes increase the information

content of fused images by selecting the most significant features from input images

and transferring them into the composite image.

This process takes place in the multiresolution pyramid domain reached by the

process of multiresolution analysis. Information fusion is achieved by creating a new,

fused pyramid representation that contains all the significant information from the

multiresolution pyramids of the input images.

25

Input images A and B, are first decomposed into multiresolution pyramids

using a series of multiresolution QMF Analysis filter banks.

Then, a new pyramid array is initialized containing no information, i.e. it is

filled with zeros. The pyramid fusion algorithm then considers, in a systematic way,

individual or groups of pixels from the multiresolution pyramid representations of the

input images, and forms values for the corresponding pixels of the new pyramid. The

coefficients of the new pyramid are formed either by transferring the input coefficient

values directly or as arithmetic combinations of the corresponding coefficients from

the input pyramids. Criteria for the selection and fusion of input pyramid coefficients

are determined in the design of the feature selection process.

Thus the feature selection process searches the input pyramids and identifies

the most significant image features at each position and scale. The aim then, is to

transfer these features from the input image pyramids into the fused without loss of

information. For each level of scale (resolution) all spatial positions have to be

considered and features from input images compared with each other. The pyramid

fusion process used in the proposed system is based on a cross-band feature

evaluation and selection approach. It integrates feature information from a number of

sub-band signals and levels at once, to make a decision on how to fuse particular

input pyramid pixels. When the pyramid fusion process is completed and all the fused

pyramid coefficients have been produced, the fused pyramid is input into the wavelet

reconstruction process to obtain the final fused image.

6.5 Conventional Feature Selection and Pyramid Fusion Mechanisms

Feature selection mechanisms can be broadly divided into pixel and area

based schemes, and pyramid coefficient fusion methods are purely selective, purely

arithmetic or composite, a combination of the first two.

Pixel based feature selection systems make a fusion decision for each pyramid

pixel individually, based on its value. In contrast, area based methods use a

neighbourhood of coefficient values to form a selection criterion for the center pixel.

A diagram illustrating these two types of coefficient selection mechanisms is shown

for a single pyramid sub-band fusion in Figure 6.3.

26

In terms of pyramid fusion, selective schemes form the composite pyramid by

direct transfer of coefficient values from the input pyramids into the fused, according

to a selection map produced by the feature selection process.

Arithmetic methods on the other hand, evaluate fused pyramid coefficients as

an arithmetic combination, usually a weighted sum, equation (5.1), of the input

pyramid values. Composite methods use both of the above approaches.

Figure 6.3: Pixel-based and Area-based selection method

Robustness can be added into the system by the use of area based selection

criteria, such as those used in schemes by Burt and Kolczynski and Li et. al.

Decisions based on a neighbourhood around the center coefficient (Figure 6.3)

remove most of the selection map randomness due to noise and random large values

in the sub-band signal.

They also reduce the contrast loss by ensuring that all the coefficients

belonging to a particular dominant feature are selected. The performance of these

methods however, depends on the image content and the size of the neighbourhood

used.

27

The feature selection mechanism proposed in this chapter is based on a

crossband coefficient selection criterion that exploits the high level of correlation

present between the different levels and sub-bands of input pyramids, to form a more

robust and complete evaluation of input image features. The processes of forming a

selection decision based upon information from multiple sub-bands and selecting

multiple input coefficients at once are also referred to as the integration of selection

information. Integration of selection information refers to the process of using

information from more than one resolution level of the input pyramids to aid pyramid

coefficient selection and fusion.

The coefficient values from more than one resolution level in our selection

criterion we gain an even better evaluation of the saliency of the original feature. In a

top-down approach used in the proposed system, values of the “father” coefficients,

are used in the selection criterion of their “children”. The selection criterion thus

becomes

ALLH

LH ALHL

FL

HL ═ ALHH , AL+AL+1 >BL+BL+1

FL

FL HH

BLLH

,otherwise (6.4)

BLHL

BLHH

28

where

AL =|ALLH| + |AL

HL| + |ALHH| (6.5)

BL =|BLLH| + |BL

HL| + |BLHH| (6.6)

AL+1 =|AL+1LH| + |AL+1

HL| + |AL+1HH| (6.7)

BL+1 =|BL+1LH| + |BL+1

HL| + |BL+1HH| (6.8)

ALsb BL

sb and FLsb represent coefficients of sub-band sb on level L, with

AL+1sb and BL+1

sb being the corresponding “father” coefficients in the input

pyramids. The cross-band feature selection and pyramid fusion mechanism described

above has a constraint in that it can be implemented only on pyramid levels which are

not at the end of the pyramid.

29

7. GRADIENT BASED MULTIRESOLUTION IMAGE FUSION

7.1 Introduction

In this chapter, a novel approach to multiresolution image analysis designed

specifically for the use in pixel-level image fusion systems is presented. The aim is to

eliminate the main problems encountered in conventional multiresolution fusion

approaches: i.e. reconstruction errors, loss of contrast information and prohibitively

high computational complexity. At the same time, the ability to operate successfully

across a wide range of pixel-level fusion applications has also been an important

objective in this part of the program.

As mentioned in the chapter 5 and chapter 6, most of the previously proposed

pixel-level image fusion systems have been based on wavelet signal analysis

techniques. These multiresolution fusion systems that employ the Discrete Wavelet

Transform (DWT) achieve high fused image quality and robust performance at

reasonable computational cost.

However, the multiresolution structure of the wavelet analysis also introduces

a number of characteristic problems in the image fusion domain. The most important

of these is certainly the problem of reconstruction errors or “ringing” artifacts.

Ringing artifacts are the result of compromising the perfect reconstruction property of

the wavelet multiresolution analysis, by introducing discontinuities into the sub-band

signals. This is almost unavoidable in the process of image fusion.

In this chapter a novel approach to multiresolution wavelet analysis is

described. The approach is based on a Gradient (edge) signal representation of image

information which is particularly well suited to pixel-level image fusion. Edge signal

representation is compressed into a related Gradient (edge) map representation that is

easily derived from the original image signal.

Edge maps express the information contained in the original image signal as

changes in the signal value, rather then absolute signal values. This edge map

representation can be incorporated into the multiresolution decomposition process by

using alternative Gradient (edge) filters.

30

Information fusion is performed in the multiresolution gradient map domain,

resulting in a new decomposition-fusion processing structure. At each level of scale

input signals are first transformed into their edge map representations, and fused to

produce fused edge maps. High-pass information from fused edge map signals is then

decomposed into a simplified wavelet pyramid representation using edge filters.

The basic multiresolution structure is preserved in the system and the fused

image is obtained through a conventional multiresolution reconstruction process. The

method provides clear advantages over conventional wavelet fusion systems both in

terms of a more robust feature selection and also a significant reduction in the amount

of ringing artifacts and information loss.

7.2 Gradient Map Representation of Image Information

The practical goal of pixel-level image fusion algorithms is to identify

(detect), compare and transfer the most important visual information from the inputs

into the fused image. Visual information, contained within image signals, is mainly in

the form of edges, i.e. changes or uncertainties of the signal rather than the absolute

gray level value of each pixel. Larger, more perceptually meaningful, information

carrying image structures such as patterns, features and objects can be considered as

collections of basic edge elements of different scales and orientations with specific

spatial arrangements. The aim of image information fusion is to transfer, without loss,

all the most important edge information from any number of registered input images

into the fused image. Indeed, an ideally fused image can be defined as an image that

contains all the edge information of all input images.

Edge signals form a representation of image information that enables the

fusion process to avoid most problems described in the previous section. It is

particularly well suited for image fusion applications in that it operates directly on the

input signal (image) level, rather than on a sub-band at a time. Moreover, the edge

signal representation is possible at any resolution, which allows the preservation of

the multiresolution structure. In this way, information from the entire spectrum is

used to make feature selection decisions. The Gradient map of a one dimensional

signal x is defined as

x(7.1) (n)=x(n)-x(n-1), for all n

31

In an gradient signal representation, an image signal is expressed as sum of

appropriately translated and weighted edge signals. Basically, each edge signal

captures a single gray level change, edge element, from the original image and is

constant elsewhere.

For two-dimensional (2-D) image signals, gradient maps are defined in the

horizontal and vertical directions independently.

They are 2-D signals that represent at each position the horizontal and vertical

gradient information as the difference between the corresponding pixel and the pixel

directly above or to the left The horizontal and vertical image gradient map signals

are defined as

xH(n,m) =x(n,m)-x(n,m-1), for all n,m (7.2)

xv(n,m) =x(n,m)-x(n-1,m), for all n,m (7.3)

Figure 7.1: Two-dimensional Gradient map signal representation:

a) input signal b) vertical gradient map representation c) horizontal gradient

map representation.

An example of an image and its horizontal and vertical edge map

representations is shown in Figure 7.1. The input image (in Figure 7.1 a) contains

significant features (information) in all possible directions.

32

Its horizontal edge map representation, (shown in 5.1 b), contains mainly

vertical edges and, to a certain extent, diagonal ones. Large grayish areas indicate

regions where there is a considerable amount of small detail, which is usually

omnidirectional.

Horizontally oriented patterns are mostly visible in the vertical edge map

(shown in 7.1 c), as are, to a certain extent, the diagonal too. In both edge map

representations the signal takes both positive and negative values and the images

displayed in Figure 7.1 are scaled absolute values of the original edge map signals.

Finally, the gradient map representation is a spatially non-redundant image

signal representation. In other words, edge map signals contain all the information

from the input signal using the same number of pixels. Compared to the gradient

signal representation this enables a significant reduction in complexity.

The original image can be perfectly reconstructed from the edge map

representation through the cumulative sum:

n

x ( n, m ) = ∑ x

H ( k, m ) (7.4) k =1

7.3 Gradient Filters

The gradient map image information domain is particularly relevant to image

fusion. However, gradient maps contain information from the entire spectrum. At the

same time, it is the multiresolution structure, in which inputs are fused at a range of

scales independently, that ensures the robustness of fusion performance. Such a

multiresolution structure is directly obtainable from gradient map signals by using

gradient filters.

In each QMF analysis stage, the upper half of the image spectrum in a

particular direction is decomposed into a “detail” subband signal. This is achieved by

filtering the input signal along that axis with the high-pass QMF H1.

33

The same result is obtained when one filters the corresponding gradient map

with a gradient filter He. Equivalency is best demonstrated for 1-D signals in the Z

transform domain

X (z) H1 (z)= x (z) He (z) (7.5)

the gradient map signal x (z) is derived from the input signal X (z) according to the z

transform of the expression in (7.1)

x (z)=X(z) –z –1 X(z) = (1 –z –1) X(z) (7.6)

and by using (7.6) and (7.5) the gradient filter is defined in terms of the impulse

response of the QMF high pass filter h1(n)

He (z) = H1 (z) / (1 –z –1) he (n) = h1(n) * u(n) (7.7)

he (n) is therefore obtained by convolving the QMF high-pass impulse

response with a step function u(n) . Using the causal convolution sum and exploiting

the fact that for k > n, the step function u (n-k) is zero and otherwise one, we get a

simplified expression for he(n) as follows:

n

he (n) = ∑ h1(k) (7.8)

k =0

Thus, the gradient filter he(n) defined in (7.8) is a finite-impulse response

(FIR) filter with N coefficients, where N is the length of the QMF h1(n), symmetrical

about the central coefficient.

Figure 7.2: Impulse response of a) high-pass QMF filter and

b) the corresponding edge filter he(n)

34

As an example, the impulse responses of the Johnston 16A QMF high-pass

filter and the gradient filter derived from it are plotted in Figure. 7.2 (a) and (b),

respectively.

7.4 Gradient Fusion Structure

Information fusion in the proposed gradient-based fusion system takes place

in the gradient map domain. The system uses an alternative, fuse-then-decompose

strategy, which yields a novel fusion-decomposition architecture based on gradient

filters and the gradient map representation of image signals. Information fusion is

achieved by applying sophisticated feature selection and fusion algorithms to gradient

maps.

Figure 7.3: Simplified spectral decomposition for a) gradient based fusion and

b) resulting pyramid structure with sub-bands of different sizes.

Image fusion in the gradient-based fusion system is performed within the

general framework of the conventional, logarithmic, multiresolution structure.

Information in horizontal and vertical directions is fused half-band at a time.

At each resolution level only the upper ¾ of the image spectrum is fused

(Figure 7.3). Decomposition is extended by applying further stages of analysis banks

until the vast majority of the information contained in the input spectra is fused. The

remaining base band residuals are fused last using alternative methods.

35

7.4.1 Block diagram

VerticalLow pass image Base-bandLow-Pass Fused

Approximation approximation Fusion Base-band

HorizontalLow-Pass

Approximation Vertical FusedGradient Map GradientGradient map Horizontal

Fusion FilterRepresentation Sub-band

Input images

Horizontal

Fused

Gradient Map Gradient Vertical

Gradient Map

Fusion Filter Sub-band

Representation

Figure 7.4: Structure of a single resolution level of gradient-based

multiresolution fusion system

The general Structure of a single resolution level of gradient-based

multiresolution fusion system is illustrated Figure 7.4 is based on combined fusion-

analysis filter banks, where, for simplicity, only two input images A and B are fused.

At each resolution level, input image signals are transformed into their

horizontal gradient map representations, which are in turn fused into a single

horizontal gradient map signal. Gradient filters are then applied to this map and the

resulting signal is decimated (by a factor 2) to produce the fused horizontal subband

at the k th resolution level. This subband contains fused information exclusive to the

horizontal upper half of the input signal spectrum.

At the same time, the input image signals are filtered with low-pass filters in

the horizontal direction, which produces their low-pass approximations containing

only the lower half of the spectrum in this direction.

In the second stage, these low-pass approximations are processed in the same

manner as the input signals but in the vertical direction.

36

This produces the vertical fused subband signal and the quarter-band low-pass

input image approximations, A1 and B1 The low-pass approximations created by the

structure are further input into an equivalent bank operating on the (k+1) th resolution

level. Finally, when all the high-pass information from the input spectra has been

fused and decomposed, or a certain decomposition depth reached, the remaining input

image basebands A1 and B1 are fused using arithmetic fusion methods. The gradient-

based multiresolution image fusion architecture of Figure 7.4 uses gradient maps and

gradient filters to effectively implement the QMF high-pass filtering branches of

multiresolution analysis filter banks.

The detailed block diagram of a single resolution stage of this analysis process

is shown in Figure 7.5. Both input images are initially processed using the horizontal

delay elements in high-pass filtering branches He to produce horizontal gradient

maps. These gradient maps are fused into a single horizontal gradient map, which is

then filtered along the rows with the He filter. The filtered signal is decimated by a

factor of 2 to produce the fused horizontal subband. Input signals are also filtered

along the rows using Ho and are decimated by 2.

The resulting low-pass approximations are processed in the vertical direction

by using vertical delay elements; input signals are expressed as vertical gradient maps

that are then fused.

Figures 7.5: Implementation structure of the gradient based fusion-

decomposition process

37

The resulting fused gradient map is gradient filtered along the columns and

decimated by ignoring every other row of the filtered signal, to produce the vertical

subband signal The half-band approximations are also low-pass filtered and

decimated in the vertical direction resulting in ¼ band low-pass subband signals

A1and B1. These are used as inputs into further decomposition stages.

A high-resolution fused image is obtained from this fused multiresolution

pyramid by applying a modified version of the conventional QMF pyramid

reconstruction process. Image reconstruction is implemented through a series of

cascaded, two-dimensional synthesis filter banks (same as conventional wavelet

reconstruction).

7.5 Gradient information fusion

In gradient-based multiresolution image fusion, information fusion is

performed in the gradient map domain.

Unlike wavelet pyramid coefficients, whose size is only an indication of the

saliency of features collocated within a neighborhood, the absolute size of the

gradient map elements is a spatially accurate direct measure of feature contrast.

Furthermore, gradient map signals contain the information from the entire spectrum,

which adds reliability to the process of feature selection and fusion.

This also enables the fusion system to transfer into the fused pyramid all the

high frequency information of a particular feature. Due to these properties, gradient-

based fusion exhibits improved performance in terms of robust feature selection and

achieves significant reductions in fused visual information distortion.

The simplest method of feature selection and fusion is the pixel-based select

max approach. In this approach, the fused gradient map pixel takes the value of the

corresponding input gradient map pixel with the largest absolute value, i.e.,

F= A , |A|>|B| (7.9)

B, otherwise

However, this method is not always as reliable as more complex subband

fusion techniques. The cross-band fusion method used in the edge based fusion

system employs the same principles as the one presented in chapter 9.

38

In this case however, sub-band coefficients of the wavelet pyramid are

replaced with the edge elements (pixels) of the edge map.

Furthermore, there is no straightforward integration of selection decisions

since there is no direct spatial correspondence between pixels of the horizontal and

vertical edge maps (they are of different size, Figure 7.3). The basic feature selection

used in horizontal and vertical edge map fusion is expressed as:

Fx =L Ax, SL

Ax >SL BxL

L Bx,(7.10)

Otherwise

Fy =L Ay, SL

Ay >SL ByL

L By,(7.10)

Otherwise

k is a constant, experimentally determined to be k=3, and L and L+1 indicate

edge map information from the current and coarser resolution levels respectively.

Consistency verification of selection decisions can change the edge element fusion

method from selective to arithmetic fusion, if the majority of corresponding selection

decisions made on the higher resolution level (L-1) do not agree with the current

decision.

The spatial correspondence between edge elements at neighbouring resolution

levels is the same as in the conventional wavelet pyramid case.

Exact weighting coefficients of the arithmetic fusion method are again based

on the distance between the edge elements:

|L Ax(n,m)| + |L Bx(n,m)|

D= (7.11)

max(|LAx(n,m)| , |L Bx(n,m)|)

39

Weighting coefficients of the arithmetic fusion are evaluated according to the

size of the difference D compared to a threshold T as: If the distance between the

coefficients is very large, D>T, input edge elements are added to form the fused

value. Otherwise they are considered similar and their average value is taken for the

fused edge map. The optimal value for the threshold parameter T was experimentally

determined to be in the region of 0.8. The complete cross-band feature selection and

edge map fusion method is illustrated in graphical form in Figure 7.6.

7.6 Baseband Fusion

Baseband signals are the residual, low-pass approximations of the input

signals. These baseband signals contain only the very large-scale features that form

the background of input images and are important for their natural appearance.

In the proposed fusion system, baseband fusion is performed using arithmetic

combinations of input basebands as follow.

Fk(n,m)=A1k(n,m)+B1k(n,m)-(A+B)/2 (7.13)

Where Fk, A1k and B1k are the fused and input baseband signals, A and B

are the mean values of the two input basebands, and k represents the coarsest

resolution level. Generally, baseband fusion methods have little influence on the

overall fusion performance.

7.7 Fusion Complexity

In terms of computational complexity the edge based multiresolution

decomposition-fusion approach proposed in this chapter offers a reduction in the

computational effort required, to fuse two images, when compared to the

conventional QMF implementation approach. The most significant portion of the

reduction in complexity comes from the reduction in the number filters used in the

decomposition and reconstruction (analysis and synthesis) filter banks. In both

analysis and synthesis banks, this elimination of a one-dimensional filter at the

second and first stages of filtering respectively, reduces the complexity by around ¼

from the direct implementation.

40

8.OBJECTIVE EVALUATION OF PIXEL LEVEL IMAGE FUSION

PERFORMANCE

8.1 Introduction

This chapter addresses the issue of objectively measuring pixel-level image

fusion performance. Multisensor image fusion is widely recognized as valuable in

image based application areas such as remote and airborne sensing and medical

imaging. As a consequence, with the constant improvements in the availability of

multispectral/ multisensor equipment, considerable research effort has been directed

towards the development of advanced image fusion techniques. Fusion performance

metrics are used in this context to identify suitable and robust fusion approaches and

to optimize the system parameters.

In this chapter, the objective evaluation of pixel-level image fusion

performance is proposed. The framework models the amount of and accuracy with

which visual information is transferred from the inputs to the fused image by the

fusion process. It is based on the principle that visual information conveyed by an

image signal relates to edge information. Therefore, by comparing the edge

information of the inputs to that of the fused image, the success of information

transfer from the input images into the fused output image can be measured. This

quantity then represents a measure of fusion performance. Perceptual importance of

different regions within the image is also taken into account in the form of perceptual

weighting coefficients associated with each gradient (edge) point in the inputs. The

objective fusion performance measure produces a single, numerical, fusion

performance score obtained as a sum of perceptually weighted measures of local

information fusion success.

41

8.2 Edge Information Extraction

As mentioned previously, human observers are motivated by resolving the

uncertainties (i.e. gray level changes) in the image. In real image signals, these

changes are not concentrated in any predefined region but are commonly distributed

according to content throughout the image signal. Spatial locations where the signal

changes value form a part of the uncertainty associated with the image signal.

An observer searches the visual stimulus (image signal) for these areas of

“uncertainty” and extracts information from them.

However, information is not only contained in the detectable changes of the

signal value fixated by the observer. The lack of signal change (zero edge) carries a

small but finite amount of information, i.e. that there is no edge there.

Therefore, in order to capture all the information contained within an image,

all possible “uncertainties” of that signal have to be considered. This is done by

measuring edge (gradient) information at all spatial locations within the presented

image.

-1 -2 -1

0 0 0

1 2 1

-1 0 1

2 0 2

-1 0 1

Figure 8.1: a) Horizontal and b) vertical Sobel template

Visual information from the image signal is represented, at each position,

through edge strength and orientation parameters. These parameters are extracted

using a simple Sobel edge operator, defined in it’s basic form by the two 3×3

templates shown in Figure 8.1. These templates represent the horizontal and vertical

edge operators that measure edge components in the horizontal and vertical directions

respectively.

For the purpose of edge information extraction in the proposed objective

measure, all three images, A and B and F, are two-dimensionally filtered with the two

42

Sobel templates. The result of filtering each image, are two further images sx and sy

that contain edge components in the x and y directions.

From these components, the edge strength, g(n,m), and orientation, (n,m),

information is easily obtained for each pixel p(n,m) of an input image (say image A)

according to:

gA (n,m) = (sxA (n,m)2 + sy

A (n,m)2 ) (8.1)

A (n,m) = tan-1(sxA (n,m) / sy

A (n,m) ) (8.2)

for 1 n N and 1 m M, where N and M are the dimensions of the input image.

8.3 Perceptual Loss of Edge Strength and Orientation

The edge information preservation estimator is a crucial part of the objective

fusion performance measure. It provides a measure of how well edge information in

the fused image represents the edge information that can be found in the inputs.This

measurement represents a comparison with the theoretical aim of the fusion process

which is to preserve, as truthfully as possible, all input information in a single fused

image. This comparison is the basis of the measurement of image fusion performance

achieved by the fusion system.

Edge information extracted from the input and fused images is in the form of

edge strength and orientation maps, gA (n,m) gB (n,m) and gF (n,m), and A (n,m)

B(n,m) and F (n,m).

The change in edge strength is evaluated as the ratio between the strength of

the fused and of the input gradient for the case when there is a loss of contrast, i.e. the

input gradient is larger than the fused. In the opposite case, when the fused gradient is

larger than the input, we have unintended contrast enhancement which is treated in

the same way as an inverted loss in contrast and the ratio is inverted. The strength

change parameter of information in F with respect to A, GAF can therefore be

expressed as:

43

gF (n,m)/ gA (n,m)

GAF (n,m)=

gA (n,m)/ gF (n,m)

, if gA (n,m) > gF (n,m)

(8.3)

, otherwise

From the expression in equation (6.3), it can be seen that parameter GAF has

a value of unity when the fused gradient strength gF(n,m) is a perfect

representation of, i.e. it is equal to, input gradient strength gA(n,m). For an

increasing difference between the two values, GAF decreases linearly

towards zero.

Change of orientation information in F with respect to A, AAF, can

be expressed as a normalized relative distance between input and fused edge orientation:

|| A (n,m) - F (n,m) | - /2 |

AAF (n,m)=/2

These are used to derive the edge strength and orientation preservation values

Γg

QgAF (n,m ) = (8.5)

1+exp(kg (GAF (n,m) - g))

ΓαAF

Q (n,m) =

1+exp (k (AAF

(8.6)

(n,m) - ))

QgAF (n,m) and Q

AF (n,m) model perceptual loss of information in

F, in terms of how well the strength and orientation values of a pixel p(n,m)

in A are represented in the fused image. The constants Γg, κ g, σ g and Γα,

κα, σα determine the exact shape of the sigmoid functions used to form the

edge strength and orientation preservation values, see equations (8.5) and

(8.6).

Edge information preservation values are then defined as

44

Q AF (n,m) = QgAF (n,m) Q

AF (n,m)

With 0 G AF (n,m) 1 A value of 0 corresponds to the complete loss of edge

information, at location (n,m), as transferred from A into F. G AF (n,m)=1 indicates

fusion from A to F with no loss of information.

The overall objective fusion performance measurement of an image fusion

process p, operating on input images A and B to produce a fused image F, is

evaluated as a perceptually weighted, normalized sum of edge information

preservation coefficients across the input image set:

Q AF (n,m) w A (n,m) + Q BF (n,m) w B (n,m)

Q AB / F =

(8.7)

w A (i , j)+ w B (i , j)

the edge preservation values Q AF (n,m) and Q BF (n,m) are weighted by wA(n,m) = [gA (n,m)]L and w B(n,m) = [gB (n,m)]L respectively. Where L is a constant.

The reasonable importance distribution is obtained only with L in the region of 0.8 <

L < 1.2. Higher and lower values place extensive emphasis on either strong of weak

edges respectively.

45

9. SOFTWARE DESCRIPTION:

9.1 Introduction

MATLAB is a programming environment for algorithm development, data

analysis, visualization, and numerical computation. Using MATLAB, you can solve

technical computing problems faster than with traditional programming languages,

such as C, C++, and Fortran. MATLAB in a wide range of applications, including

signal and image processing, communications, control design, test and measurement,

financial modeling and analysis, and computational biology.

9.2 Structures

MATLAB supports structure data types. Since all variables in MATLAB are

arrays, a more adequate name is "structure array", where each element of the array

has the same field names. In addition, MATLAB supports dynamic field names.

Unfortunately, MATLAB JIT does not support MATLAB structures, therefore just a

simple bundling of various variables into a structure will come at a cost

9.3 Function handles

MATLAB supports elements of lambda-calculus by introducing function

handles, or function references, which are implemented either in .m files or

anonymous/nested functions.

9.4 MATLAB Fundamentals

Working with the MATLAB user interface

Entering commands and creating variables

Performing analysis on vectors and matrices

Visualizing vector and matrix data

Working with data files

Working with data types

Automating commands with scripts

Writing programs with logic and flow control

46

9.5 ADVANTAGES OF MATLAB:

Algorithm Development

Develop algorithms using the high-level language and development tools in

MATLAB.

Data Analysis

Analyze, visualize, and explore data with MATLAB.

Data Visualization

Visualize engineering and scientific data with a wide variety of plotting

functions in MATLAB.

Numeric Computation

Perform mathematical operations and analyze data with MATLAB functions.

Publishing and Deploying

Share your work by publishing MATLAB code from the Editor to HTML and

other formats.

47

10. RESULTS

Figure 10.1: Image fusion of input image1 (focus on left part) and input image2

(focus on right part) with image averaging and wavelet fusion method.

48

Figure 10.2: Image fusion of input image 1 (focus on left part) and input image 2

(focus on right part) with Gradient based image fusion.

49

Figure 10.3: Image fusion of input image1(CT image) and input image2

(MRI image ) with wavelet fusion method.

50

Figure 10.4: Image fusion of input image 1(CT image) and input image 2(MRI

image) with Gradient based image fusion

51

11. CONCLUSION

This chapter summarizes and concludes the investigation of pixel-level image

fusion presented in this report. The novel multiresolution signal-level image fusion

method whose architecture belongs to the same broad system class as DWT is

presented in this report. The method uses an alternative gradient map image

information representation and a new “fuse-then-decompose” approach within the

framework of a novel, combined fusion/decomposition multiresolution architecture.

Furthermore, the image information representation in the form of gradient map

signals allows for reliable feature selection in a process, which is realized using cross-

band information fusion. Thus, the proposed fusion system significantly reduces

reconstruction error artefacts and the loss of contrast information, conditions which a

commonly observed in conventional DWT-based fusion. The objective performance

evaluation results demonstrate the superiority of gradient-based multiresolution image

fusion with respect to more complex multiresolution fusion approaches.

Further Enhancement

The biggest effort required to further is connected with the practical side of

image fusion development such as data gathering. Using the Neural networks going

to identify the objects and using Fuzzy logic to generate the database to the physician

to diaganize the patient more effectively.

52

10. REFERENCE

[1] Ahmed Abd-el-kader, Hossam El-Din Moustafa, Sameh Rehan, “Performance

Measure for image fusion based on wavelet transform and curvelet transform” April

26-28,2011, National Telecommunication institute.

[2] Anjali Malviya, S.G. Bhirud, “ Objective Criterion for performance Evaluation of

image fusion techniques” 2010 International journal of computer Applications (0975-

8887) volume1- No.25.

[3] YI zheng-jun, LI Hua-feng, SONG Rui-jing.Spatial Frequency Ratio Image

Fusion Method Based On Improved Lifting Wavelet Transform[J]. Opto-Electronic

Engineering, 2009,36(7):65-70.

[4] OLIVER ROCKINGER'S COLLECTION [EB/OL]. [2010-3-19].

http://www.imagefusion.org/.

[5] XU kai-yu, LI Shuang-yi.A Images Fusion Algorithm Based on Wavelet

Transform[J]. Infrared Technology, 2007,29(8):455-458.200.

[6] SUN yan kui. Wavelet analysis and its application[M].Beijing: Machinery

Industry Press, 2005.

[7] M. I. Smith, J. P. Heather, "Review of Image Fusion Technology in 2005,"

Proceedings of the SPIE, Volume 5782, pp. 29-45, 2005.

[8] Ishita De and Bhabatosh Chanda "A simple and efficient algorithm for multi-

focus image fusion using morphological wavelets" Electronics and Communication

Sciences Unit, Indian Statistical Institute, Kolkata 700108, India, 2005.

[9] Paul Hill, Nishan Canagarajah and Dave Bull "Image Fusion using Complex

Wavelets" Dept. of Electrical and Electronic Engineering The University of Bristol

Bristol, BS5 lUB, UK, 2002

[10] A. Toet and J. Ijspeert, “Perceptual evaluation of different image fusion

schemes,” Proc. SPIE, vol. 4380, pp. 427–435, Aug. 2001

53

[11] C. Xydeas and V. Petrovic´, “Objective image fusion performance measure,”

Electron. Lett., vol. 36, no. 4, pp. 308–309, Feb. 2000.

[12] H Li, B Munjanath, S Mitra, "Multisensor Image Fusion Using the Wavelet

Transform", Graphical Models and Image Processing, Vol. 57(3), 1995, pp 235- 245

[13] D Esteban, C Galand, “Application of quadrature mirror filters to split band

voice coding schemes”, Proc. Int. Conf. Acoustics, Speech, Signal Processing

ICASSP, Hartford, May 1977, pp 191-195

[14] Z Zhang, “Investigations of Image Fusion”, on

www.eecs.lehigh.edu/SPCRL/spcrl.htm, Lehigh University, May 2000.

[15] P Burt, R Kolczynski, "Enhanced Image Capture Through Fusion", Proc. 4th

International Conference on Computer Vision, Berlin 1993, pp 173-182

[16] W Handee, P Wells, “The Perception of Visual Information”, Springer, New

York 1997

[17] J Johnston, “A filter family designed for use in quadrature mirror filter banks”,

Proc. IEEE International Conference on Acoustic, Speech and Signal Processing,

1980, pp. 291-294

[18] V. Petrovic´ and C. Xydeas, “Multiresolution image fusion using cross band

feature selection,” Proc. SPIE , vol. 3719, pp. 319–326, Apr. 1999.

[19] M Sonka, V Hlavac, R Boyle, “Image Processing, Analysis and Machine

Vision”, PWS Publishing, Pacific Grove, 1998

54