Image Enhancement

A

MINI PROJECT REPORT ON

IMAGE ENHANCEMENT FOR IMPROVING FACE DETECTION UNDER NON UNIFORM LIGHTING

CONDITIONS

Submitted in partial fulfillment for the award of the degree of

BACHELOR OF TECHNOLOGY

IN

ELECTRONICS AND COMMUNICATION ENGINEERING

By

N.NIKITHA (08601A0480) V.SUKUMAR (08601A04C0)

Under the esteemed guidance of

Mrs. P.SHRAVANI Asst. Professor

Department of Electronics and Communication Engineering

SSJ ENGINEERING COLLEGE(Affiliated to J.N.T.U Hyderabad)

2010-2011

SSJ ENGINEERING COLLEGE(Approved by A.I.C.T.E, Affiliated to J.N.T.U, Hyderabad)

Hyderabad-500075, A .P.

Department of Electronics and Communication Engineering

CERTIFICATE

This is certify that the Project entitled

“IMAGE ENHANCEMENT FOR IMPROVING FACE DETECTION UNDER NON UNIFORM LIGHTING CONDITIONS”

is a bonafide work done for the partial fulfillment for the award of Degree of Bachelor of Technology in Electronics and Communication Engineering from SSJ Engineering College affiliated to the Jawaharlal Nehru Technological University, Hyderabad during the academic year of 2011-2012. The results embodied in this project report have not been submitted to any University for the award of any degree.

By

N.NIKITHA (08601A0480)V.SUKUMAR (08601A04C0)

Internal Guide Head of the Department Mrs. P. SHRAVANI Mr. S.JAGADEESH Asst. Professor Associate Professor

External Examiner

2

ACKNOWLEDGMENT

My sincere thanks to Prof C. Ashok, principal, Sri Sai Jyothi engineering

college, v.n.pally, Rangareddy for providing this opportunity to carry out the present

academic seminar work.

I gratefully acknowledge Mr. S. Jagadeesh, professor and head of department

of ECE, for his encouragement and advice during the course of this work.

We sincerely thank our internal guide Mr. Naresh Kumar for giving us the

encouragement for the successful completion of project work and providing the

necessary facilities.

We are grateful to the VISION KREST EMBEDDEDTECHNOLOGIES,

for providing us an opportunity to carry out this project within the time.

I would like to express my thanks to all the faculty members, ECE, who have

rendered valuable pieces of advice in completion of academic seminar report.

BY

N.NIKITHA (08601A0480)

V.SUKUMAR(08601A04C0)

3

ABSTRACT

A robust and efficient image enhancement technique has been developed to

improve the visual quality of digital images that exhibit dark shadows due to the

limited dynamic ranges of imaging and display devices which are incapable of

handling high dynamic range scenes. The proposed technique processes images in

two separate steps: dynamic range compression and local contrast enhancement.

Dynamic range compression is a neighborhood dependent intensity

transformation which is able to enhance the luminance in dark shadows while keeping

the overall tonality consistent with that of the input image. The image visibility can

be largely and properly improved without creating unnatural rendition in this manner.

A neighborhood dependent local contrast enhancement method is used to enhance the

images contrast following the dynamic range compression. Experimental results on

the proposed image enhancement technique demonstrates strong capability to

improve the performance of convolution face finder compared to histogram

equalization and multiscale Retinex with color restoration without compromising the

false alarm rate.

4

CONTENTS

Chapters Page.No

1. INTRODUCTION 8

1.1 Introduction 8

1.2 Image processing 9

2. IMAGE PROCESSING

2.1 Introduction

2.2 Digital data

2.3 Data formats for digital satellite imagery

2.4 Image resolution

2.5 How to improve your image

2.6 Digital image processing

2.7 Pre-processing of the remotely sensed image

3. AERIAL IMAGERY

3.1 Uses of imagery

3.2 Types of aerial photography

3.3 Non linear image enhancement techniques

4. WAVELETS

4.1 Continuous wavelet transform

4.2 Discrete wavelet transform

4.3 Algorithm

5

4.4 Wavelet dynamic range compression and contrast

enhancement

5. RESULT ANALYSIS

5.1 Results

5.2 Conclusion

5.3 Future work

REFERENCES

APPENDIX

6

LIST OF FIGURES

Chapters

3. Aerial Imagery

3.1 Aerial imagery of the test building

4. Wavelets

4.1 Wavelets of CWT

4.2 Same transform rotated to 250 degrees looking from 45

degrees above

4.3 Cosine signals corresponding to various scales

4.4 Block diagram

4.5 DWT waveforms

LIST OF TABLES

Chapters

2. Image Processing

2.1 Elements of image interpretation

7

CHAPTER 1

INTRODUCTION

1.1 INTRODUCTION

Aerial images captured from aircrafts, spacecrafts, or satellites usually suffer

from lack of clarity, since the atmosphere enclosing Earth has effects upon the images

such as turbidity caused by haze, fog, clouds or heavy rain. The visibility of such

aerial images may decrease drastically and sometimes the conditions at which the

images are taken may only lead to near zero visibility even for the human eyes. Even

though human observers may not see much than smoke, there may exist useful

information in those images taken under such poor conditions. Captured images are

usually not the same as what we see in a real world scene, and are generally a poor

rendition of it.

High dynamic range of the real life scenes and the limited dynamic range of

imaging devices results in images with locally poor contrast. Human Visual System

(HVS) deals with the high dynamic range scenes by compressing the dynamic range

and adapting locally to each part of the scene. There are some exceptions such as

turbid (e.g. fog, heavy rain or snow) imaging conditions under which acquired images

and the direct observation possess a close parity .The extreme narrow dynamic range

of such scenes leads to extreme low contrast in the acquired images.

To deal with the problems caused by the limited dynamic range of the

imaging devices, many image processing algorithms have been developed .These

algorithms also provide contrast enhancement to some extent. Recently we have

developed a wavelet-based dynamic range compression (WDRC) algorithm to

improve the visual quality of digital images of high dynamic range scenes with non-

8

uniform lighting conditions .The WDRC algorithm is modified in by introducing an

histogram adjustment and non-linear color restoration process so that it provides color

constancy and deals with “pathological” scenes having very strong spectral

characteristics in a single band. The fast image enhancement algorithm which

provides dynamic range compression preserving the local contrast and tonal rendition

is a very good candidate in aerial imagery applications such as image interpretation

for defense and security tasks. This algorithm can further be applied to video

streaming for aviation safety. In this project application of the WDRC algorithm in

aerial imagery is presented. The results obtained from large variety of aerial images

show strong robustness and high image quality indicating promise for aerial imagery

during poor visibility flight conditions.

1.2 IMAGE PROCESSING

In electrical engineering and computer science, image processing is any form

of signal processing for which the input is an image, such as a photography or video

frame the output of image processing may be either an image or, a set of

characteristics or parameters related to the image. Most image-processing techniques

involve treating the image as a two-dimensional signal and applying standard signal-

processing techniques to it.

Image processing usually refers to digital image processing, but optical and

analog image processing also are possible. This article is about general techniques

that apply to all of them. The acquisition of images (producing the input image in the

first place) is referred to as imaging. Image processing is a physical process used to

convert an image signal into a physical image. The image signal can be either digital

or analog. The actual output itself can be an actual physical image or the

characteristics of an image. The most common type of image processing is

photography. In this process, an image is captured using a camera to create a digital

9

or analog image. In order to produce a physical picture, the image is processed using

the appropriate technology based on the input source type.

In digital photography, the image is stored as a computer file. This file is

translated using photographic software to generate an actual image. The colors,

shading, and nuances are all captured at the time the photograph is taken the software

translates this information into an image.

Euclidean geometry transformations such as enlargement, reduction, and

rotation.

Color corrections such as brightness and contrast adjustments, color mapping,

color balancing, quantization, or color translation to a different color space.

Digital compositing or optical compositing (combination of two or more

images), which is used in film-making to make a "matte".

Interpolation, demosaicing, and recovery of a full image from a raw image

format using a Bayer filter pattern.

Image registration, the alignment of two or more images.

Image differencing and morphing.

Image recognition, for example, may extract the text from the image using

optical character recognition or checkbox and bubble values using optical

mark recognition.

Image segmentation.

High dynamic range imaging by combining multiple images.

Geometric hashing for 2-D object recognition with affine invariance.

1.2.1 DIGITAL IMAGE PROCESSING

Digital image processing is the use of computer algorithms to perform image

processing on digital images. As a subcategory or field of digital signal processing,

digital image processing has many advantages over analog image processing. It

10

allows a much wider range of algorithms to be applied to the input data and can avoid

problems such as the build-up of noise and signal distortion during processing. Since

images are defined over two dimensions (perhaps more) digital image processing may

be modeled in the form of Multidimensional Systems.

Many of the techniques of digital image processing, or digital picture

processing as it often was called, were developed in the 1960s at the Jet Propulsion

Laboratory, Massachusetts Institute of Technology, Bell Laboratories, University of

Maryland, and a few other research facilities, with application to satellite imagery,

wire-photo standards conversion, medical imaging, videophone, character

recognition, and photograph enhancement. The cost of processing was fairly high,

however, with the computing equipment of that era. That changed in the 1970s, when

digital image processing proliferated as cheaper computers and dedicated hardware

became available. Images then could be processed in real time, for some dedicated

problems such as television standards conversion. As general-purpose computers

became faster, they started to take over the role of dedicated hardware for all but the

most specialized and computer-intensive operations.

With the fast computers and signal processors available in the 2000s, digital

image processing has become the most common form of image processing and

generally, is used because it is not only the most versatile method, but also the

cheapest. Digital image processing technology for medical applications was inducted

into the Space Foundation Space Technology Hall of Fame in 1994. Digital

image processing allows the use of much more complex algorithms for image

processing, and hence, can offer both more sophisticated performance at simple tasks,

and the implementation of methods which would be impossible by analog means.In

particular, digital image processing is the only practical technology for:

Classification

Feature extraction

Pattern recognition

11

Projection

Multi-scale signal analysis

Some techniques which are used in digital image processing include:

Pixelization

Linear filtering

Principal components analysis

Independent component analysis

Hidden Markov models

Anisotropic diffusion

Partial differential equations

Self-organizing maps

Neural networks

Wavelets

12

CHAPTER 2

IMAGE PROCESSING

2.1INTRODUCTION

Image Processing and Analysis can be defined as the "act of examining

images for the purpose of identifying objects and judging their significance" Image

analyst study the remotely sensed data and attempt through logical process in

detecting, identifying, classifying, measuring and evaluating the significance of

physical and cultural objects, their patterns and spatial relationship.

2.2 DIGITAL DATA

In a most generalized way, a digital image is an array of numbers depicting

spatial distribution of a certain field parameters (such as reflectivity of EM radiation,

emissivity, temperature or some geophysical or topographical elevation. Digital

image consists of discrete picture elements called pixels. Associated with each pixel

is a number represented as DN (Digital Number), that depicts the average radiance of

relatively small area within a scene. The range of DN values being normally 0 to 255.

The size of this area effects the reproduction of details within the scene. As the pixel

size is reduced more scene detail is preserved in digital representation.

Remote sensing images are recorded in digital forms and then processed by

the computers to produce images for interpretation purposes. Images are available in

two forms - photographic film form and digital form. Variations in the scene

characteristics are represented as variations in brightness on photographic films. A

particular part of scene reflecting more energy will appear bright while a different

part of the same scene that reflecting less energy will appear black. Digital image

consists of discrete picture elements called pixels. Associated with each pixel is a

13

number represented as DN (Digital Number) that depicts the average radiance of

relatively small area within a scene. The size of this area effects the reproduction of

details within the scene. As the pixel size is reduced more scene detail is preserved in

digital representation.

2.3 DATA FORMATS FOR DIGITAL SATELITTE

IMAGERY

Digital data from the various satellite systems supplied to the user in the form of

computer readable tapes or CD-ROM. As no worldwide standard for the storage and

transfer of remotely sensed data has been agreed upon, though the CEOS (Committee

on Earth Observation Satellites) format is becoming accepted as the standard. Digital

remote sensing data are often organised using one of the three common formats used

to organise image data. For an instance an image consisting of four spectral channels,

which can be visualized as four superimposed images, with corresponding pixels in

one band registering exactly to those in the other bands. These common formats are:

Band Interleaved by Pixel (BIP)

Band Interleaved by Line (BIL)

Band Sequential (BQ)

Digital image analysis is usually conducted using Raster data structures - each

image is treated as an array of values. It offers advantages for manipulation of pixel

values by image processing system, as it is easy to find and locate pixels and their

values. Disadvantages becomes apparent when one needs to represent the array of

pixels as discrete patches or regions, where as Vector data structures uses polygonal

patches and their boundaries as fundamental units for analysis and manipulation.

Though vector format is not appropriate to for digital analysis of remotely sense data.

14

2.4 IMAGE RESOLUTION

Resolution can be defined as "the ability of an imaging system to record fine

details in a distinguishable manner". A working knowledge of resolution is essential

for understanding both practical and conceptual details of remote sensing. Along with

the actual positioning of spectral bands, they are of paramount importance in

determining the suitability of remotely sensed data for a given applications. The

major characteristics of imaging remote sensing instrument operating in the visible

and infrared spectral region are described in terms as follow:

Spectral resolution

Radiometric resolution

Spatial resolution

Temporal resolution

2.4.1 Spectral Resolution refers to the width of the spectral bands. As different

material on the earth surface exhibit different spectral reflectances and emissivities.

These spectral characteristics define the spectral position and spectral sensitivity in

order to distinguish materials. There is a tradeoff between spectral resolution and

signal to noise. The use of well -chosen and sufficiently numerous spectral bands is a

necessity, therefore, if different targets are to be successfully identified on remotely

sensed images.

2.4.2 Radiometric Resolution or radiometric sensitivity refers to the number of

digital levels used to express the data collected by the sensor. It is commonly

expressed as the number of bits (binary digits) needs to store the maximum level. For

example Landsat TM data are quantised to 256 levels (equivalent to 8 bits). Here also

there is a tradeoff between radiometric resolution and signal to noise. There is no

point in having a step size less than the noise level in the data. A low-quality

instrument with a high noise level would necessarily, therefore, have a lower

15

radiometric resolution compared with a high-quality, high signal-to-noise-ratio

instrument.

2.4.3 Spatial Resolution of an imaging system is defines through various

criteria, the geometric properties of the imaging system, the ability to distinguish

between point targets, the ability to measure the periodicity of repetitive targets

ability to measure the spectral properties of small targets.

The most commonly quoted quantity is the instantaneous field of view

(IFOV), which is the angle subtended by the geometrical projection of single detector

element to the Earth's surface. It may also be given as the distance, D measured along

the ground, in which case, IFOV is clearly dependent on sensor height, from the

relation: D = hb, where h is the height and b is the angular IFOV in radians. An

alternative measure of the IFOV is based on the PSF, e.g., the width of the PDF at

half its maximum value.

A problem with IFOV definition, however, is that it is a purely geometric

definition and does not take into account spectral properties of the target. The

effective resolution element (ERE) has been defined as "the size of an area for which

a single radiance value can be assigned with reasonable assurance that the response is

within 5% of the value representing the actual relative radiance". Being based on

actual image data, this quantity may be more useful in some situations than the IFOV.

Other methods of defining the spatial resolving power of a sensor are based on the

ability of the device to distinguish between specified targets. Of the concerns the ratio

of the modulation of the image to that of the real target. Modulation, M, is defined as:

M = Emax -Emin / Emax + Emin

Where Emax and Emin are the maximum and minimum radiance values recorded

over the image.

16

2.4.4 Temporal Resolution refers to the frequency with which images of a given

geographic location can be acquired. Satellites not only offer the best chances of

frequent data coverage but also of regular coverage. The temporal resolution is

determined by orbital characteristics and swath width, the width of the imaged area.

Swath width is given by 2tan(FOV/2) where h is the altitude of the sensor, and FOV

is the angular field of view of the sensor.

2.5 How to Improve Your Image?

Analysis of remotely sensed data is done using various image processing techniques

and methods that includes:

Analog image processing

Digital image processing.

2.5.1 Visual or Analog processing techniques is applied to hard copy data

such as photographs or printouts. Image analysis in visual techniques adopts certain

elements of interpretation, which are as follow:

The use of these fundamental elements of depends not only on the area being

studied, but the knowledge of the analyst has of the study area. For example the

texture of an object is also very useful in distinguishing objects that may appear the

same if the judging solely on tone (i.e., water and tree canopy, may have the same

mean brightness values, but their texture is much different. Association is a very

powerful image analysis tool when coupled with the general knowledge of the site.

Thus we are adept at applying collateral data and personal knowledge to the task of

image processing. With the combination of multi-concept of examining remotely

sensed data in multispectral, multitemporal, multiscales and in conjunction with

multidisciplinary, allows us to make a verdict not only as to what an object is but also

its importance. Apart from this analog image processing techniques also includes

optical photogrammetric techniques allowing for precise measurement of the height,

width, location, etc. of an object.

17

Digital Image Processing is a collection of techniques for the manipulation of

digital images by computers. The raw data received from the imaging sensors on the

satellite platforms contains flaws and deficiencies. To overcome these flaws and

deficiencies inorder to get the originality of the data, it needs to undergo several steps

of processing. This will vary from image to image depending on the type of image

format, initial condition of the image and the information of interest and the

composition of the image scene.

TABLE: 2.1 ELEMENTS OF IMAGE INTERPRETATION

Elements of Image Interpretation

Primary Elements

Black and White Tone

Color

Stereoscopic Parallax

Spatial Arrangement of Tone

& Color

Size

Shape

Texture

Pattern

Based on Analysis of Primary

Elements

Height

Shadow

Contextual Elements Site

Association

2.6 Digital Image Processing undergoes three general steps:

Pre-processing

Display and enhancement

Information extraction

18

2.6.1 Pre-processing consists of those operations that prepare data for subsequent

analysis that attempts to correct or compensate for systematic errors. The digital

imageries are subjected to several corrections such as geometric, radiometric and

atmospheric, though all these correction might not be necessarily be applied in all

cases. These errors are systematic and can be removed before they reach the user. The

investigator should decide which pre-processing techniques are relevant on the basis

of the nature of the information to be extracted from remotely sensed data. After pre-

processing is complete, the analyst may use feature extraction to reduce the

dimensionality of the data. Thus feature extraction is the process of isolating the most

useful components of the data for further study while discarding the less useful

aspects (errors, noise etc). Feature extraction reduces the number of variables that

must be examined, thereby saving time and resources.

2.6.2 Image Enhancement operations are carried out to improve the

interpretability of the image by increasing apparent contrast among various features

in the scene. The enhancement techniques depend upon two factors mainly

The digital data (i.e. with spectral bands and resolution)

The objectives of interpretation

As an image enhancement technique often drastically alters the original

numeric data, it is normally used only for visual (manual) interpretation and not for

further numeric analysis. Common enhancements include image reduction, image

rectification, image magnification, transect extraction, contrast adjustments, band

ratioing, spatial filtering, Fourier transformations, principal component analysis and

texture transformation.

2.6.3 Information Extraction is the last step toward the final output of the

image analysis. After pre-processing and image enhancement the remotely sensed

data is subjected to quantitative analysis to assign individual pixels to specific classes.

Classification of the image is based on the known and unknown identity to classify

19

the remainder of the image consisting of those pixels of unknown identity. After

classification is complete, it is necessary to evaluate its accuracy by comparing the

categories on the classified images with the areas of known identity on the ground.

The final result of the analysis consists of maps (or images), data and a report. These

three components of the result provide the user with full information concerning the

source data, the method of analysis and the outcome and its reliability.

2.7 Pre-Processing of the Remotely Sensed Images

When remotely sensed data is received from the imaging sensors on the satellite

platforms it contains flaws and deficiencies. Pre-processing refers to those operations

that are preliminary to the main analysis. Preprocessing includes a wide range of

operations from the very simple to extremes of abstractness and complexity. These

categorized as follow:

1. Feature Extraction

2. Radiometric Corrections

3. Geometric Corrections

4. Atmospheric Correction

The techniques involved in removal of unwanted and distracting elements

such as image/system noise, atmospheric interference and sensor motion from an

image data occurred due to limitations in the sensing of signal digitization, or data

recording or transmission process. Removal of these effects from the digital data are

said to be "restored" to their correct or original condition, although we can, of course

never know what are the correct values might be and must always remember that

attempts to correct data what may themselves introduce errors. Thus image

restoration includes the efforts to correct for both radiometric and geometric errors.

2.7.1 Feature Extraction

Feature Extraction does not mean geographical features visible on the image

but rather "statistical" characteristics of image data like individual bands or

20

combination of band values that carry information concerning systematic variation

within the scene. Thus in a multispectral data it helps in portraying the necessity

elements of the image. It also reduces the number of spectral bands that has to be

analyzed. After the feature extraction is complete the analyst can work with the

desired channels or bands, but in turn the individual bandwidths are more potent for

information. Finally such a pre-processing increases the speed and reduces the cost of

analysis.

2.7.2 Radiometric corrections

Radiometric Corrections are carried out when an image data is recorded by the

sensors they contain errors in the measured brightness values of the pixels. These

errors are referred as radiometric errors and can result from the

1. Instruments used to record the data

2. From the effect of the atmosphere

Radiometric processing influences the brightness values of an image to correct

for sensor malfunctions or to adjust the values to compensate for atmospheric

degradation. Radiometric distortion can be of two types:

1. The relative distribution of brightness over an image in a given band can be

different to that in the ground scene.

2. The relative brightness of a single pixel from band to band can be distorted

compared with spectral reflectance character of the corresponding region on

the ground.

21

CHAPTER 3

AERIAL IMAGERY

Aerial imagery can expose a great deal about soil and crop conditions. The

“Birds eye” view an aerial image provides, combined with field knowledge, allows

growers to observe issues that affect yield. Our imagery technology enhances the

ability to be proactive and recognize a Problematic area, thus minimizing yield loss

and limiting exposure to other areas of your field. Hemisphere GPS Imagery uses

infrared technology to help you see the big picture to identify these everyday issues.

Digital infrared sensors are very sensitive to subtle differences in plant health and

growth rate. Anything that changes the appearance of leaves (such as curling, wilting,

and defoliation) has an effect on the image. Computer enhancement makes these

Variations within the canopy stand out, often indicating disease, water, weed, or

fertility problems.

Because of Hemisphere GPS technology, aerial imagery is over 30 times more

detailed than any commercially available satellite imagery and is available in selected

areas for the 2010 growing season. Images can be taken on a scheduled or as needed

basis. Aerial images provide a snapshot of the crop condition. The example on the

right shows healthy crop conditions in red and less than healthy conditions in green.

These snapshots of crop variations can then be turned into variable rate prescription

maps (PMaps), which is shown on the right Imagery can be used to identify crop

stress over a period of time. In the images to the left, the

Problem areas identified with yellow arrows show potential plant damage (e.g.

disease, insects, etc.).

Aerial images, however, store information about the electro-magnetic radiance

of the complete scene in almost continuous form. Therefore they support the

localization of break lines and linear or spatial objects. The Map Mart Aerial Image

Library covers all of the continental United States as well as a growing number of

International locations. The aerial imagery ranges in date from 1926 to the present

22

day depending upon the location. Imagery can be requested and ordered by selecting

an area on an interactive map or larger areas, such as cities or counties can be

purchased in bundles. Many of the current digital datasets are available for download

within a few minutes of purchase.

Aerial image measurement includes non-linear, 3-dimensional, and materials

effects on imaging. Aerial image measurement excludes the processing effects of

printing and etching on the wafer. The successful application of aerial image

emulation for CDU measurement traditionally, aerial image metrology systems are

used to evaluate defect printability and repair success.

Figure:3.1 Areal image of the test the building

Whereas, digital aerial imagery should remain in the public domain and be archived

to secure its availability for future scientific, legal, and historical purposes.

Aerial photography is the taking of photographs of the ground from an elevated

position. The term usually refers to images in which the camera is not supported by a

ground-based structure. Cameras may be hand held or mounted, and photographs may

be taken by a photographer, triggered remotely or triggered automatically. Platforms

23

http://en.wikipedia.org/wiki/Camera

http://en.wikipedia.org/wiki/Photograph

for aerial photography include fixed-wing aircraft, helicopters, balloons, blimps and

dirigibles, rockets, kites, poles, parachutes, vehicle mounted poles . Aerial

photography should not be confused with Air-to-Air Photography, when aircraft serve

both as a photo platform and subject.

3.1 USES OF IMAGERY:

Aerial photography is used in cartography (particularly in photogrammetric

surveys, which are often the basis for topographic maps), land-use planning,

archaeology, movie production, environmental studies, surveillance, commercial

advertising, conveyancing, and artistic projects. In the United States, aerial

photographs are used in many Phase I Environmental Site Assessments for property

analysis. Aerial photos are often processed using GIS software.

3.1.1 Radio-controlled aircraft

Advances in radio controlled models have made it possible for model

aircraft to conduct low-altitude aerial photography. This has benefited real-estate

advertising, where commercial and residential properties are the photographic

subject. Full-size, manned aircraft are prohibited from low flights above populated

locations.[3] Small scale model aircraft offer increased photographic access to these

previously restricted areas. Miniature vehicles do not replace full size aircraft, as full

size aircraft are capable of longer flight times, higher altitudes, and greater equipment

payloads. They are, however, useful in any situation in which a full-scale aircraft

would be dangerous to operate. Examples would include the inspection of

transformers atop power transmission lines and slow, low-level flight over

agricultural fields, both of which can be accomplished by a large-scale radio

controlled helicopter. Professional-grade, gyroscopically stabilized camera platforms

are available for use under such a model; a large model helicopter with a 26cc

gasoline engine can hoist a payload of approximately seven kilograms (15 lbs).

24

http://en.wikipedia.org/wiki/Air-to-Air_Photography

http://en.wikipedia.org/wiki/Parachuting

http://en.wikipedia.org/wiki/Kite_aerial_photography

http://en.wikipedia.org/wiki/Rocket

http://en.wikipedia.org/wiki/Dirigible

http://en.wikipedia.org/wiki/Blimp

http://en.wikipedia.org/wiki/Balloon_(aircraft)

http://en.wikipedia.org/wiki/Helicopter

http://en.wikipedia.org/wiki/Aircraft

Recent (2006) FAA regulations grounding all commercial RC model flights

have been upgraded to require formal FAA certification before permission to fly at

any altitude in USA. Because anything capable of being viewed from a public space

is considered outside the realm of privacy in the United States, aerial photography

may legally document features and occurrences on private property.

3.2 TYPES OF AERIAL PHOTOGRAPH

3.2.1 Oblique photographs

Photographs taken at an angle are called oblique photographs. If they are

taken almost straight down are sometimes called low oblique and photographs taken

from a shallow angle are called high oblique.

3.2.2 Vertical photographs

Vertical photographs are taken straight down. They are mainly used in

photogrammetric and image interpretation. Pictures that will be used in

photogrammetric was traditionally taken with special large format cameras with

calibrated and documented geometric properties.

3.2.3 Combinations

Aerial photographs are often combined. Depending on their purpose it can be

done in several ways. A few are listed below.

Several photographs can be taken with one handheld camera to later be

stitched together to a panorama.

In pictometry five rigidly mounted cameras provide one vertical and four low

oblique pictures that can be used together.

25

In some digital cameras for aerial photogrammetric photographs from several

imaging elements, sometimes with separate lenses, are geometrically

corrected and combined to one photograph in the camera.

3.2.4 Orthophotos

Vertical photographs are often used to create orthophotos, photographs which

have been geometrically "corrected" so as to be usable as a map. In other words, an

orthophoto is a simulation of a photograph taken from an infinite distance, looking

straight down from nadir. Perspective must obviously be removed, but variations in

terrain should also be corrected for. Multiple geometric transformations are applied to

the image, depending on the perspective and terrain corrections required on a

particular part of the image. Orthophotos are commonly used in geographic

information systems, such as are used by mapping agencies (e.g. Ordnance Survey) to

create maps. Once the images have been aligned, or 'registered', with known real-

world coordinates, they can be widely deployed.

Large sets of orthophotos, typically derived from multiple sources and divided

into "tiles" (each typically 256 x 256 pixels in size), are widely used in online map

systems such as Google Maps. Open Street Map offers the use of similar orthophotos

for deriving new map data. Google Earth overlays orthophotos or satellite imagery

onto a digital elevation model to simulate 3D landscapes.

3.2.5 Aerial video

With advancements in video technology, aerial video is becoming more

popular. Orthogonal video is shot from aircraft mapping pipelines, crop fields, and

26

other points of interest. Using GPS, video may be embedded with meta data and later

synced with a video mapping program.

This ‘Spatial Multimedia’ is the timely union of digital media including

still photography, motion video, stereo, panoramic imagery sets, immersive media

constructs, audio, and other data with location and date-time information from the

GPS and other location designs. Aerial videos are emerging Spatial Multimedia

which can be used for scene understanding and object tracking. The input video is

captured by low flying aerial platforms and typically consists of strong parallax from

non-ground-plane structures. The integration of digital video, global positioning

systems (GPS) and automated image processing will improve the accuracy and cost-

effectiveness of data collection and reduction. Several different aerial platforms are

under investigation for the data collection

3.3 NON-LINEAR IMAGE ENHANCEMENT

TECHNIQUE

We propose a non-linear image enhancement method, which allows selective

enhancement based on the contrast sensitivity function of the human visual system.

We also proposed. An evaluation method for measuring the performance of the

algorithm and for comparing it with existing approaches. The selective enhancement

of the proposed approach is especially suitable for digital television applications to

improve the perceived visual quality of the images when the source image contains

less satisfactory amount of high frequencies due to various reasons, including

interpolation that is used to convert standard definition sources into high-definition

images. Non-linear processing can presumably generate new frequency components

and thus it is attractive in some applications.

27

3.3.1 PROPOSED ENHANCEMENT METHOD

3.3.1.1 Basic Strategy

The basic strategy of the proposed approach shares the same principle of the

methods that is, assuming that the input image is denoted by I, then the enhanced

image O is obtained by the following processing

O = I + NL(HP( I ))

where HP() stands for high-pass filtering and NL() is a nonlinear operator. As will

become clear in subsequent sections, the non-linear processing includes a scale step

and a clipping step. The HP() step is based on a set of Gabor filters.

The performance of a perceptual image enhancement algorithm is typically

judged through a subjective test. In most current work in the literature, such as this

subjective test is simplified to simply showing an enhancement image along with the

original to a viewer. While a viewer may report that a blurry image is indeed

enhanced, this approach does not allow systematic comparison between two

competing methods.

Furthermore, since the ideal goal of enhancement is to make up the high-

frequency components that are lost in the imaging or other processes, it would be

desired to show whether an enhancement algorithm indeed generates the desired high

frequency components. The tests in do not answer this question. (Note that, although

showing the Fourier transform of the enhanced image may illustrate whether high-

frequency components are added this is not an accurate evaluation of a method, due

to the fact that the Fourier transform provides only a global measure of the signal

spectrum. For example, disturbing ringing artifacts may appear as false high-

frequency components in the Fourier transform.

28

CHAPTER 4

WAVELETS

Wavelet is a waveform of effectively limited duration that has an average

value of zero. The Wavelet transform is a transform of this type. It provides the time-

frequency representation. (There are other transforms which give this information too,

such as short time Fourier transforms, Wigner distributions, etc.)

Often times a particular spectral component occurring at any instant can be

of particular interest. In these cases it may be very beneficial to know the time

intervals these particular spectral components occur. For example, in EEGs, the

latency of an event-related potential is of particular interest (Event-related potential is

the response of the brain to a specific stimulus like flash-light, the latency of this

response is the amount of time elapsed between the onset of the stimulus and the

response). Wavelet transform is capable of providing the time and frequency

information simultaneously, hence giving a time-frequency representation of the

signal.

How wavelet transform works is completely a different fun story, and should

be explained after short time Fourier Transform (STFT). The WT was developed

as an alternative to the STFT. The STFT will be explained in great detail in the

second part of this tutorial. It suffices at this time to say that the WT was developed to

overcome some resolution related problems of the STFT, as explained in Part II.

To make a real long story short, we pass the time-domain signal from various

high pass and low pass filter, which filters out either high frequency or low frequency

portions of the signal. This procedure is repeated, every time some portion of the

signal corresponding to some frequencies being removed from the signal.

29

Here is how this works: Suppose we have a signal which has frequencies up to

1000 Hz. In the first stage we split up the signal in to two parts by passing the signal

from a high pass and a low pass filter (filters should satisfy some certain conditions,

so-called admissibility condition) which results in two different versions of the same

signal: portion of the signal corresponding to 0-500 Hz (low pass portion), and 500-

1000 Hz (high pass portion). Then, we take either portion (usually low pass portion)

or both, and do the same thing again. This operation is called decomposition.

Assuming that we have taken the low pass portion, we now have 3 sets of data, each

corresponding to the same signal at frequencies 0-250 Hz, 250-500 Hz, 500-1000 Hz.

Then we take the low pass portion again and pass it through low and high

pass filters; we now have 4 sets of signals corresponding to 0-125 Hz, 125-250 Hz,

250-500 Hz, and 500-1000 Hz. We continue like this until we have decomposed the

signal to a pre-defined certain level. Then we have a bunch of signals, which actually

represent the same signal, but all corresponding to different frequency bands. We

know which signal corresponds to which frequency band, and if we put all of them

together and plot them on a 3-D graph, we will have time in one axis, frequency in

the second and amplitude in the third axis. This will show us which frequencies exist

at which time ( there is an issue, called "uncertainty principle", which states that, we

cannot exactly know what frequency exists at what time instance , but we can only

know what frequency bands exist at what time intervals , more about this in the

subsequent parts of this tutorial).

Higher frequencies are better resolved in time, and lower frequencies are

better resolved in frequency. This means that, a certain high frequency component

can be located better in time (with less relative error) than a low frequency

component. On the contrary, a low frequency component can be located better in

frequency compared to high frequency component.

Take a look at the following grid:

30

f ^

|******************************************* continuous

|* * * * * * * * * * * * * * * wavelet transform

|* * * * * * *

|* * * *

|* *

--------------------------------------------> time

Interpret the above grid as follows: The top row shows that at higher

frequencies we have more samples corresponding to smaller intervals of time. In

other words, higher frequencies can be resolved better in time. The bottom row

however, corresponds to low frequencies, and there are less number of points to

characterize the signal, therefore, low frequencies are not resolved well in time.

^frequency

|

|

|

| *******************************************************

|

| * * * * * * * * * * * * * * * * * * * discrete time

| wavelet transform

| * * * * * * * * * *

|

| * * * * *

| * * *

|----------------------------------------------------------> time

In discrete time case, the time resolution of the signal works the same as

above, but now, the frequency information has different resolutions at every stage

too. Note that, lower frequencies are better resolved in frequency, where as higher

31

frequencies are not. Note how the spacing between subsequent frequency components

increase as frequency increases.

Below, are some examples of continuous wavelet transform:

Let's take a sinusoidal signal, which has two different frequency

components at two different times: Note the low frequency portion first, and then the

high frequency.

Figure 4.1 WAVEFORM OF CWT

Note however, the frequency axis in these plots are labeled as scale. The

concept of the scale will be made clearer in the subsequent sections, but it should be

noted at this time that the scale is inverse of frequency. That is, high scales

correspond to low frequencies, and low scales correspond to high frequencies.

Consequently, the little peak in the plot corresponds to the high frequency

components in the signal, and the large peak corresponds to low frequency

components (which appear before the high frequency components in time) in the

signal.

32

The continuous wavelet transform of the above signal

Figure 4.2 SAME TRANSFORM ROTATED-250 DEGREES,LOOKING

FROM 45 DEG ABOVE

You might be puzzled from the frequency resolution shown in the plot,

since it shows good frequency resolution at high frequencies. Note however that, it is

the good scale resolution that looks good at high frequencies (low scales), and good

scale resolution means poor frequency resolution and vice versa.

4.1 CONTINUOUS WAVELET TRANSFORM

The continuous wavelet transform was developed as an alternative approach

to the short time Fourier transform to overcome the resolution problem. The wavelet

analysis is done in a similar way to the STFT analysis, in the sense that the signal is

multiplied with a function, {\it the wavelet}, similar to the window function in the

STFT, and the transform is computed separately for different segments of the time-

domain signal. However, there are two main differences between the STFT and the

CWT:

33

1. The Fourier transforms of the windowed signals are not taken, and therefore single

peak will be seen corresponding to a sinusoid, i.e., negative frequencies are not

computed.

2. The width of the window is changed as the transform is computed for every single

spectral component, which is probably the most significant characteristic of the

wavelet transform. The continuous wavelet transform is defined as follows

As seen in the above equation, the transformed signal is a function of two

variables, tau and s, the translation and scale parameters, respectively. psi(t) is the

transforming function, and it is called the mother wavelet . The term mother

wavelet gets its name due to two important properties of the wavelet analysis as

explained below:

The term wavelet means a small wave. The smallness refers to the condition

that this (window) function is of finite length (compactly supported). The wave

refers to the condition that this function is oscillatory. The term mother implies that

the functions with different region of support that are used in the transformation

process are derived from one main function, or the mother wavelet. In other words,

the mother wavelet is a prototype for generating the other window functions.

The term translation is used in the same sense as it was used in the STFT; it

is related to the location of the window, as the window is shifted through the signal.

This term, obviously, corresponds to time information in the transform domain.

However, we do not have a frequency parameter, as we had before for the STFT.

Instead, we have scale parameter which is defined as $1/frequency$. The term

frequency is reserved for the STFT. Scale is described in more detail in the next

section.

34

4.1.1The Scale

The parameter scale in the wavelet analysis is similar to the scale used in

maps. As in the case of maps, high scales correspond to a non-detailed global view

(of the signal), and low scales correspond to a detailed view. Similarly, in terms of

frequency, low frequencies (high scales) correspond to a global information of a

signal (that usually spans the entire signal), whereas high frequencies (low scales)

correspond to a detailed information of a hidden pattern in the signal (that usually

lasts a relatively short time). Cosine signals corresponding to various scales are given

as examples in the following figure .

Figure 4.3 COSINE SIGNALS CORRESPONDING TO VARIOUS SCALES

35

Fortunately in practical applications, low scales (high frequencies) do not last

for the entire duration of the signal, unlike those shown in the figure, but they usually

appear from time to time as short bursts, or spikes. High scales (low frequencies)

usually last for the entire duration of the signal.

Scaling, as a mathematical operation, either dilates or compresses a signal.

Larger scales correspond to dilated (or stretched out) signals and small scales

correspond to compressed signals. All of the signals given in the figure are derived

from the same cosine signal, i.e., they are dilated or compressed versions of the same

function. In the above figure, s=0.05 is the smallest scale, and s=1 is the largest scale.

In terms of mathematical functions, if f(t) is a given function f(st) corresponds to a

contracted (compressed) version of f(t) if s > 1 and to an expanded (dilated) version

of f(t) if s < 1 .

However, in the definition of the wavelet transform, the scaling term is used

in the denominator, and therefore, the opposite of the above statements holds, i.e.,

scales s > 1 dilates the signals whereas scales s < 1 , compresses the signal.

4.2 DISCRETE WAVELET TRANSFORM

The foundations of the DWT go back to 1976 when Croiser, Esteban, and

Galand devised a technique to decompose discrete time signals. Crochiere, Weber,

and Flanagan did a similar work on coding of speech signals in the same year. They

named their analysis scheme as sub band coding. In 1983, Burt defined a technique

very similar to sub band coding and named it pyramidal coding which is also known

as multiresolution analysis. Later in 1989, Vetterli and Le Gall made some

improvements to the sub band coding scheme, removing the existing redundancy in

the pyramidal coding scheme. Sub band coding is explained below. A detailed

coverage of the discrete wavelet transform and theory of multiresolution analysis can

be found in a number of articles and books that are available on this topic, and it is

beyond the scope of this tutorial.

36

4.2.1 The Sub band Coding and The Multi resolution Analysis

The main idea is the same as it is in the CWT. A time-scale representation of a

digital signal is obtained using digital filtering techniques. Recall that the CWT is a

correlation between a wavelet at different scales and the signal with the scale (or the

frequency) being used as a measure of similarity. The continuous wavelet transform

was computed by changing the scale of the analysis window, shifting the window in

time, multiplying by the signal, and integrating over all times. In the discrete case,

filters of different cutoff frequencies are used to analyze the signal at different scales.

The signal is passed through a series of high pass filters to analyze the high

frequencies, and it is passed through a series of low pass filters to analyze the low

frequencies.

The resolution of the signal, which is a measure of the amount of detail

information in the signal, is changed by the filtering operations, and the scale is

changed by up sampling and down sampling (sub sampling) operations. Sub sampling

a signal corresponds to reducing the sampling rate, or removing some of the samples

of the signal. For example, sub sampling by two refers to dropping every other

sample of the signal. Sub sampling by a factor n reduces the number of samples in the

signal n times.

Up sampling a signal corresponds to increasing the sampling rate of a signal

by adding new samples to the signal. For example, up sampling by two refers to

adding a new sample, usually a zero or an interpolated value, between every two

samples of the signal. Up sampling a signal by a factor of n increases the number of

samples in the signal by a factor of n.

Although it is not the only possible choice, DWT coefficients are usually

sampled from the CWT on a dyadic grid, i.e., s0 = 2 and 0 = 1, yielding s=2j and

=k*2j, as described in Part 3. Since the signal is a discrete time function, the terms

37

function and sequence will be used interchangeably in the following discussion. This

sequence will be denoted by x[n], where n is an integer.

The procedure starts with passing this signal (sequence) through a half band

digital low pass filter with impulse response h[n]. Filtering a signal corresponds to the

mathematical operation of convolution of the signal with the impulse response of the

filter. The convolution operation in discrete time is defined as follows:

A half band low pass filter removes all frequencies that are above half of the

highest frequency in the signal. For example, if a signal has a maximum of 1000 Hz

component, then half band low pass filtering removes all the frequencies above 500

Hz.

The unit of frequency is of particular importance at this time. In discrete

signals, frequency is expressed in terms of radians. Accordingly, the sampling

frequency of the signal is equal to 2 radians in terms of radial frequency. Therefore,

the highest frequency component that exists in a signal will be radians, if the signal

is sampled at Nyquist’s rate (which is twice the maximum frequency that exists in the

signal); that is, the Nyquist’s rate corresponds to rad/s in the discrete frequency

domain. Therefore using Hz is not appropriate for discrete signals. However, Hz is

used whenever it is needed to clarify a discussion, since it is very common to think of

frequency in terms of Hz. It should always be remembered that the unit of frequency

for discrete time signals is radians.

After passing the signal through a half band low pass filter, half of the

samples can be eliminated according to the Nyquist’s rule, since the signal now has a

highest frequency of /2 radians instead of radians. Simply discarding every other

sample will subsample the signal by two, and the signal will then have half the

38

number of points. The scale of the signal is now doubled. Note that the low pass

filtering removes the high frequency information, but leaves the scale unchanged.

Only the sub sampling process changes the scale. Resolution, on the other hand, is

related to the amount of information in the signal, and therefore, it is affected by the

filtering operations. Half band low pass filtering removes half of the frequencies,

which can be interpreted as losing half of the information. Therefore, the resolution is

halved after the filtering operation. Note, however, the sub sampling operation after

filtering does not affect the resolution, since removing half of the spectral

components from the signal makes half the number of samples redundant anyway.

Half the samples can be discarded without any loss of information. In summary, the

low pass filtering halves the resolution, but leaves the scale unchanged. The signal is

then sub sampled by 2 since half of the number of samples are redundant. This

doubles the scale.

This procedure can mathematically be expressed as

Having said that, we now look how the DWT is actually computed: The

DWT analyzes the signal at different frequency bands with different resolutions by

decomposing the signal into a coarse approximation and detail information. DWT

employs two sets of functions, called scaling functions and wavelet functions, which

are associated with low pass and high pass filters, respectively. The decomposition of

the signal into different frequency bands is simply obtained by successive high pass

and low pass filtering of the time domain signal. The original signal x[n] is first

passed through a half band high pass filter g[n] and a low pass filter h[n]. After the

filtering, half of the samples can be eliminated according to the Nyquist’s rule, since

the signal now has a highest frequency of /2 radians instead of . The signal can

therefore be sub sampled by 2, simply by discarding every other sample. This

39

constitutes one level of decomposition and can mathematically be expressed as

follows:

Where yhigh[k] and ylow[k] are the outputs of the high pass and low pass filters,

respectively, after sub sampling by 2.

This decomposition halves the time resolution since only half the number of

samples now characterizes the entire signal. However, this operation doubles the

frequency resolution, since the frequency band of the signal now spans only half the

previous frequency band, effectively reducing the uncertainty in the frequency by

half. The above procedure, which is also known as the sub band coding, can be

repeated for further decomposition. At every level, the filtering and sub sampling will

result in half the number of samples (and hence half the time resolution) and half the

frequency band spanned (and hence double the frequency resolution). Figure

illustrates this procedure, where x[n] is the original signal to be decomposed, and h[n]

and g[n] are low pass and high pass filters, respectively. The bandwidth of the signal

at every level is marked on the figure as "f".

40

Figure 4.4 BLOCK DIAGRAM

The Sub band Coding Algorithm As an example, suppose that the original

signal x[n] has 512 sample points, spanning a frequency band of zero to rad/s. At

the first decomposition level, the signal is passed through the high pass and low pass

filters, followed by sub sampling by 2. The output of the high pass filter has 256

points (hence half the time resolution), but it only spans the frequencies /2 to

rad/s (hence double the frequency resolution). These 256 samples constitute the first

level of DWT coefficients. The output of the low pass filter also has 256 samples, but

it spans the other half of the frequency band, frequencies from 0 to /2 rad/s. This

signal is then passed through the same low pass and high pass filters for further

decomposition.

41

The output of the second low pass filter followed by sub sampling has 128

samples spanning a frequency band of 0 to /4 rod/s, and the output of the second

high pass filter followed by sub sampling has 128 samples spanning a frequency band

of /4 to /2 rad/s. The second high pass filtered signal constitutes the second level

of DWT coefficients. This signal has half the time resolution, but twice the frequency

resolution of the first level signal. In other words, time resolution has decreased by a

factor of 4, and frequency resolution has increased by a factor of 4 compared to the

original signal. The low pass filter output is then filtered once again for further

decomposition. This process continues until two samples are left. For this specific

example there would be 8 levels of decomposition, each having half the number of

samples of the previous level. The DWT of the original signal is then obtained by

concatenating all coefficients starting from the last level of decomposition (remaining

two samples, in this case). The DWT will then have the same number of coefficients

as the original signal.

The frequencies that are most prominent in the original signal will appear as

high amplitudes in that region of the DWT signal that includes those particular

frequencies. The difference of this transform from the Fourier transform is that the

time localization of these frequencies will not be lost. However, the time localization

will have a resolution that depends on which level they appear. If the main

information of the signal lies in the high frequencies, as happens most often, the time

localization of these frequencies will be more precise, since they are characterized by

more number of samples. If the main information lies only at very low frequencies,

the time localization will not be very precise, since few samples are used to express

signal at these frequencies. This procedure in effect offers a good time resolution at

high frequencies, and good frequency resolution at low frequencies. Most practical

signals encountered are of this type.

The frequency bands that are not very prominent in the original signal will

have very low amplitudes, and that part of the DWT signal can be discarded without

any major loss of information, allowing data reduction. Figure 4.2 illustrates an

42

example of how DWT signals look like and how data reduction is provided. Figure

4.2a shows a typical 512-sample signal that is normalized to unit amplitude. The

horizontal axis is the number of samples, whereas the vertical axis is the normalized

amplitude. Figure 4.2b shows the 8 level DWT of the signal in Figure 4.2a. The last

256 samples in this signal correspond to the highest frequency band in the signal, the

previous 128 samples correspond to the second highest frequency band and so on. It

should be noted that only the first 64 samples, which correspond to lower frequencies

of the analysis, carry relevant information and the rest of this signal has virtually no

information. Therefore, all but the first 64 samples can be discarded without any loss

of information. This is how DWT provides a very effective data reduction scheme.

FIGURE 4.5 DWT WAVEFORMS

We will revisit this example, since it provides important insight to how DWT should

be interpreted. Before that, however, we need to conclude our mathematical analysis

of the DWT.

43

One important property of the discrete wavelet transform is the relationship

between the impulse responses of the high pass and low pass filters. The high pass

and low pass filters are not independent of each other, and they are related by

where g[n] is the high pass, h[n] is the low pass filter, and L is the filter length

(in number of points). Note that the two filters are odd index alternated reversed

versions of each other. Low pass to high pass conversion is provided by the (-1)n

term. Filters satisfying this condition are commonly used in signal processing, and

they are known as the Quadrature Mirror Filters (QMF). The two filtering and sub

sampling operations can be expressed by

The reconstruction in this case is very easy since half band filters form

orthonormal bases. The above procedure is followed in reverse order for the

reconstruction. The signals at every level are up sampled by two, passed through the

synthesis filters g’[n], and h’[n] (high pass and low pass, respectively), and then

added. The interesting point here is that the analysis and synthesis filters are identical

to each other, except for a time reversal. Therefore, the reconstruction formula

becomes (for each layer)

However, if the filters are not ideal half band, then perfect reconstruction

cannot be achieved. Although it is not possible to realize ideal filters, under certain

conditions it is possible to find filters that provide perfect reconstruction. The most

44

famous ones are the ones developed by Ingrid Daubechies, and they are known as

Daubechies’ wavelets.

Note that due to successive sub sampling by 2, the signal length must be a

power of 2, or at least a multiple of power of 2, in order this scheme to be efficient.

The length of the signal determines the number of levels that the signal can be

decomposed to. For example, if the signal length is 1024, ten levels of decomposition

are possible.

Interpreting the DWT coefficients can sometimes be rather difficult because

the way DWT coefficients are presented is rather peculiar. To make a real long story

real short, DWT coefficients of each level are concatenated, starting with the last

level. An example is in order to make this concept clear.

Suppose we have a 256-sample long signal sampled at 10 MHZ and we wish

to obtain its DWT coefficients. Since the signal is sampled at 10 MHz, the highest

frequency component that exists in the signal is 5 MHz. At the first level, the signal is

passed through the low pass filter h[n], and the high pass filter g[n], the outputs of

which are sub sampled by two. The high pass filter output is the first level DWT

coefficients. There are 128 of them, and they represent the signal in the [2.5 5] MHz

range. These 128 samples are the last 128 samples plotted. The low pass filter output,

which also has 128 samples, but spanning the frequency band of [0 2.5] MHz, are

further decomposed by passing them through the same h[n] and g[n]. The output of

the second high pass filter is the level 2 DWT coefficients and these 64 samples

precede the 128 level 1 coefficients in the plot. The output of the second low pass

filter is further decomposed, once again by passing it through the filters h[n] and g[n].

The output of the third high pass filter is the level 3 DWT coefficients. These 32

samples precede the level 2 DWT coefficients in the plot.

The procedure continues until only 1 DWT coefficient can be computed at

level 9. This on coefficient is the first to be plotted in the DWT plot. This is

45

followed by 2 level 8 coefficients, 4 level 7 coefficients, 8 level 6 coefficients, 16

level 5 coefficients, 32 level 4 coefficients, 64 level 3 coefficients, 128 level 2

coefficients and finally 256 level 1 coefficients. Note that less and less number of

samples is used at lower frequencies, therefore, the time resolution decreases as

frequency decreases, but since the frequency interval also decreases at low

frequencies, the frequency resolution increases. Obviously, the first few coefficients

would not carry a whole lot of information, simply due to greatly reduced time

resolution.

To illustrate this richly bizarre DWT representation let us take a look at a real

world signal. Our original signal is a 256-sample long ultrasonic signal, which was

sampled at 25 MHz. This signal was originally generated by using a 2.25 MHz

transducer, therefore the main spectral component of the signal is at 2.25 MHz. The

last 128 samples correspond to [6.25 12.5] MHz range. As seen from the plot, no

information is available here, hence these samples can be discarded without any loss

of information. The preceding 64 samples represent the signal in the [3.12 6.25] MHz

range, which also does not carry any significant information. The little glitches

probably correspond to the high frequency noise in the signal. The preceding 32

samples represent the signal in the [1.5 3.1] MHz range. As you can see, the majority

of the signal’s energy is focused in these 32 samples, as we expected to see. The

previous 16 samples correspond to [0.75 1.5] MHz and the peaks that are seen at this

level probably represent the lower frequency envelope of the signal. The previous

samples probably do not carry any other significant information. It is safe to say that

we can get by with the 3rd and 4th level coefficients, that is we can represent this 256

sample long signal with 16+32=48 samples, a significant data reduction which would

make your computer quite happy.

One area that has benefited the most from this particular property of the

wavelet transforms is image processing. As you may well know, images, particularly

high-resolution images, claim a lot of disk space. As a matter of fact, if this tutorial is

46

taking a long time to download, that is mostly because of the images. DWT can be

used to reduce the image size without losing much of the resolution. Here is how:

For a given image, you can compute the DWT of, say each row, and discard

all values in the DWT that are less then a certain threshold. We then save only those

DWT coefficients that are above the threshold for each row, and when we need to

reconstruct the original image, we simply pad each row with as many zeros as the

number of discarded coefficients, and use the inverse DWT to reconstruct each row of

the original image. We can also analyze the image at different frequency bands, and

reconstruct the original image by using only the coefficients that are of a particular

band. I will try to put sample images hopefully soon, to illustrate this point.

Another issue that is receiving more and more attention is carrying out the

decomposition (sub band coding) not only on the low pass side but on both sides. In

other words, zooming into both low and high frequency bands of the signal

separately. This can be visualized as having both sides of the tree structure of Figure

4.1. What result is what is known as the wavelet packages. We will not discuss

wavelet packages in this here, since it is beyond the scope of this tutorial. Anyone

who is interested in wavelet packages, or more information on DWT can find this

information in any of the numerous texts available in the market.

And this concludes our mini series of wavelet tutorial. If I could be of any

assistance to anyone struggling to understand the wavelets, I would consider the time

and the effort that went into this tutorial well spent. I would like to remind that this

tutorial is neither a complete nor a through coverage of the wavelet transforms. It is

merely an overview of the concept of wavelets and it was intended to serve as a first

reference for those who find the available texts on wavelets rather complicated. There

might be many structural and/or technical mistakes, and I would appreciate if you

could point those out to me. Your feedback is of utmost importance for the success of

this tutorial.

47

4.3 ALGORITHM

The proposed enhancement algorithm consists of three stages: the first and

the third stage are applied in the spatial domain and the second one in the discrete

wavelet domain.

4.3.1 Histogram Adjustment

Our motivation in making an histogram adjustment for minimizing the

illumination effect is based on some assumptions about image formation and human

vision behavior. The sensor signal S(x, y) incident upon an imaging system can be

approximated as the product[8],[26]

S(x,y) = L(x,y)R(x,y), (1)

Where R(x, y) is the reflectance and L(x, y) is the illuminance at each point (x,

y). In lightness algorithms, assuming that the sensors and filters used in artificial

visual systems possess the same nonlinear property as human photoreceptors, i.e.,

logarithmic responses to physical intensities incident on the their photoreceptors [8],

Equation 1 can be decomposed into

a sum of two components by using the transformation

I(x,y) =log(S(x,y)):

I(x, y) = log (L(x,y)) + log(R(x,y)), (2)

Where I(x,y) is the intensity of the image at pixel location (x,y).Equation 2

implies that illumination has an effect on the image histogram as a linear shift. This

shift, intrinsically, is not same in different spectral bands.

Another assumption of the lightness algorithms is the gray world

assumption stating that the average surface reflectance of each scene in each

wavelength band is the same: gray [8].

48

From an image processing stance, this assumption indicates that images of natural

scenes should contain pixels having almost equal average gray levels in each spectral

band.

Combining Equation 2 with the gray-world assumption, we perform

histogram adjustment as follows:

1. The amount of shift corresponding to illuminance is determined from the beginning

of the lower tail of the histogram such that a predefined amount (typically

0.5%) of image pixels is clipped

2. The shift is subtracted from each pixel value

3. This process is repeated separately for each color Channel

4.4 WAVELET BASED DYNAMIC RANGE

COMPRESSION AND CONTRAST ENHANCEMENT

4.4.1 Dynamic Range Compression

Dynamic range compression and the local contrast enhancement in WDRC are

performed on the luminance channel. For input color images, the intensity image

I(x,y) can

be obtained with the following equation:

I(x, y) = max [Ii (x, y)], i Î {R,G,B}. (3)

The enhancement algorithm is applied on this intensity image. The luminance values

are decomposed using orthonormal wavelet transform as shown in (4):

49

Where aJ,k,l are the approximation coefficients at scale J with corresponding scaling functions FJ,k,l (x, y) , and d j,k,l are the detail coefficients at each scale with corresponding

Wavelet functions Yj,k,l (x, y) . A raised hyperbolic sinefunction given by Equation 5

maps the normalized range [0,1] of aJ,k,l to the same range, and is used for

compressing the dynamic range represented by the coefficients. The compressed

coefficients at level J can be obtained by

where a ¢ J,k,l are normalized coefficients given by

and r is the curvature parameter which adjusts the shape of the hyperbolic sine

function. Applying the mapping operator to the coefficients and taking the inverse

wavelet transform would result in a compressed dynamic range with a significant loss

of contrast. Thus, a center/surround procedure that preserves/enhances the local

contrast is applied to those

mapped coefficients.

4.4.2 Local Contrast Enhancement

The local contrast enhancement which employs a center/surround approach is

carried out as follows

The surrounding intensity information related to each coefficient is obtained

by filtering the normalized approximation coefficients with a Gaussian kernel.

50

Where s is the surround space constant, and k is determined under the constraint that

Local average image representing the surround is obtained by 2D convolution

of (7) with image A ¢, the elements of which are the normalized approximation

coefficients a ¢ J,k,l and given

by (6) :

The contrast enhanced coefficients matrix Anew which will replace the original

approximation coefficients aJ,k,l is given by

where, R is the centre/surround ratio given by d is the

enhancement strength constant with a default value of 1; A is the matrix whose

elements are the output of the hyperbolic sine function in (5).

A linear combination of three kernels with three different scales, combined-

scale-Gaussian (Gc ), is used for improved rendition is given by

51

4.4.3 Detail Coefficient Modification

The detail coefficients are modified using the ratio between the enhanced and

original approximation coefficients. This ratio is applied as an adaptive gain mask

such as:

where A and Anew are the original and the enhanced approximation

coefficient matrices at level 1; Dh , Dv , Dd are the detail coefficient matrices for

horizontal, vertical and diagonal details at the same level, and Dnew h , Dnew v ,

Dnew d are the corresponding modified matrices, respectively. If the wavelet

decomposition is carried out for more than one level, this procedure is repeated for

each level.

4.4.4 Color Restoration

A linear color restoration process is used to obtain the final color image in our

previous work .For WDRC with color restoration a non-linear approach is employed.

The

RGB values of the enhanced color image ( , ) , I x y enh i along with the CR factor

are given as:

52

where Ii(x, y)is the RGB values of the input color image at the corresponding pixel

location and Ienh (x, y) is the resulting enhanced intensity image derived from the

inverse wavelet transform of the modified coefficients. Here _ is the non-linear gain

factor corresponding. This factor has a canonical value and increases the color

saturation resulting in more appealing color rendition. Since the coefficients are

normalized during the enhancement process, the enhanced intensity image obtained

by the inverse transform of enhanced coefficients, along with the enhanced color

image given by (15) span almost only the lower half of the full range of the

histogram. For the final display domain output enh i I , ’s in (15) are stretched to

represent the full dynamic range. Histogram clipping from the upper tail of

histograms in each channel give the best results in

converting the output to display domain.

53

CHAPTER 5

RESULTS ANALYSIS

5.1 RESULTS

Figure. 5.1 Histogram adjusted image

Figure 5.2 orthogonal wavelet decomposition

54

Figure 5.3 WDRC Approximation

Figure 5.4 WDRC Reconstucted spatial domain image

55

Figure 5.5 Final result

5.3 CONCLUSION

In this project application of the WDRC algorithm in aerial imagery is

presented. The results obtained from large variety of aerial images show strong

robustness, high image quality, and improved visibility indicating promise for aerial

imagery during poor visibility flight conditions. This algorithm can further be applied

to real time video streaming and the enhanced video can be projected to the pilot’s

heads-up display for aviation safety.

56

5.2 FUTURE SCOPE

As future work,

This work can be extended in order to increase the accuracy by increasing the

level of transformation.

This can be used as a part of the block in the full fledged applications ,i.e, by

using these DWT,the applications can be developed such as

compression,watermarking etc.

57

REFERENCES

[1] D. J. Jobson, Z. Rahman, G. A. Woodell, G.D.Hines, “A Comparison of Visual

Statistics for the Image Enhancement of FORESITE Aerial Images with Those of

Major Image Classes,” Visual Information Processing XV, Proc. SPIE 6246, (2006)

[2] S. M. Pizer, J. B. Zimmerman, and E. Staab, “Adaptive grey level assignment in

CT scan display,” Journal of Computer Assistant Tomography, vol. 8, pp. 300-305 ,

(1984).

[3] J. B. Zimmerman, S. B. Cousins, K. M. Hartzell, M. E. Frisse, and M. G. Kahn,

“A psychophysical comparison of two methods for adaptive histogram equalization,”

Journal of Digital Imaging, vol. 2, pp. 82-91(1989).

[4] S. M. Pizer and E. P. Amburn, “Adaptive histogram equalization and its

variations,” Computer Vision, Graphics, and Image Processing, vol. 39, pp. 355-368,

(1987).

[5] K. Rehm and W. J. Dallas, “Artifact suppression in digital chest radiographs

enhanced with adaptive histogram equalization,” SPIE: Medical Imaging III, (1989).

58

APPENDIXSOURCE CODE

%%% this is the main program for the implementation of Aerial image

%%% enhancement using DWT

clear all;

close all;

clc;

I=uigetfile('.jpg','select the aerial image');

a=imread(I);

s1=size(a);

% Stage 1: Histogram Adjustment

figure('Name','Stage 1: Histogram Adjustment','NumberTitle','off'); subplot(2,2,1);

imshow(a); title('Original image');

ag=rgb2gray(a);

subplot(2,2,2); imhist(ag);title('Original image Histogram');

a1=a;

a1(:,:,1)=histeq(a(:,:,1));

a1(:,:,2)=histeq(a(:,:,2));

a1(:,:,3)=histeq(a(:,:,3));

subplot(2,2,3); imshow(a1);title('Histogram Adjusted image');

ag1=rgb2gray(a1);

subplot(2,2,4);imhist(ag1);title('Histogram Adjusted image Histogram');

% Stage 2: Wavelet Based Dynamic Range Compression and Contrast Enhancement

% 2.1: Dynamic Range Compression

aI=max(a1,[],3);

[ai,aih,aiv,aid]=dwt2(double(aI),'bior1.1');

s2=size(ai);

59

figure('Name','Stage 2.1: Orthogonal Wavelet Decomposition','NumberTitle','off');

subplot(2,2,1); imshow(ai,[]);

subplot(2,2,2); imshow(aih);

subplot(2,2,3); imshow(aiv);

subplot(2,2,4); imshow(aid);

%Mapping operator

r=2;

j=1;

aim=ai;

aivm=aiv;

aidm=aid;

aihm=aih;

for n=1:s2(2)

for m=1:s2(1)

aim(m,n)=((sinh(4.6248*((ai(m,n)/(255*(2^j))))-2.3124)+5)/10).^r;

aivm(m,n)=((sinh(4.6248*((aiv(m,n)/(255*(2^j))))-2.3124)+5)/10).^r;

aihm(m,n)=((sinh(4.6248*((aih(m,n)/(255*(2^j))))-2.3124)+5)/10).^r;

aidm(m,n)=((sinh(4.6248*((aid(m,n)/(255*(2^j))))-2.3124)+5)/10).^r;

end

end

% 2.2,2.3: Local Contrast Enhancement and Detail Coefficient modification

aimd = histeq(aim);

ai=ai+0.000001; % to avoid divided by zero

aivmd=imadjust(aivm); avn=(aimd./ai).*aivm;

aidmd=imadjust(aidm);adn=(aimd./ai).*aidm;

aihmd=imadjust(aihm);ahn=(aimd./ai).*aihm;

figure;imshow(aimd), title('WDRC Approximation image')

60

a=double(a);

aId=idwt2(aimd,avn,ahn,adn,'bior1.1');

figure;imshow(aId,[]); title('WDRC Reconstructed Spatial Domain

image')

I=double(a1);

B=3;

% Step 3: Colour Restoration

for n=1:s1(2)

for m=1:s1(1)

Ai=((a(m,n,:))./max(a(m,n,:))).^B;

I(m,n,:)=Ai.*aId(m,n).*2.5;

end

end

figure('Name','Final Result','NumberTitle','off');

subplot(2,1,1); imshow(uint8(a)); title('Original Image')

subplot(2,1,2); imshow(I); title('Enhanced Image')

61

Image Enhancement

Documents

image visibility

image resolution

input image

sensed image

digital image processing

images contrast

captured images

aerial imagery