Boundary detection in Medical Images using Edge Field Vector … · 2013-11-15 · Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing,

Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013

193

Boundary detection in Medical Images using

Edge Field Vector based on Law's Texture and

Canny Method Swetha.M1

,Jyohsna.C1

1Department of E&C, KLS VDRIT, Haliyal, Karnataka, India

email:[email protected]), (email:[email protected]

Abstract—Detecting the correct boundary in noisy images is a

difficult task. Images are used in many fields, including

surveillance, medical diagnostics and non-destructive testing.

Boundaries are mainly used to detect the shape of an object.

Image segmentation is used to locate objects and boundaries in

images and it assigns a lable in every pixel in an image such that

pixels with the same level share have certain virtual

characteristics. The proposed an edge detection technique for

detecting the correct boundary of objects in an image. It can

detect the boundaries of object using the information from

intensity gradient using the vector model and texture gradient

using the edge map modle. The results show that the technique

performs very well and yields better performance than the

classical contour models. The proposed method is robust and

applicable on various kind of noisy images without prior

knowledge of noise properties.

Keywords— Boundary extraction, vector field model, edge

mapping model, edge following technique, boundary detection.

I. INTRODUCTION

Boundary detection is mainly used to detect the outline or

shape of the object, so we can easily identify objects based

upon the outline or shape. Segmentation is the process in

which an image is divided into its constituent objects or parts.

The main goal of segmentation is to simplify and/or change an

image representation into something that is analyzed easily.

Image segmentation is an initial step before performing high-

level tasks such as object recognition and understanding.

Image segmentation is typically used to locate objects and

boundaries in images. In medical imaging, segmentation is

important for feature extraction, image measurements, and

image display. In some applications it may be useful to extract

boundaries of objects of interest from ultrasound images

[1],[2], microscopic images [3]- [5].

In recent years, there have been several new methods to solve

the problem of boundary detection, e.g., active contour model

(ACM), geodesic active contour (GAC) model, active

contours without edges (ACWE), gradient vector flow (GVF)

snake model, etc. The snake models have become popular

especially in boundary detection where the problem is more

challenging due to the poor quality of the images. To remedy

the problem, we propose a new technique for boundary

detection for ill-defined edges in noisy images using a novel

edge following. The proposed edge following technique is

based on the vector image model and the edge map. The

vector image model provides a more complete description of

an image by considering both directions and magnitudes of

image edges. The proposed edge vector field is generated by

averaging magnitudes and directions in the vector image. The

edge map is derived from Law‘s texture feature and the Canny

edge detection. The vector image model and the edge map are

applied to select the best edges.

II. PROPOSED SYSTEM

In proposed boundary detection algorithm is used to detect the

boundary of object in an image. Boundary extraction

algorithm consists of following three phases.

1. edge vector gradient

2. Edge mapping model

3. Edge detection Technique

III. BLOCK DIAGRAM

A. Average Edge Vector Field Model

We exploit the edge vector field to devise a new boundary

extraction algorithm [29]. Given an image f(x, y), the edge

vector field is calculated according to the following equations:

Input image Average

edge vector

field

Initial

position

Edge

following

technique

Boundary

detected

Edge map


194

(i, j) = (Mx(i, j) + My(i, j) )……....(1)

(i, j) – …….(2)

K= ( ….(3)

Fig. 1. (a) Original unclear image. (b) Result from the edge vector field and

zoomed-in image. (c) Result from the proposed average edge vector field and

zoomed-in image.

Each component is the convolution between the image and the

corresponding difference mask, i.e.,

Mx (i, j) = −Gy × f(x, y) ≈ ...............(4)

My (i, j) = Gx × f(x, y) ≈ − …………(5)

whereGx and Gy are the difference masks of the Gaussian

weighted image moment vector operator in the x and y

directions, respectively,[29]

Gx (x, y) = exp …………(6)

Gy (x, y) = ……….(7)

Edge vectors of an image indicate the magnitudes and

directions of edges which form a vector stream flowing

around an object. However, in an unclear image, the vectors

derived from the edge vector fieldmay distribute randomly in

magnitude and direction. Therefore, we extend the capability

of the previous edge vector field by applying a local averaging

operation where the value of each vector is replaced by the

average of all the values in the local neighborhood, i.e.,

M(i, j) = ….(8)

D(i, j) = …………….(9)

whereMr is the total number of pixels in the neighborhood N.

We apply a 3 × 3 window as the neighborhood N throughout

our research.

B. Edge Map

Edge map is edges of objects in an image derived from Law‘s

texture and Canny edge detection.

1) Law‟s Texture:The texture feature images of Law‘s

texture are computed by convolving an input image

with each of the masks. Given a column vector L=(1,

4, 6, 4, 1) T , the 2-D mask l(i, j) used for texture

discrimination in this research is generated by L × LT

. The output image is obtained by convolving input

image with texture mask.

2) Canny Edge Detection:The first step of Canny edge

detection is to convolve the output image obtained

from the aforementioned Law‘s texture t(i, j) with a

Gaussian filter. The second step is to calculate the

magnitude and direction of the gradient. The third

step is nonmaximal suppression to identify edges.

The last step is the thresholding algorithm to detect

and link edges. The double threshold algorithm is

used to detect and link edges.

Edge map shows some important information of edge. This

idea is exploited for extracting objects‘ boundaries in unclear

images. Examples of the edge maps are shown in Fig. 2.

Fig. 2. (a) Synthetic noisy image. (b) Left ventricle in the MR image.

(c)Prostate ultrasound image. (d)–(f) Corresponding edge maps derived from

Law‘s texture and Canny edge detection.

C. Edge Following Technique

The edge following technique is performed to find the

boundary of an object. Most edge following algorithms take

into account the edge magnitude as primary information for

edge following. However, the edge magnitude information is

not efficient enough for searching the correct boundary of

objects in noisy images because it can be very weak in some

contour areas.

The magnitude and direction of the average edge vector field

give information of the boundary which flows around an

object. In addition, the edge map gives information of edge

which may be a part of object boundary. Hence, both average

edge vector field and edge map are exploited in the decision

of the edge following technique. At the position (i, j) of an

image, the successive positions of the edges are then

calculated by a 3 × 3 matrix.

D. Initial Position

In this section, we present a technique for determining a good

initial position of edge following that can be used for the


195

boundary detection. In this proposed technique, the initial

position of edge following is determined by the following

steps. The first step is to calculate the average magnitude

[M(i, j)] using (8). The position with high magnitude should

be a good candidate of strong edges on the image. The second

step is to calculate the density of edge length for each pixel

from an edge map. An edge map [E(i, j)], as a binary image, is

obtained by Law‘s texture and Canny edge detection. The idea

of using density is to obtain measurement of the edge length.

The density of edge length [L(i, j)] in each pixel can be

calculated from

L(i, j) = .........................(15)

whereC(i, j) is the number of connected pixels at each position

of pixel. The third step is to calculate the initial position map

P(i, j) from summation of average magnitude and density of

edge length, i.e.,

P(i, j) = ………………..(16)

The last step is the thresholding of the initial position map.

We have to threshold the map in order to detect the initial

position of edge following.

Fig. 5. (a) Aorta in cardiovascular MR image. (b) Averaged magnitude[M(i,

j)]. (c) Density of length edge [L(i, j)]. (d) Initial position map [P (i, j)] and

initial position of edge following derived by thresholding Tmax = 0.95.

IV. RESULTS

Fig1: Original image.

Fig2: Preprocessed image.

Fig3: Magnituted of the image.

Fig4: Direction of the image.

Fig5: Law‘s texture output image.


196

Fig6: Canny magnitude of the image.

Fig7: Canny direction of the image.

Fig8: Non maximal suppression image.

Fig9:First thersholding image.

Fig10: Edge map of the image.

Fig11: Density of the image.

Fig12: Position image.

Fig13: Boundary detected.


197

V. CONCLUSION

We have designed a new edge following technique for

boundarydetection and applied it to object segmentation

problem inmedical images. Our edge following technique

incorporates avector image model and the edge map

information. The results of detecting the object boundaries in

noisy images show that the proposed technique is much better

than the five contour models. We have successfully applied

the edge following technique to detect ill-defined object

boundaries in medical images. The proposed method can be

applied not only for medical imaging, but can also be applied

to any image processing problems in which ill-defined edge

detection is encountered.

ACKNOWLEDGEMENT

We sincerely thank our college, KLS VDRIT College for

humble facilities and necessary infrastructure made available

during the course of our work. We wish to express our thanks

and sincere gratitude to our Principal, Head of the Department

and guide for their guidance to complete this work

successfully and enthusiastic encouragement.

REFERENCES

[1] J. Guerrero, S. E. Salcudean, J. A. McEwen, B. A. Masri, andS. Nicolaou,

―Real-time vessel segmentation and tracking for ultrasoundimaging

applications,‖ IEEE Trans. Med. Imag., vol. 26, no. 8, pp. 1079–1090, Aug.

2007.

[2] F. Destrempes, J. Meunier, M.-F. Giroux, G. Soulez, and G.

Cloutier,―Segmentation in ultrasonic B-mode images of healthy carotid

arteriesusing mixtures of Nakagami distributions and stochastic

optimization,‖IEEE Trans. Med. Imag., vol. 28, no. 2, pp. 215–229, Feb.

2009.

[3] N. Theera-Umpon and P. D. Gader, ―System level training of neural

networksfor counting white blood cells,‖ IEEE Trans. Syst., Man, Cybern.C,

App. Rev., vol. 32, no. 1, pp. 48–53, Feb. 2002.

[4] N. Theera-Umpon, ―White blood cell segmentation and classification

inmicroscopic bone marrowimages,‖ Lecture Notes Comput. Sci., vol.

3614,pp. 787–796, 2005.

[5] N. Theera-Umpon and S. Dhompongsa, ―Morphological

granulometricfeatures of nucleus in automatic bone marrow white blood cell

classification,‖IEEE Trans. Inf. Technol. Biomed., vol. 11, no. 3, pp. 353–

359,May 2007.

[6] J. Carballido-Gamio, S. J. Belongie, and S. Majumdar, ―Normalized cutsin

3-D for spinal MRI segmentation,‖ IEEE Trans. Med. Imag., vol. 23,no. 1, pp.

36–44, Jan. 2004.


198

COLOR BASED CLASSIFICATION OF PEANUTS

Chaitra C. 1, K. V. Suresh 2, Partha Das 3

1 M. Tech. (Signal Processing), Dept. of E&C, SIT, Tumkur, Karnataka, INDIA

2 Professor and Head, Dept. of E&C, SIT, Tumkur, Karnataka, INDIA

3 R&D Engineer, Opto-Electronic Color Sorter Division,M/S Fowler westrup (India) Pvt. Ltd., Bangalore, Karnataka, INDIA.

1 [email protected]

2 [email protected]

3 [email protected]

Abstract—Peanuts are rich in energy, and contain health

benefiting nutrients, minerals, antioxidants and vitamins that are

essential for optimum health. The price for the peanutsdepends

on its quality, thus quality classification of food grains facilitates

the proper marketing of agricultural products. In this paper we

proposean image processing technique to ensure good quality of

peanuts. The main objective of this paper is to classify peanuts

into good and bad, based on color feature. The captured images

are first pre-processed, and database is prepared automatically.

Statistical and histogram features are then extracted for

classification using Feed Forward Neural Network (FFNN). Red

and white peanut samples are considered for experimentation.

The result shows that, the proposed method gives classification

accuracy of around 90% for red goodpeanuts and 84% for red

bad peanuts, 96% for white good peanuts and, 71% for white

bad peanuts. Proposed algorithm is developed using MATLAB

7.12.

Keywords - Peanut, quality, database, features, Feed Forward

Neural Network (FFNN).

I. INTRODUCTION

Peanuts compose of sufficient levels of mono-unsaturated

fatty acids especially oleic acid. It helps to lower "bad

cholesterol" level and increases "good cholesterol‖ level in the

blood. These peanuts are a good source of dietary protein

composes of fine quality amino acids that are essential for

growth and development. Peanuts contain cancer-fighting

compounds such as, resveratrol and beta-sitosterol. Beta-

sitosterol has been shown to inhibit breast, prostate, and colon

cancer cell growth [1].

Quality inspection of peanut plays a very important role in

food grain industry.The possible types of classes that could be

considered for quality analysis are freshness; good, broken,

skin removed, dull color, and shrivelled. Also they can be

classified based on size, shape and its nutrition content [2]. In

[3], a simple imaging system was developed for color image

based sorting of red and white wheat kernels. Here, the

combination of statistical and histogram features are

considered for classification. In [4], a comparative study was

made to classify and grade bulk seed samples using artificial

neural network. Three sets of features are extracted for

classification, and combinations of these features are tested

with different artificial neural networks. A review paper [5],

briefs how selection of features plays important role in

classification. Each feature namely color, size, and texture

carries useful information, using which good classification can

be achieved. In some cases RGB color space is used for

classification of fruits and nuts, HSI color space is used for

classification of wheat. If these color spaces are not enough

then multispectral imaging will give the best results. In [6], a

machine was developed to detecttoxin in peanuts using k

mean clustering algorithm. This algorithm uses average value

of R, G, and B components. Classification accuracy shown

here is 100 %. In [7], a method was presented to trace the

origin of peanut pods using image recognition, here features

like texture, color and shape are considered for classification

using neural network.Change in color is also one of the best

properties of peanut, using which quality can be assessed.

Color based classification separates almost all bad peanuts.

Further, size and texture based classification improves the

quality of peanut. Hence developing a rapid detection and

classification algorithm based on colour, size and texture is a

useful work for industrial application.In this paper, two types

of peanuts are used for experimentation. They are red peanuts

and white peanuts. The objective of this paper is to develop an

algorithm to prepare peanut database automatically, and select

suitable feature that classifies peanuts into good and bad.\The

paper is organized as follows. In section II, we present the

proposed algorithm. In section III we discuss the experimental

results. Section IV contains the brief summary and closing

remarks.

II. PROPOSED METHOD

The block diagram of proposed algorithm is shown in figure

1. In image acquisition part, peanut images are captured in

Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WiSE 2013

199

visible range. The captured color images are converted into

binary. And morphological processing is performed on the

binary images to retain the pixel information.In segmentation

and object extraction part, each peanut is segmented from its

background; Further, each segmented peanut is stored as a

subimage to form a peanut database. The features are

extracted from all the peanut images in the database, and are

classified using FFNN classifier.

Figure 1 Block diagram of proposed method

A. Image Acquisition and Pre-processing

Images are captured using SonyCybershot HX200V(18

Megapixel, SLR) camera using fluorescent lamp asa

lightsource and black color background. R component of the

input

image shows good difference between the foreground and

background. Hence, R component of the image is considered

for further processing.

B. Binary Thresholding and Morphological Processing

The component image is first converted into binary by

thresholding

(1)

where , and is the size ofthe

image . Binary thresholding results in dark background and

bright foreground image. Two problems faced after

thresholding are, loss of foreground pixel information, and

some of the dust particles in the original image appear like a

new object. Hence, morphological operations like hole filling,

and erosion are performed on the image [8].

C. Segmentation and Object Extraction

Once the foreground objects are separated from

background, it is necessary to label each foreground object in

order to extract them separately. Each labelled object is

extracted and stored to prepare peanut database.Following

algorithm is developed for object extraction

Consider there are N peanuts in the original image and L is the

label matrix. All pixels belong to one peanut is labelled as ;

similarly pixels of peanuts are labelled as . The

detailed procedure is:

Step 1: Read label and its location details from the label

matrix .

Step 2: Consider the pixel location obtained in step1 and go to

the same location in the original image I. Retrieve R, G and B

values of that pixel location and store it in the same location

of new sub image.

Step 3: Save new sub image in temporary database.

Step 4: Repeat step 1 to step 3 for peanuts.

From the above algorithm foreground objects are extracted

from the original image but, there is extra background in the

image.

D. Database Preparation

An image database is prepared using automatic database

preparation algorithm. This algorithm will remove dark

background from the images, which results from the algorithm

given in section II (C).

Following are the steps of the algorithm:

Step 1: Take R, G and B component matrices of a peanut sub

image.

Step 2: For R component matrix, check for all zero rows. If

the row is all zero, then assign 0 to a corresponding row in a

column vector, otherwise assign 1.

Step 3: Repeat Step 2 for all columns of the matrix to get a

row vector.


200

Step 4: Repeat step 2 and step 3 for G, and B components

matrices.

Step 5: Consider the column vectors from step 2. Check each

row of the column vector R and corresponding row of the G,

and B component. If all three rows of column vectors are zero,

then delete corresponding rows of R, G, and B components in

the matrices.

Step 6:Consider the row vectors from step 3. Check each

column of the row vector R and corresponding column of the

G, and B component. If all three columns of the row vectors

are zero, then delete corresponding column of R, G, and B

components in the matrices.

Step 7: R, G and B matrix values are moved to a new

variable, to get color subimage of the peanut free from extra

background.

Step 8: Resize all subimages to same size.

E. Feature Extraction and Classification

The feature of an object plays a key role in classification. The

best feature subsets selected in this work for classification are:

Percentage of red and green pixels with intensity less than

100, and the percentage of blue pixels with the intensity less

than 120. Statistical features namely, mean, median, and,

standard deviation of R, G, and B components with 24

histogram features are extracted. Totally 36 features are

extracted for classification. In this work FFNN with three

layers is used for classification. It has 36 input neurons and 2

output neurons with 20 neurons in the hidden layer.

III. EXPERIMENTAL RESULTS AND DISCUSSION

Images are captured using sony cybershot camera with each

image containing more than one peanut kernel. Figure 2(a) is

the input color image, from the figure 2(b)(c)(d) it is observed

that histogram of R component shows good difference

between foreground and background. Hence R component is

considered in further processing. Figure 2(e) is the R

component of the image in figure 2(a). Figure 2(f) is the result

after binary thresholding, with the threshold value of 110.

After binary thresholding some of the foreground pixels

appear like background, hence it is necessary to fill the

foreground region, figure 2(g) is the result after hole filling

(a)

(b) (c) (d)

(e) (f)

(g) (h)

Figure 2. Colour images of peanut samples, (a) Original image, (b)(c)(d)

Histogram of R,G and B components, (e) R component of image (a), (f)

Image after binary thresholding, (g) image after hole filling operation, (h)

Image after erosion operation.


201

operation. Also, binary thresholding results in small bright

dots which appear like new objects. These objects need to be

removed, hence morphological erosion operation is performed

using disk as a structuring element with the radius 5. But, few

of the dust particles are difficult to remove by erosion

operation also. Hence, to avoid these particles in the database

a pixel counter is set within the object extraction algorithm. If

the pixel count is less than the threshold then that object is

considered as the dust particle, and it is discarded from the

database. Figure 2(h) is the image after erosion operation.

After pre-processing and morphological operation objects are

extracted and stored as a new sub image using algorithm given

in section II (C).Figure 3 is the result of object extraction

algorithm. Eachsubimage in the Figure 3, contains unwanted

dark background. This dark background is removed using

background removal algorithm given in section II (D), and

result of this algorithm is given in Figure 4.

Figure 3. Subimages of individual peanuts extracted from original image in

Figure 2(a).

Figure 4. Peanut database (resized images), after removing extra

background from images in figure 3.

After the database preparation, next step is to extract features

for classification. A FFNN with 36 neurons in the input layer

is considered since we have extracted 36 input features for

training and also, we have two classes good and bad; hence 2

neurons are taken at output layer.

Figure 5. Red peanut samples.

Figure 6. White peanut samples.

Figure 7. Bad peanut samples.

Figure 5, 6, 7 are the images of red, white and bad peanut

samples respectively. Initially 260 red peanut samples are

taken for training with 130 good, and 130 bad peanuts. 64

samples are taken for testing with 32 good and 32 bad

peanuts. Classifications results are tabulated in Table I.

TABLE I: CLASSIFICATION OF RED PEANUT SAMPLES USING

FFNN

Category

Input

number

Success

number

Failure

number

Accuracy

in %

Red

peanut

samples

Good

32

29

3

90.62

Bad

32

27

5

84.37

Features from 64 red peanut test samples are input to the network, from 32 good samples, 29 are correctly classified as

good and 3 peanuts are wrongly classified as bad, similarly,

from 32 bad peanuts 27 are correctly classified, and 5 are

wrongly classified as good. Thus, our algorithm classifies

good red samples with the accuracy of 90.62% and bad red

samples with the accuracy of 84.37%.


202

TABLE II: CLASSIFICATION OF WHITE PEANUT SAMPLES USING

FFNN

Category

Input

number

Success

number

Failure

number

Accuracy

in %

White

peanut

samples

Good

32

31

1

96.87

Bad

32

23

9

71.87

Experiment is repeated for whitepeanut samples and results

are tabulated in Table II. From 32 white good peanut samples,

31 are correctly classified as good and 1 peanut is wrongly

classified as bad. Similarly, from 32 bad samples 23 are correctly classified as bad, and 9 are wrongly classified as

good. Classification accuracy achieved is 96.87 % for good

peanuts, and 71.87% for bad peanuts. Failure number for red

peanuts and white peanuts are 8 and 10 respectively, this is

because in most of the bad peanuts only a part of the endocarp

is damaged, Since the major part of the endocarp of such

samples are similar to good ones they are misjudged as good

peanuts.

IV. CONCLUSION

This paper presents an automatic database preparation and

classification algorithm for peanuts. The thresholding and

morphological operations will separatethe foreground, and

background with minimum loss of information.The proposed

object extraction and database preparation algorithms are able

to prepare the color image database of peanut.The selected

features are able to classify good and bad red peanuts with the

accuracy of 90.62 %, and 84.37 % respectively. And also,

good and bad white peanuts with the accuracy of 96.87%, and

71.87 % respectively. Thus, FFNN gives better classification

results for both red and white peanut samples. By solving

touching kernel problem, the proposed algorithm can be used

for automatic training of the sorting machine.

ACKNOWLEDGMENT

The authorswould like to thank M/S Fowler Westrup India

Pvt. Ltd.,Bangalore for the lab facility and financial support

given to carry out the work.

REFERENCES

[1] Jocelyne Tan, Good Eating Tip of the Month,Univ. of Michigan Health

System:Patient Food and Nutrition Services, February 2011.

[2] Hong Chen , Jing Wang, Qiaoxia Yuan and Peng Wan, ―Quality

classification of peanuts based on image processing,‖Journal of Food, Agriculture & Environment. Vol. 9 (3&4): 205-209. 2011.

[3] Tom Pearson, Dan Brabec and Scott Haley, ―Color image based sorter

for separating red and white wheat,‖ Sens. & Instrumen. Food Qual.

(2008) 2:280–288.2008.

[4] Anil Kannur, AshaKannur andVijay S. Rajapurohit, ―Classification

And Grading Of Bulk Seeds UsingArtificial Neural Networks,”

International Journal of Machine Intelligence, Vol. 3, pp. 62-73, 2011.

[5] Chaoxin Zheng, Da-Wen Sunand Liyun Zheng, ―Recent

developmentsand applications ofimage featuresforfood

qualityevaluation andinspection a review,‖ Trends in Food Science &

Technology, Vol. 17, pp. 642-655, 2006.

[6] AtrisSuyantohadi and RudiatiEviMasithoh, ―Development of Machine

Vision Based on Image Processing Technique to Identify Toxin

Contamination In Peanuts,‖ Australian Journal of Basic and Applied

Sciences, vol. 6, pp. 135-141, 2012.

[7] Han Zhongzhi, Deng Limiao, and Yu Renshi, ―Study on Origin

Traceability of Peanut Pods Based on Image Recognition,‖

International Conference on System Science, Engineering Design and Manufacturing Informatization, IEEE, vol. 2, pp. 93-96, 2011.

[8] R.C.Gonzalez, R.E.Woods, Digital Image Processing. 2-nd Edition,

Prentice Hall,2002.


203

WAVELT BASED IMAGE COMPRESSION USING ASWDR

TECHNIQUE: A COMPARITIVE STUDY

KISHORE D.V1,ARCHANAA NAWANDHAR1

Department of Telecommunication Engineering, CMRIT, Bangalore,Karnataka,India

[email protected], [email protected]

Abstract-The objective of this paper is to implement and evaluate

the effectiveness of wavelet based Adaptively Scanned Wavelet

Difference Reduction (ASWDR) compression techniques using

MATLAB R2007b The performance parameters such as Peak

Signal to Noise Ratio (PSNR), Mean Squared Error (MSE),

Compression Ratio (CR) ratio are evaluated based on the

algorithms. Recently discrete wavelet transform and wavelet packet

has emerged as popular techniques for image compression. The

wavelet transform is one of the major processing components of

image compression. The results of the compression change as the

nature of image and type of the wavelet changes. This paper

compares compression performance of Daubechies, Biorthogonal

along with results for images chosen from matlab toolbox. Based

on the results, it is proposed that proper selection of wavelet on the

basis of nature of images, improve the quality as well as

compression ratio remarkably. The prime objective is to select the

proper wavelet during the transform phase to compress the image.

This paper will provide a good reference for application developers

to choose a good wavelet compression system for their application.

Key words: Compression, Wavelet, Daubechies, Biorthogonal.

I.INTRODUCTION

In Recent years, many studies have been made on wavelets.

An excellent overview of what wavelets have brought to the

fields as diverse as biomedical applications, wireless

communications, computer graphics or turbulence, is given in

Image compression is one of the most visible applications of

wavelets. The rapid increase in the range and use of electronic

imaging justifies attention for systematic design of an image

compression system and for providing the image quality

needed in different applications. A typical still image contains

a large amount of spatial redundancy in plain areas where

adjacent picture elements (pixels, pels) have almost the same

values. It means that the pixel values are highly correlate. In

addition, a still image can contain subjective redundancy,

which is determined by properties of a human visual system

(HVS) . An HVS presents some tolerance to distortion, depending upon the image content and viewing conditions.

Consequently, pixels must not always be reproduced exactly

as originated and the HVS will not detect the difference

between original image and reproduced image.

The redundancy (both statistical and subjective) can be removed to achieve compression of the image data. The basic

measure for the performance of a compression algorithm is

compression ratio (CR), defined as a ratio between original

data size and compressed data size. In a lossy compression

scheme, the image compression algorithm should achieve a

tradeoff between compression ratio and image quality. Higher

compression ratios will produce lower image quality and vice

versa. Quality and compression can also vary according to

input image characteristics and content. Transform coding is a

widely used method of compressing image information. In a

transform-based compression system two-dimensional (2-D)

images are transformed from the spatial domain to the

frequency domain. An effective transform will concentrate

useful information into a few of the low-frequency transform

coefficients. An HVS is more sensitive to energy with low

spatial frequency than with high spatial frequency. Therefore,

compression can be achieved by quantizing the coefficients, so that important coefficients (low-frequency coefficients) are

transmitted and the remaining coefficients are discarded. Very

effective and popular ways to achieve Compression of image

data are based on the discrete cosine transform (DCT) and

discrete wavelet transform (DWT). Current standards for

compression of still (e.g., JPEG) and moving images (e.g.,

MPEG-1, MPEG-2) use DCT, which represents an image as a

superposition of cosine functions with different discrete

frequencies. The transformed signal is a function of two

spatial dimensions, and its components are called DCT

coefficients or spatial frequencies. DCT coefficients measure

the contribution of the cosine functions at different discrete

frequencies. DCT provides excellent energy compaction, and

a number of fast algorithms exist for calculating the DCT.

Most existing compression systems use square DCT blocks of

regular size. The image is divided into blocks of NxN samples

and each block is transformed independently to give NxN coefficients. For many blocks within the image, most of the

DCT coefficients will be near zero. DCT in itself does not

give compression. To achieve the compression, DCT

coefficients should be quantized so that the near-zero

coefficients are set to zero and the remaining coefficients are

represented with reduced precision that is determined by

quantizer scale. The quantization results in loss of

information, but also in compression. Increasing the quantizer

scale leads to coarser quantization, which gives high

compression and poor decoded image quality the use of

uniformly sized blocks simplified the compression system, but

mailto:[email protected]


Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013

it does not take into account the irregular shapes within real

images. The block-based segmentation of source image is a

fundamental limitation of the DCT-based compression

system. The degradation is known as the ―blocking effect‖ and

depends on block size. A larger block leads to more

Fig1.1 Scaling and wavelet function.

efficient coding, but requires more computational power.

Image distortion is less annoying for small than for large DCT

blocks, but coding efficiency tends to suffer. Therefore, most

existing systems use blocks of 8 X 8 or 16 X 16 pixels as a

compromise between coding efficiency and image quality. In

recent times, much of the research activities in image coding

have been focused on the DWT, which has become a standard

tool in image compression applications because of their data

reduction capability . In a wavelet compression system, the entire image is transformed and compressed as a single data

object rather than block by block as in a DCT-based

compression system. It allows a uniform distribution of

compression error across the entire image. DWT offers

adaptive spatial-frequency resolution (better spatial resolution

at high frequencies and better frequency resolution at low

frequencies) that is well suited to the properties of an HVS. It

can provide better image quality than DCT, especially on a

higher compression ratio. However, the implementation of the

DCT is less expensive than that of the DWT. For example, the

most efficient algorithm for 2-D 8 X 8 DCT requires only 54

multiplications, while the complexity of calculating the DWT

depends on the length of wavelet filters. A wavelet image

compression system can be created by Selecting a type of

wavelet function, quantizer, and statistical coder. In this paper,

we do not intend to give a technical description of a wavelet

image compression system. We used a few general types of wavelets and compared the effects of wavelet analysis and

representation, compression ratio, image content, and

resolution to image quality. According to this analysis, we

show that searching for the optimal wavelet needs to be done

taking into account not only objective picture quality

measures, but also subjective measures. We highlight the

performance gain of the DWT over the DCT. Quantizers for

the DCT and wavelet compression systems should be tailored

to the transform structure, which is quite different for the DCT

and the DWT.

1.1 COMPRESSION

Figure 1.2: Image/Video Compression Techniques

In predictive coding the present sample is predicted from

previous samples. Predictive coding techniques operate

directly on image pixels and thus are called spatial domain

methods. Delta modulation and Differential pulse code

modulation are different techniques in predictive coding.

Transform coding compression techniques are based on

modifying the transform of an image. In this, a reversible and

linear transform such as Discrete Cosine Transform (DCT),

Discrete Fourier Transform (DFT) and Wavelet Transform are

used to map the image into a set of transform coefficients,

which are then quantized and coded. The DCT-based image

compression standard is a lossy coding method that will result

in some loss of details and unrecoverable distortion. Fourier analysis breaks down a signal into sinusoids of different

frequencies. The main drawback in Fourier transforms is, it

provides only frequency information and the temporal

information is lost in transformation process, where as

wavelets preserves both frequency and temporal information.

II. WAVELET TRANSFORMS

Wavelets are functions generated from one single

function (basis function) called the prototype or mother

wavelet by dilations (scaling) and translations (shifts) in time

(frequency) domain. If the mother wavelet is denoted by ψ (t)

, the other wavelets ψ a,b (t) can be represented as

------- 2.1

wherea and b are two arbitrary real numbers. The

variables a andb represent the parameters for dilations and

translations respectively in the time axis. From Eq. 2.1, it is

obvious that the mother wavelet can be essentially represented

as

-------- 2.2

For any arbitrary a ≠ 1 and b = 0, it is possible to derive that

----------2.3 As shown in Eq. 2.3, ψa, 0(t) is nothing but a time-scaled (by

a) and amplitude-scaled version of the mother wavelet


function ψ t in Eq. 2.2. The parameter a causes contraction of

ψ( t) in the time axis when a < 1 and expression or stretching

when a > 1. That‘s why the parameter a is called the dilation

(scaling) parameter. For a < 0, the function ψa,b (t) results in

time reversal with dilation. Mathematically, substituting t in

Eq. 2.3 by t-b to cause a translation or shift in the time axis

resulting in the wavelet function ψ a,b (t) The function ψ a,b t

is a shift of ψa,0 (t) in right along the time axis by an amount

b when b > 0 whereas it is a shift in left along the time axis by an amount b when b < 0. That‘s why the variable b

represents the translation in time (shift in frequency) domain.

Figure 2. A wavelet shown at different scales

Figure 2 shows an illustration of a mother wavelet and its

dilations in the time domain with the dilation parameter a = α.

For the mother wavelet ψ(t) shown in Figure 2(a), a

contraction of the signal in the time axis when α < 1 is shown

in Figure 2(b) and expansion of the signal in the time axis

when α > 1 is shown in Figure 2(c). Based on this definition

of wavelets, the wavelet transform (WT) of a function (signal)

f(t) is mathematically represented by

----2.4 The inverse transform to reconstruct f(t) from W(a, b) is

mathematically represented by

-----2.5

Where

-----2.6 and Ψ(ω) is the Fourier transform of the mother wavelet ψ (t) .

If a and b are two continuous (nondiscrete) variables and f(t)

is also a continuous function, W(a,b) is called the continuous

wavelet transform (CWT). Hence the CWT maps a one-

dimensional function b(t) to a function W(a, b) of two

continuous real variables a (dilation) and b (translation).

III. DESIGN AND IMPLIMENTATION USING ASWDR

ALGORITHM

Step 1 (Initialize). Choose initial threshold, T = T0, such that

all transform values satisfy w(m)< T0 and at least one

transform value satisfies w(m) _T0=2. Set the initial scan

order to be the baseline scan order.

Step 2 (Update threshold). Let Tk = Tk_1/2.

Step 3 (Significance pass). Perform the following procedure

on the insignificant indices in the scan order:

Initialize step-counter C = 0

Let Cold = 0

Do

Get next insignificant index m

Increment step-counter C by 1

Ifw(m)>= Tk then

Output sign w(m) and set wQ(m) = Tk

Move m to end of sequence of significant indices

Let n = C - Cold

Set Cold = C

If n > 1 then

Output reduced binary expansion of n

Else if w(m) < Tk then

Let wQ(m) retain its initial value of 0.

Loop until end of insignificant indices

Output end-marker as per WDR Step 3.

Step 4 (Refinement pass). Scan through significant values found with higher threshold values Tj , for j < k (if k = 1 skip

this step). For each significant value w(m), do the following:

If jw(m)j 2 [wQ(m);wQ(m) + Tk), then


Output bit 0

Else ifw(m) 2 [wQ(m) + (Tk;wQ(m) + 2Tk)], then

Output bit 1

Replace value of wQ(m) by wQ(m) + Tk.

Step 5 (Create new scan order). For each level j in the wavelet

transform (except for j = 1), scan through the significant

values using the old scan order. The initial part of the new

scan order at level j - 1 consists of the indices for insignificant values corresponding to the child indices of these level j

significant values. Then, scan again through the insignificant

values at level j using the old scan order. Append to the initial

part of the new scan order at level j -1 the indices for

insignificant values corresponding to the child indices of these level j significant values. Note: No change is made to the scan

order at level L, where L is the number of levels in the

wavelet transform.

Step 6 (Loop). Repeat steps 2 through 5.

The creation of the new scanning order only adds a small

degree of complexity to the original WDR algorithm.

Moreover, ASWDR retains all of the attractive features of

WDR: simplicity, progressive transmission capability, and

ROI capability.

IV. IMPLEMENTATION

The tables consists of PSNR values and the time taken values

for each wavelet for Intensity images.

Table 4.1 shows the PSNR values and time taken (in seconds)

values of Daubechies family wavelets for compressing an

indexed image

.

Table 4.2 shows the PSNR values and time taken (in seconds)

values of Biorthogonal family wavelets for compressing an

indexed image.

Table 4.3 shows the PSNR values and time taken (in

seconds) values of Daubechies family wavelets for

compressing an intensity image.

Table 4.4 shows the PSNR values and time taken (in

seconds) values of Biorthogonal family wavelets for

compressing an intensity image.

wavelets PSNR Time Taken

Bior1.1 44.5183 1.9344

Bior1.3 44.6206 2.0904

Bior1.5 44.4889 2.1216

Bior2.2 43.4227 2.0592

Bior2.4 43.4750 2.0436

Bior2.6 43.4597 2.1060

Bior2.8 43.4261 2.1996

Bior3.1 40.8095 1.9812

Bior3.3 41.8626 2.0748

Bior3.5 42.0891 2.0592

Bior3.7 42.1651 2.1528

Bior3.9 42.1622 2.2932

Bior4.4 44.3776 2.1060

Bior5.5 44.5349 2.1216

Bior6.8 44.2869 2.3088

Wavelets PSNR Time Taken

db1 44.5183 1.8564

db2 44.6469 1.9032

db3 44.5166 1.9344

db4 44.5815 1.9344

db5 44.5523 1.9656

db10 44.5559 2.1372

db15 44.5740 2.5896

db20 44.6888 2.9328

db25 44.7002 3.5100

db30 44.7770 4.0092

db35 44.7878 4.8984

db40 44.8612 5.8032

Wavelets PSNR Time Taken

db1 52.5105 0.9375

db2 52.2644 0.9688

db3 52.0599 1.0313

db4 51.9944 0.9531

db5 51.9673 1.0625

db10 51.8946 1.1250

db15 51.9586 1.3125

db20 51.9239 1.4063

db25 51.9805 1.6094

db30 51.9263 1.6875

db35 51.9573 2.2344

db40 51.9794 2.4531


Table 5.1 Best results for Intensity images from each family

Table 5.2 Best results for Indexed images from each family

Figure 5.1: shows the PSNR values and the time taken values of best wavelet from each family obtained by compressing an

intensity image.

Figure 5.2: shows the PSNR values and the time taken values

of best wavelet from each family obtained by compressing an

indexed image.

V. RESULTS

Figure 5.1Test image-Intensity image waveletDaubechies

family wavelets Peak Signal to Noise Ratio : 52.5105 , Time

taken : 0.9375

Figure5.2. The approximation of Intensity image

waveletDaubechies family wavelets

Figure5.3The Reconstructed and difference Intensity image

waveletDaubechies family wavelets .

wavelets PSNR Time Taken Bior1.1 52.5105 0.9844 Bior1.3 52.5242 1.0781 Bior1.5 52.4703 1.0625 Bior2.2 50.9283 1.1094 Bior2.4 50.9749 1.0469 Bior2.6 50.9368 1.0781 Bior2.8 50.8914 1.1094 Bior3.1 48.9349 1.0469 Bior3.3 49.7433 0.9219

Bior3.5 49.9444 1.0313 Bior3.7 50.0045 1.0625 Bior3.9 50.0539 1.1875 Bior4.4 51.779 1 Bior5.5 52.0525 1.0781 Bior6.8 51.7006 1.0781

WAVELETS PSNR TIME

TAKEN(seconds)

Daubechies db1 (52.5105) db1(0.9375)

Biorthogonal bior1.3(52.5252) bior3.3(0.9219)

WAVELETS PSNR TIME

TAKEN(seconds) Daubechies db40(44.5183) db1(1.8564)

Biorthogonal bior1.3(44.6206) bior1.1(1.9344)


Figure5.3 .Biorthogonal family wavelets, Peak Signal to Noise

Ratio : 52.5242 , Time taken : 0.9219

Figure5.4The approximation images of Biorthogonal family

wavelet.

Figure5.5The Reconstructed and difference images of

Biorthogonal family wavelet.

Figure 5.6.Daubechies family wavelets Peak Signal to Noise

Ratio: 44.8612 , Time taken : 1.8564

Figure5.7. The approximation images of Daubechies family

wavelets .

Figure5.8The Reconstructed and difference

imagesDaubechies family wavelets .

Figure 5.9 .Biorthogonal family wavelets ,Peak Signal to

Noise Ratio : 44.6206 , Time taken : 1.9344


Figure5.10. The approximation images of Biorthogonal family

wavelets

Figure5.11The Reconstructed and difference images of

Biorthogonal family wavelets.

V. CONCLUSION

In this paper, the results were compared for the different

wavelet-based image compression techniques. The effects of

different wavelet functions filter orders, approximation and

reconstruction were examined. The results of the above

techniques ASWDR were compared by using the parameter

such as PSNR and time taken to compress values from the

reconstructed image.

These techniques are successfully tested in many images.

PSNR values and the time taken values of best wavelet from

each family obtained by compressing an intensity image and

indexed images. The experimental results show that the

ASWDR technique performs better for the indexed images by

giving high PSNR and less time to compress. And is an

alternative to the SPIHT method due to its low complexity.

Finally, it is identified that ASWDR for indexed image

compression performs better when compare to intensity image

compression.

REFERENCES

1. ACEEE Int. J. on Information Technology, Vol. 01,

No. 02, Sep 2011Image Compression using WDR &

ASWDRvTechniques with different Wavelet

Codecs.S.P.Raja1, Dr. A. Suruliandi2.

2. IEEEICET 2006 2nd

International Conference on

Emerging Technologies Peshawar, Pakistan 13-14

November 2006.

3. M. Antonini, M. Barland, P. Mathien and I.

Daubechies, ―Image coding using wavelet transform‖, IEEE Trans. Image Processing, vol. 1,

pp.205-220,April 1992.

4. Rafael C. Gonzalez and Richard E. Woods ,‖Digital

Image Processing‖, 2nd

Edition, Prentice Hall Inc,

2002.

5. International Journal of Computer Science &

CommunicationVol. 1, No. 1, January-June 2010, pp.

179-184

6. http://www.acm.org/crossroads/xrds6-

3/sahaimgcoding.html

7. http://www.dspexperts.com/dspprojects/htm

http://www.dspexperts.com/dspprojects/htm


FFT Based Technique for Automatic Image Mosaicing Vinod G R1, Mrs.Anita R2

1VLSI Design and Embedded System(E&C)

Visvesvaraya Technological University

East Point College Of Engineering and Technology Bangalore,India

2Associate Prof Dept.of Electronics and Communication


East Point College Of Engineering and Technology

Bangalore,India

Abstract— This paper proposes a technique to generate a

panoramic view by combining images. Image mosaicing is useful

for a variety of tasks in vision and computer graphics. It presents

a complete system for stitching a sequence of still images with

some amount of overlapping between every two successive

images. There are 2 contributions in this paper. First is an image

registration method which handles rotation and translation

between the two images using FFT phase correlation. The second

is an efficient method of stitching of registered images using the

registration parameters obtained in previous step. It removes the

redundancy of pasting pixels in the overlapped regions between

the images with the help of an empty canvas.

Keywords— Image Stitching, FFT Phase Correlation, Registration,

Rotation, Translation

1. INTRODUCTION

An Image mosaic is a composition generated from a

sequence of still images. By applying some

methods we can find parameters which are used to

obtain mosaic of images, it is possible to construct

a single image from many images with some

overlapping areas, covering the entire visible area

of the scene. The steps in mosaicing are image

registration and image stitching.

Image registration refers to the geometric alignment

of a set of images. The set may consist of two or

more digital images taken of a single scene from

different sensors, or from different viewpoints. The

goal of registration is to establish geometric

correspondence between the images so that they

may be transformed, compared, and analyzed in a

common reference frame. This is of practical

importance in many fields, including remote

sensing, medical imaging, and computer vision.

The registration method presented here uses the Fourier

domain approach to match images that are translated and

rotated with respect to one another. Fourier methods

differ from other registration strategies because they

search for the optimal match according to information in

the frequency domain. The algorithm uses the property

of phase correlation for automatic registration, which

gives the translation parameters between two images by

showing a distinct peak at the point of the displacement.

With this as the basis, rotation is also found.

The next step, following registration, is image stitching.

Image integration or image stitching is a process of

overlaying images together on a bigger canvas. The

images are placed appropriately on the bigger canvas

using registration parameters to get the final mosaic. At

this stage, the main concerns are in respect of the quality

of the mosaic and the efficiency of the algorithm used.

In this paper, an efficient method for stitching multiple

images has been proposed.

2. IMAGE REGISTRATION

2.1 ESTIMATION OF ROTATION PARAMETERS:

Suppose the two images I1 and I2 to be registered involve

both translation and rotation with angle of rotation being


―θ‖ between them. When I2 is rotated by θ, there will be only

translation left between the images and the phase correlation

with I1 should give maximum peak. So by rotating I2 by one

degree each time and computing the correlation peak for that

angle, we reach a stage where there is only translation left

between the images, which are characterized by the highest

peak for the phase correlation. That angle becomes the angle

of rotation―θ‖.

2.1 TRANSLATION PARAMETER ESTIMATION:

If f (x, y) ⇔F(ξ,η) then

f(x,y)exp[j2π(ξx0+ηy0)/ N] )⇔ F(ξ- ξ0,η- η0) and

f(x-x0,y-y0)⇔ F(ξ,η).exp[-j2π(ξx0+ηy0)/ N]

where the double arrow (⇔ ) indicates the

correspondence between f (x, y) and its Fourier

transform F. According to this property, also called

as Fourier Shift Theorem, if a certain function‘s

origin is translated by certain units, then the

translation appears in the phase of the Fourier

transform. i.e. if f and f‟ are two images that differ

only by a displacement (x0, y0)

i.e., f′(x,y)=f(x-x0,y-y0)

Then, their corresponding Fourier transforms F1

and F2 are related by

F'(ξ,η) = e-j2π(ξx0+ηy0)*F(ξ,η).

The cross-power spectrum of two images f and f‟

with Fourier transforms F and F‟ is defined as

F(ξ,η).F'*( ξ,η)/F(ξ,η).F'*( ξ,η)=ej2π(ξxₒ+ηyₒ)

where F‟ * is the complex conjugate of F‟ , the

shift theorem guarantees that the phase of the cross-

power spectrum is equivalent to the phase

difference between the images. By taking inverse

Fourier transform of the representation in the

frequency domain, we will have a function that is

an impulse, that is, it is approximately zero

everywhere except at the displacement that is

needed to optimally register the two images. If there

is no other transformation between f1 and f2 other

than translation, then there is distinct peak at the

point of the displacement.

The discussion in the above section tells that

whenever there is pure translation present between

two images, phase orrelation has a maximum peak

and the corresponding location gives the translation

parameters (x0, y0).

2.2 PROPOSED ALGORITHM:

Now, we present the Algorithm for estimation of

rotation and translation parameters which were

discussed in the previous two sections. (We can

down sample the images to speed up the process of

registration)

Algorithm1:

Input:

Two overlapping images I1 and I2

Output:

Registration parameters (tx, ty, θ)

where tx and ty are translation in x and y directions

respectively and θ is the rotation parameter.

Steps:

1. Read and resize the two images. Let the resized

images be I1' and I2'.

2. For i = 1: step: 360

2.1) Rotate I2' by i degrees. Let the rotated image be

I2'rot.

2.2) Compute the Fourier transforms FI1' and

FI2'rot of images I1' and I2 'rot respectively.

2.3) Let Q(u,v) be the Phase correlation value of I1'

and I2'rot, based on FI1' and FI2'rot.

Q(u,v)=FI1‟ (u,v).FI2'rot*(u,v)FI1‟ (u,v)F'I2‟ rot*(

u,v)

2.4) Compute the inverse Fourier transform q(x, y) of

Q(u,v).


2.5) Locate the peak of q(x,y).

2.6) Store the peak value in a vector at position i. End

For.

3. Find the index of maximum peak from the values

stored in the vector in step 2.6. It gives the angle of

rotation. Let it be θ'.

4. Repeat steps 2.1 to 2.6 for i = θ'-step : θ'+step.

5. Find the angle of maximum peak from step 4. It

becomes the angle of rotation. Let it be θ.

6. Rotate the original image I2 by ―θ‖. Let the rotated

image be I2rot.

7. Phase correlate I1 and I2rot. Let the result be P(u,v).

8. Compute the inverse Fourier transform p(x,y) of

P(u,v).

9. Locate the position (tx, ty) of the peak of p(x, y)

which become the translation parameters.

10. Output the parameters (tx, ty, θ).

The above algorithm is capable of finding rotation

between the images. The maximum peak occurs only at

the point where there exists pure translation between the

images.

As an example, for an input pair of images fig 1 and fig

2 with a translation of 31 pixels along the column (x)

direction and along row (y) direction 201 pixels and also

with rotation of 90 degrees. The plot of the phase

correlation between the images is shown in fig 3 and fig

4.

fig 1 fig 2

In figure 3 we can see peak at the 90 degree point showing

that the 2nd

input image is rotated by 90 degree with respect to

1st input image

figure 4 shows the peak at the points where there is

exact translation is present between two images along X

and Y directions(X=31 and Y=201).


fig 4

3. IMAGE STITCHING

Image stitching is the next step following the

registration. At this stage, the reference image is

overlaid on the source image by pasting its pixels on a

canvas at the appropriate location using the

transformation parameters obtained in the registration

process. In this section, we present a general algorithm

for stitching any number of images.

3.1 PROPOSED ALGORITHM:

Algorithm2:

1. Create a canvas: The canvas is for the mosaic of all

the images. We call it image canvas.

2. Make the entire canvas black.

3. For a given image I, For each pixel in the image I,

Paste a mapped pixel on the canvas, taking in to

consideration the translational and rotational parameters.

Fig 4 shows mosaic image of two input images.

Fig 4

3.2 ADVANTAGES OF THE ABOVE METHOD:

This algorithm is very efficient in stitching multiple

images with large overlaps. Consider a sequence of

image with, let us say, 80% overlap between the

successive images. If the entire image is pasted

every time, then some of the pixels in the overlap

region get mapped four times , thus leading to a

300% redundancy in pasting where as algorithm 2

pastes each pixel only once. This approach not only

improves the efficiency of the stitching but the

same time retains the quality of the mosaic closer to

that of the input images.

4. EXPERIMENTAL RESULTS

The algorithm1 for registration and algorithm2 for

stitching as described in 2 and 3 sections respectively all

have been implemented in MATLAB R2011a.These

algorithms have been tested on different sets of images,

especially real images involving large amounts of

rotational and translational changes for registration and

illumination and view changes for image composition.

5. CONCLUSION

In this paper, we have presented two algorithms

for still image sequences. The first is a simple and

reliable algorithm for finding rotation and

transformations of planar transformations based on

the phase correlation. The overall complexity is

dominated by FFT. A key feature of Fourier-based


registration methods is the speed offered by the use

of FFT routines. The next is a method of stitching

images which overcomes redundancy in re-pasting

pixels in the final mosaic. All these algorithms add

quality and efficiency to the mosaicing process.

REFERENCES

[1]. Lisa G.Brown. A survey of image registration techniques.ACM

Computing Surveys, 24(4):325–376, December 1992.

[2] D. I. Barnea and H. F. Silverman, "A class of algorithms for fast digital

registration," IEEETrans. Comput, vol. C-21, pp. 179-186, 1972.

[3] C. D. Kuglin and D. C. Hines, "The phase correlation image alignment

method," in Proc.IEEE 1975 Int. Conf. Cybernet. Society,,New York, NY, pp.

163-165.

[4] J. L. Horner and P. D. Gianino, "Phase-only matched

filtering," Appl. Opt., vol. 23, no. 6, pp. 812-816, 1984 [5] B. Reddy, and B. Chatterji, ―An FFT-based Technique for Translation,

Rotation and Scaleinvariant Image Registration‖,IEEETrans.OnImageProcessing,Vol.5,No.8,pp:1266-71,1996.

[6] Q.Chen ,M. Defrise, and F. Deconinck, ―Symmetric phase-only matched

filtering of Fourier-Mellin transforms for Image Registration and

Recognition,‖ IEEE Transactions on Pattern Analysis and Machine

Intelligence, vol 16 No. 12 pp 1156-1168.

[7] R. Szeliski, ―Video mosaics for virtual environments‖, IEEE Computer

Graphics &Automation, pp. 22-30, 1996.

[8] R. Szeliski, ―Image alignment and stitching: A tutorial”. Technical

Report, January 2005.

[9] Automatic Registration of Satellite ImagesLEILA M. G.

FONSECAMAX H. M. COSTANational Institute for Space Research –

INPECP 515, 12227-010 Sao Jos´e dos Campos, SP, BrazilState University of

Campinas – UNICAMPCampinas, SP, Brazil

[10] Frank Nielsen, Randomized Adaptive Algorithms for mosaicing Systems

IEICE TRANS. INF & SYST July 2000

[11] Image Mosaicing using Sequential Bundle Adjustment Philip F.

McLauchlan and Allan Jaenicke,School of EE, IT and

Mathematics,University of Surrey,Guildford GU2 5XH.

[12]. Y. Kanazawa and K. Kanatani. ―Image mosaicing by stratified

matching,‖ Image Vision Computing, vol. 22, Feb 2004.

[13]. K. Rangarajan, M. Shah and D. Van Brackle, ―Optimal corner

Detection'', CVGIP vol: 48 pp: 230-245, 1989.

[14]. De Castro E, and Morandi C, 1987. Registration of translated and

rotated images using finite Fourier Transforms, IEEE Trans. PAMI-9, 5

(Sept.), 700-703.

[15]. Q.Zheng and R.A.Chellappa. ―Computational vision approach to image

registraion‖. IEEE Transactions on Image Processing, 2(3) : 311326, July

1993.


Application of Numerical Approximation Techniques for Face

Detection& Recognition Kayala Rahul1, Nikhil Bharat1

1Department of Electronics & Communication, RV College of Engineering

RV Vidyaniketan Post, 8th Mile, Mysore Road, Bangalore, India


Abstract—The ―Eigen Faces‖ technique for Face detection in a

image is a standard algorithm, first proposed in 1987. It involves

the decomposition of a given image into its Eigen components for

the purpose of recognition of a face. The computation of Eigen

values of images is resource intensive. This paper proposes the

application of Numerical Approximation Techniques,

particularly the Jacobi‘s Method for Asymmetric matrices to

compute the Eigen values and thereby reduce the computational

intensity. The application of this technique is particularly useful

in low resource portable embedded systems.

Keywords— Face detection, Eigen Faces, Numerical Methods,

Jacobi‘s method,Principal Component Analysis

I. INTRODUCTION

Face detection is primarily used in Face Recognition

Systems (FRS), in which an image or a video are scanned for

a face and then selected features comparedwith database to

obtain a match. Facial recognition is primarily used as an

alternate to existing biometric identification such as

fingerprinting and eye iris scanning systems. Face detection is yet to be widely as a security feature primarily due to high

computational resources required in extracting and matching

components in a face.

II. DETECTION TECHNIQUES

Several methods are used in the detection of faces in an

image. All the techniques make the following key assumptions:

The image of the face is of reasonable resolution

The face stored in the database and that submitted

for detection were taken under similar external

conditions such as lighting, camera noise etc.

The expression of the subjects is the same,

preferably neutral.

A. Traditional Detection Techniques

Some facial recognition algorithms extract prominent

features of the face such as relative position/shape of the eyes,

mouth, cheek bones, jaw etc.

Other algorithms normalize a gallery of face images and

then compress the face data, only saving the data in the image

that is useful for face detection. A probe image is then

compared with the face data.

Geometric algorithms are those which look at the

distinguishing features. Photometric algorithms are those

using a statistical approach that distils an image into values

and compares the values with templates to eliminate

variances.

Principal Component Analysis using Eigen Faces is a

traditional detection technique.

B. 3D Detection Technique

This technique uses 3D sensors to capture information

about the shape of a face. This information is then used to

identify distinctive features on the surface of a face such as

nose, chin etc.

Three-dimensional data points from a face vastly improve

the precision of facial recognition. It is not affected by

changes in lighting and the angle of the phase doesn‘t affect

the accuracy of the recognition algorithm.

III. TYPICAL FACE RECOGNITION SYSTEM

The block diagram of a typical face recognition system is

shown in Fig. 1

Fig. 1 Block diagram of a typical system

There are several stages in FRS, all of which affect the

accuracy of the system as a whole.

A. Acquisition

This is the entry point of the face recognition process. It is

the module where the face image under consideration is

presented to the system. The image can either be from a stored




database or can be a live picture captured at the module.

Usually, picture captured live is harder to process.

B. Pre-processing

Pre-processing is performed on the captured image in order

to reduce to effect of environmental factors such as lighting. The image is normalized using some well-known processes

such as image resizing to the standard of the current system,

histogram processing to improve the quality, median filtering,

high pass filtering, rotational normalization and illumination

normalization.

C. Feature Extractor

This segment of the FRS is used to determine which

features/landmarks of the face are to be extracted and

compared. The features extracted depend upon the algorithm

used by the system. In the Eigen Faces based system, the

feature extractor obtains the Eigen Vectors of the image using

Principal Component Analysis (PCA).

D. Training Sets

A set of images which are faces are used to train the system

with regard to the features of a typical face it will encounter.

The features extracted are compared with that of those

extracted from the training set in order to determine whether

an image is a face.

E. Classifier

The classifier determines if the Image is a known face by

comparing the extracted features with the features stored in

the database. It adopts a Maximum Likelihood approach by

measuring the likelihood that a given image is a face and

belongs to a super set of the training set. If the training set is

small, it may result in errors and false positives.

IV. PRINCIPAL COMPONENT ANALYSIS

Principal component analysis (PCA) is a mathematical

procedure that uses an orthogonal transformation to convert a

set of observations of possibly correlated variables into a set

of values of linearly uncorrelated variables called principal

components. The number of principal components is less than

or equal to the number of original variables. In the context of Face detection, and in general, Image

processing, PCA refers to extraction of the Eigen Vectors of

the image.

The main principle behind Face detection using PCA is

that all faces have similar characteristics and can be expressed

as a weighted sum of the Eigen vectors of those images in the

training set.

The accuracy of the detection depends on a variety of

factors such as number of the training set, variations in

lighting conditions, variations in subject expressions etc.

A. Procedure for PCA

The first step is to obtain a set S with M face images,

which have been pre-processed. Each image is transformed

into a vector of size N and placed into the set.

Fig. 2 Training set images

The Mean image Ψis calculated for the training set

Fig. 3 Mean image of the training set

The Mean Subtracted image Φ is then calculated as

follows

A set of M orthonormal vectors, un, is chosenwhich best

describes the distribution of the data. The kth

vector, uk, is

chosen such that

is a maximum, subject to


From the concept of Singular Value Decomposition (SVD), we know that uk and λk are the eigenvectors and eigenvalues

of the covariance matrix C.

We now compute the Eigen values and vectors of the

covariance matrix C. We then express each image in the

training set in terms of the Eigen vectors obtained ul. This

given us the set of Eigen faces of the training set.

Fig.4 Eigen faces for the given training set

B. Recognition of Faces

A new face is transformed into its Eigen face components.

We compare the input image with our mean image and

multiply their difference with each eigenvector of the C

matrix. Each value would represent a weight and would be

saved on a vector Ω

We now determine the Euclidean distance between weights

of images in the training set and the input image

The input face is considered to be a face if εk is below an

established threshold θk. Then the face image is considered to be a Known face.

If the difference is above the given threshold, but below a

second thresholdθu, the image can be determined as an

Unknown face.

If an unknown face is discovered i.e.θk<εk<θu, most FR

systems add it to the training set and recompute the Eigen

faces. This feedback technique helps in improving the

accuracy.

V. LIMITATIONS OF PCA

Consider a training set of 20 faces of standard size

128x128. Each face is represented using a single row vector of

length 16384.

Correspondingly, the size of the Mean Subtracted image

matrix is 16384 x 20. Hence, the size of the Covariance matrix

C is 16384 x 16384.

Subsequently C has 16384 Eigen vectors and values, which

makes computation highly complex, resource intensive and

time consuming.

However, the number of Eigen faces depends upon the size

of the training set. This makes it reasonable to assume that the

total number of significant is equal to the size of the training

set. In this e.g. the number of significant Eigen vectors is 20.

Despite the reduction, the processing and computation of M largest Eigen values and corresponding vectors still is

computationally intensive, especially when viewed in the

context of limited resources available in an embedded system

as compared to a PC.

VI. JACOBI‘S METHOD FOR EIGEN VALUES

The computational complexityinvolved in Eigen value

calculation can be mitigated by adopting numerical

approximation techniques.

Jacobi‘s method for non-symmetric matrices isused to

convert any given matrix into a diagonal matrix. The Eigen

values of the diagonal matrix thus obtained, and the elements

of the leading diagonal.

This is an iterative technique and the error involved in the

approximation is systematically reduced with each iteration.

A. Mathematical Analysis

For a symmetric matrix A, the Jacobi method constructs

a series of orthogonal matrices T, such that the

matrix converges to a diagonal matrix . Di+1=Ti

T…T1

TATi…T1, where i denotes the iteration

number.

The Orthogonal Transformation matrix T comprises of a

Rotation component R and a Shear Component S, such that

Ti=RiSi.

Consider a situation in which the rotation is taking place in

a subspace of a matrix, with l rows and k columns, such that

k< l. Then the Jacobi‘s method states that

rkk=rll=cosθ rkl=-rlk =-sin θ

skk=sll=cosh p skl=slk=-sinh p


tanθ = akl+alk/(akk-all) tanh p= (ed-h)/(g+2(e

2+d

2))

e=akl-alk

d= (akk-all) cos 2θ + (akl+alk)sin 2θ

g= Σ(a2

ki+ a2ik+ a

2li+ a

2il)

h=cos 2θΣ(akiali-aikail)

-0.5 sin 2θΣ(a2ki+ a

2ik+ a

2li+ a

2il)

, where aij is an element of A, rij is an element of R &sij is an

element of S.

,except i=k=l

These give us Ri and Si, giving us Ti. For each iteration, Ti+1is

calculated and a better approximation of the Diagonal matrix

D is obtained. This is repeated until Ti=In. The leading diagonal elements of D give the Eigen values, while the

product of all the Ti gives the Eigen Vectors corresponding to

the values.

B. Application in Face Detection

The Jacobi‘s technique described above is best suited for

large real non-symmetric matrices. In the domain of Face

detection, the method is applied to the Covariance Matrix Cto

extract the Eigen vectors.

In order to further simplify the computation, the assumption

that the number of significant Eigen Values will be equal to

the size of the training set M, the matrix C is first reduced to a

matrix of rank M and then subjected to Jacobi‘s iteration.

C. Implementation

The above scheme can be implemented for an image of

standard type such as .jpg,.gif,.png as well as any proprietary

standard used in the FRS. We have implemented the same

using .jpg images of faces with a standard size of 320 x 243

pixels.

VII.ADVANTAGES & LIMITATIONS

The application of Jacobi‘s technique in Face detection is in

the most crucial stage of any FRS i.e. Feature Extraction. The results at this stage determine the key performance indicators

of the system.

The key advantage of using this technique is due to its

iterative nature. This reduces the code density and thereby

makes its integration into portable embedded systems easier.

Additionally, the simplicity of the technique makes it possible

for it to be implemented on any basic floating point processor

with support for trigonometric instruction set support. This

eliminates the need for dedicated, high cost DSP units usually

needed for complex matrix operations on images.

Notwithstanding the advantages, the inherent disadvantage

of incorporating Jacobi‘s technique is the approximation of

the solution, resulting in slight errors and deviations from the

expected result. This can be overcome by increasing the

number of iterations or by increasing the size of the training set (statically by the programmer or dynamically by the

system incorporating images classified as unknown).

However, this increases the time complexity of the system.

The challenge is to choose a suitable level of trade-off

between accuracy of FRS, cost/portability of FRS&

importantly the time complexity of the FRS.

VIII. CONCLUSIONS

This paper demonstrates the main challenges involved in

incorporating a robust FRS in a portable embedded

system/low resource computing environment. These can be

overcome to a certain degree by incorporating numerical

approximation techniques in the FRS. This paper

demonstrated the use of Jacobi‘s Method for Eigen

values/vectors of Real Non Symmetric Matrix in a FRS based

on the principle of PCA using Eigen for Face detection.

The challenges involved in the incorporation, primarily

time complexity and accuracy of the system were discussed

vis-à-vis the inherent advantage of code density and the

ability to be implemented in a low resource

computingenvironment. In conclusion, numerical

approximation techniques can be applied for FRS in low

resource environments where there is sufficient scope for

tradeoff between accuracy and time complexity.

ACKNOWLEDGEMENT We wish to acknowledge the valuable guidance provided to

us by Sri. P L Rajashekhar, Assistant Professor, Department

of Mathematics, RVCE. We also would like to thank Smt.

Veena Devi, Assistant Professor, Department of Electronics &

Communication, RVCE.

We also express our gratitude to The Department of

Electronics & Communication for providing us the necessary

support and environment to work on this endeavour

REFERENCES

[1] “Eigen faces for recognition”, Matthew Turk, AlexPentland, Vision &Modeling Group,The Media Laboratory, MIT

[2]”Face detection using Eigen faces”, IlkerAtalay,MSc. Thesis, Istanbul

Technical University, 19

[3] ―Face Detection on Embedded Systems”, Abbas Bigdeli, Colin Sim,

MortezaBiglari-Abhari and Brian C. Lovell, Department of Electrical and

Computer Engineering, The University of Auckland, Auckland, New Zealand

[4] ―Eigen Face Recognition Using Different Training Data Sizes”,Zhifeng

Li, Xiaoou Tang,Department of Information Engineering, The Chinese

University of Hong Kong Shatin, N.T., Hong Kong

[5] ―Basic Numerical Methods for Image Processing‖, Chang- Ock Lee

[6] “Mathematical Problems in Image Processing,”G. Aubert and P.

Kornprobst, Applied Mathematical Sciences 147, Springer, 2002.


Ajay Kumar V1, M.B.Kamakshi2 , Venkatesha B V3 1, 2

Dept. of Telecommunication, R.V. College of Engineering, Bangalore, India 3Alcatel-Lucent India Pvt. Ltd.

[email protected]

[email protected] [email protected]

Abstract - Network Management and Security has become a very

sensitive and important topic for manufacturers and network

operators. In optical networks, fault management deals with

detection and isolation of faults based on the alarms received

from the network elements. This paper describes the

development of a fault management tool for an enterprise optical

node managed by an enterprise network management system.

The tool is implemented in order to reduce the manual effort in

validating the alarms. Description of the same is presented. For

any network management system, FCAPS is the term used to

represent its functionalities like Fault, Configuration,

Accounting, Performance and Security management. Fault

management aspect for an enterprise optical node is being

presented in this paper. Validated results as part of tool's

execution are obtained as screenshots and presented.

Keywords: -FCAPS - Fault Configuration Accounting

Performance Security, NMS - Network Management System.

PSS - Photonic Service Switch, SAM - Service Aware

Manager, DWDM - Dense Wavelength Division

Multiplexing, Fault Management, Alarm Management, SNMP

- Simple Network Management Protocol, UDP - User

Datagram protocol

I. Introduction

Today's optical networks are capable of carrying large

amounts of data. Commercial systems have been deployed

with a fiber capacity of several Tera bits per second, which is

equivalent to millions of simultaneous telephone

conversations. As optical network technology advances and

higher bandwidth is demanded, the amount of data transmitted

over a single optical fiber is expected to increase further. Due

to such high data rates, even a short service disruption may

cause large amounts of data to be affected. Many different

types of service disruptions occur frequently in practice. They

include bending and cutting of fiber, loss of signal, equipment failure, and human error. Besides faults, optical networks are

vulnerable to sophisticated attacks that are not possible in

electronic networks. Therefore, it is of critical importance for

such networks to have fast and effective methods or tools for

identifying and locating network failures. This is especially

important for the physical layer, where any physical failure

should be detected, located and corrected before it is noticed

by the upper layer protocols. Fault detection in optical

networks depends on alarms/traps generated by different types

of network monitoring equipment in response to unexpected

events. Depending on the placement and capabilities of the

monitoring devices, the network fault manager or a network

management system may receive a large number of

redundant alarms for some network failures, while it may not

receive any alarms for other network failures. In order for fault detection and localization mechanism to be fast and

effective, it is important to reduce the number of redundant

alarms. This will reduce the alarm processing time as well as

ambiguity in fault localization.

In optical networks, the physical layer generally consists of

several basic network components. Optical components are

passive or active. Passive optical components do not have

monitoring equipment capable of detecting and reporting

alarms. Active optical components have monitoring

equipment and therefore are capable of reporting alarms to the

network management system or network manager.

Optical networks and all networks, in general need a fault

management system or tool that is able to identify the faults that occur from the information given by the network

elements. A fault is defined as the accidental interruption of

the ideal functioning of the network due to tiredness of the

components. Faults produce signal degradation or complete

signal interruption. The former are called soft faults, and the

latter are called hard faults.

Section II deals with the introduction of the Alcatel-Lucent

developed enterprise-specific optical node called Photonic

Service Switch (PSS) and the enterprise-specific Network

Management System called Service Aware Manager (SAM).

Section III deals with the implementation procedure for fault

management. Section IV shows the snapshots of the validated

results obtained for one card supported by PSS and also the

corresponding traces of traps using a network monitoring tool,

Wireshark. Section V gives conclusion. The terms SAM and

Fault Management Tool Development for an Optical Node


NMS, PSS and optical node alarms and traps, are used

interchangeably.

II. Photonic Service Switch (PSS) and Service

Aware Manager (SAM).

A. Photonic Service Switch

The Alcatel-Lucent 1830 Photonic Service Switch (PSS)

represents a new breed of photonic switch for next generation

access, metro and long-haul wavelength division multiplexing

(WDM). It is a multi-reach platform that spans access, metro,

regional and long-haul applications and supports a wide range

of data rates, enabling service delivery in a variety of

environments and applications. It is used in broadband transport networks for telecommunication operators and

enterprises as Telco's to provide high-bandwidth connectivity

over distances up to 4000km. It also supports for the full range

of network topologies, including ring, point-to-point and

optical mesh topologies. Fig.1 shows the schematic of a PSS

node managed by the SAM.

Fig.1 Graphical Representation of PSS node (simulator node)

Key Features of PSS (Optical Node)

A static, tunable/reconfigurable optical add/drop

multiplexer (T/ROADM) with single wavelength

add/drop granularity

Supports WDM functionality

Colorless and any direction add/drop capabilities

Up to 88 wavelengths and 50GHz ITU WDM per

fiber

ODU1, ODU2, ODU3 and ODU4 interfaces

according to the G.709 standard

100 Gb/s and 40 Gb/s channel capacity, with best-in

class

Wavelength Tracker technology enables end-to-end

power control, monitoring, tracing and fault

localization for each individual wavelength channel

Supporting for various cards according to the service

required, i.e.., Optical Transponder Cards, Filter Cards, and Amplifier Cards.

B. Service Aware Manager

The Service Aware Manager (SAM) is a network management

application which is designed using industry standards like

Java framework, multi-tier layering, and web service, standard

interfaces. The use of the industry standard interfaces allows

the SAM to interoperate with other network systems. Fig.2

shows the user interface of SAM.

Fig.2 SAM GUI showing Alarms window and topology

1) Key Features of SAM (NMS)

Use of open standards that promote interoperation

with other systems

Distributed server processing

Using muti-tier model that groups functions in

separate, well-defined elements

Creation of web services

Component Redundancy

Uses the underlying protocol as SNMP

2) Key Components of SAM

The server is a Java-based network-management

processing engine

The Database is Oracle relational database that

provides persistent storage

Java based GUI clients which provide a graphical

interface for the network operators.

3) Network Management Capabilities of SAM

Alarm correlation up to the service level.

service and routing provisioning

inventory reporting at the equipment


network performance and account data collection

III. Implementation of Fault Management Tool.

As mentioned in the previous sections, Service Aware

Manager is an NMS which manages the Photonic Service

Switch, which is an optical network element. Fig.3 shows the

flow in implementation of fault management tool for PSS

managed by SAM. In Fig.3, the major elements are PSS, SAM

and an OSSI, which is Alcatel-Lucent specific coding

methodology. Initially, dump of Alarms supported by a card is

taken from the node (PSS) by logging into the node through

SSH protocol. The lists of alarms are raised on the node with

the help of a binary tool, supported by the node. Since the

node is managed by the SAM (NMS), all alarms raised on the

node are reported to the NMS. The underlying protocol for

communication between the Node and the NMS is SNMP.

The alarms seen by thee NMS are extracted using the XML

API client and are compared with the node alarm parameters.

If the both alarm specifics match, them they are validated as

Passed, else they are validated as Failed. The binary tool

supported by the node is called FMDH, which stands for Fault

Management Defect Handler. The tool developed has both the

Frontend and Backend. Frontend includes a graphical user

interface for selection of a particular card at a particular slot.

Backend includes the coding in extracting all the alarms

supported by the node.

Fig.3 Method implemented in developing the fault management tool

A. Back End Implementation Procedure

A particular card supported by the node is chosen for alarms validation

That particular card is to be configured on the node

The dump of the alarms supported by the node are

extracted by logging into the node through Secure Shell Protocol.

The alarms need to be syntactically arranged for

execution on the node, using the binary tool

supported on the node.

B. Front End Implementation Procedure

On the frontend, which is implemented using Java,

a particular card supported by the optical node can be

chosen

The front end also provides options to choose particular slot and shelf on which the card is

configured

Node IP address is also an input provided.

Fig.4 shows the developed GUI, which gives options for

validating a card.

Fig.4 Developed GUI showing various fields on the Frontend

IV. Results and Discussion

As mentioned in the Section III, one particular card A2P2125,

supported by the node is chosen, which is an Amplifier Card,

and its validated results are given in below figures. The

characteristics of the card are as mentioned below.

Optical Amplification function is performed via multistage

EDFA amplifiers, most with mid-stage DCM access. These

amplifiers are implemented as integrated variable gain optical

amplifier modules which include fast feedback for transient

control. It provides a maximum gain of 25dB. Fig.5 shows the

validated test results for the A2P2125 card.


Fig.5 SAM GUI showing Alarms captured for the A2P2125 card

Since the underlying protocol for communication between the

node and the NMS is SNMP in this case, the alarms can also

be referred to as traps. Fig.6 shows the recorded traps using

Wireshark, which is a network monitoring tool, generated as

part of tool's execution for mentioned card. The figure gives

an insight of various attributes related to traps, like source

from where the trap is generated, destination of the trap,

underlying protocol, timestamp etc. Fig.7 shows the packet

format for one particular alarm/trap, where in layer 2, layer 3,

layer 4 and layer 7 information is recorded using the

Wireshark tool. Fig.7 shows the layer 7 information, i.e.,

SNMP related information, which is of importance in this

context.

Fig.6 Screenshot of Alarms/Traps recorded from the tool using wireshark

Fig.7 Screenshot of packets information recorded using wireshark.

V. Conclusion

The tool discussed reduces the manual effort in validating

individual alarm at a time, which is an inefficient procedure

for validation. It provides for validating bulk or total number

of alarms supported by the optical node. The front end

developed, i.e., the user interface provides the flexibility and

granularity in the validation of the alarms.

REFERENCES

[1]Ma-kun Guo, Yi-min Wang Qi Yu, ―Research and Implementation of

Network Management System Based on XML View‖, International

Conference on Logistics Engineering and Intelligent Transportation

Systems (LEITS), 2010, Page(s): 1 - 4 [2]Stanic, Subramaniam, S.Sahin, G.Choi, H.Choi, ―Active monitoring and

alarm management for fault localization in transparent all-optical

networks‖, IEEE Transactions on Network and Service Management,

2010, Volume: 7, Issue: 2, Page(s): 118 - 131 [3]Wang Haitao, Chang Chun Qin, ―Network management system based on

Java technology‖, 3rd International Conference on Advanced Computer

Theory and Engineering (ICACTE), 2010, Page(s): V1-685 - V1-688 [4]Xiaolin Lu, ―An Architecture for Web Based and Distributed

Telecommunication Network Management System‖, 22 Nov. 2009, Third

International Symposium on Intelligent Information Technology

Application, 2009. IITA 2009, Volume: 1, Page(s): 152 - 155 [5]Ismail, M.N, ―Network Management System Framework and

Development‖, International Conference on Future Computer and

Communication, ICFCC 2009, 3-5 April 2009, Page(s): 450 - 454 [6]Stanic, S. Subramaniam, ―Distributed Hierarchical Monitoring and Alarm

Management in Transparent Optical Networks‖, IEEE International

Conference on Communications, 2008, Page(s): 5281 - 5285 [7]Stanic Sava,Sahin, Gokhan, Hongsik Choi, Subramaniam, Suresh

Hyeong-Ah Choi, ―Monitoring and alarm management in transparent

optical networks‖. Fourth International Conference on Broadband

Communications, Networks and Systems, 2007, Page(s): 828 - 836 [8]Alcatel-Lucent Reference guides and Data sheets [9] www.snmp.org


Toolbox for material classification of MMWR and FLIR

for Enhanced Vision System

Devendran.B*, Sudesh Kumar Kashyap**, T.V.Rama Murthy***

*PG Student, REVA Institute of Technology and Management, Bangalore, [email protected]

**MSDF lab, FMCD, CSIR-National Aerospace Laboratories, Bangalore, India,[email protected]

***Dept. of Electronics and Communication Engineering, REVA Institute of Technology and Management,

Bangalore, [email protected]

Abstract–To improve the situational awareness of an aircrew

during poor visibility, different approaches emerged during the

past few years. Enhanced vision systems (EVS – based on sensor images) are one of those. Typically, Enhanced vision systems

concept is a combination of sensor data, environmental variables,

internal and external state data and material database of the

given geographical area.EVS uses weather penetrating forward

looking image sensors such as Forward Looking Infrared Radar

(FLIR) and HiVision Millimeter Wave Radar (HiVision

MMWR). To generate the backscatter image from the imaging

sensors, it is much important to develop a material classified

database of the given geographical area. But development ofsuch

material classified database requires various image processing

tasks.Hence the main contribution of this paper is the

implementation of GUI based toolbox which is capable of

developing material database for both MMWR and FLIR by

using different material properties for Enhanced Vision Systems

functionalities.

Keywords - Enhanced Vision Systems (EVS), Graphical

User Interface (GUI), Region of Interest (ROI) and Phong like

lighting model, Normalized Radar Cross Section (NRCS).

I. INTRODUCTION

Typical Enhanced vision Systems concept is

shown in fig.1. The performance of the Enhanced Vision

System relies on the performance of imaging sensors

[1].The reliability of the Imaging sensorshighly depends

on the accuracy of used material classified database, also

imaging sensors generates backscatter image by taking

the material classified database of the given

geographical area as a reference [2].Hence designing of

such database that is accurate is a significant challenge

and it requires number of image processing functions to

classify the objects in the given image.

Fig.1. Typical Enhanced Vision System concept [3]

As mentioned earlier sensor vision is one of the

integral part of Enhanced vision systems technology.

Sensor vision uses weather penetrating Forward Looking

Infrared Radar (FLIR) and MilliMeter Wave Radar

(MMWR) sensors for imaging. In case of MMWR,

imaging is based on Backscattering Coefficient

(Normalized Radar Cross Section (NRCS) σo). Whereas

for FLIR, Solar absorptance, Albedo, Emissivity,

Conductivity, Thickness, Slope and Surface azimuth

parameters are to be consider. More details about radar

modelling requirements are obtained from ref [1-4].

For a given geographical area(airport image)the

classifications of terrain are: (1) Asphalt or Bitumen, (2)

Concrete, (3) Tar, (4) Bare Soil, (5) Green Grass, (6)

Trees, (7) Desert Sand, (8) Ice or White paint, (9) Snow,

(10) Water, (11) Brick or Urban areas, (12) Wood.

Material classification has to be done with respect to the

Enhanced

Vision System

Radar

IR

TV

Materia

l

Databas

eBase

Position

Attitude

Air speed

Mission data

ATC - data

Imaging sensors Internalstate

Externalstate


above terrain types. Material coded airport database can

be subsequently used in modelling and simulation of

MMWR and FLIR for the purposes of EVS studies.

Material classification of an airport image is purely an

image processing task, which requires several

algorithms to be applied on an image at a time. Material

classification can also be done by using edge detection

and colour coding methods. But the results of these

methods will not meet the desired accuracy. Hence this

drawback motivates to develop a software toolbox to

create material database based on different

aforementioned parameters.

Fig.2. Block diagram for toolbox design

In this paper details of implementation of

MATLAB based GUI toolbox for material classification

and realization of Phong lighting model for different

terrain materials based on Normalized Radar Cross

Section are presented. However, the authors have

attempted to present complete procedure required for

developing material database by using the designed

toolbox.

II. TOOLBOX DESIGN USING MATLAB GRAPHICAL USER

INTERFACE DEVELOPMENT ENVIRONMENT (GUIDE)

A graphical user interface (GUI) is a graphical

display in one or more windows containing controls,

called components, which enable a user to perform

interactive tasks. The user of the GUI does not have to

create a script or type commands atthe command line to

accomplish the tasks. Unlike coding programs to

accomplish tasks, the user of a GUI need not understand

the details of how the tasks are performed.GUI

components can include menus, toolbars, push buttons,

radio buttons, list boxes, and sliders just to name a few.

GUIs created using MATLAB®tools can also perform

any type of computation, read and write data files,

communicate with other GUIs, and display data as tables

or as plots [8].

Most GUIs wait for their user to manipulate a

control, and thenrespond to each action in turn. Each

control, and the GUI itself, has one or more user-written

routines (executable MATLAB code) known as

callbacks, named for the fact that they ―call back‖ to

MATLAB to ask it to do things. The execution of each

callback is triggered by a particular user action such as

pressing a screen button, clicking a mouse button,

selecting a menu item, typing a string or a numeric

value, or passing the cursor over a component. The GUI

then responds to these events. The creator of the GUI,

provide callbacks which define what the components do

to handle events. This kind of programming is often

referred to as event-driven programming. In the

example, a button click is one such event. In event-

driven programming, callback execution is

asynchronous [6-8].

MATLAB GUIs can be built in two ways:

Use of GUIDE (GUI Development Environment), an

interactive GUI construction kit.

Create code files that generate GUIs as functions or

scripts (programmatic GUI construction).

In this paper first approach is selected and it

starts with a figure that you populate with components

from within a graphic layout editor. GUIDE creates

associated code file containing callbacks for the GUI

and its components. GUIDE saves both the figure (as a

FIG-file) and the code file. Opening either one also

opens the other to run the GUI. [8] Initial view of

material coding toolbox is shown in fig.3.

Important callbacks used in designing toolbox

Some of the very important callback functions

which are used in designing material coding toolbox are

discussed as follows.To write the other callback

Image Data

MMWR

parameters

FLIR

parameters

Image

processing

algorithms

Material

classified

Database

of the

given

image


functions which are used in toolbox, ref [6-8] are used as

a guide.

Load_Image_Callback; by usingthis callback function

one can get any image from any of the directory. Once the

image is selected, it will be displayed on main axes. Matlab

functions used for this purpose are described as follows.

[h,path]=uigetfile('*.jpg;*.tif;*.png;*.gif',

'All Image Files';...

'*.*','All Files’,'mytitle',...

'..:');

handles. a = imread (strcat (path,h));

axes (handles.axes3)

set (gcf,'CurrentAxes',handles.axes3)

set (handles.Original_img, 'string',h)

imshow (handles. a)

Select_ROI_Callback; this callback function can be

called as a heart of the designed toolbox. Because this

function provides facility for user to select the region of his

own interest. The MATLAB function used for this task is

described as follows,

I = image data;

[BW, x, y] = myroipoly (I);

The above function gives the selected region of

interest‘s binary mask (BW) and (x, y) co-ordinates of

that mask.A region of interest (ROI) is a portion of an

image that you want to filter or perform some other

operation on. It is possible to define an ROI by creating

a binary mask, which is a binary image that is the same

size as the image required to process with pixels that

define the ROI set to 1 and all other pixels set to 0.By

using above callback function one can select more than

one ROI in an image.

Fig.3. Initial view of the designed material coding toolbox using GUIDE

Delete_ID_Callback; this callback function

facilitates user to delete the erroneous region of interest as

well as its related data such as (x, y) co-ordinates, ID values,

material coding parameters. The following code is used to

delete the unwanted data,

del_file = load (file_name);

del_file.material.XY (ID_entry) = [];

del_file.material.sigma (ID_entry) =

[];

del_file.material.ID (ID_entry) = [];

The following ––function is used to get back the region

of interest and its co-ordinates which user likes to delete.

Mask = poly2mask ();

Select_Materials_Callback; this callback function is

used to select the required terrain material corresponds to

selected region of interest. The following code is used to

select the different materials in both MMWR and FLIR case

of material classification. Select_Materials_Callback (hObject, eventdata,

handles)

v = get (handles.Select_Materials,'value');

Main

axes Auxiliary

axes


s = get (handles.Select_Materials,'string');

handles.sig = importdata ('Phong.txt');

Switch s v

Case ‘Asphalt/Bitumen’

Case ‘ ’

end

Save_Callback; this callback function is used to save

the material database for both MMWR and FLIR. Material

database is saved in a structure along with the number of ID‘s,

(x, y) co-ordinates of each ID, in sigma field either NRCS

calculation parameters in case of MMWR or IR material

classification parameters for FLIR case should be includedfor

each ID selected and it is implemented as follows.

Save_Callback (hObject, eventdata, handles)

material.sigma = handles.mycm;

material.mdb = handles.matdb;

material.ID = handles.myid;

material.XY = handles.Coordinates;

handles. material = material;

[handles.material,path]=uiputfile('*.m;*.fig;

*.mat;*.mdl','MATLAB Files (*.m,*.mat)';

'*.m', 'program files (*.m)';

'*.mat','MAT-files (*.mat)';

'*.*','All Files (*.*)','Save as’...

'..\project\MMWR_DataBase\...');

Save ([path, handles. material]);

guidata (hObject, handles)

Operating procedure of the Material classification

Toolbox/ GUI Model

The procedural steps for developing database of

both MMWR and FLIR using the toolbox designed with

the help of above discussed callbacks are described as

follows,

Step 1:

Run the program file

(Material_Coding_Tool_Box.m file) on the MATLAB

editor platform.

Step 2:

A GUI model appears on the screen. To start

material classification click on the ‗Load Image‘ button.

It opens the folder where the images are stored; select

the required image by double clicking on that. It will

display the selected image on the main axes.

Step 3:

MMWR case: Move to MMWR panel, click on

the ‗Select Materials‘ popup menu and select the

required material type.

FLIR case: Move to FLIR panel, enter the

Thickness, Slope and Surface Azimuth values of the

required material in the respective editor windows. If

material classification isindependent of these three

parameters then enter ‗zero‘ and click on the ‗Select

Materials‘ popup menu and select the required material

type.

Step 4:

Click on the ‗Select ROI‘ button, It works on the

graphical input method. A cursor ‗+‘ appears on the main axes

where the required image was loaded, it moves as the mouse

pointer moves. Move this cursor on the desired material type

in the image and mark the points on its boundary. Once the

last point meets the first point, the ‗+‘ cursor changes to ‗‘

cursor. After completion of boundary shape cursor will appear, complete the Region of interest by double clicking on

it. The boundary will become red color with number tag.

Number tag is the count of number of ROI‘s created and it is

placed at the first coordinates of the selected region.

Step5:


Repeat step 3 and step 4 for different materials

in the same image.After all the region of interest has

been selected properly in an image tile, click on

‗Interpolate‘ button. It will assign the un-coded region as

Green grass and completes the database of the given

image tile.

Step 6:

Click on ‗Save‘ button to save thecomplete

material coded database of that image in ‗.mat‘ format.

The database file name should be same as the respective

image file name which is loaded on main axes.

Step 7:

If the user wishes to see the coded regions, click

on ‗Update‘ button. It will display the coded regions on

the auxiliary axes of respective image. Also displays the

information about how many pixels are coded and how

many are left out in the slot given in the tool box.

Step 8:

Suppose the region of interest is erroneously

coded or the user is not satisfied with the coded region,

in this case user can delete that particular region. First

select weather it is in FLIR mode or MMWR mode by

clicking on the radio buttons (By default it will take

MMWR mode), then move to ‗Enter ID Value‖ panel,

enter the number tag of the region would like to delete in

the editor window, then click on ‗Delete ID‘ button.

This function will clear all the data related to that

particular region.

Step 9:

Suppose user stopped the material classification

at some stage and he wanted to continue with the

previously coded data of the same image. To continue

the material classification of the previous session it is

required to load the previously selected region of

interests in order to avoid repetitive coding. This task is

accomplished as follows; first select the appropriate

mode (i.e. MMWR or FLIR mode by clicking radio

button) then click on ‗Plot ID‘ button. This will plot all

the previously selected regions of the image on to the

main axes and it loads the previously coded data on to

the workspace.

Step 10:

If the user wants to see the coded image with

respect to different parameters which are considered for

material classification of both MWR and FLIR, first

select MMWR or FLIR mode then move to View Coded

Images panel,

MMWR case: Enter the grazing angle at which

user likes to see the image and click on ‗MMWR Image‘

button. It will display the coded image with respect to

the NRCS and grazing angle on the auxiliary axes.

FLIR case: Click on the ‗Select Parameters‘

popup menu and select the required parameter for which

user likes to see the coded version of the original image

tile.

Step 11:

To realize the Phong model plots for different

terrain materials, move to Phong Model Plots panel then

click on the ‗Terrain objects‘ popup menu and select the

required material for which user likes to see the Phong

model plot.

Step 12:

To close the GUI click on the ‗Close‘ button.

Phong like lighting backscatter (NRCS) generator

model implementation for MMWR

Radar simulation involves the computation of a

radar response based on the terrain‘s normalized radar

cross section (RCS).MMWR imaging depends on

backscattering co-efficient. The amount of the radiated

energy is proportional to the target size, orientation,

physical shape and material which are all lumped


together in one target-specific parameter called Radar

Cross Section (RCS) denoted by ―σ‖. The radar cross

section is defined as the ratio of the power reflected back

to the radar to the power density incident on the target.

σ = r

d

P

P(m2)

Where, Pd is power delivered to the radar signal

processor by the antenna. Radar simulation involves the

computation of a radar response based on the terrain‘s

normalized radar cross-section (NRCS).To compute

normalized radar cross section for different types of

terrain objects, we are using a well-known model called

Phong like lighting model. Phong lighting is an

empirically derived BRDF model for the computation of

optical reflections [5]. The method is very popular in

computer graphics and is broadly supported by different

software and hardware platforms. Although the Phong

lighting model is not physically correct since it does not

obey all the laws of physics involved, it has easily

interpretable parameters which may explain

itspopularity. Using the Phong model we compute the

mean normalized radar cross section as,

0 casin bsin

Where a, b and c are the model parameters, a

controls the amount of diffuse reflection of a material, b

is thespecular reflection coefficient and c is the

specularity, that is, the sharpness of the directional

highlight for a material (see [5] for more details about

terrain types and a, b and c values).

III. SIMULATION RESULTS

Toolbox Views

In this section views of the designed toolbox at

different stages of material classification and Phong

model plots of different materials are presented in

figures 4, 5 and 6. The parameters used for developing

the Database of both MMWR and FLIR are obtained

through Literature survey. In MMWR case toolbox

facilitates user to see the coded image with respect to

grazing angle. Whereas, FLIR coded images can be seen

with respect to different parameters.

Fig.4. Demo of selecting a ROI using toolbox

Fig.5. Toolbox looks after selecting all the ROI‘s of a given image tile and

updated the coded image on to the auxiliary axes.

(a) (b)


(c)(d)

Fig.6. Phong model plots of different terrain materials implemented on

toolbox; (a) Tree (b) Green Grass (c) Brick/Urban areas (d) Wood

Results of Database Comparison

In this section comparison of databases obtained from

other methods and toolbox is made. Fig 9 shows the

comparison of FLIR Database and fig 10 shows MMWR

Database comparison. Figures from 7 to 11 also show

the accuracy of the Database obtained using toolbox.

Fig.7. Image of the material classified database developed using color coding

method

Fig.8. Image of the material classified database developed using our designed

toolbox

(a) (b)

Fig.9. Comparison of FLIR Images generated using two different databases.

(a) Image generated using Database created outside the toolbox; (b) Image

generated using Database created by toolbox.

(a) (b)

Fig.10. Comparison of MMWR Images generated using two different

databases. (a) Image generated using Database created outside the toolbox; (b)

Image generated using Database created by toolbox.

(a) (b) (c)

Fig.11. Comparison of MMWR Images generated using two databases

developed from color coding method and toolbox. (a) Original image (b)

Image generated using Database created using color coding methods; (c)

Image generated using Database created by toolbox.

IV. CONCLUSION

Weather penetrating sensors play a prominent

role in Enhanced Vision Systems functionalities.

Backscatter image generation capabilities of these

sensors are very much dependent onMaterial database of

the given geographical area. Quality of database reflects

the quality of the backscatter image from sensor. Hence,

Toolbox implementation described in this paper that can


provide the facility for user to develop the material

Database for both HiVision MilliMeter wave radar and

Forward looking Infrared Radar. The procedure for

developing the Database using toolbox is presented. The

material Database obtained from the designed toolbox is

compared with the one obtained from colour coding

method. Also comparison of quality of the Database in

the simulator also presented.But, to develop the quality

material Database much user attention is required.

ACKNOWLEDGMENT

This work is sponsored by National Aerospace

Laboratories, Bangalore. Programmatic support and

technical advice from MSDF group of FMCD, NAL is

gratefully appreciated. Gratitude also owed to director of

NAL, head of FMCD and Principal of REVA ITM

Bangalore.

REFERENCES

[1] Maxime E. Bonjeana, Fabian D. Lapierreb, Jens

Schiefelec, Jacques G. Verly, Flight Simulator with IR and MMW Radar Image Generation Capabilities, Proc of SPIE, Volume 6226, pp 62260A 8-12, April 2006.

[2] Bernd Korn, Hans-Ullrich Doehler, and Peter

Hecker, Institute of Flight Guidance, ‘‘Weather Independent Flight Guidance: Analysis of MMW Radar Images for Approach and Landing‘‘, 0-7695-0750-6/00 IEEE trans 2000

[3] B. Korn, H.-U. Doehler and P. Hecker. ―MMW radar data processing for enhanced vision‖. In J. G. Verly, editor, Enhanced and Synthetic Vision 1999,

volume 3691, pages 29-38. SPIE, Apr. 1999. [4] H.U.Dohler, P.Hecker, R.Rodolff, ― Image data

fusion for future Enhanced Vision Systems ”; Sensor Data Fusion and Integration of the Human Element, system Concepts and Integration (SCI) Symposium, Ottawa, 14-17 sep. 98, RTO Meeting Proceedings 12

[5] Niklas Peinecke, Hans-Ullrich Doehler, and Bernd

R. Korn, ―Phong-like Lighting for MMW Radar Simulation‖ Millimeter Wave and Terahertz Sensors and Technology, edited by Keith A. Krapels, Neil A. Salmon, Proc. of SPIE Vol. 7117, 71170M · © 2008 SPIE

[6] Patrick Marchand and O. Thomas Holland. ―Graphics and GUIs with MATLAB”

[7] Hunt Lipsman & Rosenberg. “A Guide to MATLAB for Beginners and Experienced Users”

[8] Matlab learning guide from Mathworks.com ,“BuildGUI in Matlab”


231

Development and hardware implementation of QPSK

Modulator with polyphase filters R.Kannan, Seetha Rama RajuSanapala

Reva Institute of Technology and Management ,Bangalore([email protected])

Reva Institute of Technology and management, Bangalore([email protected]

)

Abstract— In this paper, QPSK modulator is

presented and implemented on a field programming

gate array (FPGA). we are developing QPSK

modulator with polyphase filters on field

programmable gate array(FPGA).All the components

like the data generators,IQ mapping,polyphase filters,

NCO(Numerically Controlled Oscillator) are realized

as digital,discrete time components before feeding it

to DAC.The NCO is developed using ROM lookup

table method. The main motto behind this work is to

develop a working model of QPSK modulator along

with polyphase filters suitable for application in data

links for high data rate applications.QPSK has an

advantage that the channel bandwidth is half and

BPSK has double channel bandwidth. Here two

solutions were explored for polyphase

implementation. First method involving the direct

realization of polyphase filters and second method

using the transposed polyphase design which is the

compromise between area and speed .

Keywords— QPSK modulator , Polyphase filters

,Noble identities.

I. INTRODUCTION

Streaming real-time telemetry / video is an important

feature of Modern Unmanned Aerial Vehicles (UAV)

used for surveillance.Analog modulation schemes like

FM which are generally used for telemetry and command

control cannot be used for such high bandwidth data

links.Quadrature Phase Shift Keying (QPSK) digital

modulation and its variants provide the best compromise

between the bandwidth requirements and power available

onboard a UAV for transmission.In the proposed project

as a part of QPSK modulator is polyphase filter is used

rather than normal filter. Polyphase approach has been

tried to take advantages of computation efficiency of FIR

filter.

This paper is organized as follows In Section II, General

overview of the QPSK modulator block is discussed. In

Section III, the noble identity is proposed for the current

filter implementation. In Section IV, the polyphase filter

approach is presented. In section IV, we will make the

discussion about the application of polyphase filters in

the current paper and the difference between the two

implementations in terms of area and speed and finally

the conclusions are made.

INTERPOLATING FILTER

Fig1.Block diagram of interpolation filter.

The rate conversion can be accomplished by increasing

the rate by a factor of L to the higher rate

Rate & (f‖= L x f,).

T‘‘=T/M.

T -Input sampling time.

T‘‘-Output sampling rate.

II. NOBLE IDENTITES FOR

INTERPOLATION

)(nx )(1 mv )(my

)(nx )(2 mv )(my

Fig2.Noble Identities for up sampling.

This Noble identity is very useful result for

the filter theory.These noble identities are also

applicable for the decimator .The interpolation

helps in increasing the sampling rate and the

decimator does the opposite i.e decreasing the

sampling rate . The noble identities for the

L )( LzH

)(zH L

L

Low pass filter




232

decimator case is different from the interpolation

case .Upsampling a sequence x(n) creates a new

sequence where every thL sample is taken

from x(n) with all others zero. The upsampled

sequence contains L replicas of the original signal's

spectrum. To restore the original spectrum, the

upsampler should be followed by a low-pass

filter with gain L and cutoff frequency /L . In

this application, such an anti-aliasing filter is

referred to as an interpolation filter and the

combined process of upsampling and filtering is

called interpolation.

III. POLYPHASE FILTERS

Polyphase filters are used to reduce the large filtering

length (e.g. M samples) into a set of smaller filtering

length K(where Kis defined as M/L and is a multiple

of the integer L). Consider the interpolator. Since the

up-sampling process inserts L-1 zeros between two

consecutive samples, only K out of M input values in

the FIR filter are non-zero. At any one time, these

non-zero values coincide and are multiples by filter

coefficient h(0), h(l), ... , h(M-L). This gives the

polyphase unit sample responses as:

)()( ntkhnpk

.. (1)

Fig2.Transforming a single stage interpolator into

parallel configuration

IV. QPSK MODULATOR

Fig .3 Block diagram of QPSK modulator.

Data Generator

The message bits are generated. This data is

produces random data of 1's and 0's at the

specified rate.

IQ mapping and serial to parallel converter

The serial data stream is converted to parallel data

stream. An appropriate mapping table converts the bits

from two parallel arms into Imap and Qmap symbols.

kck tfpiAtS ***2cos)( ..(3)

tfpiAtfpiAtS ckckk ***2sinsin***2coscos)(

..(4)

tfpiQtfpiItS ckckk ***2sin***2cos)(

..(5)

Where tS k is the modulated waveform ,A is the

output amplitude, cf is the carrier frequency and

k is the instantaneous phase. kk AI cos

and kk AQ sin are in phase and Quadrature

phase components.

http://en.wikipedia.org/wiki/Low-pass_filter

http://en.wikipedia.org/wiki/Low-pass_filter


233

Polyphase Filtering

For the polyphase filter the upsampling factor that is

being used is 32.The Direct Digital Synthesizer (DDS)

produces a high carrier frequency.Each cycleof the sine

wave should have at least 4 samples therefore the

sampling rate is 80 MBPS. Filter coeffients have been

generated usingMATLAB Simulink and the numbersof

coefficients have been limited to 80 which is equivalent

to 5 symbol duration. A kaiser window has been applied

to the FIR filter for sidelobe attenuation. The filter

coefficients are quantized into 16 bits.

For the polyphase filter implementation on FPGA virtex

4 we are using two design for meeting speed and area

constraints .First we use the direct realization of direct

realization sample rate converter(L=32) .

x(n)

yx Lff

)(ny

Fig .4 Block diagram of Polyphase filter with

commutator.

Fig .5 Block diagram of Direct realization

polyphase filter (L=32).

The above figure 5 is the implementation of sub filter

for direct realization. Each sub filters will have 3

coefficients each. The second realization has sub filter

realization as shown in figure 6 .This is after adding cut

set by moving the delay line to the adder line .This has

the advantage of reducing the critical path for each

subfilter .For the figure 5 the

Fig .6 Block diagram of a sub filter with transposed

implementation.

The modulator was implemented in VHDL code

and was implemented on virtex 4 FPGA kit with

both polyphase designs. The Xilinx synthesis tool

in the synthesis report mentioned the parameters

used by the entire implementation as described

below.

)(0 zH

)(1 zH

)(1 zH N


234

For Direct Realization

Parameters Total

usage

Total

available

on FPGA

Percentage

No of slices 1539 15360 10%

Number of

Slice Flip

Flops

438

30720 1%

Number of

4 input

LUTs

2182 30720 7%

No.of DSP

48'S

8 192 4%

For transposed polyphase design

Parameters Total

usage

Total

available

on FPGA

Percentage

No of slices 1629 15360 10%

No. of slice

flip flops

524 30720 1%

Total no of

4 i/p LUTS

2376 30720 7%

No.of DSP

48'S

8 192 4%

Comparision of maximum frequency

Method Maximum Frequency

Direct Realization of

polyphase

133.190 MHZ

Transposed polyphase 186.519 MHZ

HARDWARE RESULTS

Fig7 .QPSK Output using Direct Polyphase

Realization

Fig8 .QPSK Output using transposed Polyphase

Realization.

Analysis done with the help of Vector Signal

Analyser


235

Fig 9. QPSK output using Direct Realization of

Polyphase filter tested by Vector Signal Analyser.

Fig 10. QPSK output using transposed Realization of

Polyphase filter tested by Vector Signal Analyser.

Conclusion

By implementing these both the designs of polyphase

filter we Can have a compromise between area and speed

as shown in the above results.

REFERENCES

[1].Vagner .S.Rosa, Edurdocoasta ―VHDL

generation of optimized FIR filters‖. (2008 IEEE

international conference on signals,systems and

circuits) .

[2].Fedric Harris and Michael Rice.‖Multirate

Digital Filters for Symbol Timing Synchronization

in Software Defined Radios‖.(IEEE JOURNAL

2001) .

[3].Fedric Harris and Michael Rice ―Digital

Receivers and Transmitters Using Polyphase Filter

Banks for Wireless Communications‖.(2003 IEEE

CONFERENCE PUBLICATIONS).

[4].MassimilianoLaddomada ―On the Polyphase

Decomposition for Design of Generalized

Comb Decimation Filters‖(2008 IEEE

Transactions )..

[5].Zujun Liu · Kechu Yi ―Symbol Timing

Synchronization Using Interpolation-Based

Matched-Filters‖. .

[6].C. N. Ang, R.. H. Turner, T Courtney and R

Woods ―Virtex FPGA Implementation of a

Polyphase Filter for Sample Rate Conversion‖

.(IEEE TRANSACTIONS 2000).

[7]. P.P.Vaidyanathan ―Multirate Digital signal

processing‖ .

[8].PROAKIS Digital signal processing

[9].Meyer Bayse Digital signal processing using

FPGA


236

Distinguishing Identical Twins using their Facial Marks as Biometric

Signature

Avinash JL1, H K Chandrashekar1

1Dept. ECE, AIT-Chikmagalur, India-577102,


Abstract- Due to high degree correlations which are found in

finger print recognition, palm print recognition, iris recognition

and overall facial expressions of identical twins, and study shows

that commercial face Recognition Systems shows poor

performance to distinguish Identical Twins, there exists a need of

another technique to distinguish Monozygotic Identical Twins.

Proposed work uses the Facial marks as biometric signatures to

distinguish Identical twins, Based on a fact that, though the

number of Facial marks of Identical twins may be same but

organization of those marks will be in different positions.

Proposed work proposes a multi-scale facial mark detection

process based on Fast Radial Symmetry Transform (FRST)

which detects interested regions with high radial symmetry at

different scales. Later, prominence facial marks which are

detected across different scales. Finally detected marks will be

used to distinguish Identical Twins

Key Words - Identical Twins, Face Recognition, Facial Marks,

Harr like features, Fast Radial Symmetry Transform (FRST).

I. Introduction

There are two types of twins, dizygotic and monozygotic

twins. Dizygotic twins result from two different fertilized eggs

resulting in different Deoxyribo Nucleic Acid (DNA).

Monozygotic twins, also called identical twins are the results

of a single fertilized egg splitting into two individual cells and

developing into two individuals. Therefore, identical twins

have the same genetic expressions. The frequency of identical

twins is about 0.5% across different populations. Some

researchers believe that this is the performance limit of face

recognition systems to distinguishing identical twins.

The increase in twin births has created a requirement

for biometric systems to accurately determine the identity of a

person who has an identical twin. The discriminability of

some of the identical twin biometric traits, such as

fingerprints, iris, and palm prints, is supported by anatomy

and the formation process of the biometric Characteristic,

which state they are different even in identical twins due to a

number of random factors during the gestation period.

In spite of the fact that the biometric of identical twins is

affected by many factors, some of them such as facial features

are still very similar. Some identical twins share not only

similar facial features but also the same signatures. Confusion

over their identities has made it difficult for others to know

who owns what and who does what. As a result, some

identical twins partake in commercial scams such as

fraudulent insurance compensation. Most importantly, if one

of the identical twins commits a serious crime, their unclear identities cause confusion and uncertainty in court trials.The

ability to distinguish between identical twins based on

different biometric modalities such as face, iris, fingerprint,

etc., is a challenging and interesting problem in the biometric

area. Identical twins are formed when a zygote splits and

forms two embryos. They cannot be discriminated based on

DNA. Therefore, other biometric traits are needed to

distinguish between identical twins. Using face recognition to

differentiate between identical twins is very difficult, because

of the high degree of similarity in their overall facial

appearance. This project focus on distinguishing between

monozygotic twins based on localized facial features known

as facial marks.

Traditionally, biometrics research has focused primarily on

developing robust characterizations and systems to deal with

challenges posed by variations in acquisition conditions and

the presence of noise in the acquired data. Only recently have

researchers started to look at the challenges involved in

dealing with the task of distinguishing identical twins.

Developing techniques and systems that improve twin face

recognition should also improve generic face recognition

systems. Although identical twins represent only 0.5% of the

global population, failure to correctly identify each twin has

led to problems for law enforcement agencies. There have

been several criminal cases in which either both or neither of

the identical twins was convicted due to the difficulty in

determining the correct identity of the perpetrator.

This project proposes to differentiate between

identical twins using facial marks alone. Facial marks are



237

considered to be unique and inherent characteristics of an

individual. Although they are similar in appearance, they can

be distinguished using facial marks. High-resolution images

enable us to capture these finer details on the face. Facial

marks are defined as visible changes in the skin and they

differ in texture, shape and colour from the surrounding skin.

Facial marks appear at random positions of the face. By

extracting different facial mark features this project aim to

differentiate between identical twins.

II. Proposed System

This work proposes a multi scale facial mark detector based

on the fast radial symmetry transform (FRST). The transform

detects dark regions with high radial symmetry. An overview

of the proposed work is shown in Figure 2. Initially, Harr

transform is used to detect the Face and primary facial

features like eyes and lips. Feeding zeros to the output of the

Harr transform, a mask is created to remove the primary facial

features. Next, 5 level Gaussian pyramid is constructed and

FRST is applied to each image in Gaussian pyramid to detect

dark regions with radial symmetry. Finally, the detections are

tracked across scales. Detected facial marks are characterized

only by geometric location. Small rectangular windows are

created across the detected marks for texture comparison

across the detected facial marks. A vector set is created for

detected marks, later on, comparison of created vector set with all vector sets in Database. After comparison correct match is

detected.

Study of Harr like features, Gaussian pyramid, FRST

is much needed to carryout proposed work, and they were

discussed in brief.

Figure 1: Block Diagram of Proposed work

a) Harr like features

Haar-like features are digital image features used in object

recognition. They owe their name to their intuitive similarity

with Haar wavelets and were used in the first real-time face

detector. A Haar-like feature considers adjacent rectangular

regions at a specific location in a detection window, sums up

the pixel intensities in these regions and calculates the

difference between them. This difference is then used to

categorize subsections of an image. For example, let us say we

have an image database with human faces. It is a common

observation that among all faces the region of the eyes is

darker than the region of the cheeks. Therefore a common

haar feature for face detection is a set of two adjacent

rectangles that lie above the eye and the cheek region. The

position of these rectangles is defined relative to a detection

window that acts like a bounding box to the target object

Figure 2: Examples of Harr like features

Using harr like features in proposed work, Face region in the

Image, Primary features in the Detected Face region are found

out successfully.

Figure 3: Identical Twins

Figure 4: Face and Primary Features on Face are detected

using Harr like features


238

b) Gaussian pyramid

The objective is to detect facial marks that are stable across

different scales. This can be achieved by using a Gaussian

pyramid. Reducing the pixel values of base image, subsequent images in the Gaussian image pyramid is achieved

In the Proposed work Gaussian Pyramid is constructed for the

Primary Features Masked Image, which is shown below.

Figure 5: Primary Features are masked on Detected Face

c) Fast radial symmetry transform

The proposed work uses the high radial symmetry transform

to detect the facial marks. The value of the transform at range

n € N indicates the contribution to radial symmetry of the

gradients a distance n away from each point. Whilst the

transform can be calculated for a continuous set of ranges this

is generally unnecessary as a small subset of ranges is

normally sufficient to obtain a representative result. At each

range n an orientation projection image On and a magnitude

projection image Mn are formed. These images are generated

by examining the gradient g at each point p from which a

corresponding positively-affected pixel p+ve(p) and

negatively-affected pixel p−ve(p) are determined, as shown in

Figure 1. The positively-affected pixel is defined as the pixel

that the gradient vector g(p) is pointing to, a distance n away

from p, and the negatively-affected pixel is the pixel a

distance n away that the gradient is pointing directly away

from.

Figure 6: The locations of pixels p+ve(p) and p−ve(p) affected

by the gradient element g(p) for a range of n = 2. The dotted

circle shows all the pixels which can be affected by the

gradient at p for a range n.

The coordinates of the positively-affected pixel are given by

p+ve (p) = p+ round (g(p)/g(p)n)

While those of the negatively-affected pixel are given by

p-ve(p) = p- round (g(p)/g(p)n)

Where ‘round‘ rounds each vector element to the nearest

integer.The orientation and projection images are initially

zero. For each pair of affectedpixels the corresponding point

p+ve in the orientation projection imageOn and magnitude

projection image Mn is incremented by 1 and g(p)

respectively,while the point corresponding to p−ve is

decremented by these samequantities in each image.

That is

On(p+ve(p)) = On(p+ve(p)) + 1

On(p-ve(p)) = On(p-ve(p)) – 1

Mn(p+ve(p)) = Mn(p+ve(p)) + ||g(p)||

Mn(p-ve(p)) = Mn(p-ve(p)) - ||g(p)||

The radial symmetry contribution at a range n is defined as the

convolution Sn = Fn * An

Where

Fn(p) = || Õn(p)||(α)

˜Mn(p),

Õn(p) = On/ maxp||On(p)||


239

˜Mn(p) = Mn/ maxp||Mn(p)||

Where α is the radial strictness parameter, and Anis a two-

dimensional Gaussian.The full transform is defined as the sum

of the symmetry contributions overall the ranges considered,

S=∑n€N Sn

If the gradient is calculated so it points from dark to light then the output image S will have positive values corresponding to

bright radially symmetric regions and negative values

indicating dark symmetric regions

In the Proposed work, FRST is applied to each image in the

Gaussian Image Pyramid and Symmetry map an image is

shown below.

Figure 7: FRST is applied to each image in the Gaussian

image pyramid

d) Flow diagram of proposed work

III. Discussion of results

Proposed work detects the facial marks on applying FRST to

each image in an Gaussian image pyramid which is

constructed from primary features masked image of input


240

image. Once marks are detected small rectangular window are

created around them and vector set is created for detected

marks and comparison of vector is done in order to find out

the correct person among the Identical Twins.

IV. Applications

This Proposal can be employed in Forensics and Security

Dept. to resolve Criminal cases between Identical twins and in

any Insurance companies to avoid claiming for insurance in

the name of other.

V. Conclusion

Proposed work considers facial marks as biometric signatures

to distinguish between identical twins and it is an efficient

technique to distinguish between Identical twins when other

existing techniques like Finger print detection, Palm print

detection, Iris recognition and manual annotation to detect

Facial marks were failed to distinguish Identical Twins

properly.

Future scope of proposed work is adopting facial

features, texture of face along with facial marks to distinguish

Identical Twins.

REFERENCES

[1]Nisha Srinivas, Gaurav Aggarwal, Patrick J. Flynn, Fellow,

IEEE, and Richard W. Vorder Bruegge, October 2012,

―Analysis of Facial Marks to Distinguish Between Identical

Twins,‖ IEEE transactions on information forensics and

security, vol. 7, no. 5.

[2]N. Srinivas, G. Aggarwal, P. Flynn, and R. Voder Bruegge,

Jun. 2011 , ―Facialmarks as biometric signatures to distinguish

between identical twins,‖ in Proc. IEEE Computer Society

Conf. Computer Vision and Pattern Recognition Workshops

(CVPRW 2011), pp. 106–113.

[3]Adams Kong, David Zhang and Guangming Lu, ―A Study

of Identical Twins‘ Palmprints for Personal Authentication‖.

[4] Anil K. Jain, Brendan Klare and Unsang Park Michigan

State University East Lansing, MI, U.S.A jain, klarebre,

[email protected], March, 2011, "Face Recognition: Some Challenges in Forensics", 9th IEEE Int'l Conference on

Automatic Face and Gesture Recognition, Santa Barbara, CA.

[5] Soma Biswas, Kevin W. Bowyer and Patrick J. Flynn

Dept. of Computer Science and Engineering University of

Notre Dame. fsbiswas, kwb, [email protected], "A Study of

Face Recognition of Identical Twins by Humans", A work

supported by the Federal Bureau of Investigation (FBI), the

Biometrics Task Force and the Technical Support Working

Group through US Army contract W91CRB-08-C-0093.

[6] Zhenan Sun, Alessandra A. Paulino, Jianjiang Feng,

Zhenhua Chai, Tieniu Tan, Anil K. Jain, "A Study of

Multibiometric Traits of Identical Twins", an IEEE

transaction.

[7] Sargur N. Srihari, Harish Srinivasan, Gang Fang, Center of

Excellence for Document Analysis and Recognition

Department of Computer Science and Engineering The State

University of New York Buffalo, NY, (march 2007), "Discriminability of Fingerprints of Twins", an article in

Journal of Forensic Identification.

[8] P. Jonathon Phillips, Patrick J. Flynn, Kevin W. Bowyer,

Richard W. Vorder Bruegge, Patrick J. Grother, George W.

Quinn, Matthew Pruitt, "Distinguishing Identical Twins by

Face Recognition ―an IEEE transaction.

[9] Anil K. Jain and Unsang Park, ICIP 2009, "FACIAL

MARKS: SOFT BIOMETRIC FOR FACE RECOGNITION"

an IEEE transaction.

[10] Vipin Vijayan, Kevin W. Bowyer, Patrick. Flynn, Di

Huang, Liming Chen, Mark Hansen, Omar Ocegueda, Shishir

K. Shah, Ioannis A. Kakadiaris, 2011 "Twins 3D Face

Recognition Challenge", an IEEE transaction.

[11] Saeed Mozaffari and Hamid Behravan Electrical and

Computer Engineering Department, Semnan University,

(ICEE2011), "Twins Facial Similarity Impact on

Conventional Face Recognition Systems", 19th Iranian

Conference on Electrical Engineering.

[12] P. Burt and E. Adelson, ―The Laplacian pyramid as a

compact image

code,‖ IEEE Trans. Commun., vol. 31, no. 4, pp. 532–540,

Apr. 1983

[13] Takeshi Mita, Toshimitsu Kaneko, Osamu Hori, ―Joint

Haar-like Features for Face Detection‖, 2005, Proceedings of

the Tenth IEEE International Conference on Computer Vision.



241

Better and Clear Explanation of the Technique of

Linear Convolution of Long Sequences by

Sectioning Seetha Rama Raju Sanapala

#ECE Department, Reva Institute of Technology and Management, Bangalore – 560 064.

*[email protected], [email protected]

Abstract— Convolution (LTI filtering) and correlation are the

fundamental processes in digital signal processing. As such they

are ubiquitous in every DSP area and every DSP

implementation. The computational requirement of convolution

is of the O (N2). Hence, the required number of computations

increases phenomenally with increasing order/length (N) of the

convolution. The strategy followed in computing the convolution

of long sequences is ‗division and conquest‘ i.e. to break the long

sequences into small sequences, find the convolution of the small

sequences and use these results formed from the small sequences

to build the convolution of the original long sequences – similar

to what is done in the FFT algorithm for the computation of

DFT. This method is called ‗convolution by sectioning‘ and is

covered by many standard DSP books through complex

mathematics, array of symbols, notation and DSP jargon like

time aliasing. It is the observation of the author that even the

top-notch students do not have a good understanding of this topic

and only remember the method as some trick of the trade. This

need not be the case as, just like everything in any branch of

science; this method too has the rationale and the reasoning.

Since correlation is nothing but convolution with one of the

sequences time-reversed, the above technique can also be applied

to correlation computation – basically by converting the

correlation problem into the convolution problem. This paper

presents a simple and straight way of explanation of this method.

More importantly, this paper demonstrates how all of the DSP

concepts can be explained in simple terms without compromising

on the rigour and precision. This method follows ‗teach-by-

example‘ and ‗expand with precision and rigor‘. Once the

overall idea of the method is well understood by the student

through the example, the methods of formalism can be used to

bring the student face to face with the language, symbols,

conventions and tools of the topic to make him comfortable with

the mainstream DSP literature. This paper in essence, and in

general, is about better teaching - taking the convolution of long

sequences simply as an example.

Keywords— Signal processing, teaching, education, convolution,

correlation, convolution by sectioning, overlap-add and overlap-

save, overlap-discard, overlap-drop.

V. CONVENTIONS &ORGANISATION

In this paper the sequences (i.e. discrete time signals) are

referred by their names in bold font like x and h – without the

sample/index number n in brackets. If we want to refer to the

nth sample of the sequence x we represent it by x(n) in regular

font. LC and * denote linear convolution in functional form

and symbolic form respectively. Similarly, CC and denote

circular convolution in functional and symbolic forms.

Frequently occurring phrases and names have been

abbreviated at the first occurrence in brackets and used

thenceforth by abbreviations only. Equations are named (E1,

E2, ... so on) for back reference.

Convolution of long sequences by sectioning (COLS) is

divided into two techniques – namely – (i) Overlap and add

(OA) method (ii) Overlap and save method - which is more

meaningfully called Overlap and drop (OD) method. These

methods are usually explained along with the technique of

calculation of linear convolution (LC) through circular

convolution (CC) though these – convolution by sectioning

(COLS) and computation of LC through CC (L3C) - are independent. COLS does not require that we use CC to

compute LC. But CC is usually used to compute LC because

CC can be computed through DFT and the DFT has a very

efficient algorithm in the form of FFT.

In section II, the first method of COLS i.e. overlap and add

(OA) will be explained in detail using an approach aimed at

providing a better understanding of the OA method to the

student. We do not use any figure, the reasoning given makes

the steps amply clear even without the aid of the figures.

Also, a figure without proper explanation is as good as absent.

In section III, the overlap and drop (OD) method will be

explained briefly as we can develop the reasoning on similar

lines as of OA.

The limitations of the direct computation of linear

convolution (DLC) i.e. by the convolution sum formula when one of the sequences (usually input to the filter x) is very long

compared to the other sequence (usually impulse response h

of a FIR filter) has been well explained in many standard

books [1-2] on this subject. In this kind of situation of widely

different length sequences, instead of waiting till the entire

input samples are acquired, we can start working with blocks

of input data – i.e. by sectioning the input data. Let the length

of the impulse response be l and that of the input sequence be

L with the inequality (L≫l). Let the section length of the

mailto:*[email protected]



242

input sequence be m. We would like to work on the

sequences of the impulse response – h of length l – and

sequences of length m i.e. xithat are the sections of the input

sequence x. Our ultimate objective is to compute the output

of the filter y = LC(h,x) = h * x by using LC(h,xi) = linear

convolution of h and xi - where LC and * denote linear convolution as explained in the conventions paragraph above.

To make things less abstract and concretize the ideas in the

minds of the student, we choose the sequences with the

following parameters and explain the logic and the steps of the

procedure.

l =2, L =15, m =5.

VI. OVERLAP AND ADD METHOD

This method can be explained clearly and in a simpler way

using the property that the linear convolution of x and h can

be obtained by the multiplication of the polynomials x(p) and

h(p) where x(p) and h(p) are polynomials formed from the

sequences x and h with the sample values of the

corresponding sequence of the polynomial ( i.e. x for x(p) and

h for h(p)) forming the polynomial coefficients and the index

of the sample giving the power of p i.e.

z(p) = z(0) + z(1)p + z(2)p2+…..

For the parameters we have considered, the following

equations apply :

x(p) = x(0)+x(1)p + x(2)p2 +….x(14)p

14 (E1)

h(p) = h(0) +h(1)p (E2)

In summary, y = h * x can be obtained from the

multiplication of x(p) and h(p) and this is true for all finite

sequences h and x– not just for the x and h under

consideration. Since y = h * x, we can say that conversely,

the samples of y can be obtained from the coefficients of the

polynomial y(p) = h(p)x(p).

Now we can note that from (E2),

y(p) = x(p)h(p) = h(0)x(p) + h(1)x(p)p (E3)

= h(0)x(0)+h(0)x(1)p+h(0)x(2)p2+h(0)x(3)p

3....h(0)x(14)p

14

+[h(1)x(0)p+h(1)x(1)p2+h(1)x(2)p

3+…..h(1)x(14)p

15]

= h(0)x(0)+[h(0)x(1)+h(1)x(0)]p +[h(0)x(2) +h(1)x(1)]p2

+[h(0)x(3) +h(1)x(2)]p3+…..+[h(0)x(14) +h(1)x(13)]p

14

+[h(0)x(15)]p15

You can observe that

y(0) = h(0)x(0) = coefficient of the p0 term x(p)

y(1) = h(0)x(1)+h(1)x(0) = coefficient of p1 term in x(p)

y(2) =h(0)x(2)+h(1)x(1)= coefficient of p2 term in x(p)

y(3) =h(0)x(3)+h(1)x(2)= coefficient of p3 term in x(p)

.

.

.

y(14) = h(0)x(14)+h(1)x(13) = coefficient of p14

term x(p)

y(15)=h(1)x(14) = coefficient of p15

term x(p).

Now let us decompose x(p) into a sum of polynomials as

given below.

x(p) = x1(p) + p5x2(p) + p

10x3(p) (E4)

where

x1(p) = x(0) + x(1)p + x(2)p2 +x(3)p

3 +x(4)p

4

x2(p) = x(5) + x(6)p + x(7)p2 +x(8)p

3 +x(9)p

4 and

x3(p) = x(10) + x(11)p + x(12)p2 +x(13)p

3 +x(14)p

4

From the above it is clear that

h(p)x(p) = h(p) [x1(p) + p5x2(p) + p

10x3(p)] (E5)

= h(p) x1(p) + p5h(p)x2(p) + p

10h(p)x3(p) (E6)

Now consider

h(p)x1(p) = polynomial y1(p) of convolution of h and x1 =

= [h(0)+h(1)p][x(0) + x(1)p + x(2)p2 +x(3)p

3 +x(4)p

4]

=y1(p) (say)


= [h(0)+h(1)p][x(5) + x(6)p + x(7)p2 +x(8)p

3 +x(9)p

4]

=y2(p) (say)


= [h(0)+h(1)p][x(10) + x(11)p + x(12)p2 +x(13)p

3 +x(14)p

4]

=y3(p) (say)

Therefore,

y(p) = h(p)x(p) = h(p) [x1(p) + p5x2(p) + p

10x3(p)] (E7)

= h(p) x1(p) + p5h(p)x2(p) + p

10h(p)x3(p)

= y1(p) + p5y2(p) + p

10y3(p) (E8)

y(p) would contain the sequence y – y(i) coming from the

coefficient of the term pi in y(p). Thus summarizing, we can

obtain the output samples of the filter – the sequence y by

DLC of h and x – using convolution sum of h and x, or by

computing y(p) through (ii) by polynomial multiplication of

x(p) and h(p) from E3 or (iii) by computing the same through

E7.

We can observe that y1(p) is the product polynomial of h(p)

and x1(p). Hence it can be obtained from LC(h,x1) and vice

versa. Similar statements hold for y2(p) and y3(p). We also

note that y(p) is not the direct sum of y1(p), y2(p) and y3(p).


243

We are multiplying y2(p) by p5 and y3(p) by p

10 before the

addition. Let us see the effect of these multiplications.

y1(p) is the product polynomial of h(p) – a first degree

polynomial – and x1(p) – a fourth degree of polynomial. As a

result it will be a fifth degree polynomial. So this would contribute to the 0

th to 5

th samples of y.


polynomial – and x2(p) – a fourth degree of polynomial. As a

result it will be a fifth degree polynomial. Before addition

however, as given in E8, this is getting multiplied by p5 to get

y(p). After this multiplication, the resulting polynomial would

contain powers of p from 5 to 10. So this would contribute to

the 5th

to 10th

samples of y.


polynomial – and x3(p) – a fourth degree polynomial. As a

result it will be a fifth degree polynomial. Before addition

however, as given in E8, this is getting multiplied by p10

to get

y(p). After this multiplication, the resulting polynomial would

contain powers of p from 10 to 15. So this would contribute

to the 10th

to 15th

samples of y.

Therefore, we can summarise the steps of the OA algorithm as

follows.

1) Section the input of 15 samples into 3 (=15/5 = L/m)

non-overlapping sequences of 5 samples. x1 from samples 0

to 4. x2 from samples 5 to 9 and x3 from samples 10 to 14.

This is sectioned like this because we fixed our m = 5.

2) Find the y1 = LC(h,x1), y2 = LC(h,x2) and y3 =

LC(h,x3).

3) Shift y2 by 5 samples and shift y3 by 10 samples.

Thus the range of the indices of shifted y2 would be 5 to 10

and that of y3 would be 10 to 15.

4) Now add y1, and the shifted versions y2 andy3by adding the sample values with the same index.

5) You would get samples from indices 0 to 15. This is

the LC(h,x) – our original objective.

The reader can easily observe that in our example h contained

only two samples, x contained only 15 samples and each

section of the input contained only 5 samples because we have

chosen initially l =2, L =15 and m = 5. The parameters have

been chosen to concretise the idea. l, L and m can be chosen

in quite general manner. And the arguments can be extended

to the general case quite easily. This is what is explained in

the standard books.

Now, the question remains about the step 2: the computation

of the three LCs - y1 = LC(h,x1), y2 = LC(h,x2) and y3 =

LC(h,x3) - of the step. As we know, the LCs can be computed

simply as multiplication of the corresponding polynomials.

We can use that approach. Or better still, we can compute the

LCs through CCs of the sequences modified in a certain way –

as explained in the next paragraph. The advantage with this

approach is that CCs can be obtained much efficiently via the

FFT route. So we can use L3C to obtain the LCs of the step 2. This step is generally combined in the books as part of COLS

but it is quite independent and can be used where LC

computation is required.

We will explain L3C below. Consider the case where we

want to find the LC of two sequences a andb of lengths p and

q. i.e.

a = a(0), a(1),…………..a(p-1)

b = b(0), b(1),…………..b(q-1)

letc = LC(a,b).

We want to compute c = LC(a,b) = a*b

c can be obtained by direct convolution sum of a and b or by

the product polynomial c(p) = a(p) b(p). a(p) is a polynomial of degree p-1 and b(p) is a polynomial of degree q-1. The

product polynomial c(p) would be a polynomial of degree

p+q-2 containing p+q-1 coefficients or samples of c.

Let d = CC(a,b) = ab

wherea = a(0), a(1),…………..a(p-1), 0, 0.. (q-1) zeros

b=b(0), b(1),…………..b(q-1), 0, 0.. (p-1) zeros

Thus both a,bare sequences of length p+q-1obtained from a

and b respectively by padding appropriate number of zeros.

We can show that for all finite a,baand b, c(p) = d(p). Hence

c can be obtained from d. To compute c, we resort to

computing d because we can use –

(i) the circular convolution property of DFT and

(ii) the efficiency of FFT to compute DFT

FFT is nothing but an algorithm that computes DFT

efficiently reducing the number of multiplications from O(N2)

down to O(N log2N).

The circular convolution property states that

d = CC(a,b) = ab = IDFT(DFT(a) x DFT(b)).

FFTs are used to compute the DFT and IDFT to reduce the

computational operations.

VII. OVERLAP AND DROP METHOD

The second method of COLS is the OD method. The logic of

this technique can be explained on similar lines as of OA

method given above.


244

In the OA method the input sections are non-overlapping but

the partial output sections overlap – both y1 and y2 contribute

to the 5th

sample of y for example in the particular case we

have considered - and the final output sequence is obtained by

adding the partially overlapping output sections .

In the OD method, the input sections overlap, we compute the

partial output sections by finding the LC of the h and input

sections. We discard some samples from the partial output

sections and make the final output sequence by arranging the

resultant partial output sections – after drop of some

beginning samples - contiguously i.e. without overlapping.

VIII. CONCLUSIONS

In this paper, we have shown an alternative and better

approach to DSP teaching – teaching by example and

expanding for rigour and precision. We have taken the

technique of COLS to demonstrate this technique

successfully. This necessary method which gives a clear

understanding of the topic should be followed by formal

presentation of the topic in the standard way through

mathematical and more formal reasoning.

REFERENCES

[1] Oppenheim & Schafer, Discrete-Time Signal Processing (3rd Edition) (Prentice Hall Signal Processing)

[2] Monson H. Hayes, Schaum‘s Outlines of theory and problems of

Digital signal processing, McGraw-Hill, 1999.


245

Abstract—In the field of Image processing there are many

techniques to detect edges. Some of the methods are ‗First

derivative method‘ ,‘Second derivative method‘, Sobel, Laplacian

of Guassian(LoG) and these methods are used for image

segmentation and object identification. In this paper we present a

basic idea on how to detect edges using Simulink. Generally we

detect edges of a given image using MATLAB by extracting its

features or Fuzzy Logic tool box (Artificial intelligence) but here

we detect edges using Simulink and Fuzzy logic. The main aim is

to reduce the number of bits used to represent its Pixel value.

Experimentation is performed on gray scale image in Simulink

and MATLAB and Fuzzy logic using First derivative, second

derivative and Sobel operator. Both the results are compared

and plotted. The pixel value of output image is reduced and

encrypted.

Index Terms—Edge detection, first and second derivative,

Sobel, Simulink, Fuzzy logic, PSNR.

I. Introduction

Edge detection is an important field in image processing.

It can be used in many applications such as

segmentation, registration, feature extraction, and

identification of objects in a scene. An effective edge

detector reduces a large amount of data but still keeps

most of the important feature of the image. Edge

detection refers to the process of locating sharp

discontinuities in an image. These discontinuities

originate from different scene features such as

discontinuities in depth, discontinuities in surface

orientation, and changes in material properties and

variations in scene illumination as stated in [6].

The main principle in edge detection is analysing the pixel

value of each cell and to decide whether there is an edge.

Edges can be modeled based on intensity profiles [1] and they

are

A) Ramp edge

Ramp edge is modeled as a gradual increase in image

amplitude from low to high level, or vice versa. The

edge is characterized by its height, slope angle and

horizontal coordinate of the slope midpoint.

B) Step edge

If the slope angle of the ramp edge equals 90 degrees the

resultant edge is called a step edge. In the digital

imaging system, step edges exist only for artificially

generated images such as test patterns and bi-level

graphics data. There is a sudden change in the pixel

value from high to low or vice-versa.

C) Line edge

Line edge is a combination of 2 ramp edges. The entire

range is divided by 2. The pixel values are increasing

linearly in the first part and they are decreasing linearly

in the second part, this is called line edge.

D) Roof edge

If the limit, as the line width approaches zero, the

resultant amplitude discontinuity is called roof edge.

Edge Detection Using Feature Extraction

Pericherla S. K. Rohit Varma and R.Rohit, B.E III year,ECE, SDMCET,Karnataka,India [email protected],[email protected]




246

Edges are detected in four steps that is Smoothing,

Enhancement, Detection and Localization.Smoothing

suppress as much noise as possible, without destroying

the trueedges.

Enhancement involves applying filter to enhance quality

of edges in animage.Detection involves determine which

edge pixels should be discarded as noise andwhich

should be retained (usually,thresholding provides the

criterion used for

detection). Localization determines the exact location of

an edge.

II. Edge Detection Methods

a. First derivative principle

An edge in a continuous domain edge segment F(x,y)

can be detected by forming the continuous one-

dimensional gradient G(x,y). If the gradient is

sufficiently large above some threshold value an edge is

deemed present. The classical methods for edge

detection are based on the first derivative and second

derivative principle. The operators which are used to

carry out the differentiation operation are called gradient

operators.

If we go to the discrete domain in terms of row

, RG j k and column gradient , .CG j k The spatial

gradient amplitude is given [2] as:

1/22 2, , , R CG j k G j k G j k --eq 1

For computational efficiency, the gradient amplitude is

sometimes approximated by the magnitude combination

, , , R CG j k G j k G j k

--eq 2

Using newton‘s forward and backward difference

methods for calculating the derivatives of the pixels we

get,

The row gradient is represented in eq 3 and the column

gradient is given by eq 4.

, , , 1RG j k F j k F j k

--eq 3

, , 1,cG j k F j k F j k

--eq 4

If first derivative method is applied once again at the

output obtained from differentiation then we get second

derivative results

b. Second Derivative Principle

Points which lie in the image is detected by zero

crossing of the second derivative. For the continuous

function second derivative is done according to eq 5

2 2

2 2,

G j k

x y

--eq 5

For discrete function the formulae is given in [1] as

, , 1, 1, , ;

, , , 1 , 1 , ;

R

c

G j k F j k F j k F j k F j k

G j k F j k F j k F j k F j k

1/22 2, , , R CG j k G j k G j k --eq 6

c. Sobel Edge Detection Method

It is a discrete differentiation operator, computing an

approximation of the gradient of the image intensity function.

At each point in the image, the result of the Sobel operator is

either the corresponding gradient vector or the norm of this

vector. The Sobel operator is based on convolving the image

with a small, separable, and integer valued filter in horizontal

and vertical direction and is therefore relatively inexpensive in

terms of computations.The operator uses two 3×3 kernels

which are convolved with the original image to calculate

approximations of the derivatives - one for horizontal

changes, and one for vertical. The following table is obtained

from [4] and is employed in the algorithm.

The results show that the edges detected though sobel

method are more accurate than other methods. But if the

image contains a significant amount of noise then the edges

detected through sobel method are not accurate. In that case,

we detect edges either through first derivative or second

derivative methods.


247

TABLE I: Sobel row and column

III. Methodology

Edges are detected using all three approaches namely First

derivative, Second derivative and Sobel operator. First the

grayscale image is considered then all three approaches are

performed then thresholding is done using MATLAB R2010.

Peak to Signal noise ratio (PSNR) is calculated. Using

Simulink edge is detected according to the algorithm

described below.

a. In Simulink

i. Algorithm

Step 1: select an image file which is in grayscale format.

Step 2: Define sobel gradients in a 3*3 matrix.

Step 3: Perform matrix multiplication of the 2 matrices

shown above.

Step 4: Concatenate the results obtained in step 3

sequentially.

Step 5: Edge detected image is obtained.

Fig.1: Block diagram of an edge detected image using Sobel operator in

Simulink

Fig.1 shows the Simulink toolboxes that have been used

to detect edges through Sobel edge detection method.

b. In MATLAB

i. Algorithm

These are the steps that are employed in [2] to find the

edge detected image.

Step 1: select an image of grayscale format.

Step 2: Sobel operator matrix is defined.

Step 3: the operator is applied to every pixel in the

image and a new matrix is formed.

Step 4: the matrix obtained is the edge detected image of

the given matrix.

c. Using fuzzy toolbox

These are the steps that are used to detect edges using

fuzzy logic as in [8].

Step 1: construct a fuzzy inference system that would

detect the edges and classify them as small, medium and

large edges.

Step 2: Give an image as an input to the fuzzy inference

system.

Step 3: Obtain the output and apply threshold to the

image.

Edge detection of an image is white line as edges of the

image on the black background. Generally grayscale

image lies between 0-255 where 0 represents black

moving toward white. From the experience of the tested

images in this study, it is found that the best result to be

achieved at the range black from zero to 126 gray values

and from 126 to 255 meaning is trace as white..

The edge detected matrix has a wide range of pixel

values from 0 to 255. In order to increase the accuracy

we assign a threshold value. Pixel values above the

threshold are made white and below are made black.

Now the matrix contains elements only of 0 and 255.

IV. Results

The results we obtained after performing the edge

detection technique for the image shown below are as

shown

1 0 -1

2 0 -2

1 0 -1

1 2 1

0 0 0

-1 -2 -1


248

Fig. 2. Original grayscale image

Fig 2 shows the actual image whose edge has to be

detected.

Fig. 3. Edge detected image using First Derivative Principle

Fig 3 shows the edge detected image of the original

image using First Derivative Principle.

Fig. 4. Edge detected image using Second Derivative Principle

Fig 4 shows the edge detected image of the original image

using Second Derivative Principle.

Fig. 5. Edge detected image using Sobel edge detection method.


249

Fig 5 shows the edge detected image using sobel edge

detection method.

Fig. 6. Edge detected image using Fuzzy logic toolbox

Fig 6 shows the output of an edge detected image that has

been obtained using Fuzzy logic toolbox. The approach that

has been used here is the First Derivative principle method.

By comparing all the three methods it can be said that the

edge detected image obtained using sobel edge detection

method is more accurate. TABLE II: PSNR for 3 approaches

PSNR calculated for all the images obtained from

different approach is represented in table 2.

File size is calculated after detecting edges in images

and after applying threshold and it is represented in table

3.

V. Conclusion

This paper describes detection of edges of an image

using three different approaches such as First Derivative,

Second Derivative and Sobel operator. PSNR is

calculated using all three approaches and compared.

Using Sobel operator edges can be detected perfectly as

its PSNR calculated is 6.572 better than other two

approaches

TABLE III: File size for edge detected images

.

Acknowledgment

The authors wish to thank Daneshwari I. Hatti, Assistant

Professor at SDMCET for her grateful support and assistance

while carrying out the paper work. The authors would also

like to thank the faculty of SDMCET for their support.

References

[1] Zahari Doychev, ―edge detection and feature extraction‖ -

Verlag, 1985, ch. 4.

[2] Mohamed A. El-Sayed, ―A New Algorithm Based Entropic

Threshold for Edge Detection in Images‖ [3] Ferdinand van der Heijden, ―Edge and Line Feature Extraction

Based on Covariance Models‖.

[4] Gonzalez woods and Eddins, “Digital image processing‖

[5] Mike Boldischar and Cha Poa Moua, ―Edge Detection and

Feature Extraction inAutomated Fingerprint Identification

Systems”

[6] André Aichert, ―Feature extraction techniques‖ [7] O. Folorunso and O. R. Vincent, “A Descriptive Algorithm for

Sobel Image Edge Detection‖.

[8]Shashank Mathur, Anil Ahlawat,”Application of Fuzzy logic on

image edge detection‖

METHODS PSNR

First Derivative 6.286

Second Derivative 6.507

Sobel gradient method 6.572

METHODS Original

file size

Size of edge

detected

image

Size after

applying

threshold

First

Derivative

64kB 61.5kB 4.22kB

Second

Derivative

64kB 58.6kB 4.32kB

Sobel

gradient

method

64kB 62.6kB 11.4kB


250

Face Recognition In Video Surveillance Applications

Shilpa Anbalgan1, Manjunath R Kounte

2 , C.Manogaran

3

1Student,

2Ast.Professor,

3Deputy General Manager BHEL

1,2Dept. of Electronic Engineering REVAITM Bangalore,

3BHEL

Email: [email protected], [email protected], [email protected]

Abstract—Face Recognition is rapidly

finding their way into Intelligent

Surveillance Systems. Recognizing faces

in the crowd in real-time is one of the

key features that will significantly

enhance Intelligent Surveillance

Systems. The main challenge is the fact

that the high volumes of data generated

by high-resolution sensors can make it

computationally impossible for

mainstream processors. In this paper we

report on prototyping development of a

automated face recognition system using

high resolution basic techniques. In the

proposed technique, the camera extracts

all the faces from the full-resolution

frame and only sends the pixel

information from these face areas to the

main processing unit. Face recognition

software that runs on the main

processing unit will then perform the

required pattern recognition.

1. INTRODUCTION

Image processing is a form of signal

processing for which the input is an

image/video; and the output of image

processing may be either an image or a set

of characteristics or parametersrelated to

the image. Most of the image-processing

techniques involve treating the image as

a two-dimensional signal and applying

standard signal-processing techniques to it.

Purpose of Image processing

The purpose of image processing is

1. Visualization – To Observe the

objects that are not clearly visible.

2. Image sharpening and restoration -

To create a better image form available

data.

3. Image retrieval - Seek for the image

of interest or region of interest.

4. Measurement of pattern – Measures

various objects in an image such as

features or any region of interest.

5. Image Recognition – Distinguish the

objects in an image.

Fig 1: System diagram showing main

computational components of the Face

recognition Surveillance System

Its the set of computational

techniques for analyzing, enhancing,

compressing, and reconstructing of image

data. Its main components are the input, in




http://en.wikipedia.org/wiki/Signal_processing



http://en.wikipedia.org/wiki/Output

http://en.wikipedia.org/wiki/Parameter

http://en.wikipedia.org/wiki/Two-dimensional

http://en.wikipedia.org/wiki/Signal_(electrical_engineering)


251

which an image is captured through

scanning or digital photography; analysis

and manipulation of the image,

accomplished using various specialized

software applications; and output (e.g., to a

printer or monitor display). Image

processing has various applications in

wide areas, including astronomy,

medicine, industrial robotics, remote

sensing by satellites and biometrics

Afacial recognition system is

a application in computer for

automatically identifying or verifying a fro

m a digital image or a video frame from

a video data source. One of the ways to do

this is by comparing selective facial

features from the image and a

facial database. It is typically used

in security systems and can be compared

to other biometrics such or

eye iris recognition systems. Recognition

algorithms can be divided into two main

approaches, geometric, which looks at

distinguishing features, or photometric,

which is a statistical approach that distills

an image into values and compares the

values with templates to eliminate

variances. Popular recognition algorithms

include Principal Component

Analysis using Eigen-faces.

2. PRIOR WORK

A. Face Recognition at a Distance

for video Surveillance

Applications

Face recognition at a distance is

concerned with the automatic recognition

of co-operative and non-cooperative

subjects over a wide area. The system

features predictive subject targeting and an

adaptive target selection mechanism based

on the current action and past history of

each targeted subject to help ensure that

facial images are captured for all subjects

in view. Experimental tests designed to

simulate operation in large transportation

hubs show that the system can track

subjects and capture facial images at

distances of 35–50m and can recognize

them using a commercial face recognition

system at a distance of 10–20 m.[1]

Faces are detected using a

combination of motion detection,

background modeling and skin detection.

The camera is then directed to the detected

faces for higher resolution facial image

capture. People moving in the field of the

stationary camera are detected and

tracked.The person detection process

operates on the video at about 10-15 Hz.

An extended Kalman filter [2], [3] is

applied to these detected persons in the

ground-plane. This makes the system

robust to momentary occlusions and

provides a velocity estimate for each

tracked subject, allowing for the prediction

of future subject locationsAlso at GE

Global Research, Krahnstoever et al. [4]

have developed a multi-camera tracking

framework and prototype face capture at a

distance system for video surveillance

applications.

B. Object Detection by Boosted

Cascade for small Features

A machine learning approach for

visual object detection which is capable of

processing images extremely fast and

achieving high detection. This work is

distinguished by three key contributions.

The first is the introduction of a new

image representation called the ―Integral

Image‖. Next is a learning algorithm,

which selects a small number of critical

visual features from a larger set and yields

extremely efficient classifiers[7].The third

contribution is a method for combining

increasingly more complex classifiers in a

―cascade‖ which allows background

regions of the image to be quickly

discarded while spending more

computation on promising object-like

http://encyclopedia2.thefreedictionary.com/robotics

http://en.wikipedia.org/wiki/Application_software

http://en.wikipedia.org/wiki/Identification_of_human_individuals

http://en.wikipedia.org/wiki/Authentication

http://en.wikipedia.org/wiki/Digital_image

http://en.wikipedia.org/wiki/Film_frame

http://en.wikipedia.org/wiki/Video

http://en.wikipedia.org/wiki/Face



http://en.wikipedia.org/wiki/Database_management_system

http://en.wikipedia.org/wiki/Burglar_alarm

http://en.wikipedia.org/wiki/Biometrics

http://en.wikipedia.org/wiki/Iris_scan

http://en.wikipedia.org/wiki/Principal_Component_Analysis



http://en.wikipedia.org/wiki/Eigenface


252

regions. In the domain of face detection

the system yields detection rates

comparable to the best previous systems.

The speed of the cascaded detector

is directly related to the number of features

evaluated per scanned sub-window.This is

possible because a large majority of sub-

windows are rejected by the first or second

layer in the cascade. This is roughly 15

times faster than the Rowley- Baluja-

Kanade detector [8] and about 600 times

faster than the Schneiderman-Kanade

detector [9].

D. Reliable Face Recognition for CCTV

One ambition of CCTV is to help

prevent terrorism and a key technology is

reliable face recognition With passport

quality photographs, current face

recognition technologies can approach

90% recognition accuracy. Yet trials show

that performance drops to only 15 to 25%

when there are significant changes in

lighting, pose, and facial expressions. We

describe by the authors to address these

issues and provide reliable face

recognition performance in real-time. Our

system has three major components

comprising: a) A Viola-Jones face

detection module based on cascaded

simple binary features to rapidly detect

and locate multiple faces from the input

still image or video sequences, b) A Pose

Normalization Module to estimate facial

pose and compensate for extreme pose

angles, and c) Adaptive Principal

Component Analysis to recognize the

normalized faces. Experimental results

show that our approach can achieve good

recognition rates on face images across a

wide range of head poses with different

lighting and expressions.[10]

3. PROPOSAL SYSTEM

Face detection in complex background

is a challenging task. The complexity in

such detection systems stems from the

variances in image background, view,

illumination, articulation, and facial

expression. This paper is allocated to

introducing a new algorithm for face

detection. Skin color detection and

template matching composed our method

instrument. At this paper non-skin parts in

the pictures omitted firstly and then result

passed to template matching section. Easy

implementation and high accuracy

detection be feature of our work.

4. FACE SEGMENTATION

USING COLOR MODEL

TECHNIQUE

As per standard color models,

RGB, CMYk, HSV, YCbCr, for Face

Recognition here we have used HSV to

extract features of the person in a

video surveillance system. And after

several experiments conducted, we

have just utilized Saturation (S) and

Intensity (V) values of the person and

have obtained threshold value of each

person who‘s values are set to b stored

in as a Template.

This whole process of thresholding

is done in MATLAB. And after

collection of Threshold values of each

data, the uniform average of threshold

is done of all the collected data and is

used for further process.

The collected Threshold of HSV is

further utilized in SIMULINK to

segment the skin color form the

background. We can observe it in the

below screenshots of video Fig.2 and

Fig.3 for segmentation.


253

Fig:2 screenshot of video

Fig 3. Segmented video using HSV

After segmenting the video with the

person and background, now we should fit

the original video and get the persons face

with color with the empty background

which can be observed with the Fig.4

Fig.4 Obtaining face segmented from

background.

5. FACE IDENTIFICATION

After segmentation we apply algorithm

of Eigen techniques using PCA analysis to

identify the person in the video Fig.5 and

Fig.6

Fig.5 Maximum of white pixels

represents the person identification

Fig.6 Boundary box to identify face

6. CONCLUSION

We have presented approaches to

automatic detection of human faces in

color images. The proposed approaches

Consist of two parts: a human skin

segmentation to identify Probable regions

corresponding to human faces; and a view

based face detection to further identify the

location of each human face. The human

skin segmentation employs a model based

approach to represent and differentiate the

background colors and skin colors. The


254

face detection method based on template

matching is a promising approach given its

satisfactory results. The algorithm retains

its high performance even in the presence

of structural objects like beard, spectacles,

mustaches, glasses etc. However, more

features with sophisticated algorithm need

to be added in order to use it for more

general applications.

REFERENCES

[1] W. Wolf, B. Ozer, and T. Lv, "Smart

cameras as embeddedSensor Type CMOS

systems," Computer, vol. 35, pp. 48-53,

2002.

[2] S. Blackman and R. Popoli, Design and

Analysis of Modern Tracking Systems.

Artech House Publishers, 1999.

[3] N. Krahnstoever, P. Tu, T. Sebastian,

A. Perera, and R. Collins, ―Multiview

detection and tracking of travelers and

luggage in mass transit environments,‖ in

In Proc. Ninth IEEE International

Workshop on Performance Evaluation of

Tracking and Surveillance (PETS), 2006.

[4] N. Krahnstoever, T. Yu, S.-N. Lim, K.

Patwardhan, and P. Tu, ―Collaborative

real-time control of active cameras in large

scale surveillance systems,‖ in Proc.

Workshop on Multi-camera and Multi-

modal Sensor Fusion Algorithms and

Applications (M2SFA2), October 2008.

[5] J. Hams, C. Koch, and I. Luo,

―Resistive fuses: Analog hardware for

detecting discontinuities in early vision,‖

in Analog VLSI Implementation of Neural

Systems,ed. C. Mead and M. Ismail,

pp.27-55, Kluwer Academic Publishers,

1989.

[6] P.C. Yu, S.J. Decker, H.S. Lee, C.G.

Sodini, and J.L. Wyatt, Jr., ―CMOS

resistive fuses for image smoothing and

segmentation,‘‘ IEEE I. Solid-state

Circuits, vo1.27, pp.545-553, 1992.

[7] Yoav Freund and Robert E. Schapire.

A decision-theoretic generalization of on-

line learning and an application to

boosting. In Computational Learning

Theory: Eurocolt ‘95,pages 23–37.

Springer-Verlag, 1995.

[8] H. Rowley, S. Baluja, and T. Kanade.

Neural network-based face detection. In

IEEE Patt. Anal. Mach. Intell., volume 20,

pages 22–38, 1998.

[9] H. Schneiderman and T. Kanade. A

statistical method for 3D object detection

applied to faces and cars. In International

Conference on Computer Vision, 2000.

[10] Gibbons P B, Karp B, Ke Y, Nath S

and Sehan S (2003), IrisNet: An

Architecture for a orldwide Sensor Web.

In:Pervasive Computing, 2(4), 22-23, Oct

– Dec

.


Pulse Compression for Target Tracking

Harish Naidu V1, Mr S.N Prasad2

1Harish Naidu V-M.Tech VLSI design and embedded system (ECE), VTU

Reva Institute of Technology & Management (RITM)

Bangalore, India-560064

2Mr S.N Prasad-Associate Professor department of ECE, VTU

Reva Institute of Technology & Management (RITM)

Bangalore, India-560064


Abstract- Pulse compression plays an important role in design of

the radar system. Pulse compression using linear frequency

modulation techniques are very popular in modern radar. The

linear frequency modulation is used to resolve two small targets

that are located at long range with very small separation between

them.Pulse compression is commonly used in radar applications

to improve range resolution while keeping transmitted peak

power low.This is achieved by modulating the transmitted pulse

and then correlating the received signal with the transmitted

pulse.Range resolution is the ability of radar to resolve between

two targets on the same bearing, but at slightly different

ranges.The primary focus of this paper is to implement Pulse

compression using digital technology.With high performance

digital computing, the convolutionoperation required for pulse

compression can be done digitally.This digital approach

eliminates the calibration requirementsand the limited

reconfigurability of analog approaches. This digital processing is

done in frequency domain.When doing the processing in the

frequency domain, a FastFourier Transform (FFT) is used to

transform both the referencewaveform and the return signal

waveform into the frequencydomain. The complex conjugate of

the reference waveform‘sFFT is then multiplied (point-by-point)

with the returned signalwaveform‘s FFT. The result is

transformed back into the timedomain (with an inverse FFT) to

produce the output signal, withpeaks that represent the targets.

Keywords-Pulse compression,Linear Frequency Modulation,

Matched filtering,Digital down conversion(DDC),Range resolution

1. INTRODUCTION

In modern pulsed Radar, range resolution (ΔR) is

Proportional to the pulse duration (τ). Therefore

improvedrange resolution necessitates shorter pulse

duration. Similarly the energy (E) content of the signal is

also proportional to pulse duration (τ) and the detection

probability depends it. Therefore to improve the

detection, the pulse duration is required to be longer. To

overcome this two conflicting requirements, pulse

compression method is used. The pulse compression

usually done through Frequency Modulation and Phase

Modulation are very popular in radars. Frequency

modulation can be classified as Linear Frequency

Modulation (LFM) and Nonlinear Frequency

Modulation (NLFM).

LFM is the most popular radar waveform due to good

range resolution and Doppler sensitivity. LFM

waveform generation schemes are classified in analog

and digital techniques. Analog pulse compression

techniques are based on the surface acoustic wave

(SAW) devices. However, design and fabrication of the

SAW device for the large time-bandwidth product chirp

signal is very complex and expensive, while the digital

technique gives better advantage of programmability,

flexibility, better stability, accuracy and repeatability.

This paper describes LFM generation and

implementation in the Field Programmable Gate Array

(FPGA).

2. LINEAR FREQUENCY MODULATION(LFM)

In LFM, the frequency of the modulating signal

increases linearly during the pulse duration of the signal.

Ina linear chirp, the time domain chirp signal is given by

the equation 1.1.

S (t) = exp (jΦ (t)) (1.1)

Where Φ (t) is the instantaneous phase, given by the

equation as below:

Φ (t) = 2π ( t k ) t (1.2)



Figure 1: Typical LFM Waveforms (a) up-chirp (b) down chirp where is

the centre frequency (@ time t = 0) and k is the rate of the frequency increase

or chirp rate

3. MATCHED FILTER

Matched filter can be implemented in time-domain

andfrequency-domain. In this paper, it mainly analyses

thefrequency-domain matched filter which can realize

the pulsecompression effectively.The matched filter

algorithm process is shown in figure 2.

Figure 2: Matched filter algorithm process

Matched filtering correlates receiver and

transmitted pulse in frequency domain. Echoes

correlating to the transmitted pulse will produce a

high peak, others will be ignored.IFFT will bring

signal back to time domain.

4. SYSTEM OVERVIEW

Figure 3: Digital receiver subsystem block diagram (DRS)

The Pulse compression module is implemented in the

PMCE2202FPGA v5sx95t .The PMC-E2202 is a 16-bit

digital receiver mezzanine card consisting of 4 ADC

channels with a sampling rate of up to 160 Msps and a

Xilinx Virtex-5 SX95T FPGA. The card also has 4 DAC

channels that can be used to generate test patterns to test

the ADC. The brief overview of the data path of 1

channel is as shown below. 2 channels are implemented

in one E2202 card hence 2 E2202 cards are used for 4

channels. The data path is same for all the channels.

Figure 4: Data flow in FPGA

Digital down Conversion (DDC)

Figure 5: DDC


The function of a Digital down Converter is to translate

the frequency of a signal either to alower, intermediate

frequency or to baseband. The usual reason for this is to

reduce the data rate required to represent the desired

signal.

The complex data stream from the digital down

converter flows into the pulse compression unit. This

unit:

1. Performs a time-domain multiplication, to

reduce the magnitude of the side lobes. 2. Performs an FFT operation, to transform the

time domain complex sample stream into a frequency domain representation of the return signal.

3. Performs single point-by-point multiplication (in the frequency domain) of the transverse equalization filter coefficients (to compensate

for the non-flat response of the analog receiver on the front end) and of the conjugated coefficients of the FFT of the reference waveform (to detect correlations with the transmitted waveform).

4. Performs inverse FFT operation, to generate a time domain data stream, with peaks that

correspond to targets.

Complex

Multiplier

Fft_out

Product

Figure 6: Pulse compression module block diagram

Pulse compression module takes the output of Ddc core,

correlates with the stored waveform data in the ROM.

Correlation is performed in frequency domain

hence fft cores and complex multipliers are used in the

design.

Block Description

1. Both fft and ifft cores are configured to a specific size (N-point) by the Ddc_fft _data_controller_interface depending on the number of samples acquired.

2. Ddc out with or without zero padding is fed to fft core to compute fft.

3. Simultaneously the data from the TX rom is popped out one sample at a time and multiplied (complex multiplication) with the corresponding fft output sample.

4. The complex product is then fed to ifft module to compute the final pulse Compression output.

TX-rom Module

TX rom module is a very important block which consists

the stored transmit waveform which is correlated with

the received waveform while performing pulse

compression and a small addressing logic for rom which

selects TX waveform

of specific size from a generalized waveform stored

in the rom.

Tool used

SCILAB-Scilab is a freely distributed open source

scientific software package. It is similar to Matlab,

which is a commercial product. Yet it is almost as

powerful as Matlab.

5. RESULTS

Transmitted (Reference) waveform coefficients are

processed and stored offline using Scilab and Xilinx rom

generator.TX rom of both I and Q channel consist of

stored ideal chirp waveform which is considered as

transmit waveform.Hanning window is performed

offline in Scilab to reduce sidelobes. Fft conjugate is

computed offline in scilab and rom coefficient file is

generated which is given as input to rom generator. The

Pu

lse

co

mp

ress

ion

ou

tpu

t

Fft

Core

Ddc_fft_

data

_contr

oller

inte

rface

Ifft

Core

TX rom


complex conjugate of the reference waveform‘s FFT is

then multiplied (point-by-point) with the returned signal

waveform‘s FFT. The result is transformed back into the

time domain (with an inverse FFT) to produce the output

signal, with peaks that represent the targets.

The following figures show the snapshot of the LFM

signal and pulse compression output done in SCILAB.

Figure 7: LFM pulse envelop

Figure 8: LFM pulse spectrum

Figure 9: Chirp I and Q channel

Figure 10: Pulse compression output

The following figure shows the snapshot of zero padded

output from FIFO using Xilinx ISim


Figure11: Waveform of Zero padded output from the FIFO

6. ADVANTAGES

Pulse compression increases the range resolution as well

as signal to noise ratio (SNR).The main advantage of

LFM is it is quite insensitive to doppler shifts, it is the

easiest to generate, a variety of hardware being available

to form and process it .

7. CONCLUSION

Pulse compression allows coverage of large range

areausing a reduced transmitter power and achieving the

highrange resolution. Best for resolving overlapping

returnscoming from closely spaced targets. The cost of

applyingpulse compression was paid in the form of

complexity addedto both transmitter and receiver. The

other challenging thingwas proper suppression of range

side lobes. The advantagesgenerally outweigh the

disadvantage, so pulse compression isbest technique for

radar systems, where main challengingfactors are

"transmit power" and "high range resolution".

ACKNOWLEDGEMENT

This work has been fully supported by Mistral Solutions

Private Limited, Bangalore and RITM, E & C

department Bangalore.

REFERENCES

[1]Jun Wang, Duoduo Cai and Yaya Wen, ―Comparison of matched

filter and dechirp processing used in Linear Frequency Modulation‖

Computing, Control and Industrial Engineering (CCIE), 2011, pp.70-

73.

[2]Qadir S.G, Kayani J.K, Malik S, ―Digital Implementation of Pulse

Compression Technique for X-band Radar‖Applied Sciences &

Technology, 2007, pp.35-39.

[3]Salemian S, Keivani H, Mahdiyar O, ―Comparison of radar pulse

compression techniques‖ Microwave, Antenna, Propagation and

EMC Technologies for Wireless Communications, 2005, pp. 1076-

1079.

[4]Patel K , Neelakantan U , Gangele S , Vacchani J.G. Desai N.M,‖Linear Frequency Modulation Waveform

Synthesis‖,Electrical, Electronics and Computer Science (SCEECS),

IEEE Students' Conference,2012, pp.1-4.

[5] ―Introduction to Radar systems‖ by Merill l .Skolnik

[6]―Radar Signal Analysis and processing‖ by Bassem R.Mahafza.

[7]Journal paper on ―FPGA Cores Enhance Radar Pulse

Compression‖.

[8] Document on ―Principle of The Pulse Compression Radar‖ by Vijaya Chandran Ramasami, RSL, Univ of Kansas.


EFFICIENT SORTING AND COUNTING OF

GRAINS USING IMAGE PROCESSING

Manoj N1

Department of E & C,

Siddaganga Institute of Technology,

Tumkur -572 103, Karnataka, INDIA

[email protected]

Y Harshalatha2

Asst. Professor

Department of E & C,

Siddaganga Institute of Technology,

Tumkur -572 103, Karnataka, INDIA

[email protected]

Partha Das3

R & D Engineer,

Fowler Westrup (INDIA) Pvt. Ltd.,

Bommasandra Industrial Area,

Bangalore, Karnataka, INDIA

[email protected]

Abstract—With increased expectations for food products of

high quality and safety standards, the need for accurate, fast and

objective quality determination of these characteristics in food

products continues to grow. Computer vision provides an

automated, non-destructive and cost-effective technique to sort

good and defected grains based on their color. In fact, many

systems have already been developed to perform the sorting of

grains based on color, but these systems fail to detect minute

color defects. In this paper, we present a set of image processing

algorithms that accurately differentiate all type of color defected

grains. When the images of grains are captured using line scan

CCD camera, the grains in the image will contain non uniform

distribution of gray values, hence a Gaussian based smoothening

algorithm is implemented to overcome the non-uniform

distribution of gray values inside the grain. A modified adaptive

thresholding algorithm is proposed to segment the grains in the

image based on the gray scale value of the grain. Finally the good

and defected grains are counted using connected component

algorithm and separated by using air valves. Keywords: Image processing, CCD line scan camera, computer

vision, falling grains.

I. INTRODUCTION The basis of quality assessment is often subjective with

attributes such as appearance, smell, texture, and flavor, frequently examined by human inspectors. But human perception could be easily fooled. Together with the high labor

costs, inconsistency and variability associated with human inspection accentuates the need for objective measurement systems. So an automatic sorting system, mainly based on image processing have been investigated for the sorting of agricultural and food products. The grain samples are fed into input hopper automatically as shown in Fig 1. Then, via the in-feed vibrator, the grains are fed onto a flat and channeled gravity chutes. The surface of the chutes is smooth enough to reduce the force of friction between the grains and the chutes. The grains are then passed into an optical inspection area, where a decision on whether to accept or reject each grain is made. To make a right decision on whether to accept or reject each grain, the image captured from the camera in optical inspection area is processed.In the inspection area two cameras are placed for each chute. One camera views the front portion of the grain and other camera views the rear portion of the grain at the same time. Each camera has got two front illuminators and one rear illuminator. The front illuminator illuminate the grain and rear illuminator is used to remove the shadow of the grain. The front illuminators used are of blue color for rice grains and red color for masoor dhal grains. The rear illuminator are of white color for both the grains. Based on the images captured in the inspection area, decision is made by the processors to accept or reject the particular grain. If the decision is to reject then a signal is sent to ejectors to blow the air on that particular defected grain. So finally the good and




defected grains will be separated. The images are acquired by using a CCD line scan camera. Each line scan gives an image of size 1×2048 pixels. Here we are considering four line scan as one frame and for further processing 2048×2048 image is considered.

Fig 1: working principle of color sorter

The goal of this paper is to present a computational algorithm as well as an efficient decision support tool to

automatically detect all types of color defects in the grains.

Specifically the pale yellow defect in rice grain and yellow

defect in masoor dhal has to be detected accurately. We

exploit the pixel information as a reliable feature for deciding

good or defect. The pixels containing same grain may also

vary in their gray values, because of the nature of the grain,

but this reduces the accuracy in detecting the defected grains,

hence we propose a Gaussian based smoothening algorithm to

uniformly distribute the gray values of the pixels containing

individual grain. Then modified adaptive thresholding

algorithm is applied on the image to segment the good and

defected grains in the image based on the gray level value.

According to the segmentation the gray scale image is

converted into a binary image, in this binary image only

defected grains are marked as dark and the good grains are

merged with background as white. Device starts the sorting by

using the air valves to blow off the defected grains. The numbers of good and defected grains are counted by using

connected component algorithm. The paper is organised as

follows: We will briefly review some related methods in

section II, methodology of the proposed work is discussed in

section III, experimental results are shown in section IV,

conclusion along with the future work are given in section V.

II. LITERATURE SURVEY

In this application the images obtained from the CCD line scan camera is a gray scale image. So we can divide the pixels in the image into two dominant groups, according to their gray-level. These gray-levels may serve as ―detectors‖ to

distinguish between background and objects in the image. On the other hand, if the image is one of smooth-edged objects, then it will not be a pure black & white image, hence we would not be able to find two distinct gray-levels characterizing the background and the objects. Fig 2 shows the grain image and its histogram plot. A solution in literature to this problem may be to select a gray-level T between those two dominant levels, which will serve as a ‗threshold‘ to distinguish the two classes (objects and background). Using this threshold, a new binary image can then be produced, in which grain kernels are painted completely dark, and the remaining pixels are white. But this approach is not suitable to segment the defects in the grain, because the defects will have several gray levels. So in literature this principal is generalized to deal with images having several dominant gray-levels. In this case, more than a single threshold is considered in order to classify the image‘s components. Such an approach is referred to as multilevel thresholding. But multilevel thresholding is not always the best solution for segmentation of the image having several gray levels.

(a) (b)

Fig 2: (a) original image (b) Histogram plot of (a).

In fact, multilevel thresholding usually gives poor results [4].

The reason for this is the difficulty of globally establishing

multiple thresholds that effectively isolate regions of interest,

especially when the number of corresponding histogram

modes (dominant areas) is large. Furthermore, the

establishment of a globally predefined set of thresholds cannot

take into consideration the variations within the image.

Because in the image considered here will have several gray

values. In order to overcome these drawbacks, it is impossible

in global context, since no single threshold fits entire image, so this leads to a conclusion that a more local threshold must

be used.

III. METHODOLOGY

A. WORK FLOW

Image

acquisition

GaussianSmo

othening

Modified

Adaptive

thresholding

Counting of

grains

Decision

making Reject

Fig 3: Work flow of proposed work

Accept


B. GAUSSIAN SMOOTHENING A Gaussian is an ideal filter in the sense that it

reduces the magnitude of high spatial frequencies in an image

proportional to their frequencies. That is, it reduces magnitude

of higher frequencies more. This will be at the cost of more

computation time when compared to mean filtering. A

Gaussian extends to infinity in all directions, but because it

approaches zero exponentially, it can be truncated to three or

four standard deviations away from its center without affecting

the result noticeably. Speed up is achieved by splitting a 2-D

Gaussian into two 1-D Gaussians, G(x, y) = G(x) G(y), and

carrying out filtering in 1-D, first row by row and then column

by column. Designing Gaussian Filters is to compute the mask

weights directly from the discrete Gaussian distribution.

And choosing a value for , we can evaluate it over an n x n

window to obtain a kernel, or mask, for which the value at

[0,0] equals 1.For example, choosing = 2 and n= 7, the

equation (1) yields the array as shown in Table 1.However, we

desire the filter weights to be integer values for ease in

computations. Therefore, we take the value at one of the

corners in the array, and choose k such that this value becomes

1.We get

Table 1: Array of mask

Now, by multiplying the rest of the weights by k, we obtain

Table 2. Table 2: Gaussian Kernel

This is the resulting convolution mask for the Gaussian filter.

However, the weights of the mask do notsum to 1. Therefore,

when performing the convolution, the output pixel values must

be normalized by the sum of the mask weights to ensure that regions of uniform intensity are not affected. i.e,

Therefore,

Where the weights of are all integer values.

C. IMAGE SEGMENTATION: Image segmentation is to classify or cluster an image into

several parts (regions) according to the feature of image, for

example, the pixel value or the frequency response. Image

segmentation is useful in many applications. It can identify the

regions of interest in a scene or annotate the data. Often when

approaching a Computer-Vision issue, the first problem we

face is that of Segmentation. That is, in order to extract

valuable information from the image at hand, we first need to

divide the image into distinctive components, which can then

be further analyzed. This is needed, in order to separate the

―interesting‖ components from the subordinate ones, since

computers have much difficulty in performing classification in

comparison with the Human brain.

1) Adaptive Thresholding: In order to overcome the drawbacks mentioned in literature, it

is impossible in global context, since no one threshold fits

entire image, so this leads to a conclusion that a more local

threshold must be used. According to this, the general

definition of a threshold can be written in the following

manner.

Where, f(x, y) is gray level of point (x, y) in original image.

When T depends only on the gray-level at that point, then it

degenerates into a simple global threshold. Actually, p(x, y) is

one of the more important components in the calculation of

the threshold for a certain point. In order to take into consideration the influence of noise or illumination, the

calculation of this property is usually based on an environment

of the point at hand. An example of a property may be the

average gray-level in a predefined environment, the centre of

which is the point at hand.

Threshold calculation techniques: There are two main

approaches to calculate the threshold for a certain point in the

image: one approach is the chow and kanenko approach, other

is local thresholding. Both approaches are based on the

assumption, that smaller image regions are more likely to have

approximately uniform illumination, thus being more suitable

i , j -3 -2 -1 0 1 2 3

-3 .011 .039 .082 .105 .082 .039 .011

-2 .039 .135 .287 .368 .287 .135 .039

-1 .082 .287 .606 .779 .606 .287 .082

0 .105 .368 .779 1.000 .779 .368 .105

1 .082 .287 .606 .779 .606 .287 .082

2 .039 .135 .287 .368 .287 .135 .039

3 .011 .039 .082 .105 .082 .039 .011

i , j -3 -2 -1 0 1 2 3

-3 1 4 7 10 7 4 1

-2 4 12 26 33 26 12 4

-1 7 26 55 71 55 26 7

0 10 33 71 91 71 33 10

1 7 26 55 71 55 26 7

2 4 12 26 33 26 12 4

3 1 4 7 10 7 4 1


for thresholding. Each region is treated like an independent

image, and a ―global‖ threshold is computed for it.

The Chow and Kanenko Approach: According to Chow &

Kanenko, the original image is divided into an array of

overlapping sub-images. A gray-level distribution histogram is

produced for each sub-image, and the optimal threshold for

that sub-image is calculated based on this histogram. Since the

sub-images overlap, it is then possible to produce a threshold

for each individual pixel by interpolating the thresholds of the

sub-images. This method gives reasonable results, but its

major drawback is the fact that it requires much computation.

This causes it to be too slow and heavy for real-time

applications (i.e. for use in computer vision).

The Local Approach: An alternative approach is to statistically

examine the intensity values of the local neighbourhood of

each pixel. The first problem facing us when choosing this

method is the choice of statistic by which the measurement is

made. The appropriate statistic may vary from one image to

another, and is largely dependent on the nature of the image.

For example, if the image contains a strong illumination-

gradient, then use of the average may prove to be effective in

eliminating the ill influence. A few examples of statistics

commonly used in the calculation of the threshold are: the

average, the median, and the average between the minimal and

the maximal gray level in the neighbourhood. As it is clear

from the Fig 4, the adaptive thresholding is successful in

extracting the grain kernels from the image, despite the strong

illumination gradient. On the other hand, the result is not very

pleasing at background areas. The reason for this phenomenon is, with pixels that are centred in an environment containing

enough background and text, the selected threshold falls at

about the middle between these extremes, thus separating

them nicely. With pixels that are centred in an environment of

background only, the range of gray-levels within this

environment is very small and the result is that the average is

very close to the value of the centre pixel, thus an unsuitable

threshold value is computed.

Fig 4: Adaptive thresholding

2) Modified adaptive thresholding: In order to fix the drawback of adaptive thresholding, a

combination of adaptive and global threshold can be

employed. If we compute the threshold of a pixel as the

average of its environment minus a predefined fixed threshold,

then pixels centred in a relatively uniform environment would

be classified as background, which yields a good result.

D. DECISION MAKING AND EJECTION PRINCIPLE After applying the modified adaptive thresholding on the

grain image, it converts the gray scale image into a binary image. In this binary image all dark pixels are considered to be

of defective grain pixels. So when a maximum number of

continuous pixels are reached then a signal is passed to the

ejectors to blow air on those set of pixels. The ejection system

contains a number of valves through which air is blown.

Defects are rejected from the product stream pulses of

compressed air. These pulses are accurately aimed at the

unwanted items by nozzles, a compressed air source being

connected to the nozzles via a rigid duct and switched on and

off by high speed valves [1]. The principal observation of a

single ejection event is to remove a defective grain.

E. COUNTING OF GRAINS The counting of grains is done by analyzing the connected

components. A connected component in a binary image is a

set of pixels that form a connected group. Connected

component labelling is the process of identifying the

connected components in an image and assigning each one a

unique label. Properties of Connectivity: For simplicity, we

will consider a pixel to be connected to itself (trivial

connectivity). In this way, connectivity is reflexive. It is pretty

easy to see that connectivity is also symmetric: a pixel and its

neighbour are mutually connected. 4-connectivity and 8-

connectivity are also transitive: if pixel A is connected to pixel

B, and pixel B is connected to pixel C, then there exists a

connected path between pixels A and C. A relation (such as

connectivity) is called an equivalence relation if it is reflexive, symmetric, and transitive. Connected Component Labelling: If

we find all equivalence classes of connected pixels in a binary

image, this is called connected component labelling. The

result of connected component labelling is another image in

which everything in one connected region is labelled ―1‖ (for

example), everything in another connected region is labelled

―2‖, etc..,

IV. EXPERIMENTAL RESULTS

The images captured with developed image acquisition board were processed by the proposed algorithms and were

analyzed by MATLAB 7.0. The image is first passed through

the proposed Gaussian filter to smoothen the image as well as

to remove noise. The proposed modified adaptive thresholding

is applied on this smoothened image and is compared with

the conventional global threshold approach. As an

example here we have considered an image of Masoor dhal

and rice. Masoor dhal contains three types of color defects:

brown defect, green defect and yellow defect. And the good grain will be of red color. Yellow and red color will

have nearby gray values, so it is difficult for the system to

differentiate among those two colors when global

thresholding is applied. Rice grains contains pale yellow,


orange, chalky and black defects. Here pale yellow and

good grains will contain nearby gray values so it is difficult to differentiate. The proposed algorithms differentiate

more accurately when compared to conventional methods.

V. CONCLUSION AND FUTURE WORK

In this paper, we have described recent improvements in the

algorithm of the color sorter to obtain more accuracy without

compromising speed. These improvements include the

removal of noise from the image and the use of modified

adaptive thresholding for image segmentation. In the

experimental results it is proved that the proposed set of

algorithm gives efficient segmentation of defected grains from

the good grains. The proposed algorithm is designed to work

well in real time application. It gives accurate result in the

same time that the conventional color sorters would take.To obtain more efficiency in sorting of good and defected grains,

two more algorithms can be applied. Image enhancement

algorithm and separation of overlapping algorithms can be

applied on the gray scale image [2]. The grain kernels sliding

down from the chute will be overlapped sometimes, so in the

image the grain kernels are not clearly viewed, hence a

separation of overlapping grain kernel algorithm can be

applied on the image to separate the kernels from one another.

Image enhancement processes consist of a collection of

techniques that seek to improve the visual appearance.

Hardware implementation can be done on FPGA or DSP

chips.

REFERENCES

[1] G Hamid, M J Honeywood, S Mills, S C Bee, W He, ―Optical sorting for

cereal grain industry‖, (Research and Development department, Sortex Ltd, London, England, UK).

[2] Lei Yan, Sang-Ryong Lee, Seung-Han Yang, Choon-Young Lee, ―CCD

Rice Grain Positioning Algorithm for Color Sorter with Line CCD

Camera‖ (Department of Mechanical Engineering, Kyungpook National University, Daegu, Korea).

[3] Weixing Wang, ―Image Analysis of Grains from A Falling Stream‖ Fourth International Conference on Image and Graphics.

[4] ping-sung liao, tse-sheng chen and pau-choo chung, ―A fast algorithm

for multilevel thresholding‖, journal of information science and engineering 17, 713-727 (2001) 713.

[5] Tadhg Brosnan, Da-Wen Sun, ― Improving quality inspection of food products by computer vision––a review‖, FRCFT Group, Department of

Agricultural and Food Engineering, University College Dublin, National

University of Ireland, arlsfort Terrace, Dublin 2, Ireland Received 29 April 2002; accepted 6 May 2003.

[6] E.R.Davies, ―computer and machine vision Theory, algorithm and practicalities‖.

[7] Nello Zuech, ―Understanding and Applying Machihe Vision‖ Second Edition, Revised and Expanded.

Table 3: Number of good and defected grains obtained by proposed method

Fig 5: parboiled rice good (a) original image (b) global thresholding (c) modified thresholding

Figure

no.

Number of grains

Good grains Yellow defects Total

Manual

counting

Proposed

method

Manual

counting

Proposed

method

Manual

counting

Proposed

method

5 26 23 0 2 26 25

6 0 0 23 23 23 23

7 4 4 6 6 10 10


Fig 6: parboiled rice Pale yellow (a) original image (b) global thresholding (c) modified thresholding

Fig 7: parboiled rice good and pale yellow (a) original image (b) global thresholding (c) modified thresholding


Foreground Analysis for Real Time Moving Object Detection, Tracking and object classification

Roopashree. B. G1, Vidyasagar K.N2 ,C.Manogaran3

student1,Asst,.Professor2 ,Sr. Deputy General Manager3

Department of Electronics and communication1,2

Reva Institute of Technology,Bangalore, Karnataka, India1

Quality Service, BHEL, Bangalore,Karnataka,India3

Email : [email protected],[email protected]@bheledn.co.in

ABSTRACT-In a Robust tracking system where a camera is

installed on a freely moving platform, motion detection and

tracking of object becomes much more difficult e.g. human, a

vehicle or a robot, hence it is necessary to develop an algorithm

to perform functions such as moving object detection , tracking

and object classification .

The Aim is to model foreground pixels using the background

subtraction method followed by preprocessing the data for object

modeling in order to recognize and classify the moving objects

for understanding human activity in the scene. Object detection

algorithm are applied in uncontrolled environments and the

same must be suitable for automated video surveillance system

for detection and monitoring of moving objects in both indoor

and outdoor environments.

Keywords- Motion Detection;Object detection;Tracking; Human

Model; Motion Analysis; video surveillance

I. INTRODUCTION

Over the past three decades, motion tracking have become

increasingly accepted and accessible method for accurately

plotting organic movement into computers to bypass the tricky

and cumbersome process of manually animating model with

realistic movements. As the technology and the systems relies

on mature and become widely accessible, its usefulness is increasing exponentially. The current acceptable quality of

motion tracking mark a clear milestone in the long quest to

create an intelligent machine a computer can now ‗see‘.

PRESENT SCENARIO

A closely related system is the DETER system. DETER can detect and track humans, and can

analyze the trajectory of their motion for threat

evaluation.

TheW4 system detects humans and their body parts

and analyzes some simple interactions between

humans and objects.

IBM S3-R1can detect, track, and classify moving

objects in a scene, encodes detected activities and

stores all the information on a database server.

All these systems work only in a static camera environment

and our objective is to work as well in an uncontrolled moving

camera environment.

A. Different Types of Motion capturing

There are a few different methodologies when it comes to

capturing data in real-time. The three main techniques are

prosthetic, magnetic and optical.

1. Prosthetic Motion Capture

Prosthetic motion capture uses potentiometers on the plastic

exoskeleton that an actor must ‗wear‘, and then act out his or

her movements. This technique is obviously only of use in

humanoid character animation, but is very accurate and

transmits real-time data at a far greater range than any other

technology. The suit is cumbersome but its advantages mean

that prosthetic motion capture has thrived.

Figure 1: The Gypsy 3 prosthetic full body suit

2. Magnetic Motion Capture Using magnetic motion capture, sensors attached to the body

being animated are manipulated inside a magnetic field. This technique is the least power-hungry in terms of computational

number crunching, so is the closest to real-time, with up to

one hundred samples a second possible. The sensors also

provide details on their orientation in full 3D. One of the few

drawback is the obvious effect that any metal would have on

the generated magnetic field Magnetic motion capturing also

requires a very tight space to be used, with a range of only

three meters.


Figure 2: A suit with magnetic sensors in a closed environment

3. Optical Motion Capture

Optical systems utilize data captured from image sensors

to triangulate the 3D position of a subject between one or

more cameras calibrated to provide overlapping projections.

Data acquisition is traditionally implemented using special

markers attached to an actor. Optical motion capture

systemsdo away with a suit or exoskeleton, and do not

necessarily even need physical markers such as the LED lights

or reflective dots – systems exist that are capable of mapping

patches of color or brightness to areas on a 3D model in real

time .The greatest advantage of optical motion capture is that

it is not limited to the movements in a closed motion-capture

studio or any movement on a video stream, live or recorded, can be analyzed. Water flow in a river, traffic speeds on

motorways and soon.

Figure 3: Optical facial tracking (2 cameras)

B. Background Subtraction

Background subtraction is particularly a commonly used

technique for motion segmentation in static scenes . It

attempts to detect moving regions of interest by subtracting the current image pixel-by-pixel from a reference background

image that is created by averaging images over time in an

initialization period. The pixels where the difference is above

a threshold are classified as foreground. After creating a

foreground pixel map, some morphological post processing

operations such as erosion, dilation are performed to reduce

the effects of noise and enhance the detected regions. The

reference background is updated with new images over time

to adapt to dynamic scene changes.

There are different approaches to this basic scheme of

background subtraction in terms of foreground region

detection, background maintenance and post processing.

1. Background Subtraction Techniques:

Heikkila and Silven uses the simple version of this scheme

where It is pixel at location (x, y) in the current image It is

marked as foreground if

|It(x, y) − Bt(x, y)| > Ƭ ……….. (1)

is satisfied where ‗Ƭ ‘ is a predefined threshold. The

background image Bt is updated by the use of an Infinite

Impulse Response (IIR) filter as follows:

Bt+1 = αIt + (1 − α)Bt …………(2)

The foreground pixel map creation is followed by morphological closing and the elimination of small-sized

regions.

Although background subtraction techniques perform well at

extracting most of the relevant pixels of moving regions even

they stop, they are usually sensitive to dynamic changes when,

for instance, stationary objects uncover the background (e.g. a

parked car moves out of the parking lot) or sudden

illumination changes occur.

2. Statistical Methods

More advanced methods that make use of the statistical

characteristics of individual pixels have been developed to

overcome the shortcomings of basic background subtraction

methods. These statistical methods are mainly inspired by the

background subtraction methods in terms of keeping and

dynamically updating statistics of the pixels that belong to the

background image process. Foreground pixels are identified

by comparing each pixel‘s statistics with that of the

background model. This approach is becoming more popular

due to its reliability in scenes that contain noise, illumination

changes and shadow.

The W4 system uses a statistical background model where

each pixel is represented with its minimum (M) and maximum

(N) intensity values and maximum intensity difference (D)

between any consecutive frames observed during initial

training period where the scene contains no moving objects. A

pixel in the current image It is classified as foreground if it

satisfies:

|M(x, y) – It(x, y)| > D(x, y) or

|N(x, y) – It(x, y)| > D(x, y)…………… (1)

After thresholding, a single iteration of morphological erosion

is applied to the detected foreground pixels to remove one-

pixel thick noise. In order to grow the eroded regions to their

original sizes, a sequence of erosion and dilation is performed

on the foreground pixel map. The statistics of the background

pixels that belong to the non-moving regions of current

image are updated with new image data

Stauffer and Grimson presented a novel adaptive online background mixture model that can robustly deal with lighting

changes, repetitive motions, clutter, removing objects from the


scene and slowly moving objects. The motivation was that a

background model could not handle image acquisition noise,

light change and multiple surfaces for a particular pixel at the

same time. Thus, they used a mixture of Gaussian

distributions to represent each pixel in the model. Due to its

promising features, it is implemented and integrated this

model in our visual surveillance system.(e. g. scalars for gray

values or vectors for color images) over time is considered as

a ―pixel process‖ and the recent history of each pixel, X1, . . . ,Xt, is modeled by a mixture of K Gaussian distributions.

The probability of observing current pixel value then

becomes:

K

ititiiXttsXtP

1),,,(.)( ………..(2)

where ωi,t is an estimate of the weight (what portion of the

data is accounted for this Gaussian) of the I th Gaussian (Gi,t) in the mixture at time t, μ i,t is the mean value of Gi,t and ∑i,t

is the covariance matrix of Gi,t and Ƞ is a Gaussian

probability density function:

………..(3)

Decision on K depends on the available memory and

computational power.Also, the covariance matrix is assumed

to be of the following form for computational efficiency:

Itk k

2, ………..(4)

which assumes that red, green, blue color components are

independent and have the same variance. The procedure for

detecting foreground pixels is as follows. At the beginning of

the system, the K Gaussian distributions for a pixel are

initialized with predefined mean, high variance and low prior

weight. When a new pixel is observed in the image sequence,

to determine its type, its RGB vector is checked against the K

Gaussians, until a match is found. A match is defined as a pixel value within (=2.5) standard deviation of a distribution.

Next, the prior weights of the K distributions at time t, ωk,t, are

updated as follows

)()1( ,1,, tktktk M …….(5)

where α is the learning rate and Mk,t is 1 for the matching

Gaussian distribution and 0 for the remaining distributions.

(a) (b)

Figure 1: Two different views of a sample pixel processes (in blue) and

corresponding Gaussian Distributions shown as alpha blended red spheres.

Figure 1.shows sample pixel processes and the Gaussian

distributions as spheres covering these processes. The

accumulated pixels define the background Gaussian

distribution whereas scattered pixels are classified as

foreground.

3. Temporal Differencing

Temporal differencing attempts to detect moving regions by

making use of the pixel-by-pixel difference of consecutive

frames (two or three) in a video sequence. This method is

highly adaptive to dynamic scene changes, however, it

generally fails in detecting whole relevant pixels of some

types of moving objects.

The temporal differencing algorithm fail in extracting all

pixels of the human‘s moving region. Also, this method fails

to detect stopped objects in the scene. Additional methods

need to be adopted in order to detect stopped objects for the

success of higher level

Lipton presented a two-frame differencing scheme where the

pixels that satisfy the following equation are marked as

foreground.

|It(x, y) – I t−1(x, y)| > τ ……………..(1)

In order to overcome shortcomings of two frame differencing

in some cases, three frame differencing can be used.

4. Optical Flow

Optical flow methods make use of the flow vectors of moving

objects over time to detect moving regions in an image. They

can detect motion in video sequences even from a moving

camera, however, most of the optical flow methods are

computationally complex and cannot be used real-time

without specialized hardware

C. OBJECT CLASSIFICATION

Moving regions detected in video may correspond to different

objects in real-world such as pedestrians, vehicles, clutter, etc.

It is very important to recognize the type of a detected object

in order to track it reliably and analyze its activities correctly.

Currently, there are two major approaches towards moving

object classification which are shape-based and motion-based

methods. Shape-based methods make use of the objects 2D spatial information whereas motion-based methods use

temporally tracked features of objects for the classification

solution.

1 Shape-Based Classification

Common features used in shape-based classification schemes are the bounding rectangle, area, silhouette and gradient of

detected object regions. The approach presented in makes use

of the objects silhouette contour length and area information

to classify detected objects into three groups: human,vehicle

and other. The method depends on the assumption that

humans are, in general, smaller than vehicles and have

complex shapes. Dispersedness is used as the classification

)(1,

21

||2/1

2/,

)2(

1),( tXt

TXt

en

iXt


metric and it is defined in terms of object‘s area and contour

length (perimeter) as follows:

Dispersedness =Area

2Perimeter ……….. (1)

2 Motion-based Classification

Some of the methods in the literature use only temporal

motion features of objects in order to recognize their classes

.In general, they are used to distinguish non-rigid objects (e.g.

human) from rigid objects (e.g. vehicles).Optical flow analysis

is also useful to distinguish rigid and non-rigid objects.

II. LITERATURE SURVEY

1. PAST RESEARCH

The need to capture motion has been identified for decades in

various fields, many of them unsurprisingly with no direct

relation to character animation, interactivity or digital media - for instance, intruder detection systems. However, any

advances in these industries go unaccredited in documented

histories of motion capture, which prefer to point to a method

called ‗rotoscoping‘, first used by Walt Disney, who traced

animation over film footage of live actors playing out the

scenes of the cartoon ‗Snow White‘. The quality of animation

produced is high, but as the tracing must be done by hand,

many of the advantages of automated motion capture are lost.

Aggarwal and Cai gave another survey of human motion

analysis, which covered the work prior to 1997.The paper

provided an overview of various tasks involved in motion

analysis of human body prior to 1998. The focuses were on three major areas related to interpreting human motion:

(a)motion analysis involving human body parts, (b) tracking

moving human from a single view or multiple camera

perspectives, and (c) recognizing human activities from image

sequences.

III. PROPOSED WORK

The proposed system for real time video object detection,

tracking and classification system is shown in Figure 1

Figure 1: Proposed System Block Diagram.

The background is the image which contains the non-moving

objects in a video. Obtaining a background model is done in

two steps: first, the background initialization, where we obtain

the background image from a specific time from the video

sequence, then, the background maintenance, where the

background is updated due to the changes that may occur in

the real scene.

In any indoor or outdoor scene, there are many changes that

may occur over time and may be classified as changes to the

background scene. To classify these changes according to their

sources as follows:

Illumination changes: like the change of the sun

location, the change between cloudy and sunny

weather, and turning the light on/off.

Motion changes: like small camera displacement or

tree branches moving.

Changes introduced to the background: like objects

entering the scene and stays without moving for a

long period of time.

1. Image Acquisition

The input device can be a digital video camera in a free

environment connected to the computer, or a storage device

on which a video file or individual video frames are stored as

Audio Video Interleaved (also Audio Video Interleave),

known by its initials AVI, is a multimedia container

format introduced by Microsoft.

2. Background Model

The background model must tolerate these kinds of changes.

The background maintenance helps the background model to


adapt many to changes that may occur. The background

maintenance model update uses two adaptations the

sequentially adaptation and the periodically adaptation. The

first one is done by using a statistical background model

which provides a mechanism to adapt to slow changes in the

scene. This adaptation is performed using a low pass filter and

is applied for each pixel. The periodically adaptation is used

to adapt to high illumination and physical changes that may

happen in the scene, like deposited or removed objects. .3. Foreground Detection

The purpose of image segmentation is to separate foreground

regions from background area in order to detect any moving

objects. Thresholding is the simplest method of image

segmentation. Thresholding can be used to create binary

images, so that objects of interest are separated from the

background. An foreground pixel is given a value of ―1‖ while

a background pixel is given a value of ―0‖ .

Foreground region extraction can be detected using

background subtraction method. Background subtraction method detects the moving regions by making use of the

pixel-by-pixel difference from a reference background image.

A background image over time is subtracted from current

acquired frame. The resulting image pixels are flagged as

foreground pixels if they have large intensity value, and

considered background pixels if they have a near zero value.

Foreground region extraction process may produce images

that contain holes may cause objects to be split into more than

one connected region, which would make the object to be

detected as multiple objects. So we need to restore the objects

to their original state and size by applying a sequence of

dilations and erosion

Here It is pixel at location (x, y) in the current image and Bt(x,

y) is background image It is marked as foreground if

|It(x, y) − Bt(x, y)| > Ƭ …………………….. (1)

is satisfied where ‗Ƭ ‘ is a predefined threshold.

Thresholding:

Otsu's Method

Otsu‘s method is used to automatically perform histogram

shape-based image Thresholding,the reduction of a gray level

image to a binary image. The algorithm assumes that the

image to be threshold contains two classes of pixels or bi-

modal histogram (e.g. foreground and background) then

calculates the optimum threshold separating those two classes

so that their combined spread (intra-class variance) is

minimal.

In Otsu's method we exhaustively search for the threshold that minimizes the intra-class variance, defined as a weighted sum

of variances of the two classes:

.…..(2)

Weights are the probabilities of the two classes separated

by a threshold t and variances of these classes.

Otsu shows that minimizing the intra-class variance is the

same as maximizing interclass variance:

…..(3)

Which is expressed in terms of class probabilities and class

means .

The class probability is computed from the histogram

as t:

…..(4)

While the class mean is:

…....(5)

Wher is the value at the center of the ith histogram bin. Similarly,

you can compute and on the right-hand side of the

histogram for bins greater than t.

MATLAB function for global Thresholding

Syntax level = graythresh(I)

BW = im2bw (I, level)

Description:

level = graythresh (I) computes a global threshold (level) that

can be used to convert an intensity image to a binary image

with im2bw. level is a normalized intensity value that lies in

the range [0, 1]. The graythresh function uses Otsu's method,

which chooses the threshold to minimize the intraclass

variance of the black and white pixels.

BW=im2bw (I, level) Convert image to binary image, based on threshold that converts the grayscale image I to a binary image. The output image BW replaces all pixels in the

input image with luminance greater than level with the value 1

(white) and replaces all other pixels with the value 0 (black).

Specify level in the range [0, 1]. This range is relative to the

signal levels possible for the image's class.

Figure.2: Background subtracted image

4. Object and Feature Extraction

Object tracking can be implemented when features are

efficiently identified and extracted from frame to frame

regardless of the tracking algorithm. The features used in this

system is the center of gravity (centroid) the velocity and the


size. The center of gravity is used to classify the object and

detect its location in the frame sequence. Object size is the

number of foreground pixel in the detected object. Figure 6

shows an illustration for the features.

Figure 3: Features -centroid and size of foreground image

5. Object Tracking

The aim of object tracking is to establish a correspondence

between objects or object parts in consecutive frames and to

extract temporal information about objects such as trajectory

track objects as a whole from frame to frame. The object

features such as size, center of gravity used to create

bounding box around the object in motion. The object tracking algorithm utilizes extracted object features together

with a correspondence matching scheme to track objects from

frame to frame.

Figure 4: Bounding Box enclosing human.

6. Object Classification

Motion based Classification involves extracting temporal

motion features of objects in order to recognize their classes.

In general to distinguish non-rigid objects (e.g. human) from

rigid objects (e.g. vehicles). The method is based on the

temporal self-similarity of a moving object. As an object that exhibits periodic motion evolves, its self-similarity measure

also shows a periodic motion. The method exploits this clue to

categorize moving objects using periodicity.

Time dependent features carry considerable amount of

information concerning the identity of an object. For example,

the periodicity of human gait is very effective for separating a

walking human from a moving car.

7. Object Processing

Once the object is identified and classified as human or

vehicle, the object is represented by indexing it by name,

number or generates alarm when identified object enters restricted area. Indexed image of objects can be compared

with the stored database.

IV.RESULTS

Figure1: Object detected with Background Eliminated and tracked

Figure 2: human detected with Background Eliminated

V.CONCLUSION

In System for foreground Analysis of real time moving object

detection and tracking, foreground regions of potential

moving objects are extracted over which morphological

operation applied to eliminate noise, detect moving objects

and human shown in below figure and further analysis like

Behavior analysis to be perform.

REFERENCES

[1] Real-Time Human Detection, Tracking, and Verification in Uncontrolled

Camera Motion Environments by Mohamed Hussein Wael Abd-Almageed

Yang RaLarry DavisInstitute for Advanced Computer Studies ,University of

Maryland.IEEE

[2] Moving object detection, tracking and classification for smart video

surveillance by yigithan dedeoglu

[3] J.K. Aggarwal and Q. Cai. Human motion analysis: A review. Computer

Vision and Image Understanding, 73(3):428–440, March 1999.

[4] An Extreme Precise Motion Tracking of Piezoelectric Positioning Stage

Using Sampled data Iterative Learning Control by Jian-Xin Xu, Deqing

Huang, Venkatakrishnan Venkataramanan, and Huynh The Cat Tuong

2011.


Naturalness of Machine Synthesized Tamil Speech

*A.G Ramakrishnan #kusumika krori dutta #Lakshmi Chithambaran

*Professor, Department of Electrical Engineering, Indian Institute of Science, Bangalore-560012

#Department of Electrical and Electronics Engineering,MSRIT, Bangalore-560054

Abstract: In general,Text-To-Speech (TTS) conversion

systems generate un-intonated speech that lacks

naturalness of human expressions. This paper proposes a

method to modify the un-intonated speech output by a

Tamil TTS engine to intonated speech with respect to

interrogative sentences. This is achieved by applying a

model of intonated interrogative contour representing proper speech parameters to the TTS output. The

modeling and implementation have been performed using

PRAAT interfaced with MATLAB.

Key Words: Tamil speech, TTS, interrogation, Pitch,

MATLAB, PRAAT.

1. INTRODUCTION

Over the last decade, speech processing usage has

become ubiquitous, fueled by the increasing

demand for automated systems with spoken

language interfaces. Considerable amount of

research has been done in prosody modification of

synthesized speech for European languages.

However, research on prosody in Indian languages

is very limited [1-2]. In this paper, we attempt to

model the pitch, amplitude and duration of

interrogative sentences in Tamil, which is one of

the oldest languages of Indian sub-continent.

Speech has an important role in human emotional

expressions. The same sentence, uttered with

different emotions, may give different meanings.

Thiswork aims to help people with cerebral palsy,

having all their emotions intact, but cannot talk.

Text- To- Speech (TTS)conversion systems help

them to overcome the disability, but the outputof

TTS has no intonations and thus, no information on

the feelings of the speaker. Information conveyed

through such speech can be misunderstood[3]. So

the whole purpose of communication may be lost

and may not evoke the expected response from

people [3].Thework reported here modifies the

prosody of an un-intonated interrogative sentence to

one that has more natural expression. For

introducing interrogative expression in machine

synthesized speech, a quantitative models created

for changes in different parameters responsible for

prosody[1-2] such as pitch, duration and energy. As

per this model, pitch is modified using DCT on the

pitch synchronous linear predictive coding (LPC)

residual signal [2,4].

2. INTERROGATIVE PROSODY MODEL

Fifteen interrogative sentences were recorded twice

from seven natives of Tamil, irrespective of gender,

age and slang. They were asked to utter each

sentence first without intonation and then with

intonation.The pitch, energy and durational features

were analyzed for all the recorded sentences with

the help of [1-2].

2.1 Pitch

Pitch is the most expressive feature of speech. It is

an indicator of different emotions.For

example,narrow pitch range indicates boredom,

depression, or controlled anger [1].

As shown in Figs. 1 and 2, the pitch contour of a

naturally uttered (intonated) interrogative sentence

shows a profound rise and fall pattern as compared

to un-intonated utterance of the same sentence. The

contour can be approximately modeled as a

Gaussian curve. The rise and fall pattern is

governed by the location of question word. If the

question word is at the end of the sentence, the

previous word forms the rise part and question

word forms the fall part.Otherwise, the question

word forms the rise part and the next word in the

sentence forms the fall part.

2.2 Energy

The mean energy of the whole speech signal (sum

square of all the samples divided by total number of


samples) of an intonated sentence is 10 to 28%

more than that of the corresponding un-intonated

sentence.

2.3 Duration

Duration is the time in seconds for which the basic

unit of speech under consideration lasts. In this

work, the unit studied is a word. Table 1 lists the

mean relative durations of intonated sentences.

Table 1.Variation in the duration characteristics of

intonated sentencescompared to those of un-

intonated ones.

3. VALIDATION

To validate the heuristic model of the prosodic

parameters, three varieties of semitones were

created: pure sinusoids, half wave rectified

sinusoids (to include the second harmonics) and the

combination of the two in different proportions

andMean Opinion Score(MOS) was obtained on a

scale of one to ten from four subjects.

Table 2. MOS of tones

Nature of synthetic

tones studied

MOS(on a scale

of 10)

Pure sinusoids 6

Rectified (harmonics) 4

1:1combination of

sinusoid & harmonics

7

4. EXPERIMENTAL STUDY

The above model was applied on the output of the

TTSdeveloped by MILE LAB, Deptof Electrical

Engineering, Indian Institute of Science [1&6].

Fig.3 Block diagram of the process to modify the

TTS output.

The TTS output[2-3] is segmented at the word-level

and analyzed by Praat software. The pitch marks

obtained from Praatare used in Matlab to extract

each pitch period of the speech signal to do pitch

synchronous analysis. Linear predictive analysis is

done on each period of the speech signal, using

which the basic excitation signal is decoupled from

the spectral shaping effects of the vocal tract. The

excitation signal contains the pitch information.

The human speech production is modeled as

+ G u(n) --------------(1)

where, s(n) is the speech signal samples, u(n) is the

excitation and ak are the LP coefficients. LPC

predicts the current sample using a linear

combination of p past samples.

s^(n) = -----------------------(2)

where, s^(n) is the estimated speech signal.

When the error between original speech signal and

estimated signal is taken, the error gives us only the

excitation pulse and the impulse response of the

vocal tract (formant information)iscaptured in ak‗s.

-------------------------(3)

-----------(4)

--------------------------------(5)

Where e(n) is the error and u (n) is the pitch

Therefore from equation 3, 4 and 5 we can say that

error between original speech signal and estimated

signal gives us the basic excitation and the formants

in ak.

Therefore Taking Z-transform of (4), error is the

output of the system with transfer function -k -----------------------(6)

Error of each period is obtained by passing each

period through the filter, described in (6) and the

error is manipulated in DCT domain, with the

modeled pitch contour as reference. The pitch

Duration Feature Relative duration in

intonated sentences

Duration of the

entire sentence

longer by average of

19.7%

First word in the

curve

Shorter by 31.5% on

the average.

Second word in the

curve

longer by 32% on

the average.


contour is modeled as a normal distribution curve

(from section 2.1).The block diagram of pitch

modification is shown in Fig.4

Durational features hypothesized, are implemented

by increasing or deleting frames appropriately,

Energy features hypothesized, are implemented by

multiplying the signal by the hypothesized factor.

5. RESULTS AND DISCUSSIONS

Un-intonated interrogative sentences were recorded

from people and modified by the algorithm and

then intonated interrogative sentences were

recordedfrom people and MOS was taken. The

MOS is as follows:

Table 3. MOS of intonated interrogative speech

compared to naturalness.

Speech MOS(out-of 5)

Un-intonated

recording

0

Modified un-

intonated recording

with the help of

Algorithm

3.5

Natural Intonated

recording

5

With MOS we can say that algorithm proposed in

this paper modifies the un-intoned speech very

close to natural speech.

Next, output of TTS was taken for an interrogative

sentence in Tamil and was modified by the

algorithm and then the same sentence was recorded

with intonation from people.

Table 4.MOS of algorithm in paper as compared to

TTS and natural recording.

From MOS of TTS we can see that TTS in itself

doesn‘t have a plain contour as shown for un-

intonated recording in fig 1.

The process of imposition of our model to the

existing random contour in TTS creates detoriation

in the quality of the modified speech output.

6. CONCLUSION

As aimed, with the hypothesis and algorithm

followed in this paper interrogation intonation in

non-intonated sentences is brought very close to

natural intonation.

Further work of improving quality of TTS modified

speech can be done. Also the same algorithm can be

followed for other emotions.

ACKNOWLEDGEMENT

We whole heartedly thank all the members of MILE

LAB, Department of Electrical Engineering, Indian

Institute of Science. A special thanks to Mr. H R

Shiva Kumar for developing the TTS and making it

available as a web demo.

REFERENCE

[1] G L Jayavardhana Rama et al."Thirukkural : A Speech

synthesis system in Tamil", Proc. Tamil Internet 2001,

Kualalumpur, Malaysia, Aug.26-28, 2001.

[2] R.Muralishankar and A.G.Ramakrishnan, ―Human touch

to the Tamil Synthesizer,‖ Proc. Tamil Internet 2001, Kuala

Lumpur, August 26-28, 2001, pp. 103-109.

.

Speech MOS(out-of 5)

TTS 1

Algorithm as

proposed in this paper

2

Intonated recording 5


OPTICAL FIBER COMMUNICATION 1K. B. MOHD. UMAR ANSARI,

2SATYENDRA VISHWAKARMA,

3ANUP KUMAR

1, 2

M.Tech (Electrical Power & Energy Systems),

Department of Electrical & Electronics Engineering, 3B.Tech (Mechanical Engineering), Dept. of Mechanical Engg.,

Ajay Kumar Garg Engineering College

Ghaziabad, Uttar Pradesh, India.

[email protected], [email protected], [email protected]

Abstract: Fiber-optic communication is a method of transmitting

information from one place to another by sending pulses

of light through an optical fiber. The light forms

an electromagnetic carrier wave that is modulated to carry

information. First developed in the 1970s, fiber-

optic communication systems have revolutionized the

telecommunications industry and have played a major role in the

advent of the Information Age. Because of its advantages over

electrical transmission, optical fibers have largely replaced

copper wire communications in core networks in the developed

world.

The process of communicating using fiber-optics involves the

following basic steps: Creating the optical signal involving the

use of a transmitter, relaying the signal along the fiber, ensuring

that the signal does not become too distorted or weak, receiving

the optical signal, and converting it into an electrical signal.

Keywords: Optical fiber, Principle, Modes, Elements,

Applications.

1. INTRODUCTION

The phenomenon of totalinternal reflection, responsible for

guiding of light in optical fibers, has been known since 1854

[1]. Although glass fibers were made in the 1920s, their use

became practical only in the 1950s, when the use of a cladding

layer led to considerable improvement in their guiding characteristics. Before 1970, optical fibers were used mainly

for medical imaging over short distances [2]. Their use for

communication purposes was considered impractical because

of high losses. However, the situation changed drastically in

1970 when, following an earlier suggestion, the loss of optical

fibers was reduced to below 20 dB/km [2]. Further progress

resulted by 1979 in a loss of only 0.2 dB/km near the 1.55-µm

spectral region [3]. The availability of low-loss fibers led to a

revolution in the field of light wave technology and started the

era of fiber-optic communications.

2. GEOMETRICAL OPTICS DESCRIPTION

In its simplest form an optical fiber consists of a cylindrical

core of silica glass surrounded by a cladding whose refractive

index is lower than that of the core. Because of an abrupt index change at the core–cladding interface, such fibers are

called step-indexfibers. In a different type of fiber, known as

graded-index fiber, the refractive index decreases gradually

inside the core. Figure 1 shows schematically the index profile

and the cross section for the two kinds of fibers. Considerable

insight in the guiding properties of optical fibers can be gained

by using a ray picture based on geometrical optics [3]. The

geometrical-optics description, although approximate, is valid

when the core radius a is much larger than the light

wavelength λ.

Figure 1: Cross section and refractive-index profile for step-

index and graded-index fibers.

Step-IndexFibers

Consider the geometry of Fig. 2,wherearay making an




http://en.wikipedia.org/wiki/Light

http://en.wikipedia.org/wiki/Optical_fiber

http://en.wikipedia.org/wiki/Electromagnetic_radiation

http://en.wikipedia.org/wiki/Carrier_wave

http://en.wikipedia.org/wiki/Modulation

http://en.wikipedia.org/wiki/Communication_system

http://en.wikipedia.org/wiki/Information_Age

http://en.wikipedia.org/wiki/Fiber-optic_communication#Comparison_with_electrical_transmission

http://en.wikipedia.org/wiki/Fiber-optic_communication#Comparison_with_electrical_transmission

http://en.wikipedia.org/wiki/Core_network

http://en.wikipedia.org/wiki/Developed_world



http://en.wikipedia.org/wiki/Electrical_signal


angleθi with the fiber axis is incident at the core center

.Because of refraction at the fiber–

airinterface,theraybendstowardthenormal.Theangleθr oftherefractedrayisgivenby[3]

n0sinθi=n1sinθr (1) 1

wheren1 andn0 aretherefractiveindicesofthefibercoreandair,respectively.

There- fractedrayhitsthecore–

claddinginterfaceandisrefractedagain.However,refraction

ispossibleonlyforanangleofincidenceυsuchthat

sinυ<n2/n1.

Figure2:Lightconfinementthroughtotalinternalreflectioninstep-indexfibers. Raysforwhichυ<υc arerefractedout ofthecore.

(2) OnecanuseEqs.1 and 2

tofindthemaximumanglethattheincident

rayshouldmakewiththefiberaxistoremainconfinedinsideth

ecore. Notingthatθr=π/2−υc

forsucharayandsubstitutingitinEq.1,weobtain

(3) Inanalogywithlenses, n0sinθiisknownasthenumericalaperture(NA)ofthefiber.

Graded index fibers

Therefractiveindexofthecoreingraded-index

fibersisnotconstantbutdecreases

graduallyfromitsmaximumvaluen1

atthecorecentertoitsminimumvaluen2 at thecore–

claddinginterface. Mostgraded-

indexfibersaredesignedtohaveanearlyquadraticdecreasean

dareanalyzedbyusingα-profile,givenby

where a is the core radius. The parameter α determines the

index profile. A step-index profile is approached in the limit

of large α. A parabolic-index fiber corresponds to α = 2.

It is easy to understand qualitatively why intermodal or multipath dispersion is reduced for graded-index fibers. Figure

3 shows schematically paths for three different rays. Similar to

the case of step-index fibers, the path is longer for more

oblique rays.

However, the ray velocity changes along the path because of

variations in the refractive index. More specifically, the ray

propagating along the fiber axis takes the shortest path but

travels most slowly as the index is largest along this path.

Oblique rays have a large part of their path in a medium of

lower refractive index, where they travel faster. It is therefore

possible for all rays to arrive together at the fiber output by a

suitable choice of the refractive-index profile.

Figure 3: Ray trajectory in a graded index fibers

Geometricalopticscanbeusedtoshowthataparabolic-index

profileleadsto

nondispersivepulsepropagationwithintheparaxialapproxi

mation. Thetrajectory

ofaparaxialrayisobtainedbysolving[3]

where α is the radial distance of the ray from the axis.

3. GENERAL OVERVIEW OF OPTICAL FIBER

COMMUNICATION SYSTEM :

Like all other communication system, the primary objective

of optical fiber communication system also is to transfer the

signal containing information (voice, data, video) from the

source to the destination. The general block diagram of

optical fiber communication system is shown in the figure 4.

The source provides information in the form of electrical

signal to the transmitter. The electrical stage of the

transmitter drives an optical source to produce modulated

light wave carrier. Semiconductor LASERs or LEDs are

usually used as optical source here. The information carrying

light wave then passes through the transmission medium i.e.

optical fiber cables in this system. Now it reaches to the

receiver stage where the optical detector demodulates the optical carrier and gives an electrical output signal to the

electrical stage. The common types of optical detectors used

are photodiodes (p-i-n, avalanche), phototransistors,


photoconductors etc. Finally the electrical stage gets the real information back and gives it to the concerned destination.

Figure 4: OPTICAL FIBER COMMUNICATION SYSTEM.

4. PRIMARY ELEMENTS OF OPTICAL FIBER

COMMUNICATION SYSTEM:

Figure5 shows the major elements used in an optical fiber

communication system. As we can see the transmitter stage

consists of a light source and associated drive circuitry.

Again, the receiver section includes photo detector, signal

amplifier and signal restorer.

Additional components like optical amplifier, connectors,

splices and couplers are also there. The regenerator section

is a key part of the system as it amplifies and reshapes the distorted signals for long distance links.

Figure 5: Elements Of an Optical Fiber Communication

System.

4.1 Transmitter section : The main parts of the transmitter section are a source (either

a LED or a LASER), efficient coupling means to couple the

output power to the fiber, a modulation circuit and a level

controller for LASERs. In present days, for longer repeater

spacing, the use of single mode fibers and LASERs are

seeming to be essential whereas the earlier transmitters

operated within 0.8µm to 0.9µm wavelength range, used

double hetero structure LASER or LED as optical sources.

High coupling losses result from direct coupling of the source to optical fibers. For LASERs, there are two types of

lenses being used for this purpose namely discrete

lenses and integral lenses.

4.1.1 LED vs LASER as optical source :

A larger fraction of the output power can be coupled into the

optical fibers in case of LASERs as they emit more

directional light beam than LEDs. That is why LASERs are

more suitable for high bit rate systems. LASERs have

narrow spectral width as well as faster response time.

Consequently, LASER based systems are capable of

operating at much higher modulation frequencies than LED

based systems. Typical LEDs have lifetimes in excess of

10^7 hours, whereas LASERs have only 10^5 hours of

lifetime. Another thing is that LEDs can start working at

much lower input currents which is not possible for LASERs. So, according to the situation and requirements

either LED or LASER can be utilized as an optical source.

Now there are a number of factors that pose some limitations

in transmitter design such as electrical power requirement,

speed of response, linearity, thermal behavior, spectral width

etc.

4.1.2 Drive circuitry:

These are the circuits used in the transmitters to switch a

current in the range of ten to several hundred miliamperes

required for proper functioning of optical source. For LEDs

there are drive circuits like common emitter saturating

switch, low impedance, emitter coupled, transconductance

drive circuits etc. On the other hand for LASERs, shunt

drive circuits, bias control drive circuits, ECL compatible

LASER drive etc are noticeable.

4.2 Receiver section:

From figure 5 the general structure of a receiver section

includes Photo detector, low noise front end amplifier,

voltage amplifier and a decision making circuit to get the

exact information signal back. High impedance amplifier

and Transimpedance amplifier are the two popular

configurations of front end amplifier, the design of which is

very critical for sensible performance of the receiver. The

two most common photo detectors are p-i-n diodes and

avalanche photodiodes. Quantum efficiency, responsively

and speed of response are the key parameters behind the

decision of photo detectors. The most important

requirements of an optical receiver are sensitivity, bit rate

transparency, bit pattern independence, dynamic range,

acquisition time etc. As the noise contributed by receiver is higher than other elements in the system so, we must put a

keen check on it.


5. BENEFITS OF OPTICAL FIBER COMMUNICATION SYSTEM:

Some of the innumerable benefits of optical fiber

communication system are:

Immense bandwidth to utilize

Total electrical isolation in the transmission

medium

Very low transmission loss,

Small size and light weight,

High signal security,

Immunity to interference and crosstalk,

Very low power consumption and wide scope of

system expansion etc.

6. APPLICATION Due to its variety of advantages optical fiber communication

system has a wide range of application in different fields

namely:

Public network field which includes trunk networks,

junction networks, local access networks, submerged

systems, synchronous systems etc.

Field of military applications,Civil, consumer and

industrial applications. Field of computers which is the center of research right

now.

7. CONCLUSION

Though there are some negatives of optical fiber

communication system in terms of fragility, splicing,

coupling, set up expense etc. but it is an un avoidable fact

that optical fiber has revolutionized the field of communication. As soon as computers will be capable of

processing optical signals, the total arena of communication

will be opticalized immediately.

REFERENCES [1] K. C. Kao and G. A. Hockham, Proc. IEE 113, 1151

(1966); A. Werts, Onde Electr. 45, 967 (1966). [2] N. S. Kapany, Fiber Optics: Principles and Applications,

Academic Press, San Diego, CA, 1967.

[3] J. Gower, Optical Communication Systems, 2nd

ed.,

Prentice Hall, London.


Improved Face Recognition with Multilevel

BTC using YCbCr Colour Space

Shoan Herman Pinto ECE DepartmentSJBIT

Bangalore, India [email protected]

Chitra V Kumar ECE DepartmentSJBIT

Bangalore, India

[email protected]

Shreyas S Sogal ECE DepartmentSJBIT


Abstract-The motive of the work presented in the

paper is to achieve a better efficiency in Face

Recognition using block truncation coding (BTC)

using RGB and YCbCr colour space. Multilevel Block

Truncation Coding is applied to image in RGB and

YCbCr colour space up to four levels for face

recognition. The experimental analysis has shown an

improved result for Block Truncation Coding at

Level 4 (BTC-level 4) as compared to other BTC

levels of RGB colour space. Results displaying a

similar pattern are realized when the YCbCr colour

space is used. In addition, an improved result on all

four levels is observed for YCbCr colour space.

Keywords- Face recognition; BTC; RGB; YCbCr;

Multilevel BTC; Mean Square Error;

1. INTRODUCTION

Face recognition refers to identifyingand

verifyingaface image.Afacerecognition system

accomplishesthisby comparing the

inputqueryfaceimagewiththe existingface

imagesstoredinthe database.Itexploitsthe unique

characteristics ofanindividual‘s face. Face

recognitionisthe fastestgrowing

biometrictechnology. Biometricsmay be

definedasanautomatedmethodof

recognizingpersonbasedonthephysiologicalandbeha

vioralcharacteristics. There are many biometric

systemssuch as finger prints,iris,voice,retina

andface.Amongthesesystems,

facerecognitionhasprovedtobethe

mosteffectiveanduniversalsystem.Thesesystems

are usedina wide range of

applicationsthatrequirereliablerecognition of

humans.Someof the applications of face

recognition include security, physical andcomputer

accesscontrols, law enforcement [11],[12],criminal

list verification, surveillanceatvarious

places[14],authentication atairports,forensic,etc.

Face recognitionhas becomeacentre ofattentionfor

researchersfrom the fieldof biometrics,computer

vision, imageprocessing,neural networksand

patternrecognition system. Many

algorithmsareusedtomake effectiveface

recognitionsystems.Someof

thealgorithmsincludePrinciple

ComponentAnalysis(PCA)[2],[3],[4],LinearDiscri

minant Analysis

(LDA)[5],[6],[7],IndependentComponentAnalysis

(ICA)[8],[9],[10]etc.


Thefaceimagesina databasemightnotbe of

constantsize. Thus,tomakethealgorithm

independent of thesizeof aface

image,BlockTruncationCoding (BTC)[12],[13] has

been used.Thiscoding technique has

beenimplementedtillfour levels intwocolour face

image databases.

2. BLOCK TRUNCATION CODING

Block truncationcoding (BTC) [1]

[11][12][13]isarelatively simpleimagecoding

techniquedevelopedintheearlyyears of

digitalimagingmorethan29yearsago.Althoughitis

asimple

technique,BTChasplayedanimportantroleinthehistor

y of digitalimagecoding inthesensethatmany

advancedcoding techniques have beendeveloped

basedonBTCorinspired by thesuccessof

BTC.BlockTruncationCoding(BTC)wasfirst

developed in 1979 for grayscale imagecoding

[13].In the givenimplementationof BTC,the colour

faceimagesdatabase

intheRGB(Red,GreenandBlue)colour

space[16], [17]has been used. It is later

converted to YCbCr colour space.

3. COLOUR SPACE

The various colour spaces exist because they

present colour information in ways that make

certain calculations more convenient or because

they provide a way to identify colours that is more

intuitive. Few examples being RGB, XYZ, xyY,

L*a*b, YCbCr,HSV. In this paper RGB and

YCbCr colour space is used.

Y represents the luminance and Cb & Cr represent

the chrominance components. This format is

typically used in video coding.

4. MULTI-LEVEL BTC

Tocalculate thefeaturevector inthis algorithm,

Block Truncation Coding has been used(Forfurther

information refer[11],[12],[13]).It

hasbeenimplementedonfour levels whichare

explained below:

4.1 BTC Level 1

Aface image istakenfrom the database andthe

average intensity valueofeachcolour

planeoftheimageiscalculated. The colour

spaceconsideredinthisalgorithm isthe RGBcolour

space [16],[17]. Sotheaverageintensity valueof

eachof the RGBplaneof afaceimageiscalculated.

The further discussionisdone usingtheRedplane of

animage.Thesame has to be carriedoutforthe

BlueandGreencolour space.

Afterobtainingtheaverageintensityvalue of theRed

colour plane

ofthefaceimage,eachpixeliscomparedwiththe mean

value andthe image is dividedintotworegions:

Upper RedandLowerRed[18].Theaverageintensity

valuesofthese regionsiscalculatedandstored

inthefeaturevectorasURand LR.Thus,afterrepeating

this procedure forthe Blue and the Greencolour

space ourfeaturevectorhassix elements:Upper

Red,LowerRed,UpperGreen,LowerGreen,Upper

Blue, Lower Blue (UR,LR,UG,LG,UB,LB)[18].

Referfigure 1.

4.2 BTC Level 2

At level two the values Upper Red and Lower Red

are

extractedfromthefeaturevectorofBTClevel1andusin

g thesevalues,theRedplaneoffaceimageisnow

dividedinto fourregions.TheseareUpper-

UpperRed,Upper-LowerRed, Lower-

UpperRedandLower-LowerRed[18].The average

intensityvaluesinthesefourregions

iscalculatedandstoredin the feature vectors.

The aboveprocess isreiteratedforthe Blue

andGreencolour spacesof

thefaceimage.Thusthefeaturevectoratthislevel has

12 elements,4elements for eachplane.Referfigure 1.

4.3 BTC Level 3 and Level 4


Using the procedures describedintheLevels1&

2,thefaceimagesarefurtherdividedintomoreregionsin

eachofthe colour space.These regions are

depictedinfigure1.The average intensity

valueattheseregionsarecalculatedandstoredinthe

featurevector.Thefeaturevectorhas24elementsatBT

C- level 3and 48elements atBTC-level

4.Thefeaturevectors obtainedinBTC-

levels1,2,3,4areusedforcomparisonwith the

database imageset.Figure1 depictsthefourBTC-

levels withtheir feature vectors.

4.4 BTC for YCbCr plane

YCbCr is a family of colour spaces used as a part

of the colour image pipeline in videoand digital

photography systems. Y is the luma component and

Cb and Cr are the blue-difference and red-

difference chroma components. The same

algorithm used for RGB plane is implemented on

YCbCr plane by converting the RGB image to

YCbCr image.

5. PROPOSED METHOD

5.1 Feature vector extraction

TheFeaturevectorateachBTClevelforthequery

imageand databasesetisextractedby

usingthemethoddescribedinthe

previoussection(section 4).Feature vector for Red,

Green, Blue components of the image is obtained.

This Featurevectoristhenusedin the face

recognitionsystem.

5.2 Implementation using featurevectors

Thefeature vectors obtainedineachlevel of BTCareusedto

comparewiththe database images(Trainingset).The

comparison(Similaritymeasure)isdone by Mean Square

Error(MSE)given byequation.

Where,

X&X‘ are two feature vectors of size m*n which are being

compared. False Acceptance Rate (FAR) and Genuine

Acceptance Rate (GAR) are used to evaluate the performance

of the different

BTC levels based face recognition techniques.

Mean square error is calculated for every feature vector and

then it is compared with the query image mean square vector.

The minimum MSE obtained for a image after comparing

gives us the required image.

6. IMPLEMENTATION


6.1 Platform

The implementation of the Multilevel BTC is done in

MATLAB 2010b. It was carried out on a computer using an

Intel Core i3 processor.

6.2 Database

The experiments were performed on face database:

Created by Dr Libor Spacek this database has 1000 images

(each with 180 pixels by 200 pixels), corresponding to 100

persons in 10 poses each, including both males and females.

All the images are taken against a dark or bright homogeneous

background, little variation of illumination, different facial

expressions and details. The subjects sit at fixed distance from

the camera and are asked to speak, whilst a sequence of

images is taken. The speech is used to introduce facial

expression variation. The images were taken in a single

session. The six poses of Face database are shown in Figure 2.

Figure 2

6.3 YCbCr based BTC Algorithm

Step 1: Train images are read into vector and RGB colour

space is converted to YCbCr colour space.

Step 2: Implementation of BTC levels-Level 1, Level 2, Level

3 and Level 4 is done to the train images.

Step 3: Implementation of BTC Levels is done for the test

image.

Step 4: For each image in train mean square error is

determined with respect to test image.

Step 5: The image in train with least mean square error is the

recognized image with respect to the test image.

7. RESULTS AND DISCUSSION

False Acceptance Rate(FAR)andGenuine Acceptance Rate

(GAR)arestandardperformanceevaluationparameters offace

recognitionsystem.

The Falseacceptancerate(FAR)isthemeasure of the

likelihoodthatthebiometricsecurity systemwillincorrectly

acceptanaccessattempt byanunauthorizeduser.A system‘s FAR

typically isstatedastheratio ofthe numberoffalse acceptances

divided bythe number ofidentificationattempts.

FAR = (False ClaimsAccepted/Total Claims) X 100

TheGenuine Acceptance Rate(GAR)isevaluatedby

subtractingthe FAR values from 100.

GAR=100-FAR (percentage)

Inall, 99queriesare firedonfacedatabase (132 images are

considered).Foreach query, FARandGARvalues

arecalculatedforrespective BTC levelbasedfacerecognition

technique.Attheend theaverage FARandGAR of

allqueriesinrespectivefacedatabases are consideredfor

performanceranking of BTClevelsbasedface

recognitiontechniques.

7.1 Face Database [15]

In all,99 queries are tested on the Libor database of 100

images for analyzing the performance of proposed algorithms.

The feature vectors of each image or all four BTC levels

were calculated and then compared with the database. FAR

for the algorithm was obtained to be zero. A graph of the

efficiency of the program is shown below.

The graph givesthe efficiencyvaluesof the

differentBTClevelsfor Face database.Here


itisobservedthatwitheachsuccessive level of BTC

theefficiencyvaluesgo onincreasing. Thusitisobserved thatthe

BTC-level4gives us thebest performance for RGB colour

space.

It is also observed that the efficiency for YCbCr colour space

is more compared to that of RGB colour space. 100%

efficiency is obtained for Level 2, 3, 4 when applied to YCbCr

colour space.

8. CONCLUSION

The three primary aspects on which face recognition depends

are cost, accuracy of the algorithm and execution time of the

program. As the level increases in BTC the GAR increases. The

highest GAR is obtained for level 4 implementation of BTC for

RGB colour space. YCbCr colour space gives an improved

result with highest GAR for BTC Level 2, 3, 4. This canbe

attributedto the relativelylargersize ofthe feature vectoratthis

level.Theproposedtechniquecanbe implementedinrealworld

scenarios choosingthe appropriate BTC levelimplementation.

REFERENCES

[1] H.B.Kekre, Sudeep D. Thepade, Sanchit Khandelwal, Karan

Dhamejani, Adnan Azmi, ―Face Recognition using Multilevel Block

Truncation Coding‖ International Journal of Computer Applications (IJCA)

December 2011 Edition.

[2] Xiujuan Li, Jie Ma and Shutao Li 2007. A novel faces recognition method

based on Principal Component Analysis and Kernel Partial Least. IEEE

International Conference on Robotics and Biometrics, 2007 ROBIO 2007

[3] Shermin J ―Illumination invariant face recognition using Discrete Cosine

Transform and Principal Component Analysis‖ 2011 International Conference

on Emerging Trends in Electrical and Computer Technology (ICETECT).

[4] Zhao Lihong , Guo Zikui ―Face Recognition Method Based on Adaptively

Weighted Block-Two Dimensional Principal Component Analysis‖; 2011

Third International Conference on Computational Intelligence,

Communication Systems and Networks (CICSyN)

[5] Gomathi, E, Baskaran, K. ―Recognition of Faces Using Improved

Principal Component Analysis‖; 2010 Second International Conference on

Machine Learning and Computing (ICMLC)

[6] Haitao Zhao, Pong Chi Yuen‖ Incremental Linear Discriminant Analysis

for Face Recognition‖, IEEE Transactions on Systems, Man, and Cybernetics,

Part B: Cybernetics

[7] Tae-Kyun Kim; Kittler, J. ―Locally linear discriminant analysis for

multimodally distributed classes for face recognition with a single model

image‖ IEEE Transactions on Pattern Analysis and Machine Intelligence,

March 2005

[8] James, E.A.K., Annadurai, S. ―Implementation of incremental linear

discriminant analysis using singular value decomposition for face

recognition‖. First International ConferenceonAdvancedComputing, 2009.

ICAC 2009

[9] Zhao Lihong, Wang Ye, Teng Hongfeng; ―Face recognition based on

independent component analysis‖, 2011 Chinese Control and Decision

Conference (CCDC)

[10] Yunxia Li, Changyuan Fan; ―Face Recognition by Non negative

Independent Component Analysis‖ Fifth International Conference on Natural

Computation, 2009. ICNC'09‘.

[11]Yanchuan Huang, Mingchu Li, Chuang Lin and Linlin Tian. ―Gabor

Based Kernel Independent Component Analysis on Intelligent Information

Hiding and Multimedia Signal Processing (IIH-MSP).

[12] H.B.Kekre, Sudeep D. Thepade, Varun Lodha, Pooja Luthra, Ajoy

Joseph, Chitrangada Nemani ―Augmentation of Block Truncation Coding

based Image Retrieval by using Even and Odd Images with Sundry Colour

Space‖ Int. Journal on Computer Science and Engg. Vol02, No. 08, 2010,

2535-2544

[13] H.B.Kekre, Sudeep D. Thepade, Shrikant P. Sanas Improved CBIR using

Multileveled Block Truncation Coding International Journal on Computer

Science and Engineering Vol. 02, No. 08, 2010, 2535-2544

[14] H.B.Kekre, Sudeep D. Thepade, ―Boosting Block Truncation Coding

using Kekre‘s LUV Colour Space for Image Retrieval‖, WASET International

Journal of Electrical, Computer and System Engineering (IJECSE), Volume

2, Number 3, pp. 172-180, Summer 2008.

[15] H.B.Kekre, Sudeep D. Thepade, ―Image Retrieval using Augmented

Block Truncation Coding Techniques‖, ACM International Conference on

Advances in Computing, Communication and Control (ICAC3-2009), pp.

384-390, 23-24 Jan 2009, Fr. Conceicao Rodrigous College of Engg.,

Mumbai.

[16] Developed by Dr. Libor Spacek. Available Online at:

http://cswww.essex.ac.uk/mv/otherprojects.html

[17] Mark D. Fairchild, ―Colour Appearance Models‖, 2nd Edition,

WileyIS&T, Chichester, UK, 2005. ISBN 0-470-01216-1

86

88

90

92

94

96

98

100

102

Level 1

Level 2

Level 3

Level 4

Eff

icie

ncy

BTC Level

RGB

YCbCr


Online Gesture Body Recognition Latha K1, Kavitha Vasanth 2

1 Latha.K

IV SEM, M.Tech, Dept. of CSE AIT, Banglore, Karnataka, INDIA

2 Kavitha Vasanth Asst. Professor, Dept. of CSE

AIT, Banglore, Karnataka, INDIA [email protected]

[email protected]

Abstract - Thispaperpresentsarobustframeworkforonlinefull-

bodygesturespottingfromvisualhulldata.Usingfullbodygesturefeatur

esasobservations,SVM–upportVectorMachinearetrainedforgesturespottingfromcontinuousm

ovementdatastreams.Themajorcontributionofthispaperisasystematica

pproachtoautomaticallydetectingandmodellingspecificgesturemove

mentpatternsandusingtheirHMMsforoutlierrejectioningesturespottin

g.UsingtheIXMASgesturedataset[3],theproposedframeworkisgoingtotestandthegesturespottingresultsaresuperiortothosereportedonthesame

datasetobtainedusingexistingstate-of-the-art

gesturespottingmethods.[1]

Aim is developing an efficient algorithm for online body gesture

recognition system.

Index Terms—Online gesture spotting, SVM, multilinear analysis,

IXMAS, hidden Markov models.

1. INTRODUCTION A gesture is a form of non-verbal communication in which visible

body actions communicate particular messages, either in the place of speech, or together and in parallel with spoken words. Gesture

include movements of the hands, face, or other parts of the body.

Gestures are expressive, meaningful body motions involving physical

movements of the fingers, hands, arms, head, face, or body with the intent of: (1)conveying meaningful

Information or (2) interacting with the environment. They constitute

one interesting small subspace of possible human motion. A gesture

may also be perceived by the environment as a compression technique for the information to be transmitted elsewhere and

subsequently reconstructed by the receiver.

Gesture Recognition is a technology that achieves dynamic human-

system interactions that do not require physical, touch, or contact based input mechanisms. Gesture recognition enables humans to

interface with the machine (HMI) and interact naturally without any

mechanical devices. Using the concept of gesture recognition, it is

possible to point a finger at the computer screen so that the cursor will move accordingly. This could potentially make conventional

input devices such as mouse, keyboards and even touch-screens

redundant.

Types of gesture recognition: hands, and full body. Gestures are most commonly used for communication among

humans, reducing the chances of misclassifying static poses, by using

continuous information. Gestures can be divided into two types [3]

i) Communicative gesture (a key gesture or a meaningful gesture). ii) Non-communicative gesture (garbage gesture or a transition

gesture).

A key gesture is motion that carries an explicit meaning to express

goals, and a transition gesture is motion that connects key gestures to cater to subconscious goals. Figure 1.1 shows a key gesture and

transition gesture.

Fig 1.1: Motion example consisting of a sequence of key gestures and

transition gestures

1.1 Scope of the project This project is implemented on Matlab R2010a using IXMAS gesture

data set. IXMAS gesture data set contains video clips performing human actions like walking, waving, running etc.

1. 2 Methodology • Image /Frame acquisition defining human actions

• Human Blob generation using CCA algorithm • Contour Extraction.

• Convex Hull Point generation using Graham‘s scan algorithm.

• Classification using Support Vector Machines (SVM).

1.3 Applications of gesture recognition It has following applications[2].

• Developing aids for the hearing impaired.

• Enabling very young children to interact with computers.

• Designing techniques for forensic identification. • Recognizing sign language.

• Medically monitoring patients‘ emotional states or stress levels.

• Navigating and/or manipulating in virtual environments.

• Communicating in video conferencing. • Distance learning/tele-teaching assistance.




2. TRAINING A MODEL The result of running a machine learning algorithm can be expressed

as a function ϕ(x), parameterized by the model parameters which

takes an input vector x as input and that generates an output vector y,

encoded in the same format as the target vector t.

Fig 2.1:

A machine learning algorithm, expressed as a function with input x,

output y and parameterized by.

2.1Pre-processing It is common for the original input variables to be pre-processed in

some way to reduce both the computational load and complexity of

the recognition problem. The pre-processing stage could consist of

simply scaling or normalizing the data to a standard range, smoothing it to remove noise, or by transforming it into some new subspace of

variable, where it is hoped,

the recognition problem will be easier to solve. This preprocessing

stage is sometimes also called feature extraction. It should be noted that the new test data must be pre-processed using the same feature

extraction method as used in the training data

2.2 Post-processing In addition to pre-processing the raw data prior to input to a machine learning algorithm, it is also common to process the output of a

machine learning algorithm prior to using its output to make a

decision, such as either triggering or not triggering a sound to play.

This post processing stage could consist of waiting for a number of consecutive ‗trigger‘ classification results before a sample is

triggered or by combining the classification results of a number of

machine learning algorithms together to create one super-classifier.

The post-processing stage also enables the output of the machine learning algorithm to be combined with some additional domain-

specific information to provide additional context. Figure 2.3

illustrates the processing chain of gesture recognition.

Fig 2.2: An illustration of the processing chain for

a gesture recognition system

3. MACHINE LEARNING MODELS

3.1 Hidden Markov Model (HMM)

A time-domain process demonstrates a Markov property if the conditional probability density of the current event, given all

present and past events, depends only on the jth most recent

event. If the current event depends solely on the most recent

past event, then the process is termed a first order Markov process. Hidden Markov model (HMM) is a finite state machine

which generates a sequence of

discrete time observations. At each time unit, the HMM changes

states at Markov process in accordance with a state transition probability and then generates observational data in accordance

with an output probability distribution of the current state.

An N-state HMM Model is specified by:

• The set of states‘ S= s1, s2, s3……..sN. • The set of parameters λ = (п, A, B)

a) State transition probability ai,j where i, j =

1…..N.

b) The output probability distribution B= bi (o) where i = 1….N.

c) Initial state probability п = io where i =

1…N.

Fig 3.1: A Three state left-to-right HMM model.

Fig 3.1 shows a 3-state left-to-right model, in which the state index

simply increases or stays depending on time increment. The left-to-

right models are often used as speech units to model speech

parameter sequences since they can appropriately model signals

whose properties successively change. [4]

State 1 State 2 State 3

Table 3.1: State1, State2, State 3 what is the

probability of HMM producing ―a, a, b, c‖? Pr (a, a, b, c) via 1,1,2,3

= 0.8 × 0.5 × 0.8 × 0.3 × 0.6 × 0.5 × 0.1 = 0.004068.Pr (a, a, b, c) via

1,2,3,3 = 0.8 × 0.3 × 0.2 × 0.5 × 0.3 × 1.0 × 0.1 = 0.00072.

Pr (a, a, b, c) via 1,3,3,3 = 0.8 × 0.2 × 0.7 × 1.0 × 0.3 × 1.0 × 0.1 =

0.00336.

Because of the above mentioned problems related to HMM, the

proposed framework of classifying the gestures is done with Support

Output Probability Output Probability Output Probability

a 0.8 a 0.2 a 0.7

b 0.1 b 0.6 b 0.3

c 0.1 c 0.2 c 0.1

Model parameters

Input xOutput y

h(x)


Vector Machines (SVM) Classifier. Prior work of SVM was

employed to recognize hand gestures [5].

3.2 SVM Algorithm

Support vector machine is a classifier derived from statistical

learning theory invented by Vapnik and first introduced at the

Computational Learning Theory (COLT) 1992 conference. A support

vector machine (SVM) is a concept in statistics and computer science

for a set of related supervised learning methods that analyze data and

recognize patterns, used for classification and regression analysis.

The standard SVM takes a set of input data and predicts, for each

given input, which of two possible classes forms the input, making

the SVM a non-probabilistic binary linear classifier. Given a set of

training examples, each marked as belonging to one of two

categories, an SVM training algorithm builds a model that assigns

new examples into one category or the other. An SVM model is a

representation of the examples as points in space, mapped so that the

examples of the separate categories are divided by a clear gap that is

as wide as possible.

Fig 3.2: A 1-dimensional linearly inseparable

classification problem.

Fig 3.3: Mapping the data into a new, 2-dimensional feature space to

make the data linearly separable. It is achieved using a RBF kernel.

Following are some of the kernel functions which are commonly

used to convert input features into new feature space.

Linear Kernel - This is the simplest kernel and

shows good performance for linearly separable

data. K(x, y) = x. y

RBF Gaussian Kernel – K(x, y) = ⁄2

By using any of the kernels we can convert inseparable data to

linearly separable one.

4. GESTURE RECOGNITION PROCESS Recognition of human actions from video sequences involves

extraction of relevant visual information from a video sequence,

representation of that information in a suitable form, and

interpretation of visual information for the purpose of

recognition and learning human actions. Video sequences

contain large amounts of data, but most of this data does not

carry much useful information. Therefore, the first step in

recognizing human actions is to extract relevant information

which can be used for further processing. This can be achieved

through visual tracking. Tracking involves detection of regions

of interest in image sequences, which are changing with respect

to time. Tracking also involves finding frame to frame

correspondence of each region so that location, shape, extent,

etc., of each region can reliably be extracted. The recognition

involves two phase training and classification and it is carried

out in offline mode, considering the IXMAS gesture data set.

4.1 Block diagram

Fig 4.1: Top- level diagram of proposed system

Gesture recognition using SVM has the following methodologies

which is described in the block diagram 4.1.

The gesture recognition is done through offline mode, considering

IXMAS gesture data set. The data set contains video clips which

contains actions like waving, punching, kicking which is performed

by several persons.

Training phase

Initially video has n number of frames which is in RGB scale. The

frame which contains human subject has to be separated from the

background frame which results in the segmentation of human

object. The human object has to be converted into binary in order to

generate human blob using CCA algorithm. Conversion from RGB

scale to Binary scale is done through gray scale conversion. Once

human blob is generated, need to extract features likeboundary points

from the human subject. The boundary points are obtained from

contour method. From the contour points, we obtain the hull points.

These hull points are the valid points from the contour points. These

hull points has to extract from each and evey frame. These hull points

are nothing but the feature points are generated for evey actions or


gesture. Subtraction of hull points form the consecutive frames

results in motion vectors or feature vectors. Use of Dynamic Time

Warping (DTW) in a scenario where computed hull points for the

current frame is less compared to previous or next frame. These

feature vectors is a scalar value, which represents the displacement of

human position from one frame to the next. In general, we are

extracting spatio-temporal features for each action. Spatio temporal

features are nothing but the contour and convex hull points. The

obtained features for each and every gesture along with the different

class like type1, tpe2 etc. which represents various actions is passed

to SVM for training purpose. Training the samples using SVM

results in knowledge base.

Testing phase or Classification

IXMAS gesture data set is used for the testing phase. The

above procedure is repeated for the testing also. The generated

feature vectors from the gestures along with the knowledge bases

which is resultant from the training are used for classification of

gestures.

5. RESULTS

5.1 Key Features The proposed method is tested under IXMAS gesture data set.

Actions like getup, sit-down, walking, waving is classified using

SVM algorithm. The process for classification is followed as in

Menu.

Fig 5.1: Menu showing

the overall recognition process.

It has the following steps tobe carried outi) Creating a database of all

actions. ii)Perform SVM training. iii) Select any Query action for

classification purpose.iv)Displaying the results.

5.2 Consider the person is getting up a)Test clip -5th frame

i)Grayscale image

Fig 5.2: Test

clip Fig 5.2: i) Grayscale image ii) Binary Image iii) RGB

image- 5th frame

Fig 5.2: ii) Binary image Fig5.2: iii) RGB image

In the test video, initially person is in sitting position lifting his arms.

• Subtracting the test frame from background frame results in

grayscale image

• Binary Image - grayscale to binary conversion. • RGB image - region props is applied to binary image in order to

obtain human blob

And multiplied by the colour component.

• Red rectangle - bounding box, Blue point - centroid, Blue outline - Contour

• Contour Points: 3348, Convex Hull Points: 40 for the 5th

frameb)Test video – 20th frame i)Grayscale image

Fig 5.3: Test

video Fig 5.3: i) Grayscale image ii)Binary imageiii) RGB image-

20th frame

Fig 5.3: ii) Binary image Fig 5.3: iii) RGB image

In the test video, person is about to stand.

• Subtracting the test frame from background frame results in

grayscale image

• Binary Image - grayscale to binary conversion. • RGB image - region props is applied to binary image in order to

obtain human bloband multiplied by the colour component.


• Red rectangle - bounding box, Blue point - centroid, Blue outline - Contour

• Contour Points: 4672, Convex Hull Points: 46 of the 20th frame.

Convex hull points are extracted for consecutive frames and the

resultant feature vectors willbe passed for SVM in order to train

actionsonce query action is chosen for testing, feature vectors are extracted and compare with theone which is already trained. If the

feature vectors match, action is classified and result is

displayed.

iv)Action is classified and result is displayed.

Fig 5.3: iv)Action is classified

6. CONCLUSION AND FUTURE WORK

Gesture recognition using SVM classifier is implemented using

Matlab software.

In this project, we present a gesture spotting framework from hull data. Invariant features areextracted using non-linear analysis and

used as input to SVM classifier.

In future, the proposed features can be used to achieve body shape

invariance and SVMclassifier can be trained to model the unseen

data.

REFERENCES

[1]C.Cruz-

Neira,D.J.Sandin,T.A.DeFanti,R.V.Kenyon,andJ.C.Hart,―TheCave:

AudioVisualExperienceAutomaticVirtualEnvironment,‖Comm.AC

M,vol.35,no.6,pp.64-72,1992.

[2] Sushmita Mitra and Tinku Acharya. Gesture Recognition:

A Survey, IEEE Transactionson System, Man and Cybernetics

Part C: Applications and reviews, May 2007, Vol. 37,Issue: 3,

pp. 311-324.

[3] Hee-Deok Yang, A-Yeon Park and Seong-Whan Lee. Gesture Spotting and Recognitionfor Human-Robot Interaction, IEEE

Transactions on Robotics, April 2007, Vol. 23, and Issue: 2,pp.256-

270.

[4] Stjepan Rajko and Gang Qian, HMM Parameter for practical gesture recognition, IEEE International Conference on Automatic

Face and Gesture Recognition, September 2008,pp.1-6.

[5] Yu Yuan and Kenneth Barner.Hybrid Feature Selection for Gesture Recognition UsingSupport Vector Machines, IEEE

International Conference on Acoustics, Speech and SignalProcessing,

April 2008, pp. 1-24.


Rice Quality Analysis Using Image Processing Mahalakshmi M.N.1, K.V. Suresh2, Partha Das3

1IV sem, M. Tech. (Signal Processing), Dept. of E & C, SIT, Tumkur, Karnataka, India

2Professor and Head, Dept. of E & C, SIT, Tumkur, Karnataka, India.

3 R & D Engineer, Opto-Electronic color sorter division, Fowler Westrup (India) Pvt. Ltd., Bangalore, Karnataka, India.

[email protected]

[email protected]

[email protected]

Abstract-Quality assessment of rice is a very important task in

food industries. Various methods have been proposed for the

quality analysis of rice based on determining the size distribution

of rice grain, percentage of broken rice kernels, assessing

breakage and cracks of rice kernels, counting the number of long

and small seeds. This paper proposes a colour based method for

analyzing the quality of sona masuri rice in terms of counting the

number of good rice kernels and defects in a given sample of rice

using image processing technology. The proposed method has

been developed using MATLAB 7.12 and it works for non

touching rice kernels.The accuracy of the proposed method is

above 90% for counting chalky defects, above 90% for good rice

kernels, between 80% to 90% for orange rice kernels, and 70%

to 90% for black defects.

Keywords: quality analysis; rice kernels; counting; sona masuri;

defects

VI. INTRODUCTION

Rice is the most widely consumed staple food for a large part

of the world‘s human population especially in Asia and West

Indies. Therefore the quality of rice is an important parameter to be analyzed so as to protect consumers from substandard

quality of rice. Quality control of rice is of great importance in

the food industry because after harvesting, based on quality

parameter rice will be sorted and graded. The quality of rice is

based on few properties such as color, size, shape, cooking

texture and the number of broken rice kernels. When milled

rice reaches the market, it may contain defected rice kernels

such as chalky, orange, black and broken kernels along with

good ones. After reaching market, the quality of rice becomes

the determinant of its sale ability. Hence the quality analysis

of rice is of major importance.

The most commonly used method for quality analysis of rice

is through human visual inspection. This method is time

consuming and the accuracy of analysis varies from person to

person. Also it demands experienced inspectors to accurately

analyze the quality. There is a need to explore the use of

technology for quality analysis of rice. Literature records few papers which focus on quality

analysis of rice. G.Van Dalen [1] developed a method for

determining the size distribution of rice and percentage of

broken kernels of rice using flatbed scanning and image

analysis. This method was able to measure the area, length,

width and perimeter of each rice kernel. Chaoxin Zheng et al.

[2] presented a review of techniques available for image

feature extraction and their applications in the food industry.

Zhao Ping and LI Yongkui [3] proposed a method based on

image processing technology to improve the efficiency and

precision of grain counting. Francis Courtois et al. [4] dealt

with a method for the measurement of breakage ratio and the

estimation of fissures on parboiled rice. This method used a

flat bed scanner for image acquisition and a gap filling method

to separate the touching grains. Yong Wu and Yi Pan [5]

developed a method for the measurement of cereal grain size

based on image processing. They have used 2D Otsu method

for segmentation. L.A.I. Pabamalie and H.L.Premaratne [6]

proposed a method for identification of rice quality using

neural network and image processing. This method used a

back propagation network with two hidden layers for classification. Chetna V.Maheshwari et al. [7] proposed a

technique for counting normal, long and small seeds using

computer vision and image processing.All the methods

recorded in the literature dealt with gray level image of rice

sample for quality analysis. Since the defects found in sona

masuri rice differ in colour, the proposed method focuses on

color based quality analysis so as to be able to detect all the

types of defects and good rice kernels. Hence the proposed

method uses colour image. The flow chart of the proposed

method is explained in section II of this paper. In section III,

the apparatus and software used, the procedure used for image

acquisition, image processing steps involved and the results

obtained are explained. The conclusion is given in section IV.

VII. METHODOLOGY

The flowchart of the proposed method for quality analysis of

rice is shown in Fig. 1. The acquired color image of rice sample is subjected to segmentation in order to be able to

extract each type of defect and good rice kernels separately.

The resulting images of segmentation are converted to binary

images.



Fig.1. Flow chart of the proposed method

For the resulting binary images, morphological erosion and

opening operations are performed to remove the unwanted

objects and thus to improve the accuracy of counting. From

the resulting images, the number of connected components

arecounted and the results of counting are displayed. The steps

explained above are applied to count chalky defects,

orangedefects, black defects and good rice kernels. These

counts reflects the quality of rice.

VIII. EXPERIMENTAL SETUP AND RESULTS

a. Apparatus and software

A flatbed scanner (hp scanjet G3110) has been used to capture

the image of rice sample. A red color paper is used as

background for the rice sample. MATLAB 7.12 has been used

for the development of software for quality analysis.

b. Rice sample preparation

Rice samples of sona masuri were taken such that the samples contained good rice kernels along with chalky, orange and

black defects. The rice type is chosen to be sona masuri

because, sona masuri is the most commonly used rice type in

south Karnataka.

c. Image acquisition

A flatbed scanner also called desktop scanner is used to obtain

the images of rice kernels. Scanners are the most versatile

machines which are commonly found in all offices. They are

independent of external light conditions and found to be

suitable for this application and hence used for image

acquisition. The rice sample contains good grains as well as

defected grains. The color of grain is an important parameter which can be used to distinguish good kernel from other

defected kernels. The color of good rice kernel and also

chalky rice kernel is almost near to white and hence a

background of white color is not appropriate. Since the rice

sample also contains the black defects, a background of black

color was not used. Finally a red color paper was used as

background for the rice sample.

A sample of rice was placed on the glass of the flatbed

scanner. The rice kernels were spread so that they do not

touch each other. A red color background paper was placed on

the rice sample and the color image was captured. The

captured color image was stored in JPEG format. One such

image is shown in Fig. 2.

The captured image is in RGB color space. As can be seen

from Fig. 2, the chalky kernels are bright white in color and

are similar to good rice kernels. Chalky defect occurs when

part of the starch is not developed properly and is a point of

weakness. When chalky rice is milled, it is more likely to

break and thus reducing the amount of rice that is recovered

following milling. Other defects are orange rice kernels and

black defects as shown in Fig. 2.Even though black rice

kernels are high in nutritional value, amino acids and several important vitamins, because of colour it is usually considered

as defect.

d. Image processing

The captured RGB images are processed using the following

techniques of image processing.

1) Segmentation: Thresholding in RGB color space is used for segmentation. Each pixel of the input color image contains 3

color component values i.e. R, G and B. The R, G and B

components of good rice kernels and each type of defectedrice

kernels are observed. The R, G and B component valuesof

each type of defect and good rice kernel falls within a

common range. Using these range of values, thresholding

range is fixed in order to segment chalky, good, orange, black

rice kernels separately.

Fig. 2. Image of a sample of rice

Equations (1), (2) and (3) represents thresholding for R, G and

B components respectively of each pixel of input color

image.

In equation (1), represents the R component of each

pixel of the input image f (i,j), i represents the row number

and j represents the column number of the imagef (i,j). is

the maximum value the R component can take. The range


between and is the thresholding range of R component values fixed for segmentation of a particular type of rice

kernel. In equation (2), represents the G component of

each pixel of the input image. is the maximum value the

G component can take. The range between and is the

thresholding range of G component values fixed for

segmentation of a particular type of rice kernel. In equation

(3), represents the B component of each pixel of the

input image. is the maximum value the B component can

take. The range between and is the thresholding range of B component values fixed for segmentation of a particular

type of rice kernel.

2) Conversion to binary image:The resulting images of

thresholding areconverted to gray scale images. This

conversion is done by eliminating Hue and Saturation

information while retaining the luminance information. The

gray scale value for each pixel is computed using the equation

(4).

In equation (4), R, G and B represents Red, Green and Blue

componentsrespectively of every pixel of the image resulting

after thresholding. Q represents the resulting gray scale value

for the corresponding pixel of the input image. A global

threshold is computed, by using which the gray scale image

can be converted to binary image.

3) Morphological opening:This operation is applied on the

binary images resulting from previous step to remove all

connected components that have fewer pixels than a threshold

value. The morphological opening includes determining the

connected components, computing the area of each

component, fixing a threshold in terms of pixels so as to

remove unwanted objects.

4) Counting:Connected component labeling technique is used

to count the number of defects and good rice kernels. For

labeling connected components, 8 connectivity is used. The connected component labeling algorithm used involves

scanning the image along the columns, assigning preliminary

labels, recording label equivalences in a local equivalence

table, resolving the equivalence classes and relabeling the runs

based on the resolved equivalence classes.

To study the performance of the algorithm, samples of raw

sona masuri rice was taken and images were captured using

flatbed scanner. Fig. 3, Fig. 4 and Fig. 5 shows the input

images and Table I. shows the results obtained for these three

images. The accuracy of the proposed method for counting

chalky defects is above 90% and the accuracy is between 80%

to 90% for orange rice kernels, above 90% for good rice

kernels and 70% to 90% for black defects. The accuracy of counting of black defects is less because, the color of black

defect is not uniform throughout the kernel.

Since chalky kernels are bulgy compared to other rice

kernels, they suffer from shadow effects, which in turn reduces the accuracy of counting black defects. There will be

few orange kernels, which are very light in colour. Such

kernels are more close to good rice kernels. Hence sometimes

the accuracy of counting good and orange rice kernels goes

low.

IX. CONCLUSION

A methodology for quality analysis of rice sample based

oncounting the number of defects like chalky, orange, black

and good rice kernels using image processing has been

developed using MATLAB 7.12. The proposed algorithm has

been tested on 10 images of rice samples. The accuracy of

results of counting chalky defects is above 90% and the

accuracy is between 80% to 90% for orange rice kernels,

above 90% for good rice kernels and 70% to 90% for black

defects. This method works for non touching rice kernels.

Further work being done includes developing a methodology for counting the number of broken good rice kernels and

development of the algorithm for touching rice kernels.

Fig. 3. Example (1)


Fig. 4. Example (2)

Fig. 5. Example (3)

TABLE I

RESULTS OF COUNTING

ACKNOWLEDGEMENT

The authorsexpress their gratitude toM/S Fowler Westrup

India Pvt. Ltd., Bangalore for the immense support given to

carry out the work.

REFERENCES

[1] G. Van Dalen, ―Determination of the size distribution and percentage of

broken kernels of rice using flatbed scanning and image analysis‖,

FoodResearch International journal, vol. 37, pp.51-58, (2004).

[2] Chaoxin Zheng, Da-Wen Sun and Liyun Zheng, ―Recent developments

and applications of image features for food quality evaluation and inspection-

a review‖, Trends in Food Science & Technology journal, vol. 17 , pp.642-

655, (2006).

[3] Zhao Ping and LI Yongkui, ―Grain counting method based on image

processing‖, International conference on information engineering and

computer science, ICIECS, pp. 1-3, ( 2009).

[4] Francis Courtois, Matthieu Faessel , Catherine Bonazzi, ―Assessing

breakage and cracks of parboiled rice kernels by image analysis techniques”,

Food control journal vol. 21, pp. 567–572, (2010).

[5] Yong Wu and Yi Pan, ―Cereal grain size measurement based on image

processing technology‖, International conference on intelligent control and

information processing, Aug 13-15, (2010).

[6] L.A.I. Pabamalie, H.L.Premaratne, ―A Grain Quality Classification

System‖, International conference on Bioengineering, pp. 56-61,(2010).

[7] Chetna V. Maheshwari, Kavindra R. Jain, Chintan K. Modi, ―

Nondestructive quality analysis of Indian basmati oryza sativa using image

processing‖, International conference on communication systems and network

technologies (IEEE), pp. 189-193, (2012).


Progressive Image Transmission over Coded

OFDM system with LDPC Thejaswi K V1, K Nagamani2

1M.Tech, Digital Communication,

2Assistant Professor, Dept of Telecommunication Engg.,

1,2R.V College of Engineering, Bengaluru

Email:[email protected],[email protected]

Abstract

The image compression and transmission is the day to day

challenge in the field of multimedia. The progressive image

transmission over coded Orthogonal Frequency Division

Multiplexing (OFDM) system with Low Density Parity Check

Coding (LDPC) is a new scheme. It improves the error resilience

ability and transmission efficiency for progressive image

transmission over Additive White Gaussian Noise (AWGN)

channel. The Set Partitioning in Hierarchical Trees (SPIHT)

algorithm is used for source coding of the images to be

transmitted. The proposed scheme improves the BER

performance of the OFDM system, combination of the high

spectral efficiency OFDM modulation technique and LDPC

coding is used. This improves the reconstructed image

quality.Simulation results of image transmission confirm the

effectiveness of the proposed scheme.

Keywords: OFDM, SPIHT, LDPC, PSNR, MSE.

I. Introduction

OFDM modulation has been adopted by several wireless

multimedia transmission standards, such as Digital Audio

Broadcasting (DAB) and Digital Video Broadcasting (DVB-T), because it provides a high degree of immunity to multipath fading

and impulsive noise. High spectral efficiency and efficient

modulation and demodulation by IFFT/FFT are also advantages of

OFDM. In the frequency-selective radio transmission channel, all fading and Inter-Symbol Interference (ISI)result in severe losses of

transmitted image quality. OFDM divides frequency-selective

channel into several parallel non frequency selective narrow-band channels, and modulates signal into different frequencies. It can

significantly improve the channel transmission performance without

employing complex equalization schemes. It also has broad

application prospect in wireless image and video communications [1, 2].

SPIHT algorithm has been introduced by Said and Pearlman [3]

which is based on the wavelet transform, and restricts the necessity

of random access to the whole image to small sub images. The principle of the SPIHT is partial ordering by magnitude with a set

partitioning sorting algorithm, ordered bit plane transmission, and

exploitation of self similarity across different scales of an image wavelet transform. The success of this algorithm in compression

efficiency and simplicity makes it well knownas a benchmark for

embedded wavelet image coding. The SPIHT is used for image

transmission over the OFDM system in several research works [4 , 5] because the SPIHT has a good rate-distortion performance for still

images with comparatively low complexity and it is scalable or

completely embeddable.

To improve the BER performance of the OFDM system, several

error correcting codes have been applied to OFDM. The

combination of the high spectral efficiency OFDM modulation

technique and LDPC coding will be a good candidate for high speed broadband wireless applications LDPC has been adopted as the

DVB-S2 standard. A (N, K) LDPC code can be represented by a very

sparse parity-check matrix having M rows, N columns and code rate

R=K/N, where K=N-M. It was originally invented by Gallager in 1963 [6] and rediscovered by Mackay and Neal recently [7].

The combination of the high spectral efficiency OFDM

modulation technique and LDPC coding will be a good candidate for high speed broadband wireless applications. The BER performance

of the Low Density Parity Check Coding- Coded Orthogonal

Frequency Division Multiplexing system (LDPC-COFDM) is

influenced by the subchannels which have deep fad due to frequency selective fading. According to this combination, several algorithms

were introduced into LDPC-COFDM system to improve the BER by

adaptive bit loading and power allocation of each subcarrier [8], [9].

The proposed scheme concentrates on improving the quality of

the reconstructed images. It considers transmission of image over

Additive White Gaussian Noise (AWGN) channel with SPIHT as

source code over OFDM system.

II.SPIHT Algorithm

SPIHT algorithm defines and partitions sets in the wavelet

decomposed image using a special data structure called a spatial orientation tree. A spatial orientation tree is a group of wavelet

coefficients organized into a tree rooted in the lowest frequency

(coarsest scale) subband with offspring in several generations along

the same spatial orientation in the higher frequency subbands. Figure.1 shows a spatial orientation tree and the parent–children

dependency defined by the SPIHT algorithm across subbands in the

wavelet image. The tree is defined in such a way that each node has

either no offspring (the leaves) or four offspring at the same spatial location in the next finer subband level. The pixels in the lowest

frequency subband-tree roots are grouped into blocks of 2×2 adjacent

pixels, and in each block one of them; marked by star as shown in

Fig. 1; has no descendants. SPIHT describes this collocation with one to four parent-children relationships,




)]12,12(),12,2(),2,12(),2,2[(

),(

jijijijichildren

jiparent (1)

The SPIHT algorithm consists of three stages: initialization, sorting and refinement. It sorts the wavelet coefficients into three ordered

lists: the list of insignificant sets (LIS), the List of Insignificant

Pixels (LIP), and the List of Significant Pixels (LSP). At the

initialization stage the SPIHT algorithm first defines a start threshold based on the maximum value in the wavelet pyramid, then sets the

LSP as an empty list and puts the coordinates of all coefficients in the

coarsest level of the wavelet pyramid (i.e. the lowest frequency band;

LL band) into the LIP and those which have descendants also into the LIS.

In the sorting pass, the algorithm first sorts the elements of the

LIP and then the sets with roots in the LIS. For each pixel in the LIP it performs a significance test against the current threshold and

outputs the test result to the output bit stream. All test results are

encoded as either 0 or 1, depending on the test outcome, so that the

SPIHT algorithm directly produces a binary bit stream. If a coefficient is significant, its sign is coded and its coordinate is moved

to the LSP. During the sorting pass of LIS, the SPIHT encoder

carries out the significance test for each set in the LIS and outputs the

significance information. If a set is significant, it is partitioned into its offspring and leaves. Sorting and partitioning are carried out until

all significant coefficients have been found and stored in the LSP.

After the sorting pass for all elements in the LIP and LIS, SPIHT does a refinement pass with the current threshold for all entries in the

LSP, except those which have been moved to the LSP during the last

sorting pass. Then the current threshold is divided by two and the

sorting and refinement stages are continued until a predefined bit-budget is exhausted[3].

Fig.1: Parent–children dependency and spatial orientation trees

across wavelet subbands in SPIHT.

A. Low Density Parity Check Codes

Low-density parity-check (LDPC) codes are a class of linear block

LDPC codes. The name comes from the characteristic of their parity-

check matrix which contains only a few 1‘s in comparison to the

amount of 0‘s. LDPC codes provides a reliable transmission for

coding performance that is very close to the Shannon‘s limit and can

outperform Turbo codes at long block length but with relatively low decoding complexity .LDPC codes are finding increasing use in

applications requiring reliable and highly efficient information

transfer over bandwidth or return channel–constrained links in the

presence of data-corrupting noise.

III.OFDM System

The block diagram of the proposed LDPC-COFDM system is

illustrated in Figure. 2. The SPIHT coder is chosen as the source

coding technique due toits flexibility of code rate and simplicity of

designing optimal system. The SPIHT divides the image stream into several layers according to the importance of progressive image

stream. Then the image stream is converted to a binary format.

Afterwards the information bits are LDPC encoded at the LDPC

encoder.

The OFDM considered in this paper utilizes N frequency tones

(number of subcarriers) hence the baseband data is first converted into parallel data of N subchannels so that each bit of a codeword is

on different subcarrier. The N subcarriers are chosen to be

orthogonal, that is fnfn ,where Tf /1 and T is the

OFDM symbol duration. Then, the transmitted data of each parallel

subchannel is modulated by Binary phase Shift Keying (BPSK)

modulation because it provides high throughput and best performance when combined with the OFDM.Finally, the modulated

data are fed into an IFFT circuit, such that the OFDM signal is

generated. The resulting OFDM signal can be expressed as follows:

TteXN

nxtxtfj

N

n

nn 0,

1][)(

2

1

0

(2)

where Xn is a discrete time sample.

Fig. 2: The LDPC COFDM system model with trigonometric

transforms.

Each data block is padded with a cyclic prefix (CP) of a length

longer than channel impulse response to mitigate the Inter-Block Interference (IBI). The continuous COFDM signal xg(t) is generated

at the output of the digital to analog (D/A) converter.

At the receiver, the guard interval is removed and the time interval [0,T] is evaluated.Afterwards, the OFDM subchannel demodulation


is implemented by using a (FFT) then the Parallel-to-Serial (P/S)

conversion is implemented. This received OFDM symbols are demodulated at the demodulator. The demodulated bits are decoded

with each LDPC encoded block and data bits are restored. These data

are converted into image format, such that SPIHT decoder can be

obtained.

IV. Simulation Results

The transmission of SPIHT coded images on LDPC COFDM system

over AWGN channel is carried out by simulation using Matlab. The

parameters used in the simulation are: the number of subcarriers of a LDPC coded OFDM system (N) is considered to be 512, Rate of the

SPIHT (r) = 0 to 1. LDPC code of R = 1/2 is employed, where R

denotes the code rate and a (128, 256) parity check matrix is used.

The size of the input images are 256x256, 8 bits per pixel, grayscale test images. The PSNR is evaluated at different rate.

The Peak Signal-to-Noise Ratio is defined as

MSE

PeakPSNR

2

10log10 (4)

Where, MSE is the mean squared error between the original and the

reconstructed image, and Peak is the maximum possible magnitude for a pixel inside the image. The peak value is 255 for an 8 bits/pixel

of original image.

To verify the effectiveness of the proposed method; Image transmission is carried out over COFDM system using SPIHT coder

as source coding. Simulation was carried out for different SPIHT

rates as shown in Table 1 for Cameraman and Lena image of size

256x256.

Table 1: PSNR values for Cameraman and Lena image

Rate

(r in bpp)

Cameraman

PSNR

(dB)

Lena

PSNR

(dB)

0.1 22.94 22.86 0.2 25.29 24.92 0.3 27.15 26.51 0.4 28.46 27.68 0.5 29.61 28.87 0.6 30.79 29.83 0.7 31.67 30.88 0.8 32.66 31.66 0.9 33.72 32.55 1.0 34.81 33.42

Fig 3: PSNR of the proposed scheme for different rates

From the Table 1 and Figure 3 the PSNR increases with the increase in rate of transmission. Figure 4 shows the Original images

transmitted through the system and reconstructed images at receiver

for SPIHT rates 0.5 and 1 are given in Figure 5. It is evident from the

Figure 5 that the PSNR of images at rate 0.5 is less compared to PSNR at rate 1. As the rate increases PSNR increases hence the

quality of the image at receiver.

Fig. 4: Original transmitted images

PSNR=29.61db PSNR=28.87db

Fig. 5: Reconstructed images at SPHIT Rate=0.5

r in bpp


PSNR=34.81 PSNR=33.42

Fig. 6: Reconstructed images at SPHIT Rate=1

V. CONCLUSION

The proposed new scheme is an efficient LDPC coded OFDM

system supporting image transmission using SPIHT compression technique is analyzed. The effectiveness of the proposed system is

investigated through simulations over AWGN channel. It is found

that the proposed system must be designed carefully in order to

achieve good PSNR performance. For LDPC COFDM with rate (R=0.5) and rate of SPIHT rate (r = 1) the image transmitted is

reconstructed effectively. The PSNR for the received image at

different rates are evaluated. The analysis

REFERENCES [1] H. Schulze and C. Luders, Theory and Application of OFDM and

CDMAWideband Wireless Communication. John Wiley, 2005.

[2] Gusmo, R. Dinis and N. Esteves, ―On Frequency Domain Equalization

and Diversity Combining for Broadband Wireless Communications‖,

IEEE Communication. Lett., Vol. 51, No. 7, July 2003.

[3] Said and W.A. Pearlman, ―A New, Fast and Efficient Image Codec Based

on Set Partitioning In Hierarchical Trees‖, IEEE Trans. Circuits Syst.

Video Technol., Vol.6, pp. 243–250, 1996.

[4] Y. Sun, X. Wang and Liu, K.J.R, ―A Joint Channel Estimation and

Unequal Error Protection Scheme for Image Transmission in Wireless

OFDM Systems‖, Multimedia Signal Processing, IEEE, pp. 380 – 383,

2002.

[5] S. Wang, J. Dai, C. Hou and X. Liu ― Progressive Image Transmission

over Wavelet Packet Based OFDM‖, Proceedings in Electrical and

Computer Engineering 2006 Canadian Conference Conference, pp. 950

– 953,2006.

[6] R G. Gallagher, ―Low Density Parity Check Codes‖, IRE Trans. Inform.

Theory, Vol. IT-8, pp. 21-28, Jan.1962.

[7] D. J. C. MacKay, ―Good Error-Correcting Codes Based on Very Sparse

Matrices‖, IEEE Trans. Inform. Theory, Vol.45, pp. 399-43 1,

Mar.1999.

[8] Y. Li and W.E. Ryan, ―Mutual-Information-Based Adaptive Bit-Loading

Algorithms for LDPC-Coded OFDM‖, IEEE transaction on wireless

communication, Vol.6, pp. 1670 – 1680, May 2007.

[9] C. Yuan Yang and M. Kai Ku ―LDPC Coded OFDM Modulation for High

Spectral Efficiency Transmission‖, Proceedings in ECCSC 2008, pp.

280 – 284,July 2008.

[10] Charles Pandana, Yan Sun, and K. J. Ray Liu ―Channel-Aware Priority

Transmission Scheme Using Joint Channel Estimation and Data

Loading for OFDM Systems‘‘ IEEE Transactions on Signal processing,

Vol. 53, No. 8, August 2005

[11] Bagadi K. Praveen, Susmita Das, Sridhar K. ―Image Transmission over

Space Time Coded MIMO-OFDM System with Punctured Turbo Codes

―International Journal of Computer Applications (0975 – 8887) Volume

51– No.15, August 2012

[12] Sashuang Wang,Jufeng Dai,Chunping Hou and Xueqing Liu

―Progressive Image Transmission Over Wavelet Packet Based

OFDM‖IEEE CCECE/CCGEI, Ottawa, May 2006

[13] Srikanth.N ―Progressive Image transmission over STC-OFD Based

MIMO systems‖ International Conference on Computing and Control

Engineering (ICCCE 2012), 12 & 13 April, 2012

[14] Usama S. Mohammed, H. A. Hamada ―Image transmission over OFDM

channel with rate allocation scheme and minimum peak-to average

power ratio‖Journal of telecommunications, volume 2, issue 2, May

2010

[15] R. Orzechowski ―Performance Analysis of LDPC Coded OFDM

System‖ XIV Poznań Telecommunications Workshop - PWT 2010

[16] M. M. Salah, A. A. Elrahman and A. Elmoghazy ―Unequal Power

Allocation of Image Transmission in OFDM Systems‖ 13th International

Conference on Aerospace Sciences & Aviation Technology, SAT- 13,

May 26 – 28, 2009

[17] Naglaa F. Soliman, Abd Alhamid. A. Shaalan,Mohammed.M. Fouad

,―Robust Image Transmission with OFDM over an AWGN

Channel‖National Telecommunication Institute, Egypt, March 2011

http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=4604652


A Joint Digital Watermarking of Compressed & Encrypted images for implementation of digital

rights

Chetana.R1, Chaithra.A2

1Associate Professor, ECE Dept.

SJBIT, Bangalore,India Email:[email protected]

2 PG Student, ECE Dept.

SJBIT, Bangalore,India E-mail:[email protected]

Abstract—We propose efficient, robust watermarking

algorithm to watermark compressed and encrypted

images. The proposed algorithm uses a symmetric stream

cipher with additive homomorphic properties for

encryption and for watermarking we use Spread

Spectrum (SS) frequency domain DCT2 watermarking

schemes. Digital asset management systems (DAMS)

generally handle media data in a compressed and

encrypted form. It is sometimes necessary to watermark

these compressed encrypted media items in the

compressed-encrypted domain itself for tamper detection

or ownership declaration or copyright management

purposes. It is a challenge to watermark these compressed

encrypted streams as the compression process would have

packed the information of raw media into a low number

of bits and encryption would have randomized the

compressed bit stream. Attempting to watermark such a

randomized bit stream can cause a dramatic degradation of the media quality. Thus it is necessary to choose an

encryption scheme that is both secure and will allow

watermarking in a predictable manner in the compressed

encrypted domain.

Applications: copyright violation detection, proof of

ownership or distributorship, media authentication

INTRODUCTION

DIGITAL media content creation/capturing, processing and

distribution has witnessed a phenomenal growth over the past

decade. This media content is often distributed in compressed

and encrypted format and watermarking of these media for

copyright violation detection, proof of ownership or

distributorship, media authentication, sometimes need to be

carried out in compressed-encrypted domain. One such

example is the distribution through DRM systems [1]–[4]

where the owner of multimedia content, distribute it in a

compressed and encrypted format to consumers through

multilevel distributor network. In DRM systems with content

owners, multiple levels of distributors and consumers, the

distributors do not have access to plain content (un-encrypted

content). As they are distributors of content who distributes

the encrypted content(in fact compressed encrypted content

as most of the content would be compressed and then

encrypted) and requests the license server in the DRM system

to distribute the associated licence containing the decryption

keys to open the encrypted content to the consumers. In fact

distributors do not need to have plain content as they are not

consumers. However, each distributor sometime needs to

watermark the content for media authentication, traitor

tracing or proving the distributorship. Thus they have no

choice but to watermark in the compressed encrypted domain. In this paper we focus on watermarking of

compressed-encrypted JPEG2000 images, where the

encryption refers to the ciphering of complete JPEG2000

compressed stream except headers and marker segments,

which are left in plaintext for format compliance [5]. There

have been several related image watermarking techniques

proposed to date [6]–[11]. In [6], Deng et al. proposed an

efficient buyer-seller watermarking protocol based on

composite signal representation given in [7]. However, when

the content is accessible only in encrypted form to the

watermark embedder, the embedding scheme


proposed in [6] might not be applicable as the host and

watermark signal are represented in composite signal form using the plain text features of the host signal and in [6], this

is possible as the seller embeds the watermark. Also, there is a

ciphertext expansion of 3.7 times that of plaintext. In [8] and

[9], some sub-bands of lower resolutions are chosen for

encryption while watermarking the rest of higher resolution

sub-bands. While in [10], the encryption is performed on most

significant bit planes while watermarking the rest of lower

significant bit planes. In case lesser number of sub-bands/bit

planes are used for encryption, an attacker can manipulate the

un-encrypted sub-bands/bit planes and further extract some

useful information from the image, although the image may

not be of good quality. On the other hand, if more sub-

bands/bit planes are encrypted and only rest few sub-bands/bit

planes are watermarked, it might be possible for an attacker to

remove the watermarked sub-bands/bit planes while

maintaining the image quality. Prins et al. in [11] proposed a

robust quantization index modulation (QIM)based watermarking technique, which embeds the watermark in the

encrypted domain. In the technique proposed in [11], the

addition or subtraction of a watermark bit to a sample is based

on the value of quantized plaintext sample. However, in our

algorithm, the watermark embedder does not have access to

the plain text values. They have only compressed-encrypted

content. Also the watermark embedders do not have the key to

un-encrypt and get the plain text compressed values.

Thus,watermarking in compressed-encrypted domain using

the technique proposed in [11] is very challenging. In [12] Li

et al. proposed a content-dependent watermarking technique,

which embeds the watermark in an encrypted format, but the

host signal is still in the plain text format. The algorithm may

not be directly applied when the content is in the encrypted

format, in which case the distortion introduced in the host

signal may belarge. In [13] Sun et al. proposed a semi fragile

authentication system for JPEG2000 images. However, this

scheme is not fully compressed and encrypted domain watermarking compatible as it derives the content based

features for watermarking from the plain text.We propose a

robust watermarking technique for JPEG2000 images in

which the watermark can be embedded in a predictable

manner in compressed-encrypted bytestream

Notation:

• L denotes the length in bytes.

•M=mi,mi € [0,255]¥i=0,1,…,L-1 denotes the packetized

JPEG2000 bytestream, Mw=mwi,mwi,€[0,255] ¥i=0,1,…,L-1

to be the watermarked copy of M .

•C=ci,ci€[0,255] ] ¥i=0,1,…,L-1 denotes

encrypted M, Cw=cwi,cwi €[0,255] ¥i=0,1,…,L-1 to be the

watermarked copy of C.

• b=bj,bj€-1,1¥i=0,1,…,N-1 denotes the

watermark information. • E(.) and D(.) denotes the encryption and decryption function,

respectively.

•K=ki , where ki € [0,254] ¥i=0,1,…,L-1

denotes the encryption key.

•r denotes the chip rate in SS. • α denotes the watermark strength factor in SS.

•P=pi, pi €-1,1 ¥i=0,1,…,L-1 denotes a PN sequence with

zero mean and variance σp2 .

•SS denotes Spread Spectrum

II. PROPOSED SCHEME

Fig.1 a)Watermark embedding b) Watermark extraction

The proposed algorithm works on JPEG2000 compressed

code stream. JPEG2000 compression is divided into five

different stages. In the first stage the input image is

preprocessed by dividing it into non-overlapping rectangular

tiles, the unsigned samples are then reduced by a constant to make it symmetric around zero and finally a multi-component

transform is performed. In the second stage, the discrete

wavelet transform (DWT) is applied followed by quantization

in the third stage. Multiple levels of DWT gives a multi-

resolution image. The lowest resolution contains the low-pass

image while the higher resolutions contains the high-pass

image. These resolutions are further divided into smaller

blocks known as code-blocks where each code-block is

encoded independently. Further, the quantized- DWT

coefficients are divided into different bit planes and coded

through multiple passes at embedded block coding with

optimized truncation (EBCOT) to give compressed byte

stream in the fourth stage. The compressed byte stream is

arranged into different wavelet packets based on resolution,

precincts, components and layers in the fifth and final stage.

Thus, it is possible to select bytes generated from different bit

planes of different resolutions for encryption and watermarking.

A. Encryption Algorithm

JPEG2000 gives out packetized byte stream M as its output.

In order to encrypt the message M , we choose , a randomly

generated key-stream using RC4.Then the encryption is done

byte by byte as given in (1) to get the ciphered signal C :

C=E(M,K)=ci

=(mi+ki)mod255 for i=0,1,……..,L-1 (1)


where the addition operation is arithmetic addition. Here, mod

255 is required to preserve the format compliancy of JPEG2000 bit stream [5]. In JPEG2000 bit stream, the header

syntax occurs as a value greater than 0xff89. This value

correspond to two consecutive bytes having values 255 and

higher than 137 in decimal base. If mod 256 is used, it may

generate a value 255 and the consecutive byte value greater

than 137, which corresponds to a syntax and is undesirable.

Thus in order to prevent the generation of header segments,

mod 255 is used . Let C1=E(M1,K1) and

C2=E(M2,K2).ForK=K1+K2,additive homomorphism

property gives

D(C1+C2,K)=M1+M2(2)

Here M1=m1i,for all i has been preprocessed by the owner

such that 0≤M1+M2<255 . The owner does the preprocessing

by limiting the values as M1|M1 belongs to [α,255-α+1)] ,

where α is a positive integer. However, the preprocessing is

not applied when m1=255 and m1i+1>137 , because this case

indicates the presence of a header segment which should be preserved to preserve the bitstream compliance.

B. Embedding Algorithm

The encryption algorithm used is an additive privacy

homomorphic one, so the watermark embedding is performed

by using a robust additive watermarking technique. Since the

embedding is done in the compressed ciphered byte stream,

the embedding position plays a crucial role in deciding the

watermarked image quality. Hence, for watermarking, we

consider the ciphered bytes from the less significant bit planes

of the middle resolutions, because inserting watermark in

ciphered bytes from most significant bit planes degrades the

image quality to a greater extent. Also, the higher resolutions

are vulnerable to transcoding operations and lower resolution

contains a lot of information, whose modification leads to loss

of quality.We in our experiments study the impact on quality

of watermarking in this compressed-encrypted domain. We show how the watermark can be inserted in less significant bit

planes of middle resolutions without affecting the image

quality much.We now explain the embedding process.

SS:The embedding process is carried out by first generating

the watermark signal W by using watermark information bits b

, chip rate r and PN sequence P . The watermark information

bits b=bi , where bi=1,-1 , are spread by r, which gives

aj=bj, ir≤j<(i+1)r (3)

The sequence aj is then multiplied by α>0 and P . The

watermark signal W=wj , where

wj=αajpj(4)

where pj=1,-1.The watermark signal generated in (4) is

added to the encrypted signal , to give the watermarked signal

Cw

Cw=C+W=cwi=ci+wi¥i=0,1,…,L-1 (5)

Here, C and W can be considered to be C1and C2, respectively.

Although is added in plaintext form, it can be considered to be encrypted using key K2 such that K2 is a stream of bytes with

value zero, then . In other words, as M2 in (2) corresponds to

W of (5), M2 can be assumed to be encrypted using a byte key

stream K2=k2i¥i=0,1,…,L-1.Now, if k2i=0 ¥i=0,1,…,L-1 ,

then the encrypted value of M2 denoted by C2 is c2i=(m2i+k2i) mod255 ¥i=0,1,…,L-1(6)

Thus, we get C2=M2 , i.e., encryption of M2 still produces M2

as addition of zero do not make any change in (6). Also, the

decryption key K=K1+K2 for decrypting

C1+C2 can be written as K=K1 as K2=0 . Thus, according to

homomorphic property we can write

D(C1+C2,K(=K1+K2))

=D(C1+M2,K(=K1))

D(C1+M2,K)=M1+M2. (7)

If cwi is more than 255, a lesser strength (may be zero as well)

of watermark is added such that remains below 255. Thus

decrypting Cw , we get M+W since W is inserted in plain text

form.

C. Watermark Detection

The watermark can be detected either in encrypted or decrypted compressed domain.We also discuss the

uncompressed domain detection in case of SS technique. We

will first discuss the detection in encrypted domain followed

by decrypted domain.

1) Encrypted Domain Detection: In encrypted domain, as

shown in Fig. 1, Cw is directly given to the watermark

extraction module and the detection process is as follows.

SS: The received encrypted-watermarked signal Cw=C+W is

applied to the correlator detector. It is multiplied by PN

sequence P used for embedding, followed by summation over

chip-rate window r , yielding the correlation sum Si.

Assuming zero correlation between C and P

Si=Σ(cwipj)=Σ(cj+wj)pj=biσp2r (8)

The first term in (8), i.e.cjpj , is zero if C and P are

uncorrelated. However, this is not always the case for real

compressed data. Thus, we can apply the non-blind detection

technique, i.e.,subtract away C from Cwto remove the correlation effect completely. Thus get a better watermark

detection rate. The sign of Si gives the watermark information

bit:

sign(Si)=sign(biσp2αr)=sign(bi)=bi (9)

However, the distributors can also use prefiltered (semi-blind)

detection technique. In case of ownership proving applications

the prefiltered (semi-blind) detection technique may be

required. In this case, the watermarked message is first passed

through a high pass filter to reduce the cross-talk between

watermark signal and host samples. The filtered message is

then multiplied by PN sequence and thereby extracting the

watermark.

2) Decrypted Domain Detection: The received compressed

encrypted watermarked image is first passed through the

decryption module, shown in Fig. 1, and is decrypted using

(10),which defines the corresponding byte by byte decryption for the encryption defined in (1).The received signal Cw is

decrypted to give Mw as

Mw=D(Cw,K)=(cwi-ki) mod 255


¥i=0,1,…,L-1

=(ci+wi-ki) mod 255 =mi+wi

=mwi (10)

It can be seen from (10) that mwi=mi+wi , the watermarked

compressed byte stream mwi is merely addition of compressed

byte stream mi, and the watermark signal wi.Thus by

controlling the strength of wi, choice of resolution levels and

bit planes, the quality of the watermarked signal could be

easily controlled. The watermarked quality would be poor if

we pick up more number of resolution levels and bit planes to

watermark, but the watermark embedding capacity would be

high and vice versa.

For SS detection, the embedded watermark information W can

be estimated from Mw using correlation detector even without

the knowledge of the corresponding originals M or C.

However, M and P may not always be uncorrelated and hence

the noise due to M may not be completely eliminated. Therefore to obtain better detection results, we can encrypt Mw

with K which gives Cw and removing C gives

Si=Σ(wipj)=Σαajpjpj=biσp2αr (11)

Thus, the sign of Si gives the watermark information bit

sign(Si)=sign(biσp2 αr )=sign(bi) =bi (12)

III. CONCLUSION

The proposed work is to introduce a robust

watermarking technique for the watermarking of images and

find the authorized / unauthorized user, as well as touching on

the limitations and possibilities of each. Although only the

very surface of the field was scratched, it was still enough to

draw several conclusions about digital watermarking. LSB is

the straight-forward method of watermark embedding use to embed the watermark into the least-significant-bits of the

Cover image. Another observation is that Frequency domains

are typically better candidates for watermarking than spatial,

for both reasons of robustness as well as visual impact of

recovery watermark image. Embedding in the DCT domain

proved to be highly resistant to ―.bmp‖ compression as well as

significant amounts of random noise. The algorithm is simple

to implement as it is directly performed inthe compressed-

encrypted domain, i.e., it does not require decryptingor partial

decompression of the content. Our schemealso preserves the

confidentiality of content as the embeddingis done on

encrypted data. The homomorphic property of

thecryptosystem are exploited, which allows us to detect the

watermarkafter decryption and control the image quality as

well.The detection is carried out in compressed or

decompressed domain.In case of decompressed domain, the

non-blind detectionis used. We analyze the relation between payload capacityand quality of the image (in terms of PSNR

and SSIM) for differentresolutions. Experimental results show

that the higherresolutions carry higher payloadcapacity

without affecting thequality much, whereas the middle

resolutions carry lesser capacityand the degradation in quality is more than caused bywatermarking higher resolutions.

However, higher resolutionsmight be truncated to meet the

bandwidth requirements andin that case middle resolutions

provide a good space for embedding.The distortiondue to the

round-off process also plays a significant role in

determiningBER and the effect is also analyzed by

comparingagainst the results of original watermarking

schemes.Future work aims at extending the proposed scheme

to otherimage compression schemes such as JPEG, JPEG-LS.

ACKNOWLEDGEMENT

The authors wish to acknowledge SJB Institute of

Technology for providing guidance and resources to carry out

this work.

REFERENCES [1] S. Hwang, K. Yoon,K. Jun, andK. Lee, ―Modeling and implementation of

digital rights,‖ J. Syst. Softw., vol. 73, no. 3, pp. 533–549, 2004.

[2] A. Sachan, S. Emmanuel, A. Das, and M. S. Kankanhalli, ―Privacy

preservingmultiparty multilevel DRM architecture,‖ in Proc. 6th IEEEConsumer Communications and Networking Conf., Workshop

DigitalRights Management, 2009, pp. 1–5.

[3] T. Thomas, S. Emmanuel, A. Subramanyam, and M. Kankanhalli, ―Joint

watermarking scheme for multiparty multilevel DRM architecture,‖ IEEE

Trans. Inf. Forensics Security, vol. 4, no. 4, pp. 758–767, Dec. 2009.

[4] A. Subramanyam, S. Emmanuel, and M. Kankanhalli,

―Compressedencrypted domain JPEG2000 image watermarking,‖ in Proc.

IEEE Int.Conf. Multimedia and Expo, 2010, pp. 1315–1320.

[5] H. Wu and D.Ma, ―Efficient and secure encryption schemes for JPEG

2000,‖ in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing,

2004, vol. 5, pp. 869–872.

[6] M. Deng, T. Bianchi,A. Piva, and B. Preneel, ―An efficient buyer-seller

watermarking protocol based on composite signal representation,‖ in Proc. 11th ACM Workshop Multimedia and Security, 2009, pp. 9–18.

[7] T. Bianchi, A. Piva, and M. Barni, ―Composite signal representation for

fast and storage-efficient processing of encrypted signals,‖ IEEETrans. Inf.

Forensics Security, vol. 5, no. 1, pp. 180–187, Mar. 2010.

[8] S. Lian, Z. Liu, R. Zhen, and H. Wang, ―Commutative watermarking and

encryption for media data,‖ Opt. Eng., vol. 45, pp. 1–3, 2006.

[9] F. Battisti, M. Cancellaro, G. Boato, M. Carli, and A. Neri, ―Joint

watermarking and encryption of color images in the Fibonacci-Haar domain,‖

EURASIP J. Adv. Signal Process., vol. 2009.

[10] M. Cancellaro, F. Battisti, M. Carli, G. Boato, F. De Natale, and A. Neri,

―A joint digital watermarking and encryption method,‖ in Proc.SPIE Security, Forensics, Steganography, and Watermarking of MultimediaContents X,

2008, vol. 6819, pp. 68 191C–68 191C.

[11] J. Prins, Z. Erkin, and R. Lagendijk, ―Anonymous fingerprinting with

robust QIM watermarking techniques,‖ EURASIP J. Inf. Security, vol. 2007.

[12] Z. Li, X. Zhu, Y. Lian, and Q. Sun, ―Constructing secure

contentdependent watermarking scheme using homomorphic encryption,‖ in

Proc. IEEE Int. Conf. Multimedia and Expo, 2007, pp. 627–630.

[13] Q. Sun, S. Chang, M. Kurato, and M. Suto, ―A quantitive semi-fragile

JPEG2000 image authentication system,‖ in Proc. Int. Conf. ImageProcessing, 2002, vol. 2, pp. 921–924.

[14] R. Rivest, A. Shamir, and L. Adleman, ―A method for obtaining digital

signatures and public-key cryptosystems,‖ Commun. ACM, vol. 21, no. 2, pp.

120–126, 1978.

[15] S. Goldwasser and S. Micali, ―Probabilistic encryption,‖ J. Comput. Syst. Sci., vol. 28, no. 2, pp. 270–299, 1984.


Cancellation of Noise in ECG Signal Using Low pass IIR Filters

Menaka.S.Naik 1, Akshatha N2, Mrs.Nirmala.L 3 Mtech (VLSI and Embedded Systems)

1REVA Institute of Technology and Management,

Bengaluru, India 2Nagarjuna College of Engineering,

Bengaluru, India 3Dept. of Electronics and Communication Engg.

REVA Institute of Technology and Management, Bengaluru, India

[email protected],[email protected]

Abstract :In Diagnosing of ECG Signal, Signal acquisition must

be noise free. Experienced physicians are able to make an

informed medical diagnosis on heart condition by observing the

ECG signal. This paper deals with the application of the digital

IIR filter on the raw ECG signal. At the end all these filter types

are compared.

While recording ECG signal it gets corrupted due to different

noise interferences and artefacts. Noise and interference are

usually large enough to obscure small amplitude features of the

ECG that are of physiological orclinical interest. We have used

MATLAB for this purpose as it is the most advanced tool for

DSP applications.

The filters are designed using MATLAB FDA Tool by specifying

the filter order, cut off frequency and sampling frequency.

Keywords:- ECG signal, IIR filters, MATLAB, Signal to Noise

Ratio (SNR)

I. INTRODUCTION

ECG is a method to measure and record different

electrical potentials of the heart. Owing to

intensifying importance of biomedical signal

processing, increasing efforts are devoted to noise

reduction of biomedical ECG signal. The task can

become complicated when quality is degraded due

to interferences in ECG signal, making

interpretation quite difficult. So removal of this

noise is needed in ECG analysis for correct

diagnosis. Fig. 1 depicts, each ECG signal of

normal heart beat consists of six continuous

electromagnetic peaks namely PQRST and U. The

P wave reflects the activation ofthe right and left

atria. The QRS complex shows depolarization of

the right and left ventricles. The T wave, that is

after QRS complex reflects ventricular activation.

The repolarization of atria is not recorded on the

reading of ECG. The electrocardiogram can

measure the rate and rhythm of the heartbeat, as

well as provide indirect evidence of blood flow to

the heart muscle.


Fig 1: Normal ECG Signal.

The signal is normally corrupted with two major

noises generated by biological and environmental

resources. The first group includes muscle

contraction or electromyographic (EMG) interface,

baseline drift, ECG amplitude modulation due to

respiration and motion artifacts caused by changes

in electrode skin impedance with electrode motion.

The second group includes power line interference,

electrode contact noise, instrumentation noise

generated by electronic devices used in signal

processing, electrosurgical noise and radio

frequency. Different types of interferences are listed

in fig.2 For the meaningful and accurate detection,

steps have to be taken to filter out or discard all

these noise sources.

Fig 2: Interferences in the ECG signal

II . Literature survey

Hence several techniques have been presented in

the literature to effectively reduce the noise in ECG

analysis.

Ferdjallah M. and Barr R. E. introduced frequency

domain digital filtering techniques for removal of

PLI [1]. Sornmo L. have applied time varying

filtering techniques to the problem of baseline shift

[2]. McManus C.D., Neubert K.D. and Cramer E.

have compared digital filtering methods for

elimination of AC noise in ECG [3]. Patrica Arand

patented method and apparatus for removing

baseline wander from an ECG signal [4]. Pei S.C.,

Tseng C.C proposed IIR notch filter with transient

suppression in ECG [5]. Hamid Gholam, Hosseini,

Homer Nazeran, Karen J. Reynolds elaborated on

ECG noise cancellation using application of digital

filter [6]., A nonlinear adaptive method of

elimination of powerline interference in ECG


signals was developed by Ziarani a. K. and Konard

A. [7]. Mitov I.P. A method for reduction of power

line interference in the ECG [8]. Yong Lian, Poh

Choo Ho. Focued on multiplier free digital filter

[9]. Lisette P. Harting, Nikolay M. Fedotov,

Cornelis H. Slump were discussed on baseline drift

suppression in ECG recording [10]. Dotsinky I.,

Stayanov T. discussed on power-line interference

cancellation in ECG signals [11]. Jacek M. Leski,

Norbert Hezel have proposed a combination of

ECG baseline wander and PLI reduction using

nonlinear filter band [12]. Lu G.et al. have

suggested a fast convergence of recursive least

square algorithm to enable the filter to track

complex dystonic EMGs and to effectively remove

ECG noise. The adaptive filter procedure proved a

reliable and efficient tool to remove ECG artifact

from surface EMGs with mixed and varied patterns

of transient, short and long lasting dystonic

contractions [13]. A new asynchronous averaging

and filtering (AAF) algorithm is proposed by

Gautam A. et al. for ECG signal denoising. AAF

algorithm reduces random noise (major component

of EMG noise) from an ECG signal and provides

comparatively good results for baseline wander

noise cancellation. SNR improves in a filtered ECG

signal, while signal shape remains undistorted. An

AAF algorithm is more advantageous than

adaptation algorithms like Wiener and LMS

algorithm [14]. Sorensen J. S.et al. have described a

comparison of IIR and wavelet filtering for noise

reduction of the ECG. Ideally, the output of the

optimal filter has perfect noise removal, no

distortion and low computation time. This criterion

was used to select one wavelet filter and one IIR

filter to be used on the ECG with transient muscle

activity. For this signal the root mean square error

(RMSE) of the non noise and noise segment were

calculated using the selected wavelet and IIR filters

[15]. Gupta R., Bera J.N. and Mitra M. have

developed a simple, cost effective online ECG

acquisition system for further data processing. An

8051 based dedicated embedded system is used for

converting the digitized ECG into serial data, which

is delivered to a standalone PC through a 'COM‘

port for storage and analysis. A serial link is

preferred as it minimizes cable costs and

interference effects as is the case with a parallel

link. The developed MATLAB based graphical user

interface (GUI) facilitates a user to control the

operations on the entire system [16]. Luo S. and

Jhonston P. have discussed the issues related to the

inaccuracy of ECG preprocessing filters. Their

investigations are in the context of facilitating

efficient ECG interpretation and diagnosis as a

review [17].

In this paper, Chebyshev and Butterworth filters are

designed and implemented. Finally improvement in

ECG signal by noise reduction is presented and

discussed. This paper aims towards MATLAB

based design of digital filters which further can be

extended to interface with the FPGA.

III . DIGITAL IIR FILTER

IIR systems have an impulse response function that

is non zero over an infinite length of time. IIR Filter

may be implemented as either analog or digital

filter. In digital filter, the output feedback is

immediately apparent in the equation defining the

output.

3.1 Butterworth Filter

The Butterworth filter provides the best Taylor

Series approximation to the ideal lowpass filter

response at analog

frequenciesΩ = ∞ and Ω = ∞, for any order N, the

magnitude squared response has 2N-1 zero

derivatives at these

locations. Response is monotonic overall,

decreasing smoothly from Ω = ∞, to Ω = ∞,.


H(jΩ) =√1/2 at Ω = 1 - - - - - - - - - - - - - - - -(i)

3.2 Chebyshev Type I Filter

The Chebyshev Type-I Filter minimizes the

absolute difference between the ideal and actual

frequency response over passband by incoporating

an equal ripple of RP dB in the passband. Stopband

response is maximally flat. The transition from

passband to stopband is more rapid than for the

Butterworth filter.

H(jΩ) = 10-Rp/20 at Ω = 1 - - - -- - - - - - - -(ii)

3.3 Chebyshev Type-II Filter

Chebyshev Type-II filter minimizes the absolute

difference between the ideal and actual response

over the entire stopband by incorporating an equal

ripple of RS dB in the stopband. Passband response

is maximally flat. The stopband does not approach

zero as quickly as the type I filter. The absence of

ripple in the passband, however, is often an

important advantage.

H(jΩ) = 10-Rs/20 at Ω = 1 - - - - - - - - - - - (iii)

3.4 Adaptive Filter

Typical least mean square (LMS) algorithm is

employed for updating the tap-weights of the

adaptive filter , the adaptive filter output is of the

form:

M-1

y(n) = Σ wk(n) x(n-d-k) - - - - - - - - - - - - - - -(iv)

k=0

where M is the number of taps of the adaptive filter.

The error signal e(n) is:

e(n) = x(n) - y(n) - - - - - - - - - - - - - - - - - - - - (v)

The tap-weights of the adaptive filter is updated

according to the rule:

wk (n +1) = wk (n) + μe(n)x(n – d - k) - - - - - (vi)

where k = 0,1, .....M-1 and μ is the step size

controlling the speed of convergence. After

completing the learning,

the error signal becomes:

e(n) = x(n) - y(n) = s1(n) + n1(n) – s1‘(n) ≈ n1(n) -

- (vii)

Hence the output of filter is the clean ECG signal.

IV . METHODOLOGY

We take ECG Data Signal as a input signal in

analysis of removing noise by using IIR Filter

Design techniques. The first group is intended to

serve as a representative sample of the variety of

waveforms and artifact that an arrhythmia detector

might encounter in routine clinical use. The band

pass-filtered signals were digitized at 360 Hz per

signal relative to real time using hardware

constructed at the MIT Biomedical Engineering

Center and at the BIH Biomedical Engineering

Laboratory. The sampling frequency was chosen to

facilitate implementations of 60 Hz (mains

frequency) digital notch filters in arrhythmia

detectors. Since the recorders were battery-

powered, most of the 60 Hz noise present in the

database arose during playback. Sampling

frequency of the data signal is 360 and amplitude

±1 mv. Filter of noisy ECG signal set up in two

step, in first step input data signal removing from

the baseline drift after then 10 db awgn noise

introduce in input data signal. In second step design

a filter with the help of FDA tool in mat lab

software. FDA tool parameters set up as low pass

IIR filter with sampling frequency 360 Hz and


minimum order. Frequency of pass band (FP) and

stop band (FS) are 54 Hz and 60 Hz. Attenuator of

filter AP is 1db and AS is 80 dB set in FDA tool.

The original signal of ECG data signal before and

after baseline remove shown in fig. 3.

Proposed Work:

In my proposed work, designed digital filters such

as Adaptive filter, Butterworth filter and Chebyshev

filter using MATLAB and the results are shown

below.

1. Results of Adaptive filter:

Fig 3: original ECG, noisy ecg signal, signal after removal of noise and

magnitude response of Adaptive filter.

2. Results of Butterworth filter:


Fig 4: original ecg signal, noisy ecg signal, signal after removal of noise and

magnitude response of Butterworth filter.

3. Results of Chebyshev-1 filter:

Fig 5: original ecg, noisy ecg signal, signal after removal of noise and

magnitude response of Chebyshev filter.

V . CONCLUSION

Digital filter is the prominent solution that caters

the noise reduction up to satisfactory level. A

digital Filter technique is best suited for ECG

analysis and thereby helps in improving the quality

of ECG signal. The results obtained from Adaptive

filter, Butterworth filter, Chebyshev are compared

on the basis of signal to noise ratio. Future work


will be implementing these digital filters in VHDL

on FPGA platform.

REFERENCE

[1]Ferdjallah M., Barr R.E., ―Frequency domain digital filtering techniques

for the removal of powerline noise with application to the electrocardiogram‖,

Comput. Biomed Res. 1990 Oct., 23(5), 473-489.

[2] Manish Kansal, Hardeep Singh Saini, Dinesh Arora ‗‘Designing & FPGA

Implementation of IIR Filter Used for detecting clinical information from ECG‘‘ International Journal of

Engineering and Advanced Technology (IJEAT) Volume-1, Issue-1, October

2011.

[3] Sonal K. Jagtap, M. D. Uplane ‗‘ A Real Time Approach: ECG Noise

Reduction in Chebyshev Type II Digital Filter‘‘

International Journal of Computer Applications (0975 – 8887) Volume 49–

No.9, July 2012.

[4] Mohandas Choudhary, Ravindra Pratap Narwaria ‗‘Suppression of Noise

in ECG Signal Using Low pass

IIR Filters‘‘ International Journal of Electronics and Computer Science

Engineering (ISSN- 2277-1956) 2011.

[5] Yogesh Sharma , Anurag Shrivastava „‟ Periodic Noise Suppression from

ECG Signal using Novel Adaptive Filtering Techniques‘‘ International

Journal of Electronics and Computer Science Engineering(IJCSE) Volume-1.

[6] Jane, R; Laguna,p;Thakor, and Caminal P.1992 ―Adaptive Baseline

Wander Removal in the ECG: Comparative AnalysisWith Cubic Spline

Technique‖ IEEE proceeding Computers in Cardiology, pp.143-146

[7] Hamid Gholam-Hosseini, Homer Nazeran, Karen J. Reynolds ‗‘ECG

Noise Cancellation Using Digital Filters‘‘ 2nd lntemational Conference on

Bioelectromagnetism, February 1998

[8] Seema Rani,Amarpreet kaur, J S Ubhi P.2011 ―Comparative study of FIR

and IIR filters for the removal of Baseline noises from ECG signal‖

International Journal of Computer Science and Information Technologies Vol

2 (3)

[9] Lisette P Harting Nikolay M. Fedotov Cornies HSlump ―On Baseline Drift

Suppressing in ECG Recording‖ 2004 IEEE Benelux Signal Processing

Symposium.

[10] Choy TT, Lenung P.M, ―Real Time Microprocessor-Based 50Hz Notch

Filter for ECG‖ Biomed Eng. 1908 May; 10(5); 285-8.

[11] Ferdjallah M. Barr R.E. ―Frequency-Domain Digital Filtering

Techniques for the Removal Powerline Noise withApplication to the Electro

Cadiogram‖ Compute Biomed Res. 1990 Oct 23(5); 475-89.


SYSTOLIC BASED OPTIMIZATION TOOL FOR 1D & 2D FIR

FILTERS USING TOURNAMENT SELETION METHOD

Chandra ShekarP1

B.E.,M.Tech, Divya K S2

B.E, M.Tech, Chaya P3

B.E, M.Tech., Associate Professor 1,2

Dept. of ECE, VTU Belgaum

KVG College of Engineering, Sullia - 574 327,D K, Karnataka, India. 3 Dept. of ISE, VTU Belgaum

GSSSIETW, KRS Road, Mysore – 570016, Karnataka, India .

[email protected],[email protected],chayaneetha

@gmail.com

Abstract - The project is concerned with the design of

systolic array by using linear mapping techniques on

regular dependence graph (DG), the mapping technique

transforms a Dependency graph to a space-time

representation, where each node is mapped to a certain

processing element and is scheduled to a certain time

instance. The systolic design methodology maps an N-

dimensional DG to a lower dimensional systolic

architecture. The basic vectors involved in the systolic

array design should satisfy feasibility condition for

designing the tool. MATLAB version 7.01 is the platform

used to design the FIR tool for faster implementation,

and to achieve low level designs for selected vectors.

The tool designed can also be used in selection of

Scheduling inequalities and projection vector to meet

the feasibility condition, and to achieve 100% HUE

using “Tournament Selection” Method.

The Tournament selection typically used in

Evolutionary Programming, allows for tuning the

degree of stringency of the selection imposed, Rather

than Selecting on the basis of each Solutions fitness or

error in light of the objective function at hand, selection

is made on the basis on the number of wins, earned in a

competition.

Index Terms – Systolic array, Dependence graph,

Processing element, Tournament Selection method.

I INTRODUCTION

Flow approach used in the Systolic Design


309

Fig. 1.1 Flow of systolic design

IISYSTOLIC ARCHITECTURE DESIGN

A. Basic principle of Systolic array

High performance, special-purpose computer systems are

typically used to meet specific application requirements or to

off-load computations that are especially taxing to general-

purpose computers. As hardware cost and size continue to drop

and processing requirements become well-understood in areas

such as signals and image processing, more special-purpose

systems are being constructed. However, since most of these

systems are built on ad hoc and basis for specific tasks,

methodological work in this area is rare. Because the

knowledge gained from individual experiences is neither

accumulated nor properly organized, the same errors are

repeated. I/O and computation imbalance is a notable example-

often, the fact that I/O interfaces cannot keep up with a devices

speed is discovered only after construction a high speed,

special-purpose device.

We intend to help correct this ad hoc approach by

providing a general guideline-specifically, the concept of

systolic architecture, a general methodology for mapping

high-level computations into hardware structure. In a systolic

systems, data flows from the computer memory in a rhythmic

fashion, passing through many processing elements before it

returns to memory, much as blood circulates to and from the

heart. The system works like an automobile assembly line

where different peoples work on the same car at different

times and many cars are assembled simultaneously. An

assembly line is always linear, however, and systolic systems

are sometimes two-dimensional.

The systolic architectural concept was developed at

Carnegie-Mellon University and versions of systolic

processors are being designed and built by several industrial

and governmental organizations.

Instead of:

5 million operations per second at most

We have:

30 million operations per second possible:

Fig. 2.1 Organization of Memory and Programming

Elements

B. Systolic Systems

Systolic systems consists of an array of PE (Processing

Elements) processors are called cells.Each cell is connected

to a small number of nearest neighbors in a mesh like

topology. Each cell performs a sequence of operations on data

that flows between them. Generally the operations will be the

same in each cell; each cell performs an operation or small

number of operations on a data item and then passes it to its

neighbor. Systolic arrays compute in ―lock-step‖ with each

cell (processor) undertaking alternate compute/communicate

phases.

C. Processing element

Fig shows the processing element architecture. The IN input

stores data coming from the other processor on to a dedicated

I/O register file data. The CTE input ins


310

Fig. 2.2 Internal processing element architecture

connected to a broadcast bus for receiving constant data from

the outside world and from the memory. These data are stored

in a specific constant register file before being used.

As the arithmetic operations involved are very limited, two

specific units are implemented, namely an adder and a

minimize. These two units are pipelined due to the regular and

repetitive structure of the computation. The accumulator can

be loaded either from the adder or from the minimize.

The processors of the array are activated by micro-

commands which control actions such as accumulator loading,

I/O and CTE register selection, data acquisition, etc. These

micro-commands are specified by an instruction which is

received from the outside and decoded. In the same way, the

memory actions are programmed depending on the calculation

being performed.

Processor array, reference memory and data reference array

thus operate synchronously since; they each execute one

instruction every machine cycle. One instruction specifies

actions to be realized concurrently on all these units. In that

sense, the processor array can be considered to have an SIMD

execution mode: all the cells are executing the same

instruction at the same time.

D. Features of Systolic arrays

A Systolic array is a computing network possessing the some

features, such as Synchrony,Modularity,Regularity, Spatial

locality, Temporal Locality,Pipelinability,Parallel computing.

Synchronymeans that the data is rhythmically computed

(Timed by a global clock) and passed through the network.

Modularitymeans that the array (Finite/Infinite) consists of

modular processing units.

Regularitymeans that the modular processing units are

interconnected with homogeneously.

Spatial Localitymeans that the cell has a local

communication interconnection.

Temporal Localitymeans that the cells transmits the signals

from one cell to other which require at least one unit time

delay.

Pipelinabilitymeans that the array can achieve a high speed.

E. Mapping to Systolic Array

Various approaches to map computational algorithms on to

systolic array structures have been proposed[13].Generally

they can be classifed in to three categories: functional

transformation, retiming, and dependence mapping.The

dependence mapping consists of sevearl steps:

Step1:- is to map an algorithm to a Dependance

graph(DG),

Step2:-is to map the dependent graph to a signal flow

graph(SFG),

Step3:-finally maps the SFG on to an array processor.

Dependency diagram is a visual representation of a

dependency graph; in the case of a dependency graph without

circular dependencies, it can be interpreted as a Hasse diagram

of the graph. Dependency diagrams are integral to software

development, outlining the complex, interrelationships of

various functional elements. Typically in a dependency

diagram, arrows point from each module to other modules

which they are dependent upon,the mapping of Dependency in

shown below Fig3.3

Fig.3.3 Space representation

III Tournament Selection

In tournament selection a number Tour of individuals is

chosen randomly from the population and the best individual

from this group is selected as parent. This process is repeated

as often as individuals must be chosen. These selected parents

produce uniform at random offspring. The parameter for

tournament selection is the tournament size Tour. Tour takes

values ranging from 2 to N ind (number of individuals in

population).

In selection the offspring producing individuals are chosen.

The first step is fitness assignment. Each individual in the

selection pool receives a reproduction probability depending

on the own objective value and the objective value of all other

individuals in the selection pool. This fitness is used for the

actual selection step afterwards.

http://en.wikipedia.org/wiki/Dependency_graph

http://en.wikipedia.org/wiki/Hasse_diagram


311

A. Selection schemes

Throughout this section some terms are used for comparing

the different selection schemes.

Selective pressure

Probability of the best individual being selected compared

to the average probability of selection of all individuals.

Bias:

Absolute difference between an individual's normalized

fitness and its expected probability of reproduction.

Spread

Range of possible values for the number of offspring

of an individual.

Loss of diversity

proportion of individuals of a population that is not

selected during the selection phase

Selection intensity

expected average fitness value of the population after

applying a selection method to the normalized Gaussian

distribution

Selection variance

expected variance of the fitness distribution of the

population after applying a selection method to the

normalized Gaussian distribution

Table and figure show the relation between tournament size

and selection intensity.

tournament size 1 2 3 5 10 30

selection intensity 0 0.56 0.85 1.15 1.53 2.04

Tab. 3-1: Relation between tournament size and selection

intensity

B. Analysis of tournament selection

In an analysis of tournament selection can be found.

Selection intensity

Loss of diversity

(About 50% of the population are lost at tournament

size Tour=5).

Selection variance

Fig. 3.1: Properties of tournament selection

Importance of Tournament Selection

A small group of individuals are selected from the whole

population.

The best individual in this group is selected and returned

by the operator.

Tournament selection prevents the best individual from

dominating.

The chosen group size can decrease the selective

pressure.


312

IV RESULTS

V APPLICATIONS

• Matrix Inversion and Decomposition.

• Faster approach can be met in the field of designing,

• Polynomial Evaluation.

• Matrix multiplication.

• Image Processing Convolution.

• Systolic lattice filters used for speech and seismic

signal processing.

• Artificial neural network.

• Robotics

• Equation Solving

• Signal processing

• Image processing

• Solutions of differential equations

• Graph algorithms

• Biological sequence comparison

• Other computationally intensive tasks.

VI CONCLUSION

The systolic architecture is a massively parallel processing

with limited I/O communication with host computer and

suitable for many regular interactive operations. The programs

written to design the TOOL by making use of

―TOURNAMENT SELECTION METHOD‖ in Evolutionary

programming , efficiently utilize engineering theory in order

to optimize the Systolic array design, The optimized design is

compared with the hand calculations done for 1D-FIR tap

filter and Reduced Dependency Graph (RDG). The results are

matching, hence I conclude that the design of TOOL shows

correct results and is verified. The TOOL designed in this

project can be used by any third party for better understanding

of ―Other Selection Method‖.

VIIFUTURE WORK

The analysis of the tool helps in better understanding of the

―Other selection Method‖ their by further Optimization of the

given task can be overcome to achieve 100% HUE. The


313

TOOL delivers idea to design Polyphase-FIR filter, DFT,

Polyphase-DFT and IDFT-Polyphase function.

REFERENCES

[1]. S.Y. Kung, VLSI Array Processors, Prentice Hall, 1988.

[2]. H.T. Kung,‖Why systolic architectures?‖Computer, vol.

15, p. 37, 1982.

[3]. D.I. Moldovan, Parallel Processing: From Applications

to Systems. Morgan Kaufmann Publishers.1993.

[4]. S.K. Rao. , Regular Iterative Algorithms and their

implementation on Processor arrays, Ph.D. Dissertation,

Stanford University, Stanford, CA, 1985.

[5].Flynn, M., Some Computer Organizations and Their

Effectiveness, IEEE Trans.Comput. Vol. C-21, pp. 948, 1972.

[6].Duncan, Ralph, "A Survey of Parallel Computer

Architectures", IEEE, Feb.1990.

[7].De Jong K.A., 1994, Genetic algorithms: A 25-years

perspective, in Computational Intelligence: Imitating Life

IEEE 1994.

[8].Fogel.L.J. 1994, IEEE, Evolutionary programming in

perspective: the top-down view, in: computational Intelligence

(pp.135-146).

[9].‖ A Comparison of methods for self-adaptation in

evolutionary algorithms‖, N.Saravanan, David E.Fogel, Kevin

M.Nelson, IEEE 1995.

[10].M.J Foster and H.T.Kung.‖The design of special-purpose

VLSI chips,‖computer.pp 26-40, Jan 1980, IEEE.

[11].Chapter 7. (pp 189-210), by Keshab K Pharhi.

[12].‖Evolutionary Computation 1&2‖, by fogel, 1980.

[13].P.Quinton,‖The systematic design of systolic arrays,‖

IRISA Rept., March, 1983.

[14]. Sedukhin S.G. and Sedukhin I.S. An interactive graphic

CAD tool for the synthesis and analysis of VLSI systolic

structures.Proc. of Int. Conf. "Parallel Computing

Technologies", 1993, Obninsk, Russia, 1993, Vol.1, pp.163-

175.

[15]. Jonathan Break, ―Systolic Arrays & Their Applications‖.

[16]. H.T. Kung and C.E. Leiserson, Systolic arrays (for

VLSI), Sparse Matrix Proc. 1978, Society for Industrial and

Applied Mathematics, 1979, pp. 256-282.

[17]. G.J. Li and B.W. Wah, The design of optimal systolic

army, Tram. Comput., C-34(10) (1985) 66-75.


314

Implementation of Optimized Flame Detection in a Video using Image Processing

Pramod G. Devalatkar Department of Electronics & Communication Engineering,


Belgaum, India. [email protected]

Abstract— Present work is an in depth study to detect flames in

video by processing the data captured by an ordinary camera.

Previous vision based methods were based on color difference,

motion detection of flame pixel and flame edge detection. This

paper focuses on optimizing the flame detection by identifying

gray cycle pixels nearby the flame, which is generated because of

smoke and of spreading of fire pixel and the area spread of

flame. These techniques can be used to reduce false alarms along

with fire detection methods. The novel system simulate the

existing fire detection techniques with above given new

techniques of fire detection and give optimized way to detect the

fire in terms of less false alarms by giving the accurate result of

fire occurrence. The strength of using video in fire detection is

the ability to monitor large and open spaces.

Keywords—Fire detection, Video processing, Edge detection,

Color detection, Gray cycle pixel, Fire pixel spreading.

I. INTRODUCTION

Fire detection system sensors are used to detect

occurrence of fire and to make decision based on it.

However, most of the available sensors used such as

smoke detector, flame detector, heat detector etc., take

time to response. It has to be carefully placed in various

locations. Also, these sensors are not suitable for open

spaces. Due to rapid developments in digital camera

technology and video processing techniques,

conventional fire detection methods are going to be

replaced by computer vision based systems.

Conventional point smoke and fire detectors are widely

used in buildings. They typically detect the presence of

certain particles generated by smoke and fire by

ionization or photometry. Alarm is not issued unless

particles reach the sensors to activate them. Therefore,

they cannot be used in open spaces and large covered

areas. Video based fire detection systems can be useful

to detect fire in large auditoriums, tunnels, atriums, etc.

The strength of using video in fire detection makes it

possible to serve large and open spaces. In addition,

closed circuit television (CCTV) surveillance systems

are currently installed in various public places

monitoring indoors and outdoors. Such systems may

gain an early fire detection capability with the use of fire

detection software processing the outputs of CCTV

cameras in real time. Current vision based techniques

mainly follow the color clues, motion in fire pixels and

edge detection of flame. Fire detection scheme can be

made more robust by identifying the gray cycle pixels

nearby to the flame and measuring flame area

dispersion.

II. OVERVIEW OF FIRE DETECTION

This section covers the detail of the previously proposed

fire detection methods. It is assumed that the image

capturing device produces its output in RGB format.

During an occurrence of fire, smoke and flame can be

seen. With the increasing in fire intensity, smoke and

flame will be visible. In order to detect the occurrence

of fire, both flame and smoke need to be analysed. Many

researchers used unusual properties of fire such as color,

motion, edge, shape. Lai et al. [3] suggested that features

of fire event can be utilized for fire detection in early

stages. Han et. al. [2] used color and motion features

while Kandil et al. [1] and M. Nixon, A. Aguando [6]

utilized shape and color features to detect an occurrence

of fire.

A. Edge detection

Edge detection method is used to detect the color variance

in an image. The edge detection system compares the color


315

difference and provides an edge based on it [8] which can be

used in fire detection.

B. Color detection

A fire in an image can be described by using its color

properties. This color pixel can be extracted into the

individual elements as R, G and B, which can be used

for color detection. C.-L. Lai, J.-C.Y [3] have used R

and Gelements and find out that there is a correlation

between G/R ratio and temperature distribution, where

as temperature increases, G/R ration also increases. So,

due tothis, color of flame can provide useful information

to guess on the temperature of a fire and also fire phase

[9]. In terms of RGB values, this fact corresponds to the

following inter-relation between R, G and B color

channels: R > G and G > B. The combined condition for

the fire region in the captured image is R > G > B.

Besides, R should be more stressed than the other

components, because R becomes the dominating color

channel in an RGB image of flames. This imposes

another condition for R as to be over some pre-

determined threshold, RT [1]. However, lighting

conditions in the background may adversely affect the

saturation values of flames resulting in similar R, G and

B values which may cause non flame pixels to be

considered as flame colours [4]. Therefore, saturation

values of the pixels under consideration should also be

over some threshold value.

C. Motion detection

Motion detection is used to detect any occurrence of

movement in a video. It is done by analysing difference

in images of video frames. There are three main parts in

moving pixel detection: frame/background subtraction,

background registration, and moving pixel

detection[10].

The first step is to compute the binary frame difference

map, by thresholding the difference between two

consecutive input frames. At the same time, the binary

background difference map is generated by comparing

the current input frame with the background frame

stored in the background buffer. The binary background

difference map is used as primary information for

moving pixel detection.

In the second step, according to the frame difference

map of past several frames, pixels which are not

movingfor a long time are considered as reliable

background inthe background registration [11]. This step

maintains an updated background buffer as well as a

background registration map indicating whether the

background information of a pixel is available or not.

In the third step, the binary background difference map

and the binary frame difference map are used together to

create the binary moving pixel map [6]. If the

background registration map indicates that the

background information of a pixel is available, the

background difference map is used as the initial binary

moving pixel map.

III. PROPOSED TECHNIQUE

The aim of this paper is to develop an identification

system to detect an occurrence of fire based on the video

image. In this paper, I use flame properties to conduct

the fire detection. As shown in the block diagram below:

Fig. 1. Proposed fire detection system.

This method gives the flexibility to use different

combinations of detection methods so that, I can

implement the system according to the specific

requirements of use [4]. For example:

(1) For highly sensitive area, it can apply the OR gate ( ||

) operator. So that the system will prompt for fire, if any

of the method will detect the occurrence.

(2) For general purpose, it can apply the combination of

any two methods. So that the system will be prompt for

fire, if at least two methods will detect the fire.

Decision

making

algorithm

Image

Frames from

Video

Fire Alarm

Signal

Motion &

Gray Cycle

Detection

Edges & Color

Detection


316

(3) For Less sensitive area, it can apply the AND gate

(&&) operator. So as the system will prompt for fire,

only if all methods will detect the fire.

IV. METHODOLOGY

The purpose of this paper is to develop an optimized

system to detect an occurrence of fire based on video

images. In this project I have use the previously

proposed methods to conduct the fire detection and

propose new techniques to implement in parallel. That

would give more optimized results in detection of flame.

In developing the system the following stages are

involved.

V. PROPOSED ALGORITHM

The algorithm is based on the fact that visual color

images of fire havehigh absolute values in the red

component of the RGB coordinates [4]. This property

permits simple threshold-based criteria on the red

componentof the color images to segment fire images in

natural scenarios. However, not only fire gives high

values in the red component. Another characteristic of

fire is the ratio between the red component and the blue

and green components. An image is loaded into color

detection system and mapped with the extracted edge

detection image [7]. A color detection system applies the

specific property of RGB pixels and gives the output

result as an image with a selected area of color

detection.

A. Edge detection

Edge detection method is used to detect the color

variance in an image. Block Diagram of Edge Detection

System is as in Figure 2. Using MATLAB an Edge

Detection model is built based on this block diagram.

Fig. 2 Block diagram of Edge Detection system

Fig.3 (a). Original Image.

Fig. 3 (a) shows the original frame i.e. frame 1 and the Fig. 3

(b) which is the result of the detected edge of this image.

Fig. 3(b). Image after applying edge detection.

The edge detection system compares the intensity

difference in the image and provides an image with

black and white color space where high intensity area is

filled with white color and low intensity area is filled

with black color [2]. The intensity difference is

categorized using a global intensity threshold which is

separately calculated for each image by MATLAB the

output will provide a shape of the flame [4]. Thus, the

edge detection can be used toanalyse color detection of

fire.

After getting the output from the color detection we can

apply different detection techniques by mapping these

detected coordinate on its corresponding original image

with different combinations. We have three techniques

to implements further.

Motion Detection.

Gray-cycle pixel detection.

Area dispersion.

Video

input

(image)

Edge /

Colordet

ector

Detection

output


317

B. Motion detection

Motion detection is used to detect any occurrences of

movement in a sample video. Block diagram of motion

detection system is as shown below:

Fig.4. Block diagram of motion detection

Here we took two sequential images from video

frames [4]. After applying basic two methods edge

detection and color detection we get probable area of

fire pixel thenwe compare the RGB value to of frame1

to the frame 2 for corresponding pixel and if pixel value

differs then motion detector will show motion and will

give resultant outputto the combination of operator.

C. Gray scale pixel detection

Gray-cycle detection is used to detect occurrences of

smoke pixel in the selected area which is half above the

area, detected by color detection method. Gray-cycle

pixel has some properties in terms of RGB. This method

will check these properties inside the selected area and

then depending on the result obtained it will provide

result to the operator.

D. Area dispersion

Area detection method is used to detect dispersion of

fire pixel area in the sequential frames.In this method we

are comparing fire pixel area of two sequential frameson

the basis of minimum value of x & y and maximum

value of x & y [5]. In case of fire, if any extreme value

of x and y axis will increase for next frame then there is

area dispersion takes place and systemwill provide

output to the operator.

After that operator will perform operation on the basis of

logic combination selected by the system. The detected

fire pixel area of the image in Fig. 3 (a) is as shown in

Fig. 4. Below.

Fig. 4. Detected fire pixel area.

VI. CONCLUSIONS

I have collected a number of sequential image frames

from two original created videos which consist both fire

and non-fire images. Table below shows the number of

false alarm detection over non fire image.

Methods No. of Faulty

Detection

System

Performance

Motion

Detection (10/50) =20% 80%

Gray Cycle

Detection

(08/50)

=16% 84%

Area Dispersion (07/50)

= 14% 86%

Proposed Fire

Detection

System

(03/50)

= 6% 94%

Result shows that the system performance with the

application of proposed fire detection system gives the

better system performance in term of fewer false alarms

and thus a higher system performance is achieved.

REFERENCES

[1]. M. Kandil, M. Salama (2009): A New Hybrid Algorithm for Fire-Vision Recognition in IEEE Eurocon, pp. 1460-1466.

[2]. D. Han, ByoungmooLee (2009): Flame and Smoke Detection method for early real-time detection of a tunnel fire, Fire Safety Journal, vol. 44, pp. 951-

961.

[3]. C.-L. Lai, J.-C.Y (2008) Advanced Real Time Fire Detection in Video

Surveillance System, in IEEE International Symposium on Circuit and

Systems (ISCAS), pp. 3542-3545. [4]. Gaurav Yadav et al (2012) Image Processing based fire and flame

detection technique, in The Indian Journal of Computer Science and Engineering (IJCSE).

[5]. Turgay Celik (2010): Fast and Efficient Method for Fire Detection Using

Image Processing, ETRI Journal, Volume 32, Number 6.

[6]. M. Nixon, A. Aguando (2008): Feature Extraction & Image Processing,

2nd ed., London, Academic Press, pp. 115–136. [7]. B.C. Ko, K.H. Chong, J.Y. Nam (2009): Fire Detection based on vision

sensor and support vector services, Fire Safety Journal, vol. 44, pp. 322–329. [8]. Q. Zhu, S. Avidan, M-C Yeh, , K-W Cheng, “Fast Human Detection

Using a Cascade of Histograms of Oriented Gradients”, Proceedings of the

Two

sequential

Image

frames

Comparing

RGB values

Motion

detected or

not


318

IEEE Computer Society Conference on Computer vision and Pattern

Recognition, ISSN: 1063-6919, Volume 2, pp. 1491-1498, June 2006. [9]. Fire and Gas Protection Systems Part 3(2009).PETRONAS Fire and Gas

Protection Systems. [10]. T. Chen, P. Wu, and Y. Chiou (2004): An early fire-detection method

based on image processing, in ICIP ‟04, pp.1707–1710. [11]. C.-B. Liu, N. Ahuja (2004): Vision based fire detection, Proceedings of

the 17th International Conference on Pattern Recognition (ICPR‟ 04), Vol.4, pp. 134-137.


319

An Enhanced Threshold Based Technique for

WBC Nuclei Segmentation Arpita kulkarni1, Md Riyaz Ahmed2

1,2Electronics and Communication, RITM


[email protected]

Abstract— Blood testing is the most important step; aging adults

can take to prevent life-threatening disease. With blood test

results in hand, we can identify the critical changes in the body

before they manifest as heart disease, cancer, diabetes, or worse.

Therefore a reliable and cost effective method is important to get

through the abnormalities of the blood sample obtained. Since

nuclei of the white blood cell has major information about the

abnormalities in the blood cell, we present a simple method of

white blood cell nuclei segmentation which is robust in noise

removal and obtain the required information about the cell by

global feature extraction and Gabor convolution. This is simpler

and cost effective compared to other methods.

Keywords—Bloodtest;segmentation;nuclei;leukocytes;redblood

cells

I. INTRODUCTION

. Blood tests are the critical part of medical diagnosis.

Blood smear image used for laboratory test is the

measure of the concentration of white blood cells, red

blood cells, and platelets in the blood. White blood cell

count helps in identifying the hormone and enzyme level

in blood cancer cell detection, coronary artery disease to

avoid heart attacks and many more. The white blood cell

count is Normal range adult: 5,000-10,000 [1] [3]. WBC

are cells of the immune system, and are found

throughout the body, including bone marrow and

blood[9]. White blood cells are also called as leukocytes.

There are two different types of leukocytes depending

on the presence of granules: Granulocytes and

Agranulocytes. Granulocytes are membrane bound

enzymes which help in digestion. They are Neutrophils,

Basophils and Esinophils. Agranulocytes

include Lymphocytes, Monocytes, and Macrophages [2].

Manual white blood cell counting method is time

consuming and prone to mistakes, as it depends on the

technician/expert.

Hence it is necessary to identify and analyze the major

blood cell abnormalities within short period of time. The

methodologies involved in blood test need to be simple,

reliable, faster and cost effective. Visual sample

automation is one such method.Researchers are keen in

improvising the classical segmentation method to get

advanced results. Automation involves five basic

processes which are image acquisition, image pre-

processing, image segmentation, image post-processing

and image analysis [4].Image acquisition involves

selection of the image, image pre-processing involves

enhancing the image contrast and other required

features. Image segmentation is the most important one

that we are dealing here in our project. Post-processing

is just the removal of some residual noise and finally

analysing the output image obtained. The flow is shown

in the figure 1. This work is addressing the segmentation

processes in automation method. In this work, the

algorithm proposed by Mostafa Mohammed and

Behrouz Far is modified to obtain qualitative

results. The proposed method helps in avoiding the

dependence of data on the green component of the

image , more information about the features of the cell is

obtained. As of now we are using the sub images of the

blood smear having single WBC.

The paper is organized as follows. In section 2 literature

review of the paper is presented. In section 3 the

proposed method is explained. In section 4 the results

and discussions are obtained and finally the conclusions

and future work are drawn in the section 5 and finally

the references.


http://www.medicinenet.com/script/main/art.asp?articlekey=6017





http://en.wikipedia.org/wiki/Lymphocyte

http://en.wikipedia.org/wiki/Monocyte

http://en.wikipedia.org/wiki/Macrophage


320

Figure 1: Flow diagram for The Visual automation method

II. LITRATURE REVIEW

Several methods have been proposed in the medical field

to improvise the efficiency of the blood test for faster

detection of diseases. S. H. Rezatofighi et al. [5]

developed an automation method for White Blood Cell

Nucleus Segmentation Based on Gram-Schmidt

Orthogonalization method. This orthogonality theory is

used for amplifying the desired colour vectors and

weakening the undesired colour vectors.Farnoosh

Sadeghian et al. [6] introduced a method to separate the

nucleus and the cytoplasm with the help of two types of

active contour models: parametric and geometric

models, which worked only on gray scale images.

Scientific reports presented a Minimum model

Approach. The detection and segmentation of the white

blood cell nuclei was performed with the help of virtual

microscopy images. It avoided the priori information

that was required in model based approach and was able

to work with the minimum prior information for WBC

nuclei segmentation. They introduced the contour value

which helps in measuring the rank and selects the best

image objects [7].A Novel Framework was proposed by

Ja-Won Gim et al. [8]which explained a method for

white blood cell (WBC) segmentation using region

merging scheme and GVF

(Gradient Vector Flow)snake. It described two schemes;

nuclei segmentation and cytoplasm segmentation. For

nuclei segmentation, they created a probability map

using probability density function estimated from

samples of WBC‘s nuclei and cropped the sub-images to

include nucleus by using the fact that nuclei have salient

colour against background and red blood cells. Then,

mean-shift clustering was performed for region

segmentation and merging rules were applied to merge

particle clusters to nucleus. A hybrid approach for

cytoplasm segmentation, which combines the spatial

characteristics of cytoplasm and GVF snakes to

delineate the boundary of the region of interest. This

method showed higher average accuracy of 67.7% vs.

37.6% .Adaptive Threshold detector Method was

proposed by Der-Chen Huang et al. [9]. The method

included two steps:The first step enhances the WBC

nuclei by combining two different colour spaces in a

blood smear image. Then the thresholding based

adaptive segmentation method was proposed. The

experimental result showed that we can obtain promised

segmental results even after applying different colour

tone and size of smear images. Dorini et al. [10]

presented a simple morphological operation to explore

the scale space properties and improves the

segmentation accuracy. Nucleus segmentation is done

by creating a binary image with the help of thresholding

method and the gradient is computed by water shield

transform method. Granulometric function is used for

cytoplasm segmentation. Madhloom et al. [11]

established an automatic algorithm for detection and

classification of leukocytes. It had a major drawback as

its accuracy was highly depending on the contrast of the

image. This drawback was conquered by Mostafa et al.

[4] , by using some regulations such as nucleus

minimum segment size to be half of the RBC average

size.

Chinwaraphat et al. [12] proposed a Modified Fuzzy

clustering (FCM) method which segments the cytoplasm

and nuclei of the WBC.FCM method classifies the blood

sample into white blood cell nucleus, white blood cell

cytoplasm, plasma and red blood cell.FCM is modified

to avoid the false clustering due to unclear pixel

similarity between cytoplasm and plasma region. The

output efficiently extracts the nucleus and cytoplasm


321

region compared to normal FCM method. Byoung Chul

Ko et al. [13] proposed a mage segmentation method

using stepwise merging rules based on mean-shift

clustering and boundary removal rules with a GVF

(gradient vector flow) snake. Nuclei segmentation is

done by calculating the probability density function

from the WBC nuclei. Cytoplasm segmentation is done

as the boundary edges and noise edges are then removed

using removal rules, while a GVF snake is forced to

deform to the cytoplasm boundary edges.Nicola Ritter et

al. [14] presented a blood cell segmentation algorithm

for images taken from peripheral blood smear slides.

III. PROPOSED ALGORITHM

Basically the key task of image segmentation involves

extracting all the WBCs from a complicated background

and then segmenting them into their morphological

components, such as the nuclei and cytoplasm [14]. Here

we are involved in segmentation of the most informative

part of the WBC, nuclei for further analyses.

In this research we are modifying the algorithm

proposed by Mostafa et al. [4] to improve the accuracy.

The algorithm avoids the overstraining of the

components which lead to inexact segmentation. This

method removes the noisy materials from the blood

smear with an efficient way of edge detection. The

smoothening is performed by Gabor filter which gives

an efficient output. Also the image need not depend on

the green component of the data set, the RGB image is

converted to Gray scale for further processing.

I. Proposed algorithm steps

The processing of the blood smear images are

done as follows

1) Read the input gray image, if RGB convert to

Gray.

2) Perform edge detection and morphological

operation on the image

3) Noise removal is done by calculating the region

properties of the components in an image.

4) Obtain the valid region details of the image to

obtain robustness in an image.

5) Resizing of the image is done by width

normalization.

6) Extracting the global features of the

components of the image

7) Calculating the Gabor vector by Gabor

convolution

8) Combining the Gabor vector and the global

feature vector to obtain the WBC count in the

given blood smear image

II. Details of the proposed Algorithm

The input image is identified as gray image, if

not it is converted from RGB to gray by using

proper Mat lab functions. The image is

preprocessed by numerous operations as edge

detection and morphological operation .The

properties of the connected components in an

image is calculated, then the threshold area is

set which compares each connected component

area and retains only the values above the

threshold value. This helps in avoiding the

maximum noisy components from the image.

Among these set of countable components we

select the valid region by logical image to get

the details of the valid region in an image.

Further image is resized to perform width

normalization of all the segments present in an

image data set obtained. Then the properties of

this segmented image will be calculated to

obtain the global feature vector. The Gabor

convolution is performed on the image to obtain

the Gabor vector. The Gabor filter also

smoothens the image. The global feature vector

and the Gabor vector are then concatenated to

obtain the required set of data about the WBC‘s

from an image data set given.

Gabor convolution is for convolving each

row of an image with 1D log Gabor filter

G (w) = exp (-log (w/wo)

2) / (2 (log (k/wo)

2)

Where wo is the filter's centre frequency. To

obtain constant shape ratio filters the term

k/wo must also be held constant for varying wo.

The flow diagram of the proposed method is

shown in figure 2.


322

Figure 2: Workflow of proposed method

III. RESULTS AND DISCUSSION

In the experiment we used RGB colored image

which was converted into gray scale for the

processing. This is the qualitative analysis of the

segmentation method. The image data set is stored

in the file E:\emath\eMath_Arpita\Dataset.Though

the paper has some limitations; this method can be

successfully implemented at the algorithm level to

get efficient results. The input blood smear image is

processed to get the binary image data so as to

obtain the valid region details. The valid region is

identified by taking the cell with maximum area,

which is the single WBC cell in the sub image.

Later we collect all the required information about

the WBC cell and then the output is displayed and

the matrix of properties are obtained. The input

image that we take initially is the sub image

containing only single WBC cell as proposed by the

Mostafa et al.[4]. We can obtain information about

all the physical features of the WBC cell in the

bounding box selected by the Gabor filter. This

work can also be extended to count the WBC‘s

which helps to get to know about cells, if they are

large in number.

Input data is given as the colored or gray image

Figure 3: Input image used for processing

The input image is the blood smear taken to the

laboratory for testing the abnormality. This image is

initially colored and then sub images of single

WBC cells are identified and used for further

processing methods. Binary image is an

intermediate image obtained after the thresholding

method. Valid region details are obtained to

calculate the features of required cell. Output Image

is obtained that display the segmented white blood

cell obtained after processing.


323

Figure 5: output image

CONCLUSION

We have presented a method for the segmentation

of WBC based on a threshold based technique. The

noise is removed efficiently with the help of Gabor

filter and edge detection methods. Although process

is worked out as the qualitative output. Since we

have taken the sub images of the blood smear to

know the properties of single cell.This method can

further be extended to calculate the White blood

cells in the complicated environment, where the

data contains more than one WBC cells for better

implementation methods.

ACKNOWLEDGMENT

I would like to acknowledge the head of the

department S.S.Manvi sir, my guide Md.Riyaz

Ahmed and our co-coordinator H.S.Aravind in

helping me perform my project and their due

support is always precious.

REFERENCES

[1] Siamak T. Nabili, MD, MPH and William C. Shiel Jr., MD, FACP,

FACR , ―Complete blood count,‖ URL:

http://www.medicinenet.com/complete_blood_count/article.htm#tocb.

[2] ―Laboratory blood tests,‖ Health hub from Cleveland

clinic.URL

:http://my.clevelandclinic.org/heart/services/tests/labtests/default.aspx.

[3] ―What are the types of White Blood Cells, ‖University

of Rochester Medical Centre, URL :

http://www.urmc.rochester.edu/encyclopedia/content.aspx/C

ontentTypeID=160&ContentID=35

[4] Mostafa Mohammed and Behrouz Far, ―An Enhanced

Threshold Based Technique for White Blood Cells

Nuclei Automatic Segmentation,‖2012 IEEE

14thInternational Conference on e-Health Networking,

Applications and Services (Healthcom).

[5] S. H. Rezatofighi, H. Soltanian Zadeh, R. Sharifian

and R.A. Zoroofi, ― A New Approach to White Blood

Cell Nucleus Segmentation based on Gram- Schmidt

Orthogonalization,‖ International Conference on

Digital Image Processing, Wayne State university

,2009.

[6] Farnoosh Sadeghian, Zainina Seman, Abdul Rahman

Ramli, Badrul Hisham Abdul Kahar, and M-Iqbal

Saripan, ―A Framework for White Blood Cell

Segmentation in Microscopic Blood Images Using

Digital Image Processing,‖ Shulin Li (ed.), Biological

Procedures Online, Volume 11, Number 1,2009.

[7] Stephan Wienert1, Daniel Heim, Kai Saeger, Albrecht

Stenzinger, Michael Beil, Peter Hufnagl,

Manfred Dietel, Carsten Denkert and Frederick

Klauschen1,‖Detection and Segmentation of Nuclei in

Virtual Microscopy Images: A Minimum Model

Approach,‖ Scientific Reports: Bioinformatics

Software Medical Research Imaging, 2012.

[8] Ja-Won Gima, Junoh Parka, Ji-Hyeon Leea, Byoung

Chul Koa and Jae-Yeal Nama, ―A novel framework

for white blood cell segmentation based on stepwise

rules and morphological features,‖Image Processing:

Machine Vision Applications IV, edited by David

Fofi,Philip R. Bingham, Proc. of SPIE-IS&T

Electronic Imaging, SPIE Vol. 7877, 78770H, 2011.

[9] Der Chen Huang, Kun-Ding Hung and Yung-Kuan

Chan,‖White Blood Cell Nucleus Segmentation Based

on Adaptive Threshold Detector,‖ IEEE Transactions

on Image Processing, Vol. 19, No. 12, pp. 3243-3254,

2010.

[10] Leyza Baldo Dorini ,Rodrigo Minetto, and Neucimar

Jeronimo Leite, ―White blood Cell Segmentation

using Morphological operators and Scale-space

Analysis,‖IEEE Computer Society Washington, DC,

USA, 2007.

[11] H.T. Madhloom, S.A. Kareem, H. Ariffin, A. A.

Zaidan, H.O. Alanazi and B. B. Zaidan , ―An

Automated White Blood Cell Nucleus Localization

and Segmentation using Image Arithmetic and

Automatic Threshold,‖ Journal of Applied Sciences,

vol.10, no.11, pp. 959-966,2010.

S. Chinwaraphat, A. Sanpanich, C. Pintavirooj, M.

Sangworasil and P. Tosranon1,‖A Modified Fuzzy

Clustering for White Blood Cell Segmentation,‖ The





http://www.medicinenet.com/complete_blood_count/article.htm#tocb

http://my.clevelandclinic.org/heart/services/tests/labtests/default.aspx

http://www.urmc.rochester.edu/encyclopedia/content.aspx/ContentTypeID=160&ContentID=35

http://www.urmc.rochester.edu/encyclopedia/content.aspx/ContentTypeID=160&ContentID=35

Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing and Embedded

Systems - WiSE 2013

324

An Automated Method of Enabling Portability of

Health Records for People with Restricted Mobility Pushpa R Iyer & Tej Ganapathy K B1, Sreedhar Menon R& Devdipta Pal2, P Meena3

1 and 2 Final Year B.E. Electrical & Electronics, BMS College of Engineering, Bangalore, India

3Assoc. Prof., Department of Electrical and Electronics, BMS College of Engineering, Bangalore, India

Email: [email protected], [email protected], [email protected],


Abstract - Executing the activity of filling out the initial

paperwork and provision of medical history to a new hospital or

physician during each visit is an inconvenience to every patient

more so to one with restricted mobility. What is proposed in this

work is, to generate a centralized database that can be

maintained by the Government or any central body that keeps

up-to-date records of any patient who registers for the service.

The paper presents a successful attempt in this direction by

making the details portable enabling access through the internet

by any authorized person or hospital.

Keywords – Portable Records, Up-to-date records, Centralized

database, Hospital, Restricted Mobility, Authorized person

I. INTRODUCTION

People with restricted mobility form a reasonable section of

the Indian population .Although, most hospitals maintain a

record of each patient for subsequent visits, there is currently

no mechanism for sharing of information between hospitals so

that each incoming patients' records are available to every

hospital regardless of whether the patient has visited the

hospital before. This causes a particular inconvenience to

patients with restricted mobility and in cases of emergency.

Centralized Access to Patient Records (CASPAR) overcomes

the shortfall by keeping medical records of any patient who

registers with CASPAR and providing authorized persons

access to the individual patient records on demand through an

internet based front-end. This feature presented in this work

makes visits to the care facility less time-consuming since the

patients‘ medical records can be accessed by any medical

facility that uses CASPAR.

Our research into existing systems for patient information

portability is as follows. In [1], the authors have proposed a

‗portable patient information integration system combines

Radio Frequency Identification (RFID) technology, Wireless Network, Personal Digital Assistant (PDA) and Front-

Monitoring System. Through the help of the proposed system,

the medical workers can identify patients by non contact

identification and get medical record immediately. Meanwhile

the proposed system can record the history of the interaction

between medical personnel and patients. It can also send

alarm to the corresponding medical workers when reporting a

high risk testing result and give medicine safety suggestion.

The proposed system can improve the correctness and

instantaneousness of patients' medical information and hence

provide a safe environment for patients. [2]presents a

developed ‗Point-of-Care Information System (PCIS) that uses portable terminals and supports the entire loop of daily

nursing work for the first time.The system consists of personal

digital assistants (PDAs) that nurses carry in the hospital

wards and a server computer located in the nursing station.

The system has three main functions: 1) Data Browsing that

provides patient information such as a brief history, vital-sign

charts, and handwritten notes; 2) Schedule planning helps

nurses organize doctors' instructions and make to-do lists for

the day; and 3) Care Management reminds nurses when they

should execute the doctors' orders and provides tools for data

entry. [3], Society for Promotion of IT in Chandigarh, has

proposed that government hospitals and community health

centers should have a centralized patient database. According

to the proposal, once a patient has visited any health centre or

hospital, his or her record will be uploaded on the common

database and be accessible from all places.

The feature common to all the above projects is that the patient

records are only maintained on a localized scale, in one medical

facility or for one city. The model proposed, allows portability

of information between all participating medical facilities

spanning the nation or even the world. The major requirement

for this model to be successfulis for patients to volunteer their

information to be stored on the CASPAR database.

II. SCHEMATIC OF CASPAR

In this section we proceed to explain the proposed model and

its functionalities. Fig.1 depicts the schematic.

The patient or the patient‘s authorized guardian is allowed to

access and update information on the CASPAR database

through an internet based front-end. Authentication is

provided through a login feature where the patient chooses a


Systems - WiSE 2013

325

suitable password. The patient may choose to share this

password with his authorized guardian.

The front-end developed provides the patient accessing the

database with a user-friendly interface that allows for easy

navigation between the different pages created to enable

access to the database system. The patient is permitted to

change his contact information, insurance information and

records of his medical history.

The central database is intended to be maintained by some

central and trusted organization which could be the

government. Limited access to this database can be purchased

by individual medical facilities whose access privileges are

limited by the database administrator in accordance with the

privacy requirements.

Fig.1 Schematic of CASPAR

III. STRUCTURE OF DATABASE

The database has been designed to hold all information

required by hospitals to make an admission. Fig.2 gives a

blueprint of the database.

To make the mechanism work, hospitals purchase access to

CASPAR on signing a privacy agreement.

The intention in designing the database this way is:

1) Aadhaar Number : Inclusion of this number in the

database facilitates efficient search of patient records in the hospital since Aadhaar is a Unique ID.

2) Emergency Contact Numbers and Address : Included so

that the hospital knows exactly whom to contact in the

event of an emergency.

3) Insurance details : Include information about the

insurance provider and insurance number that the

hospital needs to proceed with in the case of an

emergency after consent from the guardian.

4) Patient's Past Records : Enables the doctors to get

access to the patient's medical history so that they can

treat the patient more effectively.

5) Patient or User ID : Computer generated ID used at the

patient's end to log onto the database and view, review or update information.

Fig.2 Blueprint of the Database

IV. INTERFACE FOR PATIENT ACCESS

To have the designed system tested for its features being

operational, it is necessary that people register with CASPAR

and volunteer the requisite information. To make this process


Systems - WiSE 2013

326

simple and user-friendly a front end is designed and explained

in the following section.

A. Registration Page

The patient will enter the details mentioned in the blueprint in

Fig.2 along with his desired password. Some important fields in the registration form have been made mandatory. The

details of which are as shown in Fig.3. The patient is

redirected to a page that gives a confirmation if registration is

successful and also provides a computer generated unique

patient identification number as shown in Fig.4

Fig.3 Generated Registration page

Fig.4Generated Success page with Patient ID

B. Login page

The patient or any anybody authorized by the patient can

access patient records to view, review or update the

information by logging on to CASPAR through this page. The

patient ID generated during registration is used as a user ID.

The login process is completed when the correct password is

provided. The patient can thus update his records to ensure that

the information is up to date. This is shown in Fig.5.

Fig.5 Generated Login page

Fig.6 Generated Patient Details


Systems - WiSE 2013

327

Fig.7 Generated Update Records form

Fig.6 and Fig.7 show the display obtained as a result of log in

and the result in the form of information that is designed to

be generated. Fig.8 shows the display designed to be generated

after completion of the update information.

Fig.8 Update Success Page

C. Database to store patient information

A view of the MySQL database generated to store all the

patient information is as shown in Figure.9. Some sample

records generated is as shown below.

Fig.9 Records on the Database

V. SCOPE

Although we have provided an authentication feature through

the User ID and password requirement to view patient details,

we have not yet provided for encryption of the password

stored on the database. This feature maybe added in future to

improve security of patient information.

VI. CONCLUSION

Through this paper, we have demonstrated the use and

importance of a centralized database (CASPAR) for Patient

Information Portability. This provides a certain level of

flexibility to the patient giving them the choice of visiting the

nearest hospital or any desirable one irrespective of his prior

visits due to information transferability through CASPAR as

long as the hospitals are registered users of the database. The

generation of a unique user ID ensures authenticity of the user

enabling the user to access, confirm or modify the information

stored in CASPAR.

REFERENCES

[1] Frontiers of High Performance Computing and Networking, Proceedings

of the international conference ISPA'07, Lecture Notes in Computer

Science, Vol. 4743, 2007, pp 87-95, Chu, Jin, Chiang and Kao

[2] Point-of-care information systems with portable terminals, Medinfo.; 9 Pt

2 :990-4 10384609 Cit:3Sasaki H, Sukeda H, Matsuo H, OkaY, Kaneko

M, Sasaki, 1998

[3] Society for Promotion of IT ,Chandigarh, India INDIAN EXPRESS,

Khushboo Sandhu, Chandigarh , 2011

[4] www.ieee.org

[5] Architecture of a database system - Joseph M. Hellerstein1, Michael

Stonebraker2 and James Hamilton3, 2007

[6] Raghu Ramakrishnan, Database Management Systems, McGraw Hill, 3rd

Edition, 2002

[7] Carlo Zaniolo, Stefano Ceri, Christos Faloutsos, Richard T. Snodgrass,

V.S. Subrahmanian, and Roberto Zicari, Advanced Database system,

Morgan Kauffman, 1997

http://lib.bioinfo.pl/pmid/journal/Medinfo



http://lib.bioinfo.pl/pmid:10384609/pmid/cit


Systems - WiSE 2013

328


Systems - WiSE 2013

329

Adaptive Techniques for Cancellation of Noisy

Signal

Email :

Abstract- In the process of transmission of information from

source to receiver, noise from the surroundings automatically

gets added to the signal. In numerous application areas,

including biomedical engineering, radar & sonar engineering,

digital communication etc. The objective is to extract the useful

signal from the noise corrupted signal. The use of adaptive filter

is one of the most popular proposed solutions ,to reduce the

signal corruption caused by unpredictable noise. The proposed

method emphasizes on the performance comparison of MATLAB

Simulation of an adaptive filter for Least Mean Squared (LMS)

and Normalized Least Mean Squared (NLMS) Algorithms. The

input to the filter is a noisy signal and the output of the filter is

approximate to clean signal. The designed filter is tested using

MATLAB and the performance is analyzed on the basis of

Signal to Noise Ratio (SNR) ,Percentage of Noise Removal

(PNR), Stability. Hardware implementation of adaptive noise

cancellation using TMS320C6713 DSP Processor

Key-words:- Adaptive filter, ,Least Mean Squared

(LMS),Normalized Least Mean

Squared (NLMS)

I. INTRODUCTION

All the physical systems when they operate, they produce

physical noise, which is a mixing of an infinitive number of

sinusoidal harmonics with different frequencies. So, the initial

signal information is corrupted with this noise. This complex signal may be very noisy, so much that human ear or other

system which may follow it, cannot receive correct initial

signal.

So, an algorithm has to be invented which must be able to

separate the physical noise from the information signal and to

output the information signal without noise. This is not

possible, as there does not exist a system. So, this algorithm

should have the ability to reduce the noise level as much as it

can.

An adaptive filter[1][2] has the property of self-modifying its

frequency response to change its behavior with time. It allows

the filter to adjust the response as the input signal

characteristics change. Adaptive filters[1][2] work on the principle of minimizing the mean squared error between the

filter output and a target (or desired) signal. The general

adaptive filter configuration is illustrated in Fig.1.

figure 1:Block diagram of adaptive filter

The adaptive filter[1][2] has two inputs: the primary input

d(n), which represents the desired signal corrupted with

undesired noise, and the reference signal x(n), which is the

undesired noise to be filtered out of the system. The basic idea

of adaptive filter is to predict the amount of noise in the

primary signal, and then subtract that noise from it. The

prediction of reference signal x(n),which contains the solid

reference of the noise present in the primary signal. The noise

in the reference signal is filtered to compensate for the

amplitude, phase and time delay, and then subtracted from the

primary signal. This filtered noise is the system‘s prediction of

the noise portion of the primary signal, y(n). The resulting

signal is called error signal e(n), and it presents the output of

the system. Ideally, the resulting error signal would be only

the desired portion of the primary signal. In this paper we have implemented the adaptive filter on

software to compare their relative performance. The

MATLAB software is used for simulation purpose. The


Systems - WiSE 2013

330

resultant output of simulation is compared in terms of SNR,

PNR, Stability for a noisy signal input.

II. ADAPTIVE ALGORITHMS

The algorithms used to perform the adaptation, and the

configuration of the filter depends directly on the use of the

filter. The two classes of adaptive filtering algorithms namely

Least Mean Squared (LMS) and Recursive Least Squares

(RLS) are capable of performing the adaptation of the filter

coefficients. The LMS based algorithms are simple to

understand and easy to implement whereas RLS based

algorithm are complex and requires so much memory for

implementation. So in this work we have focuses on LMS

based algorithms.

A. Least Mean Square Algorithm

The LMS algorithm [3], is a type of adaptive filter known as stochastic gradient-based algorithms as it utilizes the gradient

vector of the filter tap weights to converge on the optimal

wiener solution. With each iteration of the LMS algorithm, the

filter tap weights of the adaptive filter are updated according

to the following formula:

Here x(n) is the input vector of time delayed input values,

Thevector

T

N nwnwnwnwnw 1210 .......

represents the co-efficient of the adaptive filter tap weight

vector at time n. The parameter μ is known as the step size

parameter and is a small positive constant. This step size

parameter controls the influence of the updating factor.

Selection of a suitable value for μ is imperative to the

performance of the LMS algorithm, if the value is too small ,

the time taken by the adaptive filter to converge on the

optimal solution will be too long; if μ is too large ,the adaptive

filter becomes unstable and its output diverges.

B. Normalized LMS Algorithm

In the standard LMS algorithm, when the convergence factor

μ is large, the algorithm experiences a gradient noise

amplification problem. In order to solve this difficulty, we can

use the NLMS(Normalized Least Mean Square)algorithm[3].

The correction applied to the weight vector w(n) at iteration

n+1 is ―normalized‖ with respect to the

squared Euclidian norm of the input vector x(n) at iteration n.

We may view the NLMS algorithm as a time-varying step-

size algorithm, calculating the convergence factorin Eq.

(3)[1].

Where: α is the NLMS adaption constant, which optimize the

convergence rate of the algorithm and should

satisfy the condition 0< α<2, and c is the constant term for

normalization and is always less than 1.

The Filter weights are updated by the Eq. (4).

III. ADAPTIVE NOISE CANCELLATION

Adaptive noise cancellation (ANC) [4][5] is performed by

Subtracting noise from a received signal, and an operation

controlled in an adaptive manner is done during the adaptation

process to get an improved signal-to-noise ratio. Noise

subtraction from a received signal could generate disastrous

results by causing an increase in the average power of the

output noise. However when filtering and subtraction are

controlled by an adaptive process, it is possible to achieve a

superior system performance compared to direct filtering of the received signal. Fig.2 shows adaptive noise canceling

system.

Fig.2.Adaptive Noise Cancellation System

The system composed of two separate inputs, a primary input

or ECG signal source which is shown as s(n) and a reference

input that is the noise input shown as x(n) . The primary signal is corrupted by noise x1(n). x1(n) is highly correlated with

noise signal or reference signal x(n) . Desired signal d(n)

results from addition of primary signal s(n) and Correlated

noise signal x1(n). The reference signal x(n) is fed into

adaptive filter and its output y(n) is subtracted from desired

signal d(n). Output of the summer block is then fed back to

adaptive filter to update filter coefficients. This process is run

recursively to obtain the noise free signal which

is supposed to be the same or very similar to primary signal

s(n) .

IV. EXPERIMENTAL SETUP

MATLAB Simulation

For simulation of adaptive algorithms, a MATLAB program is

written which implements the mathematical equation of LMS


Systems - WiSE 2013

331

and NLMS algorithms as given in eq.1 to eq.4 respectively.

The reference input signal x(n) is a white Gaussian noise is

generated using WGN function in MATLAB, and source

signal s(n) is a clean sine signal generated using sin function,

the desired signal d(n) ,obtained by adding a x(n) into clean

signal s(n), i.e. d(n) = s(n) + x1(n), as shown in Fig.3.we can

observe that the output signal is noise free and it is

comparatively equal to input. The fft of mixed signal shows

the area where noise is present and fft of LMS output shows

that the noise is cancelled as in Fig.4. the fft output shows that

still a small amount of noise is present due gradient noise

amplification problem . But Fig.5 shows the fft output of NLMS algorithm where the error is minimized by normalizing

the correction applied to the weight vector w(n).

Fig 3.Input signal, noise signal, input+noise

Fig 4.lms

output, fft_mixed signal, fft_output

Fig

5.NLMS output,fft_mixed signal,fft_output

V. SIMULINK MODEL

In order to estimate the values of the parameters for each

algorithm discussed in previous sections, MATLAB Simulink

model is used. The following figures depict the Simulink

implementation of each algorithms and estimation of its

parametric values from the corresponding graphical figures.

A. LMS ALGORITHM:

The Simulink Model for LMS with different statistical

parameter as shown in fig 6 and speech signal corrupted with

white Gaussian noise is shown in fig 7 and Output of LMS

algorithm is shown in figure 8 Power spectral analysis is also

simulated and shown in figure 9.

Fig 6.

Simulink Block diagram for LMS algorithm


Systems - WiSE 2013

332

Fig 7. Input corrupted with white guassian noise

Fig 8. Output of LMS algorithm

Fig 9.Power spectral analysis of LMS algorithm

B. N-LMS ALGORITHM

The Simulink model for N-LMS with different statistical

parameter as shown in fig 10 and speech signal corrupted with

white Gaussian noise is shown in fig 7.The power spectral

analysis is also simulated and shown in fig12. NLMS output is

shown in fig 11.

Fig

10. Simulink Block diagram for NLMS algorithm

Fig

12. Power spectral analysis of NLMS algorithm

Fig 11.

Output of NLMS algorithm

VI. RESULTS & DISCUSSION

A. Simulation


Systems - WiSE 2013

333

The simulation of the LMS and NLMS algorithms is carried

out with the following specifications:

Filter order N=2, step size μ= 0.005, iterations n= 1200. The

LMS and NLMS algorithms are compared in terms of Signal

to Noise Ratio(SNR), Percentage of Noise Removal (PNR)

and stability as shown in Table-1

SL.NO Algorithms SNR PNR Stability

1. LMS 11.0429 0.9145 2401

2. NLMS 25.6713 0.5429 6001 Table-I Comparision of performance of NLMS and LMS algorithms

From Table-1 it is clear that the performance of NLMS

algorithm is better than LMS algorithms

B. HARDWARE IMPLEMENTATION

Hardware implementation of noise cancellation[6][7] is performed by writing a C code for adaptive noise cancellation

in Code Composer Studio V3.1 (CCS V3.1) then build it and

load it on to the TMS320C6713 DSP processor. The input is

noisy sine wave which is got by adding two signals sine wave

and high frequency noise by using OPAMP 741 IC and then

the noisy input and noise is given to left and right channel of

linein port of DSP respectively and we can see that we get a

noise free output signal

at lineout port as as shown in fig 13.

VII. CONCLUSIONS

The implementation of adaptive algorithms (LMS &NLMS)

on MATLAB for a noisy signal has been done successfully

and the results are compared in the terms of SNR, PNR,

Stability. The results of comparison of algorithms show that

the performance of NLMS is better than LMS. The adaptive noise canceller is implemented on TMS320C6713 DSP

processor. Therefore to sum up, the adaptive noise canceller is

a very efficient and useful system in many applications with

sound, video etc. The only disadvantage is that it needs digital

signal processor DSP for its operation.

Fig 13. Setup for hardware implementation for noise cancellation

REFERENCES

[1] Haykin S. Adaptive filter theory. Prentice Hall. 2002. [2] Bernard

Widrow, John R. Glover, John M. Mccool, John Kaunitz, Charles S.

Williams, Robert H. Hean, James R. Zeidler, Eugene Dong,Jr. and

Robert C. Goodlin, ―Adaptive Noise Cancelling: Principles and

Applications‖, Proceedings of the IEEE, 1975, Vol.63 , No. 12 ,

Page(s): 1692 – 1716. [3] D.C. Dhubkarya , Aastha Katara,‖

Comparative Performance Analys of Adaptive. Algorithms for

Simulation & Hardware Implementation of an ECG Signal‖,

Volume1Number-4PP- 2184-2191. [4] Yaghoub

Mollaei,―Hardware Implementation of Adaptive Filter ‖, Proceedings

of the IEEE,Nov 2009 [5] Raj Kumar Thenua, S. K. Agrawal,

Member, ―Hardware Implementation of Adaptive Algorithms for Noise

Cancellation‖ IACSIT, Vol. 2, No. 2, March 2012 [6] Nirmal

R Bhalani, Jaikaran Singh, Mukesh Tiwari ,‖Implementation of Karaoke

Machine on the DSK6713 Processor ―, Volume 62– No.7, January 2013

[7] Vijay kumar Gupta, Mahesh Chandra,S.N.Sharan ―Real Time

Implementation Of Adaptive Noise Canceller‖ Paper Identification

Number CS-1.6.

Boundary detection in Medical Images using Edge Field Vector … · 2013-11-15 · Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing,

Documents