Page 1
Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
193
Boundary detection in Medical Images using
Edge Field Vector based on Law's Texture and
Canny Method Swetha.M1
,Jyohsna.C1
1Department of E&C, KLS VDRIT, Haliyal, Karnataka, India
email:[email protected] ), (email:[email protected]
Abstract—Detecting the correct boundary in noisy images is a
difficult task. Images are used in many fields, including
surveillance, medical diagnostics and non-destructive testing.
Boundaries are mainly used to detect the shape of an object.
Image segmentation is used to locate objects and boundaries in
images and it assigns a lable in every pixel in an image such that
pixels with the same level share have certain virtual
characteristics. The proposed an edge detection technique for
detecting the correct boundary of objects in an image. It can
detect the boundaries of object using the information from
intensity gradient using the vector model and texture gradient
using the edge map modle. The results show that the technique
performs very well and yields better performance than the
classical contour models. The proposed method is robust and
applicable on various kind of noisy images without prior
knowledge of noise properties.
Keywords— Boundary extraction, vector field model, edge
mapping model, edge following technique, boundary detection.
I. INTRODUCTION
Boundary detection is mainly used to detect the outline or
shape of the object, so we can easily identify objects based
upon the outline or shape. Segmentation is the process in
which an image is divided into its constituent objects or parts.
The main goal of segmentation is to simplify and/or change an
image representation into something that is analyzed easily.
Image segmentation is an initial step before performing high-
level tasks such as object recognition and understanding.
Image segmentation is typically used to locate objects and
boundaries in images. In medical imaging, segmentation is
important for feature extraction, image measurements, and
image display. In some applications it may be useful to extract
boundaries of objects of interest from ultrasound images
[1],[2], microscopic images [3]- [5].
In recent years, there have been several new methods to solve
the problem of boundary detection, e.g., active contour model
(ACM), geodesic active contour (GAC) model, active
contours without edges (ACWE), gradient vector flow (GVF)
snake model, etc. The snake models have become popular
especially in boundary detection where the problem is more
challenging due to the poor quality of the images. To remedy
the problem, we propose a new technique for boundary
detection for ill-defined edges in noisy images using a novel
edge following. The proposed edge following technique is
based on the vector image model and the edge map. The
vector image model provides a more complete description of
an image by considering both directions and magnitudes of
image edges. The proposed edge vector field is generated by
averaging magnitudes and directions in the vector image. The
edge map is derived from Law‘s texture feature and the Canny
edge detection. The vector image model and the edge map are
applied to select the best edges.
II. PROPOSED SYSTEM
In proposed boundary detection algorithm is used to detect the
boundary of object in an image. Boundary extraction
algorithm consists of following three phases.
1. edge vector gradient
2. Edge mapping model
3. Edge detection Technique
III. BLOCK DIAGRAM
A. Average Edge Vector Field Model
We exploit the edge vector field to devise a new boundary
extraction algorithm [29]. Given an image f(x, y), the edge
vector field is calculated according to the following equations:
Input image Average
edge vector
field
Initial
position
Edge
following
technique
Boundary
detected
Edge map
Page 2
Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
194
(i, j) = (Mx(i, j) + My(i, j) )……....(1)
(i, j) – …….(2)
K= ( ….(3)
Fig. 1. (a) Original unclear image. (b) Result from the edge vector field and
zoomed-in image. (c) Result from the proposed average edge vector field and
zoomed-in image.
Each component is the convolution between the image and the
corresponding difference mask, i.e.,
Mx (i, j) = −Gy × f(x, y) ≈ ...............(4)
My (i, j) = Gx × f(x, y) ≈ − …………(5)
whereGx and Gy are the difference masks of the Gaussian
weighted image moment vector operator in the x and y
directions, respectively,[29]
Gx (x, y) = exp …………(6)
Gy (x, y) = ……….(7)
Edge vectors of an image indicate the magnitudes and
directions of edges which form a vector stream flowing
around an object. However, in an unclear image, the vectors
derived from the edge vector fieldmay distribute randomly in
magnitude and direction. Therefore, we extend the capability
of the previous edge vector field by applying a local averaging
operation where the value of each vector is replaced by the
average of all the values in the local neighborhood, i.e.,
M(i, j) = ….(8)
D(i, j) = …………….(9)
whereMr is the total number of pixels in the neighborhood N.
We apply a 3 × 3 window as the neighborhood N throughout
our research.
B. Edge Map
Edge map is edges of objects in an image derived from Law‘s
texture and Canny edge detection.
1) Law‟s Texture:The texture feature images of Law‘s
texture are computed by convolving an input image
with each of the masks. Given a column vector L=(1,
4, 6, 4, 1) T , the 2-D mask l(i, j) used for texture
discrimination in this research is generated by L × LT
. The output image is obtained by convolving input
image with texture mask.
2) Canny Edge Detection:The first step of Canny edge
detection is to convolve the output image obtained
from the aforementioned Law‘s texture t(i, j) with a
Gaussian filter. The second step is to calculate the
magnitude and direction of the gradient. The third
step is nonmaximal suppression to identify edges.
The last step is the thresholding algorithm to detect
and link edges. The double threshold algorithm is
used to detect and link edges.
Edge map shows some important information of edge. This
idea is exploited for extracting objects‘ boundaries in unclear
images. Examples of the edge maps are shown in Fig. 2.
Fig. 2. (a) Synthetic noisy image. (b) Left ventricle in the MR image.
(c)Prostate ultrasound image. (d)–(f) Corresponding edge maps derived from
Law‘s texture and Canny edge detection.
C. Edge Following Technique
The edge following technique is performed to find the
boundary of an object. Most edge following algorithms take
into account the edge magnitude as primary information for
edge following. However, the edge magnitude information is
not efficient enough for searching the correct boundary of
objects in noisy images because it can be very weak in some
contour areas.
The magnitude and direction of the average edge vector field
give information of the boundary which flows around an
object. In addition, the edge map gives information of edge
which may be a part of object boundary. Hence, both average
edge vector field and edge map are exploited in the decision
of the edge following technique. At the position (i, j) of an
image, the successive positions of the edges are then
calculated by a 3 × 3 matrix.
D. Initial Position
In this section, we present a technique for determining a good
initial position of edge following that can be used for the
Page 3
Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
195
boundary detection. In this proposed technique, the initial
position of edge following is determined by the following
steps. The first step is to calculate the average magnitude
[M(i, j)] using (8). The position with high magnitude should
be a good candidate of strong edges on the image. The second
step is to calculate the density of edge length for each pixel
from an edge map. An edge map [E(i, j)], as a binary image, is
obtained by Law‘s texture and Canny edge detection. The idea
of using density is to obtain measurement of the edge length.
The density of edge length [L(i, j)] in each pixel can be
calculated from
L(i, j) = .........................(15)
whereC(i, j) is the number of connected pixels at each position
of pixel. The third step is to calculate the initial position map
P(i, j) from summation of average magnitude and density of
edge length, i.e.,
P(i, j) = ………………..(16)
The last step is the thresholding of the initial position map.
We have to threshold the map in order to detect the initial
position of edge following.
Fig. 5. (a) Aorta in cardiovascular MR image. (b) Averaged magnitude[M(i,
j)]. (c) Density of length edge [L(i, j)]. (d) Initial position map [P (i, j)] and
initial position of edge following derived by thresholding Tmax = 0.95.
IV. RESULTS
Fig1: Original image.
Fig2: Preprocessed image.
Fig3: Magnituted of the image.
Fig4: Direction of the image.
Fig5: Law‘s texture output image.
Page 4
Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
196
Fig6: Canny magnitude of the image.
Fig7: Canny direction of the image.
Fig8: Non maximal suppression image.
Fig9:First thersholding image.
Fig10: Edge map of the image.
Fig11: Density of the image.
Fig12: Position image.
Fig13: Boundary detected.
Page 5
Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
197
V. CONCLUSION
We have designed a new edge following technique for
boundarydetection and applied it to object segmentation
problem inmedical images. Our edge following technique
incorporates avector image model and the edge map
information. The results of detecting the object boundaries in
noisy images show that the proposed technique is much better
than the five contour models. We have successfully applied
the edge following technique to detect ill-defined object
boundaries in medical images. The proposed method can be
applied not only for medical imaging, but can also be applied
to any image processing problems in which ill-defined edge
detection is encountered.
ACKNOWLEDGEMENT
We sincerely thank our college, KLS VDRIT College for
humble facilities and necessary infrastructure made available
during the course of our work. We wish to express our thanks
and sincere gratitude to our Principal, Head of the Department
and guide for their guidance to complete this work
successfully and enthusiastic encouragement.
REFERENCES
[1] J. Guerrero, S. E. Salcudean, J. A. McEwen, B. A. Masri, andS. Nicolaou,
―Real-time vessel segmentation and tracking for ultrasoundimaging
applications,‖ IEEE Trans. Med. Imag., vol. 26, no. 8, pp. 1079–1090, Aug.
2007.
[2] F. Destrempes, J. Meunier, M.-F. Giroux, G. Soulez, and G.
Cloutier,―Segmentation in ultrasonic B-mode images of healthy carotid
arteriesusing mixtures of Nakagami distributions and stochastic
optimization,‖IEEE Trans. Med. Imag., vol. 28, no. 2, pp. 215–229, Feb.
2009.
[3] N. Theera-Umpon and P. D. Gader, ―System level training of neural
networksfor counting white blood cells,‖ IEEE Trans. Syst., Man, Cybern.C,
App. Rev., vol. 32, no. 1, pp. 48–53, Feb. 2002.
[4] N. Theera-Umpon, ―White blood cell segmentation and classification
inmicroscopic bone marrowimages,‖ Lecture Notes Comput. Sci., vol.
3614,pp. 787–796, 2005.
[5] N. Theera-Umpon and S. Dhompongsa, ―Morphological
granulometricfeatures of nucleus in automatic bone marrow white blood cell
classification,‖IEEE Trans. Inf. Technol. Biomed., vol. 11, no. 3, pp. 353–
359,May 2007.
[6] J. Carballido-Gamio, S. J. Belongie, and S. Majumdar, ―Normalized cutsin
3-D for spinal MRI segmentation,‖ IEEE Trans. Med. Imag., vol. 23,no. 1, pp.
36–44, Jan. 2004.
Page 6
Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
198
COLOR BASED CLASSIFICATION OF PEANUTS
Chaitra C. 1, K. V. Suresh 2, Partha Das 3
1 M. Tech. (Signal Processing), Dept. of E&C, SIT, Tumkur, Karnataka, INDIA
2 Professor and Head, Dept. of E&C, SIT, Tumkur, Karnataka, INDIA
3 R&D Engineer, Opto-Electronic Color Sorter Division,M/S Fowler westrup (India) Pvt. Ltd., Bangalore, Karnataka, INDIA.
1 [email protected]
2 [email protected]
3 [email protected]
Abstract—Peanuts are rich in energy, and contain health
benefiting nutrients, minerals, antioxidants and vitamins that are
essential for optimum health. The price for the peanutsdepends
on its quality, thus quality classification of food grains facilitates
the proper marketing of agricultural products. In this paper we
proposean image processing technique to ensure good quality of
peanuts. The main objective of this paper is to classify peanuts
into good and bad, based on color feature. The captured images
are first pre-processed, and database is prepared automatically.
Statistical and histogram features are then extracted for
classification using Feed Forward Neural Network (FFNN). Red
and white peanut samples are considered for experimentation.
The result shows that, the proposed method gives classification
accuracy of around 90% for red goodpeanuts and 84% for red
bad peanuts, 96% for white good peanuts and, 71% for white
bad peanuts. Proposed algorithm is developed using MATLAB
7.12.
Keywords - Peanut, quality, database, features, Feed Forward
Neural Network (FFNN).
I. INTRODUCTION
Peanuts compose of sufficient levels of mono-unsaturated
fatty acids especially oleic acid. It helps to lower "bad
cholesterol" level and increases "good cholesterol‖ level in the
blood. These peanuts are a good source of dietary protein
composes of fine quality amino acids that are essential for
growth and development. Peanuts contain cancer-fighting
compounds such as, resveratrol and beta-sitosterol. Beta-
sitosterol has been shown to inhibit breast, prostate, and colon
cancer cell growth [1].
Quality inspection of peanut plays a very important role in
food grain industry.The possible types of classes that could be
considered for quality analysis are freshness; good, broken,
skin removed, dull color, and shrivelled. Also they can be
classified based on size, shape and its nutrition content [2]. In
[3], a simple imaging system was developed for color image
based sorting of red and white wheat kernels. Here, the
combination of statistical and histogram features are
considered for classification. In [4], a comparative study was
made to classify and grade bulk seed samples using artificial
neural network. Three sets of features are extracted for
classification, and combinations of these features are tested
with different artificial neural networks. A review paper [5],
briefs how selection of features plays important role in
classification. Each feature namely color, size, and texture
carries useful information, using which good classification can
be achieved. In some cases RGB color space is used for
classification of fruits and nuts, HSI color space is used for
classification of wheat. If these color spaces are not enough
then multispectral imaging will give the best results. In [6], a
machine was developed to detecttoxin in peanuts using k
mean clustering algorithm. This algorithm uses average value
of R, G, and B components. Classification accuracy shown
here is 100 %. In [7], a method was presented to trace the
origin of peanut pods using image recognition, here features
like texture, color and shape are considered for classification
using neural network.Change in color is also one of the best
properties of peanut, using which quality can be assessed.
Color based classification separates almost all bad peanuts.
Further, size and texture based classification improves the
quality of peanut. Hence developing a rapid detection and
classification algorithm based on colour, size and texture is a
useful work for industrial application.In this paper, two types
of peanuts are used for experimentation. They are red peanuts
and white peanuts. The objective of this paper is to develop an
algorithm to prepare peanut database automatically, and select
suitable feature that classifies peanuts into good and bad.\The
paper is organized as follows. In section II, we present the
proposed algorithm. In section III we discuss the experimental
results. Section IV contains the brief summary and closing
remarks.
II. PROPOSED METHOD
The block diagram of proposed algorithm is shown in figure
1. In image acquisition part, peanut images are captured in
Page 7
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WiSE 2013
199
visible range. The captured color images are converted into
binary. And morphological processing is performed on the
binary images to retain the pixel information.In segmentation
and object extraction part, each peanut is segmented from its
background; Further, each segmented peanut is stored as a
subimage to form a peanut database. The features are
extracted from all the peanut images in the database, and are
classified using FFNN classifier.
Figure 1 Block diagram of proposed method
A. Image Acquisition and Pre-processing
Images are captured using SonyCybershot HX200V(18
Megapixel, SLR) camera using fluorescent lamp asa
lightsource and black color background. R component of the
input
image shows good difference between the foreground and
background. Hence, R component of the image is considered
for further processing.
B. Binary Thresholding and Morphological Processing
The component image is first converted into binary by
thresholding
(1)
where , and is the size ofthe
image . Binary thresholding results in dark background and
bright foreground image. Two problems faced after
thresholding are, loss of foreground pixel information, and
some of the dust particles in the original image appear like a
new object. Hence, morphological operations like hole filling,
and erosion are performed on the image [8].
C. Segmentation and Object Extraction
Once the foreground objects are separated from
background, it is necessary to label each foreground object in
order to extract them separately. Each labelled object is
extracted and stored to prepare peanut database.Following
algorithm is developed for object extraction
Consider there are N peanuts in the original image and L is the
label matrix. All pixels belong to one peanut is labelled as ;
similarly pixels of peanuts are labelled as . The
detailed procedure is:
Step 1: Read label and its location details from the label
matrix .
Step 2: Consider the pixel location obtained in step1 and go to
the same location in the original image I. Retrieve R, G and B
values of that pixel location and store it in the same location
of new sub image.
Step 3: Save new sub image in temporary database.
Step 4: Repeat step 1 to step 3 for peanuts.
From the above algorithm foreground objects are extracted
from the original image but, there is extra background in the
image.
D. Database Preparation
An image database is prepared using automatic database
preparation algorithm. This algorithm will remove dark
background from the images, which results from the algorithm
given in section II (C).
Following are the steps of the algorithm:
Step 1: Take R, G and B component matrices of a peanut sub
image.
Step 2: For R component matrix, check for all zero rows. If
the row is all zero, then assign 0 to a corresponding row in a
column vector, otherwise assign 1.
Step 3: Repeat Step 2 for all columns of the matrix to get a
row vector.
Page 8
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WiSE 2013
200
Step 4: Repeat step 2 and step 3 for G, and B components
matrices.
Step 5: Consider the column vectors from step 2. Check each
row of the column vector R and corresponding row of the G,
and B component. If all three rows of column vectors are zero,
then delete corresponding rows of R, G, and B components in
the matrices.
Step 6:Consider the row vectors from step 3. Check each
column of the row vector R and corresponding column of the
G, and B component. If all three columns of the row vectors
are zero, then delete corresponding column of R, G, and B
components in the matrices.
Step 7: R, G and B matrix values are moved to a new
variable, to get color subimage of the peanut free from extra
background.
Step 8: Resize all subimages to same size.
E. Feature Extraction and Classification
The feature of an object plays a key role in classification. The
best feature subsets selected in this work for classification are:
Percentage of red and green pixels with intensity less than
100, and the percentage of blue pixels with the intensity less
than 120. Statistical features namely, mean, median, and,
standard deviation of R, G, and B components with 24
histogram features are extracted. Totally 36 features are
extracted for classification. In this work FFNN with three
layers is used for classification. It has 36 input neurons and 2
output neurons with 20 neurons in the hidden layer.
III. EXPERIMENTAL RESULTS AND DISCUSSION
Images are captured using sony cybershot camera with each
image containing more than one peanut kernel. Figure 2(a) is
the input color image, from the figure 2(b)(c)(d) it is observed
that histogram of R component shows good difference
between foreground and background. Hence R component is
considered in further processing. Figure 2(e) is the R
component of the image in figure 2(a). Figure 2(f) is the result
after binary thresholding, with the threshold value of 110.
After binary thresholding some of the foreground pixels
appear like background, hence it is necessary to fill the
foreground region, figure 2(g) is the result after hole filling
(a)
(b) (c) (d)
(e) (f)
(g) (h)
Figure 2. Colour images of peanut samples, (a) Original image, (b)(c)(d)
Histogram of R,G and B components, (e) R component of image (a), (f)
Image after binary thresholding, (g) image after hole filling operation, (h)
Image after erosion operation.
Page 9
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WiSE 2013
201
operation. Also, binary thresholding results in small bright
dots which appear like new objects. These objects need to be
removed, hence morphological erosion operation is performed
using disk as a structuring element with the radius 5. But, few
of the dust particles are difficult to remove by erosion
operation also. Hence, to avoid these particles in the database
a pixel counter is set within the object extraction algorithm. If
the pixel count is less than the threshold then that object is
considered as the dust particle, and it is discarded from the
database. Figure 2(h) is the image after erosion operation.
After pre-processing and morphological operation objects are
extracted and stored as a new sub image using algorithm given
in section II (C).Figure 3 is the result of object extraction
algorithm. Eachsubimage in the Figure 3, contains unwanted
dark background. This dark background is removed using
background removal algorithm given in section II (D), and
result of this algorithm is given in Figure 4.
Figure 3. Subimages of individual peanuts extracted from original image in
Figure 2(a).
Figure 4. Peanut database (resized images), after removing extra
background from images in figure 3.
After the database preparation, next step is to extract features
for classification. A FFNN with 36 neurons in the input layer
is considered since we have extracted 36 input features for
training and also, we have two classes good and bad; hence 2
neurons are taken at output layer.
Figure 5. Red peanut samples.
Figure 6. White peanut samples.
Figure 7. Bad peanut samples.
Figure 5, 6, 7 are the images of red, white and bad peanut
samples respectively. Initially 260 red peanut samples are
taken for training with 130 good, and 130 bad peanuts. 64
samples are taken for testing with 32 good and 32 bad
peanuts. Classifications results are tabulated in Table I.
TABLE I: CLASSIFICATION OF RED PEANUT SAMPLES USING
FFNN
Category
Input
number
Success
number
Failure
number
Accuracy
in %
Red
peanut
samples
Good
32
29
3
90.62
Bad
32
27
5
84.37
Features from 64 red peanut test samples are input to the network, from 32 good samples, 29 are correctly classified as
good and 3 peanuts are wrongly classified as bad, similarly,
from 32 bad peanuts 27 are correctly classified, and 5 are
wrongly classified as good. Thus, our algorithm classifies
good red samples with the accuracy of 90.62% and bad red
samples with the accuracy of 84.37%.
Page 10
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WiSE 2013
202
TABLE II: CLASSIFICATION OF WHITE PEANUT SAMPLES USING
FFNN
Category
Input
number
Success
number
Failure
number
Accuracy
in %
White
peanut
samples
Good
32
31
1
96.87
Bad
32
23
9
71.87
Experiment is repeated for whitepeanut samples and results
are tabulated in Table II. From 32 white good peanut samples,
31 are correctly classified as good and 1 peanut is wrongly
classified as bad. Similarly, from 32 bad samples 23 are correctly classified as bad, and 9 are wrongly classified as
good. Classification accuracy achieved is 96.87 % for good
peanuts, and 71.87% for bad peanuts. Failure number for red
peanuts and white peanuts are 8 and 10 respectively, this is
because in most of the bad peanuts only a part of the endocarp
is damaged, Since the major part of the endocarp of such
samples are similar to good ones they are misjudged as good
peanuts.
IV. CONCLUSION
This paper presents an automatic database preparation and
classification algorithm for peanuts. The thresholding and
morphological operations will separatethe foreground, and
background with minimum loss of information.The proposed
object extraction and database preparation algorithms are able
to prepare the color image database of peanut.The selected
features are able to classify good and bad red peanuts with the
accuracy of 90.62 %, and 84.37 % respectively. And also,
good and bad white peanuts with the accuracy of 96.87%, and
71.87 % respectively. Thus, FFNN gives better classification
results for both red and white peanut samples. By solving
touching kernel problem, the proposed algorithm can be used
for automatic training of the sorting machine.
ACKNOWLEDGMENT
The authorswould like to thank M/S Fowler Westrup India
Pvt. Ltd.,Bangalore for the lab facility and financial support
given to carry out the work.
REFERENCES
[1] Jocelyne Tan, Good Eating Tip of the Month,Univ. of Michigan Health
System:Patient Food and Nutrition Services, February 2011.
[2] Hong Chen , Jing Wang, Qiaoxia Yuan and Peng Wan, ―Quality
classification of peanuts based on image processing,‖Journal of Food, Agriculture & Environment. Vol. 9 (3&4): 205-209. 2011.
[3] Tom Pearson, Dan Brabec and Scott Haley, ―Color image based sorter
for separating red and white wheat,‖ Sens. & Instrumen. Food Qual.
(2008) 2:280–288.2008.
[4] Anil Kannur, AshaKannur andVijay S. Rajapurohit, ―Classification
And Grading Of Bulk Seeds UsingArtificial Neural Networks,”
International Journal of Machine Intelligence, Vol. 3, pp. 62-73, 2011.
[5] Chaoxin Zheng, Da-Wen Sunand Liyun Zheng, ―Recent
developmentsand applications ofimage featuresforfood
qualityevaluation andinspection a review,‖ Trends in Food Science &
Technology, Vol. 17, pp. 642-655, 2006.
[6] AtrisSuyantohadi and RudiatiEviMasithoh, ―Development of Machine
Vision Based on Image Processing Technique to Identify Toxin
Contamination In Peanuts,‖ Australian Journal of Basic and Applied
Sciences, vol. 6, pp. 135-141, 2012.
[7] Han Zhongzhi, Deng Limiao, and Yu Renshi, ―Study on Origin
Traceability of Peanut Pods Based on Image Recognition,‖
International Conference on System Science, Engineering Design and Manufacturing Informatization, IEEE, vol. 2, pp. 93-96, 2011.
[8] R.C.Gonzalez, R.E.Woods, Digital Image Processing. 2-nd Edition,
Prentice Hall,2002.
Page 11
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WiSE 2013
203
WAVELT BASED IMAGE COMPRESSION USING ASWDR
TECHNIQUE: A COMPARITIVE STUDY
KISHORE D.V1,ARCHANAA NAWANDHAR1
Department of Telecommunication Engineering, CMRIT, Bangalore,Karnataka,India
[email protected] , [email protected]
Abstract-The objective of this paper is to implement and evaluate
the effectiveness of wavelet based Adaptively Scanned Wavelet
Difference Reduction (ASWDR) compression techniques using
MATLAB R2007b The performance parameters such as Peak
Signal to Noise Ratio (PSNR), Mean Squared Error (MSE),
Compression Ratio (CR) ratio are evaluated based on the
algorithms. Recently discrete wavelet transform and wavelet packet
has emerged as popular techniques for image compression. The
wavelet transform is one of the major processing components of
image compression. The results of the compression change as the
nature of image and type of the wavelet changes. This paper
compares compression performance of Daubechies, Biorthogonal
along with results for images chosen from matlab toolbox. Based
on the results, it is proposed that proper selection of wavelet on the
basis of nature of images, improve the quality as well as
compression ratio remarkably. The prime objective is to select the
proper wavelet during the transform phase to compress the image.
This paper will provide a good reference for application developers
to choose a good wavelet compression system for their application.
Key words: Compression, Wavelet, Daubechies, Biorthogonal.
I.INTRODUCTION
In Recent years, many studies have been made on wavelets.
An excellent overview of what wavelets have brought to the
fields as diverse as biomedical applications, wireless
communications, computer graphics or turbulence, is given in
Image compression is one of the most visible applications of
wavelets. The rapid increase in the range and use of electronic
imaging justifies attention for systematic design of an image
compression system and for providing the image quality
needed in different applications. A typical still image contains
a large amount of spatial redundancy in plain areas where
adjacent picture elements (pixels, pels) have almost the same
values. It means that the pixel values are highly correlate. In
addition, a still image can contain subjective redundancy,
which is determined by properties of a human visual system
(HVS) . An HVS presents some tolerance to distortion, depending upon the image content and viewing conditions.
Consequently, pixels must not always be reproduced exactly
as originated and the HVS will not detect the difference
between original image and reproduced image.
The redundancy (both statistical and subjective) can be removed to achieve compression of the image data. The basic
measure for the performance of a compression algorithm is
compression ratio (CR), defined as a ratio between original
data size and compressed data size. In a lossy compression
scheme, the image compression algorithm should achieve a
tradeoff between compression ratio and image quality. Higher
compression ratios will produce lower image quality and vice
versa. Quality and compression can also vary according to
input image characteristics and content. Transform coding is a
widely used method of compressing image information. In a
transform-based compression system two-dimensional (2-D)
images are transformed from the spatial domain to the
frequency domain. An effective transform will concentrate
useful information into a few of the low-frequency transform
coefficients. An HVS is more sensitive to energy with low
spatial frequency than with high spatial frequency. Therefore,
compression can be achieved by quantizing the coefficients, so that important coefficients (low-frequency coefficients) are
transmitted and the remaining coefficients are discarded. Very
effective and popular ways to achieve Compression of image
data are based on the discrete cosine transform (DCT) and
discrete wavelet transform (DWT). Current standards for
compression of still (e.g., JPEG) and moving images (e.g.,
MPEG-1, MPEG-2) use DCT, which represents an image as a
superposition of cosine functions with different discrete
frequencies. The transformed signal is a function of two
spatial dimensions, and its components are called DCT
coefficients or spatial frequencies. DCT coefficients measure
the contribution of the cosine functions at different discrete
frequencies. DCT provides excellent energy compaction, and
a number of fast algorithms exist for calculating the DCT.
Most existing compression systems use square DCT blocks of
regular size. The image is divided into blocks of NxN samples
and each block is transformed independently to give NxN coefficients. For many blocks within the image, most of the
DCT coefficients will be near zero. DCT in itself does not
give compression. To achieve the compression, DCT
coefficients should be quantized so that the near-zero
coefficients are set to zero and the remaining coefficients are
represented with reduced precision that is determined by
quantizer scale. The quantization results in loss of
information, but also in compression. Increasing the quantizer
scale leads to coarser quantization, which gives high
compression and poor decoded image quality the use of
uniformly sized blocks simplified the compression system, but
Page 12
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
it does not take into account the irregular shapes within real
images. The block-based segmentation of source image is a
fundamental limitation of the DCT-based compression
system. The degradation is known as the ―blocking effect‖ and
depends on block size. A larger block leads to more
Fig1.1 Scaling and wavelet function.
efficient coding, but requires more computational power.
Image distortion is less annoying for small than for large DCT
blocks, but coding efficiency tends to suffer. Therefore, most
existing systems use blocks of 8 X 8 or 16 X 16 pixels as a
compromise between coding efficiency and image quality. In
recent times, much of the research activities in image coding
have been focused on the DWT, which has become a standard
tool in image compression applications because of their data
reduction capability . In a wavelet compression system, the entire image is transformed and compressed as a single data
object rather than block by block as in a DCT-based
compression system. It allows a uniform distribution of
compression error across the entire image. DWT offers
adaptive spatial-frequency resolution (better spatial resolution
at high frequencies and better frequency resolution at low
frequencies) that is well suited to the properties of an HVS. It
can provide better image quality than DCT, especially on a
higher compression ratio. However, the implementation of the
DCT is less expensive than that of the DWT. For example, the
most efficient algorithm for 2-D 8 X 8 DCT requires only 54
multiplications, while the complexity of calculating the DWT
depends on the length of wavelet filters. A wavelet image
compression system can be created by Selecting a type of
wavelet function, quantizer, and statistical coder. In this paper,
we do not intend to give a technical description of a wavelet
image compression system. We used a few general types of wavelets and compared the effects of wavelet analysis and
representation, compression ratio, image content, and
resolution to image quality. According to this analysis, we
show that searching for the optimal wavelet needs to be done
taking into account not only objective picture quality
measures, but also subjective measures. We highlight the
performance gain of the DWT over the DCT. Quantizers for
the DCT and wavelet compression systems should be tailored
to the transform structure, which is quite different for the DCT
and the DWT.
1.1 COMPRESSION
Figure 1.2: Image/Video Compression Techniques
In predictive coding the present sample is predicted from
previous samples. Predictive coding techniques operate
directly on image pixels and thus are called spatial domain
methods. Delta modulation and Differential pulse code
modulation are different techniques in predictive coding.
Transform coding compression techniques are based on
modifying the transform of an image. In this, a reversible and
linear transform such as Discrete Cosine Transform (DCT),
Discrete Fourier Transform (DFT) and Wavelet Transform are
used to map the image into a set of transform coefficients,
which are then quantized and coded. The DCT-based image
compression standard is a lossy coding method that will result
in some loss of details and unrecoverable distortion. Fourier analysis breaks down a signal into sinusoids of different
frequencies. The main drawback in Fourier transforms is, it
provides only frequency information and the temporal
information is lost in transformation process, where as
wavelets preserves both frequency and temporal information.
II. WAVELET TRANSFORMS
Wavelets are functions generated from one single
function (basis function) called the prototype or mother
wavelet by dilations (scaling) and translations (shifts) in time
(frequency) domain. If the mother wavelet is denoted by ψ (t)
, the other wavelets ψ a,b (t) can be represented as
------- 2.1
wherea and b are two arbitrary real numbers. The
variables a andb represent the parameters for dilations and
translations respectively in the time axis. From Eq. 2.1, it is
obvious that the mother wavelet can be essentially represented
as
-------- 2.2
For any arbitrary a ≠ 1 and b = 0, it is possible to derive that
----------2.3 As shown in Eq. 2.3, ψa, 0(t) is nothing but a time-scaled (by
a) and amplitude-scaled version of the mother wavelet
Page 13
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
function ψ t in Eq. 2.2. The parameter a causes contraction of
ψ( t) in the time axis when a < 1 and expression or stretching
when a > 1. That‘s why the parameter a is called the dilation
(scaling) parameter. For a < 0, the function ψa,b (t) results in
time reversal with dilation. Mathematically, substituting t in
Eq. 2.3 by t-b to cause a translation or shift in the time axis
resulting in the wavelet function ψ a,b (t) The function ψ a,b t
is a shift of ψa,0 (t) in right along the time axis by an amount
b when b > 0 whereas it is a shift in left along the time axis by an amount b when b < 0. That‘s why the variable b
represents the translation in time (shift in frequency) domain.
Figure 2. A wavelet shown at different scales
Figure 2 shows an illustration of a mother wavelet and its
dilations in the time domain with the dilation parameter a = α.
For the mother wavelet ψ(t) shown in Figure 2(a), a
contraction of the signal in the time axis when α < 1 is shown
in Figure 2(b) and expansion of the signal in the time axis
when α > 1 is shown in Figure 2(c). Based on this definition
of wavelets, the wavelet transform (WT) of a function (signal)
f(t) is mathematically represented by
----2.4 The inverse transform to reconstruct f(t) from W(a, b) is
mathematically represented by
-----2.5
Where
-----2.6 and Ψ(ω) is the Fourier transform of the mother wavelet ψ (t) .
If a and b are two continuous (nondiscrete) variables and f(t)
is also a continuous function, W(a,b) is called the continuous
wavelet transform (CWT). Hence the CWT maps a one-
dimensional function b(t) to a function W(a, b) of two
continuous real variables a (dilation) and b (translation).
III. DESIGN AND IMPLIMENTATION USING ASWDR
ALGORITHM
Step 1 (Initialize). Choose initial threshold, T = T0, such that
all transform values satisfy w(m)< T0 and at least one
transform value satisfies w(m) _T0=2. Set the initial scan
order to be the baseline scan order.
Step 2 (Update threshold). Let Tk = Tk_1/2.
Step 3 (Significance pass). Perform the following procedure
on the insignificant indices in the scan order:
Initialize step-counter C = 0
Let Cold = 0
Do
Get next insignificant index m
Increment step-counter C by 1
Ifw(m)>= Tk then
Output sign w(m) and set wQ(m) = Tk
Move m to end of sequence of significant indices
Let n = C - Cold
Set Cold = C
If n > 1 then
Output reduced binary expansion of n
Else if w(m) < Tk then
Let wQ(m) retain its initial value of 0.
Loop until end of insignificant indices
Output end-marker as per WDR Step 3.
Step 4 (Refinement pass). Scan through significant values found with higher threshold values Tj , for j < k (if k = 1 skip
this step). For each significant value w(m), do the following:
If jw(m)j 2 [wQ(m);wQ(m) + Tk), then
Page 14
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Output bit 0
Else ifw(m) 2 [wQ(m) + (Tk;wQ(m) + 2Tk)], then
Output bit 1
Replace value of wQ(m) by wQ(m) + Tk.
Step 5 (Create new scan order). For each level j in the wavelet
transform (except for j = 1), scan through the significant
values using the old scan order. The initial part of the new
scan order at level j - 1 consists of the indices for insignificant values corresponding to the child indices of these level j
significant values. Then, scan again through the insignificant
values at level j using the old scan order. Append to the initial
part of the new scan order at level j -1 the indices for
insignificant values corresponding to the child indices of these level j significant values. Note: No change is made to the scan
order at level L, where L is the number of levels in the
wavelet transform.
Step 6 (Loop). Repeat steps 2 through 5.
The creation of the new scanning order only adds a small
degree of complexity to the original WDR algorithm.
Moreover, ASWDR retains all of the attractive features of
WDR: simplicity, progressive transmission capability, and
ROI capability.
IV. IMPLEMENTATION
The tables consists of PSNR values and the time taken values
for each wavelet for Intensity images.
Table 4.1 shows the PSNR values and time taken (in seconds)
values of Daubechies family wavelets for compressing an
indexed image
.
Table 4.2 shows the PSNR values and time taken (in seconds)
values of Biorthogonal family wavelets for compressing an
indexed image.
Table 4.3 shows the PSNR values and time taken (in
seconds) values of Daubechies family wavelets for
compressing an intensity image.
Table 4.4 shows the PSNR values and time taken (in
seconds) values of Biorthogonal family wavelets for
compressing an intensity image.
wavelets PSNR Time Taken
Bior1.1 44.5183 1.9344
Bior1.3 44.6206 2.0904
Bior1.5 44.4889 2.1216
Bior2.2 43.4227 2.0592
Bior2.4 43.4750 2.0436
Bior2.6 43.4597 2.1060
Bior2.8 43.4261 2.1996
Bior3.1 40.8095 1.9812
Bior3.3 41.8626 2.0748
Bior3.5 42.0891 2.0592
Bior3.7 42.1651 2.1528
Bior3.9 42.1622 2.2932
Bior4.4 44.3776 2.1060
Bior5.5 44.5349 2.1216
Bior6.8 44.2869 2.3088
Wavelets PSNR Time Taken
db1 44.5183 1.8564
db2 44.6469 1.9032
db3 44.5166 1.9344
db4 44.5815 1.9344
db5 44.5523 1.9656
db10 44.5559 2.1372
db15 44.5740 2.5896
db20 44.6888 2.9328
db25 44.7002 3.5100
db30 44.7770 4.0092
db35 44.7878 4.8984
db40 44.8612 5.8032
Wavelets PSNR Time Taken
db1 52.5105 0.9375
db2 52.2644 0.9688
db3 52.0599 1.0313
db4 51.9944 0.9531
db5 51.9673 1.0625
db10 51.8946 1.1250
db15 51.9586 1.3125
db20 51.9239 1.4063
db25 51.9805 1.6094
db30 51.9263 1.6875
db35 51.9573 2.2344
db40 51.9794 2.4531
Page 15
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Table 5.1 Best results for Intensity images from each family
Table 5.2 Best results for Indexed images from each family
Figure 5.1: shows the PSNR values and the time taken values of best wavelet from each family obtained by compressing an
intensity image.
Figure 5.2: shows the PSNR values and the time taken values
of best wavelet from each family obtained by compressing an
indexed image.
V. RESULTS
Figure 5.1Test image-Intensity image waveletDaubechies
family wavelets Peak Signal to Noise Ratio : 52.5105 , Time
taken : 0.9375
Figure5.2. The approximation of Intensity image
waveletDaubechies family wavelets
Figure5.3The Reconstructed and difference Intensity image
waveletDaubechies family wavelets .
wavelets PSNR Time Taken Bior1.1 52.5105 0.9844 Bior1.3 52.5242 1.0781 Bior1.5 52.4703 1.0625 Bior2.2 50.9283 1.1094 Bior2.4 50.9749 1.0469 Bior2.6 50.9368 1.0781 Bior2.8 50.8914 1.1094 Bior3.1 48.9349 1.0469 Bior3.3 49.7433 0.9219
Bior3.5 49.9444 1.0313 Bior3.7 50.0045 1.0625 Bior3.9 50.0539 1.1875 Bior4.4 51.779 1 Bior5.5 52.0525 1.0781 Bior6.8 51.7006 1.0781
WAVELETS PSNR TIME
TAKEN(seconds)
Daubechies db1 (52.5105) db1(0.9375)
Biorthogonal bior1.3(52.5252) bior3.3(0.9219)
WAVELETS PSNR TIME
TAKEN(seconds) Daubechies db40(44.5183) db1(1.8564)
Biorthogonal bior1.3(44.6206) bior1.1(1.9344)
Page 16
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Figure5.3 .Biorthogonal family wavelets, Peak Signal to Noise
Ratio : 52.5242 , Time taken : 0.9219
Figure5.4The approximation images of Biorthogonal family
wavelet.
Figure5.5The Reconstructed and difference images of
Biorthogonal family wavelet.
Figure 5.6.Daubechies family wavelets Peak Signal to Noise
Ratio: 44.8612 , Time taken : 1.8564
Figure5.7. The approximation images of Daubechies family
wavelets .
Figure5.8The Reconstructed and difference
imagesDaubechies family wavelets .
Figure 5.9 .Biorthogonal family wavelets ,Peak Signal to
Noise Ratio : 44.6206 , Time taken : 1.9344
Page 17
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Figure5.10. The approximation images of Biorthogonal family
wavelets
Figure5.11The Reconstructed and difference images of
Biorthogonal family wavelets.
V. CONCLUSION
In this paper, the results were compared for the different
wavelet-based image compression techniques. The effects of
different wavelet functions filter orders, approximation and
reconstruction were examined. The results of the above
techniques ASWDR were compared by using the parameter
such as PSNR and time taken to compress values from the
reconstructed image.
These techniques are successfully tested in many images.
PSNR values and the time taken values of best wavelet from
each family obtained by compressing an intensity image and
indexed images. The experimental results show that the
ASWDR technique performs better for the indexed images by
giving high PSNR and less time to compress. And is an
alternative to the SPIHT method due to its low complexity.
Finally, it is identified that ASWDR for indexed image
compression performs better when compare to intensity image
compression.
REFERENCES
1. ACEEE Int. J. on Information Technology, Vol. 01,
No. 02, Sep 2011Image Compression using WDR &
ASWDRvTechniques with different Wavelet
Codecs.S.P.Raja1, Dr. A. Suruliandi2.
2. IEEEICET 2006 2nd
International Conference on
Emerging Technologies Peshawar, Pakistan 13-14
November 2006.
3. M. Antonini, M. Barland, P. Mathien and I.
Daubechies, ―Image coding using wavelet transform‖, IEEE Trans. Image Processing, vol. 1,
pp.205-220,April 1992.
4. Rafael C. Gonzalez and Richard E. Woods ,‖Digital
Image Processing‖, 2nd
Edition, Prentice Hall Inc,
2002.
5. International Journal of Computer Science &
CommunicationVol. 1, No. 1, January-June 2010, pp.
179-184
6. http://www.acm.org/crossroads/xrds6-
3/sahaimgcoding.html
7. http://www.dspexperts.com/dspprojects/htm
Page 18
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
FFT Based Technique for Automatic Image Mosaicing Vinod G R1, Mrs.Anita R2
1VLSI Design and Embedded System(E&C)
Visvesvaraya Technological University
East Point College Of Engineering and Technology Bangalore,India
2Associate Prof Dept.of Electronics and Communication
Visvesvaraya Technological University
East Point College Of Engineering and Technology
Bangalore,India
Abstract— This paper proposes a technique to generate a
panoramic view by combining images. Image mosaicing is useful
for a variety of tasks in vision and computer graphics. It presents
a complete system for stitching a sequence of still images with
some amount of overlapping between every two successive
images. There are 2 contributions in this paper. First is an image
registration method which handles rotation and translation
between the two images using FFT phase correlation. The second
is an efficient method of stitching of registered images using the
registration parameters obtained in previous step. It removes the
redundancy of pasting pixels in the overlapped regions between
the images with the help of an empty canvas.
Keywords— Image Stitching, FFT Phase Correlation, Registration,
Rotation, Translation
1. INTRODUCTION
An Image mosaic is a composition generated from a
sequence of still images. By applying some
methods we can find parameters which are used to
obtain mosaic of images, it is possible to construct
a single image from many images with some
overlapping areas, covering the entire visible area
of the scene. The steps in mosaicing are image
registration and image stitching.
Image registration refers to the geometric alignment
of a set of images. The set may consist of two or
more digital images taken of a single scene from
different sensors, or from different viewpoints. The
goal of registration is to establish geometric
correspondence between the images so that they
may be transformed, compared, and analyzed in a
common reference frame. This is of practical
importance in many fields, including remote
sensing, medical imaging, and computer vision.
The registration method presented here uses the Fourier
domain approach to match images that are translated and
rotated with respect to one another. Fourier methods
differ from other registration strategies because they
search for the optimal match according to information in
the frequency domain. The algorithm uses the property
of phase correlation for automatic registration, which
gives the translation parameters between two images by
showing a distinct peak at the point of the displacement.
With this as the basis, rotation is also found.
The next step, following registration, is image stitching.
Image integration or image stitching is a process of
overlaying images together on a bigger canvas. The
images are placed appropriately on the bigger canvas
using registration parameters to get the final mosaic. At
this stage, the main concerns are in respect of the quality
of the mosaic and the efficiency of the algorithm used.
In this paper, an efficient method for stitching multiple
images has been proposed.
2. IMAGE REGISTRATION
2.1 ESTIMATION OF ROTATION PARAMETERS:
Suppose the two images I1 and I2 to be registered involve
both translation and rotation with angle of rotation being
Page 19
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
―θ‖ between them. When I2 is rotated by θ, there will be only
translation left between the images and the phase correlation
with I1 should give maximum peak. So by rotating I2 by one
degree each time and computing the correlation peak for that
angle, we reach a stage where there is only translation left
between the images, which are characterized by the highest
peak for the phase correlation. That angle becomes the angle
of rotation―θ‖.
2.1 TRANSLATION PARAMETER ESTIMATION:
If f (x, y) ⇔F(ξ,η) then
f(x,y)exp[j2π(ξx0+ηy0)/ N] )⇔ F(ξ- ξ0,η- η0) and
f(x-x0,y-y0)⇔ F(ξ,η).exp[-j2π(ξx0+ηy0)/ N]
where the double arrow (⇔ ) indicates the
correspondence between f (x, y) and its Fourier
transform F. According to this property, also called
as Fourier Shift Theorem, if a certain function‘s
origin is translated by certain units, then the
translation appears in the phase of the Fourier
transform. i.e. if f and f‟ are two images that differ
only by a displacement (x0, y0)
i.e., f′(x,y)=f(x-x0,y-y0)
Then, their corresponding Fourier transforms F1
and F2 are related by
F'(ξ,η) = e-j2π(ξx0+ηy0)*F(ξ,η).
The cross-power spectrum of two images f and f‟
with Fourier transforms F and F‟ is defined as
F(ξ,η).F'*( ξ,η)/F(ξ,η).F'*( ξ,η)=ej2π(ξxₒ+ηyₒ)
where F‟ * is the complex conjugate of F‟ , the
shift theorem guarantees that the phase of the cross-
power spectrum is equivalent to the phase
difference between the images. By taking inverse
Fourier transform of the representation in the
frequency domain, we will have a function that is
an impulse, that is, it is approximately zero
everywhere except at the displacement that is
needed to optimally register the two images. If there
is no other transformation between f1 and f2 other
than translation, then there is distinct peak at the
point of the displacement.
The discussion in the above section tells that
whenever there is pure translation present between
two images, phase orrelation has a maximum peak
and the corresponding location gives the translation
parameters (x0, y0).
2.2 PROPOSED ALGORITHM:
Now, we present the Algorithm for estimation of
rotation and translation parameters which were
discussed in the previous two sections. (We can
down sample the images to speed up the process of
registration)
Algorithm1:
Input:
Two overlapping images I1 and I2
Output:
Registration parameters (tx, ty, θ)
where tx and ty are translation in x and y directions
respectively and θ is the rotation parameter.
Steps:
1. Read and resize the two images. Let the resized
images be I1' and I2'.
2. For i = 1: step: 360
2.1) Rotate I2' by i degrees. Let the rotated image be
I2'rot.
2.2) Compute the Fourier transforms FI1' and
FI2'rot of images I1' and I2 'rot respectively.
2.3) Let Q(u,v) be the Phase correlation value of I1'
and I2'rot, based on FI1' and FI2'rot.
Q(u,v)=FI1‟ (u,v).FI2'rot*(u,v)FI1‟ (u,v)F'I2‟ rot*(
u,v)
2.4) Compute the inverse Fourier transform q(x, y) of
Q(u,v).
Page 20
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
2.5) Locate the peak of q(x,y).
2.6) Store the peak value in a vector at position i. End
For.
3. Find the index of maximum peak from the values
stored in the vector in step 2.6. It gives the angle of
rotation. Let it be θ'.
4. Repeat steps 2.1 to 2.6 for i = θ'-step : θ'+step.
5. Find the angle of maximum peak from step 4. It
becomes the angle of rotation. Let it be θ.
6. Rotate the original image I2 by ―θ‖. Let the rotated
image be I2rot.
7. Phase correlate I1 and I2rot. Let the result be P(u,v).
8. Compute the inverse Fourier transform p(x,y) of
P(u,v).
9. Locate the position (tx, ty) of the peak of p(x, y)
which become the translation parameters.
10. Output the parameters (tx, ty, θ).
The above algorithm is capable of finding rotation
between the images. The maximum peak occurs only at
the point where there exists pure translation between the
images.
As an example, for an input pair of images fig 1 and fig
2 with a translation of 31 pixels along the column (x)
direction and along row (y) direction 201 pixels and also
with rotation of 90 degrees. The plot of the phase
correlation between the images is shown in fig 3 and fig
4.
fig 1 fig 2
In figure 3 we can see peak at the 90 degree point showing
that the 2nd
input image is rotated by 90 degree with respect to
1st input image
figure 4 shows the peak at the points where there is
exact translation is present between two images along X
and Y directions(X=31 and Y=201).
Page 21
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
fig 4
3. IMAGE STITCHING
Image stitching is the next step following the
registration. At this stage, the reference image is
overlaid on the source image by pasting its pixels on a
canvas at the appropriate location using the
transformation parameters obtained in the registration
process. In this section, we present a general algorithm
for stitching any number of images.
3.1 PROPOSED ALGORITHM:
Algorithm2:
1. Create a canvas: The canvas is for the mosaic of all
the images. We call it image canvas.
2. Make the entire canvas black.
3. For a given image I, For each pixel in the image I,
Paste a mapped pixel on the canvas, taking in to
consideration the translational and rotational parameters.
Fig 4 shows mosaic image of two input images.
Fig 4
3.2 ADVANTAGES OF THE ABOVE METHOD:
This algorithm is very efficient in stitching multiple
images with large overlaps. Consider a sequence of
image with, let us say, 80% overlap between the
successive images. If the entire image is pasted
every time, then some of the pixels in the overlap
region get mapped four times , thus leading to a
300% redundancy in pasting where as algorithm 2
pastes each pixel only once. This approach not only
improves the efficiency of the stitching but the
same time retains the quality of the mosaic closer to
that of the input images.
4. EXPERIMENTAL RESULTS
The algorithm1 for registration and algorithm2 for
stitching as described in 2 and 3 sections respectively all
have been implemented in MATLAB R2011a.These
algorithms have been tested on different sets of images,
especially real images involving large amounts of
rotational and translational changes for registration and
illumination and view changes for image composition.
5. CONCLUSION
In this paper, we have presented two algorithms
for still image sequences. The first is a simple and
reliable algorithm for finding rotation and
transformations of planar transformations based on
the phase correlation. The overall complexity is
dominated by FFT. A key feature of Fourier-based
Page 22
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
registration methods is the speed offered by the use
of FFT routines. The next is a method of stitching
images which overcomes redundancy in re-pasting
pixels in the final mosaic. All these algorithms add
quality and efficiency to the mosaicing process.
REFERENCES
[1]. Lisa G.Brown. A survey of image registration techniques.ACM
Computing Surveys, 24(4):325–376, December 1992.
[2] D. I. Barnea and H. F. Silverman, "A class of algorithms for fast digital
registration," IEEETrans. Comput, vol. C-21, pp. 179-186, 1972.
[3] C. D. Kuglin and D. C. Hines, "The phase correlation image alignment
method," in Proc.IEEE 1975 Int. Conf. Cybernet. Society,,New York, NY, pp.
163-165.
[4] J. L. Horner and P. D. Gianino, "Phase-only matched
filtering," Appl. Opt., vol. 23, no. 6, pp. 812-816, 1984 [5] B. Reddy, and B. Chatterji, ―An FFT-based Technique for Translation,
Rotation and Scaleinvariant Image Registration‖,IEEETrans.OnImageProcessing,Vol.5,No.8,pp:1266-71,1996.
[6] Q.Chen ,M. Defrise, and F. Deconinck, ―Symmetric phase-only matched
filtering of Fourier-Mellin transforms for Image Registration and
Recognition,‖ IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol 16 No. 12 pp 1156-1168.
[7] R. Szeliski, ―Video mosaics for virtual environments‖, IEEE Computer
Graphics &Automation, pp. 22-30, 1996.
[8] R. Szeliski, ―Image alignment and stitching: A tutorial”. Technical
Report, January 2005.
[9] Automatic Registration of Satellite ImagesLEILA M. G.
FONSECAMAX H. M. COSTANational Institute for Space Research –
INPECP 515, 12227-010 Sao Jos´e dos Campos, SP, BrazilState University of
Campinas – UNICAMPCampinas, SP, Brazil
[10] Frank Nielsen, Randomized Adaptive Algorithms for mosaicing Systems
IEICE TRANS. INF & SYST July 2000
[11] Image Mosaicing using Sequential Bundle Adjustment Philip F.
McLauchlan and Allan Jaenicke,School of EE, IT and
Mathematics,University of Surrey,Guildford GU2 5XH.
[12]. Y. Kanazawa and K. Kanatani. ―Image mosaicing by stratified
matching,‖ Image Vision Computing, vol. 22, Feb 2004.
[13]. K. Rangarajan, M. Shah and D. Van Brackle, ―Optimal corner
Detection'', CVGIP vol: 48 pp: 230-245, 1989.
[14]. De Castro E, and Morandi C, 1987. Registration of translated and
rotated images using finite Fourier Transforms, IEEE Trans. PAMI-9, 5
(Sept.), 700-703.
[15]. Q.Zheng and R.A.Chellappa. ―Computational vision approach to image
registraion‖. IEEE Transactions on Image Processing, 2(3) : 311326, July
1993.
Page 23
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Application of Numerical Approximation Techniques for Face
Detection& Recognition Kayala Rahul1, Nikhil Bharat1
1Department of Electronics & Communication, RV College of Engineering
RV Vidyaniketan Post, 8th Mile, Mysore Road, Bangalore, India
[email protected] , [email protected]
Abstract—The ―Eigen Faces‖ technique for Face detection in a
image is a standard algorithm, first proposed in 1987. It involves
the decomposition of a given image into its Eigen components for
the purpose of recognition of a face. The computation of Eigen
values of images is resource intensive. This paper proposes the
application of Numerical Approximation Techniques,
particularly the Jacobi‘s Method for Asymmetric matrices to
compute the Eigen values and thereby reduce the computational
intensity. The application of this technique is particularly useful
in low resource portable embedded systems.
Keywords— Face detection, Eigen Faces, Numerical Methods,
Jacobi‘s method,Principal Component Analysis
I. INTRODUCTION
Face detection is primarily used in Face Recognition
Systems (FRS), in which an image or a video are scanned for
a face and then selected features comparedwith database to
obtain a match. Facial recognition is primarily used as an
alternate to existing biometric identification such as
fingerprinting and eye iris scanning systems. Face detection is yet to be widely as a security feature primarily due to high
computational resources required in extracting and matching
components in a face.
II. DETECTION TECHNIQUES
Several methods are used in the detection of faces in an
image. All the techniques make the following key assumptions:
The image of the face is of reasonable resolution
The face stored in the database and that submitted
for detection were taken under similar external
conditions such as lighting, camera noise etc.
The expression of the subjects is the same,
preferably neutral.
A. Traditional Detection Techniques
Some facial recognition algorithms extract prominent
features of the face such as relative position/shape of the eyes,
mouth, cheek bones, jaw etc.
Other algorithms normalize a gallery of face images and
then compress the face data, only saving the data in the image
that is useful for face detection. A probe image is then
compared with the face data.
Geometric algorithms are those which look at the
distinguishing features. Photometric algorithms are those
using a statistical approach that distils an image into values
and compares the values with templates to eliminate
variances.
Principal Component Analysis using Eigen Faces is a
traditional detection technique.
B. 3D Detection Technique
This technique uses 3D sensors to capture information
about the shape of a face. This information is then used to
identify distinctive features on the surface of a face such as
nose, chin etc.
Three-dimensional data points from a face vastly improve
the precision of facial recognition. It is not affected by
changes in lighting and the angle of the phase doesn‘t affect
the accuracy of the recognition algorithm.
III. TYPICAL FACE RECOGNITION SYSTEM
The block diagram of a typical face recognition system is
shown in Fig. 1
Fig. 1 Block diagram of a typical system
There are several stages in FRS, all of which affect the
accuracy of the system as a whole.
A. Acquisition
This is the entry point of the face recognition process. It is
the module where the face image under consideration is
presented to the system. The image can either be from a stored
Page 24
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
database or can be a live picture captured at the module.
Usually, picture captured live is harder to process.
B. Pre-processing
Pre-processing is performed on the captured image in order
to reduce to effect of environmental factors such as lighting. The image is normalized using some well-known processes
such as image resizing to the standard of the current system,
histogram processing to improve the quality, median filtering,
high pass filtering, rotational normalization and illumination
normalization.
C. Feature Extractor
This segment of the FRS is used to determine which
features/landmarks of the face are to be extracted and
compared. The features extracted depend upon the algorithm
used by the system. In the Eigen Faces based system, the
feature extractor obtains the Eigen Vectors of the image using
Principal Component Analysis (PCA).
D. Training Sets
A set of images which are faces are used to train the system
with regard to the features of a typical face it will encounter.
The features extracted are compared with that of those
extracted from the training set in order to determine whether
an image is a face.
E. Classifier
The classifier determines if the Image is a known face by
comparing the extracted features with the features stored in
the database. It adopts a Maximum Likelihood approach by
measuring the likelihood that a given image is a face and
belongs to a super set of the training set. If the training set is
small, it may result in errors and false positives.
IV. PRINCIPAL COMPONENT ANALYSIS
Principal component analysis (PCA) is a mathematical
procedure that uses an orthogonal transformation to convert a
set of observations of possibly correlated variables into a set
of values of linearly uncorrelated variables called principal
components. The number of principal components is less than
or equal to the number of original variables. In the context of Face detection, and in general, Image
processing, PCA refers to extraction of the Eigen Vectors of
the image.
The main principle behind Face detection using PCA is
that all faces have similar characteristics and can be expressed
as a weighted sum of the Eigen vectors of those images in the
training set.
The accuracy of the detection depends on a variety of
factors such as number of the training set, variations in
lighting conditions, variations in subject expressions etc.
A. Procedure for PCA
The first step is to obtain a set S with M face images,
which have been pre-processed. Each image is transformed
into a vector of size N and placed into the set.
Fig. 2 Training set images
The Mean image Ψis calculated for the training set
Fig. 3 Mean image of the training set
The Mean Subtracted image Φ is then calculated as
follows
A set of M orthonormal vectors, un, is chosenwhich best
describes the distribution of the data. The kth
vector, uk, is
chosen such that
is a maximum, subject to
Page 25
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
From the concept of Singular Value Decomposition (SVD), we know that uk and λk are the eigenvectors and eigenvalues
of the covariance matrix C.
We now compute the Eigen values and vectors of the
covariance matrix C. We then express each image in the
training set in terms of the Eigen vectors obtained ul. This
given us the set of Eigen faces of the training set.
Fig.4 Eigen faces for the given training set
B. Recognition of Faces
A new face is transformed into its Eigen face components.
We compare the input image with our mean image and
multiply their difference with each eigenvector of the C
matrix. Each value would represent a weight and would be
saved on a vector Ω
We now determine the Euclidean distance between weights
of images in the training set and the input image
The input face is considered to be a face if εk is below an
established threshold θk. Then the face image is considered to be a Known face.
If the difference is above the given threshold, but below a
second thresholdθu, the image can be determined as an
Unknown face.
If an unknown face is discovered i.e.θk<εk<θu, most FR
systems add it to the training set and recompute the Eigen
faces. This feedback technique helps in improving the
accuracy.
V. LIMITATIONS OF PCA
Consider a training set of 20 faces of standard size
128x128. Each face is represented using a single row vector of
length 16384.
Correspondingly, the size of the Mean Subtracted image
matrix is 16384 x 20. Hence, the size of the Covariance matrix
C is 16384 x 16384.
Subsequently C has 16384 Eigen vectors and values, which
makes computation highly complex, resource intensive and
time consuming.
However, the number of Eigen faces depends upon the size
of the training set. This makes it reasonable to assume that the
total number of significant is equal to the size of the training
set. In this e.g. the number of significant Eigen vectors is 20.
Despite the reduction, the processing and computation of M largest Eigen values and corresponding vectors still is
computationally intensive, especially when viewed in the
context of limited resources available in an embedded system
as compared to a PC.
VI. JACOBI‘S METHOD FOR EIGEN VALUES
The computational complexityinvolved in Eigen value
calculation can be mitigated by adopting numerical
approximation techniques.
Jacobi‘s method for non-symmetric matrices isused to
convert any given matrix into a diagonal matrix. The Eigen
values of the diagonal matrix thus obtained, and the elements
of the leading diagonal.
This is an iterative technique and the error involved in the
approximation is systematically reduced with each iteration.
A. Mathematical Analysis
For a symmetric matrix A, the Jacobi method constructs
a series of orthogonal matrices T, such that the
matrix converges to a diagonal matrix . Di+1=Ti
T…T1
TATi…T1, where i denotes the iteration
number.
The Orthogonal Transformation matrix T comprises of a
Rotation component R and a Shear Component S, such that
Ti=RiSi.
Consider a situation in which the rotation is taking place in
a subspace of a matrix, with l rows and k columns, such that
k< l. Then the Jacobi‘s method states that
rkk=rll=cosθ rkl=-rlk =-sin θ
skk=sll=cosh p skl=slk=-sinh p
Page 26
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
tanθ = akl+alk/(akk-all) tanh p= (ed-h)/(g+2(e
2+d
2))
e=akl-alk
d= (akk-all) cos 2θ + (akl+alk)sin 2θ
g= Σ(a2
ki+ a2ik+ a
2li+ a
2il)
h=cos 2θΣ(akiali-aikail)
-0.5 sin 2θΣ(a2ki+ a
2ik+ a
2li+ a
2il)
, where aij is an element of A, rij is an element of R &sij is an
element of S.
,except i=k=l
These give us Ri and Si, giving us Ti. For each iteration, Ti+1is
calculated and a better approximation of the Diagonal matrix
D is obtained. This is repeated until Ti=In. The leading diagonal elements of D give the Eigen values, while the
product of all the Ti gives the Eigen Vectors corresponding to
the values.
B. Application in Face Detection
The Jacobi‘s technique described above is best suited for
large real non-symmetric matrices. In the domain of Face
detection, the method is applied to the Covariance Matrix Cto
extract the Eigen vectors.
In order to further simplify the computation, the assumption
that the number of significant Eigen Values will be equal to
the size of the training set M, the matrix C is first reduced to a
matrix of rank M and then subjected to Jacobi‘s iteration.
C. Implementation
The above scheme can be implemented for an image of
standard type such as .jpg,.gif,.png as well as any proprietary
standard used in the FRS. We have implemented the same
using .jpg images of faces with a standard size of 320 x 243
pixels.
VII.ADVANTAGES & LIMITATIONS
The application of Jacobi‘s technique in Face detection is in
the most crucial stage of any FRS i.e. Feature Extraction. The results at this stage determine the key performance indicators
of the system.
The key advantage of using this technique is due to its
iterative nature. This reduces the code density and thereby
makes its integration into portable embedded systems easier.
Additionally, the simplicity of the technique makes it possible
for it to be implemented on any basic floating point processor
with support for trigonometric instruction set support. This
eliminates the need for dedicated, high cost DSP units usually
needed for complex matrix operations on images.
Notwithstanding the advantages, the inherent disadvantage
of incorporating Jacobi‘s technique is the approximation of
the solution, resulting in slight errors and deviations from the
expected result. This can be overcome by increasing the
number of iterations or by increasing the size of the training set (statically by the programmer or dynamically by the
system incorporating images classified as unknown).
However, this increases the time complexity of the system.
The challenge is to choose a suitable level of trade-off
between accuracy of FRS, cost/portability of FRS&
importantly the time complexity of the FRS.
VIII. CONCLUSIONS
This paper demonstrates the main challenges involved in
incorporating a robust FRS in a portable embedded
system/low resource computing environment. These can be
overcome to a certain degree by incorporating numerical
approximation techniques in the FRS. This paper
demonstrated the use of Jacobi‘s Method for Eigen
values/vectors of Real Non Symmetric Matrix in a FRS based
on the principle of PCA using Eigen for Face detection.
The challenges involved in the incorporation, primarily
time complexity and accuracy of the system were discussed
vis-à-vis the inherent advantage of code density and the
ability to be implemented in a low resource
computingenvironment. In conclusion, numerical
approximation techniques can be applied for FRS in low
resource environments where there is sufficient scope for
tradeoff between accuracy and time complexity.
ACKNOWLEDGEMENT We wish to acknowledge the valuable guidance provided to
us by Sri. P L Rajashekhar, Assistant Professor, Department
of Mathematics, RVCE. We also would like to thank Smt.
Veena Devi, Assistant Professor, Department of Electronics &
Communication, RVCE.
We also express our gratitude to The Department of
Electronics & Communication for providing us the necessary
support and environment to work on this endeavour
REFERENCES
[1] “Eigen faces for recognition”, Matthew Turk, AlexPentland, Vision &Modeling Group,The Media Laboratory, MIT
[2]”Face detection using Eigen faces”, IlkerAtalay,MSc. Thesis, Istanbul
Technical University, 19
[3] ―Face Detection on Embedded Systems”, Abbas Bigdeli, Colin Sim,
MortezaBiglari-Abhari and Brian C. Lovell, Department of Electrical and
Computer Engineering, The University of Auckland, Auckland, New Zealand
[4] ―Eigen Face Recognition Using Different Training Data Sizes”,Zhifeng
Li, Xiaoou Tang,Department of Information Engineering, The Chinese
University of Hong Kong Shatin, N.T., Hong Kong
[5] ―Basic Numerical Methods for Image Processing‖, Chang- Ock Lee
[6] “Mathematical Problems in Image Processing,”G. Aubert and P.
Kornprobst, Applied Mathematical Sciences 147, Springer, 2002.
Page 27
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Ajay Kumar V1, M.B.Kamakshi2 , Venkatesha B V3 1, 2
Dept. of Telecommunication, R.V. College of Engineering, Bangalore, India 3Alcatel-Lucent India Pvt. Ltd.
[email protected]
[email protected] [email protected]
Abstract - Network Management and Security has become a very
sensitive and important topic for manufacturers and network
operators. In optical networks, fault management deals with
detection and isolation of faults based on the alarms received
from the network elements. This paper describes the
development of a fault management tool for an enterprise optical
node managed by an enterprise network management system.
The tool is implemented in order to reduce the manual effort in
validating the alarms. Description of the same is presented. For
any network management system, FCAPS is the term used to
represent its functionalities like Fault, Configuration,
Accounting, Performance and Security management. Fault
management aspect for an enterprise optical node is being
presented in this paper. Validated results as part of tool's
execution are obtained as screenshots and presented.
Keywords: -FCAPS - Fault Configuration Accounting
Performance Security, NMS - Network Management System.
PSS - Photonic Service Switch, SAM - Service Aware
Manager, DWDM - Dense Wavelength Division
Multiplexing, Fault Management, Alarm Management, SNMP
- Simple Network Management Protocol, UDP - User
Datagram protocol
I. Introduction
Today's optical networks are capable of carrying large
amounts of data. Commercial systems have been deployed
with a fiber capacity of several Tera bits per second, which is
equivalent to millions of simultaneous telephone
conversations. As optical network technology advances and
higher bandwidth is demanded, the amount of data transmitted
over a single optical fiber is expected to increase further. Due
to such high data rates, even a short service disruption may
cause large amounts of data to be affected. Many different
types of service disruptions occur frequently in practice. They
include bending and cutting of fiber, loss of signal, equipment failure, and human error. Besides faults, optical networks are
vulnerable to sophisticated attacks that are not possible in
electronic networks. Therefore, it is of critical importance for
such networks to have fast and effective methods or tools for
identifying and locating network failures. This is especially
important for the physical layer, where any physical failure
should be detected, located and corrected before it is noticed
by the upper layer protocols. Fault detection in optical
networks depends on alarms/traps generated by different types
of network monitoring equipment in response to unexpected
events. Depending on the placement and capabilities of the
monitoring devices, the network fault manager or a network
management system may receive a large number of
redundant alarms for some network failures, while it may not
receive any alarms for other network failures. In order for fault detection and localization mechanism to be fast and
effective, it is important to reduce the number of redundant
alarms. This will reduce the alarm processing time as well as
ambiguity in fault localization.
In optical networks, the physical layer generally consists of
several basic network components. Optical components are
passive or active. Passive optical components do not have
monitoring equipment capable of detecting and reporting
alarms. Active optical components have monitoring
equipment and therefore are capable of reporting alarms to the
network management system or network manager.
Optical networks and all networks, in general need a fault
management system or tool that is able to identify the faults that occur from the information given by the network
elements. A fault is defined as the accidental interruption of
the ideal functioning of the network due to tiredness of the
components. Faults produce signal degradation or complete
signal interruption. The former are called soft faults, and the
latter are called hard faults.
Section II deals with the introduction of the Alcatel-Lucent
developed enterprise-specific optical node called Photonic
Service Switch (PSS) and the enterprise-specific Network
Management System called Service Aware Manager (SAM).
Section III deals with the implementation procedure for fault
management. Section IV shows the snapshots of the validated
results obtained for one card supported by PSS and also the
corresponding traces of traps using a network monitoring tool,
Wireshark. Section V gives conclusion. The terms SAM and
Fault Management Tool Development for an Optical Node
Page 28
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
NMS, PSS and optical node alarms and traps, are used
interchangeably.
II. Photonic Service Switch (PSS) and Service
Aware Manager (SAM).
A. Photonic Service Switch
The Alcatel-Lucent 1830 Photonic Service Switch (PSS)
represents a new breed of photonic switch for next generation
access, metro and long-haul wavelength division multiplexing
(WDM). It is a multi-reach platform that spans access, metro,
regional and long-haul applications and supports a wide range
of data rates, enabling service delivery in a variety of
environments and applications. It is used in broadband transport networks for telecommunication operators and
enterprises as Telco's to provide high-bandwidth connectivity
over distances up to 4000km. It also supports for the full range
of network topologies, including ring, point-to-point and
optical mesh topologies. Fig.1 shows the schematic of a PSS
node managed by the SAM.
Fig.1 Graphical Representation of PSS node (simulator node)
Key Features of PSS (Optical Node)
A static, tunable/reconfigurable optical add/drop
multiplexer (T/ROADM) with single wavelength
add/drop granularity
Supports WDM functionality
Colorless and any direction add/drop capabilities
Up to 88 wavelengths and 50GHz ITU WDM per
fiber
ODU1, ODU2, ODU3 and ODU4 interfaces
according to the G.709 standard
100 Gb/s and 40 Gb/s channel capacity, with best-in
class
Wavelength Tracker technology enables end-to-end
power control, monitoring, tracing and fault
localization for each individual wavelength channel
Supporting for various cards according to the service
required, i.e.., Optical Transponder Cards, Filter Cards, and Amplifier Cards.
B. Service Aware Manager
The Service Aware Manager (SAM) is a network management
application which is designed using industry standards like
Java framework, multi-tier layering, and web service, standard
interfaces. The use of the industry standard interfaces allows
the SAM to interoperate with other network systems. Fig.2
shows the user interface of SAM.
Fig.2 SAM GUI showing Alarms window and topology
1) Key Features of SAM (NMS)
Use of open standards that promote interoperation
with other systems
Distributed server processing
Using muti-tier model that groups functions in
separate, well-defined elements
Creation of web services
Component Redundancy
Uses the underlying protocol as SNMP
2) Key Components of SAM
The server is a Java-based network-management
processing engine
The Database is Oracle relational database that
provides persistent storage
Java based GUI clients which provide a graphical
interface for the network operators.
3) Network Management Capabilities of SAM
Alarm correlation up to the service level.
service and routing provisioning
inventory reporting at the equipment
Page 29
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
network performance and account data collection
III. Implementation of Fault Management Tool.
As mentioned in the previous sections, Service Aware
Manager is an NMS which manages the Photonic Service
Switch, which is an optical network element. Fig.3 shows the
flow in implementation of fault management tool for PSS
managed by SAM. In Fig.3, the major elements are PSS, SAM
and an OSSI, which is Alcatel-Lucent specific coding
methodology. Initially, dump of Alarms supported by a card is
taken from the node (PSS) by logging into the node through
SSH protocol. The lists of alarms are raised on the node with
the help of a binary tool, supported by the node. Since the
node is managed by the SAM (NMS), all alarms raised on the
node are reported to the NMS. The underlying protocol for
communication between the Node and the NMS is SNMP.
The alarms seen by thee NMS are extracted using the XML
API client and are compared with the node alarm parameters.
If the both alarm specifics match, them they are validated as
Passed, else they are validated as Failed. The binary tool
supported by the node is called FMDH, which stands for Fault
Management Defect Handler. The tool developed has both the
Frontend and Backend. Frontend includes a graphical user
interface for selection of a particular card at a particular slot.
Backend includes the coding in extracting all the alarms
supported by the node.
Fig.3 Method implemented in developing the fault management tool
A. Back End Implementation Procedure
A particular card supported by the node is chosen for alarms validation
That particular card is to be configured on the node
The dump of the alarms supported by the node are
extracted by logging into the node through Secure Shell Protocol.
The alarms need to be syntactically arranged for
execution on the node, using the binary tool
supported on the node.
B. Front End Implementation Procedure
On the frontend, which is implemented using Java,
a particular card supported by the optical node can be
chosen
The front end also provides options to choose particular slot and shelf on which the card is
configured
Node IP address is also an input provided.
Fig.4 shows the developed GUI, which gives options for
validating a card.
Fig.4 Developed GUI showing various fields on the Frontend
IV. Results and Discussion
As mentioned in the Section III, one particular card A2P2125,
supported by the node is chosen, which is an Amplifier Card,
and its validated results are given in below figures. The
characteristics of the card are as mentioned below.
Optical Amplification function is performed via multistage
EDFA amplifiers, most with mid-stage DCM access. These
amplifiers are implemented as integrated variable gain optical
amplifier modules which include fast feedback for transient
control. It provides a maximum gain of 25dB. Fig.5 shows the
validated test results for the A2P2125 card.
Page 30
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Fig.5 SAM GUI showing Alarms captured for the A2P2125 card
Since the underlying protocol for communication between the
node and the NMS is SNMP in this case, the alarms can also
be referred to as traps. Fig.6 shows the recorded traps using
Wireshark, which is a network monitoring tool, generated as
part of tool's execution for mentioned card. The figure gives
an insight of various attributes related to traps, like source
from where the trap is generated, destination of the trap,
underlying protocol, timestamp etc. Fig.7 shows the packet
format for one particular alarm/trap, where in layer 2, layer 3,
layer 4 and layer 7 information is recorded using the
Wireshark tool. Fig.7 shows the layer 7 information, i.e.,
SNMP related information, which is of importance in this
context.
Fig.6 Screenshot of Alarms/Traps recorded from the tool using wireshark
Fig.7 Screenshot of packets information recorded using wireshark.
V. Conclusion
The tool discussed reduces the manual effort in validating
individual alarm at a time, which is an inefficient procedure
for validation. It provides for validating bulk or total number
of alarms supported by the optical node. The front end
developed, i.e., the user interface provides the flexibility and
granularity in the validation of the alarms.
REFERENCES
[1]Ma-kun Guo, Yi-min Wang Qi Yu, ―Research and Implementation of
Network Management System Based on XML View‖, International
Conference on Logistics Engineering and Intelligent Transportation
Systems (LEITS), 2010, Page(s): 1 - 4 [2]Stanic, Subramaniam, S.Sahin, G.Choi, H.Choi, ―Active monitoring and
alarm management for fault localization in transparent all-optical
networks‖, IEEE Transactions on Network and Service Management,
2010, Volume: 7, Issue: 2, Page(s): 118 - 131 [3]Wang Haitao, Chang Chun Qin, ―Network management system based on
Java technology‖, 3rd International Conference on Advanced Computer
Theory and Engineering (ICACTE), 2010, Page(s): V1-685 - V1-688 [4]Xiaolin Lu, ―An Architecture for Web Based and Distributed
Telecommunication Network Management System‖, 22 Nov. 2009, Third
International Symposium on Intelligent Information Technology
Application, 2009. IITA 2009, Volume: 1, Page(s): 152 - 155 [5]Ismail, M.N, ―Network Management System Framework and
Development‖, International Conference on Future Computer and
Communication, ICFCC 2009, 3-5 April 2009, Page(s): 450 - 454 [6]Stanic, S. Subramaniam, ―Distributed Hierarchical Monitoring and Alarm
Management in Transparent Optical Networks‖, IEEE International
Conference on Communications, 2008, Page(s): 5281 - 5285 [7]Stanic Sava,Sahin, Gokhan, Hongsik Choi, Subramaniam, Suresh
Hyeong-Ah Choi, ―Monitoring and alarm management in transparent
optical networks‖. Fourth International Conference on Broadband
Communications, Networks and Systems, 2007, Page(s): 828 - 836 [8]Alcatel-Lucent Reference guides and Data sheets [9] www.snmp.org
Page 31
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Toolbox for material classification of MMWR and FLIR
for Enhanced Vision System
Devendran.B*, Sudesh Kumar Kashyap**, T.V.Rama Murthy***
*PG Student, REVA Institute of Technology and Management, Bangalore, [email protected]
**MSDF lab, FMCD, CSIR-National Aerospace Laboratories, Bangalore, India,[email protected]
***Dept. of Electronics and Communication Engineering, REVA Institute of Technology and Management,
Bangalore, [email protected]
Abstract–To improve the situational awareness of an aircrew
during poor visibility, different approaches emerged during the
past few years. Enhanced vision systems (EVS – based on sensor images) are one of those. Typically, Enhanced vision systems
concept is a combination of sensor data, environmental variables,
internal and external state data and material database of the
given geographical area.EVS uses weather penetrating forward
looking image sensors such as Forward Looking Infrared Radar
(FLIR) and HiVision Millimeter Wave Radar (HiVision
MMWR). To generate the backscatter image from the imaging
sensors, it is much important to develop a material classified
database of the given geographical area. But development ofsuch
material classified database requires various image processing
tasks.Hence the main contribution of this paper is the
implementation of GUI based toolbox which is capable of
developing material database for both MMWR and FLIR by
using different material properties for Enhanced Vision Systems
functionalities.
Keywords - Enhanced Vision Systems (EVS), Graphical
User Interface (GUI), Region of Interest (ROI) and Phong like
lighting model, Normalized Radar Cross Section (NRCS).
I. INTRODUCTION
Typical Enhanced vision Systems concept is
shown in fig.1. The performance of the Enhanced Vision
System relies on the performance of imaging sensors
[1].The reliability of the Imaging sensorshighly depends
on the accuracy of used material classified database, also
imaging sensors generates backscatter image by taking
the material classified database of the given
geographical area as a reference [2].Hence designing of
such database that is accurate is a significant challenge
and it requires number of image processing functions to
classify the objects in the given image.
Fig.1. Typical Enhanced Vision System concept [3]
As mentioned earlier sensor vision is one of the
integral part of Enhanced vision systems technology.
Sensor vision uses weather penetrating Forward Looking
Infrared Radar (FLIR) and MilliMeter Wave Radar
(MMWR) sensors for imaging. In case of MMWR,
imaging is based on Backscattering Coefficient
(Normalized Radar Cross Section (NRCS) σo). Whereas
for FLIR, Solar absorptance, Albedo, Emissivity,
Conductivity, Thickness, Slope and Surface azimuth
parameters are to be consider. More details about radar
modelling requirements are obtained from ref [1-4].
For a given geographical area(airport image)the
classifications of terrain are: (1) Asphalt or Bitumen, (2)
Concrete, (3) Tar, (4) Bare Soil, (5) Green Grass, (6)
Trees, (7) Desert Sand, (8) Ice or White paint, (9) Snow,
(10) Water, (11) Brick or Urban areas, (12) Wood.
Material classification has to be done with respect to the
Enhanced
Vision System
Radar
IR
TV
Materia
l
Databas
eBase
Position
Attitude
Air speed
Mission data
ATC - data
Imaging sensors Internalstate
Externalstate
Page 32
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
above terrain types. Material coded airport database can
be subsequently used in modelling and simulation of
MMWR and FLIR for the purposes of EVS studies.
Material classification of an airport image is purely an
image processing task, which requires several
algorithms to be applied on an image at a time. Material
classification can also be done by using edge detection
and colour coding methods. But the results of these
methods will not meet the desired accuracy. Hence this
drawback motivates to develop a software toolbox to
create material database based on different
aforementioned parameters.
Fig.2. Block diagram for toolbox design
In this paper details of implementation of
MATLAB based GUI toolbox for material classification
and realization of Phong lighting model for different
terrain materials based on Normalized Radar Cross
Section are presented. However, the authors have
attempted to present complete procedure required for
developing material database by using the designed
toolbox.
II. TOOLBOX DESIGN USING MATLAB GRAPHICAL USER
INTERFACE DEVELOPMENT ENVIRONMENT (GUIDE)
A graphical user interface (GUI) is a graphical
display in one or more windows containing controls,
called components, which enable a user to perform
interactive tasks. The user of the GUI does not have to
create a script or type commands atthe command line to
accomplish the tasks. Unlike coding programs to
accomplish tasks, the user of a GUI need not understand
the details of how the tasks are performed.GUI
components can include menus, toolbars, push buttons,
radio buttons, list boxes, and sliders just to name a few.
GUIs created using MATLAB®tools can also perform
any type of computation, read and write data files,
communicate with other GUIs, and display data as tables
or as plots [8].
Most GUIs wait for their user to manipulate a
control, and thenrespond to each action in turn. Each
control, and the GUI itself, has one or more user-written
routines (executable MATLAB code) known as
callbacks, named for the fact that they ―call back‖ to
MATLAB to ask it to do things. The execution of each
callback is triggered by a particular user action such as
pressing a screen button, clicking a mouse button,
selecting a menu item, typing a string or a numeric
value, or passing the cursor over a component. The GUI
then responds to these events. The creator of the GUI,
provide callbacks which define what the components do
to handle events. This kind of programming is often
referred to as event-driven programming. In the
example, a button click is one such event. In event-
driven programming, callback execution is
asynchronous [6-8].
MATLAB GUIs can be built in two ways:
Use of GUIDE (GUI Development Environment), an
interactive GUI construction kit.
Create code files that generate GUIs as functions or
scripts (programmatic GUI construction).
In this paper first approach is selected and it
starts with a figure that you populate with components
from within a graphic layout editor. GUIDE creates
associated code file containing callbacks for the GUI
and its components. GUIDE saves both the figure (as a
FIG-file) and the code file. Opening either one also
opens the other to run the GUI. [8] Initial view of
material coding toolbox is shown in fig.3.
Important callbacks used in designing toolbox
Some of the very important callback functions
which are used in designing material coding toolbox are
discussed as follows.To write the other callback
Image Data
MMWR
parameters
FLIR
parameters
Image
processing
algorithms
Material
classified
Database
of the
given
image
Page 33
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
functions which are used in toolbox, ref [6-8] are used as
a guide.
Load_Image_Callback; by usingthis callback function
one can get any image from any of the directory. Once the
image is selected, it will be displayed on main axes. Matlab
functions used for this purpose are described as follows.
[h,path]=uigetfile('*.jpg;*.tif;*.png;*.gif',
'All Image Files';...
'*.*','All Files’,'mytitle',...
'..:');
handles. a = imread (strcat (path,h));
axes (handles.axes3)
set (gcf,'CurrentAxes',handles.axes3)
set (handles.Original_img, 'string',h)
imshow (handles. a)
Select_ROI_Callback; this callback function can be
called as a heart of the designed toolbox. Because this
function provides facility for user to select the region of his
own interest. The MATLAB function used for this task is
described as follows,
I = image data;
[BW, x, y] = myroipoly (I);
The above function gives the selected region of
interest‘s binary mask (BW) and (x, y) co-ordinates of
that mask.A region of interest (ROI) is a portion of an
image that you want to filter or perform some other
operation on. It is possible to define an ROI by creating
a binary mask, which is a binary image that is the same
size as the image required to process with pixels that
define the ROI set to 1 and all other pixels set to 0.By
using above callback function one can select more than
one ROI in an image.
Fig.3. Initial view of the designed material coding toolbox using GUIDE
Delete_ID_Callback; this callback function
facilitates user to delete the erroneous region of interest as
well as its related data such as (x, y) co-ordinates, ID values,
material coding parameters. The following code is used to
delete the unwanted data,
del_file = load (file_name);
del_file.material.XY (ID_entry) = [];
del_file.material.sigma (ID_entry) =
[];
del_file.material.ID (ID_entry) = [];
The following ––function is used to get back the region
of interest and its co-ordinates which user likes to delete.
Mask = poly2mask ();
Select_Materials_Callback; this callback function is
used to select the required terrain material corresponds to
selected region of interest. The following code is used to
select the different materials in both MMWR and FLIR case
of material classification. Select_Materials_Callback (hObject, eventdata,
handles)
v = get (handles.Select_Materials,'value');
Main
axes Auxiliary
axes
Page 34
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
s = get (handles.Select_Materials,'string');
handles.sig = importdata ('Phong.txt');
Switch s v
Case ‘Asphalt/Bitumen’
Case ‘ ’
end
Save_Callback; this callback function is used to save
the material database for both MMWR and FLIR. Material
database is saved in a structure along with the number of ID‘s,
(x, y) co-ordinates of each ID, in sigma field either NRCS
calculation parameters in case of MMWR or IR material
classification parameters for FLIR case should be includedfor
each ID selected and it is implemented as follows.
Save_Callback (hObject, eventdata, handles)
material.sigma = handles.mycm;
material.mdb = handles.matdb;
material.ID = handles.myid;
material.XY = handles.Coordinates;
handles. material = material;
[handles.material,path]=uiputfile('*.m;*.fig;
*.mat;*.mdl','MATLAB Files (*.m,*.mat)';
'*.m', 'program files (*.m)';
'*.mat','MAT-files (*.mat)';
'*.*','All Files (*.*)','Save as’...
'..\project\MMWR_DataBase\...');
Save ([path, handles. material]);
guidata (hObject, handles)
Operating procedure of the Material classification
Toolbox/ GUI Model
The procedural steps for developing database of
both MMWR and FLIR using the toolbox designed with
the help of above discussed callbacks are described as
follows,
Step 1:
Run the program file
(Material_Coding_Tool_Box.m file) on the MATLAB
editor platform.
Step 2:
A GUI model appears on the screen. To start
material classification click on the ‗Load Image‘ button.
It opens the folder where the images are stored; select
the required image by double clicking on that. It will
display the selected image on the main axes.
Step 3:
MMWR case: Move to MMWR panel, click on
the ‗Select Materials‘ popup menu and select the
required material type.
FLIR case: Move to FLIR panel, enter the
Thickness, Slope and Surface Azimuth values of the
required material in the respective editor windows. If
material classification isindependent of these three
parameters then enter ‗zero‘ and click on the ‗Select
Materials‘ popup menu and select the required material
type.
Step 4:
Click on the ‗Select ROI‘ button, It works on the
graphical input method. A cursor ‗+‘ appears on the main axes
where the required image was loaded, it moves as the mouse
pointer moves. Move this cursor on the desired material type
in the image and mark the points on its boundary. Once the
last point meets the first point, the ‗+‘ cursor changes to ‗‘
cursor. After completion of boundary shape cursor will appear, complete the Region of interest by double clicking on
it. The boundary will become red color with number tag.
Number tag is the count of number of ROI‘s created and it is
placed at the first coordinates of the selected region.
Step5:
Page 35
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Repeat step 3 and step 4 for different materials
in the same image.After all the region of interest has
been selected properly in an image tile, click on
‗Interpolate‘ button. It will assign the un-coded region as
Green grass and completes the database of the given
image tile.
Step 6:
Click on ‗Save‘ button to save thecomplete
material coded database of that image in ‗.mat‘ format.
The database file name should be same as the respective
image file name which is loaded on main axes.
Step 7:
If the user wishes to see the coded regions, click
on ‗Update‘ button. It will display the coded regions on
the auxiliary axes of respective image. Also displays the
information about how many pixels are coded and how
many are left out in the slot given in the tool box.
Step 8:
Suppose the region of interest is erroneously
coded or the user is not satisfied with the coded region,
in this case user can delete that particular region. First
select weather it is in FLIR mode or MMWR mode by
clicking on the radio buttons (By default it will take
MMWR mode), then move to ‗Enter ID Value‖ panel,
enter the number tag of the region would like to delete in
the editor window, then click on ‗Delete ID‘ button.
This function will clear all the data related to that
particular region.
Step 9:
Suppose user stopped the material classification
at some stage and he wanted to continue with the
previously coded data of the same image. To continue
the material classification of the previous session it is
required to load the previously selected region of
interests in order to avoid repetitive coding. This task is
accomplished as follows; first select the appropriate
mode (i.e. MMWR or FLIR mode by clicking radio
button) then click on ‗Plot ID‘ button. This will plot all
the previously selected regions of the image on to the
main axes and it loads the previously coded data on to
the workspace.
Step 10:
If the user wants to see the coded image with
respect to different parameters which are considered for
material classification of both MWR and FLIR, first
select MMWR or FLIR mode then move to View Coded
Images panel,
MMWR case: Enter the grazing angle at which
user likes to see the image and click on ‗MMWR Image‘
button. It will display the coded image with respect to
the NRCS and grazing angle on the auxiliary axes.
FLIR case: Click on the ‗Select Parameters‘
popup menu and select the required parameter for which
user likes to see the coded version of the original image
tile.
Step 11:
To realize the Phong model plots for different
terrain materials, move to Phong Model Plots panel then
click on the ‗Terrain objects‘ popup menu and select the
required material for which user likes to see the Phong
model plot.
Step 12:
To close the GUI click on the ‗Close‘ button.
Phong like lighting backscatter (NRCS) generator
model implementation for MMWR
Radar simulation involves the computation of a
radar response based on the terrain‘s normalized radar
cross section (RCS).MMWR imaging depends on
backscattering co-efficient. The amount of the radiated
energy is proportional to the target size, orientation,
physical shape and material which are all lumped
Page 36
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
together in one target-specific parameter called Radar
Cross Section (RCS) denoted by ―σ‖. The radar cross
section is defined as the ratio of the power reflected back
to the radar to the power density incident on the target.
σ = r
d
P
P(m2)
Where, Pd is power delivered to the radar signal
processor by the antenna. Radar simulation involves the
computation of a radar response based on the terrain‘s
normalized radar cross-section (NRCS).To compute
normalized radar cross section for different types of
terrain objects, we are using a well-known model called
Phong like lighting model. Phong lighting is an
empirically derived BRDF model for the computation of
optical reflections [5]. The method is very popular in
computer graphics and is broadly supported by different
software and hardware platforms. Although the Phong
lighting model is not physically correct since it does not
obey all the laws of physics involved, it has easily
interpretable parameters which may explain
itspopularity. Using the Phong model we compute the
mean normalized radar cross section as,
0 casin bsin
Where a, b and c are the model parameters, a
controls the amount of diffuse reflection of a material, b
is thespecular reflection coefficient and c is the
specularity, that is, the sharpness of the directional
highlight for a material (see [5] for more details about
terrain types and a, b and c values).
III. SIMULATION RESULTS
Toolbox Views
In this section views of the designed toolbox at
different stages of material classification and Phong
model plots of different materials are presented in
figures 4, 5 and 6. The parameters used for developing
the Database of both MMWR and FLIR are obtained
through Literature survey. In MMWR case toolbox
facilitates user to see the coded image with respect to
grazing angle. Whereas, FLIR coded images can be seen
with respect to different parameters.
Fig.4. Demo of selecting a ROI using toolbox
Fig.5. Toolbox looks after selecting all the ROI‘s of a given image tile and
updated the coded image on to the auxiliary axes.
(a) (b)
Page 37
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
(c)(d)
Fig.6. Phong model plots of different terrain materials implemented on
toolbox; (a) Tree (b) Green Grass (c) Brick/Urban areas (d) Wood
Results of Database Comparison
In this section comparison of databases obtained from
other methods and toolbox is made. Fig 9 shows the
comparison of FLIR Database and fig 10 shows MMWR
Database comparison. Figures from 7 to 11 also show
the accuracy of the Database obtained using toolbox.
Fig.7. Image of the material classified database developed using color coding
method
Fig.8. Image of the material classified database developed using our designed
toolbox
(a) (b)
Fig.9. Comparison of FLIR Images generated using two different databases.
(a) Image generated using Database created outside the toolbox; (b) Image
generated using Database created by toolbox.
(a) (b)
Fig.10. Comparison of MMWR Images generated using two different
databases. (a) Image generated using Database created outside the toolbox; (b)
Image generated using Database created by toolbox.
(a) (b) (c)
Fig.11. Comparison of MMWR Images generated using two databases
developed from color coding method and toolbox. (a) Original image (b)
Image generated using Database created using color coding methods; (c)
Image generated using Database created by toolbox.
IV. CONCLUSION
Weather penetrating sensors play a prominent
role in Enhanced Vision Systems functionalities.
Backscatter image generation capabilities of these
sensors are very much dependent onMaterial database of
the given geographical area. Quality of database reflects
the quality of the backscatter image from sensor. Hence,
Toolbox implementation described in this paper that can
Page 38
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
provide the facility for user to develop the material
Database for both HiVision MilliMeter wave radar and
Forward looking Infrared Radar. The procedure for
developing the Database using toolbox is presented. The
material Database obtained from the designed toolbox is
compared with the one obtained from colour coding
method. Also comparison of quality of the Database in
the simulator also presented.But, to develop the quality
material Database much user attention is required.
ACKNOWLEDGMENT
This work is sponsored by National Aerospace
Laboratories, Bangalore. Programmatic support and
technical advice from MSDF group of FMCD, NAL is
gratefully appreciated. Gratitude also owed to director of
NAL, head of FMCD and Principal of REVA ITM
Bangalore.
REFERENCES
[1] Maxime E. Bonjeana, Fabian D. Lapierreb, Jens
Schiefelec, Jacques G. Verly, Flight Simulator with IR and MMW Radar Image Generation Capabilities, Proc of SPIE, Volume 6226, pp 62260A 8-12, April 2006.
[2] Bernd Korn, Hans-Ullrich Doehler, and Peter
Hecker, Institute of Flight Guidance, ‘‘Weather Independent Flight Guidance: Analysis of MMW Radar Images for Approach and Landing‘‘, 0-7695-0750-6/00 IEEE trans 2000
[3] B. Korn, H.-U. Doehler and P. Hecker. ―MMW radar data processing for enhanced vision‖. In J. G. Verly, editor, Enhanced and Synthetic Vision 1999,
volume 3691, pages 29-38. SPIE, Apr. 1999. [4] H.U.Dohler, P.Hecker, R.Rodolff, ― Image data
fusion for future Enhanced Vision Systems ”; Sensor Data Fusion and Integration of the Human Element, system Concepts and Integration (SCI) Symposium, Ottawa, 14-17 sep. 98, RTO Meeting Proceedings 12
[5] Niklas Peinecke, Hans-Ullrich Doehler, and Bernd
R. Korn, ―Phong-like Lighting for MMW Radar Simulation‖ Millimeter Wave and Terahertz Sensors and Technology, edited by Keith A. Krapels, Neil A. Salmon, Proc. of SPIE Vol. 7117, 71170M · © 2008 SPIE
[6] Patrick Marchand and O. Thomas Holland. ―Graphics and GUIs with MATLAB”
[7] Hunt Lipsman & Rosenberg. “A Guide to MATLAB for Beginners and Experienced Users”
[8] Matlab learning guide from Mathworks.com ,“BuildGUI in Matlab”
Page 39
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
231
Development and hardware implementation of QPSK
Modulator with polyphase filters R.Kannan, Seetha Rama RajuSanapala
Reva Institute of Technology and Management ,Bangalore([email protected] )
Reva Institute of Technology and management, Bangalore([email protected]
)
Abstract— In this paper, QPSK modulator is
presented and implemented on a field programming
gate array (FPGA). we are developing QPSK
modulator with polyphase filters on field
programmable gate array(FPGA).All the components
like the data generators,IQ mapping,polyphase filters,
NCO(Numerically Controlled Oscillator) are realized
as digital,discrete time components before feeding it
to DAC.The NCO is developed using ROM lookup
table method. The main motto behind this work is to
develop a working model of QPSK modulator along
with polyphase filters suitable for application in data
links for high data rate applications.QPSK has an
advantage that the channel bandwidth is half and
BPSK has double channel bandwidth. Here two
solutions were explored for polyphase
implementation. First method involving the direct
realization of polyphase filters and second method
using the transposed polyphase design which is the
compromise between area and speed .
Keywords— QPSK modulator , Polyphase filters
,Noble identities.
I. INTRODUCTION
Streaming real-time telemetry / video is an important
feature of Modern Unmanned Aerial Vehicles (UAV)
used for surveillance.Analog modulation schemes like
FM which are generally used for telemetry and command
control cannot be used for such high bandwidth data
links.Quadrature Phase Shift Keying (QPSK) digital
modulation and its variants provide the best compromise
between the bandwidth requirements and power available
onboard a UAV for transmission.In the proposed project
as a part of QPSK modulator is polyphase filter is used
rather than normal filter. Polyphase approach has been
tried to take advantages of computation efficiency of FIR
filter.
This paper is organized as follows In Section II, General
overview of the QPSK modulator block is discussed. In
Section III, the noble identity is proposed for the current
filter implementation. In Section IV, the polyphase filter
approach is presented. In section IV, we will make the
discussion about the application of polyphase filters in
the current paper and the difference between the two
implementations in terms of area and speed and finally
the conclusions are made.
INTERPOLATING FILTER
Fig1.Block diagram of interpolation filter.
The rate conversion can be accomplished by increasing
the rate by a factor of L to the higher rate
Rate & (f‖= L x f,).
T‘‘=T/M.
T -Input sampling time.
T‘‘-Output sampling rate.
II. NOBLE IDENTITES FOR
INTERPOLATION
)(nx )(1 mv )(my
)(nx )(2 mv )(my
Fig2.Noble Identities for up sampling.
This Noble identity is very useful result for
the filter theory.These noble identities are also
applicable for the decimator .The interpolation
helps in increasing the sampling rate and the
decimator does the opposite i.e decreasing the
sampling rate . The noble identities for the
L )( LzH
)(zH L
L
Low pass filter
Page 40
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
232
decimator case is different from the interpolation
case .Upsampling a sequence x(n) creates a new
sequence where every thL sample is taken
from x(n) with all others zero. The upsampled
sequence contains L replicas of the original signal's
spectrum. To restore the original spectrum, the
upsampler should be followed by a low-pass
filter with gain L and cutoff frequency /L . In
this application, such an anti-aliasing filter is
referred to as an interpolation filter and the
combined process of upsampling and filtering is
called interpolation.
III. POLYPHASE FILTERS
Polyphase filters are used to reduce the large filtering
length (e.g. M samples) into a set of smaller filtering
length K(where Kis defined as M/L and is a multiple
of the integer L). Consider the interpolator. Since the
up-sampling process inserts L-1 zeros between two
consecutive samples, only K out of M input values in
the FIR filter are non-zero. At any one time, these
non-zero values coincide and are multiples by filter
coefficient h(0), h(l), ... , h(M-L). This gives the
polyphase unit sample responses as:
)()( ntkhnpk
.. (1)
Fig2.Transforming a single stage interpolator into
parallel configuration
IV. QPSK MODULATOR
Fig .3 Block diagram of QPSK modulator.
Data Generator
The message bits are generated. This data is
produces random data of 1's and 0's at the
specified rate.
IQ mapping and serial to parallel converter
The serial data stream is converted to parallel data
stream. An appropriate mapping table converts the bits
from two parallel arms into Imap and Qmap symbols.
kck tfpiAtS ***2cos)( ..(3)
tfpiAtfpiAtS ckckk ***2sinsin***2coscos)(
..(4)
tfpiQtfpiItS ckckk ***2sin***2cos)(
..(5)
Where tS k is the modulated waveform ,A is the
output amplitude, cf is the carrier frequency and
k is the instantaneous phase. kk AI cos
and kk AQ sin are in phase and Quadrature
phase components.
Page 41
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
233
Polyphase Filtering
For the polyphase filter the upsampling factor that is
being used is 32.The Direct Digital Synthesizer (DDS)
produces a high carrier frequency.Each cycleof the sine
wave should have at least 4 samples therefore the
sampling rate is 80 MBPS. Filter coeffients have been
generated usingMATLAB Simulink and the numbersof
coefficients have been limited to 80 which is equivalent
to 5 symbol duration. A kaiser window has been applied
to the FIR filter for sidelobe attenuation. The filter
coefficients are quantized into 16 bits.
For the polyphase filter implementation on FPGA virtex
4 we are using two design for meeting speed and area
constraints .First we use the direct realization of direct
realization sample rate converter(L=32) .
x(n)
yx Lff
)(ny
Fig .4 Block diagram of Polyphase filter with
commutator.
Fig .5 Block diagram of Direct realization
polyphase filter (L=32).
The above figure 5 is the implementation of sub filter
for direct realization. Each sub filters will have 3
coefficients each. The second realization has sub filter
realization as shown in figure 6 .This is after adding cut
set by moving the delay line to the adder line .This has
the advantage of reducing the critical path for each
subfilter .For the figure 5 the
Fig .6 Block diagram of a sub filter with transposed
implementation.
The modulator was implemented in VHDL code
and was implemented on virtex 4 FPGA kit with
both polyphase designs. The Xilinx synthesis tool
in the synthesis report mentioned the parameters
used by the entire implementation as described
below.
)(0 zH
)(1 zH
)(1 zH N
Page 42
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
234
For Direct Realization
Parameters Total
usage
Total
available
on FPGA
Percentage
No of slices 1539 15360 10%
Number of
Slice Flip
Flops
438
30720 1%
Number of
4 input
LUTs
2182 30720 7%
No.of DSP
48'S
8 192 4%
For transposed polyphase design
Parameters Total
usage
Total
available
on FPGA
Percentage
No of slices 1629 15360 10%
No. of slice
flip flops
524 30720 1%
Total no of
4 i/p LUTS
2376 30720 7%
No.of DSP
48'S
8 192 4%
Comparision of maximum frequency
Method Maximum Frequency
Direct Realization of
polyphase
133.190 MHZ
Transposed polyphase 186.519 MHZ
HARDWARE RESULTS
Fig7 .QPSK Output using Direct Polyphase
Realization
Fig8 .QPSK Output using transposed Polyphase
Realization.
Analysis done with the help of Vector Signal
Analyser
Page 43
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
235
Fig 9. QPSK output using Direct Realization of
Polyphase filter tested by Vector Signal Analyser.
Fig 10. QPSK output using transposed Realization of
Polyphase filter tested by Vector Signal Analyser.
Conclusion
By implementing these both the designs of polyphase
filter we Can have a compromise between area and speed
as shown in the above results.
REFERENCES
[1].Vagner .S.Rosa, Edurdocoasta ―VHDL
generation of optimized FIR filters‖. (2008 IEEE
international conference on signals,systems and
circuits) .
[2].Fedric Harris and Michael Rice.‖Multirate
Digital Filters for Symbol Timing Synchronization
in Software Defined Radios‖.(IEEE JOURNAL
2001) .
[3].Fedric Harris and Michael Rice ―Digital
Receivers and Transmitters Using Polyphase Filter
Banks for Wireless Communications‖.(2003 IEEE
CONFERENCE PUBLICATIONS).
[4].MassimilianoLaddomada ―On the Polyphase
Decomposition for Design of Generalized
Comb Decimation Filters‖(2008 IEEE
Transactions )..
[5].Zujun Liu · Kechu Yi ―Symbol Timing
Synchronization Using Interpolation-Based
Matched-Filters‖. .
[6].C. N. Ang, R.. H. Turner, T Courtney and R
Woods ―Virtex FPGA Implementation of a
Polyphase Filter for Sample Rate Conversion‖
.(IEEE TRANSACTIONS 2000).
[7]. P.P.Vaidyanathan ―Multirate Digital signal
processing‖ .
[8].PROAKIS Digital signal processing
[9].Meyer Bayse Digital signal processing using
FPGA
Page 44
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
236
Distinguishing Identical Twins using their Facial Marks as Biometric
Signature
Avinash JL1, H K Chandrashekar1
1Dept. ECE, AIT-Chikmagalur, India-577102,
[email protected] , [email protected]
Abstract- Due to high degree correlations which are found in
finger print recognition, palm print recognition, iris recognition
and overall facial expressions of identical twins, and study shows
that commercial face Recognition Systems shows poor
performance to distinguish Identical Twins, there exists a need of
another technique to distinguish Monozygotic Identical Twins.
Proposed work uses the Facial marks as biometric signatures to
distinguish Identical twins, Based on a fact that, though the
number of Facial marks of Identical twins may be same but
organization of those marks will be in different positions.
Proposed work proposes a multi-scale facial mark detection
process based on Fast Radial Symmetry Transform (FRST)
which detects interested regions with high radial symmetry at
different scales. Later, prominence facial marks which are
detected across different scales. Finally detected marks will be
used to distinguish Identical Twins
Key Words - Identical Twins, Face Recognition, Facial Marks,
Harr like features, Fast Radial Symmetry Transform (FRST).
I. Introduction
There are two types of twins, dizygotic and monozygotic
twins. Dizygotic twins result from two different fertilized eggs
resulting in different Deoxyribo Nucleic Acid (DNA).
Monozygotic twins, also called identical twins are the results
of a single fertilized egg splitting into two individual cells and
developing into two individuals. Therefore, identical twins
have the same genetic expressions. The frequency of identical
twins is about 0.5% across different populations. Some
researchers believe that this is the performance limit of face
recognition systems to distinguishing identical twins.
The increase in twin births has created a requirement
for biometric systems to accurately determine the identity of a
person who has an identical twin. The discriminability of
some of the identical twin biometric traits, such as
fingerprints, iris, and palm prints, is supported by anatomy
and the formation process of the biometric Characteristic,
which state they are different even in identical twins due to a
number of random factors during the gestation period.
In spite of the fact that the biometric of identical twins is
affected by many factors, some of them such as facial features
are still very similar. Some identical twins share not only
similar facial features but also the same signatures. Confusion
over their identities has made it difficult for others to know
who owns what and who does what. As a result, some
identical twins partake in commercial scams such as
fraudulent insurance compensation. Most importantly, if one
of the identical twins commits a serious crime, their unclear identities cause confusion and uncertainty in court trials.The
ability to distinguish between identical twins based on
different biometric modalities such as face, iris, fingerprint,
etc., is a challenging and interesting problem in the biometric
area. Identical twins are formed when a zygote splits and
forms two embryos. They cannot be discriminated based on
DNA. Therefore, other biometric traits are needed to
distinguish between identical twins. Using face recognition to
differentiate between identical twins is very difficult, because
of the high degree of similarity in their overall facial
appearance. This project focus on distinguishing between
monozygotic twins based on localized facial features known
as facial marks.
Traditionally, biometrics research has focused primarily on
developing robust characterizations and systems to deal with
challenges posed by variations in acquisition conditions and
the presence of noise in the acquired data. Only recently have
researchers started to look at the challenges involved in
dealing with the task of distinguishing identical twins.
Developing techniques and systems that improve twin face
recognition should also improve generic face recognition
systems. Although identical twins represent only 0.5% of the
global population, failure to correctly identify each twin has
led to problems for law enforcement agencies. There have
been several criminal cases in which either both or neither of
the identical twins was convicted due to the difficulty in
determining the correct identity of the perpetrator.
This project proposes to differentiate between
identical twins using facial marks alone. Facial marks are
Page 45
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
237
considered to be unique and inherent characteristics of an
individual. Although they are similar in appearance, they can
be distinguished using facial marks. High-resolution images
enable us to capture these finer details on the face. Facial
marks are defined as visible changes in the skin and they
differ in texture, shape and colour from the surrounding skin.
Facial marks appear at random positions of the face. By
extracting different facial mark features this project aim to
differentiate between identical twins.
II. Proposed System
This work proposes a multi scale facial mark detector based
on the fast radial symmetry transform (FRST). The transform
detects dark regions with high radial symmetry. An overview
of the proposed work is shown in Figure 2. Initially, Harr
transform is used to detect the Face and primary facial
features like eyes and lips. Feeding zeros to the output of the
Harr transform, a mask is created to remove the primary facial
features. Next, 5 level Gaussian pyramid is constructed and
FRST is applied to each image in Gaussian pyramid to detect
dark regions with radial symmetry. Finally, the detections are
tracked across scales. Detected facial marks are characterized
only by geometric location. Small rectangular windows are
created across the detected marks for texture comparison
across the detected facial marks. A vector set is created for
detected marks, later on, comparison of created vector set with all vector sets in Database. After comparison correct match is
detected.
Study of Harr like features, Gaussian pyramid, FRST
is much needed to carryout proposed work, and they were
discussed in brief.
Figure 1: Block Diagram of Proposed work
a) Harr like features
Haar-like features are digital image features used in object
recognition. They owe their name to their intuitive similarity
with Haar wavelets and were used in the first real-time face
detector. A Haar-like feature considers adjacent rectangular
regions at a specific location in a detection window, sums up
the pixel intensities in these regions and calculates the
difference between them. This difference is then used to
categorize subsections of an image. For example, let us say we
have an image database with human faces. It is a common
observation that among all faces the region of the eyes is
darker than the region of the cheeks. Therefore a common
haar feature for face detection is a set of two adjacent
rectangles that lie above the eye and the cheek region. The
position of these rectangles is defined relative to a detection
window that acts like a bounding box to the target object
Figure 2: Examples of Harr like features
Using harr like features in proposed work, Face region in the
Image, Primary features in the Detected Face region are found
out successfully.
Figure 3: Identical Twins
Figure 4: Face and Primary Features on Face are detected
using Harr like features
Page 46
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
238
b) Gaussian pyramid
The objective is to detect facial marks that are stable across
different scales. This can be achieved by using a Gaussian
pyramid. Reducing the pixel values of base image, subsequent images in the Gaussian image pyramid is achieved
In the Proposed work Gaussian Pyramid is constructed for the
Primary Features Masked Image, which is shown below.
Figure 5: Primary Features are masked on Detected Face
c) Fast radial symmetry transform
The proposed work uses the high radial symmetry transform
to detect the facial marks. The value of the transform at range
n € N indicates the contribution to radial symmetry of the
gradients a distance n away from each point. Whilst the
transform can be calculated for a continuous set of ranges this
is generally unnecessary as a small subset of ranges is
normally sufficient to obtain a representative result. At each
range n an orientation projection image On and a magnitude
projection image Mn are formed. These images are generated
by examining the gradient g at each point p from which a
corresponding positively-affected pixel p+ve(p) and
negatively-affected pixel p−ve(p) are determined, as shown in
Figure 1. The positively-affected pixel is defined as the pixel
that the gradient vector g(p) is pointing to, a distance n away
from p, and the negatively-affected pixel is the pixel a
distance n away that the gradient is pointing directly away
from.
Figure 6: The locations of pixels p+ve(p) and p−ve(p) affected
by the gradient element g(p) for a range of n = 2. The dotted
circle shows all the pixels which can be affected by the
gradient at p for a range n.
The coordinates of the positively-affected pixel are given by
p+ve (p) = p+ round (g(p)/g(p)n)
While those of the negatively-affected pixel are given by
p-ve(p) = p- round (g(p)/g(p)n)
Where ‘round‘ rounds each vector element to the nearest
integer.The orientation and projection images are initially
zero. For each pair of affectedpixels the corresponding point
p+ve in the orientation projection imageOn and magnitude
projection image Mn is incremented by 1 and g(p)
respectively,while the point corresponding to p−ve is
decremented by these samequantities in each image.
That is
On(p+ve(p)) = On(p+ve(p)) + 1
On(p-ve(p)) = On(p-ve(p)) – 1
Mn(p+ve(p)) = Mn(p+ve(p)) + ||g(p)||
Mn(p-ve(p)) = Mn(p-ve(p)) - ||g(p)||
The radial symmetry contribution at a range n is defined as the
convolution Sn = Fn * An
Where
Fn(p) = || Õn(p)||(α)
˜Mn(p),
Õn(p) = On/ maxp||On(p)||
Page 47
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
239
˜Mn(p) = Mn/ maxp||Mn(p)||
Where α is the radial strictness parameter, and Anis a two-
dimensional Gaussian.The full transform is defined as the sum
of the symmetry contributions overall the ranges considered,
S=∑n€N Sn
If the gradient is calculated so it points from dark to light then the output image S will have positive values corresponding to
bright radially symmetric regions and negative values
indicating dark symmetric regions
In the Proposed work, FRST is applied to each image in the
Gaussian Image Pyramid and Symmetry map an image is
shown below.
Figure 7: FRST is applied to each image in the Gaussian
image pyramid
d) Flow diagram of proposed work
III. Discussion of results
Proposed work detects the facial marks on applying FRST to
each image in an Gaussian image pyramid which is
constructed from primary features masked image of input
Page 48
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
240
image. Once marks are detected small rectangular window are
created around them and vector set is created for detected
marks and comparison of vector is done in order to find out
the correct person among the Identical Twins.
IV. Applications
This Proposal can be employed in Forensics and Security
Dept. to resolve Criminal cases between Identical twins and in
any Insurance companies to avoid claiming for insurance in
the name of other.
V. Conclusion
Proposed work considers facial marks as biometric signatures
to distinguish between identical twins and it is an efficient
technique to distinguish between Identical twins when other
existing techniques like Finger print detection, Palm print
detection, Iris recognition and manual annotation to detect
Facial marks were failed to distinguish Identical Twins
properly.
Future scope of proposed work is adopting facial
features, texture of face along with facial marks to distinguish
Identical Twins.
REFERENCES
[1]Nisha Srinivas, Gaurav Aggarwal, Patrick J. Flynn, Fellow,
IEEE, and Richard W. Vorder Bruegge, October 2012,
―Analysis of Facial Marks to Distinguish Between Identical
Twins,‖ IEEE transactions on information forensics and
security, vol. 7, no. 5.
[2]N. Srinivas, G. Aggarwal, P. Flynn, and R. Voder Bruegge,
Jun. 2011 , ―Facialmarks as biometric signatures to distinguish
between identical twins,‖ in Proc. IEEE Computer Society
Conf. Computer Vision and Pattern Recognition Workshops
(CVPRW 2011), pp. 106–113.
[3]Adams Kong, David Zhang and Guangming Lu, ―A Study
of Identical Twins‘ Palmprints for Personal Authentication‖.
[4] Anil K. Jain, Brendan Klare and Unsang Park Michigan
State University East Lansing, MI, U.S.A jain, klarebre,
[email protected] , March, 2011, "Face Recognition: Some Challenges in Forensics", 9th IEEE Int'l Conference on
Automatic Face and Gesture Recognition, Santa Barbara, CA.
[5] Soma Biswas, Kevin W. Bowyer and Patrick J. Flynn
Dept. of Computer Science and Engineering University of
Notre Dame. fsbiswas, kwb, [email protected] , "A Study of
Face Recognition of Identical Twins by Humans", A work
supported by the Federal Bureau of Investigation (FBI), the
Biometrics Task Force and the Technical Support Working
Group through US Army contract W91CRB-08-C-0093.
[6] Zhenan Sun, Alessandra A. Paulino, Jianjiang Feng,
Zhenhua Chai, Tieniu Tan, Anil K. Jain, "A Study of
Multibiometric Traits of Identical Twins", an IEEE
transaction.
[7] Sargur N. Srihari, Harish Srinivasan, Gang Fang, Center of
Excellence for Document Analysis and Recognition
Department of Computer Science and Engineering The State
University of New York Buffalo, NY, (march 2007), "Discriminability of Fingerprints of Twins", an article in
Journal of Forensic Identification.
[8] P. Jonathon Phillips, Patrick J. Flynn, Kevin W. Bowyer,
Richard W. Vorder Bruegge, Patrick J. Grother, George W.
Quinn, Matthew Pruitt, "Distinguishing Identical Twins by
Face Recognition ―an IEEE transaction.
[9] Anil K. Jain and Unsang Park, ICIP 2009, "FACIAL
MARKS: SOFT BIOMETRIC FOR FACE RECOGNITION"
an IEEE transaction.
[10] Vipin Vijayan, Kevin W. Bowyer, Patrick. Flynn, Di
Huang, Liming Chen, Mark Hansen, Omar Ocegueda, Shishir
K. Shah, Ioannis A. Kakadiaris, 2011 "Twins 3D Face
Recognition Challenge", an IEEE transaction.
[11] Saeed Mozaffari and Hamid Behravan Electrical and
Computer Engineering Department, Semnan University,
(ICEE2011), "Twins Facial Similarity Impact on
Conventional Face Recognition Systems", 19th Iranian
Conference on Electrical Engineering.
[12] P. Burt and E. Adelson, ―The Laplacian pyramid as a
compact image
code,‖ IEEE Trans. Commun., vol. 31, no. 4, pp. 532–540,
Apr. 1983
[13] Takeshi Mita, Toshimitsu Kaneko, Osamu Hori, ―Joint
Haar-like Features for Face Detection‖, 2005, Proceedings of
the Tenth IEEE International Conference on Computer Vision.
Page 49
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
241
Better and Clear Explanation of the Technique of
Linear Convolution of Long Sequences by
Sectioning Seetha Rama Raju Sanapala
#ECE Department, Reva Institute of Technology and Management, Bangalore – 560 064.
*[email protected] , [email protected]
Abstract— Convolution (LTI filtering) and correlation are the
fundamental processes in digital signal processing. As such they
are ubiquitous in every DSP area and every DSP
implementation. The computational requirement of convolution
is of the O (N2). Hence, the required number of computations
increases phenomenally with increasing order/length (N) of the
convolution. The strategy followed in computing the convolution
of long sequences is ‗division and conquest‘ i.e. to break the long
sequences into small sequences, find the convolution of the small
sequences and use these results formed from the small sequences
to build the convolution of the original long sequences – similar
to what is done in the FFT algorithm for the computation of
DFT. This method is called ‗convolution by sectioning‘ and is
covered by many standard DSP books through complex
mathematics, array of symbols, notation and DSP jargon like
time aliasing. It is the observation of the author that even the
top-notch students do not have a good understanding of this topic
and only remember the method as some trick of the trade. This
need not be the case as, just like everything in any branch of
science; this method too has the rationale and the reasoning.
Since correlation is nothing but convolution with one of the
sequences time-reversed, the above technique can also be applied
to correlation computation – basically by converting the
correlation problem into the convolution problem. This paper
presents a simple and straight way of explanation of this method.
More importantly, this paper demonstrates how all of the DSP
concepts can be explained in simple terms without compromising
on the rigour and precision. This method follows ‗teach-by-
example‘ and ‗expand with precision and rigor‘. Once the
overall idea of the method is well understood by the student
through the example, the methods of formalism can be used to
bring the student face to face with the language, symbols,
conventions and tools of the topic to make him comfortable with
the mainstream DSP literature. This paper in essence, and in
general, is about better teaching - taking the convolution of long
sequences simply as an example.
Keywords— Signal processing, teaching, education, convolution,
correlation, convolution by sectioning, overlap-add and overlap-
save, overlap-discard, overlap-drop.
V. CONVENTIONS &ORGANISATION
In this paper the sequences (i.e. discrete time signals) are
referred by their names in bold font like x and h – without the
sample/index number n in brackets. If we want to refer to the
nth sample of the sequence x we represent it by x(n) in regular
font. LC and * denote linear convolution in functional form
and symbolic form respectively. Similarly, CC and denote
circular convolution in functional and symbolic forms.
Frequently occurring phrases and names have been
abbreviated at the first occurrence in brackets and used
thenceforth by abbreviations only. Equations are named (E1,
E2, ... so on) for back reference.
Convolution of long sequences by sectioning (COLS) is
divided into two techniques – namely – (i) Overlap and add
(OA) method (ii) Overlap and save method - which is more
meaningfully called Overlap and drop (OD) method. These
methods are usually explained along with the technique of
calculation of linear convolution (LC) through circular
convolution (CC) though these – convolution by sectioning
(COLS) and computation of LC through CC (L3C) - are independent. COLS does not require that we use CC to
compute LC. But CC is usually used to compute LC because
CC can be computed through DFT and the DFT has a very
efficient algorithm in the form of FFT.
In section II, the first method of COLS i.e. overlap and add
(OA) will be explained in detail using an approach aimed at
providing a better understanding of the OA method to the
student. We do not use any figure, the reasoning given makes
the steps amply clear even without the aid of the figures.
Also, a figure without proper explanation is as good as absent.
In section III, the overlap and drop (OD) method will be
explained briefly as we can develop the reasoning on similar
lines as of OA.
The limitations of the direct computation of linear
convolution (DLC) i.e. by the convolution sum formula when one of the sequences (usually input to the filter x) is very long
compared to the other sequence (usually impulse response h
of a FIR filter) has been well explained in many standard
books [1-2] on this subject. In this kind of situation of widely
different length sequences, instead of waiting till the entire
input samples are acquired, we can start working with blocks
of input data – i.e. by sectioning the input data. Let the length
of the impulse response be l and that of the input sequence be
L with the inequality (L≫l). Let the section length of the
Page 50
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
242
input sequence be m. We would like to work on the
sequences of the impulse response – h of length l – and
sequences of length m i.e. xithat are the sections of the input
sequence x. Our ultimate objective is to compute the output
of the filter y = LC(h,x) = h * x by using LC(h,xi) = linear
convolution of h and xi - where LC and * denote linear convolution as explained in the conventions paragraph above.
To make things less abstract and concretize the ideas in the
minds of the student, we choose the sequences with the
following parameters and explain the logic and the steps of the
procedure.
l =2, L =15, m =5.
VI. OVERLAP AND ADD METHOD
This method can be explained clearly and in a simpler way
using the property that the linear convolution of x and h can
be obtained by the multiplication of the polynomials x(p) and
h(p) where x(p) and h(p) are polynomials formed from the
sequences x and h with the sample values of the
corresponding sequence of the polynomial ( i.e. x for x(p) and
h for h(p)) forming the polynomial coefficients and the index
of the sample giving the power of p i.e.
z(p) = z(0) + z(1)p + z(2)p2+…..
For the parameters we have considered, the following
equations apply :
x(p) = x(0)+x(1)p + x(2)p2 +….x(14)p
14 (E1)
h(p) = h(0) +h(1)p (E2)
In summary, y = h * x can be obtained from the
multiplication of x(p) and h(p) and this is true for all finite
sequences h and x– not just for the x and h under
consideration. Since y = h * x, we can say that conversely,
the samples of y can be obtained from the coefficients of the
polynomial y(p) = h(p)x(p).
Now we can note that from (E2),
y(p) = x(p)h(p) = h(0)x(p) + h(1)x(p)p (E3)
= h(0)x(0)+h(0)x(1)p+h(0)x(2)p2+h(0)x(3)p
3....h(0)x(14)p
14
+[h(1)x(0)p+h(1)x(1)p2+h(1)x(2)p
3+…..h(1)x(14)p
15]
= h(0)x(0)+[h(0)x(1)+h(1)x(0)]p +[h(0)x(2) +h(1)x(1)]p2
+[h(0)x(3) +h(1)x(2)]p3+…..+[h(0)x(14) +h(1)x(13)]p
14
+[h(0)x(15)]p15
You can observe that
y(0) = h(0)x(0) = coefficient of the p0 term x(p)
y(1) = h(0)x(1)+h(1)x(0) = coefficient of p1 term in x(p)
y(2) =h(0)x(2)+h(1)x(1)= coefficient of p2 term in x(p)
y(3) =h(0)x(3)+h(1)x(2)= coefficient of p3 term in x(p)
.
.
.
y(14) = h(0)x(14)+h(1)x(13) = coefficient of p14
term x(p)
y(15)=h(1)x(14) = coefficient of p15
term x(p).
Now let us decompose x(p) into a sum of polynomials as
given below.
x(p) = x1(p) + p5x2(p) + p
10x3(p) (E4)
where
x1(p) = x(0) + x(1)p + x(2)p2 +x(3)p
3 +x(4)p
4
x2(p) = x(5) + x(6)p + x(7)p2 +x(8)p
3 +x(9)p
4 and
x3(p) = x(10) + x(11)p + x(12)p2 +x(13)p
3 +x(14)p
4
From the above it is clear that
h(p)x(p) = h(p) [x1(p) + p5x2(p) + p
10x3(p)] (E5)
= h(p) x1(p) + p5h(p)x2(p) + p
10h(p)x3(p) (E6)
Now consider
h(p)x1(p) = polynomial y1(p) of convolution of h and x1 =
= [h(0)+h(1)p][x(0) + x(1)p + x(2)p2 +x(3)p
3 +x(4)p
4]
=y1(p) (say)
h(p)x2(p) = polynomial y2(p) of convolution of h and x2 =
= [h(0)+h(1)p][x(5) + x(6)p + x(7)p2 +x(8)p
3 +x(9)p
4]
=y2(p) (say)
h(p)x3(p) = polynomial y3(p) of convolution of h and x3 =
= [h(0)+h(1)p][x(10) + x(11)p + x(12)p2 +x(13)p
3 +x(14)p
4]
=y3(p) (say)
Therefore,
y(p) = h(p)x(p) = h(p) [x1(p) + p5x2(p) + p
10x3(p)] (E7)
= h(p) x1(p) + p5h(p)x2(p) + p
10h(p)x3(p)
= y1(p) + p5y2(p) + p
10y3(p) (E8)
y(p) would contain the sequence y – y(i) coming from the
coefficient of the term pi in y(p). Thus summarizing, we can
obtain the output samples of the filter – the sequence y by
DLC of h and x – using convolution sum of h and x, or by
computing y(p) through (ii) by polynomial multiplication of
x(p) and h(p) from E3 or (iii) by computing the same through
E7.
We can observe that y1(p) is the product polynomial of h(p)
and x1(p). Hence it can be obtained from LC(h,x1) and vice
versa. Similar statements hold for y2(p) and y3(p). We also
note that y(p) is not the direct sum of y1(p), y2(p) and y3(p).
Page 51
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
243
We are multiplying y2(p) by p5 and y3(p) by p
10 before the
addition. Let us see the effect of these multiplications.
y1(p) is the product polynomial of h(p) – a first degree
polynomial – and x1(p) – a fourth degree of polynomial. As a
result it will be a fifth degree polynomial. So this would contribute to the 0
th to 5
th samples of y.
y2(p) is the product polynomial of h(p) – a first degree
polynomial – and x2(p) – a fourth degree of polynomial. As a
result it will be a fifth degree polynomial. Before addition
however, as given in E8, this is getting multiplied by p5 to get
y(p). After this multiplication, the resulting polynomial would
contain powers of p from 5 to 10. So this would contribute to
the 5th
to 10th
samples of y.
y3(p) is the product polynomial of h(p) – a first degree
polynomial – and x3(p) – a fourth degree polynomial. As a
result it will be a fifth degree polynomial. Before addition
however, as given in E8, this is getting multiplied by p10
to get
y(p). After this multiplication, the resulting polynomial would
contain powers of p from 10 to 15. So this would contribute
to the 10th
to 15th
samples of y.
Therefore, we can summarise the steps of the OA algorithm as
follows.
1) Section the input of 15 samples into 3 (=15/5 = L/m)
non-overlapping sequences of 5 samples. x1 from samples 0
to 4. x2 from samples 5 to 9 and x3 from samples 10 to 14.
This is sectioned like this because we fixed our m = 5.
2) Find the y1 = LC(h,x1), y2 = LC(h,x2) and y3 =
LC(h,x3).
3) Shift y2 by 5 samples and shift y3 by 10 samples.
Thus the range of the indices of shifted y2 would be 5 to 10
and that of y3 would be 10 to 15.
4) Now add y1, and the shifted versions y2 andy3by adding the sample values with the same index.
5) You would get samples from indices 0 to 15. This is
the LC(h,x) – our original objective.
The reader can easily observe that in our example h contained
only two samples, x contained only 15 samples and each
section of the input contained only 5 samples because we have
chosen initially l =2, L =15 and m = 5. The parameters have
been chosen to concretise the idea. l, L and m can be chosen
in quite general manner. And the arguments can be extended
to the general case quite easily. This is what is explained in
the standard books.
Now, the question remains about the step 2: the computation
of the three LCs - y1 = LC(h,x1), y2 = LC(h,x2) and y3 =
LC(h,x3) - of the step. As we know, the LCs can be computed
simply as multiplication of the corresponding polynomials.
We can use that approach. Or better still, we can compute the
LCs through CCs of the sequences modified in a certain way –
as explained in the next paragraph. The advantage with this
approach is that CCs can be obtained much efficiently via the
FFT route. So we can use L3C to obtain the LCs of the step 2. This step is generally combined in the books as part of COLS
but it is quite independent and can be used where LC
computation is required.
We will explain L3C below. Consider the case where we
want to find the LC of two sequences a andb of lengths p and
q. i.e.
a = a(0), a(1),…………..a(p-1)
b = b(0), b(1),…………..b(q-1)
letc = LC(a,b).
We want to compute c = LC(a,b) = a*b
c can be obtained by direct convolution sum of a and b or by
the product polynomial c(p) = a(p) b(p). a(p) is a polynomial of degree p-1 and b(p) is a polynomial of degree q-1. The
product polynomial c(p) would be a polynomial of degree
p+q-2 containing p+q-1 coefficients or samples of c.
Let d = CC(a,b) = ab
wherea = a(0), a(1),…………..a(p-1), 0, 0.. (q-1) zeros
b=b(0), b(1),…………..b(q-1), 0, 0.. (p-1) zeros
Thus both a,bare sequences of length p+q-1obtained from a
and b respectively by padding appropriate number of zeros.
We can show that for all finite a,baand b, c(p) = d(p). Hence
c can be obtained from d. To compute c, we resort to
computing d because we can use –
(i) the circular convolution property of DFT and
(ii) the efficiency of FFT to compute DFT
FFT is nothing but an algorithm that computes DFT
efficiently reducing the number of multiplications from O(N2)
down to O(N log2N).
The circular convolution property states that
d = CC(a,b) = ab = IDFT(DFT(a) x DFT(b)).
FFTs are used to compute the DFT and IDFT to reduce the
computational operations.
VII. OVERLAP AND DROP METHOD
The second method of COLS is the OD method. The logic of
this technique can be explained on similar lines as of OA
method given above.
Page 52
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
244
In the OA method the input sections are non-overlapping but
the partial output sections overlap – both y1 and y2 contribute
to the 5th
sample of y for example in the particular case we
have considered - and the final output sequence is obtained by
adding the partially overlapping output sections .
In the OD method, the input sections overlap, we compute the
partial output sections by finding the LC of the h and input
sections. We discard some samples from the partial output
sections and make the final output sequence by arranging the
resultant partial output sections – after drop of some
beginning samples - contiguously i.e. without overlapping.
VIII. CONCLUSIONS
In this paper, we have shown an alternative and better
approach to DSP teaching – teaching by example and
expanding for rigour and precision. We have taken the
technique of COLS to demonstrate this technique
successfully. This necessary method which gives a clear
understanding of the topic should be followed by formal
presentation of the topic in the standard way through
mathematical and more formal reasoning.
REFERENCES
[1] Oppenheim & Schafer, Discrete-Time Signal Processing (3rd Edition) (Prentice Hall Signal Processing)
[2] Monson H. Hayes, Schaum‘s Outlines of theory and problems of
Digital signal processing, McGraw-Hill, 1999.
Page 53
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
245
Abstract—In the field of Image processing there are many
techniques to detect edges. Some of the methods are ‗First
derivative method‘ ,‘Second derivative method‘, Sobel, Laplacian
of Guassian(LoG) and these methods are used for image
segmentation and object identification. In this paper we present a
basic idea on how to detect edges using Simulink. Generally we
detect edges of a given image using MATLAB by extracting its
features or Fuzzy Logic tool box (Artificial intelligence) but here
we detect edges using Simulink and Fuzzy logic. The main aim is
to reduce the number of bits used to represent its Pixel value.
Experimentation is performed on gray scale image in Simulink
and MATLAB and Fuzzy logic using First derivative, second
derivative and Sobel operator. Both the results are compared
and plotted. The pixel value of output image is reduced and
encrypted.
Index Terms—Edge detection, first and second derivative,
Sobel, Simulink, Fuzzy logic, PSNR.
I. Introduction
Edge detection is an important field in image processing.
It can be used in many applications such as
segmentation, registration, feature extraction, and
identification of objects in a scene. An effective edge
detector reduces a large amount of data but still keeps
most of the important feature of the image. Edge
detection refers to the process of locating sharp
discontinuities in an image. These discontinuities
originate from different scene features such as
discontinuities in depth, discontinuities in surface
orientation, and changes in material properties and
variations in scene illumination as stated in [6].
The main principle in edge detection is analysing the pixel
value of each cell and to decide whether there is an edge.
Edges can be modeled based on intensity profiles [1] and they
are
A) Ramp edge
Ramp edge is modeled as a gradual increase in image
amplitude from low to high level, or vice versa. The
edge is characterized by its height, slope angle and
horizontal coordinate of the slope midpoint.
B) Step edge
If the slope angle of the ramp edge equals 90 degrees the
resultant edge is called a step edge. In the digital
imaging system, step edges exist only for artificially
generated images such as test patterns and bi-level
graphics data. There is a sudden change in the pixel
value from high to low or vice-versa.
C) Line edge
Line edge is a combination of 2 ramp edges. The entire
range is divided by 2. The pixel values are increasing
linearly in the first part and they are decreasing linearly
in the second part, this is called line edge.
D) Roof edge
If the limit, as the line width approaches zero, the
resultant amplitude discontinuity is called roof edge.
Edge Detection Using Feature Extraction
Pericherla S. K. Rohit Varma and R.Rohit, B.E III year,ECE, SDMCET,Karnataka,India [email protected] ,[email protected]
Page 54
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
246
Edges are detected in four steps that is Smoothing,
Enhancement, Detection and Localization.Smoothing
suppress as much noise as possible, without destroying
the trueedges.
Enhancement involves applying filter to enhance quality
of edges in animage.Detection involves determine which
edge pixels should be discarded as noise andwhich
should be retained (usually,thresholding provides the
criterion used for
detection). Localization determines the exact location of
an edge.
II. Edge Detection Methods
a. First derivative principle
An edge in a continuous domain edge segment F(x,y)
can be detected by forming the continuous one-
dimensional gradient G(x,y). If the gradient is
sufficiently large above some threshold value an edge is
deemed present. The classical methods for edge
detection are based on the first derivative and second
derivative principle. The operators which are used to
carry out the differentiation operation are called gradient
operators.
If we go to the discrete domain in terms of row
, RG j k and column gradient , .CG j k The spatial
gradient amplitude is given [2] as:
1/22 2, , , R CG j k G j k G j k --eq 1
For computational efficiency, the gradient amplitude is
sometimes approximated by the magnitude combination
, , , R CG j k G j k G j k
--eq 2
Using newton‘s forward and backward difference
methods for calculating the derivatives of the pixels we
get,
The row gradient is represented in eq 3 and the column
gradient is given by eq 4.
, , , 1RG j k F j k F j k
--eq 3
, , 1,cG j k F j k F j k
--eq 4
If first derivative method is applied once again at the
output obtained from differentiation then we get second
derivative results
b. Second Derivative Principle
Points which lie in the image is detected by zero
crossing of the second derivative. For the continuous
function second derivative is done according to eq 5
2 2
2 2,
G j k
x y
--eq 5
For discrete function the formulae is given in [1] as
, , 1, 1, , ;
, , , 1 , 1 , ;
R
c
G j k F j k F j k F j k F j k
G j k F j k F j k F j k F j k
1/22 2, , , R CG j k G j k G j k --eq 6
c. Sobel Edge Detection Method
It is a discrete differentiation operator, computing an
approximation of the gradient of the image intensity function.
At each point in the image, the result of the Sobel operator is
either the corresponding gradient vector or the norm of this
vector. The Sobel operator is based on convolving the image
with a small, separable, and integer valued filter in horizontal
and vertical direction and is therefore relatively inexpensive in
terms of computations.The operator uses two 3×3 kernels
which are convolved with the original image to calculate
approximations of the derivatives - one for horizontal
changes, and one for vertical. The following table is obtained
from [4] and is employed in the algorithm.
The results show that the edges detected though sobel
method are more accurate than other methods. But if the
image contains a significant amount of noise then the edges
detected through sobel method are not accurate. In that case,
we detect edges either through first derivative or second
derivative methods.
Page 55
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
247
TABLE I: Sobel row and column
III. Methodology
Edges are detected using all three approaches namely First
derivative, Second derivative and Sobel operator. First the
grayscale image is considered then all three approaches are
performed then thresholding is done using MATLAB R2010.
Peak to Signal noise ratio (PSNR) is calculated. Using
Simulink edge is detected according to the algorithm
described below.
a. In Simulink
i. Algorithm
Step 1: select an image file which is in grayscale format.
Step 2: Define sobel gradients in a 3*3 matrix.
Step 3: Perform matrix multiplication of the 2 matrices
shown above.
Step 4: Concatenate the results obtained in step 3
sequentially.
Step 5: Edge detected image is obtained.
Fig.1: Block diagram of an edge detected image using Sobel operator in
Simulink
Fig.1 shows the Simulink toolboxes that have been used
to detect edges through Sobel edge detection method.
b. In MATLAB
i. Algorithm
These are the steps that are employed in [2] to find the
edge detected image.
Step 1: select an image of grayscale format.
Step 2: Sobel operator matrix is defined.
Step 3: the operator is applied to every pixel in the
image and a new matrix is formed.
Step 4: the matrix obtained is the edge detected image of
the given matrix.
c. Using fuzzy toolbox
These are the steps that are used to detect edges using
fuzzy logic as in [8].
Step 1: construct a fuzzy inference system that would
detect the edges and classify them as small, medium and
large edges.
Step 2: Give an image as an input to the fuzzy inference
system.
Step 3: Obtain the output and apply threshold to the
image.
Edge detection of an image is white line as edges of the
image on the black background. Generally grayscale
image lies between 0-255 where 0 represents black
moving toward white. From the experience of the tested
images in this study, it is found that the best result to be
achieved at the range black from zero to 126 gray values
and from 126 to 255 meaning is trace as white..
The edge detected matrix has a wide range of pixel
values from 0 to 255. In order to increase the accuracy
we assign a threshold value. Pixel values above the
threshold are made white and below are made black.
Now the matrix contains elements only of 0 and 255.
IV. Results
The results we obtained after performing the edge
detection technique for the image shown below are as
shown
1 0 -1
2 0 -2
1 0 -1
1 2 1
0 0 0
-1 -2 -1
Page 56
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
248
Fig. 2. Original grayscale image
Fig 2 shows the actual image whose edge has to be
detected.
Fig. 3. Edge detected image using First Derivative Principle
Fig 3 shows the edge detected image of the original
image using First Derivative Principle.
Fig. 4. Edge detected image using Second Derivative Principle
Fig 4 shows the edge detected image of the original image
using Second Derivative Principle.
Fig. 5. Edge detected image using Sobel edge detection method.
Page 57
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
249
Fig 5 shows the edge detected image using sobel edge
detection method.
Fig. 6. Edge detected image using Fuzzy logic toolbox
Fig 6 shows the output of an edge detected image that has
been obtained using Fuzzy logic toolbox. The approach that
has been used here is the First Derivative principle method.
By comparing all the three methods it can be said that the
edge detected image obtained using sobel edge detection
method is more accurate. TABLE II: PSNR for 3 approaches
PSNR calculated for all the images obtained from
different approach is represented in table 2.
File size is calculated after detecting edges in images
and after applying threshold and it is represented in table
3.
V. Conclusion
This paper describes detection of edges of an image
using three different approaches such as First Derivative,
Second Derivative and Sobel operator. PSNR is
calculated using all three approaches and compared.
Using Sobel operator edges can be detected perfectly as
its PSNR calculated is 6.572 better than other two
approaches
TABLE III: File size for edge detected images
.
Acknowledgment
The authors wish to thank Daneshwari I. Hatti, Assistant
Professor at SDMCET for her grateful support and assistance
while carrying out the paper work. The authors would also
like to thank the faculty of SDMCET for their support.
References
[1] Zahari Doychev, ―edge detection and feature extraction‖ -
Verlag, 1985, ch. 4.
[2] Mohamed A. El-Sayed, ―A New Algorithm Based Entropic
Threshold for Edge Detection in Images‖ [3] Ferdinand van der Heijden, ―Edge and Line Feature Extraction
Based on Covariance Models‖.
[4] Gonzalez woods and Eddins, “Digital image processing‖
[5] Mike Boldischar and Cha Poa Moua, ―Edge Detection and
Feature Extraction inAutomated Fingerprint Identification
Systems”
[6] André Aichert, ―Feature extraction techniques‖ [7] O. Folorunso and O. R. Vincent, “A Descriptive Algorithm for
Sobel Image Edge Detection‖.
[8]Shashank Mathur, Anil Ahlawat,”Application of Fuzzy logic on
image edge detection‖
METHODS PSNR
First Derivative 6.286
Second Derivative 6.507
Sobel gradient method 6.572
METHODS Original
file size
Size of edge
detected
image
Size after
applying
threshold
First
Derivative
64kB 61.5kB 4.22kB
Second
Derivative
64kB 58.6kB 4.32kB
Sobel
gradient
method
64kB 62.6kB 11.4kB
Page 58
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
250
Face Recognition In Video Surveillance Applications
Shilpa Anbalgan1, Manjunath R Kounte
2 , C.Manogaran
3
1Student,
2Ast.Professor,
3Deputy General Manager BHEL
1,2Dept. of Electronic Engineering REVAITM Bangalore,
3BHEL
Email: [email protected] , [email protected] , [email protected]
Abstract—Face Recognition is rapidly
finding their way into Intelligent
Surveillance Systems. Recognizing faces
in the crowd in real-time is one of the
key features that will significantly
enhance Intelligent Surveillance
Systems. The main challenge is the fact
that the high volumes of data generated
by high-resolution sensors can make it
computationally impossible for
mainstream processors. In this paper we
report on prototyping development of a
automated face recognition system using
high resolution basic techniques. In the
proposed technique, the camera extracts
all the faces from the full-resolution
frame and only sends the pixel
information from these face areas to the
main processing unit. Face recognition
software that runs on the main
processing unit will then perform the
required pattern recognition.
1. INTRODUCTION
Image processing is a form of signal
processing for which the input is an
image/video; and the output of image
processing may be either an image or a set
of characteristics or parametersrelated to
the image. Most of the image-processing
techniques involve treating the image as
a two-dimensional signal and applying
standard signal-processing techniques to it.
Purpose of Image processing
The purpose of image processing is
1. Visualization – To Observe the
objects that are not clearly visible.
2. Image sharpening and restoration -
To create a better image form available
data.
3. Image retrieval - Seek for the image
of interest or region of interest.
4. Measurement of pattern – Measures
various objects in an image such as
features or any region of interest.
5. Image Recognition – Distinguish the
objects in an image.
Fig 1: System diagram showing main
computational components of the Face
recognition Surveillance System
Its the set of computational
techniques for analyzing, enhancing,
compressing, and reconstructing of image
data. Its main components are the input, in
Page 59
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
251
which an image is captured through
scanning or digital photography; analysis
and manipulation of the image,
accomplished using various specialized
software applications; and output (e.g., to a
printer or monitor display). Image
processing has various applications in
wide areas, including astronomy,
medicine, industrial robotics, remote
sensing by satellites and biometrics
Afacial recognition system is
a application in computer for
automatically identifying or verifying a fro
m a digital image or a video frame from
a video data source. One of the ways to do
this is by comparing selective facial
features from the image and a
facial database. It is typically used
in security systems and can be compared
to other biometrics such or
eye iris recognition systems. Recognition
algorithms can be divided into two main
approaches, geometric, which looks at
distinguishing features, or photometric,
which is a statistical approach that distills
an image into values and compares the
values with templates to eliminate
variances. Popular recognition algorithms
include Principal Component
Analysis using Eigen-faces.
2. PRIOR WORK
A. Face Recognition at a Distance
for video Surveillance
Applications
Face recognition at a distance is
concerned with the automatic recognition
of co-operative and non-cooperative
subjects over a wide area. The system
features predictive subject targeting and an
adaptive target selection mechanism based
on the current action and past history of
each targeted subject to help ensure that
facial images are captured for all subjects
in view. Experimental tests designed to
simulate operation in large transportation
hubs show that the system can track
subjects and capture facial images at
distances of 35–50m and can recognize
them using a commercial face recognition
system at a distance of 10–20 m.[1]
Faces are detected using a
combination of motion detection,
background modeling and skin detection.
The camera is then directed to the detected
faces for higher resolution facial image
capture. People moving in the field of the
stationary camera are detected and
tracked.The person detection process
operates on the video at about 10-15 Hz.
An extended Kalman filter [2], [3] is
applied to these detected persons in the
ground-plane. This makes the system
robust to momentary occlusions and
provides a velocity estimate for each
tracked subject, allowing for the prediction
of future subject locationsAlso at GE
Global Research, Krahnstoever et al. [4]
have developed a multi-camera tracking
framework and prototype face capture at a
distance system for video surveillance
applications.
B. Object Detection by Boosted
Cascade for small Features
A machine learning approach for
visual object detection which is capable of
processing images extremely fast and
achieving high detection. This work is
distinguished by three key contributions.
The first is the introduction of a new
image representation called the ―Integral
Image‖. Next is a learning algorithm,
which selects a small number of critical
visual features from a larger set and yields
extremely efficient classifiers[7].The third
contribution is a method for combining
increasingly more complex classifiers in a
―cascade‖ which allows background
regions of the image to be quickly
discarded while spending more
computation on promising object-like
Page 60
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
252
regions. In the domain of face detection
the system yields detection rates
comparable to the best previous systems.
The speed of the cascaded detector
is directly related to the number of features
evaluated per scanned sub-window.This is
possible because a large majority of sub-
windows are rejected by the first or second
layer in the cascade. This is roughly 15
times faster than the Rowley- Baluja-
Kanade detector [8] and about 600 times
faster than the Schneiderman-Kanade
detector [9].
D. Reliable Face Recognition for CCTV
One ambition of CCTV is to help
prevent terrorism and a key technology is
reliable face recognition With passport
quality photographs, current face
recognition technologies can approach
90% recognition accuracy. Yet trials show
that performance drops to only 15 to 25%
when there are significant changes in
lighting, pose, and facial expressions. We
describe by the authors to address these
issues and provide reliable face
recognition performance in real-time. Our
system has three major components
comprising: a) A Viola-Jones face
detection module based on cascaded
simple binary features to rapidly detect
and locate multiple faces from the input
still image or video sequences, b) A Pose
Normalization Module to estimate facial
pose and compensate for extreme pose
angles, and c) Adaptive Principal
Component Analysis to recognize the
normalized faces. Experimental results
show that our approach can achieve good
recognition rates on face images across a
wide range of head poses with different
lighting and expressions.[10]
3. PROPOSAL SYSTEM
Face detection in complex background
is a challenging task. The complexity in
such detection systems stems from the
variances in image background, view,
illumination, articulation, and facial
expression. This paper is allocated to
introducing a new algorithm for face
detection. Skin color detection and
template matching composed our method
instrument. At this paper non-skin parts in
the pictures omitted firstly and then result
passed to template matching section. Easy
implementation and high accuracy
detection be feature of our work.
4. FACE SEGMENTATION
USING COLOR MODEL
TECHNIQUE
As per standard color models,
RGB, CMYk, HSV, YCbCr, for Face
Recognition here we have used HSV to
extract features of the person in a
video surveillance system. And after
several experiments conducted, we
have just utilized Saturation (S) and
Intensity (V) values of the person and
have obtained threshold value of each
person who‘s values are set to b stored
in as a Template.
This whole process of thresholding
is done in MATLAB. And after
collection of Threshold values of each
data, the uniform average of threshold
is done of all the collected data and is
used for further process.
The collected Threshold of HSV is
further utilized in SIMULINK to
segment the skin color form the
background. We can observe it in the
below screenshots of video Fig.2 and
Fig.3 for segmentation.
Page 61
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
253
Fig:2 screenshot of video
Fig 3. Segmented video using HSV
After segmenting the video with the
person and background, now we should fit
the original video and get the persons face
with color with the empty background
which can be observed with the Fig.4
Fig.4 Obtaining face segmented from
background.
5. FACE IDENTIFICATION
After segmentation we apply algorithm
of Eigen techniques using PCA analysis to
identify the person in the video Fig.5 and
Fig.6
Fig.5 Maximum of white pixels
represents the person identification
Fig.6 Boundary box to identify face
6. CONCLUSION
We have presented approaches to
automatic detection of human faces in
color images. The proposed approaches
Consist of two parts: a human skin
segmentation to identify Probable regions
corresponding to human faces; and a view
based face detection to further identify the
location of each human face. The human
skin segmentation employs a model based
approach to represent and differentiate the
background colors and skin colors. The
Page 62
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
254
face detection method based on template
matching is a promising approach given its
satisfactory results. The algorithm retains
its high performance even in the presence
of structural objects like beard, spectacles,
mustaches, glasses etc. However, more
features with sophisticated algorithm need
to be added in order to use it for more
general applications.
REFERENCES
[1] W. Wolf, B. Ozer, and T. Lv, "Smart
cameras as embeddedSensor Type CMOS
systems," Computer, vol. 35, pp. 48-53,
2002.
[2] S. Blackman and R. Popoli, Design and
Analysis of Modern Tracking Systems.
Artech House Publishers, 1999.
[3] N. Krahnstoever, P. Tu, T. Sebastian,
A. Perera, and R. Collins, ―Multiview
detection and tracking of travelers and
luggage in mass transit environments,‖ in
In Proc. Ninth IEEE International
Workshop on Performance Evaluation of
Tracking and Surveillance (PETS), 2006.
[4] N. Krahnstoever, T. Yu, S.-N. Lim, K.
Patwardhan, and P. Tu, ―Collaborative
real-time control of active cameras in large
scale surveillance systems,‖ in Proc.
Workshop on Multi-camera and Multi-
modal Sensor Fusion Algorithms and
Applications (M2SFA2), October 2008.
[5] J. Hams, C. Koch, and I. Luo,
―Resistive fuses: Analog hardware for
detecting discontinuities in early vision,‖
in Analog VLSI Implementation of Neural
Systems,ed. C. Mead and M. Ismail,
pp.27-55, Kluwer Academic Publishers,
1989.
[6] P.C. Yu, S.J. Decker, H.S. Lee, C.G.
Sodini, and J.L. Wyatt, Jr., ―CMOS
resistive fuses for image smoothing and
segmentation,‘‘ IEEE I. Solid-state
Circuits, vo1.27, pp.545-553, 1992.
[7] Yoav Freund and Robert E. Schapire.
A decision-theoretic generalization of on-
line learning and an application to
boosting. In Computational Learning
Theory: Eurocolt ‘95,pages 23–37.
Springer-Verlag, 1995.
[8] H. Rowley, S. Baluja, and T. Kanade.
Neural network-based face detection. In
IEEE Patt. Anal. Mach. Intell., volume 20,
pages 22–38, 1998.
[9] H. Schneiderman and T. Kanade. A
statistical method for 3D object detection
applied to faces and cars. In International
Conference on Computer Vision, 2000.
[10] Gibbons P B, Karp B, Ke Y, Nath S
and Sehan S (2003), IrisNet: An
Architecture for a orldwide Sensor Web.
In:Pervasive Computing, 2(4), 22-23, Oct
– Dec
.
Page 63
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Pulse Compression for Target Tracking
Harish Naidu V1, Mr S.N Prasad2
1Harish Naidu V-M.Tech VLSI design and embedded system (ECE), VTU
Reva Institute of Technology & Management (RITM)
Bangalore, India-560064
2Mr S.N Prasad-Associate Professor department of ECE, VTU
Reva Institute of Technology & Management (RITM)
Bangalore, India-560064
[email protected] , [email protected]
Abstract- Pulse compression plays an important role in design of
the radar system. Pulse compression using linear frequency
modulation techniques are very popular in modern radar. The
linear frequency modulation is used to resolve two small targets
that are located at long range with very small separation between
them.Pulse compression is commonly used in radar applications
to improve range resolution while keeping transmitted peak
power low.This is achieved by modulating the transmitted pulse
and then correlating the received signal with the transmitted
pulse.Range resolution is the ability of radar to resolve between
two targets on the same bearing, but at slightly different
ranges.The primary focus of this paper is to implement Pulse
compression using digital technology.With high performance
digital computing, the convolutionoperation required for pulse
compression can be done digitally.This digital approach
eliminates the calibration requirementsand the limited
reconfigurability of analog approaches. This digital processing is
done in frequency domain.When doing the processing in the
frequency domain, a FastFourier Transform (FFT) is used to
transform both the referencewaveform and the return signal
waveform into the frequencydomain. The complex conjugate of
the reference waveform‘sFFT is then multiplied (point-by-point)
with the returned signalwaveform‘s FFT. The result is
transformed back into the timedomain (with an inverse FFT) to
produce the output signal, withpeaks that represent the targets.
Keywords-Pulse compression,Linear Frequency Modulation,
Matched filtering,Digital down conversion(DDC),Range resolution
1. INTRODUCTION
In modern pulsed Radar, range resolution (ΔR) is
Proportional to the pulse duration (τ). Therefore
improvedrange resolution necessitates shorter pulse
duration. Similarly the energy (E) content of the signal is
also proportional to pulse duration (τ) and the detection
probability depends it. Therefore to improve the
detection, the pulse duration is required to be longer. To
overcome this two conflicting requirements, pulse
compression method is used. The pulse compression
usually done through Frequency Modulation and Phase
Modulation are very popular in radars. Frequency
modulation can be classified as Linear Frequency
Modulation (LFM) and Nonlinear Frequency
Modulation (NLFM).
LFM is the most popular radar waveform due to good
range resolution and Doppler sensitivity. LFM
waveform generation schemes are classified in analog
and digital techniques. Analog pulse compression
techniques are based on the surface acoustic wave
(SAW) devices. However, design and fabrication of the
SAW device for the large time-bandwidth product chirp
signal is very complex and expensive, while the digital
technique gives better advantage of programmability,
flexibility, better stability, accuracy and repeatability.
This paper describes LFM generation and
implementation in the Field Programmable Gate Array
(FPGA).
2. LINEAR FREQUENCY MODULATION(LFM)
In LFM, the frequency of the modulating signal
increases linearly during the pulse duration of the signal.
Ina linear chirp, the time domain chirp signal is given by
the equation 1.1.
S (t) = exp (jΦ (t)) (1.1)
Where Φ (t) is the instantaneous phase, given by the
equation as below:
Φ (t) = 2π ( t k ) t (1.2)
Page 64
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Figure 1: Typical LFM Waveforms (a) up-chirp (b) down chirp where is
the centre frequency (@ time t = 0) and k is the rate of the frequency increase
or chirp rate
3. MATCHED FILTER
Matched filter can be implemented in time-domain
andfrequency-domain. In this paper, it mainly analyses
thefrequency-domain matched filter which can realize
the pulsecompression effectively.The matched filter
algorithm process is shown in figure 2.
Figure 2: Matched filter algorithm process
Matched filtering correlates receiver and
transmitted pulse in frequency domain. Echoes
correlating to the transmitted pulse will produce a
high peak, others will be ignored.IFFT will bring
signal back to time domain.
4. SYSTEM OVERVIEW
Figure 3: Digital receiver subsystem block diagram (DRS)
The Pulse compression module is implemented in the
PMCE2202FPGA v5sx95t .The PMC-E2202 is a 16-bit
digital receiver mezzanine card consisting of 4 ADC
channels with a sampling rate of up to 160 Msps and a
Xilinx Virtex-5 SX95T FPGA. The card also has 4 DAC
channels that can be used to generate test patterns to test
the ADC. The brief overview of the data path of 1
channel is as shown below. 2 channels are implemented
in one E2202 card hence 2 E2202 cards are used for 4
channels. The data path is same for all the channels.
Figure 4: Data flow in FPGA
Digital down Conversion (DDC)
Figure 5: DDC
Page 65
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
The function of a Digital down Converter is to translate
the frequency of a signal either to alower, intermediate
frequency or to baseband. The usual reason for this is to
reduce the data rate required to represent the desired
signal.
The complex data stream from the digital down
converter flows into the pulse compression unit. This
unit:
1. Performs a time-domain multiplication, to
reduce the magnitude of the side lobes. 2. Performs an FFT operation, to transform the
time domain complex sample stream into a frequency domain representation of the return signal.
3. Performs single point-by-point multiplication (in the frequency domain) of the transverse equalization filter coefficients (to compensate
for the non-flat response of the analog receiver on the front end) and of the conjugated coefficients of the FFT of the reference waveform (to detect correlations with the transmitted waveform).
4. Performs inverse FFT operation, to generate a time domain data stream, with peaks that
correspond to targets.
Complex
Multiplier
Fft_out
Product
Figure 6: Pulse compression module block diagram
Pulse compression module takes the output of Ddc core,
correlates with the stored waveform data in the ROM.
Correlation is performed in frequency domain
hence fft cores and complex multipliers are used in the
design.
Block Description
1. Both fft and ifft cores are configured to a specific size (N-point) by the Ddc_fft _data_controller_interface depending on the number of samples acquired.
2. Ddc out with or without zero padding is fed to fft core to compute fft.
3. Simultaneously the data from the TX rom is popped out one sample at a time and multiplied (complex multiplication) with the corresponding fft output sample.
4. The complex product is then fed to ifft module to compute the final pulse Compression output.
TX-rom Module
TX rom module is a very important block which consists
the stored transmit waveform which is correlated with
the received waveform while performing pulse
compression and a small addressing logic for rom which
selects TX waveform
of specific size from a generalized waveform stored
in the rom.
Tool used
SCILAB-Scilab is a freely distributed open source
scientific software package. It is similar to Matlab,
which is a commercial product. Yet it is almost as
powerful as Matlab.
5. RESULTS
Transmitted (Reference) waveform coefficients are
processed and stored offline using Scilab and Xilinx rom
generator.TX rom of both I and Q channel consist of
stored ideal chirp waveform which is considered as
transmit waveform.Hanning window is performed
offline in Scilab to reduce sidelobes. Fft conjugate is
computed offline in scilab and rom coefficient file is
generated which is given as input to rom generator. The
Pu
lse
co
mp
ress
ion
ou
tpu
t
Fft
Core
Ddc_fft_
data
_contr
oller
inte
rface
Ifft
Core
TX rom
Page 66
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
complex conjugate of the reference waveform‘s FFT is
then multiplied (point-by-point) with the returned signal
waveform‘s FFT. The result is transformed back into the
time domain (with an inverse FFT) to produce the output
signal, with peaks that represent the targets.
The following figures show the snapshot of the LFM
signal and pulse compression output done in SCILAB.
Figure 7: LFM pulse envelop
Figure 8: LFM pulse spectrum
Figure 9: Chirp I and Q channel
Figure 10: Pulse compression output
The following figure shows the snapshot of zero padded
output from FIFO using Xilinx ISim
Page 67
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Figure11: Waveform of Zero padded output from the FIFO
6. ADVANTAGES
Pulse compression increases the range resolution as well
as signal to noise ratio (SNR).The main advantage of
LFM is it is quite insensitive to doppler shifts, it is the
easiest to generate, a variety of hardware being available
to form and process it .
7. CONCLUSION
Pulse compression allows coverage of large range
areausing a reduced transmitter power and achieving the
highrange resolution. Best for resolving overlapping
returnscoming from closely spaced targets. The cost of
applyingpulse compression was paid in the form of
complexity addedto both transmitter and receiver. The
other challenging thingwas proper suppression of range
side lobes. The advantagesgenerally outweigh the
disadvantage, so pulse compression isbest technique for
radar systems, where main challengingfactors are
"transmit power" and "high range resolution".
ACKNOWLEDGEMENT
This work has been fully supported by Mistral Solutions
Private Limited, Bangalore and RITM, E & C
department Bangalore.
REFERENCES
[1]Jun Wang, Duoduo Cai and Yaya Wen, ―Comparison of matched
filter and dechirp processing used in Linear Frequency Modulation‖
Computing, Control and Industrial Engineering (CCIE), 2011, pp.70-
73.
[2]Qadir S.G, Kayani J.K, Malik S, ―Digital Implementation of Pulse
Compression Technique for X-band Radar‖Applied Sciences &
Technology, 2007, pp.35-39.
[3]Salemian S, Keivani H, Mahdiyar O, ―Comparison of radar pulse
compression techniques‖ Microwave, Antenna, Propagation and
EMC Technologies for Wireless Communications, 2005, pp. 1076-
1079.
[4]Patel K , Neelakantan U , Gangele S , Vacchani J.G. Desai N.M,‖Linear Frequency Modulation Waveform
Synthesis‖,Electrical, Electronics and Computer Science (SCEECS),
IEEE Students' Conference,2012, pp.1-4.
[5] ―Introduction to Radar systems‖ by Merill l .Skolnik
[6]―Radar Signal Analysis and processing‖ by Bassem R.Mahafza.
[7]Journal paper on ―FPGA Cores Enhance Radar Pulse
Compression‖.
[8] Document on ―Principle of The Pulse Compression Radar‖ by Vijaya Chandran Ramasami, RSL, Univ of Kansas.
Page 68
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
EFFICIENT SORTING AND COUNTING OF
GRAINS USING IMAGE PROCESSING
Manoj N1
Department of E & C,
Siddaganga Institute of Technology,
Tumkur -572 103, Karnataka, INDIA
[email protected]
Y Harshalatha2
Asst. Professor
Department of E & C,
Siddaganga Institute of Technology,
Tumkur -572 103, Karnataka, INDIA
[email protected]
Partha Das3
R & D Engineer,
Fowler Westrup (INDIA) Pvt. Ltd.,
Bommasandra Industrial Area,
Bangalore, Karnataka, INDIA
[email protected]
Abstract—With increased expectations for food products of
high quality and safety standards, the need for accurate, fast and
objective quality determination of these characteristics in food
products continues to grow. Computer vision provides an
automated, non-destructive and cost-effective technique to sort
good and defected grains based on their color. In fact, many
systems have already been developed to perform the sorting of
grains based on color, but these systems fail to detect minute
color defects. In this paper, we present a set of image processing
algorithms that accurately differentiate all type of color defected
grains. When the images of grains are captured using line scan
CCD camera, the grains in the image will contain non uniform
distribution of gray values, hence a Gaussian based smoothening
algorithm is implemented to overcome the non-uniform
distribution of gray values inside the grain. A modified adaptive
thresholding algorithm is proposed to segment the grains in the
image based on the gray scale value of the grain. Finally the good
and defected grains are counted using connected component
algorithm and separated by using air valves. Keywords: Image processing, CCD line scan camera, computer
vision, falling grains.
I. INTRODUCTION The basis of quality assessment is often subjective with
attributes such as appearance, smell, texture, and flavor, frequently examined by human inspectors. But human perception could be easily fooled. Together with the high labor
costs, inconsistency and variability associated with human inspection accentuates the need for objective measurement systems. So an automatic sorting system, mainly based on image processing have been investigated for the sorting of agricultural and food products. The grain samples are fed into input hopper automatically as shown in Fig 1. Then, via the in-feed vibrator, the grains are fed onto a flat and channeled gravity chutes. The surface of the chutes is smooth enough to reduce the force of friction between the grains and the chutes. The grains are then passed into an optical inspection area, where a decision on whether to accept or reject each grain is made. To make a right decision on whether to accept or reject each grain, the image captured from the camera in optical inspection area is processed.In the inspection area two cameras are placed for each chute. One camera views the front portion of the grain and other camera views the rear portion of the grain at the same time. Each camera has got two front illuminators and one rear illuminator. The front illuminator illuminate the grain and rear illuminator is used to remove the shadow of the grain. The front illuminators used are of blue color for rice grains and red color for masoor dhal grains. The rear illuminator are of white color for both the grains. Based on the images captured in the inspection area, decision is made by the processors to accept or reject the particular grain. If the decision is to reject then a signal is sent to ejectors to blow the air on that particular defected grain. So finally the good and
Page 69
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
defected grains will be separated. The images are acquired by using a CCD line scan camera. Each line scan gives an image of size 1×2048 pixels. Here we are considering four line scan as one frame and for further processing 2048×2048 image is considered.
Fig 1: working principle of color sorter
The goal of this paper is to present a computational algorithm as well as an efficient decision support tool to
automatically detect all types of color defects in the grains.
Specifically the pale yellow defect in rice grain and yellow
defect in masoor dhal has to be detected accurately. We
exploit the pixel information as a reliable feature for deciding
good or defect. The pixels containing same grain may also
vary in their gray values, because of the nature of the grain,
but this reduces the accuracy in detecting the defected grains,
hence we propose a Gaussian based smoothening algorithm to
uniformly distribute the gray values of the pixels containing
individual grain. Then modified adaptive thresholding
algorithm is applied on the image to segment the good and
defected grains in the image based on the gray level value.
According to the segmentation the gray scale image is
converted into a binary image, in this binary image only
defected grains are marked as dark and the good grains are
merged with background as white. Device starts the sorting by
using the air valves to blow off the defected grains. The numbers of good and defected grains are counted by using
connected component algorithm. The paper is organised as
follows: We will briefly review some related methods in
section II, methodology of the proposed work is discussed in
section III, experimental results are shown in section IV,
conclusion along with the future work are given in section V.
II. LITERATURE SURVEY
In this application the images obtained from the CCD line scan camera is a gray scale image. So we can divide the pixels in the image into two dominant groups, according to their gray-level. These gray-levels may serve as ―detectors‖ to
distinguish between background and objects in the image. On the other hand, if the image is one of smooth-edged objects, then it will not be a pure black & white image, hence we would not be able to find two distinct gray-levels characterizing the background and the objects. Fig 2 shows the grain image and its histogram plot. A solution in literature to this problem may be to select a gray-level T between those two dominant levels, which will serve as a ‗threshold‘ to distinguish the two classes (objects and background). Using this threshold, a new binary image can then be produced, in which grain kernels are painted completely dark, and the remaining pixels are white. But this approach is not suitable to segment the defects in the grain, because the defects will have several gray levels. So in literature this principal is generalized to deal with images having several dominant gray-levels. In this case, more than a single threshold is considered in order to classify the image‘s components. Such an approach is referred to as multilevel thresholding. But multilevel thresholding is not always the best solution for segmentation of the image having several gray levels.
(a) (b)
Fig 2: (a) original image (b) Histogram plot of (a).
In fact, multilevel thresholding usually gives poor results [4].
The reason for this is the difficulty of globally establishing
multiple thresholds that effectively isolate regions of interest,
especially when the number of corresponding histogram
modes (dominant areas) is large. Furthermore, the
establishment of a globally predefined set of thresholds cannot
take into consideration the variations within the image.
Because in the image considered here will have several gray
values. In order to overcome these drawbacks, it is impossible
in global context, since no single threshold fits entire image, so this leads to a conclusion that a more local threshold must
be used.
III. METHODOLOGY
A. WORK FLOW
Image
acquisition
GaussianSmo
othening
Modified
Adaptive
thresholding
Counting of
grains
Decision
making Reject
Fig 3: Work flow of proposed work
Accept
Page 70
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
B. GAUSSIAN SMOOTHENING A Gaussian is an ideal filter in the sense that it
reduces the magnitude of high spatial frequencies in an image
proportional to their frequencies. That is, it reduces magnitude
of higher frequencies more. This will be at the cost of more
computation time when compared to mean filtering. A
Gaussian extends to infinity in all directions, but because it
approaches zero exponentially, it can be truncated to three or
four standard deviations away from its center without affecting
the result noticeably. Speed up is achieved by splitting a 2-D
Gaussian into two 1-D Gaussians, G(x, y) = G(x) G(y), and
carrying out filtering in 1-D, first row by row and then column
by column. Designing Gaussian Filters is to compute the mask
weights directly from the discrete Gaussian distribution.
And choosing a value for , we can evaluate it over an n x n
window to obtain a kernel, or mask, for which the value at
[0,0] equals 1.For example, choosing = 2 and n= 7, the
equation (1) yields the array as shown in Table 1.However, we
desire the filter weights to be integer values for ease in
computations. Therefore, we take the value at one of the
corners in the array, and choose k such that this value becomes
1.We get
Table 1: Array of mask
Now, by multiplying the rest of the weights by k, we obtain
Table 2. Table 2: Gaussian Kernel
This is the resulting convolution mask for the Gaussian filter.
However, the weights of the mask do notsum to 1. Therefore,
when performing the convolution, the output pixel values must
be normalized by the sum of the mask weights to ensure that regions of uniform intensity are not affected. i.e,
Therefore,
Where the weights of are all integer values.
C. IMAGE SEGMENTATION: Image segmentation is to classify or cluster an image into
several parts (regions) according to the feature of image, for
example, the pixel value or the frequency response. Image
segmentation is useful in many applications. It can identify the
regions of interest in a scene or annotate the data. Often when
approaching a Computer-Vision issue, the first problem we
face is that of Segmentation. That is, in order to extract
valuable information from the image at hand, we first need to
divide the image into distinctive components, which can then
be further analyzed. This is needed, in order to separate the
―interesting‖ components from the subordinate ones, since
computers have much difficulty in performing classification in
comparison with the Human brain.
1) Adaptive Thresholding: In order to overcome the drawbacks mentioned in literature, it
is impossible in global context, since no one threshold fits
entire image, so this leads to a conclusion that a more local
threshold must be used. According to this, the general
definition of a threshold can be written in the following
manner.
Where, f(x, y) is gray level of point (x, y) in original image.
When T depends only on the gray-level at that point, then it
degenerates into a simple global threshold. Actually, p(x, y) is
one of the more important components in the calculation of
the threshold for a certain point. In order to take into consideration the influence of noise or illumination, the
calculation of this property is usually based on an environment
of the point at hand. An example of a property may be the
average gray-level in a predefined environment, the centre of
which is the point at hand.
Threshold calculation techniques: There are two main
approaches to calculate the threshold for a certain point in the
image: one approach is the chow and kanenko approach, other
is local thresholding. Both approaches are based on the
assumption, that smaller image regions are more likely to have
approximately uniform illumination, thus being more suitable
i , j -3 -2 -1 0 1 2 3
-3 .011 .039 .082 .105 .082 .039 .011
-2 .039 .135 .287 .368 .287 .135 .039
-1 .082 .287 .606 .779 .606 .287 .082
0 .105 .368 .779 1.000 .779 .368 .105
1 .082 .287 .606 .779 .606 .287 .082
2 .039 .135 .287 .368 .287 .135 .039
3 .011 .039 .082 .105 .082 .039 .011
i , j -3 -2 -1 0 1 2 3
-3 1 4 7 10 7 4 1
-2 4 12 26 33 26 12 4
-1 7 26 55 71 55 26 7
0 10 33 71 91 71 33 10
1 7 26 55 71 55 26 7
2 4 12 26 33 26 12 4
3 1 4 7 10 7 4 1
Page 71
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
for thresholding. Each region is treated like an independent
image, and a ―global‖ threshold is computed for it.
The Chow and Kanenko Approach: According to Chow &
Kanenko, the original image is divided into an array of
overlapping sub-images. A gray-level distribution histogram is
produced for each sub-image, and the optimal threshold for
that sub-image is calculated based on this histogram. Since the
sub-images overlap, it is then possible to produce a threshold
for each individual pixel by interpolating the thresholds of the
sub-images. This method gives reasonable results, but its
major drawback is the fact that it requires much computation.
This causes it to be too slow and heavy for real-time
applications (i.e. for use in computer vision).
The Local Approach: An alternative approach is to statistically
examine the intensity values of the local neighbourhood of
each pixel. The first problem facing us when choosing this
method is the choice of statistic by which the measurement is
made. The appropriate statistic may vary from one image to
another, and is largely dependent on the nature of the image.
For example, if the image contains a strong illumination-
gradient, then use of the average may prove to be effective in
eliminating the ill influence. A few examples of statistics
commonly used in the calculation of the threshold are: the
average, the median, and the average between the minimal and
the maximal gray level in the neighbourhood. As it is clear
from the Fig 4, the adaptive thresholding is successful in
extracting the grain kernels from the image, despite the strong
illumination gradient. On the other hand, the result is not very
pleasing at background areas. The reason for this phenomenon is, with pixels that are centred in an environment containing
enough background and text, the selected threshold falls at
about the middle between these extremes, thus separating
them nicely. With pixels that are centred in an environment of
background only, the range of gray-levels within this
environment is very small and the result is that the average is
very close to the value of the centre pixel, thus an unsuitable
threshold value is computed.
Fig 4: Adaptive thresholding
2) Modified adaptive thresholding: In order to fix the drawback of adaptive thresholding, a
combination of adaptive and global threshold can be
employed. If we compute the threshold of a pixel as the
average of its environment minus a predefined fixed threshold,
then pixels centred in a relatively uniform environment would
be classified as background, which yields a good result.
D. DECISION MAKING AND EJECTION PRINCIPLE After applying the modified adaptive thresholding on the
grain image, it converts the gray scale image into a binary image. In this binary image all dark pixels are considered to be
of defective grain pixels. So when a maximum number of
continuous pixels are reached then a signal is passed to the
ejectors to blow air on those set of pixels. The ejection system
contains a number of valves through which air is blown.
Defects are rejected from the product stream pulses of
compressed air. These pulses are accurately aimed at the
unwanted items by nozzles, a compressed air source being
connected to the nozzles via a rigid duct and switched on and
off by high speed valves [1]. The principal observation of a
single ejection event is to remove a defective grain.
E. COUNTING OF GRAINS The counting of grains is done by analyzing the connected
components. A connected component in a binary image is a
set of pixels that form a connected group. Connected
component labelling is the process of identifying the
connected components in an image and assigning each one a
unique label. Properties of Connectivity: For simplicity, we
will consider a pixel to be connected to itself (trivial
connectivity). In this way, connectivity is reflexive. It is pretty
easy to see that connectivity is also symmetric: a pixel and its
neighbour are mutually connected. 4-connectivity and 8-
connectivity are also transitive: if pixel A is connected to pixel
B, and pixel B is connected to pixel C, then there exists a
connected path between pixels A and C. A relation (such as
connectivity) is called an equivalence relation if it is reflexive, symmetric, and transitive. Connected Component Labelling: If
we find all equivalence classes of connected pixels in a binary
image, this is called connected component labelling. The
result of connected component labelling is another image in
which everything in one connected region is labelled ―1‖ (for
example), everything in another connected region is labelled
―2‖, etc..,
IV. EXPERIMENTAL RESULTS
The images captured with developed image acquisition board were processed by the proposed algorithms and were
analyzed by MATLAB 7.0. The image is first passed through
the proposed Gaussian filter to smoothen the image as well as
to remove noise. The proposed modified adaptive thresholding
is applied on this smoothened image and is compared with
the conventional global threshold approach. As an
example here we have considered an image of Masoor dhal
and rice. Masoor dhal contains three types of color defects:
brown defect, green defect and yellow defect. And the good grain will be of red color. Yellow and red color will
have nearby gray values, so it is difficult for the system to
differentiate among those two colors when global
thresholding is applied. Rice grains contains pale yellow,
Page 72
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
orange, chalky and black defects. Here pale yellow and
good grains will contain nearby gray values so it is difficult to differentiate. The proposed algorithms differentiate
more accurately when compared to conventional methods.
V. CONCLUSION AND FUTURE WORK
In this paper, we have described recent improvements in the
algorithm of the color sorter to obtain more accuracy without
compromising speed. These improvements include the
removal of noise from the image and the use of modified
adaptive thresholding for image segmentation. In the
experimental results it is proved that the proposed set of
algorithm gives efficient segmentation of defected grains from
the good grains. The proposed algorithm is designed to work
well in real time application. It gives accurate result in the
same time that the conventional color sorters would take.To obtain more efficiency in sorting of good and defected grains,
two more algorithms can be applied. Image enhancement
algorithm and separation of overlapping algorithms can be
applied on the gray scale image [2]. The grain kernels sliding
down from the chute will be overlapped sometimes, so in the
image the grain kernels are not clearly viewed, hence a
separation of overlapping grain kernel algorithm can be
applied on the image to separate the kernels from one another.
Image enhancement processes consist of a collection of
techniques that seek to improve the visual appearance.
Hardware implementation can be done on FPGA or DSP
chips.
REFERENCES
[1] G Hamid, M J Honeywood, S Mills, S C Bee, W He, ―Optical sorting for
cereal grain industry‖, (Research and Development department, Sortex Ltd, London, England, UK).
[2] Lei Yan, Sang-Ryong Lee, Seung-Han Yang, Choon-Young Lee, ―CCD
Rice Grain Positioning Algorithm for Color Sorter with Line CCD
Camera‖ (Department of Mechanical Engineering, Kyungpook National University, Daegu, Korea).
[3] Weixing Wang, ―Image Analysis of Grains from A Falling Stream‖ Fourth International Conference on Image and Graphics.
[4] ping-sung liao, tse-sheng chen and pau-choo chung, ―A fast algorithm
for multilevel thresholding‖, journal of information science and engineering 17, 713-727 (2001) 713.
[5] Tadhg Brosnan, Da-Wen Sun, ― Improving quality inspection of food products by computer vision––a review‖, FRCFT Group, Department of
Agricultural and Food Engineering, University College Dublin, National
University of Ireland, arlsfort Terrace, Dublin 2, Ireland Received 29 April 2002; accepted 6 May 2003.
[6] E.R.Davies, ―computer and machine vision Theory, algorithm and practicalities‖.
[7] Nello Zuech, ―Understanding and Applying Machihe Vision‖ Second Edition, Revised and Expanded.
Table 3: Number of good and defected grains obtained by proposed method
Fig 5: parboiled rice good (a) original image (b) global thresholding (c) modified thresholding
Figure
no.
Number of grains
Good grains Yellow defects Total
Manual
counting
Proposed
method
Manual
counting
Proposed
method
Manual
counting
Proposed
method
5 26 23 0 2 26 25
6 0 0 23 23 23 23
7 4 4 6 6 10 10
Page 73
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Fig 6: parboiled rice Pale yellow (a) original image (b) global thresholding (c) modified thresholding
Fig 7: parboiled rice good and pale yellow (a) original image (b) global thresholding (c) modified thresholding
Page 74
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Foreground Analysis for Real Time Moving Object Detection, Tracking and object classification
Roopashree. B. G1, Vidyasagar K.N2 ,C.Manogaran3
student1,Asst,.Professor2 ,Sr. Deputy General Manager3
Department of Electronics and communication1,2
Reva Institute of Technology,Bangalore, Karnataka, India1
Quality Service, BHEL, Bangalore,Karnataka,India3
Email : [email protected] ,[email protected] @bheledn.co.in
ABSTRACT-In a Robust tracking system where a camera is
installed on a freely moving platform, motion detection and
tracking of object becomes much more difficult e.g. human, a
vehicle or a robot, hence it is necessary to develop an algorithm
to perform functions such as moving object detection , tracking
and object classification .
The Aim is to model foreground pixels using the background
subtraction method followed by preprocessing the data for object
modeling in order to recognize and classify the moving objects
for understanding human activity in the scene. Object detection
algorithm are applied in uncontrolled environments and the
same must be suitable for automated video surveillance system
for detection and monitoring of moving objects in both indoor
and outdoor environments.
Keywords- Motion Detection;Object detection;Tracking; Human
Model; Motion Analysis; video surveillance
I. INTRODUCTION
Over the past three decades, motion tracking have become
increasingly accepted and accessible method for accurately
plotting organic movement into computers to bypass the tricky
and cumbersome process of manually animating model with
realistic movements. As the technology and the systems relies
on mature and become widely accessible, its usefulness is increasing exponentially. The current acceptable quality of
motion tracking mark a clear milestone in the long quest to
create an intelligent machine a computer can now ‗see‘.
PRESENT SCENARIO
A closely related system is the DETER system. DETER can detect and track humans, and can
analyze the trajectory of their motion for threat
evaluation.
TheW4 system detects humans and their body parts
and analyzes some simple interactions between
humans and objects.
IBM S3-R1can detect, track, and classify moving
objects in a scene, encodes detected activities and
stores all the information on a database server.
All these systems work only in a static camera environment
and our objective is to work as well in an uncontrolled moving
camera environment.
A. Different Types of Motion capturing
There are a few different methodologies when it comes to
capturing data in real-time. The three main techniques are
prosthetic, magnetic and optical.
1. Prosthetic Motion Capture
Prosthetic motion capture uses potentiometers on the plastic
exoskeleton that an actor must ‗wear‘, and then act out his or
her movements. This technique is obviously only of use in
humanoid character animation, but is very accurate and
transmits real-time data at a far greater range than any other
technology. The suit is cumbersome but its advantages mean
that prosthetic motion capture has thrived.
Figure 1: The Gypsy 3 prosthetic full body suit
2. Magnetic Motion Capture Using magnetic motion capture, sensors attached to the body
being animated are manipulated inside a magnetic field. This technique is the least power-hungry in terms of computational
number crunching, so is the closest to real-time, with up to
one hundred samples a second possible. The sensors also
provide details on their orientation in full 3D. One of the few
drawback is the obvious effect that any metal would have on
the generated magnetic field Magnetic motion capturing also
requires a very tight space to be used, with a range of only
three meters.
Page 75
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Figure 2: A suit with magnetic sensors in a closed environment
3. Optical Motion Capture
Optical systems utilize data captured from image sensors
to triangulate the 3D position of a subject between one or
more cameras calibrated to provide overlapping projections.
Data acquisition is traditionally implemented using special
markers attached to an actor. Optical motion capture
systemsdo away with a suit or exoskeleton, and do not
necessarily even need physical markers such as the LED lights
or reflective dots – systems exist that are capable of mapping
patches of color or brightness to areas on a 3D model in real
time .The greatest advantage of optical motion capture is that
it is not limited to the movements in a closed motion-capture
studio or any movement on a video stream, live or recorded, can be analyzed. Water flow in a river, traffic speeds on
motorways and soon.
Figure 3: Optical facial tracking (2 cameras)
B. Background Subtraction
Background subtraction is particularly a commonly used
technique for motion segmentation in static scenes . It
attempts to detect moving regions of interest by subtracting the current image pixel-by-pixel from a reference background
image that is created by averaging images over time in an
initialization period. The pixels where the difference is above
a threshold are classified as foreground. After creating a
foreground pixel map, some morphological post processing
operations such as erosion, dilation are performed to reduce
the effects of noise and enhance the detected regions. The
reference background is updated with new images over time
to adapt to dynamic scene changes.
There are different approaches to this basic scheme of
background subtraction in terms of foreground region
detection, background maintenance and post processing.
1. Background Subtraction Techniques:
Heikkila and Silven uses the simple version of this scheme
where It is pixel at location (x, y) in the current image It is
marked as foreground if
|It(x, y) − Bt(x, y)| > Ƭ ……….. (1)
is satisfied where ‗Ƭ ‘ is a predefined threshold. The
background image Bt is updated by the use of an Infinite
Impulse Response (IIR) filter as follows:
Bt+1 = αIt + (1 − α)Bt …………(2)
The foreground pixel map creation is followed by morphological closing and the elimination of small-sized
regions.
Although background subtraction techniques perform well at
extracting most of the relevant pixels of moving regions even
they stop, they are usually sensitive to dynamic changes when,
for instance, stationary objects uncover the background (e.g. a
parked car moves out of the parking lot) or sudden
illumination changes occur.
2. Statistical Methods
More advanced methods that make use of the statistical
characteristics of individual pixels have been developed to
overcome the shortcomings of basic background subtraction
methods. These statistical methods are mainly inspired by the
background subtraction methods in terms of keeping and
dynamically updating statistics of the pixels that belong to the
background image process. Foreground pixels are identified
by comparing each pixel‘s statistics with that of the
background model. This approach is becoming more popular
due to its reliability in scenes that contain noise, illumination
changes and shadow.
The W4 system uses a statistical background model where
each pixel is represented with its minimum (M) and maximum
(N) intensity values and maximum intensity difference (D)
between any consecutive frames observed during initial
training period where the scene contains no moving objects. A
pixel in the current image It is classified as foreground if it
satisfies:
|M(x, y) – It(x, y)| > D(x, y) or
|N(x, y) – It(x, y)| > D(x, y)…………… (1)
After thresholding, a single iteration of morphological erosion
is applied to the detected foreground pixels to remove one-
pixel thick noise. In order to grow the eroded regions to their
original sizes, a sequence of erosion and dilation is performed
on the foreground pixel map. The statistics of the background
pixels that belong to the non-moving regions of current
image are updated with new image data
Stauffer and Grimson presented a novel adaptive online background mixture model that can robustly deal with lighting
changes, repetitive motions, clutter, removing objects from the
Page 76
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
scene and slowly moving objects. The motivation was that a
background model could not handle image acquisition noise,
light change and multiple surfaces for a particular pixel at the
same time. Thus, they used a mixture of Gaussian
distributions to represent each pixel in the model. Due to its
promising features, it is implemented and integrated this
model in our visual surveillance system.(e. g. scalars for gray
values or vectors for color images) over time is considered as
a ―pixel process‖ and the recent history of each pixel, X1, . . . ,Xt, is modeled by a mixture of K Gaussian distributions.
The probability of observing current pixel value then
becomes:
K
ititiiXttsXtP
1),,,(.)( ………..(2)
where ωi,t is an estimate of the weight (what portion of the
data is accounted for this Gaussian) of the I th Gaussian (Gi,t) in the mixture at time t, μ i,t is the mean value of Gi,t and ∑i,t
is the covariance matrix of Gi,t and Ƞ is a Gaussian
probability density function:
………..(3)
Decision on K depends on the available memory and
computational power.Also, the covariance matrix is assumed
to be of the following form for computational efficiency:
Itk k
2, ………..(4)
which assumes that red, green, blue color components are
independent and have the same variance. The procedure for
detecting foreground pixels is as follows. At the beginning of
the system, the K Gaussian distributions for a pixel are
initialized with predefined mean, high variance and low prior
weight. When a new pixel is observed in the image sequence,
to determine its type, its RGB vector is checked against the K
Gaussians, until a match is found. A match is defined as a pixel value within (=2.5) standard deviation of a distribution.
Next, the prior weights of the K distributions at time t, ωk,t, are
updated as follows
)()1( ,1,, tktktk M …….(5)
where α is the learning rate and Mk,t is 1 for the matching
Gaussian distribution and 0 for the remaining distributions.
(a) (b)
Figure 1: Two different views of a sample pixel processes (in blue) and
corresponding Gaussian Distributions shown as alpha blended red spheres.
Figure 1.shows sample pixel processes and the Gaussian
distributions as spheres covering these processes. The
accumulated pixels define the background Gaussian
distribution whereas scattered pixels are classified as
foreground.
3. Temporal Differencing
Temporal differencing attempts to detect moving regions by
making use of the pixel-by-pixel difference of consecutive
frames (two or three) in a video sequence. This method is
highly adaptive to dynamic scene changes, however, it
generally fails in detecting whole relevant pixels of some
types of moving objects.
The temporal differencing algorithm fail in extracting all
pixels of the human‘s moving region. Also, this method fails
to detect stopped objects in the scene. Additional methods
need to be adopted in order to detect stopped objects for the
success of higher level
Lipton presented a two-frame differencing scheme where the
pixels that satisfy the following equation are marked as
foreground.
|It(x, y) – I t−1(x, y)| > τ ……………..(1)
In order to overcome shortcomings of two frame differencing
in some cases, three frame differencing can be used.
4. Optical Flow
Optical flow methods make use of the flow vectors of moving
objects over time to detect moving regions in an image. They
can detect motion in video sequences even from a moving
camera, however, most of the optical flow methods are
computationally complex and cannot be used real-time
without specialized hardware
C. OBJECT CLASSIFICATION
Moving regions detected in video may correspond to different
objects in real-world such as pedestrians, vehicles, clutter, etc.
It is very important to recognize the type of a detected object
in order to track it reliably and analyze its activities correctly.
Currently, there are two major approaches towards moving
object classification which are shape-based and motion-based
methods. Shape-based methods make use of the objects 2D spatial information whereas motion-based methods use
temporally tracked features of objects for the classification
solution.
1 Shape-Based Classification
Common features used in shape-based classification schemes are the bounding rectangle, area, silhouette and gradient of
detected object regions. The approach presented in makes use
of the objects silhouette contour length and area information
to classify detected objects into three groups: human,vehicle
and other. The method depends on the assumption that
humans are, in general, smaller than vehicles and have
complex shapes. Dispersedness is used as the classification
)(1,
21
||2/1
2/,
)2(
1),( tXt
TXt
en
iXt
Page 77
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
metric and it is defined in terms of object‘s area and contour
length (perimeter) as follows:
Dispersedness =Area
2Perimeter ……….. (1)
2 Motion-based Classification
Some of the methods in the literature use only temporal
motion features of objects in order to recognize their classes
.In general, they are used to distinguish non-rigid objects (e.g.
human) from rigid objects (e.g. vehicles).Optical flow analysis
is also useful to distinguish rigid and non-rigid objects.
II. LITERATURE SURVEY
1. PAST RESEARCH
The need to capture motion has been identified for decades in
various fields, many of them unsurprisingly with no direct
relation to character animation, interactivity or digital media - for instance, intruder detection systems. However, any
advances in these industries go unaccredited in documented
histories of motion capture, which prefer to point to a method
called ‗rotoscoping‘, first used by Walt Disney, who traced
animation over film footage of live actors playing out the
scenes of the cartoon ‗Snow White‘. The quality of animation
produced is high, but as the tracing must be done by hand,
many of the advantages of automated motion capture are lost.
Aggarwal and Cai gave another survey of human motion
analysis, which covered the work prior to 1997.The paper
provided an overview of various tasks involved in motion
analysis of human body prior to 1998. The focuses were on three major areas related to interpreting human motion:
(a)motion analysis involving human body parts, (b) tracking
moving human from a single view or multiple camera
perspectives, and (c) recognizing human activities from image
sequences.
III. PROPOSED WORK
The proposed system for real time video object detection,
tracking and classification system is shown in Figure 1
Figure 1: Proposed System Block Diagram.
The background is the image which contains the non-moving
objects in a video. Obtaining a background model is done in
two steps: first, the background initialization, where we obtain
the background image from a specific time from the video
sequence, then, the background maintenance, where the
background is updated due to the changes that may occur in
the real scene.
In any indoor or outdoor scene, there are many changes that
may occur over time and may be classified as changes to the
background scene. To classify these changes according to their
sources as follows:
Illumination changes: like the change of the sun
location, the change between cloudy and sunny
weather, and turning the light on/off.
Motion changes: like small camera displacement or
tree branches moving.
Changes introduced to the background: like objects
entering the scene and stays without moving for a
long period of time.
1. Image Acquisition
The input device can be a digital video camera in a free
environment connected to the computer, or a storage device
on which a video file or individual video frames are stored as
Audio Video Interleaved (also Audio Video Interleave),
known by its initials AVI, is a multimedia container
format introduced by Microsoft.
2. Background Model
The background model must tolerate these kinds of changes.
The background maintenance helps the background model to
Page 78
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
adapt many to changes that may occur. The background
maintenance model update uses two adaptations the
sequentially adaptation and the periodically adaptation. The
first one is done by using a statistical background model
which provides a mechanism to adapt to slow changes in the
scene. This adaptation is performed using a low pass filter and
is applied for each pixel. The periodically adaptation is used
to adapt to high illumination and physical changes that may
happen in the scene, like deposited or removed objects. .3. Foreground Detection
The purpose of image segmentation is to separate foreground
regions from background area in order to detect any moving
objects. Thresholding is the simplest method of image
segmentation. Thresholding can be used to create binary
images, so that objects of interest are separated from the
background. An foreground pixel is given a value of ―1‖ while
a background pixel is given a value of ―0‖ .
Foreground region extraction can be detected using
background subtraction method. Background subtraction method detects the moving regions by making use of the
pixel-by-pixel difference from a reference background image.
A background image over time is subtracted from current
acquired frame. The resulting image pixels are flagged as
foreground pixels if they have large intensity value, and
considered background pixels if they have a near zero value.
Foreground region extraction process may produce images
that contain holes may cause objects to be split into more than
one connected region, which would make the object to be
detected as multiple objects. So we need to restore the objects
to their original state and size by applying a sequence of
dilations and erosion
Here It is pixel at location (x, y) in the current image and Bt(x,
y) is background image It is marked as foreground if
|It(x, y) − Bt(x, y)| > Ƭ …………………….. (1)
is satisfied where ‗Ƭ ‘ is a predefined threshold.
Thresholding:
Otsu's Method
Otsu‘s method is used to automatically perform histogram
shape-based image Thresholding,the reduction of a gray level
image to a binary image. The algorithm assumes that the
image to be threshold contains two classes of pixels or bi-
modal histogram (e.g. foreground and background) then
calculates the optimum threshold separating those two classes
so that their combined spread (intra-class variance) is
minimal.
In Otsu's method we exhaustively search for the threshold that minimizes the intra-class variance, defined as a weighted sum
of variances of the two classes:
.…..(2)
Weights are the probabilities of the two classes separated
by a threshold t and variances of these classes.
Otsu shows that minimizing the intra-class variance is the
same as maximizing interclass variance:
…..(3)
Which is expressed in terms of class probabilities and class
means .
The class probability is computed from the histogram
as t:
…..(4)
While the class mean is:
…....(5)
Wher is the value at the center of the ith histogram bin. Similarly,
you can compute and on the right-hand side of the
histogram for bins greater than t.
MATLAB function for global Thresholding
Syntax level = graythresh(I)
BW = im2bw (I, level)
Description:
level = graythresh (I) computes a global threshold (level) that
can be used to convert an intensity image to a binary image
with im2bw. level is a normalized intensity value that lies in
the range [0, 1]. The graythresh function uses Otsu's method,
which chooses the threshold to minimize the intraclass
variance of the black and white pixels.
BW=im2bw (I, level) Convert image to binary image, based on threshold that converts the grayscale image I to a binary image. The output image BW replaces all pixels in the
input image with luminance greater than level with the value 1
(white) and replaces all other pixels with the value 0 (black).
Specify level in the range [0, 1]. This range is relative to the
signal levels possible for the image's class.
Figure.2: Background subtracted image
4. Object and Feature Extraction
Object tracking can be implemented when features are
efficiently identified and extracted from frame to frame
regardless of the tracking algorithm. The features used in this
system is the center of gravity (centroid) the velocity and the
Page 79
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
size. The center of gravity is used to classify the object and
detect its location in the frame sequence. Object size is the
number of foreground pixel in the detected object. Figure 6
shows an illustration for the features.
Figure 3: Features -centroid and size of foreground image
5. Object Tracking
The aim of object tracking is to establish a correspondence
between objects or object parts in consecutive frames and to
extract temporal information about objects such as trajectory
track objects as a whole from frame to frame. The object
features such as size, center of gravity used to create
bounding box around the object in motion. The object tracking algorithm utilizes extracted object features together
with a correspondence matching scheme to track objects from
frame to frame.
Figure 4: Bounding Box enclosing human.
6. Object Classification
Motion based Classification involves extracting temporal
motion features of objects in order to recognize their classes.
In general to distinguish non-rigid objects (e.g. human) from
rigid objects (e.g. vehicles). The method is based on the
temporal self-similarity of a moving object. As an object that exhibits periodic motion evolves, its self-similarity measure
also shows a periodic motion. The method exploits this clue to
categorize moving objects using periodicity.
Time dependent features carry considerable amount of
information concerning the identity of an object. For example,
the periodicity of human gait is very effective for separating a
walking human from a moving car.
7. Object Processing
Once the object is identified and classified as human or
vehicle, the object is represented by indexing it by name,
number or generates alarm when identified object enters restricted area. Indexed image of objects can be compared
with the stored database.
IV.RESULTS
Figure1: Object detected with Background Eliminated and tracked
Figure 2: human detected with Background Eliminated
V.CONCLUSION
In System for foreground Analysis of real time moving object
detection and tracking, foreground regions of potential
moving objects are extracted over which morphological
operation applied to eliminate noise, detect moving objects
and human shown in below figure and further analysis like
Behavior analysis to be perform.
REFERENCES
[1] Real-Time Human Detection, Tracking, and Verification in Uncontrolled
Camera Motion Environments by Mohamed Hussein Wael Abd-Almageed
Yang RaLarry DavisInstitute for Advanced Computer Studies ,University of
Maryland.IEEE
[2] Moving object detection, tracking and classification for smart video
surveillance by yigithan dedeoglu
[3] J.K. Aggarwal and Q. Cai. Human motion analysis: A review. Computer
Vision and Image Understanding, 73(3):428–440, March 1999.
[4] An Extreme Precise Motion Tracking of Piezoelectric Positioning Stage
Using Sampled data Iterative Learning Control by Jian-Xin Xu, Deqing
Huang, Venkatakrishnan Venkataramanan, and Huynh The Cat Tuong
2011.
Page 80
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Naturalness of Machine Synthesized Tamil Speech
*A.G Ramakrishnan #kusumika krori dutta #Lakshmi Chithambaran
*Professor, Department of Electrical Engineering, Indian Institute of Science, Bangalore-560012
#Department of Electrical and Electronics Engineering,MSRIT, Bangalore-560054
Abstract: In general,Text-To-Speech (TTS) conversion
systems generate un-intonated speech that lacks
naturalness of human expressions. This paper proposes a
method to modify the un-intonated speech output by a
Tamil TTS engine to intonated speech with respect to
interrogative sentences. This is achieved by applying a
model of intonated interrogative contour representing proper speech parameters to the TTS output. The
modeling and implementation have been performed using
PRAAT interfaced with MATLAB.
Key Words: Tamil speech, TTS, interrogation, Pitch,
MATLAB, PRAAT.
1. INTRODUCTION
Over the last decade, speech processing usage has
become ubiquitous, fueled by the increasing
demand for automated systems with spoken
language interfaces. Considerable amount of
research has been done in prosody modification of
synthesized speech for European languages.
However, research on prosody in Indian languages
is very limited [1-2]. In this paper, we attempt to
model the pitch, amplitude and duration of
interrogative sentences in Tamil, which is one of
the oldest languages of Indian sub-continent.
Speech has an important role in human emotional
expressions. The same sentence, uttered with
different emotions, may give different meanings.
Thiswork aims to help people with cerebral palsy,
having all their emotions intact, but cannot talk.
Text- To- Speech (TTS)conversion systems help
them to overcome the disability, but the outputof
TTS has no intonations and thus, no information on
the feelings of the speaker. Information conveyed
through such speech can be misunderstood[3]. So
the whole purpose of communication may be lost
and may not evoke the expected response from
people [3].Thework reported here modifies the
prosody of an un-intonated interrogative sentence to
one that has more natural expression. For
introducing interrogative expression in machine
synthesized speech, a quantitative models created
for changes in different parameters responsible for
prosody[1-2] such as pitch, duration and energy. As
per this model, pitch is modified using DCT on the
pitch synchronous linear predictive coding (LPC)
residual signal [2,4].
2. INTERROGATIVE PROSODY MODEL
Fifteen interrogative sentences were recorded twice
from seven natives of Tamil, irrespective of gender,
age and slang. They were asked to utter each
sentence first without intonation and then with
intonation.The pitch, energy and durational features
were analyzed for all the recorded sentences with
the help of [1-2].
2.1 Pitch
Pitch is the most expressive feature of speech. It is
an indicator of different emotions.For
example,narrow pitch range indicates boredom,
depression, or controlled anger [1].
As shown in Figs. 1 and 2, the pitch contour of a
naturally uttered (intonated) interrogative sentence
shows a profound rise and fall pattern as compared
to un-intonated utterance of the same sentence. The
contour can be approximately modeled as a
Gaussian curve. The rise and fall pattern is
governed by the location of question word. If the
question word is at the end of the sentence, the
previous word forms the rise part and question
word forms the fall part.Otherwise, the question
word forms the rise part and the next word in the
sentence forms the fall part.
2.2 Energy
The mean energy of the whole speech signal (sum
square of all the samples divided by total number of
Page 81
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
samples) of an intonated sentence is 10 to 28%
more than that of the corresponding un-intonated
sentence.
2.3 Duration
Duration is the time in seconds for which the basic
unit of speech under consideration lasts. In this
work, the unit studied is a word. Table 1 lists the
mean relative durations of intonated sentences.
Table 1.Variation in the duration characteristics of
intonated sentencescompared to those of un-
intonated ones.
3. VALIDATION
To validate the heuristic model of the prosodic
parameters, three varieties of semitones were
created: pure sinusoids, half wave rectified
sinusoids (to include the second harmonics) and the
combination of the two in different proportions
andMean Opinion Score(MOS) was obtained on a
scale of one to ten from four subjects.
Table 2. MOS of tones
Nature of synthetic
tones studied
MOS(on a scale
of 10)
Pure sinusoids 6
Rectified (harmonics) 4
1:1combination of
sinusoid & harmonics
7
4. EXPERIMENTAL STUDY
The above model was applied on the output of the
TTSdeveloped by MILE LAB, Deptof Electrical
Engineering, Indian Institute of Science [1&6].
Fig.3 Block diagram of the process to modify the
TTS output.
The TTS output[2-3] is segmented at the word-level
and analyzed by Praat software. The pitch marks
obtained from Praatare used in Matlab to extract
each pitch period of the speech signal to do pitch
synchronous analysis. Linear predictive analysis is
done on each period of the speech signal, using
which the basic excitation signal is decoupled from
the spectral shaping effects of the vocal tract. The
excitation signal contains the pitch information.
The human speech production is modeled as
+ G u(n) --------------(1)
where, s(n) is the speech signal samples, u(n) is the
excitation and ak are the LP coefficients. LPC
predicts the current sample using a linear
combination of p past samples.
s^(n) = -----------------------(2)
where, s^(n) is the estimated speech signal.
When the error between original speech signal and
estimated signal is taken, the error gives us only the
excitation pulse and the impulse response of the
vocal tract (formant information)iscaptured in ak‗s.
-------------------------(3)
-----------(4)
--------------------------------(5)
Where e(n) is the error and u (n) is the pitch
Therefore from equation 3, 4 and 5 we can say that
error between original speech signal and estimated
signal gives us the basic excitation and the formants
in ak.
Therefore Taking Z-transform of (4), error is the
output of the system with transfer function -k -----------------------(6)
Error of each period is obtained by passing each
period through the filter, described in (6) and the
error is manipulated in DCT domain, with the
modeled pitch contour as reference. The pitch
Duration Feature Relative duration in
intonated sentences
Duration of the
entire sentence
longer by average of
19.7%
First word in the
curve
Shorter by 31.5% on
the average.
Second word in the
curve
longer by 32% on
the average.
Page 82
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
contour is modeled as a normal distribution curve
(from section 2.1).The block diagram of pitch
modification is shown in Fig.4
Durational features hypothesized, are implemented
by increasing or deleting frames appropriately,
Energy features hypothesized, are implemented by
multiplying the signal by the hypothesized factor.
5. RESULTS AND DISCUSSIONS
Un-intonated interrogative sentences were recorded
from people and modified by the algorithm and
then intonated interrogative sentences were
recordedfrom people and MOS was taken. The
MOS is as follows:
Table 3. MOS of intonated interrogative speech
compared to naturalness.
Speech MOS(out-of 5)
Un-intonated
recording
0
Modified un-
intonated recording
with the help of
Algorithm
3.5
Natural Intonated
recording
5
With MOS we can say that algorithm proposed in
this paper modifies the un-intoned speech very
close to natural speech.
Next, output of TTS was taken for an interrogative
sentence in Tamil and was modified by the
algorithm and then the same sentence was recorded
with intonation from people.
Table 4.MOS of algorithm in paper as compared to
TTS and natural recording.
From MOS of TTS we can see that TTS in itself
doesn‘t have a plain contour as shown for un-
intonated recording in fig 1.
The process of imposition of our model to the
existing random contour in TTS creates detoriation
in the quality of the modified speech output.
6. CONCLUSION
As aimed, with the hypothesis and algorithm
followed in this paper interrogation intonation in
non-intonated sentences is brought very close to
natural intonation.
Further work of improving quality of TTS modified
speech can be done. Also the same algorithm can be
followed for other emotions.
ACKNOWLEDGEMENT
We whole heartedly thank all the members of MILE
LAB, Department of Electrical Engineering, Indian
Institute of Science. A special thanks to Mr. H R
Shiva Kumar for developing the TTS and making it
available as a web demo.
REFERENCE
[1] G L Jayavardhana Rama et al."Thirukkural : A Speech
synthesis system in Tamil", Proc. Tamil Internet 2001,
Kualalumpur, Malaysia, Aug.26-28, 2001.
[2] R.Muralishankar and A.G.Ramakrishnan, ―Human touch
to the Tamil Synthesizer,‖ Proc. Tamil Internet 2001, Kuala
Lumpur, August 26-28, 2001, pp. 103-109.
.
Speech MOS(out-of 5)
TTS 1
Algorithm as
proposed in this paper
2
Intonated recording 5
Page 83
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
OPTICAL FIBER COMMUNICATION 1K. B. MOHD. UMAR ANSARI,
2SATYENDRA VISHWAKARMA,
3ANUP KUMAR
1, 2
M.Tech (Electrical Power & Energy Systems),
Department of Electrical & Electronics Engineering, 3B.Tech (Mechanical Engineering), Dept. of Mechanical Engg.,
Ajay Kumar Garg Engineering College
Ghaziabad, Uttar Pradesh, India.
[email protected] , [email protected] , [email protected]
Abstract: Fiber-optic communication is a method of transmitting
information from one place to another by sending pulses
of light through an optical fiber. The light forms
an electromagnetic carrier wave that is modulated to carry
information. First developed in the 1970s, fiber-
optic communication systems have revolutionized the
telecommunications industry and have played a major role in the
advent of the Information Age. Because of its advantages over
electrical transmission, optical fibers have largely replaced
copper wire communications in core networks in the developed
world.
The process of communicating using fiber-optics involves the
following basic steps: Creating the optical signal involving the
use of a transmitter, relaying the signal along the fiber, ensuring
that the signal does not become too distorted or weak, receiving
the optical signal, and converting it into an electrical signal.
Keywords: Optical fiber, Principle, Modes, Elements,
Applications.
1. INTRODUCTION
The phenomenon of totalinternal reflection, responsible for
guiding of light in optical fibers, has been known since 1854
[1]. Although glass fibers were made in the 1920s, their use
became practical only in the 1950s, when the use of a cladding
layer led to considerable improvement in their guiding characteristics. Before 1970, optical fibers were used mainly
for medical imaging over short distances [2]. Their use for
communication purposes was considered impractical because
of high losses. However, the situation changed drastically in
1970 when, following an earlier suggestion, the loss of optical
fibers was reduced to below 20 dB/km [2]. Further progress
resulted by 1979 in a loss of only 0.2 dB/km near the 1.55-µm
spectral region [3]. The availability of low-loss fibers led to a
revolution in the field of light wave technology and started the
era of fiber-optic communications.
2. GEOMETRICAL OPTICS DESCRIPTION
In its simplest form an optical fiber consists of a cylindrical
core of silica glass surrounded by a cladding whose refractive
index is lower than that of the core. Because of an abrupt index change at the core–cladding interface, such fibers are
called step-indexfibers. In a different type of fiber, known as
graded-index fiber, the refractive index decreases gradually
inside the core. Figure 1 shows schematically the index profile
and the cross section for the two kinds of fibers. Considerable
insight in the guiding properties of optical fibers can be gained
by using a ray picture based on geometrical optics [3]. The
geometrical-optics description, although approximate, is valid
when the core radius a is much larger than the light
wavelength λ.
Figure 1: Cross section and refractive-index profile for step-
index and graded-index fibers.
Step-IndexFibers
Consider the geometry of Fig. 2,wherearay making an
Page 84
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
angleθi with the fiber axis is incident at the core center
.Because of refraction at the fiber–
airinterface,theraybendstowardthenormal.Theangleθr oftherefractedrayisgivenby[3]
n0sinθi=n1sinθr (1) 1
wheren1 andn0 aretherefractiveindicesofthefibercoreandair,respectively.
There- fractedrayhitsthecore–
claddinginterfaceandisrefractedagain.However,refraction
ispossibleonlyforanangleofincidenceυsuchthat
sinυ<n2/n1.
Figure2:Lightconfinementthroughtotalinternalreflectioninstep-indexfibers. Raysforwhichυ<υc arerefractedout ofthecore.
(2) OnecanuseEqs.1 and 2
tofindthemaximumanglethattheincident
rayshouldmakewiththefiberaxistoremainconfinedinsideth
ecore. Notingthatθr=π/2−υc
forsucharayandsubstitutingitinEq.1,weobtain
(3) Inanalogywithlenses, n0sinθiisknownasthenumericalaperture(NA)ofthefiber.
Graded index fibers
Therefractiveindexofthecoreingraded-index
fibersisnotconstantbutdecreases
graduallyfromitsmaximumvaluen1
atthecorecentertoitsminimumvaluen2 at thecore–
claddinginterface. Mostgraded-
indexfibersaredesignedtohaveanearlyquadraticdecreasean
dareanalyzedbyusingα-profile,givenby
where a is the core radius. The parameter α determines the
index profile. A step-index profile is approached in the limit
of large α. A parabolic-index fiber corresponds to α = 2.
It is easy to understand qualitatively why intermodal or multipath dispersion is reduced for graded-index fibers. Figure
3 shows schematically paths for three different rays. Similar to
the case of step-index fibers, the path is longer for more
oblique rays.
However, the ray velocity changes along the path because of
variations in the refractive index. More specifically, the ray
propagating along the fiber axis takes the shortest path but
travels most slowly as the index is largest along this path.
Oblique rays have a large part of their path in a medium of
lower refractive index, where they travel faster. It is therefore
possible for all rays to arrive together at the fiber output by a
suitable choice of the refractive-index profile.
Figure 3: Ray trajectory in a graded index fibers
Geometricalopticscanbeusedtoshowthataparabolic-index
profileleadsto
nondispersivepulsepropagationwithintheparaxialapproxi
mation. Thetrajectory
ofaparaxialrayisobtainedbysolving[3]
where α is the radial distance of the ray from the axis.
3. GENERAL OVERVIEW OF OPTICAL FIBER
COMMUNICATION SYSTEM :
Like all other communication system, the primary objective
of optical fiber communication system also is to transfer the
signal containing information (voice, data, video) from the
source to the destination. The general block diagram of
optical fiber communication system is shown in the figure 4.
The source provides information in the form of electrical
signal to the transmitter. The electrical stage of the
transmitter drives an optical source to produce modulated
light wave carrier. Semiconductor LASERs or LEDs are
usually used as optical source here. The information carrying
light wave then passes through the transmission medium i.e.
optical fiber cables in this system. Now it reaches to the
receiver stage where the optical detector demodulates the optical carrier and gives an electrical output signal to the
electrical stage. The common types of optical detectors used
are photodiodes (p-i-n, avalanche), phototransistors,
Page 85
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
photoconductors etc. Finally the electrical stage gets the real information back and gives it to the concerned destination.
Figure 4: OPTICAL FIBER COMMUNICATION SYSTEM.
4. PRIMARY ELEMENTS OF OPTICAL FIBER
COMMUNICATION SYSTEM:
Figure5 shows the major elements used in an optical fiber
communication system. As we can see the transmitter stage
consists of a light source and associated drive circuitry.
Again, the receiver section includes photo detector, signal
amplifier and signal restorer.
Additional components like optical amplifier, connectors,
splices and couplers are also there. The regenerator section
is a key part of the system as it amplifies and reshapes the distorted signals for long distance links.
Figure 5: Elements Of an Optical Fiber Communication
System.
4.1 Transmitter section : The main parts of the transmitter section are a source (either
a LED or a LASER), efficient coupling means to couple the
output power to the fiber, a modulation circuit and a level
controller for LASERs. In present days, for longer repeater
spacing, the use of single mode fibers and LASERs are
seeming to be essential whereas the earlier transmitters
operated within 0.8µm to 0.9µm wavelength range, used
double hetero structure LASER or LED as optical sources.
High coupling losses result from direct coupling of the source to optical fibers. For LASERs, there are two types of
lenses being used for this purpose namely discrete
lenses and integral lenses.
4.1.1 LED vs LASER as optical source :
A larger fraction of the output power can be coupled into the
optical fibers in case of LASERs as they emit more
directional light beam than LEDs. That is why LASERs are
more suitable for high bit rate systems. LASERs have
narrow spectral width as well as faster response time.
Consequently, LASER based systems are capable of
operating at much higher modulation frequencies than LED
based systems. Typical LEDs have lifetimes in excess of
10^7 hours, whereas LASERs have only 10^5 hours of
lifetime. Another thing is that LEDs can start working at
much lower input currents which is not possible for LASERs. So, according to the situation and requirements
either LED or LASER can be utilized as an optical source.
Now there are a number of factors that pose some limitations
in transmitter design such as electrical power requirement,
speed of response, linearity, thermal behavior, spectral width
etc.
4.1.2 Drive circuitry:
These are the circuits used in the transmitters to switch a
current in the range of ten to several hundred miliamperes
required for proper functioning of optical source. For LEDs
there are drive circuits like common emitter saturating
switch, low impedance, emitter coupled, transconductance
drive circuits etc. On the other hand for LASERs, shunt
drive circuits, bias control drive circuits, ECL compatible
LASER drive etc are noticeable.
4.2 Receiver section:
From figure 5 the general structure of a receiver section
includes Photo detector, low noise front end amplifier,
voltage amplifier and a decision making circuit to get the
exact information signal back. High impedance amplifier
and Transimpedance amplifier are the two popular
configurations of front end amplifier, the design of which is
very critical for sensible performance of the receiver. The
two most common photo detectors are p-i-n diodes and
avalanche photodiodes. Quantum efficiency, responsively
and speed of response are the key parameters behind the
decision of photo detectors. The most important
requirements of an optical receiver are sensitivity, bit rate
transparency, bit pattern independence, dynamic range,
acquisition time etc. As the noise contributed by receiver is higher than other elements in the system so, we must put a
keen check on it.
Page 86
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
5. BENEFITS OF OPTICAL FIBER COMMUNICATION SYSTEM:
Some of the innumerable benefits of optical fiber
communication system are:
Immense bandwidth to utilize
Total electrical isolation in the transmission
medium
Very low transmission loss,
Small size and light weight,
High signal security,
Immunity to interference and crosstalk,
Very low power consumption and wide scope of
system expansion etc.
6. APPLICATION Due to its variety of advantages optical fiber communication
system has a wide range of application in different fields
namely:
Public network field which includes trunk networks,
junction networks, local access networks, submerged
systems, synchronous systems etc.
Field of military applications,Civil, consumer and
industrial applications. Field of computers which is the center of research right
now.
7. CONCLUSION
Though there are some negatives of optical fiber
communication system in terms of fragility, splicing,
coupling, set up expense etc. but it is an un avoidable fact
that optical fiber has revolutionized the field of communication. As soon as computers will be capable of
processing optical signals, the total arena of communication
will be opticalized immediately.
REFERENCES [1] K. C. Kao and G. A. Hockham, Proc. IEE 113, 1151
(1966); A. Werts, Onde Electr. 45, 967 (1966). [2] N. S. Kapany, Fiber Optics: Principles and Applications,
Academic Press, San Diego, CA, 1967.
[3] J. Gower, Optical Communication Systems, 2nd
ed.,
Prentice Hall, London.
Page 87
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Improved Face Recognition with Multilevel
BTC using YCbCr Colour Space
Shoan Herman Pinto ECE DepartmentSJBIT
Bangalore, India [email protected]
Chitra V Kumar ECE DepartmentSJBIT
Bangalore, India
[email protected]
Shreyas S Sogal ECE DepartmentSJBIT
Bangalore, India [email protected]
Abstract-The motive of the work presented in the
paper is to achieve a better efficiency in Face
Recognition using block truncation coding (BTC)
using RGB and YCbCr colour space. Multilevel Block
Truncation Coding is applied to image in RGB and
YCbCr colour space up to four levels for face
recognition. The experimental analysis has shown an
improved result for Block Truncation Coding at
Level 4 (BTC-level 4) as compared to other BTC
levels of RGB colour space. Results displaying a
similar pattern are realized when the YCbCr colour
space is used. In addition, an improved result on all
four levels is observed for YCbCr colour space.
Keywords- Face recognition; BTC; RGB; YCbCr;
Multilevel BTC; Mean Square Error;
1. INTRODUCTION
Face recognition refers to identifyingand
verifyingaface image.Afacerecognition system
accomplishesthisby comparing the
inputqueryfaceimagewiththe existingface
imagesstoredinthe database.Itexploitsthe unique
characteristics ofanindividual‘s face. Face
recognitionisthe fastestgrowing
biometrictechnology. Biometricsmay be
definedasanautomatedmethodof
recognizingpersonbasedonthephysiologicalandbeha
vioralcharacteristics. There are many biometric
systemssuch as finger prints,iris,voice,retina
andface.Amongthesesystems,
facerecognitionhasprovedtobethe
mosteffectiveanduniversalsystem.Thesesystems
are usedina wide range of
applicationsthatrequirereliablerecognition of
humans.Someof the applications of face
recognition include security, physical andcomputer
accesscontrols, law enforcement [11],[12],criminal
list verification, surveillanceatvarious
places[14],authentication atairports,forensic,etc.
Face recognitionhas becomeacentre ofattentionfor
researchersfrom the fieldof biometrics,computer
vision, imageprocessing,neural networksand
patternrecognition system. Many
algorithmsareusedtomake effectiveface
recognitionsystems.Someof
thealgorithmsincludePrinciple
ComponentAnalysis(PCA)[2],[3],[4],LinearDiscri
minant Analysis
(LDA)[5],[6],[7],IndependentComponentAnalysis
(ICA)[8],[9],[10]etc.
Page 88
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Thefaceimagesina databasemightnotbe of
constantsize. Thus,tomakethealgorithm
independent of thesizeof aface
image,BlockTruncationCoding (BTC)[12],[13] has
been used.Thiscoding technique has
beenimplementedtillfour levels intwocolour face
image databases.
2. BLOCK TRUNCATION CODING
Block truncationcoding (BTC) [1]
[11][12][13]isarelatively simpleimagecoding
techniquedevelopedintheearlyyears of
digitalimagingmorethan29yearsago.Althoughitis
asimple
technique,BTChasplayedanimportantroleinthehistor
y of digitalimagecoding inthesensethatmany
advancedcoding techniques have beendeveloped
basedonBTCorinspired by thesuccessof
BTC.BlockTruncationCoding(BTC)wasfirst
developed in 1979 for grayscale imagecoding
[13].In the givenimplementationof BTC,the colour
faceimagesdatabase
intheRGB(Red,GreenandBlue)colour
space[16], [17]has been used. It is later
converted to YCbCr colour space.
3. COLOUR SPACE
The various colour spaces exist because they
present colour information in ways that make
certain calculations more convenient or because
they provide a way to identify colours that is more
intuitive. Few examples being RGB, XYZ, xyY,
L*a*b, YCbCr,HSV. In this paper RGB and
YCbCr colour space is used.
Y represents the luminance and Cb & Cr represent
the chrominance components. This format is
typically used in video coding.
4. MULTI-LEVEL BTC
Tocalculate thefeaturevector inthis algorithm,
Block Truncation Coding has been used(Forfurther
information refer[11],[12],[13]).It
hasbeenimplementedonfour levels whichare
explained below:
4.1 BTC Level 1
Aface image istakenfrom the database andthe
average intensity valueofeachcolour
planeoftheimageiscalculated. The colour
spaceconsideredinthisalgorithm isthe RGBcolour
space [16],[17]. Sotheaverageintensity valueof
eachof the RGBplaneof afaceimageiscalculated.
The further discussionisdone usingtheRedplane of
animage.Thesame has to be carriedoutforthe
BlueandGreencolour space.
Afterobtainingtheaverageintensityvalue of theRed
colour plane
ofthefaceimage,eachpixeliscomparedwiththe mean
value andthe image is dividedintotworegions:
Upper RedandLowerRed[18].Theaverageintensity
valuesofthese regionsiscalculatedandstored
inthefeaturevectorasURand LR.Thus,afterrepeating
this procedure forthe Blue and the Greencolour
space ourfeaturevectorhassix elements:Upper
Red,LowerRed,UpperGreen,LowerGreen,Upper
Blue, Lower Blue (UR,LR,UG,LG,UB,LB)[18].
Referfigure 1.
4.2 BTC Level 2
At level two the values Upper Red and Lower Red
are
extractedfromthefeaturevectorofBTClevel1andusin
g thesevalues,theRedplaneoffaceimageisnow
dividedinto fourregions.TheseareUpper-
UpperRed,Upper-LowerRed, Lower-
UpperRedandLower-LowerRed[18].The average
intensityvaluesinthesefourregions
iscalculatedandstoredin the feature vectors.
The aboveprocess isreiteratedforthe Blue
andGreencolour spacesof
thefaceimage.Thusthefeaturevectoratthislevel has
12 elements,4elements for eachplane.Referfigure 1.
4.3 BTC Level 3 and Level 4
Page 89
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Using the procedures describedintheLevels1&
2,thefaceimagesarefurtherdividedintomoreregionsin
eachofthe colour space.These regions are
depictedinfigure1.The average intensity
valueattheseregionsarecalculatedandstoredinthe
featurevector.Thefeaturevectorhas24elementsatBT
C- level 3and 48elements atBTC-level
4.Thefeaturevectors obtainedinBTC-
levels1,2,3,4areusedforcomparisonwith the
database imageset.Figure1 depictsthefourBTC-
levels withtheir feature vectors.
4.4 BTC for YCbCr plane
YCbCr is a family of colour spaces used as a part
of the colour image pipeline in videoand digital
photography systems. Y is the luma component and
Cb and Cr are the blue-difference and red-
difference chroma components. The same
algorithm used for RGB plane is implemented on
YCbCr plane by converting the RGB image to
YCbCr image.
5. PROPOSED METHOD
5.1 Feature vector extraction
TheFeaturevectorateachBTClevelforthequery
imageand databasesetisextractedby
usingthemethoddescribedinthe
previoussection(section 4).Feature vector for Red,
Green, Blue components of the image is obtained.
This Featurevectoristhenusedin the face
recognitionsystem.
5.2 Implementation using featurevectors
Thefeature vectors obtainedineachlevel of BTCareusedto
comparewiththe database images(Trainingset).The
comparison(Similaritymeasure)isdone by Mean Square
Error(MSE)given byequation.
Where,
X&X‘ are two feature vectors of size m*n which are being
compared. False Acceptance Rate (FAR) and Genuine
Acceptance Rate (GAR) are used to evaluate the performance
of the different
BTC levels based face recognition techniques.
Mean square error is calculated for every feature vector and
then it is compared with the query image mean square vector.
The minimum MSE obtained for a image after comparing
gives us the required image.
6. IMPLEMENTATION
Page 90
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
6.1 Platform
The implementation of the Multilevel BTC is done in
MATLAB 2010b. It was carried out on a computer using an
Intel Core i3 processor.
6.2 Database
The experiments were performed on face database:
Created by Dr Libor Spacek this database has 1000 images
(each with 180 pixels by 200 pixels), corresponding to 100
persons in 10 poses each, including both males and females.
All the images are taken against a dark or bright homogeneous
background, little variation of illumination, different facial
expressions and details. The subjects sit at fixed distance from
the camera and are asked to speak, whilst a sequence of
images is taken. The speech is used to introduce facial
expression variation. The images were taken in a single
session. The six poses of Face database are shown in Figure 2.
Figure 2
6.3 YCbCr based BTC Algorithm
Step 1: Train images are read into vector and RGB colour
space is converted to YCbCr colour space.
Step 2: Implementation of BTC levels-Level 1, Level 2, Level
3 and Level 4 is done to the train images.
Step 3: Implementation of BTC Levels is done for the test
image.
Step 4: For each image in train mean square error is
determined with respect to test image.
Step 5: The image in train with least mean square error is the
recognized image with respect to the test image.
7. RESULTS AND DISCUSSION
False Acceptance Rate(FAR)andGenuine Acceptance Rate
(GAR)arestandardperformanceevaluationparameters offace
recognitionsystem.
The Falseacceptancerate(FAR)isthemeasure of the
likelihoodthatthebiometricsecurity systemwillincorrectly
acceptanaccessattempt byanunauthorizeduser.A system‘s FAR
typically isstatedastheratio ofthe numberoffalse acceptances
divided bythe number ofidentificationattempts.
FAR = (False ClaimsAccepted/Total Claims) X 100
TheGenuine Acceptance Rate(GAR)isevaluatedby
subtractingthe FAR values from 100.
GAR=100-FAR (percentage)
Inall, 99queriesare firedonfacedatabase (132 images are
considered).Foreach query, FARandGARvalues
arecalculatedforrespective BTC levelbasedfacerecognition
technique.Attheend theaverage FARandGAR of
allqueriesinrespectivefacedatabases are consideredfor
performanceranking of BTClevelsbasedface
recognitiontechniques.
7.1 Face Database [15]
In all,99 queries are tested on the Libor database of 100
images for analyzing the performance of proposed algorithms.
The feature vectors of each image or all four BTC levels
were calculated and then compared with the database. FAR
for the algorithm was obtained to be zero. A graph of the
efficiency of the program is shown below.
The graph givesthe efficiencyvaluesof the
differentBTClevelsfor Face database.Here
Page 91
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
itisobservedthatwitheachsuccessive level of BTC
theefficiencyvaluesgo onincreasing. Thusitisobserved thatthe
BTC-level4gives us thebest performance for RGB colour
space.
It is also observed that the efficiency for YCbCr colour space
is more compared to that of RGB colour space. 100%
efficiency is obtained for Level 2, 3, 4 when applied to YCbCr
colour space.
8. CONCLUSION
The three primary aspects on which face recognition depends
are cost, accuracy of the algorithm and execution time of the
program. As the level increases in BTC the GAR increases. The
highest GAR is obtained for level 4 implementation of BTC for
RGB colour space. YCbCr colour space gives an improved
result with highest GAR for BTC Level 2, 3, 4. This canbe
attributedto the relativelylargersize ofthe feature vectoratthis
level.Theproposedtechniquecanbe implementedinrealworld
scenarios choosingthe appropriate BTC levelimplementation.
REFERENCES
[1] H.B.Kekre, Sudeep D. Thepade, Sanchit Khandelwal, Karan
Dhamejani, Adnan Azmi, ―Face Recognition using Multilevel Block
Truncation Coding‖ International Journal of Computer Applications (IJCA)
December 2011 Edition.
[2] Xiujuan Li, Jie Ma and Shutao Li 2007. A novel faces recognition method
based on Principal Component Analysis and Kernel Partial Least. IEEE
International Conference on Robotics and Biometrics, 2007 ROBIO 2007
[3] Shermin J ―Illumination invariant face recognition using Discrete Cosine
Transform and Principal Component Analysis‖ 2011 International Conference
on Emerging Trends in Electrical and Computer Technology (ICETECT).
[4] Zhao Lihong , Guo Zikui ―Face Recognition Method Based on Adaptively
Weighted Block-Two Dimensional Principal Component Analysis‖; 2011
Third International Conference on Computational Intelligence,
Communication Systems and Networks (CICSyN)
[5] Gomathi, E, Baskaran, K. ―Recognition of Faces Using Improved
Principal Component Analysis‖; 2010 Second International Conference on
Machine Learning and Computing (ICMLC)
[6] Haitao Zhao, Pong Chi Yuen‖ Incremental Linear Discriminant Analysis
for Face Recognition‖, IEEE Transactions on Systems, Man, and Cybernetics,
Part B: Cybernetics
[7] Tae-Kyun Kim; Kittler, J. ―Locally linear discriminant analysis for
multimodally distributed classes for face recognition with a single model
image‖ IEEE Transactions on Pattern Analysis and Machine Intelligence,
March 2005
[8] James, E.A.K., Annadurai, S. ―Implementation of incremental linear
discriminant analysis using singular value decomposition for face
recognition‖. First International ConferenceonAdvancedComputing, 2009.
ICAC 2009
[9] Zhao Lihong, Wang Ye, Teng Hongfeng; ―Face recognition based on
independent component analysis‖, 2011 Chinese Control and Decision
Conference (CCDC)
[10] Yunxia Li, Changyuan Fan; ―Face Recognition by Non negative
Independent Component Analysis‖ Fifth International Conference on Natural
Computation, 2009. ICNC'09‘.
[11]Yanchuan Huang, Mingchu Li, Chuang Lin and Linlin Tian. ―Gabor
Based Kernel Independent Component Analysis on Intelligent Information
Hiding and Multimedia Signal Processing (IIH-MSP).
[12] H.B.Kekre, Sudeep D. Thepade, Varun Lodha, Pooja Luthra, Ajoy
Joseph, Chitrangada Nemani ―Augmentation of Block Truncation Coding
based Image Retrieval by using Even and Odd Images with Sundry Colour
Space‖ Int. Journal on Computer Science and Engg. Vol02, No. 08, 2010,
2535-2544
[13] H.B.Kekre, Sudeep D. Thepade, Shrikant P. Sanas Improved CBIR using
Multileveled Block Truncation Coding International Journal on Computer
Science and Engineering Vol. 02, No. 08, 2010, 2535-2544
[14] H.B.Kekre, Sudeep D. Thepade, ―Boosting Block Truncation Coding
using Kekre‘s LUV Colour Space for Image Retrieval‖, WASET International
Journal of Electrical, Computer and System Engineering (IJECSE), Volume
2, Number 3, pp. 172-180, Summer 2008.
[15] H.B.Kekre, Sudeep D. Thepade, ―Image Retrieval using Augmented
Block Truncation Coding Techniques‖, ACM International Conference on
Advances in Computing, Communication and Control (ICAC3-2009), pp.
384-390, 23-24 Jan 2009, Fr. Conceicao Rodrigous College of Engg.,
Mumbai.
[16] Developed by Dr. Libor Spacek. Available Online at:
http://cswww.essex.ac.uk/mv/otherprojects.html
[17] Mark D. Fairchild, ―Colour Appearance Models‖, 2nd Edition,
WileyIS&T, Chichester, UK, 2005. ISBN 0-470-01216-1
86
88
90
92
94
96
98
100
102
Level 1
Level 2
Level 3
Level 4
Eff
icie
ncy
BTC Level
RGB
YCbCr
Page 92
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Online Gesture Body Recognition Latha K1, Kavitha Vasanth 2
1 Latha.K
IV SEM, M.Tech, Dept. of CSE AIT, Banglore, Karnataka, INDIA
2 Kavitha Vasanth Asst. Professor, Dept. of CSE
AIT, Banglore, Karnataka, INDIA [email protected]
[email protected]
Abstract - Thispaperpresentsarobustframeworkforonlinefull-
bodygesturespottingfromvisualhulldata.Usingfullbodygesturefeatur
esasobservations,SVM–upportVectorMachinearetrainedforgesturespottingfromcontinuousm
ovementdatastreams.Themajorcontributionofthispaperisasystematica
pproachtoautomaticallydetectingandmodellingspecificgesturemove
mentpatternsandusingtheirHMMsforoutlierrejectioningesturespottin
g.UsingtheIXMASgesturedataset[3],theproposedframeworkisgoingtotestandthegesturespottingresultsaresuperiortothosereportedonthesame
datasetobtainedusingexistingstate-of-the-art
gesturespottingmethods.[1]
Aim is developing an efficient algorithm for online body gesture
recognition system.
Index Terms—Online gesture spotting, SVM, multilinear analysis,
IXMAS, hidden Markov models.
1. INTRODUCTION A gesture is a form of non-verbal communication in which visible
body actions communicate particular messages, either in the place of speech, or together and in parallel with spoken words. Gesture
include movements of the hands, face, or other parts of the body.
Gestures are expressive, meaningful body motions involving physical
movements of the fingers, hands, arms, head, face, or body with the intent of: (1)conveying meaningful
Information or (2) interacting with the environment. They constitute
one interesting small subspace of possible human motion. A gesture
may also be perceived by the environment as a compression technique for the information to be transmitted elsewhere and
subsequently reconstructed by the receiver.
Gesture Recognition is a technology that achieves dynamic human-
system interactions that do not require physical, touch, or contact based input mechanisms. Gesture recognition enables humans to
interface with the machine (HMI) and interact naturally without any
mechanical devices. Using the concept of gesture recognition, it is
possible to point a finger at the computer screen so that the cursor will move accordingly. This could potentially make conventional
input devices such as mouse, keyboards and even touch-screens
redundant.
Types of gesture recognition: hands, and full body. Gestures are most commonly used for communication among
humans, reducing the chances of misclassifying static poses, by using
continuous information. Gestures can be divided into two types [3]
i) Communicative gesture (a key gesture or a meaningful gesture). ii) Non-communicative gesture (garbage gesture or a transition
gesture).
A key gesture is motion that carries an explicit meaning to express
goals, and a transition gesture is motion that connects key gestures to cater to subconscious goals. Figure 1.1 shows a key gesture and
transition gesture.
Fig 1.1: Motion example consisting of a sequence of key gestures and
transition gestures
1.1 Scope of the project This project is implemented on Matlab R2010a using IXMAS gesture
data set. IXMAS gesture data set contains video clips performing human actions like walking, waving, running etc.
1. 2 Methodology • Image /Frame acquisition defining human actions
• Human Blob generation using CCA algorithm • Contour Extraction.
• Convex Hull Point generation using Graham‘s scan algorithm.
• Classification using Support Vector Machines (SVM).
1.3 Applications of gesture recognition It has following applications[2].
• Developing aids for the hearing impaired.
• Enabling very young children to interact with computers.
• Designing techniques for forensic identification. • Recognizing sign language.
• Medically monitoring patients‘ emotional states or stress levels.
• Navigating and/or manipulating in virtual environments.
• Communicating in video conferencing. • Distance learning/tele-teaching assistance.
Page 93
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
2. TRAINING A MODEL The result of running a machine learning algorithm can be expressed
as a function ϕ(x), parameterized by the model parameters which
takes an input vector x as input and that generates an output vector y,
encoded in the same format as the target vector t.
Fig 2.1:
A machine learning algorithm, expressed as a function with input x,
output y and parameterized by.
2.1Pre-processing It is common for the original input variables to be pre-processed in
some way to reduce both the computational load and complexity of
the recognition problem. The pre-processing stage could consist of
simply scaling or normalizing the data to a standard range, smoothing it to remove noise, or by transforming it into some new subspace of
variable, where it is hoped,
the recognition problem will be easier to solve. This preprocessing
stage is sometimes also called feature extraction. It should be noted that the new test data must be pre-processed using the same feature
extraction method as used in the training data
2.2 Post-processing In addition to pre-processing the raw data prior to input to a machine learning algorithm, it is also common to process the output of a
machine learning algorithm prior to using its output to make a
decision, such as either triggering or not triggering a sound to play.
This post processing stage could consist of waiting for a number of consecutive ‗trigger‘ classification results before a sample is
triggered or by combining the classification results of a number of
machine learning algorithms together to create one super-classifier.
The post-processing stage also enables the output of the machine learning algorithm to be combined with some additional domain-
specific information to provide additional context. Figure 2.3
illustrates the processing chain of gesture recognition.
Fig 2.2: An illustration of the processing chain for
a gesture recognition system
3. MACHINE LEARNING MODELS
3.1 Hidden Markov Model (HMM)
A time-domain process demonstrates a Markov property if the conditional probability density of the current event, given all
present and past events, depends only on the jth most recent
event. If the current event depends solely on the most recent
past event, then the process is termed a first order Markov process. Hidden Markov model (HMM) is a finite state machine
which generates a sequence of
discrete time observations. At each time unit, the HMM changes
states at Markov process in accordance with a state transition probability and then generates observational data in accordance
with an output probability distribution of the current state.
An N-state HMM Model is specified by:
• The set of states‘ S= s1, s2, s3……..sN. • The set of parameters λ = (п, A, B)
a) State transition probability ai,j where i, j =
1…..N.
b) The output probability distribution B= bi (o) where i = 1….N.
c) Initial state probability п = io where i =
1…N.
Fig 3.1: A Three state left-to-right HMM model.
Fig 3.1 shows a 3-state left-to-right model, in which the state index
simply increases or stays depending on time increment. The left-to-
right models are often used as speech units to model speech
parameter sequences since they can appropriately model signals
whose properties successively change. [4]
State 1 State 2 State 3
Table 3.1: State1, State2, State 3 what is the
probability of HMM producing ―a, a, b, c‖? Pr (a, a, b, c) via 1,1,2,3
= 0.8 × 0.5 × 0.8 × 0.3 × 0.6 × 0.5 × 0.1 = 0.004068.Pr (a, a, b, c) via
1,2,3,3 = 0.8 × 0.3 × 0.2 × 0.5 × 0.3 × 1.0 × 0.1 = 0.00072.
Pr (a, a, b, c) via 1,3,3,3 = 0.8 × 0.2 × 0.7 × 1.0 × 0.3 × 1.0 × 0.1 =
0.00336.
Because of the above mentioned problems related to HMM, the
proposed framework of classifying the gestures is done with Support
Output Probability Output Probability Output Probability
a 0.8 a 0.2 a 0.7
b 0.1 b 0.6 b 0.3
c 0.1 c 0.2 c 0.1
Model parameters
Input xOutput y
h(x)
Page 94
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Vector Machines (SVM) Classifier. Prior work of SVM was
employed to recognize hand gestures [5].
3.2 SVM Algorithm
Support vector machine is a classifier derived from statistical
learning theory invented by Vapnik and first introduced at the
Computational Learning Theory (COLT) 1992 conference. A support
vector machine (SVM) is a concept in statistics and computer science
for a set of related supervised learning methods that analyze data and
recognize patterns, used for classification and regression analysis.
The standard SVM takes a set of input data and predicts, for each
given input, which of two possible classes forms the input, making
the SVM a non-probabilistic binary linear classifier. Given a set of
training examples, each marked as belonging to one of two
categories, an SVM training algorithm builds a model that assigns
new examples into one category or the other. An SVM model is a
representation of the examples as points in space, mapped so that the
examples of the separate categories are divided by a clear gap that is
as wide as possible.
Fig 3.2: A 1-dimensional linearly inseparable
classification problem.
Fig 3.3: Mapping the data into a new, 2-dimensional feature space to
make the data linearly separable. It is achieved using a RBF kernel.
Following are some of the kernel functions which are commonly
used to convert input features into new feature space.
Linear Kernel - This is the simplest kernel and
shows good performance for linearly separable
data. K(x, y) = x. y
RBF Gaussian Kernel – K(x, y) = ⁄2
By using any of the kernels we can convert inseparable data to
linearly separable one.
4. GESTURE RECOGNITION PROCESS Recognition of human actions from video sequences involves
extraction of relevant visual information from a video sequence,
representation of that information in a suitable form, and
interpretation of visual information for the purpose of
recognition and learning human actions. Video sequences
contain large amounts of data, but most of this data does not
carry much useful information. Therefore, the first step in
recognizing human actions is to extract relevant information
which can be used for further processing. This can be achieved
through visual tracking. Tracking involves detection of regions
of interest in image sequences, which are changing with respect
to time. Tracking also involves finding frame to frame
correspondence of each region so that location, shape, extent,
etc., of each region can reliably be extracted. The recognition
involves two phase training and classification and it is carried
out in offline mode, considering the IXMAS gesture data set.
4.1 Block diagram
Fig 4.1: Top- level diagram of proposed system
Gesture recognition using SVM has the following methodologies
which is described in the block diagram 4.1.
The gesture recognition is done through offline mode, considering
IXMAS gesture data set. The data set contains video clips which
contains actions like waving, punching, kicking which is performed
by several persons.
Training phase
Initially video has n number of frames which is in RGB scale. The
frame which contains human subject has to be separated from the
background frame which results in the segmentation of human
object. The human object has to be converted into binary in order to
generate human blob using CCA algorithm. Conversion from RGB
scale to Binary scale is done through gray scale conversion. Once
human blob is generated, need to extract features likeboundary points
from the human subject. The boundary points are obtained from
contour method. From the contour points, we obtain the hull points.
These hull points are the valid points from the contour points. These
hull points has to extract from each and evey frame. These hull points
are nothing but the feature points are generated for evey actions or
Page 95
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
gesture. Subtraction of hull points form the consecutive frames
results in motion vectors or feature vectors. Use of Dynamic Time
Warping (DTW) in a scenario where computed hull points for the
current frame is less compared to previous or next frame. These
feature vectors is a scalar value, which represents the displacement of
human position from one frame to the next. In general, we are
extracting spatio-temporal features for each action. Spatio temporal
features are nothing but the contour and convex hull points. The
obtained features for each and every gesture along with the different
class like type1, tpe2 etc. which represents various actions is passed
to SVM for training purpose. Training the samples using SVM
results in knowledge base.
Testing phase or Classification
IXMAS gesture data set is used for the testing phase. The
above procedure is repeated for the testing also. The generated
feature vectors from the gestures along with the knowledge bases
which is resultant from the training are used for classification of
gestures.
5. RESULTS
5.1 Key Features The proposed method is tested under IXMAS gesture data set.
Actions like getup, sit-down, walking, waving is classified using
SVM algorithm. The process for classification is followed as in
Menu.
Fig 5.1: Menu showing
the overall recognition process.
It has the following steps tobe carried outi) Creating a database of all
actions. ii)Perform SVM training. iii) Select any Query action for
classification purpose.iv)Displaying the results.
5.2 Consider the person is getting up a)Test clip -5th frame
i)Grayscale image
Fig 5.2: Test
clip Fig 5.2: i) Grayscale image ii) Binary Image iii) RGB
image- 5th frame
Fig 5.2: ii) Binary image Fig5.2: iii) RGB image
In the test video, initially person is in sitting position lifting his arms.
• Subtracting the test frame from background frame results in
grayscale image
• Binary Image - grayscale to binary conversion. • RGB image - region props is applied to binary image in order to
obtain human blob
And multiplied by the colour component.
• Red rectangle - bounding box, Blue point - centroid, Blue outline - Contour
• Contour Points: 3348, Convex Hull Points: 40 for the 5th
frameb)Test video – 20th frame i)Grayscale image
Fig 5.3: Test
video Fig 5.3: i) Grayscale image ii)Binary imageiii) RGB image-
20th frame
Fig 5.3: ii) Binary image Fig 5.3: iii) RGB image
In the test video, person is about to stand.
• Subtracting the test frame from background frame results in
grayscale image
• Binary Image - grayscale to binary conversion. • RGB image - region props is applied to binary image in order to
obtain human bloband multiplied by the colour component.
Page 96
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
• Red rectangle - bounding box, Blue point - centroid, Blue outline - Contour
• Contour Points: 4672, Convex Hull Points: 46 of the 20th frame.
Convex hull points are extracted for consecutive frames and the
resultant feature vectors willbe passed for SVM in order to train
actionsonce query action is chosen for testing, feature vectors are extracted and compare with theone which is already trained. If the
feature vectors match, action is classified and result is
displayed.
iv)Action is classified and result is displayed.
Fig 5.3: iv)Action is classified
6. CONCLUSION AND FUTURE WORK
Gesture recognition using SVM classifier is implemented using
Matlab software.
In this project, we present a gesture spotting framework from hull data. Invariant features areextracted using non-linear analysis and
used as input to SVM classifier.
In future, the proposed features can be used to achieve body shape
invariance and SVMclassifier can be trained to model the unseen
data.
REFERENCES
[1]C.Cruz-
Neira,D.J.Sandin,T.A.DeFanti,R.V.Kenyon,andJ.C.Hart,―TheCave:
AudioVisualExperienceAutomaticVirtualEnvironment,‖Comm.AC
M,vol.35,no.6,pp.64-72,1992.
[2] Sushmita Mitra and Tinku Acharya. Gesture Recognition:
A Survey, IEEE Transactionson System, Man and Cybernetics
Part C: Applications and reviews, May 2007, Vol. 37,Issue: 3,
pp. 311-324.
[3] Hee-Deok Yang, A-Yeon Park and Seong-Whan Lee. Gesture Spotting and Recognitionfor Human-Robot Interaction, IEEE
Transactions on Robotics, April 2007, Vol. 23, and Issue: 2,pp.256-
270.
[4] Stjepan Rajko and Gang Qian, HMM Parameter for practical gesture recognition, IEEE International Conference on Automatic
Face and Gesture Recognition, September 2008,pp.1-6.
[5] Yu Yuan and Kenneth Barner.Hybrid Feature Selection for Gesture Recognition UsingSupport Vector Machines, IEEE
International Conference on Acoustics, Speech and SignalProcessing,
April 2008, pp. 1-24.
Page 97
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Rice Quality Analysis Using Image Processing Mahalakshmi M.N.1, K.V. Suresh2, Partha Das3
1IV sem, M. Tech. (Signal Processing), Dept. of E & C, SIT, Tumkur, Karnataka, India
2Professor and Head, Dept. of E & C, SIT, Tumkur, Karnataka, India.
3 R & D Engineer, Opto-Electronic color sorter division, Fowler Westrup (India) Pvt. Ltd., Bangalore, Karnataka, India.
[email protected]
[email protected]
[email protected]
Abstract-Quality assessment of rice is a very important task in
food industries. Various methods have been proposed for the
quality analysis of rice based on determining the size distribution
of rice grain, percentage of broken rice kernels, assessing
breakage and cracks of rice kernels, counting the number of long
and small seeds. This paper proposes a colour based method for
analyzing the quality of sona masuri rice in terms of counting the
number of good rice kernels and defects in a given sample of rice
using image processing technology. The proposed method has
been developed using MATLAB 7.12 and it works for non
touching rice kernels.The accuracy of the proposed method is
above 90% for counting chalky defects, above 90% for good rice
kernels, between 80% to 90% for orange rice kernels, and 70%
to 90% for black defects.
Keywords: quality analysis; rice kernels; counting; sona masuri;
defects
VI. INTRODUCTION
Rice is the most widely consumed staple food for a large part
of the world‘s human population especially in Asia and West
Indies. Therefore the quality of rice is an important parameter to be analyzed so as to protect consumers from substandard
quality of rice. Quality control of rice is of great importance in
the food industry because after harvesting, based on quality
parameter rice will be sorted and graded. The quality of rice is
based on few properties such as color, size, shape, cooking
texture and the number of broken rice kernels. When milled
rice reaches the market, it may contain defected rice kernels
such as chalky, orange, black and broken kernels along with
good ones. After reaching market, the quality of rice becomes
the determinant of its sale ability. Hence the quality analysis
of rice is of major importance.
The most commonly used method for quality analysis of rice
is through human visual inspection. This method is time
consuming and the accuracy of analysis varies from person to
person. Also it demands experienced inspectors to accurately
analyze the quality. There is a need to explore the use of
technology for quality analysis of rice. Literature records few papers which focus on quality
analysis of rice. G.Van Dalen [1] developed a method for
determining the size distribution of rice and percentage of
broken kernels of rice using flatbed scanning and image
analysis. This method was able to measure the area, length,
width and perimeter of each rice kernel. Chaoxin Zheng et al.
[2] presented a review of techniques available for image
feature extraction and their applications in the food industry.
Zhao Ping and LI Yongkui [3] proposed a method based on
image processing technology to improve the efficiency and
precision of grain counting. Francis Courtois et al. [4] dealt
with a method for the measurement of breakage ratio and the
estimation of fissures on parboiled rice. This method used a
flat bed scanner for image acquisition and a gap filling method
to separate the touching grains. Yong Wu and Yi Pan [5]
developed a method for the measurement of cereal grain size
based on image processing. They have used 2D Otsu method
for segmentation. L.A.I. Pabamalie and H.L.Premaratne [6]
proposed a method for identification of rice quality using
neural network and image processing. This method used a
back propagation network with two hidden layers for classification. Chetna V.Maheshwari et al. [7] proposed a
technique for counting normal, long and small seeds using
computer vision and image processing.All the methods
recorded in the literature dealt with gray level image of rice
sample for quality analysis. Since the defects found in sona
masuri rice differ in colour, the proposed method focuses on
color based quality analysis so as to be able to detect all the
types of defects and good rice kernels. Hence the proposed
method uses colour image. The flow chart of the proposed
method is explained in section II of this paper. In section III,
the apparatus and software used, the procedure used for image
acquisition, image processing steps involved and the results
obtained are explained. The conclusion is given in section IV.
VII. METHODOLOGY
The flowchart of the proposed method for quality analysis of
rice is shown in Fig. 1. The acquired color image of rice sample is subjected to segmentation in order to be able to
extract each type of defect and good rice kernels separately.
The resulting images of segmentation are converted to binary
images.
Page 98
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Fig.1. Flow chart of the proposed method
For the resulting binary images, morphological erosion and
opening operations are performed to remove the unwanted
objects and thus to improve the accuracy of counting. From
the resulting images, the number of connected components
arecounted and the results of counting are displayed. The steps
explained above are applied to count chalky defects,
orangedefects, black defects and good rice kernels. These
counts reflects the quality of rice.
VIII. EXPERIMENTAL SETUP AND RESULTS
a. Apparatus and software
A flatbed scanner (hp scanjet G3110) has been used to capture
the image of rice sample. A red color paper is used as
background for the rice sample. MATLAB 7.12 has been used
for the development of software for quality analysis.
b. Rice sample preparation
Rice samples of sona masuri were taken such that the samples contained good rice kernels along with chalky, orange and
black defects. The rice type is chosen to be sona masuri
because, sona masuri is the most commonly used rice type in
south Karnataka.
c. Image acquisition
A flatbed scanner also called desktop scanner is used to obtain
the images of rice kernels. Scanners are the most versatile
machines which are commonly found in all offices. They are
independent of external light conditions and found to be
suitable for this application and hence used for image
acquisition. The rice sample contains good grains as well as
defected grains. The color of grain is an important parameter which can be used to distinguish good kernel from other
defected kernels. The color of good rice kernel and also
chalky rice kernel is almost near to white and hence a
background of white color is not appropriate. Since the rice
sample also contains the black defects, a background of black
color was not used. Finally a red color paper was used as
background for the rice sample.
A sample of rice was placed on the glass of the flatbed
scanner. The rice kernels were spread so that they do not
touch each other. A red color background paper was placed on
the rice sample and the color image was captured. The
captured color image was stored in JPEG format. One such
image is shown in Fig. 2.
The captured image is in RGB color space. As can be seen
from Fig. 2, the chalky kernels are bright white in color and
are similar to good rice kernels. Chalky defect occurs when
part of the starch is not developed properly and is a point of
weakness. When chalky rice is milled, it is more likely to
break and thus reducing the amount of rice that is recovered
following milling. Other defects are orange rice kernels and
black defects as shown in Fig. 2.Even though black rice
kernels are high in nutritional value, amino acids and several important vitamins, because of colour it is usually considered
as defect.
d. Image processing
The captured RGB images are processed using the following
techniques of image processing.
1) Segmentation: Thresholding in RGB color space is used for segmentation. Each pixel of the input color image contains 3
color component values i.e. R, G and B. The R, G and B
components of good rice kernels and each type of defectedrice
kernels are observed. The R, G and B component valuesof
each type of defect and good rice kernel falls within a
common range. Using these range of values, thresholding
range is fixed in order to segment chalky, good, orange, black
rice kernels separately.
Fig. 2. Image of a sample of rice
Equations (1), (2) and (3) represents thresholding for R, G and
B components respectively of each pixel of input color
image.
In equation (1), represents the R component of each
pixel of the input image f (i,j), i represents the row number
and j represents the column number of the imagef (i,j). is
the maximum value the R component can take. The range
Page 99
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
between and is the thresholding range of R component values fixed for segmentation of a particular type of rice
kernel. In equation (2), represents the G component of
each pixel of the input image. is the maximum value the
G component can take. The range between and is the
thresholding range of G component values fixed for
segmentation of a particular type of rice kernel. In equation
(3), represents the B component of each pixel of the
input image. is the maximum value the B component can
take. The range between and is the thresholding range of B component values fixed for segmentation of a particular
type of rice kernel.
2) Conversion to binary image:The resulting images of
thresholding areconverted to gray scale images. This
conversion is done by eliminating Hue and Saturation
information while retaining the luminance information. The
gray scale value for each pixel is computed using the equation
(4).
In equation (4), R, G and B represents Red, Green and Blue
componentsrespectively of every pixel of the image resulting
after thresholding. Q represents the resulting gray scale value
for the corresponding pixel of the input image. A global
threshold is computed, by using which the gray scale image
can be converted to binary image.
3) Morphological opening:This operation is applied on the
binary images resulting from previous step to remove all
connected components that have fewer pixels than a threshold
value. The morphological opening includes determining the
connected components, computing the area of each
component, fixing a threshold in terms of pixels so as to
remove unwanted objects.
4) Counting:Connected component labeling technique is used
to count the number of defects and good rice kernels. For
labeling connected components, 8 connectivity is used. The connected component labeling algorithm used involves
scanning the image along the columns, assigning preliminary
labels, recording label equivalences in a local equivalence
table, resolving the equivalence classes and relabeling the runs
based on the resolved equivalence classes.
To study the performance of the algorithm, samples of raw
sona masuri rice was taken and images were captured using
flatbed scanner. Fig. 3, Fig. 4 and Fig. 5 shows the input
images and Table I. shows the results obtained for these three
images. The accuracy of the proposed method for counting
chalky defects is above 90% and the accuracy is between 80%
to 90% for orange rice kernels, above 90% for good rice
kernels and 70% to 90% for black defects. The accuracy of counting of black defects is less because, the color of black
defect is not uniform throughout the kernel.
Since chalky kernels are bulgy compared to other rice
kernels, they suffer from shadow effects, which in turn reduces the accuracy of counting black defects. There will be
few orange kernels, which are very light in colour. Such
kernels are more close to good rice kernels. Hence sometimes
the accuracy of counting good and orange rice kernels goes
low.
IX. CONCLUSION
A methodology for quality analysis of rice sample based
oncounting the number of defects like chalky, orange, black
and good rice kernels using image processing has been
developed using MATLAB 7.12. The proposed algorithm has
been tested on 10 images of rice samples. The accuracy of
results of counting chalky defects is above 90% and the
accuracy is between 80% to 90% for orange rice kernels,
above 90% for good rice kernels and 70% to 90% for black
defects. This method works for non touching rice kernels.
Further work being done includes developing a methodology for counting the number of broken good rice kernels and
development of the algorithm for touching rice kernels.
Fig. 3. Example (1)
Page 100
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Fig. 4. Example (2)
Fig. 5. Example (3)
TABLE I
RESULTS OF COUNTING
ACKNOWLEDGEMENT
The authorsexpress their gratitude toM/S Fowler Westrup
India Pvt. Ltd., Bangalore for the immense support given to
carry out the work.
REFERENCES
[1] G. Van Dalen, ―Determination of the size distribution and percentage of
broken kernels of rice using flatbed scanning and image analysis‖,
FoodResearch International journal, vol. 37, pp.51-58, (2004).
[2] Chaoxin Zheng, Da-Wen Sun and Liyun Zheng, ―Recent developments
and applications of image features for food quality evaluation and inspection-
a review‖, Trends in Food Science & Technology journal, vol. 17 , pp.642-
655, (2006).
[3] Zhao Ping and LI Yongkui, ―Grain counting method based on image
processing‖, International conference on information engineering and
computer science, ICIECS, pp. 1-3, ( 2009).
[4] Francis Courtois, Matthieu Faessel , Catherine Bonazzi, ―Assessing
breakage and cracks of parboiled rice kernels by image analysis techniques”,
Food control journal vol. 21, pp. 567–572, (2010).
[5] Yong Wu and Yi Pan, ―Cereal grain size measurement based on image
processing technology‖, International conference on intelligent control and
information processing, Aug 13-15, (2010).
[6] L.A.I. Pabamalie, H.L.Premaratne, ―A Grain Quality Classification
System‖, International conference on Bioengineering, pp. 56-61,(2010).
[7] Chetna V. Maheshwari, Kavindra R. Jain, Chintan K. Modi, ―
Nondestructive quality analysis of Indian basmati oryza sativa using image
processing‖, International conference on communication systems and network
technologies (IEEE), pp. 189-193, (2012).
Page 101
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Progressive Image Transmission over Coded
OFDM system with LDPC Thejaswi K V1, K Nagamani2
1M.Tech, Digital Communication,
2Assistant Professor, Dept of Telecommunication Engg.,
1,2R.V College of Engineering, Bengaluru
Email:[email protected] ,[email protected]
Abstract
The image compression and transmission is the day to day
challenge in the field of multimedia. The progressive image
transmission over coded Orthogonal Frequency Division
Multiplexing (OFDM) system with Low Density Parity Check
Coding (LDPC) is a new scheme. It improves the error resilience
ability and transmission efficiency for progressive image
transmission over Additive White Gaussian Noise (AWGN)
channel. The Set Partitioning in Hierarchical Trees (SPIHT)
algorithm is used for source coding of the images to be
transmitted. The proposed scheme improves the BER
performance of the OFDM system, combination of the high
spectral efficiency OFDM modulation technique and LDPC
coding is used. This improves the reconstructed image
quality.Simulation results of image transmission confirm the
effectiveness of the proposed scheme.
Keywords: OFDM, SPIHT, LDPC, PSNR, MSE.
I. Introduction
OFDM modulation has been adopted by several wireless
multimedia transmission standards, such as Digital Audio
Broadcasting (DAB) and Digital Video Broadcasting (DVB-T), because it provides a high degree of immunity to multipath fading
and impulsive noise. High spectral efficiency and efficient
modulation and demodulation by IFFT/FFT are also advantages of
OFDM. In the frequency-selective radio transmission channel, all fading and Inter-Symbol Interference (ISI)result in severe losses of
transmitted image quality. OFDM divides frequency-selective
channel into several parallel non frequency selective narrow-band channels, and modulates signal into different frequencies. It can
significantly improve the channel transmission performance without
employing complex equalization schemes. It also has broad
application prospect in wireless image and video communications [1, 2].
SPIHT algorithm has been introduced by Said and Pearlman [3]
which is based on the wavelet transform, and restricts the necessity
of random access to the whole image to small sub images. The principle of the SPIHT is partial ordering by magnitude with a set
partitioning sorting algorithm, ordered bit plane transmission, and
exploitation of self similarity across different scales of an image wavelet transform. The success of this algorithm in compression
efficiency and simplicity makes it well knownas a benchmark for
embedded wavelet image coding. The SPIHT is used for image
transmission over the OFDM system in several research works [4 , 5] because the SPIHT has a good rate-distortion performance for still
images with comparatively low complexity and it is scalable or
completely embeddable.
To improve the BER performance of the OFDM system, several
error correcting codes have been applied to OFDM. The
combination of the high spectral efficiency OFDM modulation
technique and LDPC coding will be a good candidate for high speed broadband wireless applications LDPC has been adopted as the
DVB-S2 standard. A (N, K) LDPC code can be represented by a very
sparse parity-check matrix having M rows, N columns and code rate
R=K/N, where K=N-M. It was originally invented by Gallager in 1963 [6] and rediscovered by Mackay and Neal recently [7].
The combination of the high spectral efficiency OFDM
modulation technique and LDPC coding will be a good candidate for high speed broadband wireless applications. The BER performance
of the Low Density Parity Check Coding- Coded Orthogonal
Frequency Division Multiplexing system (LDPC-COFDM) is
influenced by the subchannels which have deep fad due to frequency selective fading. According to this combination, several algorithms
were introduced into LDPC-COFDM system to improve the BER by
adaptive bit loading and power allocation of each subcarrier [8], [9].
The proposed scheme concentrates on improving the quality of
the reconstructed images. It considers transmission of image over
Additive White Gaussian Noise (AWGN) channel with SPIHT as
source code over OFDM system.
II.SPIHT Algorithm
SPIHT algorithm defines and partitions sets in the wavelet
decomposed image using a special data structure called a spatial orientation tree. A spatial orientation tree is a group of wavelet
coefficients organized into a tree rooted in the lowest frequency
(coarsest scale) subband with offspring in several generations along
the same spatial orientation in the higher frequency subbands. Figure.1 shows a spatial orientation tree and the parent–children
dependency defined by the SPIHT algorithm across subbands in the
wavelet image. The tree is defined in such a way that each node has
either no offspring (the leaves) or four offspring at the same spatial location in the next finer subband level. The pixels in the lowest
frequency subband-tree roots are grouped into blocks of 2×2 adjacent
pixels, and in each block one of them; marked by star as shown in
Fig. 1; has no descendants. SPIHT describes this collocation with one to four parent-children relationships,
Page 102
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
)]12,12(),12,2(),2,12(),2,2[(
),(
jijijijichildren
jiparent (1)
The SPIHT algorithm consists of three stages: initialization, sorting and refinement. It sorts the wavelet coefficients into three ordered
lists: the list of insignificant sets (LIS), the List of Insignificant
Pixels (LIP), and the List of Significant Pixels (LSP). At the
initialization stage the SPIHT algorithm first defines a start threshold based on the maximum value in the wavelet pyramid, then sets the
LSP as an empty list and puts the coordinates of all coefficients in the
coarsest level of the wavelet pyramid (i.e. the lowest frequency band;
LL band) into the LIP and those which have descendants also into the LIS.
In the sorting pass, the algorithm first sorts the elements of the
LIP and then the sets with roots in the LIS. For each pixel in the LIP it performs a significance test against the current threshold and
outputs the test result to the output bit stream. All test results are
encoded as either 0 or 1, depending on the test outcome, so that the
SPIHT algorithm directly produces a binary bit stream. If a coefficient is significant, its sign is coded and its coordinate is moved
to the LSP. During the sorting pass of LIS, the SPIHT encoder
carries out the significance test for each set in the LIS and outputs the
significance information. If a set is significant, it is partitioned into its offspring and leaves. Sorting and partitioning are carried out until
all significant coefficients have been found and stored in the LSP.
After the sorting pass for all elements in the LIP and LIS, SPIHT does a refinement pass with the current threshold for all entries in the
LSP, except those which have been moved to the LSP during the last
sorting pass. Then the current threshold is divided by two and the
sorting and refinement stages are continued until a predefined bit-budget is exhausted[3].
Fig.1: Parent–children dependency and spatial orientation trees
across wavelet subbands in SPIHT.
A. Low Density Parity Check Codes
Low-density parity-check (LDPC) codes are a class of linear block
LDPC codes. The name comes from the characteristic of their parity-
check matrix which contains only a few 1‘s in comparison to the
amount of 0‘s. LDPC codes provides a reliable transmission for
coding performance that is very close to the Shannon‘s limit and can
outperform Turbo codes at long block length but with relatively low decoding complexity .LDPC codes are finding increasing use in
applications requiring reliable and highly efficient information
transfer over bandwidth or return channel–constrained links in the
presence of data-corrupting noise.
III.OFDM System
The block diagram of the proposed LDPC-COFDM system is
illustrated in Figure. 2. The SPIHT coder is chosen as the source
coding technique due toits flexibility of code rate and simplicity of
designing optimal system. The SPIHT divides the image stream into several layers according to the importance of progressive image
stream. Then the image stream is converted to a binary format.
Afterwards the information bits are LDPC encoded at the LDPC
encoder.
The OFDM considered in this paper utilizes N frequency tones
(number of subcarriers) hence the baseband data is first converted into parallel data of N subchannels so that each bit of a codeword is
on different subcarrier. The N subcarriers are chosen to be
orthogonal, that is fnfn ,where Tf /1 and T is the
OFDM symbol duration. Then, the transmitted data of each parallel
subchannel is modulated by Binary phase Shift Keying (BPSK)
modulation because it provides high throughput and best performance when combined with the OFDM.Finally, the modulated
data are fed into an IFFT circuit, such that the OFDM signal is
generated. The resulting OFDM signal can be expressed as follows:
TteXN
nxtxtfj
N
n
nn 0,
1][)(
2
1
0
(2)
where Xn is a discrete time sample.
Fig. 2: The LDPC COFDM system model with trigonometric
transforms.
Each data block is padded with a cyclic prefix (CP) of a length
longer than channel impulse response to mitigate the Inter-Block Interference (IBI). The continuous COFDM signal xg(t) is generated
at the output of the digital to analog (D/A) converter.
At the receiver, the guard interval is removed and the time interval [0,T] is evaluated.Afterwards, the OFDM subchannel demodulation
Page 103
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
is implemented by using a (FFT) then the Parallel-to-Serial (P/S)
conversion is implemented. This received OFDM symbols are demodulated at the demodulator. The demodulated bits are decoded
with each LDPC encoded block and data bits are restored. These data
are converted into image format, such that SPIHT decoder can be
obtained.
IV. Simulation Results
The transmission of SPIHT coded images on LDPC COFDM system
over AWGN channel is carried out by simulation using Matlab. The
parameters used in the simulation are: the number of subcarriers of a LDPC coded OFDM system (N) is considered to be 512, Rate of the
SPIHT (r) = 0 to 1. LDPC code of R = 1/2 is employed, where R
denotes the code rate and a (128, 256) parity check matrix is used.
The size of the input images are 256x256, 8 bits per pixel, grayscale test images. The PSNR is evaluated at different rate.
The Peak Signal-to-Noise Ratio is defined as
MSE
PeakPSNR
2
10log10 (4)
Where, MSE is the mean squared error between the original and the
reconstructed image, and Peak is the maximum possible magnitude for a pixel inside the image. The peak value is 255 for an 8 bits/pixel
of original image.
To verify the effectiveness of the proposed method; Image transmission is carried out over COFDM system using SPIHT coder
as source coding. Simulation was carried out for different SPIHT
rates as shown in Table 1 for Cameraman and Lena image of size
256x256.
Table 1: PSNR values for Cameraman and Lena image
Rate
(r in bpp)
Cameraman
PSNR
(dB)
Lena
PSNR
(dB)
0.1 22.94 22.86 0.2 25.29 24.92 0.3 27.15 26.51 0.4 28.46 27.68 0.5 29.61 28.87 0.6 30.79 29.83 0.7 31.67 30.88 0.8 32.66 31.66 0.9 33.72 32.55 1.0 34.81 33.42
Fig 3: PSNR of the proposed scheme for different rates
From the Table 1 and Figure 3 the PSNR increases with the increase in rate of transmission. Figure 4 shows the Original images
transmitted through the system and reconstructed images at receiver
for SPIHT rates 0.5 and 1 are given in Figure 5. It is evident from the
Figure 5 that the PSNR of images at rate 0.5 is less compared to PSNR at rate 1. As the rate increases PSNR increases hence the
quality of the image at receiver.
Fig. 4: Original transmitted images
PSNR=29.61db PSNR=28.87db
Fig. 5: Reconstructed images at SPHIT Rate=0.5
r in bpp
Page 104
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
PSNR=34.81 PSNR=33.42
Fig. 6: Reconstructed images at SPHIT Rate=1
V. CONCLUSION
The proposed new scheme is an efficient LDPC coded OFDM
system supporting image transmission using SPIHT compression technique is analyzed. The effectiveness of the proposed system is
investigated through simulations over AWGN channel. It is found
that the proposed system must be designed carefully in order to
achieve good PSNR performance. For LDPC COFDM with rate (R=0.5) and rate of SPIHT rate (r = 1) the image transmitted is
reconstructed effectively. The PSNR for the received image at
different rates are evaluated. The analysis
REFERENCES [1] H. Schulze and C. Luders, Theory and Application of OFDM and
CDMAWideband Wireless Communication. John Wiley, 2005.
[2] Gusmo, R. Dinis and N. Esteves, ―On Frequency Domain Equalization
and Diversity Combining for Broadband Wireless Communications‖,
IEEE Communication. Lett., Vol. 51, No. 7, July 2003.
[3] Said and W.A. Pearlman, ―A New, Fast and Efficient Image Codec Based
on Set Partitioning In Hierarchical Trees‖, IEEE Trans. Circuits Syst.
Video Technol., Vol.6, pp. 243–250, 1996.
[4] Y. Sun, X. Wang and Liu, K.J.R, ―A Joint Channel Estimation and
Unequal Error Protection Scheme for Image Transmission in Wireless
OFDM Systems‖, Multimedia Signal Processing, IEEE, pp. 380 – 383,
2002.
[5] S. Wang, J. Dai, C. Hou and X. Liu ― Progressive Image Transmission
over Wavelet Packet Based OFDM‖, Proceedings in Electrical and
Computer Engineering 2006 Canadian Conference Conference, pp. 950
– 953,2006.
[6] R G. Gallagher, ―Low Density Parity Check Codes‖, IRE Trans. Inform.
Theory, Vol. IT-8, pp. 21-28, Jan.1962.
[7] D. J. C. MacKay, ―Good Error-Correcting Codes Based on Very Sparse
Matrices‖, IEEE Trans. Inform. Theory, Vol.45, pp. 399-43 1,
Mar.1999.
[8] Y. Li and W.E. Ryan, ―Mutual-Information-Based Adaptive Bit-Loading
Algorithms for LDPC-Coded OFDM‖, IEEE transaction on wireless
communication, Vol.6, pp. 1670 – 1680, May 2007.
[9] C. Yuan Yang and M. Kai Ku ―LDPC Coded OFDM Modulation for High
Spectral Efficiency Transmission‖, Proceedings in ECCSC 2008, pp.
280 – 284,July 2008.
[10] Charles Pandana, Yan Sun, and K. J. Ray Liu ―Channel-Aware Priority
Transmission Scheme Using Joint Channel Estimation and Data
Loading for OFDM Systems‘‘ IEEE Transactions on Signal processing,
Vol. 53, No. 8, August 2005
[11] Bagadi K. Praveen, Susmita Das, Sridhar K. ―Image Transmission over
Space Time Coded MIMO-OFDM System with Punctured Turbo Codes
―International Journal of Computer Applications (0975 – 8887) Volume
51– No.15, August 2012
[12] Sashuang Wang,Jufeng Dai,Chunping Hou and Xueqing Liu
―Progressive Image Transmission Over Wavelet Packet Based
OFDM‖IEEE CCECE/CCGEI, Ottawa, May 2006
[13] Srikanth.N ―Progressive Image transmission over STC-OFD Based
MIMO systems‖ International Conference on Computing and Control
Engineering (ICCCE 2012), 12 & 13 April, 2012
[14] Usama S. Mohammed, H. A. Hamada ―Image transmission over OFDM
channel with rate allocation scheme and minimum peak-to average
power ratio‖Journal of telecommunications, volume 2, issue 2, May
2010
[15] R. Orzechowski ―Performance Analysis of LDPC Coded OFDM
System‖ XIV Poznań Telecommunications Workshop - PWT 2010
[16] M. M. Salah, A. A. Elrahman and A. Elmoghazy ―Unequal Power
Allocation of Image Transmission in OFDM Systems‖ 13th International
Conference on Aerospace Sciences & Aviation Technology, SAT- 13,
May 26 – 28, 2009
[17] Naglaa F. Soliman, Abd Alhamid. A. Shaalan,Mohammed.M. Fouad
,―Robust Image Transmission with OFDM over an AWGN
Channel‖National Telecommunication Institute, Egypt, March 2011
Page 105
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
A Joint Digital Watermarking of Compressed & Encrypted images for implementation of digital
rights
Chetana.R1, Chaithra.A2
1Associate Professor, ECE Dept.
SJBIT, Bangalore,India Email:[email protected]
2 PG Student, ECE Dept.
SJBIT, Bangalore,India E-mail:[email protected]
Abstract—We propose efficient, robust watermarking
algorithm to watermark compressed and encrypted
images. The proposed algorithm uses a symmetric stream
cipher with additive homomorphic properties for
encryption and for watermarking we use Spread
Spectrum (SS) frequency domain DCT2 watermarking
schemes. Digital asset management systems (DAMS)
generally handle media data in a compressed and
encrypted form. It is sometimes necessary to watermark
these compressed encrypted media items in the
compressed-encrypted domain itself for tamper detection
or ownership declaration or copyright management
purposes. It is a challenge to watermark these compressed
encrypted streams as the compression process would have
packed the information of raw media into a low number
of bits and encryption would have randomized the
compressed bit stream. Attempting to watermark such a
randomized bit stream can cause a dramatic degradation of the media quality. Thus it is necessary to choose an
encryption scheme that is both secure and will allow
watermarking in a predictable manner in the compressed
encrypted domain.
Applications: copyright violation detection, proof of
ownership or distributorship, media authentication
INTRODUCTION
DIGITAL media content creation/capturing, processing and
distribution has witnessed a phenomenal growth over the past
decade. This media content is often distributed in compressed
and encrypted format and watermarking of these media for
copyright violation detection, proof of ownership or
distributorship, media authentication, sometimes need to be
carried out in compressed-encrypted domain. One such
example is the distribution through DRM systems [1]–[4]
where the owner of multimedia content, distribute it in a
compressed and encrypted format to consumers through
multilevel distributor network. In DRM systems with content
owners, multiple levels of distributors and consumers, the
distributors do not have access to plain content (un-encrypted
content). As they are distributors of content who distributes
the encrypted content(in fact compressed encrypted content
as most of the content would be compressed and then
encrypted) and requests the license server in the DRM system
to distribute the associated licence containing the decryption
keys to open the encrypted content to the consumers. In fact
distributors do not need to have plain content as they are not
consumers. However, each distributor sometime needs to
watermark the content for media authentication, traitor
tracing or proving the distributorship. Thus they have no
choice but to watermark in the compressed encrypted domain. In this paper we focus on watermarking of
compressed-encrypted JPEG2000 images, where the
encryption refers to the ciphering of complete JPEG2000
compressed stream except headers and marker segments,
which are left in plaintext for format compliance [5]. There
have been several related image watermarking techniques
proposed to date [6]–[11]. In [6], Deng et al. proposed an
efficient buyer-seller watermarking protocol based on
composite signal representation given in [7]. However, when
the content is accessible only in encrypted form to the
watermark embedder, the embedding scheme
Page 106
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
proposed in [6] might not be applicable as the host and
watermark signal are represented in composite signal form using the plain text features of the host signal and in [6], this
is possible as the seller embeds the watermark. Also, there is a
ciphertext expansion of 3.7 times that of plaintext. In [8] and
[9], some sub-bands of lower resolutions are chosen for
encryption while watermarking the rest of higher resolution
sub-bands. While in [10], the encryption is performed on most
significant bit planes while watermarking the rest of lower
significant bit planes. In case lesser number of sub-bands/bit
planes are used for encryption, an attacker can manipulate the
un-encrypted sub-bands/bit planes and further extract some
useful information from the image, although the image may
not be of good quality. On the other hand, if more sub-
bands/bit planes are encrypted and only rest few sub-bands/bit
planes are watermarked, it might be possible for an attacker to
remove the watermarked sub-bands/bit planes while
maintaining the image quality. Prins et al. in [11] proposed a
robust quantization index modulation (QIM)based watermarking technique, which embeds the watermark in the
encrypted domain. In the technique proposed in [11], the
addition or subtraction of a watermark bit to a sample is based
on the value of quantized plaintext sample. However, in our
algorithm, the watermark embedder does not have access to
the plain text values. They have only compressed-encrypted
content. Also the watermark embedders do not have the key to
un-encrypt and get the plain text compressed values.
Thus,watermarking in compressed-encrypted domain using
the technique proposed in [11] is very challenging. In [12] Li
et al. proposed a content-dependent watermarking technique,
which embeds the watermark in an encrypted format, but the
host signal is still in the plain text format. The algorithm may
not be directly applied when the content is in the encrypted
format, in which case the distortion introduced in the host
signal may belarge. In [13] Sun et al. proposed a semi fragile
authentication system for JPEG2000 images. However, this
scheme is not fully compressed and encrypted domain watermarking compatible as it derives the content based
features for watermarking from the plain text.We propose a
robust watermarking technique for JPEG2000 images in
which the watermark can be embedded in a predictable
manner in compressed-encrypted bytestream
Notation:
• L denotes the length in bytes.
•M=mi,mi € [0,255]¥i=0,1,…,L-1 denotes the packetized
JPEG2000 bytestream, Mw=mwi,mwi,€[0,255] ¥i=0,1,…,L-1
to be the watermarked copy of M .
•C=ci,ci€[0,255] ] ¥i=0,1,…,L-1 denotes
encrypted M, Cw=cwi,cwi €[0,255] ¥i=0,1,…,L-1 to be the
watermarked copy of C.
• b=bj,bj€-1,1¥i=0,1,…,N-1 denotes the
watermark information. • E(.) and D(.) denotes the encryption and decryption function,
respectively.
•K=ki , where ki € [0,254] ¥i=0,1,…,L-1
denotes the encryption key.
•r denotes the chip rate in SS. • α denotes the watermark strength factor in SS.
•P=pi, pi €-1,1 ¥i=0,1,…,L-1 denotes a PN sequence with
zero mean and variance σp2 .
•SS denotes Spread Spectrum
II. PROPOSED SCHEME
Fig.1 a)Watermark embedding b) Watermark extraction
The proposed algorithm works on JPEG2000 compressed
code stream. JPEG2000 compression is divided into five
different stages. In the first stage the input image is
preprocessed by dividing it into non-overlapping rectangular
tiles, the unsigned samples are then reduced by a constant to make it symmetric around zero and finally a multi-component
transform is performed. In the second stage, the discrete
wavelet transform (DWT) is applied followed by quantization
in the third stage. Multiple levels of DWT gives a multi-
resolution image. The lowest resolution contains the low-pass
image while the higher resolutions contains the high-pass
image. These resolutions are further divided into smaller
blocks known as code-blocks where each code-block is
encoded independently. Further, the quantized- DWT
coefficients are divided into different bit planes and coded
through multiple passes at embedded block coding with
optimized truncation (EBCOT) to give compressed byte
stream in the fourth stage. The compressed byte stream is
arranged into different wavelet packets based on resolution,
precincts, components and layers in the fifth and final stage.
Thus, it is possible to select bytes generated from different bit
planes of different resolutions for encryption and watermarking.
A. Encryption Algorithm
JPEG2000 gives out packetized byte stream M as its output.
In order to encrypt the message M , we choose , a randomly
generated key-stream using RC4.Then the encryption is done
byte by byte as given in (1) to get the ciphered signal C :
C=E(M,K)=ci
=(mi+ki)mod255 for i=0,1,……..,L-1 (1)
Page 107
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
where the addition operation is arithmetic addition. Here, mod
255 is required to preserve the format compliancy of JPEG2000 bit stream [5]. In JPEG2000 bit stream, the header
syntax occurs as a value greater than 0xff89. This value
correspond to two consecutive bytes having values 255 and
higher than 137 in decimal base. If mod 256 is used, it may
generate a value 255 and the consecutive byte value greater
than 137, which corresponds to a syntax and is undesirable.
Thus in order to prevent the generation of header segments,
mod 255 is used . Let C1=E(M1,K1) and
C2=E(M2,K2).ForK=K1+K2,additive homomorphism
property gives
D(C1+C2,K)=M1+M2(2)
Here M1=m1i,for all i has been preprocessed by the owner
such that 0≤M1+M2<255 . The owner does the preprocessing
by limiting the values as M1|M1 belongs to [α,255-α+1)] ,
where α is a positive integer. However, the preprocessing is
not applied when m1=255 and m1i+1>137 , because this case
indicates the presence of a header segment which should be preserved to preserve the bitstream compliance.
B. Embedding Algorithm
The encryption algorithm used is an additive privacy
homomorphic one, so the watermark embedding is performed
by using a robust additive watermarking technique. Since the
embedding is done in the compressed ciphered byte stream,
the embedding position plays a crucial role in deciding the
watermarked image quality. Hence, for watermarking, we
consider the ciphered bytes from the less significant bit planes
of the middle resolutions, because inserting watermark in
ciphered bytes from most significant bit planes degrades the
image quality to a greater extent. Also, the higher resolutions
are vulnerable to transcoding operations and lower resolution
contains a lot of information, whose modification leads to loss
of quality.We in our experiments study the impact on quality
of watermarking in this compressed-encrypted domain. We show how the watermark can be inserted in less significant bit
planes of middle resolutions without affecting the image
quality much.We now explain the embedding process.
SS:The embedding process is carried out by first generating
the watermark signal W by using watermark information bits b
, chip rate r and PN sequence P . The watermark information
bits b=bi , where bi=1,-1 , are spread by r, which gives
aj=bj, ir≤j<(i+1)r (3)
The sequence aj is then multiplied by α>0 and P . The
watermark signal W=wj , where
wj=αajpj(4)
where pj=1,-1.The watermark signal generated in (4) is
added to the encrypted signal , to give the watermarked signal
Cw
Cw=C+W=cwi=ci+wi¥i=0,1,…,L-1 (5)
Here, C and W can be considered to be C1and C2, respectively.
Although is added in plaintext form, it can be considered to be encrypted using key K2 such that K2 is a stream of bytes with
value zero, then . In other words, as M2 in (2) corresponds to
W of (5), M2 can be assumed to be encrypted using a byte key
stream K2=k2i¥i=0,1,…,L-1.Now, if k2i=0 ¥i=0,1,…,L-1 ,
then the encrypted value of M2 denoted by C2 is c2i=(m2i+k2i) mod255 ¥i=0,1,…,L-1(6)
Thus, we get C2=M2 , i.e., encryption of M2 still produces M2
as addition of zero do not make any change in (6). Also, the
decryption key K=K1+K2 for decrypting
C1+C2 can be written as K=K1 as K2=0 . Thus, according to
homomorphic property we can write
D(C1+C2,K(=K1+K2))
=D(C1+M2,K(=K1))
D(C1+M2,K)=M1+M2. (7)
If cwi is more than 255, a lesser strength (may be zero as well)
of watermark is added such that remains below 255. Thus
decrypting Cw , we get M+W since W is inserted in plain text
form.
C. Watermark Detection
The watermark can be detected either in encrypted or decrypted compressed domain.We also discuss the
uncompressed domain detection in case of SS technique. We
will first discuss the detection in encrypted domain followed
by decrypted domain.
1) Encrypted Domain Detection: In encrypted domain, as
shown in Fig. 1, Cw is directly given to the watermark
extraction module and the detection process is as follows.
SS: The received encrypted-watermarked signal Cw=C+W is
applied to the correlator detector. It is multiplied by PN
sequence P used for embedding, followed by summation over
chip-rate window r , yielding the correlation sum Si.
Assuming zero correlation between C and P
Si=Σ(cwipj)=Σ(cj+wj)pj=biσp2r (8)
The first term in (8), i.e.cjpj , is zero if C and P are
uncorrelated. However, this is not always the case for real
compressed data. Thus, we can apply the non-blind detection
technique, i.e.,subtract away C from Cwto remove the correlation effect completely. Thus get a better watermark
detection rate. The sign of Si gives the watermark information
bit:
sign(Si)=sign(biσp2αr)=sign(bi)=bi (9)
However, the distributors can also use prefiltered (semi-blind)
detection technique. In case of ownership proving applications
the prefiltered (semi-blind) detection technique may be
required. In this case, the watermarked message is first passed
through a high pass filter to reduce the cross-talk between
watermark signal and host samples. The filtered message is
then multiplied by PN sequence and thereby extracting the
watermark.
2) Decrypted Domain Detection: The received compressed
encrypted watermarked image is first passed through the
decryption module, shown in Fig. 1, and is decrypted using
(10),which defines the corresponding byte by byte decryption for the encryption defined in (1).The received signal Cw is
decrypted to give Mw as
Mw=D(Cw,K)=(cwi-ki) mod 255
Page 108
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
¥i=0,1,…,L-1
=(ci+wi-ki) mod 255 =mi+wi
=mwi (10)
It can be seen from (10) that mwi=mi+wi , the watermarked
compressed byte stream mwi is merely addition of compressed
byte stream mi, and the watermark signal wi.Thus by
controlling the strength of wi, choice of resolution levels and
bit planes, the quality of the watermarked signal could be
easily controlled. The watermarked quality would be poor if
we pick up more number of resolution levels and bit planes to
watermark, but the watermark embedding capacity would be
high and vice versa.
For SS detection, the embedded watermark information W can
be estimated from Mw using correlation detector even without
the knowledge of the corresponding originals M or C.
However, M and P may not always be uncorrelated and hence
the noise due to M may not be completely eliminated. Therefore to obtain better detection results, we can encrypt Mw
with K which gives Cw and removing C gives
Si=Σ(wipj)=Σαajpjpj=biσp2αr (11)
Thus, the sign of Si gives the watermark information bit
sign(Si)=sign(biσp2 αr )=sign(bi) =bi (12)
III. CONCLUSION
The proposed work is to introduce a robust
watermarking technique for the watermarking of images and
find the authorized / unauthorized user, as well as touching on
the limitations and possibilities of each. Although only the
very surface of the field was scratched, it was still enough to
draw several conclusions about digital watermarking. LSB is
the straight-forward method of watermark embedding use to embed the watermark into the least-significant-bits of the
Cover image. Another observation is that Frequency domains
are typically better candidates for watermarking than spatial,
for both reasons of robustness as well as visual impact of
recovery watermark image. Embedding in the DCT domain
proved to be highly resistant to ―.bmp‖ compression as well as
significant amounts of random noise. The algorithm is simple
to implement as it is directly performed inthe compressed-
encrypted domain, i.e., it does not require decryptingor partial
decompression of the content. Our schemealso preserves the
confidentiality of content as the embeddingis done on
encrypted data. The homomorphic property of
thecryptosystem are exploited, which allows us to detect the
watermarkafter decryption and control the image quality as
well.The detection is carried out in compressed or
decompressed domain.In case of decompressed domain, the
non-blind detectionis used. We analyze the relation between payload capacityand quality of the image (in terms of PSNR
and SSIM) for differentresolutions. Experimental results show
that the higherresolutions carry higher payloadcapacity
without affecting thequality much, whereas the middle
resolutions carry lesser capacityand the degradation in quality is more than caused bywatermarking higher resolutions.
However, higher resolutionsmight be truncated to meet the
bandwidth requirements andin that case middle resolutions
provide a good space for embedding.The distortiondue to the
round-off process also plays a significant role in
determiningBER and the effect is also analyzed by
comparingagainst the results of original watermarking
schemes.Future work aims at extending the proposed scheme
to otherimage compression schemes such as JPEG, JPEG-LS.
ACKNOWLEDGEMENT
The authors wish to acknowledge SJB Institute of
Technology for providing guidance and resources to carry out
this work.
REFERENCES [1] S. Hwang, K. Yoon,K. Jun, andK. Lee, ―Modeling and implementation of
digital rights,‖ J. Syst. Softw., vol. 73, no. 3, pp. 533–549, 2004.
[2] A. Sachan, S. Emmanuel, A. Das, and M. S. Kankanhalli, ―Privacy
preservingmultiparty multilevel DRM architecture,‖ in Proc. 6th IEEEConsumer Communications and Networking Conf., Workshop
DigitalRights Management, 2009, pp. 1–5.
[3] T. Thomas, S. Emmanuel, A. Subramanyam, and M. Kankanhalli, ―Joint
watermarking scheme for multiparty multilevel DRM architecture,‖ IEEE
Trans. Inf. Forensics Security, vol. 4, no. 4, pp. 758–767, Dec. 2009.
[4] A. Subramanyam, S. Emmanuel, and M. Kankanhalli,
―Compressedencrypted domain JPEG2000 image watermarking,‖ in Proc.
IEEE Int.Conf. Multimedia and Expo, 2010, pp. 1315–1320.
[5] H. Wu and D.Ma, ―Efficient and secure encryption schemes for JPEG
2000,‖ in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing,
2004, vol. 5, pp. 869–872.
[6] M. Deng, T. Bianchi,A. Piva, and B. Preneel, ―An efficient buyer-seller
watermarking protocol based on composite signal representation,‖ in Proc. 11th ACM Workshop Multimedia and Security, 2009, pp. 9–18.
[7] T. Bianchi, A. Piva, and M. Barni, ―Composite signal representation for
fast and storage-efficient processing of encrypted signals,‖ IEEETrans. Inf.
Forensics Security, vol. 5, no. 1, pp. 180–187, Mar. 2010.
[8] S. Lian, Z. Liu, R. Zhen, and H. Wang, ―Commutative watermarking and
encryption for media data,‖ Opt. Eng., vol. 45, pp. 1–3, 2006.
[9] F. Battisti, M. Cancellaro, G. Boato, M. Carli, and A. Neri, ―Joint
watermarking and encryption of color images in the Fibonacci-Haar domain,‖
EURASIP J. Adv. Signal Process., vol. 2009.
[10] M. Cancellaro, F. Battisti, M. Carli, G. Boato, F. De Natale, and A. Neri,
―A joint digital watermarking and encryption method,‖ in Proc.SPIE Security, Forensics, Steganography, and Watermarking of MultimediaContents X,
2008, vol. 6819, pp. 68 191C–68 191C.
[11] J. Prins, Z. Erkin, and R. Lagendijk, ―Anonymous fingerprinting with
robust QIM watermarking techniques,‖ EURASIP J. Inf. Security, vol. 2007.
[12] Z. Li, X. Zhu, Y. Lian, and Q. Sun, ―Constructing secure
contentdependent watermarking scheme using homomorphic encryption,‖ in
Proc. IEEE Int. Conf. Multimedia and Expo, 2007, pp. 627–630.
[13] Q. Sun, S. Chang, M. Kurato, and M. Suto, ―A quantitive semi-fragile
JPEG2000 image authentication system,‖ in Proc. Int. Conf. ImageProcessing, 2002, vol. 2, pp. 921–924.
[14] R. Rivest, A. Shamir, and L. Adleman, ―A method for obtaining digital
signatures and public-key cryptosystems,‖ Commun. ACM, vol. 21, no. 2, pp.
120–126, 1978.
[15] S. Goldwasser and S. Micali, ―Probabilistic encryption,‖ J. Comput. Syst. Sci., vol. 28, no. 2, pp. 270–299, 1984.
Page 109
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Cancellation of Noise in ECG Signal Using Low pass IIR Filters
Menaka.S.Naik 1, Akshatha N2, Mrs.Nirmala.L 3 Mtech (VLSI and Embedded Systems)
1REVA Institute of Technology and Management,
Bengaluru, India 2Nagarjuna College of Engineering,
Bengaluru, India 3Dept. of Electronics and Communication Engg.
REVA Institute of Technology and Management, Bengaluru, India
[email protected] ,[email protected]
Abstract :In Diagnosing of ECG Signal, Signal acquisition must
be noise free. Experienced physicians are able to make an
informed medical diagnosis on heart condition by observing the
ECG signal. This paper deals with the application of the digital
IIR filter on the raw ECG signal. At the end all these filter types
are compared.
While recording ECG signal it gets corrupted due to different
noise interferences and artefacts. Noise and interference are
usually large enough to obscure small amplitude features of the
ECG that are of physiological orclinical interest. We have used
MATLAB for this purpose as it is the most advanced tool for
DSP applications.
The filters are designed using MATLAB FDA Tool by specifying
the filter order, cut off frequency and sampling frequency.
Keywords:- ECG signal, IIR filters, MATLAB, Signal to Noise
Ratio (SNR)
I. INTRODUCTION
ECG is a method to measure and record different
electrical potentials of the heart. Owing to
intensifying importance of biomedical signal
processing, increasing efforts are devoted to noise
reduction of biomedical ECG signal. The task can
become complicated when quality is degraded due
to interferences in ECG signal, making
interpretation quite difficult. So removal of this
noise is needed in ECG analysis for correct
diagnosis. Fig. 1 depicts, each ECG signal of
normal heart beat consists of six continuous
electromagnetic peaks namely PQRST and U. The
P wave reflects the activation ofthe right and left
atria. The QRS complex shows depolarization of
the right and left ventricles. The T wave, that is
after QRS complex reflects ventricular activation.
The repolarization of atria is not recorded on the
reading of ECG. The electrocardiogram can
measure the rate and rhythm of the heartbeat, as
well as provide indirect evidence of blood flow to
the heart muscle.
Page 110
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Fig 1: Normal ECG Signal.
The signal is normally corrupted with two major
noises generated by biological and environmental
resources. The first group includes muscle
contraction or electromyographic (EMG) interface,
baseline drift, ECG amplitude modulation due to
respiration and motion artifacts caused by changes
in electrode skin impedance with electrode motion.
The second group includes power line interference,
electrode contact noise, instrumentation noise
generated by electronic devices used in signal
processing, electrosurgical noise and radio
frequency. Different types of interferences are listed
in fig.2 For the meaningful and accurate detection,
steps have to be taken to filter out or discard all
these noise sources.
Fig 2: Interferences in the ECG signal
II . Literature survey
Hence several techniques have been presented in
the literature to effectively reduce the noise in ECG
analysis.
Ferdjallah M. and Barr R. E. introduced frequency
domain digital filtering techniques for removal of
PLI [1]. Sornmo L. have applied time varying
filtering techniques to the problem of baseline shift
[2]. McManus C.D., Neubert K.D. and Cramer E.
have compared digital filtering methods for
elimination of AC noise in ECG [3]. Patrica Arand
patented method and apparatus for removing
baseline wander from an ECG signal [4]. Pei S.C.,
Tseng C.C proposed IIR notch filter with transient
suppression in ECG [5]. Hamid Gholam, Hosseini,
Homer Nazeran, Karen J. Reynolds elaborated on
ECG noise cancellation using application of digital
filter [6]., A nonlinear adaptive method of
elimination of powerline interference in ECG
Page 111
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
signals was developed by Ziarani a. K. and Konard
A. [7]. Mitov I.P. A method for reduction of power
line interference in the ECG [8]. Yong Lian, Poh
Choo Ho. Focued on multiplier free digital filter
[9]. Lisette P. Harting, Nikolay M. Fedotov,
Cornelis H. Slump were discussed on baseline drift
suppression in ECG recording [10]. Dotsinky I.,
Stayanov T. discussed on power-line interference
cancellation in ECG signals [11]. Jacek M. Leski,
Norbert Hezel have proposed a combination of
ECG baseline wander and PLI reduction using
nonlinear filter band [12]. Lu G.et al. have
suggested a fast convergence of recursive least
square algorithm to enable the filter to track
complex dystonic EMGs and to effectively remove
ECG noise. The adaptive filter procedure proved a
reliable and efficient tool to remove ECG artifact
from surface EMGs with mixed and varied patterns
of transient, short and long lasting dystonic
contractions [13]. A new asynchronous averaging
and filtering (AAF) algorithm is proposed by
Gautam A. et al. for ECG signal denoising. AAF
algorithm reduces random noise (major component
of EMG noise) from an ECG signal and provides
comparatively good results for baseline wander
noise cancellation. SNR improves in a filtered ECG
signal, while signal shape remains undistorted. An
AAF algorithm is more advantageous than
adaptation algorithms like Wiener and LMS
algorithm [14]. Sorensen J. S.et al. have described a
comparison of IIR and wavelet filtering for noise
reduction of the ECG. Ideally, the output of the
optimal filter has perfect noise removal, no
distortion and low computation time. This criterion
was used to select one wavelet filter and one IIR
filter to be used on the ECG with transient muscle
activity. For this signal the root mean square error
(RMSE) of the non noise and noise segment were
calculated using the selected wavelet and IIR filters
[15]. Gupta R., Bera J.N. and Mitra M. have
developed a simple, cost effective online ECG
acquisition system for further data processing. An
8051 based dedicated embedded system is used for
converting the digitized ECG into serial data, which
is delivered to a standalone PC through a 'COM‘
port for storage and analysis. A serial link is
preferred as it minimizes cable costs and
interference effects as is the case with a parallel
link. The developed MATLAB based graphical user
interface (GUI) facilitates a user to control the
operations on the entire system [16]. Luo S. and
Jhonston P. have discussed the issues related to the
inaccuracy of ECG preprocessing filters. Their
investigations are in the context of facilitating
efficient ECG interpretation and diagnosis as a
review [17].
In this paper, Chebyshev and Butterworth filters are
designed and implemented. Finally improvement in
ECG signal by noise reduction is presented and
discussed. This paper aims towards MATLAB
based design of digital filters which further can be
extended to interface with the FPGA.
III . DIGITAL IIR FILTER
IIR systems have an impulse response function that
is non zero over an infinite length of time. IIR Filter
may be implemented as either analog or digital
filter. In digital filter, the output feedback is
immediately apparent in the equation defining the
output.
3.1 Butterworth Filter
The Butterworth filter provides the best Taylor
Series approximation to the ideal lowpass filter
response at analog
frequenciesΩ = ∞ and Ω = ∞, for any order N, the
magnitude squared response has 2N-1 zero
derivatives at these
locations. Response is monotonic overall,
decreasing smoothly from Ω = ∞, to Ω = ∞,.
Page 112
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
H(jΩ) =√1/2 at Ω = 1 - - - - - - - - - - - - - - - -(i)
3.2 Chebyshev Type I Filter
The Chebyshev Type-I Filter minimizes the
absolute difference between the ideal and actual
frequency response over passband by incoporating
an equal ripple of RP dB in the passband. Stopband
response is maximally flat. The transition from
passband to stopband is more rapid than for the
Butterworth filter.
H(jΩ) = 10-Rp/20 at Ω = 1 - - - -- - - - - - - -(ii)
3.3 Chebyshev Type-II Filter
Chebyshev Type-II filter minimizes the absolute
difference between the ideal and actual response
over the entire stopband by incorporating an equal
ripple of RS dB in the stopband. Passband response
is maximally flat. The stopband does not approach
zero as quickly as the type I filter. The absence of
ripple in the passband, however, is often an
important advantage.
H(jΩ) = 10-Rs/20 at Ω = 1 - - - - - - - - - - - (iii)
3.4 Adaptive Filter
Typical least mean square (LMS) algorithm is
employed for updating the tap-weights of the
adaptive filter , the adaptive filter output is of the
form:
M-1
y(n) = Σ wk(n) x(n-d-k) - - - - - - - - - - - - - - -(iv)
k=0
where M is the number of taps of the adaptive filter.
The error signal e(n) is:
e(n) = x(n) - y(n) - - - - - - - - - - - - - - - - - - - - (v)
The tap-weights of the adaptive filter is updated
according to the rule:
wk (n +1) = wk (n) + μe(n)x(n – d - k) - - - - - (vi)
where k = 0,1, .....M-1 and μ is the step size
controlling the speed of convergence. After
completing the learning,
the error signal becomes:
e(n) = x(n) - y(n) = s1(n) + n1(n) – s1‘(n) ≈ n1(n) -
- (vii)
Hence the output of filter is the clean ECG signal.
IV . METHODOLOGY
We take ECG Data Signal as a input signal in
analysis of removing noise by using IIR Filter
Design techniques. The first group is intended to
serve as a representative sample of the variety of
waveforms and artifact that an arrhythmia detector
might encounter in routine clinical use. The band
pass-filtered signals were digitized at 360 Hz per
signal relative to real time using hardware
constructed at the MIT Biomedical Engineering
Center and at the BIH Biomedical Engineering
Laboratory. The sampling frequency was chosen to
facilitate implementations of 60 Hz (mains
frequency) digital notch filters in arrhythmia
detectors. Since the recorders were battery-
powered, most of the 60 Hz noise present in the
database arose during playback. Sampling
frequency of the data signal is 360 and amplitude
±1 mv. Filter of noisy ECG signal set up in two
step, in first step input data signal removing from
the baseline drift after then 10 db awgn noise
introduce in input data signal. In second step design
a filter with the help of FDA tool in mat lab
software. FDA tool parameters set up as low pass
IIR filter with sampling frequency 360 Hz and
Page 113
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
minimum order. Frequency of pass band (FP) and
stop band (FS) are 54 Hz and 60 Hz. Attenuator of
filter AP is 1db and AS is 80 dB set in FDA tool.
The original signal of ECG data signal before and
after baseline remove shown in fig. 3.
Proposed Work:
In my proposed work, designed digital filters such
as Adaptive filter, Butterworth filter and Chebyshev
filter using MATLAB and the results are shown
below.
1. Results of Adaptive filter:
Fig 3: original ECG, noisy ecg signal, signal after removal of noise and
magnitude response of Adaptive filter.
2. Results of Butterworth filter:
Page 114
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
Fig 4: original ecg signal, noisy ecg signal, signal after removal of noise and
magnitude response of Butterworth filter.
3. Results of Chebyshev-1 filter:
Fig 5: original ecg, noisy ecg signal, signal after removal of noise and
magnitude response of Chebyshev filter.
V . CONCLUSION
Digital filter is the prominent solution that caters
the noise reduction up to satisfactory level. A
digital Filter technique is best suited for ECG
analysis and thereby helps in improving the quality
of ECG signal. The results obtained from Adaptive
filter, Butterworth filter, Chebyshev are compared
on the basis of signal to noise ratio. Future work
Page 115
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
will be implementing these digital filters in VHDL
on FPGA platform.
REFERENCE
[1]Ferdjallah M., Barr R.E., ―Frequency domain digital filtering techniques
for the removal of powerline noise with application to the electrocardiogram‖,
Comput. Biomed Res. 1990 Oct., 23(5), 473-489.
[2] Manish Kansal, Hardeep Singh Saini, Dinesh Arora ‗‘Designing & FPGA
Implementation of IIR Filter Used for detecting clinical information from ECG‘‘ International Journal of
Engineering and Advanced Technology (IJEAT) Volume-1, Issue-1, October
2011.
[3] Sonal K. Jagtap, M. D. Uplane ‗‘ A Real Time Approach: ECG Noise
Reduction in Chebyshev Type II Digital Filter‘‘
International Journal of Computer Applications (0975 – 8887) Volume 49–
No.9, July 2012.
[4] Mohandas Choudhary, Ravindra Pratap Narwaria ‗‘Suppression of Noise
in ECG Signal Using Low pass
IIR Filters‘‘ International Journal of Electronics and Computer Science
Engineering (ISSN- 2277-1956) 2011.
[5] Yogesh Sharma , Anurag Shrivastava „‟ Periodic Noise Suppression from
ECG Signal using Novel Adaptive Filtering Techniques‘‘ International
Journal of Electronics and Computer Science Engineering(IJCSE) Volume-1.
[6] Jane, R; Laguna,p;Thakor, and Caminal P.1992 ―Adaptive Baseline
Wander Removal in the ECG: Comparative AnalysisWith Cubic Spline
Technique‖ IEEE proceeding Computers in Cardiology, pp.143-146
[7] Hamid Gholam-Hosseini, Homer Nazeran, Karen J. Reynolds ‗‘ECG
Noise Cancellation Using Digital Filters‘‘ 2nd lntemational Conference on
Bioelectromagnetism, February 1998
[8] Seema Rani,Amarpreet kaur, J S Ubhi P.2011 ―Comparative study of FIR
and IIR filters for the removal of Baseline noises from ECG signal‖
International Journal of Computer Science and Information Technologies Vol
2 (3)
[9] Lisette P Harting Nikolay M. Fedotov Cornies HSlump ―On Baseline Drift
Suppressing in ECG Recording‖ 2004 IEEE Benelux Signal Processing
Symposium.
[10] Choy TT, Lenung P.M, ―Real Time Microprocessor-Based 50Hz Notch
Filter for ECG‖ Biomed Eng. 1908 May; 10(5); 285-8.
[11] Ferdjallah M. Barr R.E. ―Frequency-Domain Digital Filtering
Techniques for the Removal Powerline Noise withApplication to the Electro
Cadiogram‖ Compute Biomed Res. 1990 Oct 23(5); 475-89.
Page 116
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
SYSTOLIC BASED OPTIMIZATION TOOL FOR 1D & 2D FIR
FILTERS USING TOURNAMENT SELETION METHOD
Chandra ShekarP1
B.E.,M.Tech, Divya K S2
B.E, M.Tech, Chaya P3
B.E, M.Tech., Associate Professor 1,2
Dept. of ECE, VTU Belgaum
KVG College of Engineering, Sullia - 574 327,D K, Karnataka, India. 3 Dept. of ISE, VTU Belgaum
GSSSIETW, KRS Road, Mysore – 570016, Karnataka, India .
[email protected] ,[email protected] ,chayaneetha
@gmail.com
Abstract - The project is concerned with the design of
systolic array by using linear mapping techniques on
regular dependence graph (DG), the mapping technique
transforms a Dependency graph to a space-time
representation, where each node is mapped to a certain
processing element and is scheduled to a certain time
instance. The systolic design methodology maps an N-
dimensional DG to a lower dimensional systolic
architecture. The basic vectors involved in the systolic
array design should satisfy feasibility condition for
designing the tool. MATLAB version 7.01 is the platform
used to design the FIR tool for faster implementation,
and to achieve low level designs for selected vectors.
The tool designed can also be used in selection of
Scheduling inequalities and projection vector to meet
the feasibility condition, and to achieve 100% HUE
using “Tournament Selection” Method.
The Tournament selection typically used in
Evolutionary Programming, allows for tuning the
degree of stringency of the selection imposed, Rather
than Selecting on the basis of each Solutions fitness or
error in light of the objective function at hand, selection
is made on the basis on the number of wins, earned in a
competition.
Index Terms – Systolic array, Dependence graph,
Processing element, Tournament Selection method.
I INTRODUCTION
Flow approach used in the Systolic Design
Page 117
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
309
Fig. 1.1 Flow of systolic design
IISYSTOLIC ARCHITECTURE DESIGN
A. Basic principle of Systolic array
High performance, special-purpose computer systems are
typically used to meet specific application requirements or to
off-load computations that are especially taxing to general-
purpose computers. As hardware cost and size continue to drop
and processing requirements become well-understood in areas
such as signals and image processing, more special-purpose
systems are being constructed. However, since most of these
systems are built on ad hoc and basis for specific tasks,
methodological work in this area is rare. Because the
knowledge gained from individual experiences is neither
accumulated nor properly organized, the same errors are
repeated. I/O and computation imbalance is a notable example-
often, the fact that I/O interfaces cannot keep up with a devices
speed is discovered only after construction a high speed,
special-purpose device.
We intend to help correct this ad hoc approach by
providing a general guideline-specifically, the concept of
systolic architecture, a general methodology for mapping
high-level computations into hardware structure. In a systolic
systems, data flows from the computer memory in a rhythmic
fashion, passing through many processing elements before it
returns to memory, much as blood circulates to and from the
heart. The system works like an automobile assembly line
where different peoples work on the same car at different
times and many cars are assembled simultaneously. An
assembly line is always linear, however, and systolic systems
are sometimes two-dimensional.
The systolic architectural concept was developed at
Carnegie-Mellon University and versions of systolic
processors are being designed and built by several industrial
and governmental organizations.
Instead of:
5 million operations per second at most
We have:
30 million operations per second possible:
Fig. 2.1 Organization of Memory and Programming
Elements
B. Systolic Systems
Systolic systems consists of an array of PE (Processing
Elements) processors are called cells.Each cell is connected
to a small number of nearest neighbors in a mesh like
topology. Each cell performs a sequence of operations on data
that flows between them. Generally the operations will be the
same in each cell; each cell performs an operation or small
number of operations on a data item and then passes it to its
neighbor. Systolic arrays compute in ―lock-step‖ with each
cell (processor) undertaking alternate compute/communicate
phases.
C. Processing element
Fig shows the processing element architecture. The IN input
stores data coming from the other processor on to a dedicated
I/O register file data. The CTE input ins
Page 118
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
310
Fig. 2.2 Internal processing element architecture
connected to a broadcast bus for receiving constant data from
the outside world and from the memory. These data are stored
in a specific constant register file before being used.
As the arithmetic operations involved are very limited, two
specific units are implemented, namely an adder and a
minimize. These two units are pipelined due to the regular and
repetitive structure of the computation. The accumulator can
be loaded either from the adder or from the minimize.
The processors of the array are activated by micro-
commands which control actions such as accumulator loading,
I/O and CTE register selection, data acquisition, etc. These
micro-commands are specified by an instruction which is
received from the outside and decoded. In the same way, the
memory actions are programmed depending on the calculation
being performed.
Processor array, reference memory and data reference array
thus operate synchronously since; they each execute one
instruction every machine cycle. One instruction specifies
actions to be realized concurrently on all these units. In that
sense, the processor array can be considered to have an SIMD
execution mode: all the cells are executing the same
instruction at the same time.
D. Features of Systolic arrays
A Systolic array is a computing network possessing the some
features, such as Synchrony,Modularity,Regularity, Spatial
locality, Temporal Locality,Pipelinability,Parallel computing.
Synchronymeans that the data is rhythmically computed
(Timed by a global clock) and passed through the network.
Modularitymeans that the array (Finite/Infinite) consists of
modular processing units.
Regularitymeans that the modular processing units are
interconnected with homogeneously.
Spatial Localitymeans that the cell has a local
communication interconnection.
Temporal Localitymeans that the cells transmits the signals
from one cell to other which require at least one unit time
delay.
Pipelinabilitymeans that the array can achieve a high speed.
E. Mapping to Systolic Array
Various approaches to map computational algorithms on to
systolic array structures have been proposed[13].Generally
they can be classifed in to three categories: functional
transformation, retiming, and dependence mapping.The
dependence mapping consists of sevearl steps:
Step1:- is to map an algorithm to a Dependance
graph(DG),
Step2:-is to map the dependent graph to a signal flow
graph(SFG),
Step3:-finally maps the SFG on to an array processor.
Dependency diagram is a visual representation of a
dependency graph; in the case of a dependency graph without
circular dependencies, it can be interpreted as a Hasse diagram
of the graph. Dependency diagrams are integral to software
development, outlining the complex, interrelationships of
various functional elements. Typically in a dependency
diagram, arrows point from each module to other modules
which they are dependent upon,the mapping of Dependency in
shown below Fig3.3
Fig.3.3 Space representation
III Tournament Selection
In tournament selection a number Tour of individuals is
chosen randomly from the population and the best individual
from this group is selected as parent. This process is repeated
as often as individuals must be chosen. These selected parents
produce uniform at random offspring. The parameter for
tournament selection is the tournament size Tour. Tour takes
values ranging from 2 to N ind (number of individuals in
population).
In selection the offspring producing individuals are chosen.
The first step is fitness assignment. Each individual in the
selection pool receives a reproduction probability depending
on the own objective value and the objective value of all other
individuals in the selection pool. This fitness is used for the
actual selection step afterwards.
Page 119
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
311
A. Selection schemes
Throughout this section some terms are used for comparing
the different selection schemes.
Selective pressure
Probability of the best individual being selected compared
to the average probability of selection of all individuals.
Bias:
Absolute difference between an individual's normalized
fitness and its expected probability of reproduction.
Spread
Range of possible values for the number of offspring
of an individual.
Loss of diversity
proportion of individuals of a population that is not
selected during the selection phase
Selection intensity
expected average fitness value of the population after
applying a selection method to the normalized Gaussian
distribution
Selection variance
expected variance of the fitness distribution of the
population after applying a selection method to the
normalized Gaussian distribution
Table and figure show the relation between tournament size
and selection intensity.
tournament size 1 2 3 5 10 30
selection intensity 0 0.56 0.85 1.15 1.53 2.04
Tab. 3-1: Relation between tournament size and selection
intensity
B. Analysis of tournament selection
In an analysis of tournament selection can be found.
Selection intensity
Loss of diversity
(About 50% of the population are lost at tournament
size Tour=5).
Selection variance
Fig. 3.1: Properties of tournament selection
Importance of Tournament Selection
A small group of individuals are selected from the whole
population.
The best individual in this group is selected and returned
by the operator.
Tournament selection prevents the best individual from
dominating.
The chosen group size can decrease the selective
pressure.
Page 120
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
312
IV RESULTS
V APPLICATIONS
• Matrix Inversion and Decomposition.
• Faster approach can be met in the field of designing,
• Polynomial Evaluation.
• Matrix multiplication.
• Image Processing Convolution.
• Systolic lattice filters used for speech and seismic
signal processing.
• Artificial neural network.
• Robotics
• Equation Solving
• Signal processing
• Image processing
• Solutions of differential equations
• Graph algorithms
• Biological sequence comparison
• Other computationally intensive tasks.
VI CONCLUSION
The systolic architecture is a massively parallel processing
with limited I/O communication with host computer and
suitable for many regular interactive operations. The programs
written to design the TOOL by making use of
―TOURNAMENT SELECTION METHOD‖ in Evolutionary
programming , efficiently utilize engineering theory in order
to optimize the Systolic array design, The optimized design is
compared with the hand calculations done for 1D-FIR tap
filter and Reduced Dependency Graph (RDG). The results are
matching, hence I conclude that the design of TOOL shows
correct results and is verified. The TOOL designed in this
project can be used by any third party for better understanding
of ―Other Selection Method‖.
VIIFUTURE WORK
The analysis of the tool helps in better understanding of the
―Other selection Method‖ their by further Optimization of the
given task can be overcome to achieve 100% HUE. The
Page 121
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
313
TOOL delivers idea to design Polyphase-FIR filter, DFT,
Polyphase-DFT and IDFT-Polyphase function.
REFERENCES
[1]. S.Y. Kung, VLSI Array Processors, Prentice Hall, 1988.
[2]. H.T. Kung,‖Why systolic architectures?‖Computer, vol.
15, p. 37, 1982.
[3]. D.I. Moldovan, Parallel Processing: From Applications
to Systems. Morgan Kaufmann Publishers.1993.
[4]. S.K. Rao. , Regular Iterative Algorithms and their
implementation on Processor arrays, Ph.D. Dissertation,
Stanford University, Stanford, CA, 1985.
[5].Flynn, M., Some Computer Organizations and Their
Effectiveness, IEEE Trans.Comput. Vol. C-21, pp. 948, 1972.
[6].Duncan, Ralph, "A Survey of Parallel Computer
Architectures", IEEE, Feb.1990.
[7].De Jong K.A., 1994, Genetic algorithms: A 25-years
perspective, in Computational Intelligence: Imitating Life
IEEE 1994.
[8].Fogel.L.J. 1994, IEEE, Evolutionary programming in
perspective: the top-down view, in: computational Intelligence
(pp.135-146).
[9].‖ A Comparison of methods for self-adaptation in
evolutionary algorithms‖, N.Saravanan, David E.Fogel, Kevin
M.Nelson, IEEE 1995.
[10].M.J Foster and H.T.Kung.‖The design of special-purpose
VLSI chips,‖computer.pp 26-40, Jan 1980, IEEE.
[11].Chapter 7. (pp 189-210), by Keshab K Pharhi.
[12].‖Evolutionary Computation 1&2‖, by fogel, 1980.
[13].P.Quinton,‖The systematic design of systolic arrays,‖
IRISA Rept., March, 1983.
[14]. Sedukhin S.G. and Sedukhin I.S. An interactive graphic
CAD tool for the synthesis and analysis of VLSI systolic
structures.Proc. of Int. Conf. "Parallel Computing
Technologies", 1993, Obninsk, Russia, 1993, Vol.1, pp.163-
175.
[15]. Jonathan Break, ―Systolic Arrays & Their Applications‖.
[16]. H.T. Kung and C.E. Leiserson, Systolic arrays (for
VLSI), Sparse Matrix Proc. 1978, Society for Industrial and
Applied Mathematics, 1979, pp. 256-282.
[17]. G.J. Li and B.W. Wah, The design of optimal systolic
army, Tram. Comput., C-34(10) (1985) 66-75.
Page 122
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WISE 2013
314
Implementation of Optimized Flame Detection in a Video using Image Processing
Pramod G. Devalatkar Department of Electronics & Communication Engineering,
Visvesvaraya Technological University
Belgaum, India. [email protected]
Abstract— Present work is an in depth study to detect flames in
video by processing the data captured by an ordinary camera.
Previous vision based methods were based on color difference,
motion detection of flame pixel and flame edge detection. This
paper focuses on optimizing the flame detection by identifying
gray cycle pixels nearby the flame, which is generated because of
smoke and of spreading of fire pixel and the area spread of
flame. These techniques can be used to reduce false alarms along
with fire detection methods. The novel system simulate the
existing fire detection techniques with above given new
techniques of fire detection and give optimized way to detect the
fire in terms of less false alarms by giving the accurate result of
fire occurrence. The strength of using video in fire detection is
the ability to monitor large and open spaces.
Keywords—Fire detection, Video processing, Edge detection,
Color detection, Gray cycle pixel, Fire pixel spreading.
I. INTRODUCTION
Fire detection system sensors are used to detect
occurrence of fire and to make decision based on it.
However, most of the available sensors used such as
smoke detector, flame detector, heat detector etc., take
time to response. It has to be carefully placed in various
locations. Also, these sensors are not suitable for open
spaces. Due to rapid developments in digital camera
technology and video processing techniques,
conventional fire detection methods are going to be
replaced by computer vision based systems.
Conventional point smoke and fire detectors are widely
used in buildings. They typically detect the presence of
certain particles generated by smoke and fire by
ionization or photometry. Alarm is not issued unless
particles reach the sensors to activate them. Therefore,
they cannot be used in open spaces and large covered
areas. Video based fire detection systems can be useful
to detect fire in large auditoriums, tunnels, atriums, etc.
The strength of using video in fire detection makes it
possible to serve large and open spaces. In addition,
closed circuit television (CCTV) surveillance systems
are currently installed in various public places
monitoring indoors and outdoors. Such systems may
gain an early fire detection capability with the use of fire
detection software processing the outputs of CCTV
cameras in real time. Current vision based techniques
mainly follow the color clues, motion in fire pixels and
edge detection of flame. Fire detection scheme can be
made more robust by identifying the gray cycle pixels
nearby to the flame and measuring flame area
dispersion.
II. OVERVIEW OF FIRE DETECTION
This section covers the detail of the previously proposed
fire detection methods. It is assumed that the image
capturing device produces its output in RGB format.
During an occurrence of fire, smoke and flame can be
seen. With the increasing in fire intensity, smoke and
flame will be visible. In order to detect the occurrence
of fire, both flame and smoke need to be analysed. Many
researchers used unusual properties of fire such as color,
motion, edge, shape. Lai et al. [3] suggested that features
of fire event can be utilized for fire detection in early
stages. Han et. al. [2] used color and motion features
while Kandil et al. [1] and M. Nixon, A. Aguando [6]
utilized shape and color features to detect an occurrence
of fire.
A. Edge detection
Edge detection method is used to detect the color variance
in an image. The edge detection system compares the color
Page 123
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WiSE 2013
315
difference and provides an edge based on it [8] which can be
used in fire detection.
B. Color detection
A fire in an image can be described by using its color
properties. This color pixel can be extracted into the
individual elements as R, G and B, which can be used
for color detection. C.-L. Lai, J.-C.Y [3] have used R
and Gelements and find out that there is a correlation
between G/R ratio and temperature distribution, where
as temperature increases, G/R ration also increases. So,
due tothis, color of flame can provide useful information
to guess on the temperature of a fire and also fire phase
[9]. In terms of RGB values, this fact corresponds to the
following inter-relation between R, G and B color
channels: R > G and G > B. The combined condition for
the fire region in the captured image is R > G > B.
Besides, R should be more stressed than the other
components, because R becomes the dominating color
channel in an RGB image of flames. This imposes
another condition for R as to be over some pre-
determined threshold, RT [1]. However, lighting
conditions in the background may adversely affect the
saturation values of flames resulting in similar R, G and
B values which may cause non flame pixels to be
considered as flame colours [4]. Therefore, saturation
values of the pixels under consideration should also be
over some threshold value.
C. Motion detection
Motion detection is used to detect any occurrence of
movement in a video. It is done by analysing difference
in images of video frames. There are three main parts in
moving pixel detection: frame/background subtraction,
background registration, and moving pixel
detection[10].
The first step is to compute the binary frame difference
map, by thresholding the difference between two
consecutive input frames. At the same time, the binary
background difference map is generated by comparing
the current input frame with the background frame
stored in the background buffer. The binary background
difference map is used as primary information for
moving pixel detection.
In the second step, according to the frame difference
map of past several frames, pixels which are not
movingfor a long time are considered as reliable
background inthe background registration [11]. This step
maintains an updated background buffer as well as a
background registration map indicating whether the
background information of a pixel is available or not.
In the third step, the binary background difference map
and the binary frame difference map are used together to
create the binary moving pixel map [6]. If the
background registration map indicates that the
background information of a pixel is available, the
background difference map is used as the initial binary
moving pixel map.
III. PROPOSED TECHNIQUE
The aim of this paper is to develop an identification
system to detect an occurrence of fire based on the video
image. In this paper, I use flame properties to conduct
the fire detection. As shown in the block diagram below:
Fig. 1. Proposed fire detection system.
This method gives the flexibility to use different
combinations of detection methods so that, I can
implement the system according to the specific
requirements of use [4]. For example:
(1) For highly sensitive area, it can apply the OR gate ( ||
) operator. So that the system will prompt for fire, if any
of the method will detect the occurrence.
(2) For general purpose, it can apply the combination of
any two methods. So that the system will be prompt for
fire, if at least two methods will detect the fire.
Decision
making
algorithm
Image
Frames from
Video
Fire Alarm
Signal
Motion &
Gray Cycle
Detection
Edges & Color
Detection
Page 124
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WiSE 2013
316
(3) For Less sensitive area, it can apply the AND gate
(&&) operator. So as the system will prompt for fire,
only if all methods will detect the fire.
IV. METHODOLOGY
The purpose of this paper is to develop an optimized
system to detect an occurrence of fire based on video
images. In this project I have use the previously
proposed methods to conduct the fire detection and
propose new techniques to implement in parallel. That
would give more optimized results in detection of flame.
In developing the system the following stages are
involved.
V. PROPOSED ALGORITHM
The algorithm is based on the fact that visual color
images of fire havehigh absolute values in the red
component of the RGB coordinates [4]. This property
permits simple threshold-based criteria on the red
componentof the color images to segment fire images in
natural scenarios. However, not only fire gives high
values in the red component. Another characteristic of
fire is the ratio between the red component and the blue
and green components. An image is loaded into color
detection system and mapped with the extracted edge
detection image [7]. A color detection system applies the
specific property of RGB pixels and gives the output
result as an image with a selected area of color
detection.
A. Edge detection
Edge detection method is used to detect the color
variance in an image. Block Diagram of Edge Detection
System is as in Figure 2. Using MATLAB an Edge
Detection model is built based on this block diagram.
Fig. 2 Block diagram of Edge Detection system
Fig.3 (a). Original Image.
Fig. 3 (a) shows the original frame i.e. frame 1 and the Fig. 3
(b) which is the result of the detected edge of this image.
Fig. 3(b). Image after applying edge detection.
The edge detection system compares the intensity
difference in the image and provides an image with
black and white color space where high intensity area is
filled with white color and low intensity area is filled
with black color [2]. The intensity difference is
categorized using a global intensity threshold which is
separately calculated for each image by MATLAB the
output will provide a shape of the flame [4]. Thus, the
edge detection can be used toanalyse color detection of
fire.
After getting the output from the color detection we can
apply different detection techniques by mapping these
detected coordinate on its corresponding original image
with different combinations. We have three techniques
to implements further.
Motion Detection.
Gray-cycle pixel detection.
Area dispersion.
Video
input
(image)
Edge /
Colordet
ector
Detection
output
Page 125
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WiSE 2013
317
B. Motion detection
Motion detection is used to detect any occurrences of
movement in a sample video. Block diagram of motion
detection system is as shown below:
Fig.4. Block diagram of motion detection
Here we took two sequential images from video
frames [4]. After applying basic two methods edge
detection and color detection we get probable area of
fire pixel thenwe compare the RGB value to of frame1
to the frame 2 for corresponding pixel and if pixel value
differs then motion detector will show motion and will
give resultant outputto the combination of operator.
C. Gray scale pixel detection
Gray-cycle detection is used to detect occurrences of
smoke pixel in the selected area which is half above the
area, detected by color detection method. Gray-cycle
pixel has some properties in terms of RGB. This method
will check these properties inside the selected area and
then depending on the result obtained it will provide
result to the operator.
D. Area dispersion
Area detection method is used to detect dispersion of
fire pixel area in the sequential frames.In this method we
are comparing fire pixel area of two sequential frameson
the basis of minimum value of x & y and maximum
value of x & y [5]. In case of fire, if any extreme value
of x and y axis will increase for next frame then there is
area dispersion takes place and systemwill provide
output to the operator.
After that operator will perform operation on the basis of
logic combination selected by the system. The detected
fire pixel area of the image in Fig. 3 (a) is as shown in
Fig. 4. Below.
Fig. 4. Detected fire pixel area.
VI. CONCLUSIONS
I have collected a number of sequential image frames
from two original created videos which consist both fire
and non-fire images. Table below shows the number of
false alarm detection over non fire image.
Methods No. of Faulty
Detection
System
Performance
Motion
Detection (10/50) =20% 80%
Gray Cycle
Detection
(08/50)
=16% 84%
Area Dispersion (07/50)
= 14% 86%
Proposed Fire
Detection
System
(03/50)
= 6% 94%
Result shows that the system performance with the
application of proposed fire detection system gives the
better system performance in term of fewer false alarms
and thus a higher system performance is achieved.
REFERENCES
[1]. M. Kandil, M. Salama (2009): A New Hybrid Algorithm for Fire-Vision Recognition in IEEE Eurocon, pp. 1460-1466.
[2]. D. Han, ByoungmooLee (2009): Flame and Smoke Detection method for early real-time detection of a tunnel fire, Fire Safety Journal, vol. 44, pp. 951-
961.
[3]. C.-L. Lai, J.-C.Y (2008) Advanced Real Time Fire Detection in Video
Surveillance System, in IEEE International Symposium on Circuit and
Systems (ISCAS), pp. 3542-3545. [4]. Gaurav Yadav et al (2012) Image Processing based fire and flame
detection technique, in The Indian Journal of Computer Science and Engineering (IJCSE).
[5]. Turgay Celik (2010): Fast and Efficient Method for Fire Detection Using
Image Processing, ETRI Journal, Volume 32, Number 6.
[6]. M. Nixon, A. Aguando (2008): Feature Extraction & Image Processing,
2nd ed., London, Academic Press, pp. 115–136. [7]. B.C. Ko, K.H. Chong, J.Y. Nam (2009): Fire Detection based on vision
sensor and support vector services, Fire Safety Journal, vol. 44, pp. 322–329. [8]. Q. Zhu, S. Avidan, M-C Yeh, , K-W Cheng, “Fast Human Detection
Using a Cascade of Histograms of Oriented Gradients”, Proceedings of the
Two
sequential
Image
frames
Comparing
RGB values
Motion
detected or
not
Page 126
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WiSE 2013
318
IEEE Computer Society Conference on Computer vision and Pattern
Recognition, ISSN: 1063-6919, Volume 2, pp. 1491-1498, June 2006. [9]. Fire and Gas Protection Systems Part 3(2009).PETRONAS Fire and Gas
Protection Systems. [10]. T. Chen, P. Wu, and Y. Chiou (2004): An early fire-detection method
based on image processing, in ICIP ‟04, pp.1707–1710. [11]. C.-B. Liu, N. Ahuja (2004): Vision based fire detection, Proceedings of
the 17th International Conference on Pattern Recognition (ICPR‟ 04), Vol.4, pp. 134-137.
Page 127
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WiSE 2013
319
An Enhanced Threshold Based Technique for
WBC Nuclei Segmentation Arpita kulkarni1, Md Riyaz Ahmed2
1,2Electronics and Communication, RITM
Bangalore, India [email protected]
[email protected]
Abstract— Blood testing is the most important step; aging adults
can take to prevent life-threatening disease. With blood test
results in hand, we can identify the critical changes in the body
before they manifest as heart disease, cancer, diabetes, or worse.
Therefore a reliable and cost effective method is important to get
through the abnormalities of the blood sample obtained. Since
nuclei of the white blood cell has major information about the
abnormalities in the blood cell, we present a simple method of
white blood cell nuclei segmentation which is robust in noise
removal and obtain the required information about the cell by
global feature extraction and Gabor convolution. This is simpler
and cost effective compared to other methods.
Keywords—Bloodtest;segmentation;nuclei;leukocytes;redblood
cells
I. INTRODUCTION
. Blood tests are the critical part of medical diagnosis.
Blood smear image used for laboratory test is the
measure of the concentration of white blood cells, red
blood cells, and platelets in the blood. White blood cell
count helps in identifying the hormone and enzyme level
in blood cancer cell detection, coronary artery disease to
avoid heart attacks and many more. The white blood cell
count is Normal range adult: 5,000-10,000 [1] [3]. WBC
are cells of the immune system, and are found
throughout the body, including bone marrow and
blood[9]. White blood cells are also called as leukocytes.
There are two different types of leukocytes depending
on the presence of granules: Granulocytes and
Agranulocytes. Granulocytes are membrane bound
enzymes which help in digestion. They are Neutrophils,
Basophils and Esinophils. Agranulocytes
include Lymphocytes, Monocytes, and Macrophages [2].
Manual white blood cell counting method is time
consuming and prone to mistakes, as it depends on the
technician/expert.
Hence it is necessary to identify and analyze the major
blood cell abnormalities within short period of time. The
methodologies involved in blood test need to be simple,
reliable, faster and cost effective. Visual sample
automation is one such method.Researchers are keen in
improvising the classical segmentation method to get
advanced results. Automation involves five basic
processes which are image acquisition, image pre-
processing, image segmentation, image post-processing
and image analysis [4].Image acquisition involves
selection of the image, image pre-processing involves
enhancing the image contrast and other required
features. Image segmentation is the most important one
that we are dealing here in our project. Post-processing
is just the removal of some residual noise and finally
analysing the output image obtained. The flow is shown
in the figure 1. This work is addressing the segmentation
processes in automation method. In this work, the
algorithm proposed by Mostafa Mohammed and
Behrouz Far is modified to obtain qualitative
results. The proposed method helps in avoiding the
dependence of data on the green component of the
image , more information about the features of the cell is
obtained. As of now we are using the sub images of the
blood smear having single WBC.
The paper is organized as follows. In section 2 literature
review of the paper is presented. In section 3 the
proposed method is explained. In section 4 the results
and discussions are obtained and finally the conclusions
and future work are drawn in the section 5 and finally
the references.
Page 128
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WiSE 2013
320
Figure 1: Flow diagram for The Visual automation method
II. LITRATURE REVIEW
Several methods have been proposed in the medical field
to improvise the efficiency of the blood test for faster
detection of diseases. S. H. Rezatofighi et al. [5]
developed an automation method for White Blood Cell
Nucleus Segmentation Based on Gram-Schmidt
Orthogonalization method. This orthogonality theory is
used for amplifying the desired colour vectors and
weakening the undesired colour vectors.Farnoosh
Sadeghian et al. [6] introduced a method to separate the
nucleus and the cytoplasm with the help of two types of
active contour models: parametric and geometric
models, which worked only on gray scale images.
Scientific reports presented a Minimum model
Approach. The detection and segmentation of the white
blood cell nuclei was performed with the help of virtual
microscopy images. It avoided the priori information
that was required in model based approach and was able
to work with the minimum prior information for WBC
nuclei segmentation. They introduced the contour value
which helps in measuring the rank and selects the best
image objects [7].A Novel Framework was proposed by
Ja-Won Gim et al. [8]which explained a method for
white blood cell (WBC) segmentation using region
merging scheme and GVF
(Gradient Vector Flow)snake. It described two schemes;
nuclei segmentation and cytoplasm segmentation. For
nuclei segmentation, they created a probability map
using probability density function estimated from
samples of WBC‘s nuclei and cropped the sub-images to
include nucleus by using the fact that nuclei have salient
colour against background and red blood cells. Then,
mean-shift clustering was performed for region
segmentation and merging rules were applied to merge
particle clusters to nucleus. A hybrid approach for
cytoplasm segmentation, which combines the spatial
characteristics of cytoplasm and GVF snakes to
delineate the boundary of the region of interest. This
method showed higher average accuracy of 67.7% vs.
37.6% .Adaptive Threshold detector Method was
proposed by Der-Chen Huang et al. [9]. The method
included two steps:The first step enhances the WBC
nuclei by combining two different colour spaces in a
blood smear image. Then the thresholding based
adaptive segmentation method was proposed. The
experimental result showed that we can obtain promised
segmental results even after applying different colour
tone and size of smear images. Dorini et al. [10]
presented a simple morphological operation to explore
the scale space properties and improves the
segmentation accuracy. Nucleus segmentation is done
by creating a binary image with the help of thresholding
method and the gradient is computed by water shield
transform method. Granulometric function is used for
cytoplasm segmentation. Madhloom et al. [11]
established an automatic algorithm for detection and
classification of leukocytes. It had a major drawback as
its accuracy was highly depending on the contrast of the
image. This drawback was conquered by Mostafa et al.
[4] , by using some regulations such as nucleus
minimum segment size to be half of the RBC average
size.
Chinwaraphat et al. [12] proposed a Modified Fuzzy
clustering (FCM) method which segments the cytoplasm
and nuclei of the WBC.FCM method classifies the blood
sample into white blood cell nucleus, white blood cell
cytoplasm, plasma and red blood cell.FCM is modified
to avoid the false clustering due to unclear pixel
similarity between cytoplasm and plasma region. The
output efficiently extracts the nucleus and cytoplasm
Page 129
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WiSE 2013
321
region compared to normal FCM method. Byoung Chul
Ko et al. [13] proposed a mage segmentation method
using stepwise merging rules based on mean-shift
clustering and boundary removal rules with a GVF
(gradient vector flow) snake. Nuclei segmentation is
done by calculating the probability density function
from the WBC nuclei. Cytoplasm segmentation is done
as the boundary edges and noise edges are then removed
using removal rules, while a GVF snake is forced to
deform to the cytoplasm boundary edges.Nicola Ritter et
al. [14] presented a blood cell segmentation algorithm
for images taken from peripheral blood smear slides.
III. PROPOSED ALGORITHM
Basically the key task of image segmentation involves
extracting all the WBCs from a complicated background
and then segmenting them into their morphological
components, such as the nuclei and cytoplasm [14]. Here
we are involved in segmentation of the most informative
part of the WBC, nuclei for further analyses.
In this research we are modifying the algorithm
proposed by Mostafa et al. [4] to improve the accuracy.
The algorithm avoids the overstraining of the
components which lead to inexact segmentation. This
method removes the noisy materials from the blood
smear with an efficient way of edge detection. The
smoothening is performed by Gabor filter which gives
an efficient output. Also the image need not depend on
the green component of the data set, the RGB image is
converted to Gray scale for further processing.
I. Proposed algorithm steps
The processing of the blood smear images are
done as follows
1) Read the input gray image, if RGB convert to
Gray.
2) Perform edge detection and morphological
operation on the image
3) Noise removal is done by calculating the region
properties of the components in an image.
4) Obtain the valid region details of the image to
obtain robustness in an image.
5) Resizing of the image is done by width
normalization.
6) Extracting the global features of the
components of the image
7) Calculating the Gabor vector by Gabor
convolution
8) Combining the Gabor vector and the global
feature vector to obtain the WBC count in the
given blood smear image
II. Details of the proposed Algorithm
The input image is identified as gray image, if
not it is converted from RGB to gray by using
proper Mat lab functions. The image is
preprocessed by numerous operations as edge
detection and morphological operation .The
properties of the connected components in an
image is calculated, then the threshold area is
set which compares each connected component
area and retains only the values above the
threshold value. This helps in avoiding the
maximum noisy components from the image.
Among these set of countable components we
select the valid region by logical image to get
the details of the valid region in an image.
Further image is resized to perform width
normalization of all the segments present in an
image data set obtained. Then the properties of
this segmented image will be calculated to
obtain the global feature vector. The Gabor
convolution is performed on the image to obtain
the Gabor vector. The Gabor filter also
smoothens the image. The global feature vector
and the Gabor vector are then concatenated to
obtain the required set of data about the WBC‘s
from an image data set given.
Gabor convolution is for convolving each
row of an image with 1D log Gabor filter
G (w) = exp (-log (w/wo)
2) / (2 (log (k/wo)
2)
Where wo is the filter's centre frequency. To
obtain constant shape ratio filters the term
k/wo must also be held constant for varying wo.
The flow diagram of the proposed method is
shown in figure 2.
Page 130
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WiSE 2013
322
Figure 2: Workflow of proposed method
III. RESULTS AND DISCUSSION
In the experiment we used RGB colored image
which was converted into gray scale for the
processing. This is the qualitative analysis of the
segmentation method. The image data set is stored
in the file E:\emath\eMath_Arpita\Dataset.Though
the paper has some limitations; this method can be
successfully implemented at the algorithm level to
get efficient results. The input blood smear image is
processed to get the binary image data so as to
obtain the valid region details. The valid region is
identified by taking the cell with maximum area,
which is the single WBC cell in the sub image.
Later we collect all the required information about
the WBC cell and then the output is displayed and
the matrix of properties are obtained. The input
image that we take initially is the sub image
containing only single WBC cell as proposed by the
Mostafa et al.[4]. We can obtain information about
all the physical features of the WBC cell in the
bounding box selected by the Gabor filter. This
work can also be extended to count the WBC‘s
which helps to get to know about cells, if they are
large in number.
Input data is given as the colored or gray image
Figure 3: Input image used for processing
The input image is the blood smear taken to the
laboratory for testing the abnormality. This image is
initially colored and then sub images of single
WBC cells are identified and used for further
processing methods. Binary image is an
intermediate image obtained after the thresholding
method. Valid region details are obtained to
calculate the features of required cell. Output Image
is obtained that display the segmented white blood
cell obtained after processing.
Page 131
Proceedings of National Conference on Wireless Communication, Signal Processing, Embedded Systems-WiSE 2013
323
Figure 5: output image
CONCLUSION
We have presented a method for the segmentation
of WBC based on a threshold based technique. The
noise is removed efficiently with the help of Gabor
filter and edge detection methods. Although process
is worked out as the qualitative output. Since we
have taken the sub images of the blood smear to
know the properties of single cell.This method can
further be extended to calculate the White blood
cells in the complicated environment, where the
data contains more than one WBC cells for better
implementation methods.
ACKNOWLEDGMENT
I would like to acknowledge the head of the
department S.S.Manvi sir, my guide Md.Riyaz
Ahmed and our co-coordinator H.S.Aravind in
helping me perform my project and their due
support is always precious.
REFERENCES
[1] Siamak T. Nabili, MD, MPH and William C. Shiel Jr., MD, FACP,
FACR , ―Complete blood count,‖ URL:
http://www.medicinenet.com/complete_blood_count/article.htm#tocb.
[2] ―Laboratory blood tests,‖ Health hub from Cleveland
clinic.URL
:http://my.clevelandclinic.org/heart/services/tests/labtests/default.aspx.
[3] ―What are the types of White Blood Cells, ‖University
of Rochester Medical Centre, URL :
http://www.urmc.rochester.edu/encyclopedia/content.aspx/C
ontentTypeID=160&ContentID=35
[4] Mostafa Mohammed and Behrouz Far, ―An Enhanced
Threshold Based Technique for White Blood Cells
Nuclei Automatic Segmentation,‖2012 IEEE
14thInternational Conference on e-Health Networking,
Applications and Services (Healthcom).
[5] S. H. Rezatofighi, H. Soltanian Zadeh, R. Sharifian
and R.A. Zoroofi, ― A New Approach to White Blood
Cell Nucleus Segmentation based on Gram- Schmidt
Orthogonalization,‖ International Conference on
Digital Image Processing, Wayne State university
,2009.
[6] Farnoosh Sadeghian, Zainina Seman, Abdul Rahman
Ramli, Badrul Hisham Abdul Kahar, and M-Iqbal
Saripan, ―A Framework for White Blood Cell
Segmentation in Microscopic Blood Images Using
Digital Image Processing,‖ Shulin Li (ed.), Biological
Procedures Online, Volume 11, Number 1,2009.
[7] Stephan Wienert1, Daniel Heim, Kai Saeger, Albrecht
Stenzinger, Michael Beil, Peter Hufnagl,
Manfred Dietel, Carsten Denkert and Frederick
Klauschen1,‖Detection and Segmentation of Nuclei in
Virtual Microscopy Images: A Minimum Model
Approach,‖ Scientific Reports: Bioinformatics
Software Medical Research Imaging, 2012.
[8] Ja-Won Gima, Junoh Parka, Ji-Hyeon Leea, Byoung
Chul Koa and Jae-Yeal Nama, ―A novel framework
for white blood cell segmentation based on stepwise
rules and morphological features,‖Image Processing:
Machine Vision Applications IV, edited by David
Fofi,Philip R. Bingham, Proc. of SPIE-IS&T
Electronic Imaging, SPIE Vol. 7877, 78770H, 2011.
[9] Der Chen Huang, Kun-Ding Hung and Yung-Kuan
Chan,‖White Blood Cell Nucleus Segmentation Based
on Adaptive Threshold Detector,‖ IEEE Transactions
on Image Processing, Vol. 19, No. 12, pp. 3243-3254,
2010.
[10] Leyza Baldo Dorini ,Rodrigo Minetto, and Neucimar
Jeronimo Leite, ―White blood Cell Segmentation
using Morphological operators and Scale-space
Analysis,‖IEEE Computer Society Washington, DC,
USA, 2007.
[11] H.T. Madhloom, S.A. Kareem, H. Ariffin, A. A.
Zaidan, H.O. Alanazi and B. B. Zaidan , ―An
Automated White Blood Cell Nucleus Localization
and Segmentation using Image Arithmetic and
Automatic Threshold,‖ Journal of Applied Sciences,
vol.10, no.11, pp. 959-966,2010.
S. Chinwaraphat, A. Sanpanich, C. Pintavirooj, M.
Sangworasil and P. Tosranon1,‖A Modified Fuzzy
Clustering for White Blood Cell Segmentation,‖ The
Page 132
Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing and Embedded
Systems - WiSE 2013
324
An Automated Method of Enabling Portability of
Health Records for People with Restricted Mobility Pushpa R Iyer & Tej Ganapathy K B1, Sreedhar Menon R& Devdipta Pal2, P Meena3
1 and 2 Final Year B.E. Electrical & Electronics, BMS College of Engineering, Bangalore, India
3Assoc. Prof., Department of Electrical and Electronics, BMS College of Engineering, Bangalore, India
Email: [email protected] , [email protected] , [email protected] ,
[email protected] , [email protected]
Abstract - Executing the activity of filling out the initial
paperwork and provision of medical history to a new hospital or
physician during each visit is an inconvenience to every patient
more so to one with restricted mobility. What is proposed in this
work is, to generate a centralized database that can be
maintained by the Government or any central body that keeps
up-to-date records of any patient who registers for the service.
The paper presents a successful attempt in this direction by
making the details portable enabling access through the internet
by any authorized person or hospital.
Keywords – Portable Records, Up-to-date records, Centralized
database, Hospital, Restricted Mobility, Authorized person
I. INTRODUCTION
People with restricted mobility form a reasonable section of
the Indian population .Although, most hospitals maintain a
record of each patient for subsequent visits, there is currently
no mechanism for sharing of information between hospitals so
that each incoming patients' records are available to every
hospital regardless of whether the patient has visited the
hospital before. This causes a particular inconvenience to
patients with restricted mobility and in cases of emergency.
Centralized Access to Patient Records (CASPAR) overcomes
the shortfall by keeping medical records of any patient who
registers with CASPAR and providing authorized persons
access to the individual patient records on demand through an
internet based front-end. This feature presented in this work
makes visits to the care facility less time-consuming since the
patients‘ medical records can be accessed by any medical
facility that uses CASPAR.
Our research into existing systems for patient information
portability is as follows. In [1], the authors have proposed a
‗portable patient information integration system combines
Radio Frequency Identification (RFID) technology, Wireless Network, Personal Digital Assistant (PDA) and Front-
Monitoring System. Through the help of the proposed system,
the medical workers can identify patients by non contact
identification and get medical record immediately. Meanwhile
the proposed system can record the history of the interaction
between medical personnel and patients. It can also send
alarm to the corresponding medical workers when reporting a
high risk testing result and give medicine safety suggestion.
The proposed system can improve the correctness and
instantaneousness of patients' medical information and hence
provide a safe environment for patients. [2]presents a
developed ‗Point-of-Care Information System (PCIS) that uses portable terminals and supports the entire loop of daily
nursing work for the first time.The system consists of personal
digital assistants (PDAs) that nurses carry in the hospital
wards and a server computer located in the nursing station.
The system has three main functions: 1) Data Browsing that
provides patient information such as a brief history, vital-sign
charts, and handwritten notes; 2) Schedule planning helps
nurses organize doctors' instructions and make to-do lists for
the day; and 3) Care Management reminds nurses when they
should execute the doctors' orders and provides tools for data
entry. [3], Society for Promotion of IT in Chandigarh, has
proposed that government hospitals and community health
centers should have a centralized patient database. According
to the proposal, once a patient has visited any health centre or
hospital, his or her record will be uploaded on the common
database and be accessible from all places.
The feature common to all the above projects is that the patient
records are only maintained on a localized scale, in one medical
facility or for one city. The model proposed, allows portability
of information between all participating medical facilities
spanning the nation or even the world. The major requirement
for this model to be successfulis for patients to volunteer their
information to be stored on the CASPAR database.
II. SCHEMATIC OF CASPAR
In this section we proceed to explain the proposed model and
its functionalities. Fig.1 depicts the schematic.
The patient or the patient‘s authorized guardian is allowed to
access and update information on the CASPAR database
through an internet based front-end. Authentication is
provided through a login feature where the patient chooses a
Page 133
Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing and Embedded
Systems - WiSE 2013
325
suitable password. The patient may choose to share this
password with his authorized guardian.
The front-end developed provides the patient accessing the
database with a user-friendly interface that allows for easy
navigation between the different pages created to enable
access to the database system. The patient is permitted to
change his contact information, insurance information and
records of his medical history.
The central database is intended to be maintained by some
central and trusted organization which could be the
government. Limited access to this database can be purchased
by individual medical facilities whose access privileges are
limited by the database administrator in accordance with the
privacy requirements.
Fig.1 Schematic of CASPAR
III. STRUCTURE OF DATABASE
The database has been designed to hold all information
required by hospitals to make an admission. Fig.2 gives a
blueprint of the database.
To make the mechanism work, hospitals purchase access to
CASPAR on signing a privacy agreement.
The intention in designing the database this way is:
1) Aadhaar Number : Inclusion of this number in the
database facilitates efficient search of patient records in the hospital since Aadhaar is a Unique ID.
2) Emergency Contact Numbers and Address : Included so
that the hospital knows exactly whom to contact in the
event of an emergency.
3) Insurance details : Include information about the
insurance provider and insurance number that the
hospital needs to proceed with in the case of an
emergency after consent from the guardian.
4) Patient's Past Records : Enables the doctors to get
access to the patient's medical history so that they can
treat the patient more effectively.
5) Patient or User ID : Computer generated ID used at the
patient's end to log onto the database and view, review or update information.
Fig.2 Blueprint of the Database
IV. INTERFACE FOR PATIENT ACCESS
To have the designed system tested for its features being
operational, it is necessary that people register with CASPAR
and volunteer the requisite information. To make this process
Page 134
Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing and Embedded
Systems - WiSE 2013
326
simple and user-friendly a front end is designed and explained
in the following section.
A. Registration Page
The patient will enter the details mentioned in the blueprint in
Fig.2 along with his desired password. Some important fields in the registration form have been made mandatory. The
details of which are as shown in Fig.3. The patient is
redirected to a page that gives a confirmation if registration is
successful and also provides a computer generated unique
patient identification number as shown in Fig.4
Fig.3 Generated Registration page
Fig.4Generated Success page with Patient ID
B. Login page
The patient or any anybody authorized by the patient can
access patient records to view, review or update the
information by logging on to CASPAR through this page. The
patient ID generated during registration is used as a user ID.
The login process is completed when the correct password is
provided. The patient can thus update his records to ensure that
the information is up to date. This is shown in Fig.5.
Fig.5 Generated Login page
Fig.6 Generated Patient Details
Page 135
Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing and Embedded
Systems - WiSE 2013
327
Fig.7 Generated Update Records form
Fig.6 and Fig.7 show the display obtained as a result of log in
and the result in the form of information that is designed to
be generated. Fig.8 shows the display designed to be generated
after completion of the update information.
Fig.8 Update Success Page
C. Database to store patient information
A view of the MySQL database generated to store all the
patient information is as shown in Figure.9. Some sample
records generated is as shown below.
Fig.9 Records on the Database
V. SCOPE
Although we have provided an authentication feature through
the User ID and password requirement to view patient details,
we have not yet provided for encryption of the password
stored on the database. This feature maybe added in future to
improve security of patient information.
VI. CONCLUSION
Through this paper, we have demonstrated the use and
importance of a centralized database (CASPAR) for Patient
Information Portability. This provides a certain level of
flexibility to the patient giving them the choice of visiting the
nearest hospital or any desirable one irrespective of his prior
visits due to information transferability through CASPAR as
long as the hospitals are registered users of the database. The
generation of a unique user ID ensures authenticity of the user
enabling the user to access, confirm or modify the information
stored in CASPAR.
REFERENCES
[1] Frontiers of High Performance Computing and Networking, Proceedings
of the international conference ISPA'07, Lecture Notes in Computer
Science, Vol. 4743, 2007, pp 87-95, Chu, Jin, Chiang and Kao
[2] Point-of-care information systems with portable terminals, Medinfo.; 9 Pt
2 :990-4 10384609 Cit:3Sasaki H, Sukeda H, Matsuo H, OkaY, Kaneko
M, Sasaki, 1998
[3] Society for Promotion of IT ,Chandigarh, India INDIAN EXPRESS,
Khushboo Sandhu, Chandigarh , 2011
[4] www.ieee.org
[5] Architecture of a database system - Joseph M. Hellerstein1, Michael
Stonebraker2 and James Hamilton3, 2007
[6] Raghu Ramakrishnan, Database Management Systems, McGraw Hill, 3rd
Edition, 2002
[7] Carlo Zaniolo, Stefano Ceri, Christos Faloutsos, Richard T. Snodgrass,
V.S. Subrahmanian, and Roberto Zicari, Advanced Database system,
Morgan Kauffman, 1997
Page 136
Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing and Embedded
Systems - WiSE 2013
328
Page 137
Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing and Embedded
Systems - WiSE 2013
329
Adaptive Techniques for Cancellation of Noisy
Signal
Email :
Abstract- In the process of transmission of information from
source to receiver, noise from the surroundings automatically
gets added to the signal. In numerous application areas,
including biomedical engineering, radar & sonar engineering,
digital communication etc. The objective is to extract the useful
signal from the noise corrupted signal. The use of adaptive filter
is one of the most popular proposed solutions ,to reduce the
signal corruption caused by unpredictable noise. The proposed
method emphasizes on the performance comparison of MATLAB
Simulation of an adaptive filter for Least Mean Squared (LMS)
and Normalized Least Mean Squared (NLMS) Algorithms. The
input to the filter is a noisy signal and the output of the filter is
approximate to clean signal. The designed filter is tested using
MATLAB and the performance is analyzed on the basis of
Signal to Noise Ratio (SNR) ,Percentage of Noise Removal
(PNR), Stability. Hardware implementation of adaptive noise
cancellation using TMS320C6713 DSP Processor
Key-words:- Adaptive filter, ,Least Mean Squared
(LMS),Normalized Least Mean
Squared (NLMS)
I. INTRODUCTION
All the physical systems when they operate, they produce
physical noise, which is a mixing of an infinitive number of
sinusoidal harmonics with different frequencies. So, the initial
signal information is corrupted with this noise. This complex signal may be very noisy, so much that human ear or other
system which may follow it, cannot receive correct initial
signal.
So, an algorithm has to be invented which must be able to
separate the physical noise from the information signal and to
output the information signal without noise. This is not
possible, as there does not exist a system. So, this algorithm
should have the ability to reduce the noise level as much as it
can.
An adaptive filter[1][2] has the property of self-modifying its
frequency response to change its behavior with time. It allows
the filter to adjust the response as the input signal
characteristics change. Adaptive filters[1][2] work on the principle of minimizing the mean squared error between the
filter output and a target (or desired) signal. The general
adaptive filter configuration is illustrated in Fig.1.
figure 1:Block diagram of adaptive filter
The adaptive filter[1][2] has two inputs: the primary input
d(n), which represents the desired signal corrupted with
undesired noise, and the reference signal x(n), which is the
undesired noise to be filtered out of the system. The basic idea
of adaptive filter is to predict the amount of noise in the
primary signal, and then subtract that noise from it. The
prediction of reference signal x(n),which contains the solid
reference of the noise present in the primary signal. The noise
in the reference signal is filtered to compensate for the
amplitude, phase and time delay, and then subtracted from the
primary signal. This filtered noise is the system‘s prediction of
the noise portion of the primary signal, y(n). The resulting
signal is called error signal e(n), and it presents the output of
the system. Ideally, the resulting error signal would be only
the desired portion of the primary signal. In this paper we have implemented the adaptive filter on
software to compare their relative performance. The
MATLAB software is used for simulation purpose. The
Page 138
Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing and Embedded
Systems - WiSE 2013
330
resultant output of simulation is compared in terms of SNR,
PNR, Stability for a noisy signal input.
II. ADAPTIVE ALGORITHMS
The algorithms used to perform the adaptation, and the
configuration of the filter depends directly on the use of the
filter. The two classes of adaptive filtering algorithms namely
Least Mean Squared (LMS) and Recursive Least Squares
(RLS) are capable of performing the adaptation of the filter
coefficients. The LMS based algorithms are simple to
understand and easy to implement whereas RLS based
algorithm are complex and requires so much memory for
implementation. So in this work we have focuses on LMS
based algorithms.
A. Least Mean Square Algorithm
The LMS algorithm [3], is a type of adaptive filter known as stochastic gradient-based algorithms as it utilizes the gradient
vector of the filter tap weights to converge on the optimal
wiener solution. With each iteration of the LMS algorithm, the
filter tap weights of the adaptive filter are updated according
to the following formula:
Here x(n) is the input vector of time delayed input values,
Thevector
T
N nwnwnwnwnw 1210 .......
represents the co-efficient of the adaptive filter tap weight
vector at time n. The parameter μ is known as the step size
parameter and is a small positive constant. This step size
parameter controls the influence of the updating factor.
Selection of a suitable value for μ is imperative to the
performance of the LMS algorithm, if the value is too small ,
the time taken by the adaptive filter to converge on the
optimal solution will be too long; if μ is too large ,the adaptive
filter becomes unstable and its output diverges.
B. Normalized LMS Algorithm
In the standard LMS algorithm, when the convergence factor
μ is large, the algorithm experiences a gradient noise
amplification problem. In order to solve this difficulty, we can
use the NLMS(Normalized Least Mean Square)algorithm[3].
The correction applied to the weight vector w(n) at iteration
n+1 is ―normalized‖ with respect to the
squared Euclidian norm of the input vector x(n) at iteration n.
We may view the NLMS algorithm as a time-varying step-
size algorithm, calculating the convergence factorin Eq.
(3)[1].
Where: α is the NLMS adaption constant, which optimize the
convergence rate of the algorithm and should
satisfy the condition 0< α<2, and c is the constant term for
normalization and is always less than 1.
The Filter weights are updated by the Eq. (4).
III. ADAPTIVE NOISE CANCELLATION
Adaptive noise cancellation (ANC) [4][5] is performed by
Subtracting noise from a received signal, and an operation
controlled in an adaptive manner is done during the adaptation
process to get an improved signal-to-noise ratio. Noise
subtraction from a received signal could generate disastrous
results by causing an increase in the average power of the
output noise. However when filtering and subtraction are
controlled by an adaptive process, it is possible to achieve a
superior system performance compared to direct filtering of the received signal. Fig.2 shows adaptive noise canceling
system.
Fig.2.Adaptive Noise Cancellation System
The system composed of two separate inputs, a primary input
or ECG signal source which is shown as s(n) and a reference
input that is the noise input shown as x(n) . The primary signal is corrupted by noise x1(n). x1(n) is highly correlated with
noise signal or reference signal x(n) . Desired signal d(n)
results from addition of primary signal s(n) and Correlated
noise signal x1(n). The reference signal x(n) is fed into
adaptive filter and its output y(n) is subtracted from desired
signal d(n). Output of the summer block is then fed back to
adaptive filter to update filter coefficients. This process is run
recursively to obtain the noise free signal which
is supposed to be the same or very similar to primary signal
s(n) .
IV. EXPERIMENTAL SETUP
MATLAB Simulation
For simulation of adaptive algorithms, a MATLAB program is
written which implements the mathematical equation of LMS
Page 139
Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing and Embedded
Systems - WiSE 2013
331
and NLMS algorithms as given in eq.1 to eq.4 respectively.
The reference input signal x(n) is a white Gaussian noise is
generated using WGN function in MATLAB, and source
signal s(n) is a clean sine signal generated using sin function,
the desired signal d(n) ,obtained by adding a x(n) into clean
signal s(n), i.e. d(n) = s(n) + x1(n), as shown in Fig.3.we can
observe that the output signal is noise free and it is
comparatively equal to input. The fft of mixed signal shows
the area where noise is present and fft of LMS output shows
that the noise is cancelled as in Fig.4. the fft output shows that
still a small amount of noise is present due gradient noise
amplification problem . But Fig.5 shows the fft output of NLMS algorithm where the error is minimized by normalizing
the correction applied to the weight vector w(n).
Fig 3.Input signal, noise signal, input+noise
Fig 4.lms
output, fft_mixed signal, fft_output
Fig
5.NLMS output,fft_mixed signal,fft_output
V. SIMULINK MODEL
In order to estimate the values of the parameters for each
algorithm discussed in previous sections, MATLAB Simulink
model is used. The following figures depict the Simulink
implementation of each algorithms and estimation of its
parametric values from the corresponding graphical figures.
A. LMS ALGORITHM:
The Simulink Model for LMS with different statistical
parameter as shown in fig 6 and speech signal corrupted with
white Gaussian noise is shown in fig 7 and Output of LMS
algorithm is shown in figure 8 Power spectral analysis is also
simulated and shown in figure 9.
Fig 6.
Simulink Block diagram for LMS algorithm
Page 140
Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing and Embedded
Systems - WiSE 2013
332
Fig 7. Input corrupted with white guassian noise
Fig 8. Output of LMS algorithm
Fig 9.Power spectral analysis of LMS algorithm
B. N-LMS ALGORITHM
The Simulink model for N-LMS with different statistical
parameter as shown in fig 10 and speech signal corrupted with
white Gaussian noise is shown in fig 7.The power spectral
analysis is also simulated and shown in fig12. NLMS output is
shown in fig 11.
Fig
10. Simulink Block diagram for NLMS algorithm
Fig
12. Power spectral analysis of NLMS algorithm
Fig 11.
Output of NLMS algorithm
VI. RESULTS & DISCUSSION
A. Simulation
Page 141
Proceedings of TEQIP II sponsored National Conference on Wireless Communication, Signal Processing and Embedded
Systems - WiSE 2013
333
The simulation of the LMS and NLMS algorithms is carried
out with the following specifications:
Filter order N=2, step size μ= 0.005, iterations n= 1200. The
LMS and NLMS algorithms are compared in terms of Signal
to Noise Ratio(SNR), Percentage of Noise Removal (PNR)
and stability as shown in Table-1
SL.NO Algorithms SNR PNR Stability
1. LMS 11.0429 0.9145 2401
2. NLMS 25.6713 0.5429 6001 Table-I Comparision of performance of NLMS and LMS algorithms
From Table-1 it is clear that the performance of NLMS
algorithm is better than LMS algorithms
B. HARDWARE IMPLEMENTATION
Hardware implementation of noise cancellation[6][7] is performed by writing a C code for adaptive noise cancellation
in Code Composer Studio V3.1 (CCS V3.1) then build it and
load it on to the TMS320C6713 DSP processor. The input is
noisy sine wave which is got by adding two signals sine wave
and high frequency noise by using OPAMP 741 IC and then
the noisy input and noise is given to left and right channel of
linein port of DSP respectively and we can see that we get a
noise free output signal
at lineout port as as shown in fig 13.
VII. CONCLUSIONS
The implementation of adaptive algorithms (LMS &NLMS)
on MATLAB for a noisy signal has been done successfully
and the results are compared in the terms of SNR, PNR,
Stability. The results of comparison of algorithms show that
the performance of NLMS is better than LMS. The adaptive noise canceller is implemented on TMS320C6713 DSP
processor. Therefore to sum up, the adaptive noise canceller is
a very efficient and useful system in many applications with
sound, video etc. The only disadvantage is that it needs digital
signal processor DSP for its operation.
Fig 13. Setup for hardware implementation for noise cancellation
REFERENCES
[1] Haykin S. Adaptive filter theory. Prentice Hall. 2002. [2] Bernard
Widrow, John R. Glover, John M. Mccool, John Kaunitz, Charles S.
Williams, Robert H. Hean, James R. Zeidler, Eugene Dong,Jr. and
Robert C. Goodlin, ―Adaptive Noise Cancelling: Principles and
Applications‖, Proceedings of the IEEE, 1975, Vol.63 , No. 12 ,
Page(s): 1692 – 1716. [3] D.C. Dhubkarya , Aastha Katara,‖
Comparative Performance Analys of Adaptive. Algorithms for
Simulation & Hardware Implementation of an ECG Signal‖,
Volume1Number-4PP- 2184-2191. [4] Yaghoub
Mollaei,―Hardware Implementation of Adaptive Filter ‖, Proceedings
of the IEEE,Nov 2009 [5] Raj Kumar Thenua, S. K. Agrawal,
Member, ―Hardware Implementation of Adaptive Algorithms for Noise
Cancellation‖ IACSIT, Vol. 2, No. 2, March 2012 [6] Nirmal
R Bhalani, Jaikaran Singh, Mukesh Tiwari ,‖Implementation of Karaoke
Machine on the DSK6713 Processor ―, Volume 62– No.7, January 2013
[7] Vijay kumar Gupta, Mahesh Chandra,S.N.Sharan ―Real Time
Implementation Of Adaptive Noise Canceller‖ Paper Identification
Number CS-1.6.