1 MEMS Based MicroResonator Design & Simulation Based On Comb-Drive Structure Mr. Prashant Gupta [email protected]Ideal Institute of Technology, Ghaziabad Abstract:- Resonators serve as essential components in Radio- Frequency (RF) electronics, forming the backbone of filters and tuned amplifiers. However, traditional solid state or mechanic implementations of resonators and filters tend to be bulky and power hungry, limiting the versatility of communications, guidance, and avionics systems. MicroElectro-Mechanical Systems (MEMS) are promising replacements for traditional RFcircuit components. In this paper we discuss the MEMS resonator, which is one of the versatile components in the RF circuits, based on one of the promising architecture known as Comb-Drive structure. Introduction: A resonator is a device or system that exhibits resonance or resonant behavior, that is, it naturally oscillates at some frequencies, called its resonant frequencies, with greater amplitude than at others. The oscillations in a resonator can be either electromagnetic or mechanical (including acoustic). Resonators are used to either generate waves of specific frequencies or to select specific frequencies from a signal. A physical system can have as many resonant frequencies as it has degrees of freedom; each degree of freedom can vibrate as a harmonic oscillator. Systems with one degree of freedom, such as a mass on a spring, pendulums, balance wheels, and LC tuned circuits have one resonant frequency. Systems with two degrees of freedom, such as coupled pendulums and resonant transformers can have two resonant frequencies. The vibrations in them begin to travel through the coupled harmonic oscillators in waves, from one oscillator to the next. Resonators can be viewed as being made of millions of coupled moving parts (such as atoms). Therefore they can have millions of resonant frequencies, although only a few may be used in practical resonators. The vibrations inside them travel as waves, at an approximately constant velocity, bouncing back and forth between the sides of the resonator. The oppositely moving waves interfere with each other to create a pattern of standing waves in the resonator. If the distance between the sides is , the length of a round trip is . In order to cause resonance, the phase of a sinusoidal wave after a round trip has to be equal to the initial phase, so the waves will reinforce. So the condition for resonance in a resonator is that the round trip distance, , be equal to an integral number of wavelengths of the wave: If the velocity of a wave is , the frequency is so the resonance frequencies are: So the resonant frequencies of resonators, called normal modes, are equally spaced multiples (harmonics), of a lowest frequency called the fundamental frequency. The above analysis assumes the medium inside the resonator is homogeneous, so the waves travel at a constant speed, and that the shape of the resonator is rectilinear. If the resonator is inhomogeneous or has a non rectilinear shape, like a circular drumhead or a cylindrical microwave cavity, the resonant frequencies may not occur at equally spaced multiples of the fundamental frequency. They are then called overtones instead of harmonics. There may be several such series of resonant frequencies in a single resonator, corresponding to different modes of vibration.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
MEMS Based MicroResonator Design & Simulation Based On
Abstract:- Resonators serve as essential components in Radio- Frequency (RF) electronics, forming the backbone of filters and tuned amplifiers. However, traditional solid state or mechanic implementations of resonators and filters tend to be bulky and power hungry, limiting the versatility of communications, guidance, and avionics systems. MicroElectro-Mechanical Systems (MEMS) are promising replacements for traditional RFcircuit components. In this paper we discuss the MEMS resonator, which is one of the versatile components in the RF circuits, based on one of the promising architecture known as Comb-Drive structure. Introduction: A resonator is a device or system that exhibits resonance or resonant behavior, that is, it naturally oscillates at some frequencies, called its resonant frequencies, with greater amplitude than at others. The oscillations in a resonator can be either electromagnetic or mechanical (including acoustic). Resonators are used to either generate waves of specific frequencies or to select specific frequencies from a signal. A physical system can have as many resonant frequencies as it has degrees of freedom; each degree of freedom can vibrate as a harmonic oscillator. Systems with one degree of freedom, such as a mass on a spring, pendulums, balance wheels, and LC tuned circuits have one resonant frequency. Systems with two degrees of freedom, such as coupled pendulums and resonant transformers can have two resonant frequencies. The vibrations in them begin to travel through the coupled harmonic oscillators in waves, from one oscillator to the next. Resonators can be viewed as being made of millions of coupled moving parts (such as atoms). Therefore they can have millions
of resonant frequencies, although only a few may be used in practical resonators. The vibrations inside them travel as waves, at an approximately constant velocity, bouncing back and forth between the sides of the resonator. The oppositely moving waves interfere with each other to create a pattern of standing waves in the resonator. If the distance between the sides is , the length of a round trip is . In order to cause resonance, the phase of a sinusoidal wave after a round trip has to be equal to the initial phase, so the waves will reinforce. So the condition for resonance in a resonator is that the round trip distance, , be equal to an integral number of wavelengths of the wave:
If the velocity of a wave is , the frequency is so the resonance frequencies are:
So the resonant frequencies of resonators, called normal modes, are equally spaced multiples (harmonics), of a lowest frequency called the fundamental frequency. The above analysis assumes the medium inside the resonator is homogeneous, so the waves travel at a constant speed, and that the shape of the resonator is rectilinear. If the resonator is inhomogeneous or has a non rectilinear shape, like a circular drumhead or a cylindrical microwave cavity, the resonant frequencies may not occur at equally spaced multiples of the fundamental frequency. They are then called overtones instead of harmonics. There may be several such series of resonant frequencies in a single resonator, corresponding to different modes of vibration.
2
MEMS Resonators:- Mechanical resonators are highly sensitive probes for physical or chemical parameters which alter their potential or kinetic energy[1,2]. Silicon resonant microsensors for measurement of pressure, acceleration, and vapor concentration have been demonstrated recently, polysilicon micro-mechanical structures have been resonated elcctrostatlcally parallel to the plane of the substrate by means of one or more interdigitated capacitors (electrostatic combs). Some advantages of this approach are (1) less damping on the structure, leading to higher quality factors, (2) linearity of the electrostatic-comb drive and (3) flexibility in the design of the suspension for the resonator For example, folded-beam suspensions can be fabricated without increased process complexity, which is attractive for releasing residual strain and for achieving large-amplitude vibrations. There are different types of resonator. We only focus on vibrating resonators. •Lateral movement –Parallel to substrate –Ex.: Folded beam comb-structure •Vertical movement –Perpendicular to substrate –Ex.: clamped-clamped beam (c-c beam) –”free-free beam”(f-f beam) Example of simple resonators Mass and spring. This resonator is used by many physicists as the elemental simple mechanical resonator, to explain the properties of more complex resonances and resonators. The governing homogeneous differential equation is
vertical displacement y from its equilibrium position, mass m and spring constant k = f / y, R is the damping coefficient. The angular resonant frequency is given by
Folded-Flexure comb drive Microresonator:- In the design of Resonator, spring constant played a vital role. Different types of spring designs have been applied in comb-drive actuators. 1- Clamped–clamped beams, 2-A crab-leg flexure and 3- A folded-beam flexure. In all these different types of spring design, folded beam structure is widely used to design a Microresonator The folded-flexure electrostatic comb drive micromechanical resonator shown in Figure 1 was first introduced by Tang [4, 5,6]. This device has been well-researched and is commonly used for MEMS process characterization. The microresonator consists of a movable central shuttle mass which is suspended by folded-flexure springs on either side. The other ends of the folded-flexure springs are fixed to the lower layer. The microresonator can be thought of, as a spring-mass damper system, the damping being provided by the air below and above the movable part. By applying a voltage across the fixed and movable comb fingers, an electrostatic force is produced which sets the mass into motion in the x-direction. The microresonator has been used in building filters, oscillators and in resonant positioning systems. Figure 1 shows the overhead view of a µresonator which utilizes interdigitated-comb finger transduction in a typical bias and excitation configuration. The resonator consists of a finger-supporting shuttle mass suspended above the substrate by folded flexures, which are anchored to the substrate at two central points. The shuttle mass is free to move in the direction
3
indicated, parallel to the plane of the silicon substrate. Folding the suspending beams as shown provides two main advantages: first, post-fabrication residual stress is relieved if all beams expand or contract by the same amount; and second, spring stiffening nonlinearity in the suspension is reduced, since the folding truss is free to move in a direction perpendicular to the resonator motion. The black areas are the places where the polysilicon structure is anchored to the bottom layer.
Fig.1 Layout of the lateral folded-flexure comb drive microresonator Modeling the Oscillation Modes of the Microresonator:- The preferred direction of motion of the microresonator is the x-direction. However, the microresonator structure can vibrate in other modes. There are the three translation modes along x, y and z, three rotational modes about x, y and z, and oscillation modes due the movement of the folded-flexure beams and the comb drive. Each oscillation mode is described by a lumped second-order equation of motion. For any generalized displacement ζ, we can write:
where Fe,ζ is the external force (in the x-mode this force is generated by the comb drives), rn; is theeffective mass, Bζ is the damping coefficient, and k; is the spring constant. The fundamental frequency of the structure can be obtained from Rayleigh’s Quotient. The fundamental resonance frequency of this mechanical resonator is, again, determined largely by material properties and by geometry, and is given by the expression .
where MP is the shuttle mass, Mt is the mass of the folding trusses, Mb is the total mass of the suspending beams, W and h are the cross-sectional width and thickness, respectively, of the suspending beams, and L is indicated in Fig.1 The expression for the damping coefficient is
where µ is the viscosity of air, d is the fixed spacer gap between the ground plane and the bottom surface of the comb fingers, δ is the penetration depth of airflow above the structure, g is the gap between comb fingers, and As, At, Ab, and Ac are layout areas for the shuttle, truss beams, flexure beams, and comb finger sidewalls, respectively.
4
Working Principle:- To bias and excite the device, a dc-bias voltage VP is applied to the resonator and its underlying ground plane, while an ac excitation voltage is applied to one (or more) drive electrodes. A specific resonance mode may be emphasized by using multiple drive electrodes, placing them at the displacement maxima of the desired mode, and applying properly phased drive signals to the electrodes. To avoid unnecessary notational complexity, however, we focus on the case of fundamental-mode resonance in the present discussion. We also assume that the electrodes are concentrated at the center of the beam and that the beam length is much greater than the electrode lengths. This allows us to neglect beam displacement variations across the lengths of the electrodes due to the beam’s mode shape (i.e., we may assume that x(y) ~ x for y near the center of the beam). A more rigorous analysis which accounts for all of these effects is certainly possible, but obscures the main points. When an ac excitation with frequency close to the fundamental resonance frequency of the µresonator is applied, the µresonator begins to oscillate, creating a time-varying capacitance between the µresonator and the electrodes. Since the dc-bias VPn = VP - Vn is effectively applied across the time-varying capacitance at port n, a motional output current arises at port n. For this resonator design, the transducer capacitors consist of overlap capacitance between the interdigitated shuttle and electrode fingers. As the shuttle moves, these capacitors vary linearly with displacement. Thus, Cn/x is a constant, given approximately by the expression
where Ng is the number of finger gaps, h is the film thickness, and d is the gap between electrode
and resonator fingers. α is a constant that models additional capacitance due to fringing electricfields. For comb geometries, α =1.2 . Note that, again, Cn/x is inversely proportional to the gap distance. Linear equations for the spring constants are derived using energy methods . A force (or moment) is applied to the free end(s) of the spring in the direction of interest, and the displacement is calculated symbolically (as a function of the design variables and the applied force). In these calculations different boundary conditions are applied for the different modes of deformation of the spring. When forces (moments) are applied at the end-points of the flexure, the total energy of deformation, U, is calculated as:
where, Li is the length of the i’th beam in the flexure, Mi is the bending momentransmitted through beam i, E is the Young’s modulus of the material of the beam (polysilicon, in our case) and Ii is the moment of inertia of beam i, about the relevant axis, Ti is the torsion transmitted through beam i, G is the shear modulus, Ji is the torsion constant of beam i, and ξ is the variable along the length of the beam. The bending moment and the torsion is a linear function of the forces and moments applied to the end-points of the flexure. The displacement of an end-point of the flexure in any direction ζ is given as:
where, F ζ is the force applied in that direction at that end-point . Similarly, angular displacements can be related to applied moments. Our aim here is to obtain the displacement in the direction of interest as a function of the applied force in that direction. Applying the boundary conditions, we obtain a set of linear
5
equations in terms of the applied forces and moments and the unknown displacement. Solving the set of equations yields a linear relationship between the displacement and applied force in the direction of interest. The constant of proportionality gives the spring constant as a function of the physical dimensions of the flexure. The effect of spring mass on resonance frequency is incorporated in effective masses for each lateral mode. Effective mass for each mode of interest is calculated by normalizing the total maximum kinetic energy of the spring by the maximum shuttle velocity, Vmax.
where mi and Li are the mass and length of the i’th beam in the flexure. Analytic expressions for velocities, vi, along the flexure’s beams are approximated from static deformation shapes, and are found from the spring constant derivations. Design Variables:- Fifteen design variables are identified for the µresonator. The design variables are listed in Table I and shown in Fig.2 These include 13 geometrical parameters (shown in Fig. 2), the number of fingers in the comb drive, N, and the effective voltage, V, applied to the combdrive.
Fig.2 Dimensions of the microresonator elements. (a) shuttle mass, (b) folded-flexure, comb drive with N movable ’rotor’ fingers, (d) close-up view of comb fingers.
The displacement as a function of the driving voltage was measured while applying a dc voltage between the rotor (movable set) and a stator (stationary set)
Table 1: Design and style variables for the microresonator. Upper and lower bounds are in units of µm except N and V. Quality Factor (Q):- It describes how underdamped an oscillator or resonator. Higher Q indicates a lower rate of energy loss relative to the stored energy.
Where x- x direction m-Mass k-Spring constant B- Damping coefficient.
6
Simulation Process:- Steps for the IntelliSuite Simulator 1-Design the appropriate mask or masks for your design in the IntelliMask 2- Fabricate the device using IntelliFab and visualize it. 3- Perform Different types of Analysis (Static or Frequency) with the help of TEM. 4- Get the results
Fig.3:MEMS microresonator mask structure using IntelliMask
Fig.4:MEMS microresonator process flow using IntelliFab
Fig.5:MEMS microresonator TEM structure using TEM Analysis
7
Fig.6:MEMS microresonator Pressure Distribution
Fig.7:MEMS microresonator charge Distribution
*Capacitance Report Number of conductors: 2 CAPACITANCE MATRIX, 1e-6 nanofarads*1e-6 C11 9.334000 C12 -1.037000 C21 -1.037000 C22 2.767000 *Natural Frequency Report *Unit Hz *Mode Number 6 Mode 1 Frequency 23347.1 (Natural Frequency or resonant Frequency) Mode 2 Frequency 39248.8 Mode 3 Frequency 40138 Mode 4 Frequency 51.6151 Mode 5 Frequency 70.8529 Resonator Simulation Results:- With the help of Simulation process we get the Resonant Frequency with different parameters. We can also find out displacement, pressure distribution, charge distribution, stress, linear motion etc. Figures for pressure distribution and charge distribution are shown in the figure. Comb characteristics Resonant frequency (kHz) S.No. No. Of
Conclusion and Future Work:- In this project we design and simulate a microresonator based on comb-drive structure which is introduced by Tang. We design it and calculate resonance frequency for different geometry parameters. There are two types of constraints in comb drive structure (1-Geometric and 2-Functional) which we have not discuss here left for the future work. The project work can be extended in a number of directions. Manufacturing variations need to be incorporated for accurate synthesis results. Fabrication for MEMS resonator is also a big issue which we are not discuss in our work and left for the future work. The spring constant can also be designed by different styles also left for future work. After design and calculating the resonance frequency for different shapes we go for simulation process and simulate them and get the results which we shown in the table. From all these work, I would like to conclude some points which are following. To achieve high resonance frequency –Total spring constant should increase –Or dynamic mass should decrease -(Difficult, since a given number of fingers are needed for electrostatic actuation –k and m depend on material choice, layout, dimensions •k expresses the spring constant relative to mass –Frequency can increase by using a material with larger k ratio than Si
Acknowledgements: This research work had been carrying out at CARE, IIT Delhi under the supervision of Prof. Sudhir Chandra CARE, IIT Delhi. I am also grateful to my college Director Dr. G. P. Govil and my Head of the Department Mr. N.P. Gupta for his kind hearted support and motivation during the research work. References:
1. S. M. Sze, Semiconductor Sensors, John Wiley & Sons Inc., New York, 1994
2. Ljubisa Ristic, “Sensor Technology and Devices”, Artech House ISBN 0-89006-532-2, 1994
3. G.K. Fedder and T. Mukherjee, "Automated Optimal Synthesis of Microresonators," Proc 9th Intl. Conf on Solid-State Sensors and Actuators (Transducers ’97), Chicago, IL, June 16-19, 1997.
4. W.C. Tang, T.-C. H. Nguyen, M. W. Judy, and R. T. Howe, "Electrostatic Comb Drive of Lateral Polysilicon Resonators," Sensors and Actuators A, 21 (1990) 328-31.
5. X. Zhang and W. C. Tang, "Viscous Air Damping in Laterally Driven Microresonators," Sensors and Materials, v. 7, no. 6, 1995, pp.415-430.
6. W C Tang, T-C H Nguyen and R T Howe, Laterally driven polysilicon resonant microstructures, IEEE MicroElectro Mechamal System Workshop, Salt Luke City, UT,US A , Feb 20-22, 1989, pp 53-59
Abstract: This paper proposes a new aspect of comparing thetwo video codecs on the basis of rate-distortion basis. Scalablecoding provides a straight forward solution for video codingthat can serve broad range of applications without the need fortranscoding. Even though the latest international video -codingstandards do not provide ful ly scalable methods, only H.264provides the best rate-distortion performance. Other thanH.264, the performance on rate-distortion Motion CompensatedEmbedded Zero Block Context (MC-EZBC) coder which is fullyscalable.
Keywords—, MC-EZBC, ME/MC sub pixel accuracy,temporal level subband coding, YSNR.
I. INTRODUCTION
THE MODERN VIDEO compression coding technologies hasbeen significantly improved for last few years and hasenabled broadcasting of digital video signal over variousnetworks [1]. Also motion compensated wavelet based videocoding emerged as an important research topic to explorebecause of its ability to provide better quality. MC-EZBC [2][3] is one of the codec that encodes the motion information ina non scalable manner, which results in a reduced codingefficiency performance at low bit rates. However H.264 [4] isa non scalable coding technique provides a good qualityvideo at substantially lower bit rates than previous standardslike MPEG-2, H.263, or MPEG-4 Part 2 without increasingthe complexity of design and cost.
In this paper we are performing the analysis on thejoint region of applicability between the MC-EZBC andH.264 video codec. In MC-EZBC, by using a third and fourlevel of temporal decomposition of the input video sequencethereby obtaining a GOP structure of 8 and 16 frames, andeffect of sub-pixel accurate Motion estimation andcompensation, a good comparison with H.264 is achieved interms of Coding Efficiency [5].
The outline of the paper is as follows. After introducingthe examined compression schemes in section II, an overviewof the applied methodology is provided in Section III. Th eobtained results are described in Section IV while theconclusions are drawn in Section V.
II. Video codec overviewThe two video codec that were used in the tests are summedup in this section. Due to place constraints, the reader isreferred to the references for further information on thesecodecs. The first one is a scalable wavelet based video codecdeveloped by J. Woods et al. (motion compensated
embedded zero blocking coding --- MC-EZBC) [6] [7]. Thesecond video codec is the Ad Hoc Model 2.0 (AH M 2.0)implementation of the H.264 standard [4][8] which extendsthe JM 6.1 implementation[9] with a rate controlalgorithm[10].
III. MATERIALS AND METHODS
A. Encoding Process
This section describes how the two codecs were configuredand used in order to obtain the bit streams necessary forperforming the various measurements.
TABLE ISequences Used In Our Experiment
Name No. of frames Abbreviation
Akiyo 300 AK
Foreman 300 FO
Hall 300 HA
As input three progressive video sequences were used inraw Y Cb Cr 4:2:0 formats. These were downloaded from theHannover FTP server.
An overview of the sequences is given i n the Table I. Theresolutions used are the Common Intermediate Format
(CIF, 288352 ), thus resulting in 3 input video sequences.These sequences were encoded by making use of constant bitrate coding (CBR). Ten different target bit rates were used:both very low and very high bit rates. The bit -rates taken are100, 200, 300,…1000 kbps. At each bit rate, encoding wasperformed at 30 frames per second. The detailed settings forthe different encoding parameters can be found in Table IIand Table III.
The code of MC-EZBC was downloaded from the MPEGCVS server. Each input video sequence was encoded onceand then pulled several times in order to get decodable bitstreams for all target bit rates. The H.264 bitstreams areconforming to Baseline and Main Profile. The GOP structureis IBBBP and GOP length is 16.
TABLE IIParameter Settings for the MC-EZBC Compressor
Parameter Value(CIF) Comment
-inname akiyo.yuv Name of input filecontaining a sequence of4:2:0
-statname akiyo_tpyrlev3_cif_mv0.stat
Name of output filecontaining some statisticalinformation generatedduring encoding
-start 0 Index number of the firstframe (0 means first framein file)
-last 299 Index number of the lastframe
-size 352 288 176144
Size of each input frame.1. pixel width of theluminance component2. pixel height of theluminance component3. pixel width of thechrominance component4. pixel height of thechrominance component
-frame rate 30 Number of input framesper second
-tPyrLev 3 Levels of temporalsubband decomposition
-searchrange 16 Maximum search range(in pixels) in firsttemporal decompositionlevel. The search range isdoubled with eachdecomposition
-maxsearchrange 64 Upper limit for searchrange
TABLE IIIParameter Settings for the H.264 AHM 2.0 Encoder
Parameter Value(CIF)
Input FileFrames To Be EncodedSource WidthSource HeightTrace FileRecon FileOutput File
“../Akiyo300_cif.yuv”300352288
“trace_enc.txt”“trace_rec.yuv”
“test.264”Search RangeNumber Reference Frames
161
Restrict Search RangeRD Optimization
21
Context Init Method 1
Rate Control EnableRate Control TypeBit rate
10
100Kbps
B.Quality measurement
The PSNR-Y is calculated as defined in [11]. In order to get aPSNR value for an entire sequence, the average of the PSNR -Y values of the individual frames is calculated. This is notonly one way to get a value for an entire sequence. Butanother method could be, for instance, to take the minimumof the individual PSNR-Y values (because a video sequencemay be evaluated based on the worst part). PSNR is based ona distance between two images [derived from the metric3mean square error (MSE)] and does not take into account anyproperty of the human visual system (HVS).
IV. EXPERIMENTAL RESULTS
In the experiment, the performance of the codec is checkedon Rate-Distortion basis. It is clear that due to the size of theexperiments and place constraints, not all results can bepresented. A subclass of the re sults is given in Table IV andTable V.
The coding efficiency of MC-EZBC is compared withH.264 with different sequences at different bit rates. MC -EZBC is a fully scalable coding architecture whi ch utilizesMCTF and wavelet filtering. The software available fordownload at the website of CIPR, RPI [7] is used for testingof the video material. On the other hand H.264 has nonscalable coding structure and the entire tests were done onLINUX based personal computer (AMD turion 64x2processor speed 1.9GHz and RAM 1GB) with Ubuntu 9.04installed and no other software running in the background.
The measurement results of both codecs can provide anassessment of the coding efficiency of current wavele t-basedcodecs compared to state-of-the-art single-layered codecs. Afirst general remark is the fact that, for certain bit rates, thereare no measurement points for MC -EZBC. MC-EZBC is notable to encode that particular video sequence at such lowtarget bit rates. In case of low bit rates, a codec may alsodecide to skip some frames.
TABLE IV
Average coding gain of MC-EZBC and H.264 between 500- 1000 Kbps
Video Codec Foreman (YSNR) dBMC-EZBC 37.90
H.264 38.06
For video sequences with a higher amount of movement (FO)
indicates that on an Average, H.264 JM 6.0 performs
significantly better than MC-EZBC in terms of PSNR-Y at
almost all bit rates. It is also observed that H.264 outperforms
well throughout the bitrate for High complexity.
TABLE V
Subset of Quality Measurements for Video CIF Sequences
Bit Rate(Kbps)
MC-EZBC H.264
Foreman Sequence Foreman Sequence
100 27.86 30.33400 34.88 35.73
1000 39.12 39.30
IV. CONCLUSION
In this paper, an overview was given of the rate distortionperformance of the two state of the art video codectechnologies in terms of YSNR. From the above results it isclear that the tools that are incorporated in the H.264 standardoutperform MC-EZBC. Although at around 1000 Kbps theperformance of MC-EZBC is comparable with that of H.264for high complexity sequences.
REFERENCES[1] M Ghanbari. Standard Codecs: Image Compression to Advanced video
Coding. IEE Telecommunications Series 2003.
[2] S.S. Tsai, motion Information Scalability for Interframe Wavelet VideoCoding, MS Thesis, National Chiao Tung University, Hs inchu,Tiawan, R.O.C., Jun.2003
[3] S.S. Tsai, motion Information Scalability for Interframe Wavelet VideoCoding, MS Thesis, National Chiao Tung University, Hsinchu,Tiawan, R.O.C., Jun.2003.
[4] J. W. Woods and P. S. Chen, “Improved MC-EZBC with Quarter-pixelMotion Vectors”, ISO/IEC/JTC1SC29/WG11 doc no. m8366, fairfax,May 2002.
[5] T. Wiegand, G Sullivan and A. Luthra. Overview of the H.264/AVCVideo Coding Standard. IEEE Trans. On CSVT, Vol.13, pp.560 -576,July 2003.
[6] I.E.G. Richardson. H.264 and MPEG-$ Video compression. Hoboken,NJ: Wiley, 2003.
[7] S.T. Hsiang and J. W. Woods. “Embedded Video Coding usingInvertible Motion Compensated 3 -D Subband/ wavelet filter Bank.”Signal Process.: Image Communication, vol. 16, pp.705-724, May2001
[8] T. Wiegand, H. Schwarz, A. Joch, F. Kossc ntini, G. J Sullivan,“Rate- Constrained Coder Control and Comparison of Video CodingStandards”, IEEE trans. Circuits sytems, video Technology , vol.13, no.7, pp- 688-703, July 2003.
Customized VLSI chips influenced the former and most of the
researches implementing digital filter. The architecture of
these filters are largely determined by the target application.
Typical DSP chips like Texas instrument’s TMS320, Free
scale’s MSC81xx, Motorola’s 56000, Analog device’s ADSP-
2100 family efficiently performs filtering operations in audio
range. For higher frequency domain, CMOS and Bi-CMOS
technology is used. There are some disputes in the customized
chips. The biggest shortcoming is low flexibility as they are
application specific. Also, lack of adaptability in these chips is
severe. Typical custom approaches do not allow the function
of a device to be modified during the evaluation, for an
example, fault correction. The FPGA approach is therefore
necessary to provide the designing freedom. Many of the
popular FPGAs are in-system programmable, which allows
modification of the operation using simple programming. But
for filtering purposes FIR[3] filters have been commonly
used. In
this particular work, IIR filters are implemented as they
require fewer calculations and lesser memory requirement.IIR
filters also outperforms FIRs[5] for narrow transition bands.
They can also provide a better approximation for traditionally
analog systems in digital applications than competing filter
types.IIR filters are mainly used in audio applications such as
speakers and sound processing functions. In this work,
XILINX SPARTAN 3E series is used for implementing
various digital filtering algorithms. XILINX SPARTAN 3E
consists of reconfigurable combinational logic blocks with
multi input and output, router or switching matrix for
connection and buffers.
III PROPOSED ARCHITECTURE
IIR filter implementations on FPGA board illustrate that the
FPGA approach is both flexible and provides performance
superior to traditional approaches. Because of the
programmability of this technology, the examples in this paper
can be extended to provide a variety of other high
performance IIR filter realizations. Using powerful computer
based software tools to perform redundant calculations in the
filter design process enables a designer to achieve the best
design within the shortest time. While implementing a filter
on hardware, the biggest challenge is to achieve specified
system performance at minimum hardware cost. In this paper
we achieve this goal by designing the digital filter which also
gives better noise margin and less ageing effect of
components in comparison to Analog filter. One among the
hurdles is to understand, estimate and overcome where
possible, the effects of using a finite word length to represent
the infinite word length coefficients. Selecting a non
optimized word length[6] can result in the filter transfer
function being different from what is expected. The effects of
using finite word length representation can be minimized by
analytical or qualitative methods or simply by choosing to
implement higher order filters in cascaded or parallel form
Digitals filters[7] are often described and implemented in
terms of the difference equation that defines how the output
signal is related to the input signal. We have modeled the
equation as
0 * 0 * *
0
1* 2 * *
1[ ] ( [ ] [ 1] ......... [ ]
[ 1] [ 2] ........... [ ])
p
Q
y n b x n b x n b x n Pa
a y n a y n a y n Q
= + − + −
− − − − − − −
(1)
Where:
• is the feed forward filter order
• are the feed forward filter coefficients
• is the feedback filter order
• are the feedback filter coefficients
• is the input signal
• is the output signal.
Now from the above equation we modeled the transfer
function of IIR filter as
(2)
For hardware representation of the digital filter we have
modeled the transfer function by using adder, multiplier and
delay unit.
Figure 1: Direct Form-2 Structure of Digital Filter
A basic IIR filter consist of 3 main blocks-
(i) Adder (ii) Multiplier (iii) Delay unit
A Implementation of Adder
We have implemented this system using serial adder. A serial
adder is a binary adder that adds the two numbers bit-pair
( )
( )( )
2
2
1
1
2
2
1
10
1 −−
−−
++
++==
zaza
zbzbbzH
zX
zY
+
+
+
+
z-1
z-1
x(n) y(n)w(n)
w(n-1)
w(n-2)
b0
b1
b2-a2
-a1
3
wise. Each bit-pair are added in a single clock pulse. The
carry of each pair is propagated to the next pair.
B. Implementation of Multiplier
The multiplier has been configured to perform multiplication
of signed numbers in two’s complement notation We have
used signed multiplication where a n-bit by n-bit
multiplication takes place and result in a 2*n-bit value.
C. Implementation of Delay Unit
We have used shift register for the purpose of delay. A shift
register is a group of flip-flops set up in a linear fashion with
their inputs and outputs connected together in such a way that
the data is shifted from one device to another when the circuit
is active. (i) A provides the data movement function
(ii). A shift register “shifts” its output once every clock cycle.
IV SIMULATION RESULT
To check the response of proposed filter we have used Filter
Design and Analysis Tool (FDA Tool) which is a graphical
user interface (GUI) available in the Signal Processing
Toolbox of MATLAB for designing and analyzing filters. It
takes the filter specifications as inputs. Table 1 shows the
specifications of an IIR low pass elliptical filter of order 6.
Table 1: IIR filter specifications
Filter performance
parameter
Value
Pass band ripple
Pass band frequency
Stop band frequency
Stop band attenuation
Sampling frequency
0.5dB
11000 Hz
12000 Hz
35 dB
48000 Hz
A. Software Simulation
The sampling frequency is chosen as 4 times the stop band
and the filter has a steep transition band with a width of 1000
Hz. These specifications are fed as inputs to the FDA tool in
MATLAB R2009a. The tool performs the filter design
calculations using double precision floating point numeric
representation and displays the response of a IIR elliptical low
pass filter of order 6. Figure 2 shows the filter design window
of FDA tool, after completion of the design process.
Figure 2 Filter design using MATLAB FDA tool
We have designed the IIR filter of direct form-2 .Using
VHDL we have simulated and downloaded it in Xilinx
Spartan 3E kit. The response we have obtained by simulating
the VHDL code is shown below.
PASS BAND STOP BAND
4
Figure 3 The simulation output of IIR filter
in Xilinx ISE 7.01
The coding scheme that we are using is VHDL (Very high
speed integrated circuit hardware description language). Since
we have designed the filter in digital domain, so to
accommodate it in current existing analog system we have to
add a A/D converter before the system and a D/A converter
after the system.
B. Hardware Implementation
We have implemented digital IIR filter using FPGA based
Xilinx Spartan 3E kit which consists of an interior array of 64-
bit CLBs, surrounded by a ring of 64 input-output interface
blocks. The FPGA architecture is shown below.
Figure 4: Internal Block Diagram of FPGA Architecture
V. CONCLUSION
We have implemented the IIR filter in FPGA and our results
shows better improvement over existing filter design
architecture. In future we will implement our scheme for real
time application.
REFERENCE
[1] U. Meyer-Baese, Digital Signal Processing with Field Programmable Gate
Arrays Second Edition , Springer, p.109.
[2] U. Meyer-Baese, Digital Signal Processing with Field Programmable Gate
Arrays Second Edition , Springer, p.110.
[3] DUSAN M. KODEK, 1980, “Design of Optimal Finite Word length FIR
Digital Filters Using Integer Programming Techniques” IEEE Transactions on
Acoustics, Speech, and Signal Processing, Vol. ASSP-28, No. 3, JUNE 1980.
[4] Wonyong Sung and Ki-Il Kum, 1995, “Simulation-Based Word-Length
Optimization Method for Fixed-point Digital Signal Processing Systems”,
IEEE Transactions on Signal Processing, Vol. 43, No.12, December 1995.
[5] X. Hu, L. S. DeBrunner, and V. DeBrunner, 2002, “An efficient design for
FIR filters with Variable precision”, Proc. 2002 IEEE Int. Symp. on Circuits
and Systems, pp.365-368, vol. 4,May 2002.
[6] Y. C. Lim, R. Yang, D. Li, and J. Song, 1999. “Signed-power-of-two term
allocation scheme for the design of digital filters,” IEEE Transactions on
Circuits and Systems II, vol. 46, pp.577- 584, May 1999.
[7] S. C. Chan, W. Liu, and K. L. Ho, 2001, “Multiplier less perfect
reconstruction modulated filterbanks with sum-of-powers-of-two
coefficients,” IEEE Signal Processing Letters, vol. 8, no. 6,pp. 163-166, June
2001
1
Abstract—GSM acronym for global system for mobile uses
various encryption algorithms as A5/1/ 2/ 3.This is use to encrypt the information when transmit from mobile station to base station during communication. As stated that A5/1 is strong algorithm but it exhibit some weakness as basis on attacks happened on it. In A5/1 attacked on linearity complexity, clocking taps etc. So, in this paper proposed concept to improve A5/1 encryption algorithm to some extend by consideration or improve clocking mechanism of registers present in A5/1 and modified version of A5/1 is fast and easy to implement which make it ideal to future. Index Terms—GSM, encryption, A5/1 stream cipher, clock
controlling unit, correlation
I. INTRODUCTION
In wireless communication technology, wireless
communication is effective and convenient for sharing information[7]. GSM is a very good example of that wireless communication .But this information should be secure means nobody could interfere like eavesdropper. So, to protect our information cryptography play vital role. However, for sending information mobile station to base station there is air interface serious security threat prevention between communicating parties[10]. Then question arise how to protect while communication. For this there is encryption algorithm use in GSM as A5/x series. These algorithms used to encrypt voice and data over GSM link. The various different implementations A5/0 has no encryption, A5/1 is strong version, A5/2 weaker version targeting market outside Europe and at last A5/3 based in block ciphering strong version created as part of 3rd generation partnership project (3GPP)[5].
In this paper we explore about A5/1 that is also strong
version but exhibit weaker due attack happened on it. A5/1 based on stream ciphering[1] that is very fast doing bit by bit XOR and getting result. If we take simple encryption we could perform by take a plaintext bit XOR with any key that keep secret so choose any whatever got that is called cipher text and reverse process is called decryption.
A5/1 made up using linear feedback shift register. Initial value of LFSR is called seed because operation of the register is deterministic stream values produced by registers is
completely determined by its current or previous state. However, LFSR the well chosen feedback function can produce a sequence of bits which appear random and which has long cycle [2]. In cryptography, correlation attacks are a class of known plaintext attacks for breaking stream ciphers whose key stream is generated by combining the output of several linear feedback shift registers using a Boolean function. Correlation attacks [6] exploit a statistical weakness that arises from a poor choice of the Boolean function – it is possible to select a function which avoids correlation attacks, so this type of cipher is inherently insecure. It is simply essential to consider susceptibility to correlation attacks when designing stream ciphers of any type. In this paper proposed a new clocking mechanism for to avoid correlation attack on the place of m-rule i.e. majority rule used by A5/1 stream cipher. Form in different sections as follows. In section 2 description of A5/1 stream cipher is given. In section 3 correlation attack analysis. In section 4 proposed modified structure of A5/1 key stream generator. At last give conclusion.
II. DESCRIPTION OF A5/1
A5/1 is a stream cipher [11] provide key stream so called key stream generator. Made up of three linear feedback shift register of length 19, 22, 23 used to generate sequence of binary bits. GSM conversations are in form of frames as length of 228 bit i.e. 114 for each direction for encrypt/ decrypt data[4]. A5/1 initialize 64 bit key together with 22 frame number publicly known. It used linear feedback shift registers as R1, R2 and R3 to correspondence tap as (13, 16, 17, 18) contained by R1, (20, 21) by R2 and (7, 20, 21, 22) respectively. Each clocked using rule called as majority rule. Clocking tap considered as A, B, C to correspondence registers R1, R2 and R3 as R1 (8), R2 (10) and R310). Before register is clocked feedback is calculated by using linear operator i.e. XOR. The one bit shift to right (discarding the rightmost) bit produced by feedback location store leftmost locations of linear feedback shift registers. This cycle goes up to 64 times. This done on basis of clocking rule that register clocked irregularly according to majority rule. Majority rule uses on three clocking bits of LFSR’s A, B, C. Among clocking bit if one or more is 0, then m=0 whose value match with m that register will clock. Similarly, if one or more
Enhanced Clocking Rule for A5/1 Encryption Algorithm
Rosepreet Kaur Bhogal, ECE Dept., [email protected] , Nikesh Bajaj, Asst. Prof., ECE Dept., [email protected], Lovely Professional University -India
clocking bits is 1, then whose values match with m that will clock. At each clocking LFSR generate one bit which combined by linear function. In A5/1, the probability of an individual LFSR being clocked is 3/4. The clocking bit generates bit m defined as using Boolean algebra (A.B (+) B.C (+) A.C) as shown in figure 1 structure of A5/1 stream cipher and possible cases refer to table 1.
Figure 1: Structure of A5/1 stream cipher
Table 1: Possible cases of A5/1 register to clocked
As shown in table 1 that possible cases of register to clocked according to m-rule explained. In this each register clocked with probability 3/4 [8] i.e. each output bit of this yield some information about the state of LFSRs [3]. Due to this the whole thing falls to a correlation attack and we determine bits.
III. ANALYSIS CORRELATION ATTACK
Analyzing stream cipher is easier as compare to block cipher. There is two main factor consider while designing any stream cipher that is correlation and linear complexity. Linear complexity is important because Berlekamp messey algorithm
can examine the state of LFSRs mean some of LFSRs bits are related to the output sequence generated. Linear complexity should be longer for more security but does not indicate for secure one. And further correlation immunity, higher linear complexity by combining the output sequence more non linear manner. So, insecurity arises that output of the combining function is correlated with output of individual LFSRs due this correlation attack exist. If observing the output sequence obtains information about internal state output. Using that could determine other internal states by this entire stream cipher generator is broken. Now, come on main point that A5/1 stream cipher is also using three LFSRs and clocking taps look strong but cryptographically weak shown by attacks. In the output of generator equal two output of LFSR2 75% times, if feedback is known, we can determine the initial bit of the LFSR2 and generate output sequence then count number of times LFSR2 output is agrees with output of generator. If two sequences will agree about 50% times then guess wrong if agree 75% then guessed right. Similarly, the output sequence agrees 75% times with LFSR3 using correlation. We could easily cracked by known plaintext attack. It is clear that basic idea behind A5/1 is good it passes statistical test example NIST test [12] but still have weakness that LFSRs length is short enough to made feasible for cryptanalysis. Make A5/1 longer as possible for more security.
IV. MODIFIED A5/1 STREAM CIPHER
The new clock control mechanism is proposed to overcome problem of getting probability of 3/4 explained. By proposed concept probability become 1/2 by using modified clock controlling unit. Consider three bits as A, B and C of respective registers R1, R2 and R3 called as clocking bits .The structure of proposed A5/ 1 stream cipher as shown in figure 2.
Figure 2: Modified stream cipher
3
A. Clocking controlling unit
In the new clock control mechanism each register has one clocking tap in bit 8 for R1, bit 10 for R2 and bit 10 for R3. The clocking bit generated by using Boolean algebra for expression as next write. In this used and gate due to that linear complexity also increase. In the text ¬ is not and (+) is XOR. As that equation given below:
y = ¬ A. (B (+) C) + A. ¬ (B (+) C) (1)
As above expression made by using different gate stated that consider clocking bits A, B and C to respective register. For each cycle register whose clocking tap is agree with y refer equation (1) that register clocked are shifted. For example A,B,C are clocking taps of R1,R2,R3 respectively then table 1 show the all possible combination for clocking.
Table 2: Possible cases modified stream cipher to clock register.
As refer table above at each cycle at least one register should clock else it stop that position where it not clocked. Consideration of these problems above mechanism made. Lets case 1: A=0, B=0, C=0 getting result by using equation (1) y=0 so whose register agree with value that clocked like R1, R2, R3 agree so all register clocked shift to right (discarding rightmost bit) .In case 2: A=0, B=0, C=1 using equation (1) y comes as 1 then R3 clocked and shifted. In case 3: A=0, B=1, C=0 using equation (1) y comes as 1 then R2 clocked and shifted. In case 4: A=0, B=1, C=1 using equation (1) y comes as 0 then R1 clocked and shifted. In case 5: A=1, B=0, C=1 using equation (1) y comes as 0 then R2 clocked and shifted. In case 6: A=1, B=0, C=1 using equation (1) y comes as 0 then R2 clocked and shifted. In case 7: A=1, B=1, C=0 using equation (1) y comes as 0 then R3 clocked and shifted. Last In case 8: A=1, B=1, C=1 using equation (1) y comes as 1 then all register clocked and shifted. Note, that if compare the possible outcomes to clock registers in table 1 and 2. In table 1 each cycle at least 2 registers are shifted with 75% probability. This reduced by 50% shown in table 2 where at least one registers shifted. The register bit that got output which is unrelated to state of LFSRs for 6 clock cycles.
V. CONCLUSION
A5/1 key stream generator is easy to implement and also efficient encryption algorithm used in communication application GSM. So, it exhibit weakness like length of LFSRs is short and basic correlation attack discussed in section 3. After analysis these things decreased the possibility of correlation attack. A5/1 modified structure has been given which is easy to implement and fast to do section 4. But if compare clocking mechanism based on majority rule then modified a5/1 stream cipher. However, it has proved that encryption algorithm is insecure based on m-rule. The enhancement proposed in new clock mechanism increase level of security and also decrease the possibility of attack called as correlation attack. As probability of linear feedback shift register clocked was 3/4 reduced up to 1/2. So, it prevents state identified by output sequence i.e. it gave bits which unrelated with output sequence up to 6 cycles. Hence, all shown by modified structure of A5/1 stream cipher in section 3.
ACKNOWLEDGMENT
This is part completion of masters as dissertation. The contribution in assorted ways to do work and the making of the deserved special mention. It is a pleasure to convey my gratitude to them all in my humble acknowledgment. Thanks to guide Mr. Nikesh Bajaj for his supervision, advice, and guidance for the every stage of this paper as well as giving me extraordinary experiences throughout the work. Above all and the most needed, he provided me unflinching encouragement and support in various ways. His truly intuition has made him as a constant oasis of ideas and passions in electronics, which exceptionally inspire and enrich my growth as a student. Last but not the least; I would like to thank my fellow being for the stimulating discussions and successful realization.
REFERENCES
[1] Instant cipher text-only cryptanalysis of GSM encrypted communication, Elad Barkan, Eli Biham, Nathan Keller, Advances in Cryptology – CRYPTO 2003.
[2] On LFSR based stream cipher , analysis and design , Patrik Ekdahl.
[3] A complex linear feedback shift register design for the a5 keystream
generator , Mohmed Sharaf , Hala A.K.Mansour , Hala H.Zayed , M L Shore.
[4] GSM Security and Encryption by David Margrave, George Mason
University.
[5] A Practical-Time Attack on the A5/3 Cryptosystem Used in Third Generation GSM TelephonyOrr Dunkelman, Nathan Keller, and Adi Shamir.
[6] A précis of the new attacks on GSM encryption Greg Rose,
QUALCOMM Australia.
4
[7] Communication security in gsm networks petr bouška, martin drahanský faculty of information technology, brno university of technology.
[8] Enhanced a5/1 cipher with improved linear complexity ,musheer
ahmad and izharuddin.
[9] Mobile networks security,tkl markus peuhkuri ,2008-04-22.
[10] Security enhancements for a5/1 without loosing hardware efficiency in future mobile systems,n. komninos, ‘b. honary, m. Darnel1
[11] Stream Ciphers for GSM Networks,Chi-Chun La and Yu-Jen Chen
Institute of Information Management,National Chiao-Tung University.
Rosepreet Kaur Bhogal pursuing the master’s degree in signal processing from Lovely Professional University, Punjab, India, in 2004. Currently, doing dissertation under the supervision of Mr. Nikesh Bajaj, the assistant professor of electronic department. Research interests include different aspects of cryptography like cryptographic assumptions and encryption algorithms use in GSM etc
Nikesh Bajaj received his bachelor degree in Electronics & Telecommunication from Institute of Electronics And Telecommunication Engineers, and he received his master degree in Communication & Information System from Aligarh Muslim University, India. Now, he is working in LPU as Asst. Professor, Department of ECE. Research interests include Cryptography, Cryptanalysis, and Signal
& Image Processing.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0107-1
An Application of Kalman Filter in State Estimation of a
Dynamic System
Vishal Awasthi1, Krishna Raj
2
Abstract-- Most wireless communication systems for indoor
positioning and tracking may suffer from different error
sources, including process errors and measurement errors.
State estimation algorithm deals with recovering some desired
state variables of a dynamic system from available noisy
measurements. A correct and accurate state estimation of
linear or non-linear system can be improved by selecting the
proper estimation technique. Kalman filter algorithms are
often used technique that provides linear, unbiased and
minimum variance estimates of an unknown state vectors for
non-linear systems. In this paper we tried to bridge the gap
between the Kalman Filter and its variant i.e. Extended
Kalman Filter (EKF) with their algorithm and performance in
the state estimation of the car moving with a constant force.
Index Terms-- Stochastic filtering, Bayesian filtering,
Adaptive filter, Unscented transform, Digital filters.
1. INTRODUCTION
In the area of telecommunications, signals are the mixtures of
different Frequencies. Least squares method proposed by Carl
Friedrich Gauss in 1795 was the first method for forming an
optimal estimate from noisy data, and it provides an important
connection between the experimental and theoretical sciences.
Before Kalman, In 1940s, Norbert Wiener proposed his
famous filter called Wiener filter which was restricted only to
stationary scalar signals and noises. The solution obtained by
this filter is not recursive and needs the storing of the entire
pas observed data. Early 1960s, Kalman filtering theory, a
novel recursive filtering algorithm, was developed by Kalman
and Bucy which did not require the stationarity assumption
[1], [2]. Kalman filter is a generalization of Wiener filter. The
significance of this filter is in its ability to accommodate
vector signals and noises which may be non stationary. The
solution is recursive in that each update estimate of the state is
computed from the previous estimate and the new input data,
so, contrary to Wiener filter, only the previous estimate
requires storage, so Kalman filter eliminate the need for
storing the entire past observed data. Most of the existing
approaches need a priori kinematics model of the target for the
prediction. Although this predictor can successfully filter out
the noisy measurement, its parameters might be changed due
to different dynamic targets.
1Member IETE, Lecturer, Deptt. of Electronics & Comm. Engg., UIET., CSJM.University, Kanpur-24, U.P., (email: [email protected]) 2Fellow IETE, Associate Professor, Deptt. of Electronics Engineering,
training, speech and speaker recognition systems and
vocoders[6],[8]
. For accurately detect and estimate the
fundamental frequency of a speaker we use cepstrum [5]
analysis which is also called spectrum of spectrum.
It is used to separate the excitation signal (pitch) and
transfer function (voice quality). One of these
algorithms that show good performance for quasi-
periodic signals is the cepstrum (CEP) algorithm
However, its ability to separate the source signal
(that conveys pitch information) from the vocal
tract response fails wherever the speech frame cannot
be contemplated as just the result of a linear
convolution between both components, as occurs
transitions or non-stationary speech segments, or
Spectral and Cepstral analysis Using Modified
Bartlett Hanning Window
P
Rohit Pandey1, Rohit Kumar Agrawal
1, Sneha Shree
1
Department of Electronics & Communication Engineering 1 Jaypee University of Engineering & Technology, Guna, MP, India [email protected],[email protected]
CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011
SIP0110-2
when the recorded speech signal includes additive
noise [5],[7]
.
II. WINDOW FUNCTIONS
These are the window functions used for spectrum and
cepstrum analysis.
MBH[1] are used for the estimation techniques. Modified
Bartlett-Hanning (MBH) window is extended to the form [1]
ABSTRACT: Face recognition has been in research for the last couple of decades. With the advancement of 3D imaging technology, 3D face recognition emerges as an alternative to overcome the problems inherent to 2D face recognition, i.e. sensitivity to illumination conditions and positions of a subject. But 3D face recognition still needs to tackle the problem of deformation of facial geometry that results from the expression changes of a subject. To deal with this issue, a 3D face recognition framework is proposed in this paper. It is combination of three subsystems: expression recognition system, expressional face recognition system and neutral face recognition system. A system for the recognition of faces with one type of expression (smile) and neutral faces was implemented and tested on a database of 30 subjects. The results proved the feasibility of this framework.
Index Terms- face recognition, databases, neutral face, smiling face, image acquisition.
I. INTRODUCTION
Mostly the face recognition attempts that have been made use of 2D intensity images as the data format for processing. In spite of the success reached by 2D recognition methods, certain problems still exist. 2D face images not only depend on the face of a subject, but also depend on imaging factors, such as the environmental illumination and the orientation of the subject. These variable factors can become the cause of the failure of the 2D face recognition system. With the advancement of 3D imaging technology, more attention is given to 3D face recognition, which is robust with respect to illumination variation and posing orientation. In [1], Bowyer et al. provide a survey of 3D face recognition technology. Mostly the 3D face recognition systems treat the 3D face surface as a rigid surface. But actually, the face surface is deformed by different expressions of the subject, which causes the failure of the systems that treat the face as a rigid surface. The involvement of facial
expression has become a big challenge in 3D face recognition systems. In this paper, we propose an approach to tackle this problem, through the integration of expression recognition and face recognition in a system.
II. EXPRESSION AND FACE RECOGNITION
From the psychological point of view, it is still not known whether facial expression recognition information aids the recognition of faces by human beings. It is found that people are slower in identifying happy and angry faces than they are in identifying faces with neutral expression.The proposed framework involves an initial assessment of the expression of an unknown face, and uses that assessment to assist the progress of its recognition. The incoming 3D range image is processed by an expression recognition system to find the most appropriate expression label for it. The expression labels include the six prototypical expressions of the faces, which are happiness, sadness, anger, fear, surprise and disgust, plus the neutral expression. According to different expressions, a matching face recognition system is then applied. If the expression is recognized as neutral, then the incoming 3D range image is directly passed to the neutral expression face recognition system, which uses the features of the probe image to directly match those of the gallery images, which are all neutral, to get the closest match. If the expression found is not neutral, then for each of the six expressions, a separate face recognition subsystem should be used. The system will find the right face through modelling the variations of the face features between the neutral face and the face with expression. Figure 1 shows a simplified version of this framework. This simplified diagram only deals with the smiling expression, which is the most commonly displayed by people publicly.
III. DATA ACQUISITION AND PROCESSING
To test the approach proposed in this model, a database, which includes 30 subjects, was built. In
2
this database, we test the different processing of the two most common expressions, i.e., smiling versus neutral. Each subject participated in two sessions of the data acquisition process, which took place in two different days. In each session, two 3D scans were acquired with a Polhemus Fastscan scanner. One was a neutral expression; the other was a happy (smiling) expression. The resulting database contains 60 3D neutral scans and 60 3D smiling scans of 30 subjects.
Figure1- Simplified framework of 3D face recognition
The left image in Figure 2 shows an example of the 3D scans obtained using this scanner, the right image is the 2.5D range image used in the algorithm.
Figure 2- 3D surface (left) and a mesh plot of the converted range image (right)
IV. EXPRESSION RECOGNITION
The face expression is a basic mode of nonverbal communication among people. In [5], Ekman and Friesen proposed six primary emotions. Each possesses a distinctive content together with a unique facial expression. These six emotions are happiness, sadness, fear, disgust, surprise and anger. Together with the neutral expression, they also form the seven basic prototypical facial expressions.
In our experiment, we aim to recognize social smiles, which were posed by each subject. Smiling is
generated by contraction of the zygomatic major muscle. This muscle lifts the corner of the mouth obliquely upwards and laterally, producing a characteristic “smiling expression”. So, the most distinctive features associated with the smile are the bulging of the cheek muscle and the uplift of the corner of the mouth, as shown in Figure 3. The following steps are followed to extract six representative features for the smiling expression:-
1. An algorithm is developed to obtain the coordinates of five characteristic points in the face range image as shown in Figure 3. A and D are the extreme points of the base of the nose. B and E are the points defined by the corners of the mouth. C is in the middle of the lower lip.
Figure 3- Illustration of features of a smiling face versus a neutral face
2. The first feature is the width of the mouth, BE, normalized by the length of AD. Obviously, while smiling the mouth becomes wider. The first feature is represented by mw.
3. The second feature is the depth of the mouth (The difference between the Z coordinates of point B point C and point E point C) normalized by the height of the nose to capture the fact that the smiling expression pulls back the mouth. This second feature is represented by md.
4. The third feature is the uplift of the corner of the mouth, compared with the middle of the lower lip d1 and d2, as shown in the figure, normalized by the difference of the Y coordinates of point A point B and point D point E, respectively and represented by lc.
5. The fourth feature is the angle of line AB and line DE with the central vertical profile, represented by ag.
6. The last two features are extracted from the semicircular areas shown, which are defined by using line AB and line DE as diameters. The histograms of the range (Z coordinates) of all the points within these two semicircles are calculated.
Figure 4 shows the histograms for the smiling and the neutral faces of the subject in Figure 3. The two figures in the first row are the histograms of the range
3
values for the left cheek and right cheek of the neutral face image; the two figures in the second row are the histograms of the range values for the left cheek and right cheek of the smiling face image.
Figure 4- Histogram of range of cheeks (L &R) for neutral (top row), and smiling (bottom row) face.
From the above figures, we can see that the range histograms of the neutral and smiling expressions are different. The smiling face tends to have large values at the high end of the histogram because of the bulge of the cheek muscle. On the other hand, a neutral face has large values at the low end of the histogram distribution. Therefore two features can be obtained from the histogram.
One is called the ‘histogram ratio’, represented by hr, the other is called the ‘histogram maximum’, represented by hm.
After the six features have been extracted, this becomes a general classification problem. Two
pattern classification methods are applied to recognize the expression of the incoming faces. The first method used is a linear discriminant (LDA) classifier, which seeks the best set of features to separate the classes. The other method used is a support vector machine (SVM).
V. 3D FACE RECOGNITION
A. Neutral face recognitionIn our earlier research work, we have found that the
central vertical profile and the contour are both discriminant features for every person. Therefore, for neutral face recognition, the results of central vertical profile matching and contour matching are combined. The combination of the two classifiers improves the overall performance significantly. The final similarity score for the probe image is the product of ranks for each of the two classifiers (based on the central vertical profile and contour). The image with the smallest score in the gallery will be chosen as the matching face for the probe image.
B. Smiling face recognitionFor the recognition of smiling faces we have
adopted the probabilistic subspace method proposed by B. Moghaddam et al. [8,9]. It is an unsupervised technique for visual learning, which is based on density estimation in high dimensional spaces using Eigen decomposition. Using the probabilistic subspace method, a multi-class classification problem can be converted into a binary classification problem.In the experiment for smiling face recognition,because of the limited number of subjects (30), the central vertical profile and the contour are not used directly as vectors in a high dimensional subspace. Instead, they are down sampled to a dimension of 17 to be used. The dimension of difference in feature space is set to be 10, which contains approximately 97% of the total variance. The dimension of difference from feature space is 7.
In this case also, the results of central vertical profile matching and contour matching are combined, improving the overall performance. The final similarity score for the probe image is the product of ranks for each of the two classifiers. The image with the smallest score in the gallery will be chosen as the matching face for the probe image.
VI. EXPERIMENTS AND RESULTS
One gallery and three probe databases were used for evaluation. The gallery database has 30 neutral faces, one for each subject, recorded in the first data acquisition session. Three probe sets are formed as follows: Probe set 1: 30 neutral faces acquired in the second session.Probe set 2: 30 smiling faces acquired in the second session.Probe set 3: 60 faces, (probe set 1 and probe set 2).
0100200300
a b c d e f g h i j
Series1
0
200
a b c d e f g h i j
Series1
0
100
200
300
a b c d e f g h i j
Series1
050
100150
a b c d e f g h i j
Series1
Experiment 1: Testing the expression recognition module
The leave-one-out cross validation method is used to test the expression recognition classifier. Every time, the faces collected from 29 subjects in both data acquisition sessions are used to train the classifier and the four faces of the remaining subject collected in both sessions are used to test the classifier.classifiers are used. One is the linear discriminant classifier; the other is a support vector machine classifier. LDA tries to find the subspace that best discriminates different classes by maximizing the between class scatter matrix, while minimizing the within-class scatter matrix in the projective subspace. Support vector machine is a relatively new technology for classification. It relies on preprocessing the data to represent patterns in a high dimension, typically much higher than the feature space. With an appropriate nonlinear mapping to a sufficiently high dimension, data from two categories can always be separated by a hyper plane.
Table 1- expression recognition results
Method LDAExpression recognition rate 90.8
Experiment 2: Testing the neutral and smiling recognition modules separately
In the first two sub experiments, probe faces are directly fed to the natural face recognition module. In the third sub experiment, the leave-onevalidation is used to verify the performance of the smiling face recognition module.
a. Neutral face recognition: probe set 1.(neutral face recognition module used.)
b. Natural face recognition: probe set 2(neutral face recognition module used.)
c. Smiling face recognition: pro2(smiling face recognition module used).
From Figure 5, it can be seen that when the incoming faces are all neutral, the algorithm which treats all the faces as neutral achieves a very high recognition rate.
Figure 5 Results of Experiment 2(three sub-experiments)
0
0.5
1
1.5
a b c
rank 1 recognition rate
rank 3 recognition rate
Experiment 1: Testing the expression recognition
out cross validation method is used e expression recognition classifier. Every
time, the faces collected from 29 subjects in both data acquisition sessions are used to train the classifier and the four faces of the remaining subject collected in both sessions are used to test the classifier. Two classifiers are used. One is the linear discriminant classifier; the other is a support vector machine classifier. LDA tries to find the subspace that best discriminates different classes by maximizing the
minimizing the class scatter matrix in the projective subspace.
Support vector machine is a relatively new technology for classification. It relies on pre-processing the data to represent patterns in a high dimension, typically much higher than the original feature space. With an appropriate nonlinear mapping to a sufficiently high dimension, data from two categories can always be separated by a hyper plane.
SVM92.5
Experiment 2: Testing the neutral and smiling
In the first two sub experiments, probe faces are directly fed to the natural face recognition module. In
one-out cross used to verify the performance of the
Neutral face recognition: probe set 1.(neutral face recognition module used.)Natural face recognition: probe set 2(neutral
Smiling face recognition: probe set 2(smiling face recognition module used).
From Figure 5, it can be seen that when the incoming faces are all neutral, the algorithm which treats all the faces as neutral achieves a very high
experiments)
On the other hand, if the incoming faces are smiling, then the neutral face recognition algorithm does not
These experiments emulate a realistic situation in which a mixture of neutral and smiling faces (probe set 3) must be perform well, only 57% rank one recognition rate is obtained. (Rankone means only the face which scores highest is selected from the gallery. Rank one recognition rate is the ratio between number of faces correctly recognized and the number of probe faces. Rank three means three highest scored faces instead of one face are selected.) In contrast, when the smiling face recognition algorithm is used to deal with smiling faces, the recognition rate can be as high as 80%.
Experiment 3: Testing a practical scenario
These experiments emulate a realistic situation in which a mixture of neutral and smiling faces (probe set 3) must be recognized. Sub experiment 1 investigates the performance obtained if the expression recognition front end is bypassed, and the recognition of all the probe faces is attempted with the neutral face recognition module alone. The last two sub experiments implement the full framework shown in Figure 1. In 3.2 the expression recognition is performed with the linear discrimwhile in 3.3 it is implemented through the support vector machine approach.
a. Neutral face recognition module used alone: probe set 3 is used.
b. Integrated expression and face recognitioprobe set 3 is used. (Linear discriminant classifier for expression recognitio
c. Integrated expression and face recognition: probe set 3 is used.(support vector machine for expression recognition.)
It can been seen in Figure 6 that if the incoming faces include both neutral faces and smiling faces, the recognition rate can be improved about 10 percent by using the integrated framework proposed here.
CONCLUSION
The work reported in this paper represents an attempt to acknowledge and account for the presence of expression on 3D face images, towards their improved identification. The method introduced here is computationally efficient. Furthermore, this method also yields as a secondary result the information of the expression found in the faces.Based on these findings we believe that the acknowledgement of the impact of expression on 3D face recognition and the development of systems that
rank 1 recognition
rank 3 recognition
4
On the other hand, if the incoming faces are smiling, then the neutral face recognition algorithm
These experiments emulate a realistic situation in which a mixture of neutral and smiling faces (probe
well, only 57% rank one recognition rate is obtained. (Rankone means only the face which scores highest is selected from the gallery. Rank one recognition rate is the ratio between number of faces correctly recognized and
ree means three highest scored faces instead of one face are selected.) In contrast, when the smiling face recognition algorithm is used to deal with smiling faces, the recognition rate can be as high as 80%.
Experiment 3: Testing a practical scenario
ese experiments emulate a realistic situation in mixture of neutral and smiling faces (probe
recognized. Sub experiment 1 investigates the performance obtained if the expression recognition front end is bypassed, and the
of all the probe faces is attempted with the neutral face recognition module alone. The last two sub experiments implement the full framework shown in Figure 1. In 3.2 the expression recognition is performed with the linear discriminant classifier,
n 3.3 it is implemented through the support
Neutral face recognition module used alone:
d expression and face recognition: probe set 3 is used. (Linear discriminant classifier for expression recognition.)Integrated expression and face recognition: probe set 3 is used.(support vector machine
It can been seen in Figure 6 that if the incoming faces include both neutral faces and smiling faces,
be improved about 10 percent by using the integrated framework proposed
The work reported in this paper represents an attempt to acknowledge and account for the presence of expression on 3D face images, towards their
identification. The method introduced here is computationally efficient. Furthermore, this method also yields as a secondary result the information of the expression found in the faces.Based on these findings we believe that the
act of expression on 3D face recognition and the development of systems that
account for it, such as the framework introduced here, will be keys to future enhancements in the field of 3D Automatic Face Recognition.
.
REFERENCES
[1] K. Bowyer, K. Chang, and P. Flynn, “A Survey of Approaches to 3D and Multi-Modal 3D+2D Face Recognition,” Conf. o Pattern Recognition, 2004.
[2] R. Chellappa, C.Wilson, and S. Sirohey, “Human and Machine Recognition of Faces: A Survey,” Proceedings of the IEEE, 83(5): pp. 705-740.
[3] www.polhemus.com.
[4] C. Li, A.Barreto, J. Zhai and C. Chin. “Exploring Face Recognition Using 3D Profiles and Contours,” IEEE SoutheastCon 2005. Fort Lauderdale.
[5] P.Ekman, W. Friesen, “Constants across cultures in the face and emotion,” Journal of Personality and Social Psychology1971. 17(2): pp. 124-129
[6] Y. Hu, D. Jiang, S. Yan, L. Zhang, and H. Zhang, "Automatic 3D Reconstruction for Face Recognition," presented at International conference on automatic face and gesture recognition, Seoul, 2004.
[7]"Notredame 3D Face Database, "http://www.nd.edu/~cvrl/.
[8].B. Moghaddam, A. Pentlend, “Probabilistic Visual Learning for Object Detection,” International Conference of Computer Vision (ICCV' 95), 1995.
[9]B. Moghaddam, A. Pentlend, “Probabilistic Visual Learning for Object Representation,” IEEE Trans. on Pattern Analysis and Machine Intelligence, 1997. 19(7): pp. 696-710.
00.20.40.60.8
11.2
a b c
rank 1 recognition rate
rank 3 recognition rate
account for it, such as the framework introduced here, will be keys to future enhancements in the field
and P. Flynn, “A Survey of Approaches Modal 3D+2D Face Recognition,” IEEE Intl.
[2] R. Chellappa, C.Wilson, and S. Sirohey, “Human and Machine Proceedings of the IEEE, 1995.
C. Li, A.Barreto, J. Zhai and C. Chin. “Exploring Face IEEE SoutheastCon
“Constants across cultures in the face ” Journal of Personality and Social Psychology,
[6] Y. Hu, D. Jiang, S. Yan, L. Zhang, and H. Zhang, "Automatic 3D Reconstruction for Face Recognition," presented at
conference on automatic face and gesture
[7]"Notredame 3D Face Database, "http://www.nd.edu/~cvrl/.
[8].B. Moghaddam, A. Pentlend, “Probabilistic Visual Learning International Conference of Computer
[9]B. Moghaddam, A. Pentlend, “Probabilistic Visual Learning for ,” IEEE Trans. on Pattern Analysis and
rank 1 recognition rate
rank 3 recognition rate
5
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0112-1
Performance Evaluation of Signal Selective DOA Tracking for
wideband cyclostationary sources
SANDEEP SANTOSH1, O.P.SAHU
2, MONIKA AGGARWAL
3
Astt. Prof., Department of Electronics and Communication Engineering1
,
Associate Prof., Department of Electronics and Communication Engineering2
,
National Institute of Technology , Kurukshetra, 1,2
Associate Prof., Centre For Applied Research in Electronics (CARE)3 ,
Green SSubimagreferencthe nongdifferencprocessethe colosubimagscan sepredictiodependetwo susequentiof adapti
I
This prPredictioNon-gre
Predictio
As the predictioNow pronearest form a c
We cangreen pix
r:
tructure of prop
Subimage is coge follows bae and To redugreen subimace domain wed in the intensor difference
ge. Both subimequence withon technique ency. The predubimages areially with our ive Rice code.
IV. WORKING
roposed schemon on the greeneen plane.
on on the green
green plane ion and all preocessing a parprocessed neiandidate set
n find the dirxels it need som
posed scheme
oded first and ased on greenuce the spectrage is processe
whereas the gresity domain as
content of mages are proch context ma
to removediction residuee then entrproposed real
G OF THE SC
me is mainlyn plane and Pr
n plane
is raster scannediction errorsrticular green ighboring sam
rections assocme process.
the Non greenn subimage aral redundancyed in the colo
een subimage ia reference fothe nongreen
cessed in rasteatching basede the spatiae planes of thropy encodedization schem
CHEME
y working onrediction on th
ned during ths are recordedplane the fou
mples of g (i,j
iated with th
n as y, or is or n
er d al e d e
n e
e d. ur j)
e
Fig 4: green p
Let g(mranked Sg(mu,1<=u<=
If the directiowill bepredicti
i.e. {wheterog
i.e. {w1
FLOW GREEN
Adaptivplane When cdifferencolor spLet c(msamplincolor di d(m,n)g’(m,n)value
Four possiblpixel
mk,nk)Є Φg(i,candidates
,nu)) <= D=v<=4
directions oons of all greene considered ion of g(i,j) is
1,w2,w3,w4}=genous region a
1,w2,w3,w4}=
CHART FON PLANE
ve color differ
compressing thnce informatiopectral dependm,n) be the inng position(mifference of pix)=g’(m,n)-c(m,) à estimate
le directions
,j) for k=1,2,of sample g
D(Sg(i,j), Sg(
of g(i,j) is idn samples in Sin a homogen
={1,0,0,0} Elsand predicted v
={5/8,2/8,1/8,0
OR PREDICT
rence estimatio
he nongreen coon is exploitedency.
ntensity value m,n). Green-Rxel (m,n) is ,n) d green comp
associated wi
,3,4 be the fog(i,j) Э(Sg(i,jmv,nv) ) f
dentical to thSg(i,j), pixel (inous region an
se the g(i,j) is value of g(i,j) i
}
TION ON TH
on for non gree
olor plane, cold to remove th
at a non greeRed(Green-Blu
ponent intensi
ith
our j),
for
he ,j) nd
in is
HE
en
lor he
en ue)
ity
GH=(g(mGv=(g(m
Predictio
Color dic(i,j) wit
Where {Where kranked c CompresThe preimage, s
m,n-1)+g(m,n+m+1,n)+g(m+1
on on the non g
ifference predith color differe
{w1, w2, w3, wk is predictor candidate in Φc
ssion scheme ediction Error ay e(i, j) is giv
+1))/2 and ,n))/2
green plane
iction of a nonence value d(i,j
w4}={4/8, 2/8, coefficient d
c(i,j)
of pixel (i, jven by
n green samplj) is
1/8, 1/8} d(mk,nk) is kth
j) in the CFA
e
h
A
where gsample (i, j)
The enonnegdistribuone
The E(scannedis emplsimplicexponeis usedQuotien
R Where QuotienstorageThe Lej) is k d
Parameperformj) Optima
Where For a gcolor spand, hewhole sΜ is es
defined
g(i, j), d(i, j) avalue and the
error residue gative integer aution to an exp
(i, j) ’s from thd and coded wloyed to cocity and higentially distribd, each mappednt Q
parameter knt and Rem
e and transmissength of code wdependent and
eter k is cmance as it dete
al parameter K
geometric sourpaces I As lonence, the optimsource can be dstimated adapti
When codind to be
are respectivelycolor differenc
e(i, j) is thenas follows to rponential one f
he green sub-iwith Rice code ode E(i, j) bgh efficiencyuted sources W
d Residue E(i,
k is a non nainder are th
sion. word used for ris given by
ritical to thermines the co
is given by
is the goldrce with distribng as is μ knowmal coding pardetermined easvely in course
ng E(i, j) of
y, the real greece value of pix
n mapped to reshape its valufrom a Laplacia
image are rastfirst. Rice cod
because of iy in handlinWhen Rice cod
j) is split into
negative integhen saved f
representing E
he compressioode length of E
den ratio. bution parametwn, parameter ameter k for thsily. of Encoding
green plane
en xel
a ue an
ter de its ng de
o a
ger for
E(i,
on E(i,
ter ρ, he
is
When cobe
Decodin
DecodinEncodinthen thedecodedCFA Imtwo sub
Fig 5: St
From thea good predictiothe meadivisor adjustedof Rice c V. COMSimulatiperformabit coloraccordinCFA imthe prop Some resuch as LCMI w S No.
Image 1
oding E(i, j) of
ng Process:
ng Process isng. Green Sube non-green sud green sub-immage is then rec
images.
tructure of Dec
BITRAT
e above fig, it compression pon residue is aan of its value
used to gened accordingly socode.
MPRESSION Pions were caance of proposr images of sizng to the Bayer
mages. These Imosed compress
epresentative LJPEG-LS, JPE
were used for co
JPEG LS
5.467
f non green pla
s just reversb-image is decub-image is demage as a referconstructed by
coder�
TE ANALYSIS
shows that α =performance. Wa local variabl distribution a
erate the Riceo as to improv
ERFORMANCarried out tosed compressio
ze 512*768 wer pattern to formages are dirsion scheme fo
Lossless compreEG 2000(losslomparison of r
JPEG 2000 5.039
ane is defined to
se process ocoded first andcoded with thrence. Origina
y combining th
S
= 1 can providWe assume thle and estimatadaptively. The code is thene the efficiency
CE evaluate thon scheme. 24re sub-sampledrm 8 bit testingectly coded by
or evaluation.
ession schemeless mode) andresults
Proposed
4.803
o
of d e
al e
e e e e n y
e 4-d g y
es d
Image 2Image 3
If we aget impand also
α =0 α =0.6α =0.8α =1
ADV
We canand alssensorscomplegives b
VI. EX
2 6.188 3 6.828
alter the valueproved results o reduce the bi
Overall CRate (in bp4.9496 4.8486 4.8437 4.8366
VANTAGES O
n reduce the spso can get highs in digital cexity to designetter performan
XPERIMENTA
5.218 4.525
Table I
s of weightingin terms of co
it rates of CFA
CFA Bit p)
C
1.1.1.1.
Table-II OF PROPOSE
pectral redundh quality imagcameras from n. Compare wince.
AL RESULTS
4.847 3.847
g factor then wompression rat
A.
ompression Ra
6163 6496 6516 6537
ED METHOD
dancy mean timge. Reducing th
3 to 1. Loith JPEG2000
we tio
atio
me he
ow it
VII. CONCLUSION
CFA image encodes the sub-image separately with predictive coding Lossless prediction is carried out in the intensity domain for the green. While it is carried out in the color difference domain for the non green
VIII.ACKNOWLEDGMENT
The first author express his gratitude to the remaining two authors towards the completion this project.
IX REFERENCES
[1] S. Banks, Signal Processing, Image Processing and Pattern Recognition. Englewood Cliffs, NJ: Prentice-Hall, 1990.
[2] S. P. Lloyd, “Least squares quantization in PCM,” IEEE Trans. Inf. Theory, vol. IT-28, no. 2, pp. 129–136, Mar. 1982.
[3] P. Berkhin, “Survey of clustering data mining techniques,” Accrue Software, San Jose, CA, 2002.
[4] J. Besag, “On the statistical analysis of dirty pictures,” J. Roy. Statist. Soc. B, vol. 48, pp. 259–302, 1986.
[5] D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 5, pp. 603–619, May 2002.
[6] J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 888–905, Aug. 2000.
[7] P. Felzenszwalb and D. Huttenlocher, “Efficient graph-based image segmentation,” Int. J. Comput. Vis., vol. 59, pp. 167–181, 2004.
[8] S. Zhu and A. Yuille, “Region competition: Unifying snakes, region growing, and Bayes/MDL for multiband image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 18, no. 9, pp. 884–900, Sep.1996.
[9] M. Mignotte, C. Collet, P. Pérez, and P. Bouthemy, “Sonar image segmentation using a hierarchical MRF model,” IEEE Trans. Image Process., vol. 9, no. 7, pp. 1216–1231, Jul. 2000.
[10] M. Mignotte, C. Collet, P. Pérez, and P. Bouthemy, “Three-class Markovian segmentation of high resolution sonar images,” Comput. Vis. Image Understand., vol. 76, no. 3, pp. 191–204, 1999.
[11] F. Destrempes, J.-F. Angers, and M. Mignotte, “Fusion of hidden Markov random field models and its Bayesian estimation,” IEEE Trans. Image Process., vol. 15, no. 10, pp. 2920–2935, Oct. 2006.
[12] Z. Kato, T. C. Pong, and G. Q. Song, “Unsupervised segmentation of color textured images using a multi-layer MRF model,” in Proc. Int. Conf. Image Processing, Barcelona, Spain, Sep. 2003, pp. 961–964.
[13] P. Pérez, C. Hue, J. Vermaak, and M. Gangnet, “Colorbased
probabilistic tracking,” in Proc. Eur. Conf. Computer Vision, Copenhagen,Denmark, Jun. 2002, pp. 661–675.
[14] J. B. Martinkauppi, M. N. Soriano, and M. H. Laaksonen, “Behavior of skin color under varying illumination seen by different cameras at different color spaces,” in Proc. SPIE, Machine Vision Applications in
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM
(SPRTOS)” MARCH 26-27 2011
SIP0201-1
OPTIMAL RECEIVER FILTER DESIGN
Vivek Kumar Dr. K. Raj Deptt. of Electronics Engg. Deptt. of Electronics Engg.
IITM. Kanpur Harcourt Butler Technological
5/414, Avas Vikas,Farrukhabad Institute, Kanpur – 208002, India
Ajmer Singh, Student of Lovely Professional University(LPU)-India, Nikesh Bajaj, Asst. Prof., ECE Dept.(LPU) [email protected], [email protected]
Abstract- Fractional Fourier Transform (FRFT) is the
generalization of the classic Fourier Transform (FT). When we dealing with time-varying signals, FRFT is an important tool to analysis these signals. This paper contain the results for variation of basic signals like Rectangular pulse, sine wave and Gaussian signal in the Fractional Fourier Domain (FRFD). The correlation results for FRFD signal to the time domain(TD) and correlation results for FRFT at α-domain to the FRFT at (α-1)-domain also shown and discussed. The graphically proof of scaling property of FRFT is also given.
Index Terms— FRFT, FRFD, Signal Processing, α-domain, Analysis FRFT, FRFT scaling property, α-domain’s correlation
I. INTRODUCTION
The FT is one of the most frequently used tools in signal analysis. However, the FT is not very suitable in dealing with signals whose frequency changes with time because of its assumption that the signal is stationary. The generalization of FT has been proposed in [1] by V. Namias, and is known as FRFT. FRFT also state as “FRFT perform a spectral rotation of the signal in time-frequency plane with variation of α parameter”. In recent years, FRFT has been applied in many areas such as solving differential equations [2], quantum mechanics [1], optical signal processing [6], time variant filtering and multiplexing [3]–[5], swept-frequency filters [6]. Several properties of the FRFT in signal analysis have been summarized in [6].
This paper is divided as following sections. Section II is about the basic concept of FRFT, and also discussed about some properties of FRFT. Section III is about the analysis of different signals, in this section we discussed about the Rectangular pulse, Sine wave and Gaussian signal, also check correlation results for these signals in FRFD. In section IV the conclusion of the paper is discussed.
II. BASIC CONCEPT OF FRFT
The FRFT with angle parameter α of a signal f(t) is defined as,
# $�%� & '� � ����%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%# $�%� � �'� � ����%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%# $�%� � ��' ( ��� Fα(u) called as the α- order FRFT of signal f(t). Where α
= Aл/2, and ‘A’ is a real number and is called the order of the FRFT, which is in interval [-2, 2] and can be extended to any real number according to A+4k = A. Where k is any integer like [….-3, -2, -1, 0, 1, 2, 3,….], and A can be any fractional value in interval [-2, 2].
Some basic properties of FRFT are:
• Linearity. • Zero rotation/ Time domain. When A = 0 or 4; α = 0 or 2л; the FRFT operator Fα(u) is
correspond to identity operator. Or F0(u) = f(t). Where f(t) is the time domain signal and F0(u) is the FRFT operator at α=0. • FT is the special case of FRFT. When A = 1; α = л/2; the FRFT operator Fα(u) is
correspond to FT. Or Fл/2(u) = F(t). Where F(t) stand for the Fourier Transform of the time domain signal f(t) and Fл/2(u) is the FRFT operator at α = л/2. • Flipped operation/ time inversion. When A = 2; α = л; the FRFT operator Fα(u) is
correspond to flipped operator. Or Fл(u) = f(-t). Where f(-t) is the flipped version of the input signal f(t) and Fл(u) is the FRFT operator at α = л. • Inverse Fourier Domain. When A = 3; α = 3л/2; the FRFT operator Fα(u) is
correspond to inverse Fourier domain. Or F3л/2(u) = F(-t). Where F(-t) stand for the flipped version of Fourier Transform of the time domain signal f(t) and F3л/2(u) is the FRFT operator at α = 3л/2.
The above properties of FRFT are easily understood by the figure 1.
Figure 1: Time- frequency plane for FRFT.
In this paper, we use the Digital Computation method of
the FRFT which is given in [7].
III. ANALYSIS OF DIFFERENT SIGNALS
We always store our information or data in some type of memory space, that set of information or data is known as signal. There are some basic signals like Rectangular pulse, Sine wave, Gaussian signals. These signals are basically use for signal processing. In signal processing there are different types of transform techniques which are used to analysis the
frequency spectrum of the signals. Because the frequency spectrum tells more about the signal behavior as compare to the time domain representation.
But the FRFT tell about the signal representations in time domain and frequency domain while using the different FRFT operator Fα(u), where α = 0; give the time domain representation and α = л/2; give the frequency domain representation. Also 0 < α < л/2 give the intermediates domain which are known as α–domains. These domains are not giving any exact information about the time / frequency component. But gives some mixed information about that.
So, in this section we are going to discuss about the variation of some signals with variation in α-domain.
A. Analysis of Rectangular pulse in FRFD The rectangular pulse (also known as the rectangle
function, rectangular function, gate function, or unit pulse) is defined as:
And FT of a rectangular function is defined as: �01)�*����2 � 345%��-�� ��-��6
Now, let us discuss an example for rectangular pulse in
FRFD and discuss results. In figure 2 we shows the results
(a) A=0/α=0 (b) A=0.2/α=л/10
(c) A=0.4/α=л/5 (d) A=0.6/α=3л/10
(e) A=0.8/α=2л/5 (f) A=1/α=л/2
Figure 2: FRFT of rectangular pulse for different values of angle α/A. solid line: real part. Dashed line: imaginary part.
for six different FRFD values, out of which figure 2(a) for
α = 0; which shows the rectangular pulse in time domain and figure 2(f) for α = л/2; which shows the spectrum of rectangular pulse that is sinc function, rest of the four
domains shows the FRFT results for rectangular pulse at α = л/10, л/5, 3л/10, and 2л/5.
Two FRFDs for rectangular pulse at α=0 and α= л/2 are ordinary time and frequency domains respectively. By taking a look on figure 2(a) to 2(e) any one can easily understand the concept that how an rectangular pulse become a sinc function in frequency domain, without any mathamaticaly expression. We can also see how much these domains are correlated to each other. But not tell the actual value of correlation cofficent. To analysis this in figure 3 there are two graphs first one of which tells about the normalized correlation value of α-domain signal to the time domain signal and second one tells about the normalized correlation value of α-domain to (α-1)-domain. For the better results we take 90 domains at 90 different values of A between 0 < A < 1.
In figure 3(a) and 3(b) where the α = 0 correlation coefficent has the maxium value is 1. It proof that the FRFT at α = 0 give the actual time domain signal or no rotation. But when there is a small change of 1° (one degree) in α value the correlation coefficent give the minimum value quite different from time domain signal but still correlate up to 95%, and so on. In figure 3(b) we can see that when 1° < α < 45° then the α-domain signal is highly corrrelated to the previous α-domain, an simillar result for 45°< α < 90°.
(a)
(b)
Figure 3: Correlation results for rectangular pulse.
B. Analysis of Sine wave in FRFD The sine wave or sinusoid is a mathematical function that
describes a smooth repetitive oscillation. It occurs often in pure mathematics, as well as physics, signal processing,
-30 -20 -10 0 10 20 300
0.2
0.4
0.6
0.8
1
-30 -20 -10 0 10 20 30-0.5
0
0.5
1
1.5
-30 -20 -10 0 10 20 30-0.5
0
0.5
1
1.5
-30 -20 -10 0 10 20 30-0.5
0
0.5
1
1.5
-30 -20 -10 0 10 20 30-0.5
0
0.5
1
1.5
-30 -20 -10 0 10 20 30-0.5
0
0.5
1
1.5
0 20 40 60 80 1000.8
0.85
0.9
0.95
1
value of α in degrees
MA
X(C
orre
latio
n)
Correlation of α domain signal to time domain signal
0 20 40 60 80 1000.95
0.96
0.97
0.98
0.99
1Correlation of α domain signal to (α-1) domain signal
value of α in degrees
MA
X(C
orre
latio
n)
3
electrical engineering and many other fields. It’s most basic form as x(t) known as a function of time (t) is defined as:
7��� � 8 345����� ( 9� Where M is the amplitude of the sine wave, f is the
frequency component, t is time and θ is the phase, specifies where in its cycle the oscillation begins at t = 0.
Now, let discuss the results for Sine wave in FRFD. In figure 4 we shows the results for six α’s values, out of which figure 4(a) for α = 0; which shows the Sine wave in time domain and figure 4(f) for α = л/2; which shows the spectrum of Sine wave that is impulse function, rest of the four domains shows the FRFT results for Sine wave at α = л/10, л/5, 3л/10, and 2л/5.
As similar to the results discuss in section 3(A), now in figure 4 shows the six α-domains for Sine wave out of which two domains are identical to the ordinary time domain and frequency domain, which are in figure 4(a) and 4(b) respectively. Rest four figures 4(b), 4(c), 4(d) and 4(e) shows the results for FRFT of Sine wave at different value of α. The correlation results for Sine wave in the α-domain with the TD signal, and with the (α-1)-domain signal are shown in figure 5(a) and 5(b) respectively.
(a) A=0/α=0 (b) A=0.2/α=л/10
(c) A=0.4/α=л/5 (d) A=0.6/α=3л/10
(e) A=0.8/α=2л/5 (f) A=1/α=л/2
Figure 4: FRFT of Sine wave for different values of angle α/A. solid line: real part. Dashed line: imaginary part.
In figure 5(a) it is clear that when 1° < α <10° then the α-
domain signal for sine wave is somehow correlated to the TD signal. But when 10° < α < 90° then the α-domain signal
is not correlated to TD signal, for these domains the correlation coefficient values tends to zero.
In figure 5(b) the correlation results for α-domain to (α-1) domain are shows. When 1°< α < 90°, these domain are equally correlated to each other. But very less correlated to TD signal
(a)
(b)
Figure 5: Correlation results for Sine wave in α-domain.
C. Analysis of Gaussian signal in FRFD A Gaussian signal has a bell-shaped curve. Gaussian
tuning curves are extensively used because their analytical expression can be easily manipulated in mathematical derivations. Mathematically Gaussian signal defined as:
7��� � ����-�
As we discuss two type of signals in section 3(a) and 3(b) which are Rectangular pulse and Sine wave respectively. The third point of interest is Gaussian signal. Because Gaussian functions are widely used in statistics where they describe the normal distributions, in signal processing where they serve to define Gaussian filters, and many more application they have.
At last, by taking an example of Gaussian signal to compute the FRFT for analysis it in FRFDs. In figure 6 we show six different FRFDs for Gaussian signal. Out of which two domains are again identical to TD and FD. And rest four domains are intermediate domains of TD and FD.
For Gaussian signals the Fourier transform is again a Gaussian signals. Now if we have a look from 6(a) to 6(f) then the variation from TD to FD is easily understandable. Our point of interest is that how FRFD signals are correlated to each other. For this in figure 7 we have two plots which show the correlation of α-domain signal with TD signal and with (α-1)-domain signal in figure 7(a) and 7(b) respectively.
0 20 40 60 80 100-1
-0.5
0
0.5
1
0 20 40 60 80 100-1.5
-1
-0.5
0
0.5
1
1.5
0 20 40 60 80 100-1.5
-1
-0.5
0
0.5
1
1.5
0 20 40 60 80 100-1.5
-1
-0.5
0
0.5
1
1.5
0 20 40 60 80 100-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
0 20 40 60 80 100-5
0
5FRFT of Sine Wave A= 1
0 20 40 60 80 1000
0.2
0.4
0.6
0.8
1
1.2
1.4
value of α in degrees
MA
X(C
orre
latio
n)
Correlation of α domain signal to time domain signal
0 20 40 60 80 1000.96
0.97
0.98
0.99
1
1.01
1.02Correlation of α domain signal to (α-1) domain signal
value of α in degrees
MA
X(C
orre
latio
n)
4
(a) A=0/α=0 (b) A=0.2/α=л/10
(c) A=0.4/α=л/5 (d) A=0.6/α=3л/10
(e) A=0.8/α=2л/5 (f) A=1/α=л/2
Figure 6: FRFT of Sine wave for different values of angle α/A. solid line: real part. Dashed line: imaginary part.
(a)
(b)
Figure 7: Correlation results for Gaussian signal in α-domain.
By analyzing these three signals in FRFDs. It is clear that the α-domain signal are highly correlated to the (α-1)- domain. That can understand from figure 3(b), 5(b) and 7(b). These figures show that for the interval 1° < α < 90° these signals are similar to each other. And by taking a look to figure 2(b) to 2(e), 4(b) to 4(e) and 6(b) to 6(e), we can realized that the FRFDs signal are just scaled version of the previous FRFD.
IV. CONCLUSION
We have discussed about FRFT concept and its some properties. Also we have discussed the behavior of three different signals in FRFD, and presented these signal in FRFD. The correlation concept have discussed for α-domain signal to TD signal and to (α-1) domain signal. That shows that the α-domain signal is just the scaled version of the previous α-domain signals. That graphically proofs the scaling property of FRFT which is discussed in [6].
The work presented in this paper is helpful for further research. And the graphically proof of the scaling property of FRFT is helpful to understand that how FRFT change the time domain signal to the frequency domain signal.
REFERENCES
[1] V. Namias, “The fractional order Fourier transform and its application to quantum mechanics,” J. Inst. Math. Applicat., vol. 25, pp. 241–265, 1980.
[2] A. C. McBride and F. H. Kerr, “On Namias’ fractional Fourier transforms,” IMA J. Appl. Math., vol. 39, pp. 159–175, 1987.
[3] H. M. Ozaktas, B. Barshan, D. Mendlovic, and L. Onural, “Convolution, filtering, and multiplexing in fractional Fourier domains and their relationship to chirp and wavelet transforms,” J. Opt. Soc. Amer. A, vol. 11, pp. 547–559, Feb. 1994.
[4] R. G. Dorsch, A. W. Lohmann, Y. Bitran, and D. Mendlovic, “Chirp filtering in the fractional Fourier domain,” Appl. Opt., vol. 33, pp. 7599–7602, 1994.
[5] A. W. Lohmann and B. H. Soffer, “Relationships between the Radon–Wigner and fractional Fourier transforms,” J. Opt. Soc. Amer. A, vol. 11, pp. 1798–1801, June 1994.
[6] L. B. Almeida, “The fractional Fourier transform and time-frequency representation,” IEEE Trans. Signal Processing, vol. 42, pp. 3084–3091, Nov. 1994.
[7] Haldun M. Ozaktas, Orhan Arikan, M. A. Kutay and G. Bozdag, “Digital Computation of the Fractional Fourier Transform”, IEEE Trans. Signal Processing vol. 44, pp. 2141-2150, Sept. 1996.
Ajmer Singh (M’22) was born in Punjab, India. He is pursuing the master’s degree in signal processing from Lovely Professional University, Punjab, India, in 2011. Currently, he is doing dissertation under the supervision of Mr. Nikesh Bajaj, the assistant professor of electronic department. Research interests include different aspects of FRFD filter designing.
Nikesh Bajaj received his bachelor degree in Electronics & Telecommunication from Institute of Electronics And Telecommunication Engineers. And he received his master degree in Communication & Information System from Aligarh Muslim University, India. Now, he is working in LPU as Asst. Professor, Department of ECE. Research interests include Cryptography, Cryptanalysis,
and Signal & Image Processing.
-100 -50 0 50 1000
0.2
0.4
0.6
0.8
1
-100 -50 0 50 100-1
-0.5
0
0.5
1
1.5
-100 -50 0 50 100-1
-0.5
0
0.5
1
1.5
-100 -50 0 50 100-1
-0.5
0
0.5
1
1.5
-100 -50 0 50 100-1.5
-1
-0.5
0
0.5
1
1.5
2
-100 -50 0 50 100-1
0
1
2
3
4
0 20 40 60 80 100
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
value of α in degrees
MA
X(C
orre
latio
n)
Correlation of α domain signal to time domain signal
0 20 40 60 80 1000.997
0.9975
0.998
0.9985
0.999
0.9995
1
1.0005Correlation of α domain signal to (α -1) domain signal
value of α in degrees
MA
X(C
orre
latio
n)
1
Parzen-Cos6 (πt) combinational window family based QMF bank
Narendra Singh (*)
and Rajiv Saxena,
Jaypee University of Engineering and Technology, Raghogarh, Guna (MP)
FIR filter of two-channel Quadrature Mirror Filter
(QMF) bank is introduced. Three variable windows,
viz., Blackman window, Kaiser window, and
Parzen-cos6 (πt) (PC6) window are used to design
prototype filters. The design equations of these
window functions based filter banks are also given
in this article. Reconstruction error, which is used
as an objective function, is minimized by optimizing
the cutoff frequency of designed prototype filters.
The Gradient based iterative optimization
algorithm is used. The performances of filter banks
designed with these window functions are compared
on the basis of reconstruction error. The
combinational window, PC6 window provides the
QMF bank with better reconstruction error.
Keywords: QMF, Filter Bank, Combinational
Window.
1. INTRODUCTION
Window functions widely used in digital signal
processing for the applications in signal analysis
and estimation, digital filter design and speech
processing. Digital filter banks are used in a number
of communication applications. The theory and
design of QMF bank was first introduced by
Johnston [1]. These filter banks find wide
applications in many signal processing fields such
as trans-multiplexers [2]-[3], equalization of
wireless communication channel [4], sub-band
coding of speech and image signals [5]-[8], sub-
band acoustic echo cancellation [9]-[12]. In QMF bank the input signal x(n) splits into
two sub-band signals having equal bandwidth using the low pass and high pass analysis filter H0 (z) and
H1(z) respectively. These sub-band signals are down sampled by factor of two to reduce processing complexity. At the output, corresponding synthesis bank has two-fold interpolator for both sub-band signals, followed by G0(z) and G1(z) synthesis filters. The outputs of the synthesis filters are combined to obtain the reconstructed signal y(n). This reconstruction of signal at output is not perfect replica of the input signal x(n), due to three types of errors: aliasing error, amplitude error and phase error [12]-[13]. Since inception of the QMF banks most of the researchers giving main stress on the elimination or minimization of these errors and obtain near perfect reconstruction (NPR). In several design methods [14]-[18] aliasing and phase distortion has been eliminated completely by designing all the analysis and synthesis FIR linear phase filters by a single low pass prototype even order symmetric FIR linear phase filter. Amplitude distortion is not possible to eliminate completely, but can be minimized using optimization techniques [12]-[13]. Figure-1 shows the two - channel quadrature mirror filter bank designed by Johnston [1] in which Hanning window was used to design low pass prototype FIR filter and nonlinear optimization technique to minimize reconstruction error was employed.
This paper uses the algorithm as proposed in
Creusere and Mitra [6] with certain modifications to
optimize the objective function. The combinational
window functions [19]-[21] with large SLFOR have
been devised and used for designing FIR prototype
filters. Due to the closed form expressions of the
10. Clarity Braintech system, Standard edition, Software version 3.4, Hardware version 1.4, Clarity Medical Private Limited
-15 -10 -5 0 5 10 15 2010
20
30
40
50
60
RTL
LTR
Support Vectors
Classifier
0 5 10 15 200
200
400
600
800
1000
Time(sec)
Va
ria
nce
Variance of FP2F4
RTL
LTR
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0333-1
Abstract—Due to changing trends, there is an
increasing risk of people having Cardiac
Disorders. This is the impetus behind, for
developing a system which can diagnose the
cardiac disorder and also risk level of the
patient, so that effective medication can be
taken in the initial stages. This paper helps in
comprehensive diagnosis of the patient without
the doctor in the same geographical
location.This will prove to be advantageous for
implementation in villages where doctors are
not easily accessible. In this paper, Atrial rate,
Ventricular rate, QRS Width and PR Interval
are extracted from ECG signal, so that
arrhythmia disorders- Sinus tachycardia (ST),
supra-ventricular tachycardia (SVT),
ventricular tachycardia (VT), junctional
tachycardia (JT), ventricular and Atrial
fibrillation (VF & AF) are diagnosed with their
respective risk levels. So that the system acts as
an risk analyzer, which tells how far the
subject is prone to arrhythmia. LabVIEW
signal express is used to read ECG and for
analysis this information is passed to the Fuzzy
Module. In the Fuzzy module Various ―If-then
rules‖ have been framed to identify the risk
level of the patient. The Extracted information
is then published to the client from the server
by using a Online publishing tool. After
passing the report developed by the system to
the doctor,he or she can pass the medical
advice to the server, i.e. generally the system
where the patient ECG is extracted and
analyzed.
Index Terms–LabVIEW,Arrhythmia- Sinus
tachycardia (ST), supra-ventricular tachycardia
(SVT), ventricular tachycardia (VT), junctional
tachycardia (JT), ventricular and Atrial
fibrillation (VF & AF, Online publishing tool,
QRS width, trial rate,ventricular rate
I. INTRODUCTION
According to the World Health Organization
(WHO) heart disease and stroke kills around 17 million people a year, which is almost one-third of all deaths globally. By 2020, heart disease and stroke will become the leading cause of both death and disability worldwide. So, it is very clear that proper diagnosis of heart disease is important for patients to survive. Electrocardiogram (ECG) is an important tool for Diagnosis of heart diseases .But it has some drawbacks such as: 1) Special skill is required to administer and interpret the
results of ECG.
2) Cost of ECG equipment is high.
K.A.Sunitha1, N.Senthil kumar
2, K.Prema
3, Sandeep Kotikalapudi
4
1&3 Assistant professor, Instrumentation and Control Engineering Department, SRM University,
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0333-2
3) Limited availability of ECG equipment. Due to these drawbacks, telemedicine contacts were mostly used for consultations between special telemedicine centres in hospitals and clinics in the past. More recently, however, providers have begun to experiment with telemedicine contacts between health care providers and patients at home to monitor conditions such as chronic diseases [1].
LabVIEW (Laboratory Virtual Instrument Engineering Workbench) is a graphical programming environment suited for high-level or system-level design. As it has been proven that LabVIEW based telemedicine system does have the following features.
1) It replaces multiple stand-alone devices at the
cost of a single instrument using virtual instrumentation and its functionality is expandable [2].
2) It facilitates the extraction of valuable diagnostic information using embedded advanced biomedical signal processing algorithms [2].
3) It can be connected to the internet to create an internet –based telemedicine infrastructure, which provides a comfortable way for physicians to communicate with friends, family and colleagues [3].
Several systems had been developed on acquisition and analysis of ECG [4]-[8] using labVIEW . Some systems [5] and [7],[8] also dealed with identifying the cardiac disorder but it lacks , identifying the risk levels of the patient for the cardiac disorder and the online publishing system.
In this paper, we developed a program not only to access the patient’s data but also we had tried to diagnose the heart abnormalities, which can be a reference to the doctor or physician for further procedure. This can be taken up from anywhere if an internet connection is available. And a fuzzy system
is developed to identify the risk level of the patient. . Fuzzy system is more accurate than the normal controller because instead of being either true or false, a partial true case can also be declared. The risk scores can be accurately and exactly calculated for specific records of a person.
II. PROPOSED SYSTEM
Figure 1. Shows the proposed Fuzzy analyser
with online
System.
Fig 1. proposed system
The ECG waveforms are obtained from MIT-BIH Database.LabVIEW signal express is used to read and make analysis of the ECG and pass the information to the Fuzzy Module. In the Fuzzy module Various “If-then rules ” have been written to identify the risk level of the patient.The Extracted information is then published to the client from the server by using different Online publishing tools. After passing the information i.e, Atrial rate, Ventricular rate, QRS Width and PR Interval which were extracted from ECG signal,from patients system to the doctor’s system the doctor can pass the medical advice to the server, i.e. generally the system where the patient ECG is extracted and analyzed.
A. Internet based System:
The internet is used as a to and fro vehicle to deliver both the virtual medical instruments, medical data and prescription from the doctor in real time .An internet-based telemedicine system is shown in fig:2. This work involves an internet –based telemonitoring system, which has been developed as an instance of the general client-server architecture presented in fig:.
The client server architecture is defined as follows: the client application provide visualization, archiving, transmission, and contact facilities to the remote user (i.e., the patient). The server, which is
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0333-3
located at the physicians end takes care of the incoming data, and organizes patient sessions.
Fig.2 Internet based system
B. LABVIEW
LabVIEW is a graphical programming language
developed by National instruments. Programming
with LabVIEW gives a vivid picture of data flow by
the graphical representation in blocks. labview is
used here for getting the ECG waveform and also
for analyzing the parameters like PR interval, QRS
width, heart rates which are later passed to the fuzzy
system.
LabVIEW offers modular approach and parallel
computing , which makes easier for developing
complex systems. Debugging tools like probes,
Highlight execution are handy in analyzing where
actually the error occurred.
C. Fuzzy system
Fuzzy controllers are the widely employed as they
are efficient controllers when working with the
vague values. A Fuzzy controller has a rule base in
“IF-THEN” fashion, which is used for identification
of the risk level of disease using the weight.
A Fuzzy system is generally given by Fig 3.
Fig 3. Fuzzy system
A. Fuzzification
In this system we are considering the atrial and ventricular heart rates, QRS complex width and PR interval values as the input linguistic variables, which are passed to the inference engine.
Based on the rule base and linguistic variables, the fuzzy system output is obtained.
B. Defuzzification
The defuzzified values are the risk levels high risk, medium risk, low risk which are obtained according to the weights of fuzzy variables.
C. Relation between input and output variables
The relationship between input and output is shown by a 3-Dimensional figure 4. shown below
Fig 4. Relation between input and output
D. Fuzzy Rules
In this Fuzzy system we are using the centre of area method as the fuzzificaton method. The rule base of the fuzzy system consists of rules in the form of “If-Then”. The risk levels are dependent on the number of conditions are met by the input variables for the respective cardiac disorder. As there is no particular rule of identifying the arrhythmia based on heart rate, since it can differ from patient to patient and so this system thus is more accurate in determining the arrhythmia since it is not based only on heart rate.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0333-4
Fuzzy rule base is acts like a database of rules for selecting the output, basing on the input quantities. Some of the rules are:-
1. IF 'PR interval' IS 'Normal' AND 'vHR' IS '30,40' AND 'aHR' IS '60,75' THEN 'First Degree Block' IS 'No ' ALSO 'Third Degree block' IS 'Medium Risk' 2. IF 'PR interval' IS 'Normal' AND 'vHR' IS '30,40' AND 'aHR' IS '75,90' THEN 'First Degree Block' IS 'No ' ALSO 'Third Degree block' IS 'Medium Risk' 3. IF 'PR interval' IS 'Normal' AND 'vHR' IS '30,40' AND 'aHR' IS '90,100' THEN 'First Degree Block' IS 'No ' ALSO 'Third Degree block' IS 'High Risk'.
4. IF 'vHR' IS '150,180' AND 'QRS Width' IS 'Narrow QRS' THEN 'Ventricular Tachycardia at' IS 'Low risk' ALSO 'Junctional Tachycardia at' IS 'Low Risk' ALSO 'Supra Ventricular Tachy at' IS 'High Risk' 5. IF 'vHR' IS '180,210' AND 'QRS Width' IS 'Normal QRS' THEN 'Ventricular Tachycardia at' IS 'Low risk' ALSO 'Junctional Tachycardia at' IS 'High Risk' ALSO 'Supra Ventricular Tachy at' IS 'Low Risk'
In this manner, based upon the PR interval,QRS width, atrial and Ventricular heart rates a Fuzzy system is developed to identify the Cardio disorder as well as its level of risk.
III. ONLINE PUBLISHING
One of the Unique feature of this system is its ability to publish or pass the extracted information to the Client, usually to a doctor`s computer. This helps in implementing a telediagnosis system. The doctor will be able to see the diagnosis result along with risk levels and then pass the information to the doctor for further advice. Since internet issued for passing the values to the doctor ,This becomes immensely help for immediate action to be taken. This will cater to the need of public health care centres rural areas where it is difficult to have cardiologists. And also this system can be used to assist the doctor in monitoring the patient’s heart during surgery.
IV. RESULTS
This system is able to measure the arrhythmias accurately and also publish it online.
Fig 5. Block Diagram for extracting
ECG waveform
In the above Fig 5 block diagram , it perform the
function of passing the HR value obtained from the
signal express to the fuzzy system .
Fig 6. Block diagram for calling fuzzy system in labVIEW
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0333-5
Above figure 6. Shows the block diagram of risk level detection , we show how we called the fuzzy system into the main panel for diagnosing and risk level indication.
Fig 7 shows the Front panel which is developed from the fuzzy system ,and is sent to the doctor using web publishing tool for the second advice .System also have a database to save the details of patient like Name, Age, Sex, Symptoms which can used for the next time..
Fig 7. Front panel
V. CONCLUSION
In this way we had developed a fuzzy system with good accuracy in determining the cardiac disorders with risk levels when compared to the normal system considering the atrial and ventricular heart rates, QRS complex width and PR interval values as the input linguistic variables using labVIEW. This report is successfully sent to the doctors system using web publising tool for the second advice.
REFERENCES:-
[1] N.Noury and P.Pilichowski,”A telematic system tool for home health care,”-in proc. IEEE 14
th
Annu.Int.conf.EMBS, Paris, oct.1992, PP.1175-1177
[2] Zhenyu Guo and John c.Moulder “An internet based Telemedicine system”IEEE transactions, pp. 2000
[3].Volodymyr Hrusha, Olexandr Osolinskiy,
Pasquale Daponte, Domenico Grimaldi”Distributed
Web-based Measurement system” IEEE Workshop
on Intelligent Data and Advanced Computing
System Technology and Applications pp, on 5-7
2005
1. Acquisition and Analysis System of the ECG
Signal Based on LabVIEW by Lina Zhang,
Xinhua Jiang.
2. QRS DETECTION USING A FUZZY NEURAL
NETWORK Kevin P. Cohen, Willis J.
Tompkins, Adrianus Djohan, John G. Webster
and Yu H. Hu.
3. Classification of ECG Arrhythmias using Type-2
Fuzzy
Clustering Neural Network
4. Robust techniques for remote real time
arrhythmias classification system
5. ECG Arrhythmia Detection Using Fuzzy
Classifiers by
S. Zarei Mahmoodabadi ,A. Ahmadian, M. D.
Abolhassani, J. Alireazie P. Babyn
6. Discrimination of Cardiac Arrhythmias Using a
Fuzzy Rule-Based Method by E Chowdhury,
LC Ludeman.
7. Automated ECG Rhythm Analysis Using Fuzzy
Reasoning by W Zong, D Jiang.
8. Fuzzy Classification of Intra-Cardiac
Arrhythmias by Jodie Usher, Duncan Campbell,
Jitu Vohra, Jim Cameron.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0401-1
Projected View & Novel Application of Context Based Image Retrieval Techniques
Abstract— Image searching is one of the fascinating topics for the advanced research since the 1990s. As fast there is advancement in the computer and network technologies coupled with relatively cheap high volume data storage devices have brought tremendous growth in the amount of digital images, hence the development of pattern recognition is also increases exponentially. Pattern recognition is the act of taking in raw data and classifying it into predefined categories using statistical and empirical methods. Content based image retrieval (CBIR) is one of the widely used applications of pattern recognition for finding images from vast and un-annotated image database. In CBIR images are indexed on the basis of low-level features, such as color, texture, and shape, which can automatically be derived from the visual content of the images. The paper discusses techniques and algorithm that are used to extract these image features from the visual content of the images & the advancement which can be done using the CBIR. The various similarity measures are used to identify the closely associated patterns. These methods compute the distance between the features generated for different patterns and identify the closely related patterns and these patterns are then generated as the result. This paper unfolds a novel application using context based image retrieval for search the detailed description of an image without knowing a single word about it. This paper also proposes algorithms to create such a utility. Keywords: Context Based Image Retrieval, Image Searching.
INTRODUCTION
The initial techniques which are used are based on the textual annotation of the images. Using the text descriptions, images can be organized by topical or semantic hierarchies to facilitate easy navigation and browsing based on standard Boolean queries. Content Based Image Retrieval is one of the major approaches of image retrieval that has drawn significant attention in the past decade, which uses visual contents to search images from large scale image database according to users interests Low Level image features such as color, texture, shape and structure are extracted from images. Relevant images are retrieved based on the similarity of their image features. Examples of some of the prominent systems are QBIC, Photobook, and NETRA. In this paper we discuss the different algorithms used to extract the different features of an image. In this paper we also discuss the future advancement of the Context Based Image Retrieval techniques, how can be it beneficial in different fields. We also discuss the futuristic approaches to attain this technique in more advanced way.
1. Image Retrieval
A recent study of literature in image indexing and retrieval has been conducted based on 100 papers from Web of Science. Two major research approaches, text-based (description-based) and content-based, were identified. It appears that researchers in the information science community focus on the text-based approach while researchers in computer science focus on the content-based approach. Text-based image retrieval (TBIR) makes use of the text descriptors to retrieve relevant images. Some recent studies found that text descriptors such as time, location, events, objects,
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0401-2
formats, aboutness of image content, and topical terms are most helpful to users. The advantage of this approach was that it enabled widely approved text information retrieval systems to be used for visual retrieval systems. 1.1. Content-based image retrieval
In CBIR, the images are indexed by features that are derived directly from the images. The features are always consistent with the image and they are extracted and analyzed automatically by means of computer processing, instead of manual annotation. Due to the difficulty of automatic object recognition, information extracted from images in CBIR is rather low level, such as colors, textures, shapes, structure and combinations of the above. A number of representative generic CBIR systems have been developed in the last ten years. These systems have been implemented in different environments, some of which are Web based while some are GUI-based applications. QBIC, Photobook, and NETRA are the most prominent examples. QBIC is developed at the IBM Almaden Research Centre [1, 2, 3]. It is the first commercial CBIR application and plays an important role in the evolution of CBIR systems. The QBIC system supports low level image features of average color, color histogram, color layout, texture and shape. Additionally, users can provide pictures or draw sketches as example images in query. The visual queries can also be combined with textual keyword predicates. Photobook [4], developed at the MIT Media Lab. It is a tool for performing queries on image databases based on image content. It works by comparing features associated with images, not the images themselves. These features are in turn the parameter values of particular models fitted to each image. These models are commonly color, texture, and shape, though Photobook will work with features from any model. Features are compared using one out of a library of matching algorithms that Photobook provides. It is a set of interactive tools for searching and querying images. It is divided into three specialized systems, namely Appearance Photobook (face images), Texture Photobook, and Shape Photobook, which can also be used in combination. The features are compared by using one of the matching algorithms. These include Euclidean, Mahalanobis, divergence, vector space angle, histogram, Fourier peak, and wavelet tree distances, as well as any linear combination of those previously discussed. NETRA is a prototype image retrieval system that has been developed at them University of California, Santa
Barbara (UCSB) [5, 6]. NETRA supports features of color, texture, shape, and spatial information of segmented image regions to region-based search. Images are segmented to homogenous regions. Using the region as the basic unit, users can submit queries based on features that combine regions of multiple images. For example, a user may compose queries such as retrieve all images that contain regions having color of a region of image A, texture of a region of image B, shape of a region of image C. 1.1.1 Image features One of the main foci in CBIR is the means for extraction of the features of the images and evaluation of the similarity measurement between the features. Image features refer to the characteristics which describe the contents of an image. In this paper, image features are confined to visual features that are derived from an image directly. There have been extensive studies of various sorts of visual feature. The simplest form of visual feature is directly based on pixel values of the image. However, these types of visual feature are very sensitive to noise, brightness, hue and saturation changes, and are not invariant to spatial transformations such as translation and rotations. As a result, CBIR systems that are based on pixel values do not generally have satisfactory results. Much of the research in this area has placed the emphasis on computing useful characteristics from images using image processing and computer vision techniques. Usually, general purpose features in CBIR have included Text, color, texture, shape and Layout. Color representations Color histogram is the standard representation of color feature in CBIR system, initially investigated by Swain and Ballard. The histograms of intensity values are used to represent the color distribution. This captures the global chromatic information of an image and is invariant under translation and rotation about the view axis. Despite changes in view, change in scale, and occlusion, the histogram changes only slightly. A Color histogram H (M) of image M is a 1-D discrete function representing the Probabilities of occurrence of colors in images, which, is typically defined as: H (M) = [ ]
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0401-3
= k= 1, 2, 3 …. , n [Equation 1]
Where N is the number of pixels in image M and is the number of pixels with image value k. The division normalizes the histogram such that:
= 1.0 [Equation 2]
Texture representations
Many texture features have been investigated in the past, including the conventional pyramid-structured wavelet transform (PWT) features, tree-structured wavelet transform (TWT) features, the multi-resolution simultaneous autoregressive model (MR-SAR) features and the Gabor wavelet features. Experiments have been conducted and have found that the Gabor features [7, 8] produce the best performance. The computation of Gabor features is given as follows. A two dimensional Gabor function can be formulated as:
G (x, y) = ( ) × exp [- ( ) + 2�jWx]
[Equation 3] A self-similar filter dictionary can be obtained as a mother Gabor wavelet G (x, y) by appropriate dilations and rotations of Eq. (2) as:
= G ( )
Where h = height of image, w = width of image, hside = (h-1)/2; wside = (w-1)/2
= (x – hside) cos (n�/k)
+ (y – wside) sin (n�/k)
= - (x – hside) cos (n�/k)
+ (y – wside) sin (n�/k)
a > 1; m, n are integers Given an image with luminance, I (m, n), Gabor decomposition can be obtained by multiplying the luminance by the magnitude of the Gabor wavelet:
| |= I ( )
[Equation 4] The mean and standard deviation of the magnitude of the transform coefficient are used to represent the texture feature for classification and retrieval purposes:
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0401-4
= [Equation 5]
=
[Equation 6] The Gabor feature vector is constructed by using
and as feature components:
Where S is the number of scales and K is the number of orientation. Shape representations
APPLICATION BASED ON CONTEXT BASED IMAGE RETRIEVAL AND WORKING PROCEDURE
The one of the future advancement of the CBIR is to develop a platform for the users on which someone upload a image, query processor calculate the distance between the images of the database & according to the closeness of the images(distance between the images) it shows the related results for that image. Let suppose I am a noob for Egypt and walking into the streets of Cairo. I saw a monument, and I am eager to know about that then I just capture the image of it and upload on an application of my mobile. The application processed the query image
and shows the output in the form of the detailed information about that monument. We can create a desktop and mobile application for this purpose. There is lots of GPL & Closed License driven projects on the Image Retrieval. Tineye, gazopa are the most famous & effective project website for the image search. These Projects are using different feature extract algorithm for Context Based Image Retrieval. But, the search results provided by these websites are limited to the other images results. If we upload an image of some celebrity, we got the other similar images of that celebrity but not about that person. Here we are giving the concept of an application which works as a combination of Tineye and Wikipedia. To achieve this goal we design our web crawlers such that whenever they are indexing the images into the database it will also index the data related to that image using the Meta character and some keywords based on different algorithms apply on that page. There might be a problem that the page contains a lot of words with a single image than how can we identify that which word is exact for that image. For achieving this we follow the procedures described below: (A) First of all filter out all the unuseful words like preposition, adjective etc. from the whole text. And then apply the given algorithms for assigning the priority to remaining words. (I) Words in the Meta data contain higher priority Instead of other words on the page. (II) Words in the top 3 or 4 lines contain the higher priority after the filtration. (III) The frequently repeated word on the page contains the higher priority. (IV) Words in the bold letters contain the higher priority. (B) Now we have an Image and some words which contain the top priority from each page. (C) I upload an image to search the related images and its description. (D) The Context Based Image Searching is done to find the related images. (E) After searching, the words are also collected along with the related images of the desired Image. (F) Now one more filtering algorithm is apply for finding the exact keyword related to that image, the frequency of each word is calculated from the different results. (G) Now we assign the top priority to the word which contains the highest frequency.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0401-5
(H) That word goes to Wikipedia & shows the resultant description along with the query image.
CONCLUSION & FUTURE WORK
There are lots of methods for the features extraction in the Context Based Image Retrieval. We can perform as many comparison algorithms for more exact search result. Here we discuss the color, texture and shape representations in the context based image retrieval. We also discuss to generate a application using the CBIR which can play the vital role in the current generation. There is a lot of space in the future advancement of the Context Based Image Retrieval. There are lots of application can be generated which can play a vital role in different fields. There are some visual abilities which is absent from the current CBIR & there is a scope to work on like perceptual organization, similarity between semantic concepts etc.
ACKNOWLEDGMENTS
The Authors gratefully acknowledges ARYA Development and Research Center, ACEIT, Jaipur.
REFERENCES
[1] M. Flickner, H. Sawhney and W. Niblack, Query by image and video content: the QBIC system, IEEE Computer September (1995). [2] J. Hafner, H.S. Sawhney, W. Equitz, M. Flicker and W. Niblack, Efficient color histogram indexing for quadratic form distance functions, IEEE Transactions on Pattern Analysis and Machine Intelligence 17(7) (1995) 729–36. [3] W. Niblack, R. Barber, W. Equitz, M. Flickner, E. Glasman, D. Petkovic, P. Yanker, C. Faloutsos and G. Taubin, The QBIC project: querying images by content using colors, texture and shape. In: W. Niblack (ed.), SPIE Proceedings Vol. 1908, Storage and Retrieval for Image and Video Databases, 2–3 February 1993, San Jose, California (SPIE, San Jose, 1993) 3173–87. [4] A. Pentland, R. Picard and S. Sclaroff, Photobook: content-based manipulation of image databases. Storage and Retrieval for Image and Video Databases II, number 2185, San Jose, CA., February 1994. [5] W.Y. Ma and B.S. Manjunath, NeTra: a toolbox for navigating large image databases, Multimedia Systems 7(3) (1999) 184–198. [6] B.S. Manjunath and W.Y. Ma, Texture features for browsing and retrieval of image data, IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(18) (1996) 837–42. [7] C.C. Chen and C.C. Chen, Filtering methods for texture discrimination, Pattern Recognition Letters 20(8) (1999) 783–90. [8] B.S. Manjunath and W.Y. Ma, Texture features for browsing and retrieval of image data, IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(18) (1996) 837–42.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH
26-27 2011
SIP0402-1
Recursive Algorithm and Systolic Architecture for the Discrete Sine
and Practicalities, Academic Press, 1990, Chap. 3.
[3] I. Pitas and A. N. Venetsanopoulos, “Order
statistics in digital image processing,” Proc.
IEEE, vol. 80, no. 12, pp. 1893–1921, Dec.
1992.
[4] D. R. K. Brownrigg, “The weighted median
filter,” Commun. ACM, vol. 27, no. 8, pp. 807–
818, Aug. 1984. [5] H. Hwang and R. A. Haddad, “Adaptive
median filters : New algorithms and results,” IEEE Trans. Image Process., vol. 4, no. 4, pp.499–502, Apr. 1995.
[6] A. Bovik, Handbook of Image & Video
Processing, 1st Ed. New York: Academic, 2000.
[7] http://homepages.inf.ed.ac.uk [8] W. Y. Han and J. C. Lin, “Minimum–
maximum exclusive mean (MMEM) filter to remove impulse noise from highly corrupted images,” Electron. Lett., vol. 33, no. 2, pp. 124-125, 1997.
[9] T. Sun and Y. Neuvo, “Detail-preserving median based filters in image processing,” Pattern Recognit. Lett., vol. 15, no. 4, pp. 341–347, Apr.1994.
[10] A. Sawant, H. Zeman, D. Muratore, S. Samant, and F. DiBianka, “An adaptive median filter algorithm to remove impulse noise in X-ray and CT images and speckle in ultrasound images,” Proc.SPIE vol. 3661,pp. 1263–1274, Feb. 1999.
Abstract--AUDIO+ is an electronic device that alter how a musical instrument or other audio source sounds and can be best termed as a “Digital Effect Processor”. Some effects subtly "colour" a sound, while others transform it dramatically. Effects can be used during live performances (typically with keyboard, electric guitar or bass) or in the studio i.e. the faithful reproduction of the sound signals is heard when AUDIO+ is used in the audio line.
AUDIO+ has a unique quality to modify the sound signals and make it soothing to every human ear. The device is provided with the control panel of “Volume”, “Bass”, “Treble” and “Balance” to make it desirable for ear sensitive to high and low frequency sound. AUDIO+ is easy to use portable device with single signal input/output port and an internal power supply with batteries.
Keywords: Digital audio players, Digital signal processors, Mixed analog digital integrated circuits, Digital filters, Equalizers, Digital controls.
I. INTRODUCTION
AUDIO+ is all about the musical sound box, which can take the raw mp3, mpeg data and process it digitally. What is interesting that it can sample and play many sound formats starting from sampling rate of 8 kHz to 96 kHz which is more than enough to play any sound format. It improves Sound quality with significant reduction of noise and Dolby sound effects.
II. SYSTEM DESCRIPTION
AUDIO+ is built around the combination of IC’s from Texas Instruments and National Instruments. DRV134 and INA2134 from Texas Instruments are used to design a circuit which enhances sound performance.
This project is supported by Associated Electronics Research Foundation.
Mr. Abhay Kumar is with Associated Electronics Research Foundation, C-53, Phase-II, Noida (U.P.) as a Research Scholar(Phone No.-+919650109759, [email protected])
Very low distortion, low noise, and wide bandwidth provide superior performance in high quality audio applications.
LM1036 of the National Instruments is a DC controlled tone (bass/treble), volume and balance circuit for stereo applications in car radio, TV and audio systems. An additional control input allows loudness compensation to be simply effected.
III. DRV134
DRV134 is a differential output amplifiers that convert a single-ended input to a balanced output pair. These balanced audio drivers consist of high performance op amps with on-chip precision resistors. They are fully specified for high performance audio applications, including low distortion (0.0005% at 1 kHz). Wide output voltage swing and high output drive capability allow use in a wide variety of demanding applications. They easily drive the large capacitive loads associated with long audio cables. Laser-trimmed matched resistors provide optimum output common-mode rejection (typically 68dB), especially when compared to circuits implemented with op amps and discrete precision resistors. In addition, high slew rate (15V/μs) and fast settling time (2.5μs to 0.01%) ensure excellent dynamic response. The DRV134 has excellent distortion characteristics. Noise is below 0.003% throughout the audio frequency range under various output conditions. The gain of 6dB is seen at the output of the differential amplifier.
Fig 1: Gain Vs Frequency graph for DRV134
2
IV. INA2134
INA2134 differential line receivers consisting of high performance op amps with on chip precision resistors. They are fully specified for high performance audio applications and have excellent
LM1036 provide user a compatibility to control the component of sound with the help of multi-turn potentiometer. Graphs given below illustrate the different control operation.
ac specifications, including low distortion (0.0005% at 1 kHz) and high slew rate (14V/ms), assuring good dynamic response. In addition, wide outputvoltage swing and high output drive capability allow use in a wide variety of demanding applications. The dual version features completely independent circuitry for lowest crosstalk and freedom from interaction, even when overdriven or overloaded. The INA2134 on-chip resistors are laser trimmed for accurate gain and optimum common-mode rejection. . It has a unity gain.
Fig 2: Gain Vs Frequency graph for INA2134
V. LM1036
LM1036 has a four control inputs provide control of the bass, treble, balance and volume functions through application of DC voltages from a remote control system or, alternatively, from four potentiometers which may be biased from zener regulated power supply. LM1036 has the following features:
Large volume control range, 75 dB typical Tone control, ±15 dB typical Channel separation, 75 dB typical Low distortion, 0.06% typical for an input
level of 0.3 V RMS High signal to noise, 80 dB typical for an
input level of 0.3Vrms
Fig 3: Volume control LM1036
Fig 4: Tone control LM1036
Fig 5: Balance control LM1036
3
VI. DRV 134 SIMULATION
.
T
Input voltage (V)
0.00 10.00 20.00 30.00 40.00 50.00
Outp
ut
-3.00
-1.50
0.00
1.50
3.00
Fig 8: DC analysis of DRV 134
The Fig 8 shows how the input at DRV134 can be balanced and input line can be modulated.
Fig 6: TINA-TI simulation window for DRV134
The above result shows how a circuit can be built on Tina-TI software of DRV 134.The input to the circuit has to be in the range of 8 kHz-96 kHz and the input voltage should be 200mVrms to 2Vrms. The result can be judged by taking the voltage at the VM1 and VM2. The output is balanced owing to DRV134 acts as a balance modulator.
T
Frequency (Hz)
1 10 100 1k 10k 100k 1M
Outp
ut
nois
e (
V/H
z?)
0.00
10.00u
20.00u
30.00u
Fig 7: Noise analysis of DRV 134
The above figure shows the noise analysis of the DRV134 circuit. The noise significantly reduces as the frequency increases.
VII. SIMULATION OF DRV 134 WITH
INA 137
Fig 9: TINA-TI simulation window for DRV134 and INA
137
The above diagram shows that how the balanced output can be amplified and two channels can be made using INA137 (Gain=1/2) and INA134 (Gain=1).
4
T
Frequency (Hz)
1 10 100 1k 10k 100k 1M
Outp
ut
nois
e (
V/H
z?)
0.00
100.00n
200.00n
300.00n
400.00n
Fig 10: Analysis of DRV 134 with INA 137
The above graph shows how the noise can be significantly reduced after the introduction of INA137 or INA134. This shows that how the input signal can be balanced and amplified to reduce the noise affect to the desired one.
T
Input voltage (V)
0.00 25.00 50.00 75.00 100.00
Voltage (
V)
-3.00
-1.50
0.00
1.50
3.00
Fig 11: DC analysis of DRV 134 with INA 2137
The above fig 11 shows that the output voltage range between 200mVrms to 2Vrms and the sampling frequency of 8 kHz to 96 kHz.
VIII. CONCLUSION
AUDIO+ maintains the originality of five major components of sound signals:
a. Pitch: the frequency of sound signals. Low frequencies (Bass): Make
the sound powerful. Midrange frequencies: Give
sound its energy. Human beingare more sensitive to midrange frequencies.
High frequencies (Treble): Give sounds its presence and life like quality and lets us feel that we are close to sound source.
b. Timbre: Timbre is that unique combination of fundamental frequency, harmonics, and overtones that give each voice, musical instrument, and sound effect its unique colouring and character.
c. Harmonics: When a object vibrates it propagates sound waves of a certain frequency. This frequency, in turn, sets in motion frequency waves called harmonics.
d. Loudness: The loudness of a sound depends on the intensity of the sound stimulus.
e. Rhythm: Rhythm is a recurring sound that alternates between strong and weak elements.
In combination to the above all components of sound present AUDIO+ concentrate on the high frequencies with 6dB overall gain and gives presence of the original reproduction of sound and thus it is more useful for high quality audio system and long distance telephonic calls.
IX. FUTURE WORK
The AUDIO+ has a great advantage in audio system and audio communication. That’s why an opportunity to use in digital communication and VOIP phone.
X. REFRENCES
1) Software support and information about the digital speakers reveal from: Texas Instrument ( www.TI.com)
2) Audio www.ti.com/audio3) Data Converters dataconverter.ti.com4) DSP dsp.ti.com 5) Digital Control www.ti.com/digitalcontrol6) Clocks and Timers www.ti.com/clocks7) Logic logic.ti.com 8) Power Mgmt power.ti.com9) Microcontrollers microcontroller.ti.com10) Hardware support from: Farnell India (http://in.farnell.com/)11) Audio codec www.ti.com/tlv320aic3101.pdf12) Audio digital processor www.ti.com/tas3103.pdf13) Audio line driver www.ti.com/drv134.pdf14) Input amplifier www.ti.com/ina2134.pdf15) Voltage regulator www.ti.com/tps62007.pdf,
www.ti.com/tps74801.pdf, www.ti.com/tps74701.pdf,16) Control IC www.national.com
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0502-1
Speaker Identification
Prerana & Aditi Choudhary
Abstract-Humans use voice recognition
everyday to distinguish between speakers
and genders. Other animals use voice
recognition to differentiate among sounds
sources Speaker recognition is the process
of automatically recognizing who is
speaking on the basis of individual
information included in speech waves. This
technique makes it possible to use the
speaker's voice to verify their identity and
control access to services such as voice
dialing, banking by telephone, telephone
shopping, database access services,
information services, voice mail, security
control for confidential information areas,
and remote access to computers
Speaker identification has been a wide and
attractive area of research. Many works
based on speech features, were proposed. In
a speaker recognition system there are three
important components; the feature extraction
component, the speaker models and the
matching algorithm.
The speech signal conveys information
about the identity of the speaker. The area of
speaker identification is concerned with
extracting the identity of the person
speaking the utterance. As speech
interaction with computers becomes more
pervasive in activities such as the telephone,
financial transactions and information
retrieval from speech databases, the utility
of automatically identifying a speaker is
based solely on vocal characteristic.
FEATURES OF SPEECH
One might wonder what information is
needed to classify between genders or to
classify the speech of multiple speakers. In
fact, speech contains a great deal of
information that allows a listener to
determine both gender and speaker identity.
In addition, speech can reveal much about
the emotional state and age of the speaker.
For example, an Israeli engineer created a
signal processing lie detector system that out
performs the traditional polygraph test.
PITCH
Pitch is the most distinctive difference
between male and female speakers. A
person’s pitch originates in the vocal
cords/folds, and the rate at which the vocal
folds vibrate is the frequency of the pitch.
So, when the vocal folds oscillate at 300
times per second, they are said to be
producing a pitch of 300 Hz. When the air
passing through the vocal folds vibrates at
the frequency of the pitch, harmonics are
also created. The harmonics occur at integer
multiples of the pitch and decrease in
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0502-2
amplitude at a rate of 12 dB per octave – the
measure between each harmonic .
The reason pitch differs between sexes is the
size, mass, and tension of the laryngeal tract
which includes the vocal folds and the
glottis (the spaces between and behind the
vocal folds). Just before puberty, the
fundamental frequency, or pitch, of the
human voice is about 250 Hz, and the vocal
fold length is about 10.4 mm. After puberty
the human body grows to its full adult size,
changing the dimensions of the larynx area.
The vocal fold length in males increases to
about 15-25 mm while female’s vocal fold
length increases to about 13-15 mm. These
increases in size correlate to decreased
frequencies coming from the vocal folds. In
males, the average pitch falls between 60
and 120 Hz, and the range of a female’s
pitch can be found between 120 and 200 Hz.
Females have a higher pitch range than
males because the size of their larynx is
smaller. However, these are not the only
differences between male and female speech
patterns .
FORMANT FREQUENCIES
When sound is emitted from the human
mouth, it passes through two different
systems before it takes its final form. The
first system is the pitch generator, and the
next system modulates the pitch harmonics
created by the first system. Scientists call the
first system the laryngeal tract and the
second system the supralaryngeal/vocal
tract. The supralaryngeal tract consists of
structures such as the oral cavity, nasal
cavity, velum, epiglottis, tongue, etc.
When air flows through the laryngeal tract,
the air vibrates at the pitch frequency
formed by the laryngeal tract as mentioned
above. Then the air flows through the
supralaryngeal tract, which begins to
reverberate at particular frequencies
determined by the diameter and length of the
cavities in the supralaryngeal tract. These
reverberations are called “resonances” or
“formant frequencies”. In speech,
resonances are called formants. So, those
harmonics of the pitch that are closest to the
formant frequencies of the vocal tract will
become amplified while the others are
attenuated
INTRODUCTION- Most signal processing
involves processing a signal without concern
for the quality or information content of that
signal. In speech processing, speech is
processed on a frame by-frame basis usually
only with the concern that the frame is either
speech or silence The usable speech frames
can be defined as frames of speech that
contain higher information content
compared to unusable frames with reference
to a particular application. We have been
Input
speech
Feature
extraction
Reference
model
(Speaker #1)
Similarity
Reference
model
(Speaker #N)
Similarity
Maximum
selection
Identification
result
(Speaker ID)
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0502-3
investigating a speaker identification system
to identify usable speech frames. We then
determine a method for identifying those
frames as usable using a different approach.
However, knowing how reliable the
information is in a frame of speech can be
very important and useful.
This is where usable speech detection and
extraction can play a very important role.
The usable speech frames can be defined as
frames of speech that contain higher
information content compared to unusable
frames with reference to a particular
application. We have been investigating a
speaker identification system to identify
usable speech frames .We then determine a
method for identifying those frames as
usable using a different approach.
PARADIGMS OF SPEECH
RECONGITION
1. Speaker Recognition - Recognize which
of the population of subjects spoke a given
utterance.
2. Speaker verification -Verify that a given
speaker is one who he claims to be. System
prompts the user who claims to be the
speaker to provide ID. System verifies user
by comparing codebook of given speech
utterance with that given by user. If it
matches the set threshold then the identity
claim of the user is accepted otherwise
rejected.
3. Speaker identification - detects a
particular speaker from a known population.
The system prompts the user to provide
speech utterance. System identifies the user
by comparing the codebook of speech
utterance with those of the stored in the
database and lists, which contain the most
likely speakers, could have given that
speech utterance.
At the highest level, all speaker recognition
systems contain two main modules (refer to
Figure 1): feature extraction and feature
matching. Feature extraction is the process
that extracts a small amount of data from the
voice signal that can later be used to
represent each speaker. Feature matching
involves the actual procedure to identify the
unknown speaker by comparing extracted
features from his/her voice input with the
ones from a set of known speakers.
Reference
model
(Speaker #M)
SimilarityInput
speech
Feature
extraction
Verification
result
(Accept/Reject)Decision
ThresholdSpeaker ID
(#M)
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0502-4
(b) Speaker verification
Figure 1. Basic structures of speaker
recognition systems
Figure 1 shows the basic structures of
speaker identification and verification
systems. The system that we will describe is
classified as text-independent speaker
identification system since its task is to
identify the person who speaks regardless of
what is saying.
Concepts of speaker identification
systems:
Speaker identification systems may be
classified into two categories based on their
principle of operation.
Text-dependent systems, which make use of
a fixed utterance for test and training and
rely on specific features of the test utterance
in order to affect a match.
Text-independent systems, which make use
of different utterances for test and training
and rely on long term statistical
characteristics of speech for making a
successful identification.
Text-dependent systems require less training
than text-independent systems and are
capable of producing good results with a
fraction of the test speech sample required
by a text-independent system. The pitch
period or fundamental frequency of speech
varies from one individual to another; pitch
frequency is high for female voices and low
for male voices. This suggests that pitch
might be a suitable parameter to distinguish
one speaker from another, or at least to
narrow down the set of probable matches.
The analysis of the frequency spectrum of
the test utterance provides valuable
information about speaker identification.
The spectrum contains both pitch harmonics
and vocal-tract resonant peaks, making it
possible to identify the speaker with a high
probability of being correct. The vocal-tract
filter parameters (filter coefficients) can also
be used to good effect for speaker
identification. This is due to the fact that
different speakers have different vocal-tract
configurations for the same utterance,
depending on their physical and emotional
conditions, as well as whether the speaker is
a native or non-native speaker
In any text-dependent speaker identification
system, an important decision is the choice
of test utterance. The source-filter model is
most accurate at representing voiced sounds,
such as the vowels. Vowels have a definite,
consistent pitch period. The vocal-tract
configuration for vowel-utterances exhibits a
clear formant (resonant) structure. The
frequency spectrum corresponding to vowel-
utterances therefore contains a wealth of
information that can be used for speaker
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0502-5
identification. In general, it is difficult to
guarantee a hundred percent recognition
even with the best speaker identification
approaches.
Generally speaking, two parameters may be
used to describe the overall performance of
a speakeridentification system.
A false acceptance: Which occurs when the
system incorrectly identifies an unregistered
individual as an enrolled one, or when one
registered individual is mistaken for another.
The FAR (False Acceptance Ratio) is the
ratio of the number of false acceptances to
the total number of trials. The value of FAR
can be reduced by setting a strict low
threshold.
A false rejection: Which occurs when the
system incorrectly refuses to identify an
individual who is registered with the system.
The FRR (False Rejection Ratio) is the ratio
of the number of false rejections to the total
number of trials. Setting the threshold to a
liberal high value can minimize the value of
FRR. The requirements for low FAR and
FRR are seen to be conflicting and both
parameters cannot be simultaneously
lowered. However, a low FAR is vital for
good speaker identification systems and
most systems are biased for good FAR
performance at the expense of FRR.
APPROACHES TO SPEECH
RECOGNITION
1. The Acoustic Phonetic approach
2. The Pattern Recognition approach
3. The Artificial Intelligence approach
A. The Acoustic Phonetic Approach
The acoustic phonetic approach is based
upon the theory of acoustic phonetics that
postulate that there exist a set of finite,
distinctive phonetic units in spoken
language and that the phonetic units are
broadly characterized by a set of properties
that can be seen in the speech signal, or its
spectrum, over time. Even though the
acoustic properties of phonetic units are
highly variable, both with the speaker and
with the neighboring phonetic units, it is
assumed that the rules governing the
variability are straightforward and can
readily be learned and applied in practical
situations. Hence the first step in this
approach is called segmentation and labeling
phase. It involves segmenting the speech
signal into discrete (in Time) regions where
the acoustic properties of the signal are
representatives of one of the several
phonetic units or classes and then attaching
one or more phonetic labels to each
segmented region according to acoustic
properties.
For speech recognition, a second step is
required. This second step attempts to
determine a valid word (or a string of words)
from the sequence of phonetic labels
produced in the first step, which is
consistent with the constraints of the speech
recognition task
B. The Pattern Recognition Approach
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0502-6
The Pattern Recognition approach to speech
is basically one in which the speech patterns
are used directly without explicit feature
determination (in the acoustic – phonetic
sense) and segmentation. As in most pattern
recognition approaches, the method has two
steps – namely, training of speech patterns,
and recognition of patterns via pattern
comparison. Speech is brought into a system
via a training procedure The concept is that
if enough versions of a pattern to be
recognized (be it sound a word, a phrase etc)
are included in the training set provided to
the algorithm, the training procedure should
be able to adequately characterize the
acoustic properties of the pattern (with no
regard for or knowledge of any other pattern
presented to the training procedure) This
type of characterization of speech via
training is called as pattern classification.
Here the machine learns which acoustic
properties of the speech class are reliable
and repeatable across all training tokens of
the pattern. The utility of this method is the
pattern comparison stage with each possible
pattern learned in the training phase and
classifying the unknown speech according to
the accuracy of the match of the patterns
Advantages of Pattern Recognition
Approach
• Simplicity of use. The method is relatively
easy to understand. It is rich in mathematical
and communication theory justification for
individual procedures used in training and
decoding. It is widely used and best
understood.
• Robustness and invariance to different
speech vocabularies, user, features sets
pattern comparison algorithms and decision
rules. This property makes the algorithm
appropriate for wide range of speech units,
word vocabularies, speaker populations,
background environments, transmission
conditions etc.
• Proven high performance. The pattern
recognition approach to speech recognition
consistently provides a high performance on
any task that is reasonable for technology
and provides a clear path for extending the
technology in a wide range of directions.
C. The Artificial Intelligence Approach
The artificial intelligence approach to
speech is a hybrid of acoustic phonetic
approach and the pattern recognition
approach in which it exploits ideas and
concepts of both methods. The artificial
intelligence approach attempts to mechanize
the recognition procedure according to the
way a person applies intelligence in
visualizing, analyzing and finally making a
decision on the measured acoustic features.
In particular, among the techniques used
within the class of methods are the use of an
expert system for segmentation and labeling.
The use of neural networks could represent a
separate structural approach to speech
recognition or could be regarde
as an implementational architecture that may
be incorporated in any of the above classical
approaches.
FUTURE SCOPE
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0502-7
A range of future improvements is possible:
• Speech independent speaker identification
• No of user scan be increased
• Identification of a male female child and
adult
REFERENCES
1. R.V Pawar, P.P.Kajave, and S.N.Mali
“Speaker Identification using Neural
Networks”, World Academy of Science,
Engineering and Technology 12 2005.
2. Lawrence Rabiner- “Fundamentals of
Speech Recognition” Pearson Education
Speech Processing Series. Pearson
Education Publication.
3. Brian J. Love , Jennifer Vining , Xuening
Sun “Automatic Speaker Recognition
Using Neural Networks”, Electrical and
Computer Engineering Department The
University of Texas at Austin Spring 2004.
4. Muzhir Shaban Al-Ani, Thabit Sultan
Mohammed and Karim M. Aljebory
“Speaker Identification: A Hybrid Approach
Using Neural Networks and Wavelet
Transform”, Journal of Computer Science 3
(5): 304-309, 2007 ISSN 1549-3636, 2007
Science Publications.
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0503-1
Abstract
This Paper focuses on the analysis of the
Film Bulk Acoustic Wave Resonator
(FBAR) comprising of Zinc Oxide (ZnO)
piezoelectric thin film sandwiched
between two metal electrodes of gold (Au)
and located on a silicon substrate with a
low stress silicon nitride (Si3N4)
supporting membrane for high frequency
wireless application. The film bulk
acoustic wave technology is a promising
technology for manufacturing miniaturized
high performance filters for Giga Hertz
range.
Keywords: FBAR, Quartz crystal, APLAC.
Quartz Crystal
Crystal Quartz is the most important
resonator material presently available. It
has been used for 50 years, and thus
growth, characterization, and fabrication
techniques are quite mature. Its low
coupling is usually not a disadvantage
when it is used for frequency control
applications. For reasonable values of
transducer areas, the resistance falls in the
10 –20 ohm range at 5 to 20MHz. This
range is ideal for oscillator circuits. Its Q is
some what lower than that of ferroelectric
materials, but at lower frequencies it is
more than adequate, and because the
stoichmetery of the crystal quartz is simple
and its growth technology well
established, there are a few crystal defects
and the attenuation has frequency squared
dependence. Only when very high
frequencies or wide inductive regions are
required do designers look beyond quartz.
So at higher frequencies e.g. at GHz we
cannot use quartz and FBAR and Saw
devices are used which are much smaller
in size. Quartz also have disadvantage that
it has the limits of the integration with the
mechanical structure and integrated circuit
as compared to silicon and furthermost the
cost of quartz wafers is significantly higher
than that of silicon.[1-7]
FBAR Devices
FBAR stands for Film Bilk Acoustic
Resonator FBAR is a break through
resonator technology being developed by
Agilent technologies.Thus the technology
can be used to create the essential
frequency shooing elements found in
modern wireless systems, including filters,
duplexers and resonators for oscillators.
[1-3]
Why FBAR
The rapid growth of wireless mobile
telecommunication system leads to
increase in demand for high frequency
oscillators, filters and duplexers capable of
operating in GHz frequency band range.
Conventionally Liquid Crystal, microwave
ceramic resonators, transmission lines and
SAW devices have been used as high
frequency band devices. Although they
provide high performance at reasonable
price but they are large in size to be able to
integrate in wireless application. SAW
have better electrical performances and
smaller in size but they had relatively poor
sensitivity to temperature, high insertion
losses and limited power handling.
To cope with these limitations FBAR
devices have been developed and can
easily replace these devices in higher
frequency for wireless communication
applications.A thin film bulk acoustic
wave resonator consists basically of a thin
piezoelectric layer sandwiched between
two electrodes. In such a resonator a
mechanical wave is piezoelectrically is
excited in response to an electric field
applied between the electrodes. The
propagation direction of this acoustic wave
is perpendicular to the surface of the
resonator. For a standing wave situation to
Modeling of FBAR Resonator and Simulation using APLAC
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011
SIP0506-4
192 bit Blowfish key (unknown estimated key
strength).
(E) Nautilus Nautilus is a free secure communications
program. Its lacks many of the features of other
communications programs, and its interface is
best described as user-hateful. Unlike most other
voice encryption programs, Nautilus uses a
proprietary algorithm with a key negotiated by the
Diffie-Hellman Key Exchange.
(F) Speak Freely
Speak Freely is a versatile, simple voice
encryption system. Speak Freely offers a selection
of voice encryption techniques (IDEA or DES).
Speak Freely also permits conferencing, and
contains several other useful functions. Unlike
most voice encryption platforms, Speak Freely
includes options that it to connect to other
encrypting and non-encrypting internet
telephones.
(G) SEU-8201 Cipher system
The SEU-8201 is a high-security voice ciphering
system which is mainly used for authorities,
governmental agencies, police and military or
paramilitary. The ciphering algorithm is a new
approach, providing the highest security needed
for such user groups. From a practical standpoint,
it is not susceptible to attack by eavesdroppers or
by using current crypto-analytical methods [4].
.
Fig 4: SEU-8201 Voice Encryption System
V. CONCLUSION
The Speech Scrambling and Encryption is an ambivalent technique for voice security and plays an important role in the field of voice communication. Speech Scrambling and Encryption technique describes the enhanced security of voice communication due to large number of complex operations to convert the sound wave from original one to scramble wave format, which is very difficult to convert into original format for any unauthorized third party. The advantage of Speech Scrambling and Encryption is that it provides better security because even if transmitted wave is accessed by the intruder, the confidentiality of original wave can still be maintained by the speech scramble and encryption technique. The study of speech scrambling and encryption technique aims to enhance the potential of upcoming communication technologies and its implications to defense and government users. The implementation of voice scrambling and encryption technique is a strong and positive move in the way of defining a standard for secure voice communication. However, as the amount of confidential voice communication increases over the insecure wireless channel, speech scrambling and encryption must also be reviewed from a security prospective.
VI. REFERENCES
1. Weblink:http://history.sandiego.edu/gen/re
cording/sigsaly.html, “SIGSALY”
2. Owens, F. J. (1993). Signal Processing of Speech. Houndmills: MacMillan Press. ISBN 0333519221.