Top Banner

of 15

On the Study of Feature Extraction Methods for an Electronic Nose

Jun 02, 2018

Download

Documents

SuginoMarwoto
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/10/2019 On the Study of Feature Extraction Methods for an Electronic Nose

    1/15

    On the study of feature extraction methods for an electronic noseCosimo Distante a,* , Marco Leo b , Pietro Siciliano a , Krishna C. Persaud c

    a Istituto per la Microelettronica e Microsistemi IMM-CNR, via Arnesano, 73100 Lecce, Italyb Instituto di Studi sui Sistemi Intelligenti per lAutomazione ISSIA-CNR, via Amendola, 166/5-70126 Bari, Italy

    c Department of Instrumentation and Analytical Science DIAS-UMIST, 3DIAS, UMIST, P.O. Box 88, Sackville Street, Manchester M60 1QD, UK

    Received 8 March 2002; received in revised form 12 June 2002; accepted 26 June 2002

    Abstract

    In this study, we analyzed the transient of microsensors based on tin oxide solgel thin lm. A novel method to this research eld for dataanalysis and discrimination amongdifferentvolatile organic compounds is presented.Moreover; several feature extraction methods havebeenconsidered, both steady-state (fractional change, relative, difference and log) and transient (Fourier and wavelet descriptors, integral andderivatives) information. Feature extraction methods have been validated qualitatively (by using principal component analysis) andquantitatively on the classication rate (by using a radial basis function neural network).# 2002 Elsevier Science B.V. All rights reserved.

    Keywords: Electronic nose; Radial basis function; Wavelet analysis; Feature extraction

    1. Introduction

    Semiconductor metal-oxide-based gas sensors have been

    studied for many years and they are now used in many eldsof application. Despite this wide trade further research needsmainly to improve sensitivity, selectivity and stability. Infact, in this sense, improving both the sensor selectivitytoward specic gaseous substances and discrimination cap-ability, has been the goal of a great deal of work over the lastfew years. One strategy consists of using non selectivesensor arrays and an appropriate pattern recognition (PARC)system capable of recognizing simple or complex vaporsbased on the conductance or current at the saturation point of each transient response. We believe that most useful infor-mation (high and low frequencies in the curve) can be takenfrom the transient response. One of the most importantprocessing part of an intelligent system, is its ability toextract useful information less redundant than the originalone to aid fast processing and pattern classication. In otherwords: any selected feature (1) must discriminate clearlybetween two or more classes of objects, (2) must not becorrelated with another feature to any moderate strongextent, and (3) should have meaning for humans. The rst

    step toward the pre-processing of enose data was based onmethods for extracting information of the transient onlyfrom the steady-state and baseline response values. How-

    ever, these methods takes into account only stationaryinformation about the transient (i.e. steady-state and base-line) but all the information related to the kinetic of the riseand recovery time is lost. Each sensor has its own behaviorin response to an odor presentation that is stored along theresponse. For example, starting from the fact that lightmolecules (i.e. alcohol) got more dispersed than heavierone (i.e. acid oils), it is possible to analyze the recoverytime of the response due to the end of the exposure torecognize the odor being presented. In some cases of longtime sensor response (of the order of several minutes), it ispossible to analyze the rise time of the transient responsethat can be computed in a few seconds instead of con-sidering the saturation parameters. Several methods makeuse of computing the gradient of the response to detectrelevant features. However, these methods have the draw-back of amplifying the noise. In 1997 in [1] there has beenfaced the problem of analyzing the transient. In that paperthey noted that the rise time is strongly dependent to thetype of odor being presented for an array of tin oxidesensors.

    Several feature extraction methods such as parametersextracted from curve tting derivatives and integrals havebeen investigated in [2] using only one Pt-MOSFET sensorand proposing a method for their comparison. A method for

    Sensors and Actuators B 87 (2002) 274288

    * Corresponding author. Tel.: 39-8-3232-0253; fax: 39-8-3232-5299.E-mail addresses: [email protected] (C. Distante),[email protected] (M. Leo), [email protected] (P. Siciliano),[email protected] (K.C. Persaud).

    0925-4005/02/$ see front matter # 2002 Elsevier Science B.V. All rights reserved.PII: S 0 9 2 5 - 4 0 0 5 ( 0 2 ) 0 0 2 4 7 - 2

  • 8/10/2019 On the Study of Feature Extraction Methods for an Electronic Nose

    2/15

    comparing selected features based on a root mean squareerror from a multilinear regression model is proposed in [3].

    In [4] the transient response has been modeled as sum of exponential functions that model the process of injecting thegas into the test chamber and during the following odorabsorbing phase of the active elements. However, the extrac-

    tion of the features result more complex than the rst twomethods, and they do not provide repeatable parameter dueto the fact that the exponential functions are not orthogonal.Starting from this fact, in [5] meat freshness is detected byusing tin oxide sensors, whose responses have been modeledwith orthogonal functions by using the Fourier transform.The extracted features are then fed into a neural network forrecognition. They show successfully results using Fourierdescriptors, that ameliorate the recognition rate with respectto the steady-state parameters. However, the problem of using Fourier transform is the impossibility of localizingfrequencies in the time domain. This problem can be over-come if we can add the information about the localization of these frequencies in the time domain that can be addressedwith the theory of wavelet transform. The method for dataprocessing consists of curve transient analysis through anovel set of shape descriptors that represents digitizedpattern in a concise way and that is particularly well-suitedfor the recognition with an electronic nose. The set of descriptors is derived from the wavelet transform of apatterns contour. The motivation to use an orthonormalwavelet basis is that wavelet coef cients provide localizedfrequency information and that wavelet allow use to decom-pose a function into a multiresolution hierarchy of localizedfrequency bands. In [6] a comparative study of Grahm

    Scmidt, FFT and Haar wavelet transform is presented;showing the superiority of this last method to have moreinformative patterns. Recently, discrete wavelet transformand FFT has been proposed [7] in combination with FuzzyArtmap [8] for the identi cation of CO and NO 2 .

    This paper describes a compounds recognition system thatrelies upon wavelet descriptors to analyze transient shape atmultiple levels of resolution. In fact, the transient response of tin oxide sensors to gas mixtures contains relevant informa-tion which is related to the species present in the mixture. Inthis way, sensors usually shows a response curve dependingon the odor which with it interacts and the shape of thetransient depends oil the kinetic of the chemical interaction.Moreover, major information can be obtained from theanalysis of this curve, as compared with the traditionalmethods suggested in literature (for example, relative, frac-tional, etc.). A set of ve tin oxide thin lm microsensors wasprepared by sol gel method and used for this application.

    2. Electronic nose and signal processing

    An electronic nose incorporates an array of chemicalsensors, whose response constitutes an odor pattern. A singlesensor in the array should not be highly speci c in its

    responses but should respond to a broad range of compoundssuch that different patterns are expected to be related todifferent odors.

    Let us consider a simple odor (pure gas) or a complex onerepresented as a concentration vector of the jth odor classc jt c1 j, c2 j; . . . ; c pj), where p 1 is the number odorantcomponents. In case of equality, we are in presence of asimple odor otherwise for p 1 in a complex odor.The rst stage of the description of the system is theimpact of the odor with the sensor surfaces. The most likelyeffect of the transduction process is the measurement of theelectrical resistance, but in other cases could be a change inmass (for BAW sensors) or electrical potential (for Pd-gateMOSFET) [9].

    The signal generated by the sensor material is thenconverted into an electrical signal and then conditioned.The output signal, for exam, the resistance Rij (t ) is digitized.The converted signal is given by the vector

    y jt y1 j y2 j

    ..

    .

    ynj

    2666437775

    (1)

    and the array response may be preprocessed for noise andcomplexity reduction in order to accomplish the odor recogni-tion task andfor visualization. A typical gas sensor response isshown in Fig. 1 (ideal case), where the sensor is exposed to acertain odorant j with a certain concentration c j(t ).

    Fig. 1. Characteristic response of a chemical gas sensor. The odor is givenwith a certain concentration c(t ) and the system constituted of a sensor andthe electronics converts the signal into an electrical measure in this case aresistance R(t ) is readout.

    C. Distante et al./ Sensors and Actuators B 87 (2002) 274 288 275

  • 8/10/2019 On the Study of Feature Extraction Methods for an Electronic Nose

    3/15

    Usually, the rise time t r and the decay time t d aredifferent. However, the output signal Rij(t ) is subjected todivergence from the ideal case by interfering signals.

    There are several interfering inputs, the most commonbeing changes in temperature and relative humidity of odors.Usually, the heater of a chemoresistor is maintained atconstant voltage but in reality the operating temperaturevaries due to any changes in ambient temperature.

    Humidity also has a strong effect on most sensors. Thebaseline resistance usually decreases as the humidityincreases, although the exact slope depends from the oper-ating temperature of sensors. Chemical sensor suffer fromlong-term stability, caused by physical changes in the sen-sors and the chemical background, which gives an unstablesignal over the time. Attempts for drift counteraction havebeen made both in the pre-processing and recognition stage(see [10 12] ).

    Several methods have been applied for an initial datareduction that are listed in Table 1 for a metal-oxide sensor.The difference method is usually applied when there is anadditive error Ei both to the baseline y0 and the steady-state

    response ys . In the case of multiplicative type of error, it isbetter to apply the relative method that represent the ratio of the steady-state response to the baseline so that the error willcancel out. The relative parameter is often used with metal-oxide gas sensors, due to the high sensitivity to concentra-tion they have. However, the baseline of these sensors isusually referenced to a speci c gas rather than air, since theyare very sensitive in air to the presence of any gas byshowing baseline stability problems. The pre-processingof this methods has some limitations due to the quiterestricted concentration range. To overcome this problemthe fractional change has been proposed in [9] . This meth-ods has shown good results in odor recognition using neuralnetworks [13] for the discrimination of several coffees.

    The last method, the log parameter , is more suitable whenthe variation of the concentration is very large. A log relativeresistance parameter has the bene t of linearising the sensoroutput and taking the value of zero in absence of an odorinput.

    3. Wavelet analysis

    Wavelet transform is an extension of Fourier transform,generalized to any wideband transient. Let us think to our

    input as a time-varying signal. To analyze signal structure of very different sizes, it is necessary to use time frequencyatoms with different time support. The wavelet transformdecomposes signals over dilated and translated wavelets[14] . The signal may be sampled at discrete wavelengthvalues yielding a spectrum. In continuous wavelet transform

    the input signal is correlated with an analyzing continuouswavelet. The latter is a function of two parameters such asscale and position. The widely used Fourier transform (FT)maps the input data into a new space, the basis functions of which are sines and cosines. Such basis functions are de nedin an in nite space and are periodic, this means that FT isbest suited to signal with these same features. The wavelettransform maps the input signal int a new space, the basisfunction that are quite localized in space. They are usually of compact support. The term wavelet comes from well loca-lized wave-like functions. In fact, they are well localized inspace and frequency, i.e. their rate of variations is restricted.Fourier transform is not local in space but only in frequency.Furthermore, Fourier analysis is unique, but wavelet not,since there are many possible sets of wavelets which one canchoose. Our trade-off between different wavelet set iscompactness versus smoothness.

    Working with xed windows as in the short term Fouriertransform (STFT) may bring to problems. In fact, if thesignal details are much smaller than the width of the windowthey can be detected but the transform will not localize them.If the signal details are larger than the window size, then theywill not be detected properly. The scale is de ned by thewidth of a modulation function. To solve this problem wemust de ne a transform independent from the scale. This

    means that the function should not have a xed scale butshould vary. To achieve this, we start from a function c (t ) asa candidate of a modulation function and we can obtain afamily starting from it by varying the scale s as follows:

    c su jsj pc

    us 1jsj p c

    us ;

    p 0 8s 2R s 6 0 (2)If c width T then the width of c is sT . In term of frequencies,we can state that small scales implies c has high frequenciesand increasing s the frequency of c s decreases.

    3.1. Continuous wavelet transform

    As it is well known, FT uses basis functions consisting of sines and cosines functions. These functions are time-inde-pendent. Hence, the description of a signal provided byFourier analysis is purely in the frequency domain. Thewindowed Fourier transform and the wavelet transform aimsat the analysis of time and frequency. For non-stationaryanalysis, windowed Fourier transform or short term FT(STFT) is best suited. The smaller the window size themore the number of discrete frequencies that will bereduced, leading to a weak discrimination potential among

    Table 1Pre-processing formula for metal-oxide sensors where the resistance isused to compute output signal xij

    Method Formula

    Difference xij Rs R0Relative xij Rs / R0Practional change xij ( Rs R0 )/ R0Log parameter xij ln( Rs / R0 )

    276 C. Distante et al. / Sensors and Actuators B 87 (2002) 274 288

  • 8/10/2019 On the Study of Feature Extraction Methods for an Electronic Nose

    4/15

    frequencies. Given a signal f (t ), a window g around the time-point t and frequency o is

    STFT to Z 11 f t gt t e jo t dt (3)

    Now considering

    k to t gt t e jo t (4)as a new basis and rewriting this with window s inverselyproportional to the frequency o and the position parameter bthat replaces t gives the following

    k b;st 1

    ffiffisp c

    t bs (5)

    that yields to the continuous wavelet transform (CWT).While in STFT the basis functions are sinusoids, in the CWT

    they are scaled versions of the so called mother wavelet c

    (c represents the conjugate, i.e. c s u 1= ffiffisp c u=s.A wavelet mother function can be constructed in several waysbut subjected to admissibility constraints.Denition 1 . The Morlet Grossman de nition of the con-tinuous wavelet transform for a one-dimensional signal f t 2 L 2R , the space of all square integrable functions,is given as follows:~ f a ; b

    1

    ffiffiffiap Z 1

    1 f t c

    t ba dt (6)

    where

    ~ f a ; b is the wavelet coefficient of the function f (t ), c (t ) is the mother wavelet, a > 0 is the scale parameter, b is the position parameter.

    We can rewrite Eq. (6) as a convolution product:

    ~ f a ; b f c st (7)The continuous wavelet transform is the result of the

    scalar product of the original signal f with the shifted andscaled version of a prototype analyzing function c (t ) calledmother wavelet which has the characteristic of a band-pass

    lter impulse response. The coef cients, ~ f of the trans-formed signal f represent how closely correlated the mother

    wavelet is with the section of the signal being anal yzed. Thehigher the coef cient is, the more the similarity. 1

    The continuous wavelet transform has the followingproperties:

    1. CWT is a linear transformation,2. CWT is covariant under translation,3. CWT is covariant under dilation.

    3.2. Haar wavelet function

    As it is know, to analyze signals of very different sizes, itis necessary to use time frequency atoms with different timesupport. A wavelet c 2 L 2R is a function with a zeroaverage

    Z 11 c t dt 0 (8)and is also normalized jjc jj 1 and centered in the neigh-borhood of t 0. The scaled and translated versions c s(u)remain normalized as well.

    In 1910 Haar realized that one can construct a simplepiece-wise constant function as shown in Fig. 2 and isdened as follows:

    c t 1; if 0 t < 12

    1; if 12 t < 10; otherwise

    8>:

    (9)

    whose dilation and translations generate an orthonormalbasis of L 2R .Application of this transform to data smoothing andperiodicity detection have been investigated.

    3.3. Discrete wavelet transform

    Calculating wavelet coef cients at every possible scale isa fair amount of work, and it generates an awful lot of data.What if we choose only a subset of scales and positions atwhich to make our calculations?

    It turns out, that if we choose scales and positions basedon power of two (called dyadic scales and positions) then ouranalysis will be much more ef cient. This analysis is calledthe discrete wavelet transform.

    In the discrete case, WT is sampled at discrete meshpointsand using smoother basis function. But, how to discretizedthe time-scale domain in order to have a discrete wavelet transform (DWT).

    In the windowed Fourier transform the time frequencydomain was discretized using; a uniform lattice Dt 0; o 0 fmt 0; no 0jm; n 2 Z g}, where t 0 and o 0 are the time andposition, respectively.

    It is known that scaling operation acts as a multiplicative

    way, that is, composing two consecutive scaling is attainedby multiplying each of the scale factor. Thus, starting frominitial scale s0 > 1 we consider all the discrete scalessm sm0 ; m 2 Z (10)Now how to discretize the time? It is important to note thatwe must obtain a lattice in the time-scale domain in order tosample (with a minimum redundancy reconstruction of theoriginal signal) the continuous wavelet transform we haveseen before from the time-scale domain ~ f m;n .

    Changing the scale, results in the increase of the width of the wavelet. Also, when the width of the wavelet reduceswith a scale reduction operation, we must increase the

    1 Note that the result will depend on the shape of the wavelet youchoose.

    C. Distante et al. / Sensors and Actuators B 87 (2002) 274 288 277

  • 8/10/2019 On the Study of Feature Extraction Methods for an Electronic Nose

    5/15

    frequency. One of the most important properties of the WTis the invariance under scale changes. In fact, if wechange the scale in the function f and the scale of theunderlying space by the same scaling factor, the WT doesnot change. If we take f

    s0t

    s 1=2

    0 f

    t =s

    0 this implies

    ~ f s0s0s; s0 t ~ f s; t .The invariance property of the wavelet transform is veryimportant and should be preserved even when the WT isdiscretized. The preservation can be accomplished when wepass from one scale sm sm0 to the other sm1 sm10 byincrementing the time by the scaling factor s0 . We canchoose an initial time t 0 and take the length of the samplingtime intervals D t t 0sm0 . The time discretization lattice foreach scale sm0 is given byt m;n nsm0 t 0; n 2 Z (11)and the time-scale discretization domain in the lattice is

    Ds0 ;t 0 fsm0 ;nsm0 t 0jm; n 2 Z g (12)the discretization of the WT ~ f s; t h f ; c s;t uiin the time-scale lattice is given by~ f m;n h f ; c m;nui (13)where

    c m;nu sm=2

    0 c s m0 u nt 0 (14)The discrete WT has the following characteristics:

    the sequence h f ; c m;ni; m; n 2 Z is an exact representa-tion of f ,

    it is possible to reconstruct f from the family of wavelettime-scale atoms c m;n ,

    {c m;n} constitutes an orthonormal basis for L 2R .

    3.4. Multiresolution and subband coding

    In the previous sections, we have seen the continuouswavelet transform and its discretization in the time-scaledomain.

    The idea of the scale is mainly related to the problem of pointsampling of the signal. When we sample a signal we have to xthe sampling frequency 2 and the sampling period. 3 If we aresampling the signal at a frequency 2 j this means that frequen-cies ( details ) outside the scale magnitude of the samples will belost in the sampling process. All of the details captured in acertain scale, will be present at higher scales 2 k , k > m.

    The scaling process gives rise to a subspace generation. In

    fact, sampling with a frequency of 2 j can give rise to theformation f the subspace V j 2 L 2R which is constituted bythe functions in L 2R whose details are well represented inthe scale 2 j. Now let us de ne a representation operator thatwill represent the function f 2 L 2R in the scale 2 j. Let ussuppose that there exist a function f 2 L 2R such that thefamily of functionsf j;k u 2 j=2 f 2 ju k ; j; k 2 Z (15)that represent an orthonormal basis for the subspace V j.

    Fig. 2. The Haar function.

    2 The length of sample interval.3 The number of samples in the time unit.

    278 C. Distante et al. / Sensors and Actuators B 87 (2002) 274 288

  • 8/10/2019 On the Study of Feature Extraction Methods for an Electronic Nose

    6/15

    Dening different scales of f we have

    f su 1

    jsj1=2 f

    us (16)

    where the width of f is s-times the width of f s . Thus, as thescale increases or decreases the width of f s , does the same.

    Now taking s 2 j, j 2 Z we havef j;k f 2 ju k 2 j=2f 2 ju k (17)is an orthonormal basis for V j.

    For each space V j with scale 2 j we can de ne the operator

    R j : L 2R ! V j. This represent the representation operatorthat orthogonally project a function f E L 2R in the space V jas follows: R j f Proj V j f Xk h f ; f j;k if j;k (18)From the previous relationships emerges that we can repre-

    sent a function f at several scales. It is important to changethe representation from one scale to another without loosinginformation. The details of one scale at 2 j must appear at asmaller scale 2 j 1 . Thus, V j 2 V j 1 that means: given afunction f 2 L 2R then f 2 V j; iff f 2u 2 V j 1 andrecursively we can obtain the following f 2 V j; iff f 2 ju 2 V 0 (19)

    Let us now de ne the multiresolution representation asfollows:

    De nition 2 . A multiresolution representation in L 2R isdened as a sequence of closed subspaces V j 2 L

    2

    R

    ; j 2 Z that satis es the following properties:1. V j V j 1 ,2. f 2 V j; iff f 2u 2 V j 1 ,3. T j2 Z V j f0g,4. S j2 Z V j L 2R ,5. 9f 2 V 00ff u k jk 2 Z g is an orthonormal basis of V 0 .The function f is called the scaling function of the multi-resolution representation. Each of spaces V j is called scalespaces , or, more precisely, space of scale 2 j.

    The orthogonal projection of f 2 L 2R in the space V j isobtained by using a ltering process of f with the differentkernels f j;k ; k 2 Z which de ne low-pass lters. De ningthe Haar multiresolution representation asf t

    0; if t < 0 and t 11; if t 2 0; 1

    (20)

    whose family represents basis function of the subspace

    V j f f 2 L 2R ; f 2 jk ; 2 jk 1 constant ; k 2 Z g(21)

    That is the projection of f on the scale space V j is given bya function that is constant in the interval

    2 jk ; 2 j

    k

    1

    .

    Thus, the orthogonality projection of f 2 L 2R in the spaceV j is obtained using a ltering process of f with the differentkernels f jk ; k 2 Z which de ne low-pass lters.Let us now interpret geometrically the sequence of nestedscale spaces in a multiresolution representation. Indicatingthe cutting frequency a j of this lters we can say that the

    space V j is constituted of functions whose frequencies arecontained in the interval a j; a j ; a j > 0. Going to aner scale V j 1 we change to the interval [ a j 1; a j 1 ]where the relation of the two subspaces V j and V j 1 isgiven by

    V j 1 V j W j (22)where W j is the detail space that comprises all the functionsof L 2R with frequencies in the band a j ; a j 1 of thespectrum. Thus, W j is orthogonal to V j and the above statesthat a function represented on a ner scale space V j 1 isobtained from the representation on a coarser scale space V j

    by adding details W j. The details can be obtained byprojecting a function f in each subspace W j using band-passltering whose pass-band is exactly [ a j ; a j 1]. In fact, thisltering process can be computed by projecting f on anorthogonal basis of wavelets. For each j 2 Z there exists anorthonormal basis of wavelets { c j;k ; k 2 Z } of the space W j.Therefore, if R j is the representation operator on the scalespace V j, we have, for all f 2 L 2R R j 1 f R j f Xk 2 z h f ; c j;k ic j;k (23)The second term represent the orthogonal projection of

    function f on the space W j and it will be denoted byProj W j( f ). Now rewriting Eq. (22) in terms of lters we have

    R j 1 f R j f Proj W j f R j 2 f R j 1 f Proj W j 1 f (24)

    iterating this equation for R j 2; . . . ; R j j0 summing up bothsides and performing the proper cancellations, we obtain

    R j j0 f R j f Proj W j 1 f Proj W j j0 (25)The projection R j ( f ) represents a version of low resolution(approximation) of the signal obtained using successive low-pass lters f

    j; f

    j 1; . . . ; f

    j j0.

    The terms Proj W j 1 f ; . . . ; Proj W j j0 f represent thedetails of the signal lost in each low-pass ltering. Thisdetails are obtained by ltering the signal using the waveletsc j; c j 1; . . . ; c j j0

    4. Experimental setup

    The active layers of the array consist of pure and dopedSnO 2 thin lms prepared by means of sol gel technology.Pd, Pt, Os, and Ni were chosen as doping elements startingfrom different precursors of the preparation of the modi ed

    C. Distante et al. / Sensors and Actuators B 87 (2002) 274 288 279

  • 8/10/2019 On the Study of Feature Extraction Methods for an Electronic Nose

    7/15

    lms. The lms, whose thickness was 100 nm, were depos-ited on alumnia substrates supplied with interdigitated(electrodes and platinum heater, by the spin coating tech-nique at 3000 rpm, dried at 80 8C and heat treated in air at600 8C. After deposition, the sensors were mounted onto aTO8 socket and inserted in the test chamber.

    Samples of different compounds were introduced into avial kept at room temperature by a thermostatic tank. Manysubsequent measurements were performed, for each sample,by xing the exposure time and the purging time at 20 min.The responses have been acquired with sampling interval of 32 s then acquiring 75 points for each response.

    Three gases have been taken under consideration (216measurements in total): acetone, hexanal and pentanone in50% relative humidity (RH) and dry air. For a preliminaryanalysis these gases in 50% RH.

    5. Results and discussion

    Three different kinds of analysis have been carried out byusing PCA and neural network. The rst analysis is purelyqualitative and for visualization purposes, project dataobtained by the several feature extraction methods pre-viously discussed onto the rst two useful principal com-ponents.

    Principal component analysis is usually carried out as alow-pass lter, in order to reduce noise in the signal, keepingcomponents corresponding to the rst few eigenvectors thatcapture most of the variance contained in the data set.Usually, the transformation is made into two- or three-

    dimensional spaces. Let A be the truncated transformationmatrix constituted of the rst useful principal components of the correlation matrix of the data set. In the experiments thatfollow in this paper, the rst three principal componentsfrom the measurement yij have been extracted, so the matrixA is 3 n and then the observation becomes the three-dimensional vector x j Ay j.The second analysis is more deep because is based on theclassi cation results of a radial basis function neural net-work, which is given a pattern composed of the coef cients

    extracted with a pre-processing method. The training andvalidation procedure is performed using the leave-one-outprocedure which provides an estimate of the generalizationperformances of the nal classi er.

    The RBF network used, creates neurons one at a time. Ateach iteration, the input vector which will result in lowering

    the network sumsquared error, is used to create a new radialbasis neuron. The error of the new network is checked, and if low enough the learning phase is nished. Otherwise thenext neuron is added. This procedure is repeated until theerror goal is met (0.001), or the maximum number of neurons is reached (i.e. the number of training vector215). A spread of 0.8 is used for radial basis functions inorder to ensure that more than one neuron can respond tooverlapping regions of the input space.

    Analysis in the frequency domain has been carried out andFig. 3 shows three curves relative to one of the three gasesunder consideration. The electronic nose responses the samefrequency domain but different magnitudes for differentodors. So an appropriate low-pass lter is suitable for featureextraction since the responses have the useful informationwe need lying in low frequencies, as opposed to noise that ispresent in high frequencies. Unfortunately, drift also residesin low frequencies and this method carries also this problemin the transformation, but the reduction of the drift is outsidethe scope of this paper. As introduced in the previoussections, at each level (scale) of signal decomposition,wavelet, coef cients are divided by approximation (lowfrequencies) and details (high frequencies). Also, the coef -cients are obtained by convolving the wavelet functionand the response curve, thus, measuring their correlation

    degree.The mother wavelet used is the Daubechies family since

    they guarantee an orthogonal analysis a necessary conditionfor feature extraction. After choosing the mother wavelet touse. the next investigation is the level to stop for getting theapproximation coef cients of the DWT. The higher the scalethe more low frequencies are ampli ed and high frequenciesare cut off. But this is not an in nite process, in fact thedecomposition level has a lower limit that is the samplingperiod and a superior limit that is the signal support. The

    Fig. 3. Power spectra density computed on three different odor responses.

    280 C. Distante et al. / Sensors and Actuators B 87 (2002) 274 288

  • 8/10/2019 On the Study of Feature Extraction Methods for an Electronic Nose

    8/15

    maximum decomposition level of the DWT is the two-baselogarithm of the length of the signal. Since each transient hasa length of approximately 80 samples, then the sixth level isappropriate, where 2 6 64 < 80 (the seventh level requiresa signal greater than 80 samples).

    The graph of the coef cients for one sensor and three

    exposures of the sensor array to the same odor (hexanal) isgiven in Fig. 5 where coef cients up to the sixth decom-position level are shown. The rst 13 coef cients on the x-axis shows approximation and the remaining show details.It is interesting to note that the rst coef cients of the threecurves fall all in the same position as shown by the leftmostpatterns (they overlap each other). This may bring to theconclusion that the same order of coef cients related to thesame odor are the same. So in this graph the rst three orderof coef cients can be taken as features to be successivelyclassi ed by an opportune classi er. As a counter-example,Fig. 4 shows coef cients related to different odors. However,the rst coef cients of the approximation of the three odorcurves do not discriminate well since they overlap. Thediscrimination starts from the third order of coef cient untilthe sixth, and the scale is larger than the coef cients shownin the Fig. 5 . This states that a number of six coef cients issuitable is features and that wavelet transform enlargethe range of the coef cients for different odors while

    maintaining small the variability of the range coef cientsrelated to the same odor (compare the two gures for thefth coef cient).

    The transient has been analyzed in three time partsresponse. rise time, recovery time and the complete curve(i.e. from the rise time to the next when a new odor is

    present). Over all this experiments the extraction of the rstsix coef cients over the complete transient has shown betterresults. However, it has been shown that the recovery time ismore informative than the rise time, but in general betterresults are obtained with the complete curve. In Fig. 6 isshown the score plot of the rst two principal componentscomputed in the wavelet space for visualization. Tables 2and 3 give the classi cation results with RBF for inputpattern of 30 wavelet descriptors.

    Traditional methods such as the relative, fractional, dif-ference and log parameter, present almost the same behaviorfor the data under consideration in terms of classi cation.

    As a comparison, also Fourier descriptors have beentested against wavelet descriptors. Fourier descriptors pro-vide a frequency measure of the curve under considerationwithout localizing them in the time domain. The fastFourier transform (FFT) is used which allows 115 tocompute very quickly the discrete transformation with agood approximation. The rst 10 descriptors have been

    Fig. 4. Coefficients extracted up to the sixth level of DWT decomposition for three responses of tile three considered odor (pentanone hexanal, and acetone).

    C. Distante et al. / Sensors and Actuators B 87 (2002) 274 288 281

  • 8/10/2019 On the Study of Feature Extraction Methods for an Electronic Nose

    9/15

    considered for each curve, since they possess the greatestmagnitude and, low frequencies dominate in the responsecurve. The feature vector pattern is then composed of 50descriptors, which have been: projected onto the rst twoprincipal components for visualization as shown in Figs. 7

    and 8 and; given to RBF for discrimination capabilities(Table 4 ).

    Integrals and derivatives have also been investigated tobenchmark them against the above discussed feature extrac-tion methods.

    Fig. 5. Coefficients extracted up to the sixth level of DWT decomposition for three response of the same odor (hexanal).

    Table 2Confusion matrix of the wavelet analysis

    1 2 3 4 5 6 7

    1 28 0 0 0 0 0 02 0 21 0 0 0 0 03 0 0 25 0 0 0 04 0 0 0 29 0 0 05 0 0 0 0 35 0 06 0 0 0 0 0 22 07 0 0 0 0 0 0 60

    The total recognition percentage is 100%. (1) Humidity; (2) acetone; (3)acetone in 50% RH; (4) hexanal; (5) hexanal in 50% RH; (6) pentanone;(7) pentanone in 50% RH.

    Table 3Confusion matrix of the relative method feature extraction

    1 2 3 4 5 6 7

    1 26 0 2 0 0 0 02 0 21 0 0 0 0 03 4 0 16 0 5 0 04 0 0 0 28 0 1 05 1 0 1 0 31 0 26 0 0 0 22 0 0 07 0 0 0 0 3 0 57

    The total recognition percentage is 81.36%. (1) Humidity; (2) acetone; (3)acetone in 50% RH; (4) hexanal; (5) hexanal in 50% RH; (6) pentanone;(7) pentanone in 50% RH.

    282 C. Distante et al. / Sensors and Actuators B 87 (2002) 274 288

  • 8/10/2019 On the Study of Feature Extraction Methods for an Electronic Nose

    10/15

    The aim is the computation of the integral of the f functionin the interval [ a , b], where a represents the time step of theconcentration change in the rise time, and b the end of therecovery time (where _ f t 0). The applied Newton Cotesmethod [15] is based on the substitution of the function that

    represents the transient response with a more simple approx-imate function:

    I Z b

    a f xd x Z

    b

    a f n xd x (26)

    where f n( x) is polynomial function of order n dened asfollows:

    f n x a 0 a1 x a n 1 xn 1 an xn (27)The order of the polynomial function determines the

    accuracy of the method in our experiments we used theSimpson s 1/3 rule which uses a second order polynomialfunction.

    Simpson s rule nds the area under the parabola whichpasses through three points (the end point the midpoint, i.e. x0 , x1 , x2 ) on a curve. In essence the rule approximates thecurve by a series of parabolic arcs and the area under theparabolas is approximately the area under the curve. TheSimpson s rule as well as the other Newton cotes methods

    Fig. 6. Principle component analysis done in the wavelet space for the three gases measured in dry and humid air. Each observation has 30 coefficients(features) since for each response of the five sensors, the first six coefficients are extracted (compare the result with the traditional method shown in Fig. 7).

    Table 4Confusion matrix of the FFT method

    1 2 3 4 5 6 7

    1 28 0 0 0 0 0 02 0 21 0 0 0 1 03 0 0 25 0 2 0 04 0 0 0 29 0 1 05 0 0 1 0 34 0 16 0 0 0 0 0 22 07 0 0 0 0 2 0 60

    The total recognition percentage is 96.36%. (1) Humidity; (2) acetone; (3)acetone in 50% RH; (4) hexanal; (5) hexanal in 50% RH; (6) pentanone;(7) pentanone in 50% RH.

    C. Distante et al. / Sensors and Actuators B 87 (2002) 274 288 283

  • 8/10/2019 On the Study of Feature Extraction Methods for an Electronic Nose

    11/15

    can be applied only if the points are equally spaced. Thetransient response has been split into 25 intervals where thearea is computed as follows:

    I x2 x0 f x0 4 f x1 f x2

    6(28)

    The observation pattern is composed of ve features eachone of them describing the are of the corresponding sensorresponse. Fig. 9 shows the PCA plot of the extractedfeatures. The ve features are given to the RBF for trainingand classi cation with leave-one-out procedure, and theconfusion matrix is given in Table 5 . It is interestingto note that this method produces very informative featuresas compared with the results obtained with the waveletdescriptors.

    Another applied method is the study of the local gradientover the whole transient response. Even in this case, thestudy of the local gradient hasbeen carried outby approximationusing the Taylor series. Starting from a rst order Taylor series f

    xi

    1

    f

    xi

    f 0

    xi

    xi

    1

    xi

    we can approximate the

    derivative in the point xi as follows:

    f 0 xi f xi f xi1

    xi1 xi(29)

    The mean derivative has been computed over intervals of 10 point samples so then for the whole transient response a

    Fig. 7. Result using the traditional method of applying the relative method ( I s / I 0 ) for the extraction of the features from one response of the sensor and thenperforming the principle component analysis on the observation space of five elements (one observation from each sensor as opposed to the six extracted with

    the wavelet analysis).

    Table 5Confusion matrix of the integration method

    1 2 3 4 5 6 7

    1 28 0 0 0 0 0 02 0 21 0 0 0 0 03 0 0 25 0 0 0 04 0 0 0 29 0 0 05 0 0 0 0 34 0 16 0 0 0 0 0 22 07 0 0 0 0 0 0 60

    The total recognition percentage is 99.54%. (1) Humidity; (2) acetone; (3)acetone in 50% RH; (4) hexanal; (5) hexanal in 50% RH; (6) pentanone;(7) pentanone in 50% RH.

    284 C. Distante et al. / Sensors and Actuators B 87 (2002) 274 288

  • 8/10/2019 On the Study of Feature Extraction Methods for an Electronic Nose

    12/15

    number of (75/10 7 features are obtained leading to 35features for each observation of the array ve sensors. Fig. 10shows the result of mean derivative method by using PCAand in Table 6 the confusion matrix of the seven featurevectors classi ed by the RBF is given.

    All of the compounds in dry air have shown good results.Only the cluster related to humidity is well de ned andseparable from others in all the experiments, while hexa-nal and acetone are mixing up in dry air. Among all the

    steady-state methods, the difference method have shownbetter results.

    However, when analyzed in presence of humidity (50% RH)the performance of the classi er are decreased. It is interest-ing to note that after inspecting data with PCA, humidcompounds and dry compounds are separate from eachother. Traditional methods provides poor results in the rsttwo principal dimensions, in fact, in the two macro-clusters(dry and humid) the compounds have disordered positions.Hexanal from pentanone in dry air have correlated features,making RBF dif cult to classify them correctly. It has beenshown that features made only of steady-state information,weakly discriminate classes of odors. Features extractedfrom the transient response, are more informative but asopposed to traditional steady-state methods, they inheritsensor long-term stabilities problems. These are shown withthe humidity and pentanone measurements. The derivativemethod presents better results with 95% classi cation rateconsidering the whole curve analysis. Some of the transientanalysis methods have also been investigated for the rise andrecovery time. A study of the derivative method restricted tothe rise time and recovery time has produced 95 and 93%recognition rates, respectively. The integration methodhave shown results better results both in the PCA space

    Fig. 8. Principal component analysis done in the Fourier space for the three gases measured in dry and humid air. Each observation has 50 features since foreach response of the five sensors, the first 10 coefficients are extracted.

    Table 6Confusion matrix of the gradient method

    1 2 3 4 5 6 7

    1 28 0 0 0 0 0 02 0 20 0 1 0 0 03 0 0 25 0 0 0 04 0 0 0 29 0 0 05 0 0 4 0 29 0 26 0 1 0 0 0 21 07 0 0 0 0 2 0 58

    The total recognition percentage is 95.45%. (1) Humidity; (2) acetone; (3)acetone in 50% RH; (4) hexanal; (5) hexanal in 50% RH; (6) pentanone;(7) pentanone in 50% RH.

    C. Distante et al. / Sensors and Actuators B 87 (2002) 274 288 285

  • 8/10/2019 On the Study of Feature Extraction Methods for an Electronic Nose

    13/15

    Fig. 9. Principal component analysis with the integration method for gases measured in dry and humid air. Each observation has five features correspondingto the areas computed oil the responses starting from the release of the aroma in the test chamber.

    Fig. 10. PCA plot of the features extracted with the derivative method.

  • 8/10/2019 On the Study of Feature Extraction Methods for an Electronic Nose

    14/15

    and with RBF classi cation leading to 99% over the wholeresponse and 96 and 98% for the rise and recovery of time,respectively. The multiresolution approach with waveletanalysis conducted for the whole response have shown bestperformance overall methods here presented. In this case,both linear separation and better classi cation results have

    been obtained.Rise time and recovery time parts of the curve gave

    classi cation rates of 99 and 100% accuracy, respectively.

    6. Conclusions

    A new method to characterize several odors from an arrayof gas sensor responses has been described in this paper. Theproblem to address was the study of the transient response interms of informative content stored in the curve (rise part,recovery part and both). The recovery part is more infor-mative than the rise one but the use of the whole curvecarries better information. Then the wavelet analysis isperformed on each transient response. The feature vectoris obtained from the approximation coef cients of multi-resolution analysis with wavelet transform and is analyzedwith principal component analysis and radial basis Functionneural network. Although wavelet transform captures thefrequency content of the signal at each decomposition level,it would have been reasonable to make a feature vectorcomposed by maximums approximation coef cients in eachlevel. This performed poorer than the method we used, sincea good compression rate and the more discriminative fea-tures were obtained with the coef cients of the highest

    decomposition level.The new method is compared with the most common

    techniques of signal processing in the context of electronicnose community. The performed measurements have shownthat when we use wavelet descriptors the capability of therecognition system to classify three organic compoundsimproves notably. Table 7 gives the recognition results byusing a RBF neural network trained with the featuresextracted with the considered methods.

    References

    [1] E. Llobet, J. Brezmes, X. Vilanova, J. Sueiras, Qualitative andquantitative anlysis of volatile organic compounds using transientand steady-state responses of tick-film tin oxide gas sensor array,Sens. Actuators B 41 (1997) 13 21.

    [2] T. Eklo v, P. Ma rtensson, J. Lundstro m, Enhanced selectivity of

    mosfet gas sensors by systematical analysis of transient parameters,Anal. Chim. Acta 353 (1997) 291 300.

    [3] T. Eklo v, P. Ma rtensson, I. Lundstro m, Selection of variables forinterpreting multivariate gas sensor data, Anal. Chim. Acta 381(1999) 221 232.

    [4] R. Gutierrez-Osuna, H. Troy Nagle, S.S. Schiffman, Transientresponse analysis of an electronic nose using multi-exponentialmodels, Sens. Actuators B 61 (1999) 170 182.

    [5] A. Galdikas, A. Mironas, D. Senuliene, V. Strazdiene, A. Setkus, D.Zelenin, Response time based output metal oxide gas sensors appliedto evaluation of meat freshness with neural signal analysis, Sens.Actuator B 69 (2000) 258 265.

    [6] J. Ratton, T. Kunt, T. McAvoy, T. Fuja, R. Cavicchi, S. Semancik, Acomparative study of signal processing techniques for clusteringmicrosensor data, Sens. Actuators B 41 (1997) 105 120.

    [7] E. Llobet, J. Brezmes, R. Ionescu, X. Vilanova, S. AI-Khalifa, J.W.Gardner, N. Barsan, Wavelet transform and fuzzy artmap-basedpattern recognition for fast gas identification using a micro-hotplategas sensor, Sens. Actuators B 83 (2002) 238 244.

    [8] C. Distante, P. Siciliano, L. Vasanelli, Odour discrimination usingadaptive resonance theory, Sens. Actuators 13 (69) (2000) 248 252.

    [9] J.W. Gardner, P.N. Bartlet, Electronic Noses: Principles andApplications, Oxford University Press, Oxford, 1999.

    [10] M. Holmberg, F. Winquist, I. Lundstro m, F. Davide, C. Di Natale, A.DAmico, Drift counteraction for an electronic nose, Sens. ActuatorsB 35/36 (1996) 528 535.

    [11] M. Holmberg, F. Winquist, I. Lundstro m, F. Davide, C. Di Natale,A.D. Amico, Drift counteraction in odour recognition application:lifelong calibration method, Sens. Actuators B 42 (1997) 185 194.

    [12] C. Distante, P. Siciliano, K.C. Persaud, Dynamic cluster recognitionusing multiple self-organizing maps. Pattern analysis and applica-tions, 2002, in press.

    [13] J.W. Gardner, H.V. Shumer, T.T. Tan, Application of an electronicnose to the discrimination of coffees, Sens. Actuators B 6 (1992)7175.

    [14] S. Mallat, A Wavelet Tour of Signal Processing, 2nd Edition,Academic Press, New York, 1999.

    [15] S.C. Chapra, R.P. Canale, Numerical Methods for Engineers withPersonal Computer Applications, 2nd Edition, McGraw-Hill, NewYork, 1988.

    Biographies

    Cosimo Distante was born in Francavilla Fontana (Brindisi province) Italyin 1970. He received a Laurea degree in Computer Science from theUniversity of Bari in 1997, and PhD degree in Material Engineering fromthe University of Lecce, Italy. He has been a visiting researcher at theComputer Science Department of the University of Massachusetts atAmherst 1998 1999. In 2001 he joined as a research scientist the Institutefor Microelectronics and Microsystems (IMM) of the Italian NationalResearch Council (CNR). Dr. Distante is mainly interested in the field of pattern recognition, robot learning, computer vision and intelligentinterfaces for networked transducers.

    Marco Leo received a Laurea degree in Computer Engineering from theUniversity of Lecce in 2001, where his thesis on signal processingtechniques applied to electronic nose. He is currently at the Institute for theStudy of Intelligent Systems for Automation (ISSIA) of the Italian

    Table 7

    Comparison between several feature extraction methods in terms of informative contents contained in each pattern

    Pre-processing method Recognition rate (%)

    Relative 81Log parameter 81Difference 83Fractional 82Derivative 95.45Fourier coefficient 96Integration 99.5Wavelet coefficient 166

    The results are based on the leave-one-out procedure performed with RBFneural network.

    C. Distante et al. / Sensors and Actuators B 87 (2002) 274 288 287

  • 8/10/2019 On the Study of Feature Extraction Methods for an Electronic Nose

    15/15

    National Research Council (CNR) where his main research activity is thestudy of human perception algorithms.

    Pietro Siciliano received his degree in Physics in 1985 from the Universityof Lecce. He gained his PhD in Physics in 1989 at the University of Bari.Initially, he was involved in research in the field of electricalcharacterization of semiconductors devices. He is currently a seniormember of the National Research Council at the Institute for Microelec-tronics and Microsystems (IMM), where he has been working in the fieldof preparation and characterization of thin film for gas sensor. Dr. Sicilianois now the responsible for IMM s branch of Lecce, Italy.

    Krishna C. Persaud BSc (Hons.) Biochemistry (1976), University of Newcastle-upon-Tyne, UK, MSc Molecular Enzymology (1977),University of Warwick, UK, PhD Olfactory Biochemistry (1980),University of Warwick, UK. He has research interests in the area of olfaction from physiology to chemistry and has been involved in thedevelopment of gas sensor arrays for sensing odors based on conductingpolymers. He has worked in olfactory research in Italy and the USA,and was appointed as Lecturer, Department of Instrumentation andAnalytical Science, University of Manchester, Institute of Science andTechnology, UK in 1988, and is currently Senior Lecturer indepartment.

    288 C. Distante et al. / Sensors and Actuators B 87 (2002) 274 288