CHAPTER IV PREPROCESSING & FEATURE EXTRACTION IN ECG …shodhganga.inflibnet.ac.in/bitstream/10603/29063/12/12_chapter4.pdf · 5: Six Types of ECG Signal Wave forms d on the classifier,

64

CHAPTER IV

PREPROCESSING & FEATURE EXTRACTION IN ECG

SIGNALS

The proposed ECG classification approach consists of three phases. They

are

• Preprocessing

• Feature Extraction and Selection

• Classification

The complete process of the proposed approach is shown in the figure 4.1.

Figure 4.1: Block Diagram of the Complete Process of ECG Signal

Classification

4.1. DATASET DESCRIPTION

The experiment conducted on the basis of ECG data from the MIT–BIH

arrhythmia database [107]. This database was the first commonly available set of

standard test material for evaluation of arrhythmia detectors and has been

MIT-BIH Arrhythmia

Database�

Noise Removal using

Morphology Filter�

Feature Extraction and

Selection using DWT and

AR Modeling

Classification using Machine

Learning Techniques

exploited for that purpose in add

more than 500 sites worldwide.

The MIT-BIH Arrhythmia Database

excerpts of two-channel ambulatory ECG recordings, obtained from 47

studied by the BIH Arrhythmia Laboratory between 19

Figure 4.2: MIT

65

exploited for that purpose in addition for basic research into cardiac dynamics at

more than 500 sites worldwide.

BIH Arrhythmia Database (Figure 4.2) includes 48 half

channel ambulatory ECG recordings, obtained from 47

studied by the BIH Arrhythmia Laboratory between 1975 and 1979.

Figure 4.2: MIT–BIH Arrhythmia Database

ition for basic research into cardiac dynamics at

includes 48 half-hour

channel ambulatory ECG recordings, obtained from 47 subjects

75 and 1979.

Twenty-three recordings were selected arbitrari

hour ambulatory ECG recordings collected from a mix

(about 60%) and outpatients (about 40%) at Boston's

remaining 25 recordings were chosen from the same s

but clinically significant arrhythmias that would n

random sample. The recordings were digitized at 360

channel with 11-bit resolution over a 10 mV range.

shown in figure 4.3.

Figure 4.3: Sample ECG Wave from Physionet 2011 (MIT

In particular, the considered beats refer to the fo

sinus rhythm (N), Atrial premature beat (A), Ventri

bundle branch block (RB), left bundle branch block

beats were selected from the recordings of 20 patients, which corresp

following files: 100, 102, 104, 105, 106, 107, 118,

208, 209, 212, 213, 214, 215 and 217.

describe the content of the

66

three recordings were selected arbitrarily from a set of 4000 24

hour ambulatory ECG recordings collected from a mixed population of inpatients

(about 60%) and outpatients (about 40%) at Boston's Beth Israel Hospital; the

remaining 25 recordings were chosen from the same set to include less comm

but clinically significant arrhythmias that would not be well-represented in a small

random sample. The recordings were digitized at 360 samples per second per

bit resolution over a 10 mV range. Sample ECG wave form is

: Sample ECG Wave from Physionet 2011 (MIT

In particular, the considered beats refer to the following classes: Normal

sinus rhythm (N), Atrial premature beat (A), Ventricular premature beat (V), Right

bundle branch block (RB), left bundle branch block (LB) and paced beat (/). The

d from the recordings of 20 patients, which corresp

following files: 100, 102, 104, 105, 106, 107, 118, 119, 200, 201, 202, 203, 205,

208, 209, 212, 213, 214, 215 and 217. Notes and statistics shown

the content of the record 100.

ly from a set of 4000 24-

ed population of inpatients

Beth Israel Hospital; the

et to include less common

represented in a small

samples per second per

Sample ECG wave form is

�

: Sample ECG Wave from Physionet 2011 (MIT-BIH)

llowing classes: Normal

cular premature beat (V), Right

(LB) and paced beat (/). The

d from the recordings of 20 patients, which correspond to the

119, 200, 201, 202, 203, 205,

shown in figure 4.4

Figure 4.4: Notes and Statistics of

The wave forms of different diseases are shown in t

(a) Wave form of Normal Beat for Patient ID: 100

67

Notes and Statistics of Record 100 in MIT–BIH Arrhythmia

Database

The wave forms of different diseases are shown in the following figure.

(a) Wave form of Normal Beat for Patient ID: 100

BIH Arrhythmia

he following figure.

(b) Wave form of

(c) Wave form of Ventricular Premature Beat for Pat

(d) Wave form of Right Bundle Branch Block Disease

(e) Wave form of Paced Heart Beat for Patient ID: 1

68

(b) Wave form of Atrial Fibrillation Disease for Patient ID: 201

(c) Wave form of Ventricular Premature Beat for Patient ID: 106

(d) Wave form of Right Bundle Branch Block Disease for Patient ID: 118


Disease for Patient ID: 201

ient ID: 106

for Patient ID: 118


(f) Wave form of

Figure 4.

4.2. PREPROCESSING

The performance of the classification not only base

however it is also based on the features and enhanced ECG signal

Morphology Filter (MF), a built

the noise component at the same time preserving the

domain features.

ECG signals taken from the MIT

based pre-processing removes high frequency noise components and base

in addition to preserve ECG morphology. MF has the

the sharpness of the QRS complex.

frequency ECG noise with low distortion as

with less computational burden [118].

The following command

Morphology Filter.

69

(f) Wave form of Left Bundle Branch Block for Patient ID: 214

Figure 4.5: Six Types of ECG Signal Wave forms

PREPROCESSING

The performance of the classification not only based on the classifier,

also based on the features and enhanced ECG signal

Morphology Filter (MF), a built-in function in MATLAB which is used to remove

the noise component at the same time preserving the ECG morphology and time

ECG signals taken from the MIT-BIH arrhythmia database is used. MF

ng removes high frequency noise components and base

in addition to preserve ECG morphology. MF has the good quality of preserving

the sharpness of the QRS complex. MF filters the baseline drift and high

frequency ECG noise with low distortion as present in the original ECG signal and

with less computational burden [118].

The following command is used for preprocessing ECG signal using

Rsig=bwmorph(sig, ‘clean’);

for Patient ID: 214

: Six Types of ECG Signal Wave forms

d on the classifier,

also based on the features and enhanced ECG signal processing.

in function in MATLAB which is used to remove

ECG morphology and time

BIH arrhythmia database is used. MF

ng removes high frequency noise components and baseline drift,

good quality of preserving

MF filters the baseline drift and high

present in the original ECG signal and

ECG signal using

Figure 4.6: Preprocessed ECG Signal

4.3. FEATURE EXTRACTION AND SELECTION

ECG beat recognition and classification is performe

morphological features. Since these features are ve

morphology and the temporal characteristics of ECG,

one from the other on the basis of the time wavefor

[66, 68]. In this phase two different classes of fe

isolated ECG beats including; auto

of discrete wavelet transform detail coefficients f

scales) [97].

A. Wavelet Transformation

In this research work, the feature extraction was d

Wavelet Transform. The

to highlight the significant amount of information

70

Figure 4.6: Preprocessed ECG Signal of Record 100

EXTRACTION AND SELECTION

ECG beat recognition and classification is performed with temporal and

morphological features. Since these features are very at risk to variations of ECG

morphology and the temporal characteristics of ECG, it is difficult to

one from the other on the basis of the time waveform or frequency representation

[66, 68]. In this phase two different classes of feature set are used belonging to the

isolated ECG beats including; auto-regressive model parameters and the varia

of discrete wavelet transform detail coefficients for the different scales (1

Wavelet Transformation

In this research work, the feature extraction was done by applying Discrete

Wavelet Transform. The benefit of the wavelet transformation lies in its capacity

to highlight the significant amount of information about the ECG signal.

of Record 100

d with temporal and

ry at risk to variations of ECG

it is difficult to distinguish

m or frequency representation

ature set are used belonging to the

regressive model parameters and the variance

or the different scales (1–6

one by applying Discrete

n lies in its capacity

signal.

71

Physiological signals used for diagnosis are frequently characterized by a

non-stationary time behavior. For such patterns, time and frequency

representations are desirable. The frequency characteristics in addition to the

temporal behavior can be described with respect to uncertainty principle. The

wavelet transform can represent signals in different resolutions by dilating and

compressing its basis functions [72]. While the dilated functions adapt to slow

wave activity, the compressed functions captures fast activity and sharp spikes.

The most favorable choice of types of wavelet functions for pre-processing is

problem dependent. In this phase, Daubechies wavelet function (Db5) which is

called compactly supported orthonormal wavelets [101]. By making discretization

the scaling factor and position factor the DWT is obtained. For orthonormal

wavelet transform, x(n) the discrete signal can be expanded in to the scaling

function at j level, as follows:

�� (4.1)

where �� represents the detailed signal at j level. Note that j controls the dilation

or contraction of the scale function �� and � denotes the position of the wavelet

function �� and � represents the sample number of the �� . Here � � �

represents the set of integers. The frequency spectrum of the signal is classified

into high frequency and low frequency for wavelet decomposition as the band

increases �� . Wavelet transform is a two-dimensional timescale

processing method for non-stationary signals with adequate scale values and

shifting in time [102].

Multi resolution decomposition can efficiently provide simultaneous

characteristics, in term of the representation of the signal at multiple resolutions

corresponding to different time scales. Feature vectors are constructed by the

normalized variances of detail coefficients and P-QRS-T coefficients of the DWT

which belongs to the related scales.

72

The wavelet decomposition of ECG signal was done by the following

commands.

axes(handles.axes1);

plot(sig);

ylabel('Signal');


[a1 d1]=dwt(sig,'db5');

plot(d1);

ylabel('d1');


[a2 d2]=dwt(a1,'db5');

plot(d2);

ylabel('d2');


[a3 d3]=dwt(a2,'db5');

plot(d3);

ylabel('d3');


[a4 d4]=dwt(a3,'db5');

plot(d4);

ylabel('d4');


[a5 d5]=dwt(a4,'db5');

plot(d5);

ylabel('d5');


[a6 d6]=dwt(a5,'db5');

plot(d6);

ylabel('d6');


plot(a6);

ylabel('a6');

73

The following figures show the output wave forms of preprocessing and

morphology feature extraction.

Figure 4.7: Output Screen Demanding the Signal Number from User

Figure 4.8: Resulted Signal of Record 100 after Preprocessing & DWT

74

Then the P-QRS-T points are constructed from the normalized variances of

detail coefficients of the DWT which belongs to the related scales. According to

the following procedure the points are constructed.

In order to detect the peaks, specific details of the signal were

selected. R peaks are the Largest amplitude points which are greater than

threshold points are located in the wave. Those maxima points are stored and

the R-R interval is determined. Their mean value is found which is used to find the

portion of the single wave. The Q and S points are found as local minimum points

before and after R wave. Calculating the distance from zero point or close zero left

side of R peak within the threshold limit denotes Q peak. The onset is the

beginning of the Q wave and the offset is the ending of the S-wave. Normally, the

onset of the QRS complex contains the high-frequency components, which are

detected at finer scales. Calculating the distance from zero point or close zero

right side of R peak within the threshold limit denotes Q peak. Based on the

PR interval and QT interval the P and Q points are determined respectively.

Figure 4.9: Detected P-QRS-T Features for Signal 100

75

B. Higher-order Statistics and AR Modeling

Additional statistical data will be utilized for ECG signal feature detection.

For this purpose this research work proposed a complete procedure to extract

temporal features using third order cumulant based AR modeling.

The main problem in automatic ECG beat recognition and classification is

that related features are very susceptible to variations of ECG morphology and

temporal characteristics of ECG. In the study [96] the set of original QRS

complexes typical for six types of arrhythmia taken from the MIT/BIH arrhythmia

database, there is a great variations of signal among the same type of beats

belonging to the same type of arrhythmia. Therefore, in order to solve such

problem, this approach will rely on the statistical features of the ECG beats. For

this purpose, third-order cumulant has been taken into account, which can be

determined (for zero mean signals) as follows

�� !�� " (4.2)

�#�� $ � !�� $ " (4.3)

�%�� $� & � !�� $ �� & "

' �� & ' $ ' ��$ ��& ' �

' ��& ��$ ' �

(4.4)

where E represents the expectation operator, and k, l, and m are the time lags. In

this phase, third-order cumulant of selected ECG beats is used. Normalized ten

points represents the cumulant evenly distributed within the range of 25 lags. Each

succeeding samples of a signal as a linear combination of previous samples, that

is, as the output of an all-pole IIR filter is modeled by linear prediction. This

process locates the coefficients of an nth

order auto-regressive linear process that

models the time series x as

�� '(�) �� ' � ' (�* �� ' ) ' +

' (�� ' � ' �

(4.5)

76

where x represents the real input time series (a vector) and n is the order of the

denominator polynomial a(z). In the block processing, autocorrelation method is

one of the modeling methods of all-pole modeling to find the linear prediction

coefficients. This method is as well called as the Maximum Entropy Method

(MEM) of spectral analysis.

The following commands are used for temporal feature extraction from

preprocessed ECG signal using AR modeling and third order cumulant.

Inputs to the function are x-input signal vector, p-the optimal AR model

order, Fs-sampling frequency.This part of the code to determine the AR

parameters.

% Spectrum(f)=e(L)/ 1+A(L,1)*exp(-j*2*Pi*f/Fs)+

% ...+A(L,L)*exp(-j2*Pi*f*L/Fs)^2

for i=1:Nfreq

den=0;

for k=2:order+1

den=a(k)*exp(-j*2*pi*(i-1)*(k-1)/Nfreq)+den;

end

power(i)=e(order)/abs(1+den)^2;

end

freq=0:fsamp/Nfreq:(Nfreq/2-1)*fsamp/Nfreq;

function[A,E,K]=AR(x,p,Fs)

A=zeros(p+1,p+1);

K=zeros(1,p);

E=zeros(1,p);

N=length(y);

% y is a raw vector

% initialization

Rxx=(y*y’)/N;

ef=y; % ef(n)=y(n)

eb=y; % eb(n)=y(n)

L=1;

DEN=y(2:N)*y(2:N)’+y(1:N-1)*y(1:N-1)’;

Num=y(2:N)*y(1:N-1)’;

K(1)=2*Num/DEN; %K(L)=-R(L)

A(1,1)=-K(1);

77

E(1)=Rxx*(1-K(1)^2);

ef(2:N)=y(2:N)-K(1)*y(1:N-1);

eb(1:N-1)=y(1:N-1)-K(1)*y(2:N);

% Calculation

for L=2:p

Num=ef(L+1:N)*eb(1:N-L)’;

Den=ef(L+1:N)*ef(L+1:N)’+eb(1:N-L)*eb(1:N-L)’;

K(L)=2*Num/Den;

E(L)=E(L-1)*(1-K(L)^2);

A(L,L)=-K(L);

for j=1:L-1

A(L,j)=A(L-1,j)-K(L)*A(L-1,L-j);

end;

efm=ef;

ebm=eb;

ef(L+1:N)=efm(L+1:N)-K(L)*ebm(1:N-L);

eb(1:N-L)=ebm(1:N-L)-K(L)*efm(L+1:N);

end;

B(2:p+1,2:p+1)=A(1:p,1:p);

B=zeros(p+1,p+1);

B(:,1)=ones(p+1,1);

B(2:p+1,2:p+1)=A(1:p,1:p);

A=B(p+1,:);

% End

The outputs are the

A: AR parameters matrix, the pth row is the final set of p AR

parameters i.e. A=[1 A1 A2...Ap];

e(n)=y(n)+A1*y(n-1)+A2*y(n-2)+... +Ap*y(n-p);

E: error variance vector=[E(0),E(1),E(2),...,E(p)];

K: a raw vector of reflection coefficients at each calculating

step (from 1 to order p)

Hence, the noise components in ECG signal are removed in preprocessing

phase using morphology filter. From the preprocessed signal, P-QRS-T points are

constructed using DWT. Temporal features of the preprocessed ECG signal are

extracted using third order cumulant based AR modeling. These two feature sets

will construct the input vectors for the classifiers.

CHAPTER IV PREPROCESSING & FEATURE EXTRACTION IN ECG …shodhganga.inflibnet.ac.in/bitstream/10603/29063/12/12_chapter4.pdf · 5: Six Types of ECG Signal Wave forms d on the classifier,

Documents