LAB 1 SignalProcessing

of 16

cole Nationale Suprieure d'Ingnieurs de Poitiers (ENSIP)

REPORT ON ADVANCED SIGNAL PROCESSING- LAB 1

Power Spectral Density estimate using Matlab

ABSTRACT:

In this lab, we want to estimate Power Spectral Densities (PSD) of random stationary ergodic signals using Matlab. PSD are important, since they allow to estimate frequency contents of signals, to calculate transfer functions of Linear Time Invariant (LTI) systems and to estimate relations between signals using coherence. In part I, a routine to implement Welchs method is constructed and tested. In part II this routine is used to estimate the properties of signals measured during the study of an air conditioning exhaust noise.

Prepared by: Suman Khanal Submitted to: David MARX International Masters in Turbulence Charg de Recherche au CNRS

ENSIP Institut PPRIME

of 16

Part I

1.1 Study of a windowing function

a) Create a Hanning window w of size 128 (use the hanning Matlab command). Plot it.

In signal processing, a window function is a mathematical function that is zero-valued outside of some chosen interval.

When another function or waveform/data-sequence is multiplied by a window function, the product is also zero-valued

outside the interval: all that is left is the part where they overlap, the "view through the window".

The Hanning window (also known as the Hann window) is defined as:

The advantage of the Hanning window is very low aliasing, and the tradeoff is slightly decreased resolution (widening of

the main lobe).

The figure below shows the plot of a Hanning window of size 128 obtained by using the command wvtool in Matlab

which shows both the time and frequency domain plots of the Hanning window. The frequency domain plot is the

magnitude squared of the Fourier transform of the window vector in decibels (dB).

Figure-1: Hanning window of size 128 in time and frequency domain

of 16

b) Can you recover the value of the corrective factor Cw given in the table 1 for the Hanning window?

Corrective factor Cw for the Hanning window as calculated in the Matlab = N/sum(w.^2) = 2.6460

It is almost the same as given in the table 1 which is 2.67

The corrective factor is necessary to take into account for compensating the window effect, which tapers the signal and

thus modifies the energy. So it depends on the window used. As given in the table 1, for Rectangular window, its value

is 1 and for Blackman window, its value is 3.24.

c) We would like to study the first side-lobe attenuation for the continuous Hanning window. For this we need to

append some zeros to the Hanning window created with Matlab. This is called zero-padding. In the following

command lines, the window has intitially a size 128 and trailing zeros are added to obtain a length 2048. This is done

directly in the fft command.

N=128;

Npadding=2048;

W=hanning(N);

W=fft(w, Npadding);

W=fftshift(W);

f=(-Npadding/2:Npadding/2-1)/Npadding;

figure

plot(f, 20*log10(abs(W)./max(abs(W))), b)

Create a new m-file that includes the above commands. Run this program.

The figure below shows the 2 plots obtained by running the m-file including the above commands. The plot in blue

shows the value of W (defined by 20log(|W| which is the gain in dB) calculated with using the command fft vs

frequencies and the plot in red shows the value of W calculated using the command fftshift.

Figure: 2

of 16

What is the effect of the Matlab command fftshift?

The Matlab command fftshift re-orders the fft vector W so that the elements are sorted in ascending order of their

corresponding frequencies. That means it rearranges the outputs of vectors obtained by the Matlab command fft by

moving the zero-frequency component to the center of the array. It is useful for visualizing a Fourier transform with the

zero-frequency component in the middle of the spectrum and extending equal frequency either side of zero. This can be

clear from the figure 2 shown above.

The gain in dB is defined by 20log(|W|). What is the difference of gain in dB between the main lobe and the first side-

lobe?

The gain in dB between the main lobe and the first side-lobe is 31.52 as shown in the above figure 2.

Add on the plot the result obtained for a rectangular window (in that case, take w=ones(N,1). What compromise do

these curves illustrate?

The following figure 3 shows the gain in dB for a rectangular window and a hanning window. Two things are clear from

the figure below which are summarized as:

1) For a rectangular window, a fine (narrow) main lobe is observed which allows a good precision in the frequency

domain (better frequency resolution) and high-amplitude secondary lobes which increase the risk of spurious

frequency peak detection.

2) For a hanning window, a wide main lobe is observed which allows only a low precision in the frequency domain

(decreased frequency resolution) and low-amplitude secondary lobes which decrease the risk of spurious

frequency peak.

Therefore, the main point to remember while windowing the signals is that there is a trade-off in the frequency

domain between the width of the main lobe (frequency response) and the amplitudes of its side lobes (risk of

spurious frequency peak detection).

Figure: 3

of 16

1.2 Raw Periodogram

a) When the noise is null (std_noise) = 0 in the program) the signal is a cosine. This is then a deterministic signal and

we can estimate its PSD using the raw periodogram.

Given the signal definition in the main program, is Shannon condition verified?

Shannon condition states that it is possible to reconstruct a signal from its samples without loss of any information if the

signal is bandlimited and if the sampling frequency is at least twice as large as the maximal frequency contained in the

signal, i.e. if fs 2B, where fs is the sampling frequency (also called sampling rate) and B is the maximal frequency

contained in the signal (also called the bandwidth).

In the main program, the maximal frequency contained in the signal is 100 Hz and the sampling frequency is 1000 Hz,

which is greater than 2B, i.e. 200 Hz. Hence, the Shannon condition is verified.

Open and complete the function fct_raw(x, fs, iwindow). This should return the periodogram of a signal x according to

Eq. (1). It should also return the frequency vector. Both the frequency and the PSD should be ordered so that the

negative frequencies come first. Once the function works the PSD of the signal is given in figure 2. Do you obtain the

expected result? What is the effect of changing the value of N (say N=500 and N=2000)?

As we know, the Fourier Transform of the cosine function of frequency A (here, A=100 Hz) is an impulse (Dirac) at f=A

and f=-A. That means, all the energy of the cosine function of frequency A is entirely localized at the frequencies given

by |f|=A.

So, as expected, the PSD of the signal in the following figure shows the two impulses (diracs) at the frequencies -100 Hz

and 100 Hz.

Figure: 4

of 16

When we change the value of N (number of points used for the raw periodogram) we notice that the peak of the

dirac becomes more sharper for the higher value of N.

This is clear from the two figures 5 and 6 below:

PSD obtained for the value of N=500

Figure: 5

PSD obtained for the value of N=2000

Figure: 6

of 16

b) Open and complete the functions fct_power_time and fct_power_freq. They should return the power of the signal,

calculated according to Eqns. (9) and (10) respectively. Is Parseval relation satisfied?

The energy or power contained in the signal can be calculated either in the time-domain or in the frequency domain by

application of the Parseval theorem or Parseval relation.

Parseval theorem states that the total power contained in a signal x(t) summed across all of time t is equal to the total

power of the signal's Fourier Transform X(f) summed across all of its frequency components f.

= lim

1

|()|2

0

= |()|2 =

()

() is the Power Spectral Density (PSD) of the signal.

For discrete signals with finite power, Parsevals relation takes the form:

=1

2()

1

=0

= 1

()

1

=0

So, in the program, the average power calculated by the functions fct_power_time and fct_power_freq in time and

frequency domains respectively according to the above equation gives the following values:

Power calculated in the time domain: Pt=0.50008

Power calculated in the frequency domain: Pf_raw=0.50008

Therefore, the Parseval relation is clearly satisfied.

c) Now we can try to use the raw periodogram for a random signal. Let us use a white noise with variance . Set A=0

and std_noise=5 and run the program. From the PSD in figure 2, can you estimate properly the PSD of the noise?

The figure 7 below shows a white noise.

Figure: 7

of 16

The figure 8 below shows the PSD of the white noise estimated by using Raw Periodogram. As we can see, its not

possible to estimate properly the PSD of the noise by using raw periodogram because the white noise is a random signal

and the raw periodogram does not work well for a random signal. The raw periodogram is a perfectly valid estimate if

the signal is deterministic but it is only a raw estimate for a random signal like white noise. For such a random signal, the

typical error in the estimate, (), is:

(())

() ~ 0(1)

This means that the error is of the order of the calculated quantity, which is not a very good result as can be seen from

the figure below. So for a random signal, the averaged periodogram (Welchs method) should be used for a better

estimate of the PSD.

Figure: 8

d) Since the PSD of a white noise should be flat and equal to for all frequencies (see eq.(11)), there is a simple way

to estimate the mean and the standard deviation of the PSD estimate; we can estimate these quantities by using only

one realization, , , and considering the values at the different frequencies as different realizations of the PSD

estimate. Then use the Matlab command mean to calculate the mean of ,, and use the Matlab command std to

calculate its standard deviation, [,] (this is a single number that does not depend on frequency given the

chosen method). How do these compare with Eq. (3)?

The mean and the standard deviation of , calculated by using the matlab commands mean and std are as below:

Mean of , = 24.6168

Standard deviation of , = 24.8151

Since the standard deviation of , is of the same order of the calculated quantity ,, it follows Eq.(3) which is

(())

() ~ 0(1)

This means that the error is of the order of the calculated quantity, which is not a very good result.

of 16

1.3 Averaged Periodogram

a) Consider again the white noise only. We want to estimate the PSD of this signal using the averaged periodogram.

Open and complete the function fct_welch(x, fs, N, iwindow). This should return: the frequency vector going from

negative to positive, the PSD calculated according to Eq. (4), and the number of blocks M that have been used. The

result will be displayed in figure 2. Use N=300. Is the result more satisfying than with the raw periodogram?

The figure 9 below shows the PSD of the white noise estimated by using the raw periodogram and the averaged

periodogram (Welchs method). It can be clearly seen that the result with the averaged periodogram is more satisfying

than with the raw periodogram because the averaged periodogram improves the quality of the estimate by decreasing

the typical error of the estimate.

Now, for the averaged periodogram, the typical error in the estimate, (), is:

(())

() ~ 0(

1

)

which is a better estimate than with the raw periodogram.

Figure: 9

b) Does Parseval relation hold?

In the program, the average power calculated in time and frequency domains are as follows:

Power calculated in the time domain: Pt=25.4486

Power calculated in the frequency domain (by using raw periodogram): Pf_raw=25.4486

Power calculated in the frequency domain (by sing averaged periodogram): Pf_welch=25.5241

Since, the power calculated in both time and frequency domains are almost equal, Parseval relation is satisfied.

of 16

1.4 Variance of the PSD estimator

Here we propose to check Eq. (6) in the same way as we have checked Eq. (3). For this we vary the value of N (the size

of one block) used as input to the function fct_welch, since this will correspond to changing M (the number of blocks).

Then, for each value of N belonging to the vector Nvec, we want to calculate the mean and the standard deviation of

,. To do this, open and complete the function fct_error(x, fx, Nvec, iwindow). The standard deviation is then

plotted versus the number of blocks in figure 3, and this is compared with the result of Eq. (6).

The following figures 10 and 11 show the two graphs of the standard deviation plotted versus the number of blocks for

two cases with Nvec=[100:1000:4000] and Nvec=[100:200:4000] respectively.

Figure: 10 with Nvec = [100:1000:4000]

Figure: 11 with Nvec = [100:200:4000]

of 16

Nvec is a vector containing different block sizes. If we change the block sizes, the number of blocks also changes and

there will be the change in the standard deviation of , as well. This can be clear from the two graphs shown

above. In the first case with Nvec=[100:1000:4000], the observed standard deviation shows a huge error as compared to

the theoretical value because of the less number of blocks. Whereas in the second case with Nvec = [100:200:4000], the

observed standard deviation shows less error as it is more closer to the theoretical value because of the more number of

blocks.

So, this result is in good agreement with the Eq. (6) which gives the typical error in the estimate, ()

(())

() ~ 0(

1

)

This means that the error decreases with the number of blocks which is clear from the figures above. The only thing to

consider is that there is a price to pay for this, since at the same time the frequency resolution decreases.

of 16

Part II Application to the sound produced by an air conditioning exhaust

Now open the main program for part II: go_part_II.m

At the beginning of this program, either the file signaux1.1vm or the file signaux2.1vm is read. These files contain the

data for two different positions, and , of the mobile hot wire. At the moment, one can load signaux1.1vm and

work only with this file.

a) Plot in figure 1 and compare the signals x(t) and y(t) over a time span of 50ms.

The figure 12 below shows the two signals x(t) and y(t) over a time span of 50ms (=0.05 seconds)

Figure: 12

Signal x(t) is the velocity measurement in the potential core (where the flow is laminar) provided by the first hot wire

which is fixed. Signal y(t) is the velocity measurement provided by the second hot wire which is a flying hot wire that can

be displaced in the shear layer along the direction. Both signals are random due to noise and turbulence. As it is seen

in the figure above, the signal x(t) is more random in nature than the signal y(t) since the velocity fluctuations in the

signal x(t) show more randomness than the velocity fluctuations in the signal y(t).

b) Compute the mean velocity of signal x(t) and y(t) (the mean of x(t) is actually U). Remove these means from the

signals.

Mean velocity of signal x(t) =20.1882 m/s

Mean velocity of signal y(t) =17.914 m/s

After removing these means from the signals, we get the corresponding velocity fluctuations of the signals.

of 16

c) Compute and display in figure 2 the auto-PSD , and . Use Welchs method. Display the result in dB (plot

10.log ( ) for example).

What is the frequency that has high energy content ( )? This frequency corresponds to vortices passing by in

the shear layer (a vortex shedding frequency). This is the frequency appearing in Eq. (13).

The Auto-PSD Sxx, S and S computed by using Welchs method is shown in the figure 13 below:

Figure: 13

From the figure above, it is clear that the auto-PSDs Sxx, S and S have the high energy contents at a certain

frequency near 1000 Hz (more precisely at the frequency 1030 Hz) which corresponds to vortices passing by in the shear

layer (also called a vortex shedding frequency).

of 16

d) Compute the cross power spectral density of x and y (open and complete the matlab function fct_iwelch(u, v,

fs, N, iwindow). Display the modulus of as a function of frequency in figure 3a and its phase in figure 3b.

The cross power spectral density Syx of x and y is computed. The following figure shows the modulus of Cross PSD Syx

and its phase as a function of frequency. As it can be seen from the figure, there is a high energy content at a frequency

1030 Hz shown by the peak in the Syx plot which corresponds to the phase ()= 0.8093 shown in the plot.

Figure: 14

e) We would like to estimate the convection velocity of the vortices in the shear layer by using Eqs. (16) and (17). For

this you need to work sequentially with files signaux1.1vm and signaux2.1vm. First open signaux1.1vm and compute

the cross-PSD . Determine its phase, arg[ ]( ) at the frequency . Then work with file signaux2.1vm and

determine arg[ ]( ) at the frequency . From these you can determine the convection speed (remember that

= ).

For signaux1.1vm, we get (1) = 0.8093 (as shown in the figure above)

Similarly, for signaux2.1vm, we get (1) = -2.785

Therefore, (0) = (1) (2) = 0.8093 (2.785) = 3.5943

Convection speed:

Uc = 20

(0)

= 2 3.1410306103

0.809(2.785)

= 10.80 m/s

of 16

f) Determine the coherence () of x and y, and plot its modulus in figure 3c. Remember that the coherence is

nothing but a normalized cross-PSD, and so it is very easily determined using the functions fct_welch and fct_iwelch. Determine . Plot its modulus and phase in figure 4a and 4b. Plot the modulus of the coherence in figure 4c. What information does coherence bring to you?

Cross PSD of x and y, phase () and modulus of the Coherence as a function of frequency

Figure: 15

Cross PSD of x and z, phase () and modulus of the Coherence as a function of frequency

Figure: 16

of 16

The coherence function between two signals x and y is given by:

() = ()

() ()

It is also called the normalized cross-PSD. Coherence is used to test a linear stationary relationship between two random processes. In other words, it is used to test whether the two signals are linked by a LTI system. Coherence is a complex number. Its phase is that of the cross-PSD. Its modulus ranges from 0 to 1. Three cases may be distinguished based on the modulus of the coherence. 1) |()| = 1: x and y are completely coherent at frequency f; there exists a linear and stationary relationship

between them 2) |()| = 0: x and y are completely incoherent at frequency f

3) 0 < ()| < 1: there are 3 possibilities:

a) Noise is present. b) The relation between x(t) and y(t) is not linear. c) The signal y(t) depends on x(t) and on some other signals as well. Often, a coherence larger than 0.8 or 0.9 is judged significant enough for a linear and stationary relationship between the two signals to exist. From the figures 15 and 16 above, we can see that the modulus of the coherence and are very low at all

the frequencies except at a certain frequency 1030 Hz which is the vortex shedding frequency where there is a high coherence, =0.8949 (between the signals x and y) and =0.9483 (between the signals x and z). This means

that there exists a link( linear time invariant) between the random processes x and y and also between x and z at the vortex shedding frequency 1030 Hz.

Conclusion This lab was primarily intended to make us familiar with the basic tools of digital signal processing such as: windowing, sampling, Discrete Fourier Transform, estimation of the Power Spectral Density using the Raw and Averaged Periodogram, calculating the power contained in the signal in both time and frequency domains, estimating relations between signals using coherence, etc. With the help of this lab, I am able to better understand all these concepts of digital signal processing taught in the lecture. Matlab was used for the calculation and the plotting purpose. So, It also sharpened my programming skills in Matlab.

LAB 1 SignalProcessing

Documents

window function

hanning window of size

continuous hanning window

window vector

window effect

rectangular window

blackman window

hann window