This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
FrequencyDomain Speech Analysis
ShortTime Fourier Analysis Windowed (shorttime) Fourier Transform Spectrogram of speech signals Filter bank implementation*
Cepstral Analysis (Real) cepstrum and complex cepstrum Complex cepstrum for speech Pitch detection
Echo hiding
Fourier Transform
Joseph Fourier(17681830)
F w =∫−∞
∞f t e− jwt
f t =12π
∫−∞
∞F w e jwtdw
It’s a simple and powerful idea:Can any signal be represented by linear combination of sines and cosines?
Deep Impact of FT
A versatile tool for solving many problems in science and engineering Mathematics: functional/harmonic analysis Physics: thermodynamics, Fourier optics Astronomy: radar imaging, FT Spectrometer Biomedical engineering: MRI, FT infrared
Speech Signal Analysis Why (longterm) FT is not appropriate for speech
signals? FT is the ideal tool for analyzing periodic or stationary
signals – frequency domain representation greatly helps the analysis
Like many other phenomena we observe in the natural worlds, speeches are transient or nonstationary signals whose properties change markedly as a function of time
Due to Heisenberg’s uncertainty principle, we can only find some compromised solution between time and frequency localization
ShortTime (Windowed) FT(Normal) Fourier Transform of a Sequence
TimeDependent Fourier Transform
X e jw=∑
−∞
∞
x n e− jwn
Xn ejw=∑
−∞
∞
wn−m x m e− jwm
window (typically Hamming)time frequency
Definition of Spectrogram (Sonogram) Windowed speech
window
speech
time freq.
Interpretation of Spectrogram
Time
Frequency
STFT at time nn
Evolution ofST spectrumat frequency walong time
w
History of Sonagraph (Visual Speech) Spectrogram has been
used in almost every phase of speech research for over 70 years (DSP has been around for 57 years)
Before DSP, a device called sound spectrograph (also called wave analyzer) was widely used
Example of Spectrogram: Chirps
Chirps are analytic signals which have a particular instantaneous frequency
Example of Speech Spectrogram
Spectrogram Reading
We will use MATLAB demo to test your spectrogram reading capability
Spectrogram via Filter Bank*
Filter Bank
Spectrogram Calculation Under MATLABMethod 1: Use COLEA toolbox (it has a nice GUI)
Method 2: Use the demo program on the right
Cutandpaste it and saveIt as specgram_demo.m
>[x,fs]=wavread(filename);>specgram_demo(x,fs);
% function specgram_demo(y,fs)% display the spectrogram of speech signal% demo for EE493Q Fall 2006
function specgram_demo(y,fs)
% calculate the table of amplitudes[B,f,t]=specgram(y,1024,fs,256,192);% calculate amplitude 50dB down from maximumbmin=max(max(abs(B)))/300;% plot top 50dB as imageimagesc(t,f,20*log10(max(abs(B),bmin)/bmin));% label plotaxis xy;xlabel('Time (s)');ylabel('Frequency (Hz)');% build and use a grey scalelgrays=zeros(100,3);for i=1:100 lgrays(i,:) = 1i/100;endcolormap(hot);
/Hello/
/Don’t ask me/
Impact of Window on Spectrogram
FrequencyDomain Speech Analysis
ShortTime Fourier Analysis Windowed (shorttime) Fourier Transform Spectrogram of speech signals Filter bank implementation*
Cepstral Analysis (Real) cepstrum and complex cepstrum Complex cepstrum for speech pitch detection
Echo hiding
From Spectrum to Cepstrum Recall fundamental assumption about
speech signal: speech can be represented as the output of a linear filtering system whose excitation and system response vary slowly with time To separate excitation e(n) from the system
response h(n), we need to perform some kind of deconvolution in the frequency domain: X(w)=E(w)H(w)
Multiplication is not as easy as addition to deal with; can we convert product into sum?
The Power of Logarithm
log()X e jw =Ee jw He jw X e jw = Ee jw He jw
Note on Complex Logarithm
Since X(w) (FT of x(n)) is typically a complex signal, we need to definecomplex logarithm as follows (i.e., take the logarithm of magnitude)