Frequency Domain Coding of Speech
Post on 02-Jan-2016
35 Views
Preview:
DESCRIPTION
Transcript
Frequency Domain Coding of Speech
主講人:虞台文
Content Introduction The Short-Time Fourier Transform The Short-Time Discrete Fourier Transform Wide-Band Analysis/Synthesis Sub-Band Coding
Frequency Domain Coding of Speech
Introduction
Speech Coders Waveform Coders
– Attempt to reproducing the original waveform according to some fidelity criteria
– Performance: successful at producing good quality, robust speech.
Vocoders– Correlated with speech production model.– Performance: more fragile and more model depend
ent.– Lower bit rate
Frequency-Domain Coders
Sub-band coder (SCB). Adaptive Transform Coding (ATC). Multi-band Excited Vocoder (MBEV). Noise Shaping in Speech Coders.
Classification of Speech Coders
Frequency Domain Coding of Speech
The Short-Time Fourier Transform
Definition of STFT
m
mjjn emxmnheX )()()(
m
mjjn emxmnheX )()()(
Interpretations:Filter Bank InterpretationBlock Transform Interpretation
Filter Bank Interpretation
m
mjjn emxmnheX )()()(
m
mjjn emxmnheX )()()(
is fixed at 0.
])([*)()( 00 njjn enxnheX
f (m)AnalysisFilter
Filter Bank Interpretation
...
nje 1
nje 2
nj Me 1
nj Me
)( 1jn eX
)( 2jn eX
)( 3jn eX
)( 4jn eX
h(n)h(n)
h(n)h(n)
h(n)h(n)
h(n)h(n)
x(n)
])([*)()( 00 njjn enxnheX
Filter Bank Interpretation
])([*)()( 00 njjn enxnheX
Modulation
)( 00)( jFTnj eXenx )( 00)( jFTnj eXenx
)( jeX )(nx
nje 0
)(nx
)( 0)( jj eXeX
0
])([*)()( 00 njjn enxnheX
Filter Bank Interpretation
)( jeX )(nx
nje 0
)(nx
)( 0)( jj eXeX
0
LowpassFilter
])([*)()( 00 njjn enxnheX
Modulation
Filter Bank Interpretation
])([*)()( 00 njjn enxnheX
...
nje 1
nje 2
nj Me 1
nj Me
)( 1jn eX
)( 2jn eX
)( 3jn eX
)( 4jn eX
h(n)h(n)
h(n)h(n)
h(n)h(n)
h(n)h(n)
x(n) Modulated Subband signals
Block Transform Interpretation
m
mjjn emxmnheX )()()( 00
m
mjjn emxmnheX )()()( 00
n is fixed at n0.
Windowed Data
AnalysisWindow
m
mjjn emxmnheX )()()(
m
mjjn emxmnheX )()()(
FT of Windowed Data
)]()([)( 00nxnnhFTeX j
n )]()([)( 00nxnnhFTeX j
n
Block Transform Interpretation
n is fixed at n0. )]()([)( 00nxnnhFTeX j
n )]()([)( 00nxnnhFTeX j
n
n1
n2
n3...nr
)(1
jn eX )(1
jn eX
)(2
jn eX )(2
jn eX
)(3
jn eX )(3
jn eX
)( jn eX
r
)( jn eX
r
Analysis/Synthesis Equations
m
mjjn emxmnheX )()()(
m
mjjn emxmnheX )()()(Analysis
r
njjr deeXrnfnx )()(
2
1)(ˆ
r
njjr deeXrnfnx )()(
2
1)(ˆSynthesis
In what condition we will have ?)(ˆ)( nxnx
Analysis/Synthesis Equations
m
mjjn emxmnheX )()()(
m
mjjn emxmnheX )()()(Analysis
r
njjr deeXrnfnx )()(
2
1)(ˆ
r
njjr deeXrnfnx )()(
2
1)(ˆSynthesis
deeXrnfnx njjr
r
)(2
1)()(ˆ )()()( nxnrhrnf
r
)()()( nrhrnfnxr
Replace r with n+r
)()()( rhrfnxr
Analysis/Synthesis Equations
m
mjjn emxmnheX )()()(
m
mjjn emxmnheX )()()(Analysis
r
njjr deeXrnfnx )()(
2
1)(ˆ
r
njjr deeXrnfnx )()(
2
1)(ˆSynthesis
deeXrnfnx njjr
r
)(2
1)()(ˆ )()()( nxnrhrnf
r
)()()( nrhrnfnxr
Therefore, )(ˆ)( nxnx if 1)()(
nhnfn
1)()(
nhnfn
)()()( rhrfnxr
Analysis/Synthesis Equations
More general, 1)()(2
1)()(
deHeFnhnf jj
n
1)()(2
1)()(
deHeFnhnf jj
n
m
mjjn emxmnheX )()()(
m
mjjn emxmnheX )()()(Analysis
r
njjr deeXrnfnx )()(
2
1)(ˆ
r
njjr deeXrnfnx )()(
2
1)(ˆSynthesis
Therefore, )(ˆ)( nxnx if 1)()(
nhnfn
1)()(
nhnfn
Examples1)()(
2
1)()(
deHeFnhnf jj
n
1)()(2
1)()(
deHeFnhnf jj
n
0)0( ,)0(
)()(
h
h
nnf 1)()(
nhnfn
neH
nfj
allfor ,)(
1)(
0
)(
)()(
0jj
eHeF
1( ) ( ) 1
2j jF e H e d
Examples
0)0( ,)0(
)()(
h
h
nnf
r
njjr deeXrnfnx )()(
2
1)(ˆ
r
njjr deeXrnfnx )()(
2
1)(ˆ
deeXh
nx njjn )(
2
1
)0(
1)(ˆ
m
mjjn emxmnheX )()()(
m
mjjn emxmnheX )()()(
h(0)x(n)
)(nx
Examples
r
njjr deeXrnfnx )()(
2
1)(ˆ
r
njjr deeXrnfnx )()(
2
1)(ˆ
r
njjrj
deeXeH
nx )(2
1
)(
1)(ˆ
0
m
mjjn emxmnheX )()()(
m
mjjn emxmnheX )()()(
neH
nfj
allfor ,)(
1)(
0
j
n
j enheH )()(
n
j nheH )()( 0
r
jr
r
eXFTrh
)]([)(
1 1
r
r
nxnrhrh
)()()(
1
r
r
nxrhrh
)()()(
1)(nx
Frequency Domain Coding of Speech
The Short-Time Discrete Fourier Transform
Definition of STDFT
m
kmM
Mkmjnn WmxmnheXkX )()(][)( )/2(
m
kmM
Mkmjnn WmxmnheXkX )()(][)( )/2(
Analysis:
1
0
)()(1
)(ˆM
k r
knMr WkXrnf
Mnx
1
0
)()(1
)(ˆM
k r
knMr WkXrnf
Mnx
Synthesis: In what condition we will have?)(ˆ)( nxnx
r
njjr deeXrnfnx )()(
2
1)(ˆ
r
njjr deeXrnfnx )()(
2
1)(ˆ
m
mjjn emxmnheX )()()(
m
mjjn emxmnheX )()()(
)/2( MjM eW
)/2( MjM eW
Synthesis
1
0
)(1
)()(ˆM
k
knMr
r
WkXM
rnfnx
m
kmMn WmxmnhkX )()()(
m
kmMn WmxmnhkX )()()(
)()()()(ˆ nxnrhrnfnxr
)()()( nrhrnfnxr
1)(nx
1)()(
nrhrnfr
1)()(
nrhrnfr
1
0
)()(1
)(ˆM
k r
knMr WkXrnf
Mnx
1
0
)()(1
)(ˆM
k r
knMr WkXrnf
Mnx
Synthesis
1
0
)(1
)()(ˆM
k
knMr
r
WkXM
rnfnx
)()()()(ˆ nxnrhrnfnxr
)()()( nrhrnfnxr
)(nx
1)()(
nrhrnfr
1)()(
nrhrnfr
periodic. are )()(ˆBoth nxnx periodic. are )()(ˆBoth nxnx
)()(
)(ˆ)(ˆ
Mnxnx
Mnxnx
)()(
)(ˆ)(ˆ
Mnxnx
Mnxnx
We need only one period.
Therefore, the condition is respecified as:
)()]([)( ppMnrhrnfr
)()]([)( ppMnrhrnfr
Implementation Consideration
n
Fre
quen
cyk
0Spectrogram
Sampling
n
Fre
quen
cyk
0Spectrogram
R 2R 3R 4R
)(0 kX R)(0 kX R )(kX R
)(kX R )(2 kX R)(2 kX R )(3 kX R
)(3 kX R )(4 kX R)(4 kX R
Sampled STDFT
m
kmMn WmxmnhkX )()()(
m
kmMn WmxmnhkX )()()(
Analysis:
1
0
)()(1
)(ˆM
k r
knMr WkXrnf
Mnx
1
0
)()(1
)(ˆM
k r
knMr WkXrnf
Mnx
Synthesis: In what condition we will have?)(ˆ)( nxnx
m
kmMsR WmxmsRhkX )()()(
m
kmMsR WmxmsRhkX )()()(
1
0
)()(1
)(ˆM
k s
knMsR WkXsRnf
Mnx
1
0
)()(1
)(ˆM
k s
knMsR WkXsRnf
Mnx
Sampled STDFT
m
kmMn WmxmnhkX )()()(
m
kmMn WmxmnhkX )()()(
Analysis:
1
0
)()(1
)(ˆM
k r
knMr WkXrnf
Mnx
1
0
)()(1
)(ˆM
k r
knMr WkXrnf
Mnx
Synthesis: In what condition we will have?)(ˆ)( nxnx
m
kmMsR WmxmsRhkX )()()(
m
kmMsR WmxmsRhkX )()()(
1
0
)()(1
)(ˆM
k s
knMsR WkXsRnf
Mnx
1
0
)()(1
)(ˆM
k s
knMsR WkXsRnf
Mnx
)()]([)( ppMnrhrnfr
)()]([)( ppMnrhrnfr
)()]([)( ppMnsRhsRnfs
)()]([)( ppMnsRhsRnfs
Frequency Domain Coding of Speech
Wide-Band
Analysis/Synthesis
Short-Time Synthesis --- Filter Bank Summation
m
mjjn emxmnheX )()()(
m
mjjn emxmnheX )()()(
m
mjjn
kk emxmnheX )()()(
STFT
h(n)h(n)x(n)
nj ke
)( kjn eX
nj kenxnh )(*)(
LowpassFilter
Short-Time Synthesis --- Filter Bank Summation
m
mjjn emxmnheX )()()(
m
mjjn emxmnheX )()()(
m
nmjjn
kk emhmnxeX )()()()(
STFT
m
mjnj kk emhmnxe )()(
m
knjj
n mhmnxeeX kk )()()(
m
knjj
n mhmnxeeX kk )()()(nj
kkenhnh )()(nj
kkenhnh )()(
Short-Time Synthesis --- Filter Bank Summation
|H(ej)|
|Hk(ej)|
k
Lowpass filter Bandpass filter
( )( ) kjjkH e H e ( )( ) kjj
kH e H e
m
knjj
n mhmnxeeX kk )()()(
m
knjj
n mhmnxeeX kk )()()(nj
kkenhnh )()(nj
kkenhnh )()(
Short-Time Synthesis --- Filter Bank Summation
hk(n)hk(n)x(n))( kj
n eX
BandpassFilter nj ke
m
mjjn emxmnheX )()()(
m
mjjn emxmnheX )()()(
h(n)h(n)x(n)
nj ke
)( kjn eX
LowpassFilter
Lowpass representation of for the signal in a band centered at k.
m
knjj
n mhmnxeeX kk )()()(
m
knjj
n mhmnxeeX kk )()()(nj
kkenhnh )()(nj
kkenhnh )()(
Short-Time Synthesis --- Filter Bank Summation
hk(n)hk(n)x(n))( kj
n eX
BandpassFilter nj ke
h(n)h(n)x(n)
nj ke
)( kjn eX
LowpassFilter
nj ke
)(nyk
nj ke
)(nyk
Encoding one band Decoding one band
)(*)()()( nhnxeeXny knjj
nkkk )(*)()()( nhnxeeXny knjj
nkkk
Short-Time Synthesis --- Filter Bank Summation
)(*)()()( nhnxeeXny knjj
nkkk )(*)()()( nhnxeeXny knjj
nkkk
h1(n)h1(n))( 1j
n eX
)(1 ny
nje 1 nje 1x(n)
nje 0
h0(n)h0(n))( 0j
n eX )(0 nynje 0
hN1(n)hN1(n))( 1Nj
n eX
)(1 nyN
nj Ne 1 nj Ne 1
.
.
.
)(ny
AnalysisAnalysis SynthesisSynthesis
Short-Time Synthesis --- Filter Bank Summation
h1(n)h1(n))( 1j
n eX
)(1 ny
nje 1 nje 1x(n)
nje 0
h0(n)h0(n))( 0j
n eX )(0 nynje 0
hN1(n)hN1(n))( 1Nj
n eX
)(1 nyN
nj Ne 1 nj Ne 1
.
.
.
)(ny
AnalysisAnalysis SynthesisSynthesis
Short-Time Synthesis --- Filter Bank Summation
h1(n)h1(n))( 1j
n eXnje 1 nje 1
x(n)
nje 0
h0(n)h0(n))( 0j
n eX )(0 nynje 0
hN1(n)hN1(n))( 1Nj
n eX
)(1 nyN
nj Ne 1 nj Ne 1
.
.
.
)(ny
AnalysisAnalysis SynthesisSynthesis
)(1 ny
)()( )( kjjk eHeH )()( )( kjj
k eHeH
Equal Spaced Ideal Filters
N2
N2
N2
N2
N2
N2
N2
1 2 3 4 5 21 0
N = 6
)()( )( kjjk eHeH )()( )( kjj
k eHeH N
kk
2N
kk
2
Equal Spaced Ideal Filters
)(0 ny
)(1 nyN
)(ny)(1 nyh1(n)
x(n)
h0(n)
hN1(n)
.
.
.
1
0
)()(~ N
k
jk
j eHeH
1
0
)()(~ N
k
jk
j eHeH
What condition should be satisfied so that y(n)=x(n)?
)()( )( kjjk eHeH )()( )( kjj
k eHeH N
kk
2N
kk
2
Equal Spaced Ideal Filters
)()( )( kjjk eHeH )()( )( kjj
k eHeH N
kk
2N
kk
2
1
0
)(1 N
k
njj kk eeHN
r
rNnh )(
Equal spaced sampling of
H(ej )
Inverse discrete FT of H(ej )
Time-Aliasedversion of h(n)
1
0
)()(~ N
k
jk
j eHeH
1
0
)()(~ N
k
jk
j eHeH
Equal Spaced Ideal Filters
)()( )( kjjk eHeH )()( )( kjj
k eHeH N
kk
2N
kk
2
1
0
)(1 N
k
njj kk eeHN
r
rNnh )(
Consider FIR, i.e., h(n) is of duration of L samples.
0 L1 n
h(n)
In case that N L,
1
0
)0()(1 N
k
j heHN
k
1
0
)0()(1 N
k
j heHN
k
1
0
)()(~ N
k
jk
j eHeH
1
0
)()(~ N
k
jk
j eHeH
Equal Spaced Ideal Filters
)()( )( kjjk eHeH )()( )( kjj
k eHeH N
kk
2N
kk
2
1( )
0
( ) ( )k
Njj
k
H e H e
1
0
( )k
Nj
k
H e
)0(Nh
1
0
)0()(1 N
k
j heHN
k
1
0
)0()(1 N
k
j heHN
k
1
0
)()(~ N
k
jk
j eHeH
1
0
)()(~ N
k
jk
j eHeH
Equal Spaced Ideal Filters
)0()(~
NheH j )0()(~
NheH j
)(0 ny
)(1 nyN
)(ny)(1 nyh1(n)h1(n)
x(n)
h0(n)h0(n)
hN1(n)hN1(n)
.
.
.
)()0()( nxNhny )()0()( nxNhny
0 L1 n
h(n)
x(n) can always beReconstructed if N L,
1
0
)()(~ N
k
jk
j eHeH
1
0
)()(~ N
k
jk
j eHeH
Equal Spaced Ideal Filters
)0()(~
NheH j )0()(~
NheH j
)(0 ny
)(1 nyN
)(ny)(1 nyh1(n)h1(n)
x(n)
h0(n)h0(n)
hN1(n)hN1(n)
.
.
.
0 L1 n
h(n)
x(n) can always beReconstructed if N L,
Does x(n) can still be reconstructed if N<L?Does x(n) can still be reconstructed if N<L?
If affirmative, what condition should be satisfied?If affirmative, what condition should be satisfied?
)()0()( nxNhny )()0()( nxNhny
1
0
)()(~ N
k
jk
j eHeH
1
0
)()(~ N
k
jk
j eHeH
Equal Spaced Ideal Filters
)(0 ny
)(1 nyN
)(ny)(1 nyh1(n)h1(n)
x(n)
h0(n)h0(n)
hN1(n)hN1(n)
.
.
.
njk
kenhnh )()(nj
kkenhnh )()(
njN
k
kenhnh
1
0
)()(~
N
kk
2N
kk
2
1
0
)(N
k
nj kenh
p(n)
r
rNnNnp )()(
r
rNnNnp )()(
Equal Spaced Ideal Filters
njN
k
kenhnh
1
0
)()(~
1
0
)(N
k
nj kenh
p(n)
r
rNnNnp )()(
r
rNnNnp )()(
)()()(~
npnhnh
r
rNnrNhN )()(
Signal can be reconstructedIf it equals to (n m).
)()()(~
npnhnh )()()(~
npnhnh
r
rNnnNh )()(
Typical Sequences of h(n)
)()()(~
npnhnh )()()(~
npnhnh
Ideal lowpass filter with cutoff at /N.
n
nnh N
sin)(
n
nnh N
sin)(
0N2N N 2N 3N 4N
p(n)N
)()(~
nnh )()(~
nnh
0N2N N 2N 3N 4N
h(n)
1/N
Typical Sequences of h(n)
)()()(~
npnhnh )()()(~
npnhnh
0N2N N 2N 3N 4N
p(n)N
0N2N N 2N 3N 4N
h(n)
h(0)
)()0()(~
nNhnh )()0()(~
nNhnh
L2L L 2L 3L 4L
N L
Typical Sequences of h(n)
)()()(~
npnhnh )()()(~
npnhnh
0N2N N 2N 3N 4N
p(n)N
)2()(~
Nnnh )2()(~
Nnnh
0N2N N 2N 3N 4N
h(n)
h(0)
1/N A causalFIR lowpass filter
Typical Sequences of h(n)
)()()(~
npnhnh )()()(~
npnhnh
0N2N N 2N 3N 4N
p(n)N
)()(~
Nnnh )()(~
Nnnh
0N2N N 2N 3N 4N
h(n)
h(0)
1/N A causalIIR lowpass filter
Filter Back Implementation for a Single Channel
hk(n)x(n))( kj
n eX
nj ke nj ke
)(nyk
h(n)x(n)
nj ke
)( kjn eX
nj ke
)(nyk
AnalysisAnalysis SynthesisSynthesis
hk(n)x(n))( kj
n eX
nj ke nj ke
)(nyk
h(n)x(n)
nj ke
)( kjn eX
nj ke
)(nyk
Filter Back Implementation for a Single Channel
R:1
R:1
1:R
1:R)( kj
n eX
)( kjn eX
AnalysisAnalysis SynthesisSynthesis
DecimatorDecimator InterpolatorInterpolator
hk(n)x(n))( kj
n eX
nj ke nj ke
)(nyk
h(n)x(n)
nj ke
)( kjn eX
nj ke
)(nyk
Filter Back Implementation for a Single Channel
R:1
R:1
1:R
1:R)( kj
n eX
)( kjn eX
AnalysisAnalysis SynthesisSynthesis
DecimatorDecimator InterpolatorInterpolator
Depends on the bandwidth of h(n).Depends on the bandwidth of h(n).
R=?R=?
Frequency Domain Coding of Speech
Sub-Band Coding
AnalysisAnalysis SynthesisSynthesis
Filter Bank Implementation(Direct Implementation)
...
0NW
h(n)h(n)
h(n)h(n)
h(n)h(n)
h(n)h(n)
x(n)n
NW
knNW
nNNW )1(
...
)0(sRXR:1R:1
R:1R:1
R:1R:1
R:1R:1
)1(sRX
)(kX sR
)1( NX sR
1:R1:R
1:R1:R
1:R1:R
1:R1:R
...
...
f(n)f(n)
f(n)f(n)
f(n)f(n)
f(n)f(n)
0NW
nNW
knNW
nNNW )1(
x(n)
Complex ChannelsComplex Channels R=2BR=2B
Bandwidth B/2
Filter Bank Implementation(Practical Implementation)
0
B
k0
B
k
0 B/2B/2 0 B/2B/2
0B 0 B
0B B
knNW kn
NW knNWkn
NW
2/jBne2/jBne 2/jBne
2/jBne
Filter Bank Implementation(Practical Implementation)
)()()( njbnaeX kkj
nk
)()()( njbnaeX kkj
nk
...
...
h(n)h(n)
h(n)h(n)
x(n)
knNW
knNW
...2/jBne
2/jBne
)(nyk
)2/sin()(2)2/cos()(2)( BnnbBnnany kkk )2/sin()(2)2/cos()(2)( BnnbBnnany kkk
Filter Bank Implementation(Practical Implementation)
)2/cos(Bn
)(21 nyk
)2/sin()(2)2/cos()(2)( BnnbBnnany kkk )2/sin()(2)2/cos()(2)( BnnbBnnany kkk
)2/sin(Bn
)(nak
)(nbk
nkcos
nksin
...
h(n)h(n)
x(n)
...
h(n)h(n)
)(21 sDyk
)2/cos(BsD
)2/sin(BsD
)(nak
)(nbk
nkcos
nksin
...
h(n)h(n)
x(n)
...
h(n)h(n)
Filter Bank Implementation(Practical Implementation)
)2/sin()(2)2/cos()(2)( BnnbBnnany kkk )2/sin()(2)2/cos()(2)( BnnbBnnany kkk
D:1D:1
D:1D:1
BD / BD /
Why?
)(sDak
)(sDbk
Filter Bank Implementation(Practical Implementation)
)2/sin()(2)2/cos()(2)( BnnbBnnany kkk )2/sin()(2)2/cos()(2)( BnnbBnnany kkk
)(21 sDyk
)2/cos(BsD
)2/sin(BsD
)(nak
)(nbk
nkcos
nksin
...
h(n)h(n)
x(n)
...
h(n)h(n)
D:1D:1
D:1D:1
BD / BD /)(sDak
)(sDbk
)2/cos( s )2/cos( s
)2/sin( s )2/sin( s
)(21 sDyk
)(sDak
)(sDbk
)(nak
)(nbk
nkcos
nksin
...
h(n)h(n)
x(n)
...
h(n)h(n)
)2/cos( s )2/cos( s
)2/sin( s )2/sin( s
D:1D:1
D:1D:1
Filter Bank Implementation(Practical Implementation)
)2/sin()(2)2/cos()(2)( BnnbBnnany kkk )2/sin()(2)2/cos()(2)( BnnbBnnany kkk
,0,1,0,1,0,1 ,0,1,0,1,0,1
,1,0,1,0,1,0 ,1,0,1,0,1,0
s)1(
Filter Bank Implementation(Practical Implementation)
)2/sin()(2)2/cos()(2)( BnnbBnnany kkk )2/sin()(2)2/cos()(2)( BnnbBnnany kkk
s)1(
)2( Dsak
)2( Dsbk
x(n)
)(nak
)(nbk
nkcos
nksin
...
h(n)h(n)
...
h(n)h(n)
)(21 sDyk
D:1D:1
D:1D:1
2D:12D:1
2D:12D:1
Filter Bank Implementation(Practical Implementation)
ADPCMCODEC
s)1(
s)1(
)2( Dsak
)2( Dsbk
nkcos
nksin...
h(n)h(n)
...
h(n)h(n)
2D:12D:1
2D:12D:1
)(nx
f(n)f(n)
...
f(n)f(n)
2D:12D:1
2D:12D:1
s)1(
s)1(
nkcos
nksin...
)2(ˆ Dsak
)2(ˆ Dsbk
)(ˆ nxk
Filter BankAnalysis
Filter BankAnalysis Sub-Band Coder
ModificationSub-Band Coder
Modification Filter BankSynthesis
Filter BankSynthesis
top related