Top Banner
E85.2607: Lecture 8 – Source-Filter Processing E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 1 / 21
21

E85.2607: Lecture 8 -- Source-Filter Processing · 2010-04-01 · E85.2607: Lecture 8 { Source-Filter Processing 2010-04-01 18 / 21. Applications - Cross-synthesis/Vocoding freq

Jul 06, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • E85.2607: Lecture 8 – Source-Filter Processing

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 1 / 21

  • Source-filter analysis/synthesis

    Transformation

    Analysis

    Synthesis

    Sourcesignal

    Spectralenvelope

    n

    f

    n

    n

    n1 n2

    n

    Spectralenvelope

    Separate

    Source/excitation fine time/frequency structure (e.g. pitch)Filter broad spectral shape (resonances)

    Similar to subtractive synthesis

    Satisfying physical interpretation for real-world signals

    Easier to make sense of than e.g. phase

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 2 / 21

  • Human speech production

    Reasonable approximation to speechsignals:

    Source is oscillation of vocal chords

    e.g. normal speech (varyingpitches) vs whispering

    Filtered by vocal tract(throat + tongue + lips)

    e.g. “oooh” vs “aaah”resonances = formants

    Both are time-varying

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 3 / 21

  • Source filter model

    Excitationsource

    t

    tResonancefilter

    f

    0 200 400 600 800 1000

    −5

    0

    5

    10

    x 10−3 time signal of pred. error e(n)

    n !0 2 4 6 8

    −100

    −80

    −60

    −40

    −20

    magnitude spectra |X(f)| and |G" H(f)| in dB

    f/kHz !

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 4 / 21

  • Formants in speech

    watch thin as a dimeahas

    mdnctcl

    ^

    θ zwzh e III ayε

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 5 / 21

  • How to separate the source and filter?

    x(n)

    Chan. Voc.LPC

    Cepstrum

    y(n)e (n)1

    H (z)1H (z)2

    1

    Spectral EnvelopeEstimation

    Source Signal

    Spectral EnvelopeTransformation

    Source SignalProcessing

    Short-time analysis

    For each frame, estimate spectral envelope (filter response)1 Channel vocoder (frequency-domain)2 Linear Predictive Coding (LPC) (time-domain)3 Cepstral analysis

    Source signal is whats left over (residual) after “whitening”

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 6 / 21

  • Channel vocoder

    Wideband STFT filterbank

    but using relatively few filters

    Linearly spaced with equalbandwidth (STFT)Logarithmically spaced(constant-Q filter bank)

    Take RMS energy in eachfrequency band

    x(n)

    BP 1

    x (n)2BP1

    ( )2 LP x (n)RMS1

    BP 2 ( )2

    LP x (n)RMS2

    BP k ( )2

    LP x (n)RMSk

    x (n)2BP2

    x (n)2BPk

    BP1

    f

    BP2

    BPk

    (a)

    (b)

    BP1

    f

    BPk

    BP2

    Octave-spaced channel stacking

    Equally-spaced channel stacking

    0 1000 2000 3000 4000 5000 6000 7000 8000

    −100

    −80

    −60

    −40

    −20

    0

    X(f)/

    dB

    f/Hz !

    Short−time spectrum and spectral envelope

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 7 / 21

  • Channel vocoder using FFT

    x(n)

    BP 1

    x (n)2BP1

    ( )2 LP x (n)RMS1

    BP 2 ( )2

    LP x (n)RMS2

    BP k ( )2

    LP x (n)RMSk

    x (n)2BP2

    x (n)2BPk

    BP1

    f

    BP2

    BPk

    (a)

    (b)

    BP1

    f

    BPk

    BP2

    Octave-spaced channel stacking

    Equally-spaced channel stacking

    0 1000 2000 3000 4000 5000 6000 7000 8000

    −100

    −80

    −60

    −40

    −20

    0

    X(f)/

    dB

    f/Hz !

    Short−time spectrum and spectral envelope

    Lowpass filter magnitude of each STFT frame

    i.e. filter columns of the spectrogram

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 8 / 21

  • Linear predictive coding

    Predict next input sample as linear combination of previous samples

    Synthesis filter(spectral envelope

    model)

    Excitation source Sound

    a2

    x(n) z -1

    apa1

    e(n)

    x̂(n)

    z -1 z -1

    _

    Filter is described by a few filter coefficients for each frame

    xm[n] ≈ x̂ [n] =p∑

    k=1

    akx [n − k]

    Excitation is whats left after filtering (residual aka prediction error)

    e[n] = x [n]− x̂ [n] = x [n]−p∑

    k=1

    akx [n − k]

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 9 / 21

  • LPC analysis/synthesis

    x(n)

    P(z) x̂(n)

    e(n) y(n)

    P(z)

    ~e(n)

    (a) (b)

    _

    (a) LPC analysis (b) LPC synthesis

    P(z) is just an FIR filter: P(z) =∑p

    k=1 akz−k

    Excitation is still a filtered version of the input:

    E (x) = X (z) (1− P(z))For synthesis, pass (approximate) excitation through the inverse filter:

    Y (z) = Ẽ (z)H(z)

    H(z) =1

    1− P(z)all-pole “autoregressive” (AR) modeling

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 10 / 21

  • LPC - varying filter order

    LPC filter H(z) models the spectrum of x [n]Minimizing the energy of the residual e[n] gives optimal coefficients

    {ak} = argminak

    ∑n

    (x [n]−

    ∑k

    akx [m − k]

    )2The approximation improves with increasing filter order p

    0 2 4 6 8−100

    −50

    0

    50

    100spectra of original and LPC filters

    |X(f)|/dB

    f/kHz !

    p=10p=20

    p=40p=60p=80p=120

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 11 / 21

  • Estimating LPC parameters

    Set derivative of∑

    n e2[n] w.r.t. ak zero and solve for ak :

    ∂ak

    ∑n

    e2[n] = 0

    End up with p linear equations involving autocorrelations of x :∑m

    x [m]x [m − k] =∑

    i

    ak∑m

    x [m − i ]x [m − k]

    Solve using Levinson-Durbin recursion

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 12 / 21

  • LPC example

    0 1000 2000 3000 4000 5000 6000 7000 freq / Hz

    time / samp

    -60

    -40

    -20

    0

    0 50 100 150 200 250 300 350 400

    dB

    windowed original

    original spectrum

    LPC residual

    residual spectrum

    LPC spectrum

    -0.3

    -0.2

    -0.1

    0

    0.1

    Filter poles

    z-plane

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 13 / 21

  • Short-time LPC analysis

    E4896 Music Signal Processing (Dan Ellis) 2010-02-22 - /16

    Short-Time LP Analysis• Solve LPC for each ~20 ms frame

    10time / s

    freq

    / kHz

    0

    2

    4

    6

    8

    freq

    / kHz

    0

    2

    4

    6

    8

    0.5 1 1.5 2 2.5 3

    -1 0 1-1

    -0.5

    0

    0.5

    1

    12

    Real Part

    Imag

    inary

    Par

    t

    0 0.2 0.4 0.6 0.8 1-15

    -10

    -5

    0

    5

    10

    15

    20

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 14 / 21

  • Cepstral analysis

    cepstrum = String.reverse(“spec”) + “trum”Entire lexicon of funny anagrams

    Insight: source and filter add in the log spectral domain

    X (z) = E (z)H(z)

    log X (z) = log E (z) + log H(z)

    Makes them easy to separate

    y(n)=x(n)*h(n)

    FFT log|Y(k)|Y(k) Y

    ^(k)R

    IFFT

    w(n)

    w (n)HP

    FFT

    Source Envelope

    Real Cepstrum

    c(n)

    w (n)LP

    FFT

    Spectral Envelope

    c (n)h

    c (n)x

    C (k)=h log|H(k)|

    C (k)=x log|X(k)|

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 15 / 21

  • Liftering example

    By low-pass “liftering” the cepstrum we obtain the spectral envelope of the signal

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 16 / 21

  • Liftering example 2

    Original waveform has excitation finestructure convolved with resonances

    DFT shows harmonics modulated byresonances

    Log DFT is sum of harmonic ‘comb’ andresonant bumps

    IDFT separates out resonant bumps (lowquefrency) and regular, fine structure(‘pitch pulse’)

    Selecting low-n cepstrum separatesresonance information (deconvolution /‘liftering’)

    0 100 200 300 400-0.2

    0

    0.2Waveform and min. phase IR

    samps

    0 1000 2000 30000

    10

    20abs(dft) and liftered

    freq / Hz

    freq / Hz0 1000 2000 3000

    -40

    -20

    0

    log(abs(dft)) and liftered

    0 100 200

    0

    100

    200 real cepstrum and lifter

    quefrency

    dB

    pitch pulse

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 17 / 21

  • Applications - Speech coding

    E4896 Music Signal Processing (Dan Ellis) 2010-02-22 - /16

    4. LPC Synthesis• LP analysis on ~20ms frames gives

    prediction filter and residual recombining them should yield perfectcoding applications further compress

    e.g. simple pitch tracker ! “buzz-hiss” encoding

    13

    A(z) e[n]s[n]

    e[n]

    f

    |1/A(ej!)|

    LPC analysis

    Represent & encode

    Represent & encode

    Excitation generator

    All-pole filter

    Input s[n]

    Filter coefficients {ai}

    Residual e[n]

    Encoder Decoder

    t

    Output s[n]^e[n]^

    H(z) = 1

    1 - "aiz

    -i

    -50

    0

    50

    100

    1.3 1.35 1.4 1.45 1.5 1.55 1.6 1.65 1.7 1.75 time / s

    16 ms frame boundariesPitch period valuesLow bitrate speech codec used in cell phones is based on LPCQuantize LPC filter parameters, use crude approximation to residual

    Many different ways to represent filter params:prediction coefficients {ak}, roots of 1− P(z), line spectral frequenciesSwitch between noise and pulse train for excitation

    E4896 Music Signal Processing (Dan Ellis) 2010-02-22 - /16

    4. LPC Synthesis• LP analysis on ~20ms frames gives

    prediction filter and residual recombining them should yield perfectcoding applications further compress

    e.g. simple pitch tracker ! “buzz-hiss” encoding

    13

    A(z) e[n]s[n]

    e[n]

    f

    |1/A(ej!)|

    LPC analysis

    Represent & encode

    Represent & encode

    Excitation generator

    All-pole filter

    Input s[n]

    Filter coefficients {ai}

    Residual e[n]

    Encoder Decoder

    t

    Output s[n]^e[n]^

    H(z) = 1

    1 - "aiz

    -i

    -50

    0

    50

    100

    1.3 1.35 1.4 1.45 1.5 1.55 1.6 1.65 1.7 1.75 time / s

    16 ms frame boundariesPitch period values

    Use codebook of excitations (CELP: Code Excited Linear Prediction)

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 18 / 21

  • Applications - Cross-synthesis/Vocodingfr

    eq /

    Hz

    freq

    / H

    z

    0

    1000

    2000

    3000

    4000

    time / s0 0.2 0.4 0.6 0.8 1 1.2 1.40

    1000

    2000

    3000

    4000

    Original (mpgr1_sx419)

    Noise-excited LPC resynthesis with pole freqs

    Reconstruct using excitation from one sound and filter from another

    Whisperization: replace excitation with white noise

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 19 / 21

    http://www.ee.columbia.edu/~ronw/adst/lectures/matlab/wavs/mpgr1_sx419-8k.wavhttp://www.ee.columbia.edu/~ronw/adst/lectures/matlab/wavs/mpgr1_sx419-8k-whisper.wav

  • Still more applications

    E4896 Music Signal Processing (Dan Ellis) 2010-02-22 - /16

    LPC Warping• Replacing delays z-1 with allpass elements

    warps frequencies but not magnitudes

    http://www.ee.columbia.edu/~dpwe/resources/matlab/polewarp/

    14

    0 0.2 0.4 0.6 0.8 ^0

    0.2

    0.4

    0.6

    0.8 = 0.6

    = -0.6

    z + ααz + 1

    Time

    Freq

    uenc

    yOriginal

    0.5 1 1.5 2 2.5 30

    2000

    4000

    6000

    8000

    Time

    Freq

    uenc

    y

    Warped LPC resynth, = -0.2

    0.5 1 1.5 2 2.5 30

    2000

    4000

    6000

    8000

    Process formants independent of pitchPitch-shifting while preserving formantsShift formants while preserving pitch

    http://www.ee.columbia.edu/˜dpwe/resources/matlab/polewarp/

    Voice transformationPitch-analysis

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 20 / 21

    http://www.ee.columbia.edu/~dpwe/resources/matlab/polewarp/sm1_cln.wavhttp://www.ee.columbia.edu/~dpwe/resources/matlab/polewarp/sm1_cln_anp2.wavhttp://www.ee.columbia.edu/~dpwe/resources/matlab/polewarp/

  • Reading

    DAFX 9.1 – 9.3 - Source-Filter Processing

    E85.2607: Lecture 8 – Source-Filter Processing 2010-04-01 21 / 21