This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Slide 1
Slide 2
MELP Vocoder Page 0 of 23
Slide 3
Outline Introduction MELP Vocoder Features Algorithm
Description Parameters & Comparison Page 1 of 23
Slide 4
Introduction Traditional pitched-excited LPC vocoders use
either a periodic train or white noise for synthesis filter
intelligible speech at very low bit rates But sometimes results in
mechanical or buzzy sound and are prone to tonal noise Page 2 of
23
Slide 5
Introduction These problems arise from: Inability of a simple
pulse train to reproduce all kind of voiced speech MELP vocoder
uses a mixed-excitation model and it represents a richer ensemble
of speech characteristic Produce more natural sounding speech Page
3 of 23
Slide 6
MELP vocoder Robust in background noise environments Based on
traditional LPC model, also includes additional features Page 4 of
23 Aperiodic pulses Adaptive spectral enhancement Mixed excitation
Pulse dispersion
Mixed Excitation Mixed-excitation is implemented using a
multi-band mixing model This model can simulate frequency dependent
voicing strength Using a mixture of Aperiodic/periodic and white
noise as excitation Primary effect of this unit is to reduce the
buzz in broadband acoustic noise Page 12 of 23
Slide 15
Aperiodic pulses When input signal is voiced, MELP vocoder can
synthesize speech using either aperiodic or periodic pulses.
Aperiodic pulses used during transition regions between voiced and
unvoiced segments of speech signal Producing erratic glottal pulses
without tonal noise Page 13 of 23
Slide 16
Pulse Dispersion Pulse dispersion is implemented using fixed
pulse dispersion filter based on a flattened triangle pulse The
pulse dispersion filter improves the match of bandpass filtered
synthetic and natural speech waveforms in frequency bands which do
not contain a formant resonance. Spreading the excitation energy
with a pitch period Reduce harsh quality of the synthetic speech
Page 14 of 23
Slide 17
Adaptive spectral enhancement filter Based on the poles of the
vocal tract filter Is used to enhance the formant structure in the
synthetic speech This filter improves the match between synthetic
and natural bandpass waveforms more natural speech output Page 15
of 23
Slide 18
MELP Algorithm Description (Encoder) 1. filter out any low
frequency noise 2. This filtered speech is again filtered in order
to perform the initial pitch search for the pitch estimation 3. The
next step is to perform the Bandpass voicing analysis - In this
step we decide to use periodic/Aperiodic train or white noise model
Page 16 of 23
Slide 19
MELP Algorithm Description (Encoder) contd In this stage A
voice degree parameter is estimated in each band, based on the
normalized correlation function of the speech signal and the
smoothed rectified signal in the non-DC band Let s k (n) denote the
speech signal in band k, u k (n) denote the DC-removed smoothed
rectified signal of s k (n). The correlation function: Page 17 of
23 P the pitch of current frame N the frame length k the voicing
strength for band (defined as max(R sk (P),R uk (P)))
Slide 20
MELP Algorithm Description (Encoder ) contd The jittery state
is determined by the peakiness of the fullwave rectified LP residue
e(n): Page 18 of 23 If peakiness is greater than some threshold,
the speech frame is then flagged as jittered (Aperiodic flag will
be set)
Slide 21
MELP Algorithm Description (Encoder) contd 4. Applying a LPC
analysis 5.Calculating final pitch estimate 6.Calculating Gain
estimate 7.quantize the LPC coefficients, pitch, gain and bandpass
voicing 8.Fourier magnitudes are determined and quantized The
information in these coefficients improves the accuracy of the
speech production model at the perceptually-important lower
frequencies Page 19 of 23
Slide 22
MELP Encoder Page 20 of 23 Pre filter Pitch Search Bandpass
Voicing Decision Gain Calculator LPC Analysis Filter Final Pitch
And voicing Decision LSF quantization Quantize Gain, pitch,
Voicing, jitter Fourier Magnitude calculation Apply Forward Error
Correction Input signal Transmitted Bitstream
Slide 23
MELP Algorithm (Decoder) 1. Decoding the pitch 2. Applying gain
attenuation 3. Interpolating linearly all of the synthesis
parameters pitch-synchronously 4. Generating mixed-excitation Page
21 of 23
Slide 24
MELP Algorithm (Decoder) contd 5. Applying an adaptive spectral
enhancement filter 6. LPC synthesis and applying gain factor 7.
Dispersion filtering Page 22 of 23
Slide 25
MELP Decoder Page 23 of 23 Decode parameters Noise Generator
Noise Shaping Filter Pulse Generator Pulse Position Jitter Pulse
Shaping Filter Adaptive Spectral Enhancement + LPC Synthesis Filter
Pulse Dispersion Filter gain Received Bitstream Synthesized
Speech
Slide 26
Parameter Quantization ParametersVoicedUnvoiced LSF
parameters25 Fourier magnitudes8- Gain (2 per frames)88 Pitch.
overall voicing77 Bandpass voicing4- Aperiodic flag1- Error
protection-13 Sync bit11 Total bits / 22.5 ms frame 54 Page 24 of
23
Slide 27
Bit transmission order Page 25 of 23
Slide 28
Comparison of the 2400 BPS MELP with other Standard Coders
Diagnostic Acceptability Measure Two Conditions Quiet Office
Continuously Variable Slope Delta Modulation (CVSD) 16,000 bps Code
Excited Linear Prediction (CELP) 4800 bps FS1016 Mixed Excitation
Linear Prediction (MELP) 2400 bps FIPS Publication 137 Linear
Predictive Coding (LPC) 2400 bps Page 26 of 23
Slide 29
Comparison of the 2400 BPS MELP with other Standard Coders
(contd) Mean Opinion Score in Six Conditions Quiet Anechoic Sound
Chamber Dynamic Microphone Quiet - H250 Anechoic Sound Chamber H250
Microphone 1% Random Bit Errors Anechoic Sound Chamber Dynamic
Microphone 0.5% Random Block Errors Anechoic Sound Chamber Dynamic
Microphone 50% Errors within a 35ms block Office Modern Office
Environment Dynamic Microphone Mobile Command Environment Field
Shelter EV M87 Microphone Page 27 of 23
Slide 30
Comparison of the 2400 BPS MELP with other Standard Coders
(contd) Complexity with three Measurements RAM ROM MIPS Page 28 of
23
Slide 31
Page 29 of 23 LPC 10 Voice samples
Slide 32
Page 30 of 30 Original Sound MELP 1800 MELP 2000 MELP 2200