Page 1 of 23 MELP Vocoders MELP Vocoders Nima Moghadam SN#:82245502 Saeed Nari SN#:82270309 Supervisor Dr. Saameti April 2005 Sharif University of Technology
Jan 12, 2016
Page 1 of 23
MELP VocodersMELP Vocoders
Nima Moghadam SN#:82245502
Saeed NariSN#:82270309
Supervisor
Dr. Saameti
April 2005Sharif University of Technology
Page 2 of 23
OutlineOutline
IntroductionMELP Vocoder FeaturesAlgorithm DescriptionParameters & Comparison
Page 3 of 23
IntroductionIntroduction
Traditional pitched-excited LPC vocoders use either a periodic train or white noise for synthesis filter
intelligible speech at very low bit ratesBut sometimes results in mechanical or
buzzy sound and are prone to tonal noise
Page 4 of 23
IntroductionIntroduction
These problems arise from:– Inability of a simple pulse train to reproduce
all kind of voiced speech
MELP vocoder uses a mixed-excitation model and it represents a richer ensemble of speech characteristic Produce more natural sounding speech
Page 5 of 23
MELP vocoderMELP vocoder
Robust in background noise environments
Based on traditional LPC model, also includes additional features
Aperiodic pulses
Adaptive spectral enhancement
Mixed excitation
Pulse dispersion
Page 6 of 23
Mixed ExcitationMixed Excitation
Mixed-excitation is implemented using a multi-band mixing model
This model can simulate frequency dependent voicing strength
Using a mixture of Aperiodic/periodic and white noise as excitation
Primary effect of this unit is to reduce the buzz in broadband acoustic noise
Page 7 of 23
Aperiodic pulsesAperiodic pulses
When input signal is voiced, MELP vocoder can synthesize speech using either aperiodic or periodic pulses.
Aperiodic pulses used during transition regions between voiced and unvoiced segments of speech signal
Producing erratic glottal pulses without tonal noise
Page 8 of 23
Pulse DispersionPulse Dispersion
Pulse dispersion is implemented using fixed pulse dispersion filter based on a flattened triangle pulse
The pulse dispersion filter improves the match of bandpass filtered synthetic and natural speech waveforms in frequency bands which do not contain a formant resonance.
Spreading the excitation energy with a pitch periodReduce harsh quality of the synthetic speech
Page 9 of 23
Adaptive spectral enhancement filterAdaptive spectral enhancement filter
Based on the poles of the vocal tract filterIs used to enhance the formant structure
in the synthetic speechThis filter improves the match between
synthetic and natural bandpass waveforms more natural speech output
Page 10 of 23
MELP Algorithm Description MELP Algorithm Description (Encoder)(Encoder)
1. filter out any low frequency noise
2. This filtered speech is again filtered in order to perform the initial pitch search for the pitch estimation
3. The next step is to perform the Bandpass voicing analysis
- In this step we decide to use periodic/Aperiodic train or white noise model
Page 11 of 23
MELP Algorithm Description MELP Algorithm Description (Encoder) (Encoder) cont’dcont’d
In this stage A voice degree parameter is estimated in each band, based on the normalized correlation function of the speech signal and the smoothed rectified signal in the non-DC band
Let sk(n) denote the speech signal in band k, uk(n) denote the DC-removed smoothed rectified signal of sk(n). The correlation function:
2/11
0
21
0
2
1
0
])()([
)()()(
N
n
N
n
N
nx
pnxnx
pnxnxpR
P – the pitch of current frame
N – the frame length
k – the voicing strength for band (defined as max(Rsk(P),Ruk(P)))
Page 12 of 23
MELP Algorithm Description MELP Algorithm Description (Encoder ) (Encoder ) cont’dcont’d
The jittery state is determined by the peakiness of the fullwave rectified LP residue e(n):
1
0
1
0
2/12
)(1
])(1
[
N
n
N
n
neN
neN
Peakiness
If peakiness is greater than some threshold, the speech frame is then flagged as jittered (Aperiodic flag will be set)
Page 13 of 23
MELP Algorithm Description MELP Algorithm Description (Encoder) (Encoder) cont’dcont’d
4. Applying a LPC analysis
5. Calculating final pitch estimate
6. Calculating Gain estimate
7. quantize the LPC coefficients, pitch, gain and bandpass voicing
8. Fourier magnitudes are determined and quantized The information in these coefficients improves the
accuracy of the speech production model at the perceptually-important lower frequencies
Page 14 of 23
MELP EncoderMELP Encoder
Pre filter Pitch Search
Bandpass Voicing Decision
GainCalculator
LPC Analysis
Filter
Final PitchAnd voicing
Decision
LSF quantization
QuantizeGain, pitch,Voicing,
jitter
FourierMagnitudecalculation
ApplyForward
Error Correction
Input
signal
Transmitted
Bitstream
Page 15 of 23
MELP Algorithm (Decoder)MELP Algorithm (Decoder)
1. Decoding the pitch
2. Applying gain attenuation
3. Interpolating linearly all of the synthesis parameters pitch-synchronously
4. Generating mixed-excitation
Page 16 of 23
MELP Algorithm (Decoder) MELP Algorithm (Decoder) cont’dcont’d
5. Applying an adaptive spectral enhancement filter
6. LPC synthesis and applying gain factor
7. Dispersion filtering
Page 17 of 23
MELP DecoderMELP Decoder
Decodeparameters
NoiseGenerator
NoiseShaping
Filter
PulseGenerator
PulsePosition
Jitter
PulseShaping
Filter
AdaptiveSpectral
Enhancement+
LPCSynthesis
Filter
PulseDispersion
Filtergain
Received Bitstream
Synthesized
Speech
Page 18 of 23
Parameter QuantizationParameter Quantization
Parameters Voiced Unvoiced
LSF parameters 25 25
Fourier magnitudes 8 -
Gain (2 per frames) 8 8
Pitch. overall voicing 7 7
Bandpass voicing 4 -
Aperiodic flag 1 -
Error protection - 13
Sync bit 1 1
Total bits / 22.5 ms frame
54 54
Page 19 of 23
Bit transmission orderBit transmission order
Page 20 of 23
Comparison of the 2400 BPS MELP with Comparison of the 2400 BPS MELP with other Standard Codersother Standard Coders
Diagnostic Acceptability Measure
Two Conditions– Quiet
– Office
Continuously Variable Slope Delta Modulation (CVSD)
• 16,000 bps Code Excited Linear Prediction (CELP)
• 4800 bps • FS1016
Mixed Excitation Linear Prediction (MELP) • 2400 bps • FIPS Publication 137
Linear Predictive Coding (LPC) • 2400 bps
Page 21 of 23
Comparison of the 2400 BPS MELP with Comparison of the 2400 BPS MELP with other Standard Coders (cont’d)other Standard Coders (cont’d)
Mean Opinion Score in Six ConditionsQuiet
– Anechoic Sound Chamber – Dynamic Microphone
Quiet - H250 – Anechoic Sound Chamber – H250 Microphone
1% Random Bit Errors – Anechoic Sound Chamber – Dynamic Microphone
0.5% Random Block Errors – Anechoic Sound Chamber – Dynamic Microphone – 50% Errors within a 35ms block
Office – Modern Office Environment – Dynamic Microphone
Mobile Command Environment – Field Shelter – EV M87 Microphone
Page 22 of 23
Comparison of the 2400 BPS MELP with Comparison of the 2400 BPS MELP with other Standard Coders (cont’d)other Standard Coders (cont’d)
Complexity with three Measurements
– RAM– ROM– MIPS
Page 23 of 23
Voice samplesVoice samples
Original Sound
MELP 1800
MELP 2000
MELP 2200
Page 24 of 23
Any Question?Any Question?
Thanks!