Top Banner
Logo Prosodic Manipulation Advanced Signal Processing, SE 3.12.03 David Ludwig [email protected]
30

Prosodic Manipulation

Dec 31, 2015

Download

Documents

sacha-conrad

Prosodic Manipulation. Advanced Signal Processing, SE 3.12.03 David Ludwig [email protected]. Contents. Introduction SOLA, PSOLA LP-PSOLA RELP Sinusoidal/harmonic-plus-noise modeling MBROLA Application. Introduction. Definition. prosody (noun) - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Prosodic Manipulation

Logo

Prosodic Manipulation

Advanced Signal Processing, SE

3.12.03

David Ludwig

[email protected]

Page 2: Prosodic Manipulation

2Logo

Contents

Introduction SOLA, PSOLA LP-PSOLA RELP Sinusoidal/harmonic-plus-noise modeling MBROLA Application

Page 3: Prosodic Manipulation

Logo

Introduction

Page 4: Prosodic Manipulation

4Logo

Definition

prosody (noun) 

1. the study of poetic meter and the art of versification

2. the patterns of stress and intonation in a languageSynonyms: inflection

3. a system of versificationSynonyms: poetic rhythm, rhythmic pattern

pitch, duration, amplitude (gestures) Function:

Stress, non – lexical information, discourse, emotion

Page 5: Prosodic Manipulation

5Logo

Pitch ´n Time(Robot: His name is R1D1...)

Playful_Time: Random number between 10 and 400 milliseconds

and use that for the phone duration.

Serious_Time: same duration value to each phone.

Playful_Pitch: random melody for the sentence

Serious_Pitch: same pitch values, monotone

Page 6: Prosodic Manipulation

Logo

SOLA, PSOLA

(pitch-)synchronous overlap-and-add

Page 7: Prosodic Manipulation

7Logo

SOLA

Time-segment processing Segmentation of x[n] into overlapping frames Shifting according to scaling-factor Repositioning, Overlap/Add Cross-Correlation in the overlap interval Maximum of CC Fade in / fade out Flexible time lag

Page 8: Prosodic Manipulation

8Logo

PSOLA

Variation especially for voice processing Splitting signal ino overlapping windows Synchronized with fundamental frequency

Avoids pitch discontinuities

Neccesitates preliminary pitch marking Analysis:

Pitch Period P(t) at pitch mark ti

Segment extraction by windowing with pitch mark as its center

Page 9: Prosodic Manipulation

9Logo

Page 10: Prosodic Manipulation

10Logo

Synthesis

Time-Scaling: ST-signals must be added (or suppressed) without altering

the distance among adjacent pitch periods

Pitch-Shifting: synthesis time axis will have the same duration, but it will be

necessary to scale the local pitch period ST-Signals might be discarded (compression/lower

pitch) ST-Signals might be used twice (stretching/higher pitch) Artefacts:

Transient smearing, audible slices, Distortion due to phase errors

Page 11: Prosodic Manipulation

11Logo

Page 12: Prosodic Manipulation

12Logo

Page 13: Prosodic Manipulation

Logo

LP-PSOLA, RELP

Page 14: Prosodic Manipulation

14Logo

LP-PSOLA

LP-Residual or Error Function e(t) is used spectrally flat Separating excitation and vocal tract Little correlation within each pitch period

TD-PSOLA algorithm is applied to the residual part

Advantages: Control of spectral structure No additional computation time

Page 15: Prosodic Manipulation

15Logo

RELP

Residual Excited LPC Vocoding technique for speech transmission

(e.g. mobile phones) Residual Signal is compressed

Low-Pass Filtering Downsampling Re-Quantisation

Page 16: Prosodic Manipulation

16Logo

Page 17: Prosodic Manipulation

17Logo

Source-Filter Models

Source=oscillation of vocal chords Voiced (Dirac-Impulses) Unvoiced(Noise)

Filter=TF of vocal tract LP Approximation of spectral envelope Problem: Estimation of filter coefficients

Page 18: Prosodic Manipulation

Logo

Sinusoidal/Harmonic + Residual Model(HNM)

Page 19: Prosodic Manipulation

19Logo

Analysis/Synthesis

Signal is decomposed in harmonic+noise part:

Number of harmonics, fundamental frequency, time-variant amplitudes (harmonic model)

Peak detection/continuation, pitch detection, Subtraction

Residual~ time-pulsed,filtered noise Synthesis: Additive/Subtractive Synthesis

)(

)(

)( )()()( 0

tL

tLk

ttjkk teetAts

)](),()[()( tnthtwte

Page 20: Prosodic Manipulation

20Logo

Page 21: Prosodic Manipulation

21Logo

Features

Voiced/unvoiced decision Crucial:

Pitch estimation Peak continuation

McAulay + Quatieri Algorithm

Phase Relationships

Page 22: Prosodic Manipulation

22Logo

MBROLA                

                                                                           

      

Page 23: Prosodic Manipulation

23Logo

MBROLA

Multiband Resynthesis OLA Faculté polytechnique de Mons (Belgium) Open source synthesizer

As many voices, dialects and languages as possible Actually 27 languages !! Diphone concatenation

Time-domain approach (MBR-PSOLA) Smoothing of spectral discontinuities in the

time domain enhances fludity

Page 24: Prosodic Manipulation

24Logo

Examples

German

U.K. English

U.S. English

Japanese

Page 25: Prosodic Manipulation

25Logo

Manipulation

Manipulation in frequency domain Pitch-Shifting

Direct access to sinusoidal components frequency shifting with/without formant preservation

Time-Scaling No change of Input/Output hopsize Superior to phase vocoder Computationally expensive

Page 26: Prosodic Manipulation

Logo

Application

Page 27: Prosodic Manipulation

27Logo

Application

Mean value <F0> Macro-Prosody DF0

Micro-Prosody MF0

Pitch Modification by:

Page 28: Prosodic Manipulation

28Logo

Page 29: Prosodic Manipulation

29Logo

References

PSOLA: U.Zoelzer: DAFX Wiley, John & Sons, Incorporated

E. Moulines and F. Charpentier: Pitch-Synchronous Waveform Processing Techniques for Text-to-Speech Synthesis using Diphones, Speech Comminucation, vol 9, pp 452-467, 1990.

HNM: J. Laroche, Y. Stylianou, and E. Moulines: HNS: Speech

Modification Based on a Harmonic+Noise Model, Proc. of ICASSP 1993, vol.2, pp.550-553.

MBROLA: tcts.fpms.ac.be/synthesis/mbrola.html

Page 30: Prosodic Manipulation

30Logo

THANK YOU