Speech Timing and Rhythm Dafydd Gibbon Jinan University, Guangzhou, China November 2017
Speech Timing and Rhythm
Dafydd Gibbon
Jinan University, Guangzhou, ChinaNovember 2017
Background: the Architecture of Speech and Language
The Ranks and Interpretations Model
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 3
The architecture of language: Ranks and Interpretations
(MORPHO)PHONEME
MORPHEME
LEXICAL ROOT
DERIVED WORD
COMPOUND WORD
PHRASE
CLAUSE
SENTENCE
TEXT
LE
XIC
ON
– h
oli
stic
pro
per
ties
, o
pac
ity
DISCOURSE
Grammar – compositionality
MULTIMODALHIERARCHIES
speechwritinggesture
SEMANTICS/PRAGMATICHIERARCHIES:
conceptsobjectsevents
Categorial Ranks Interpretations
semiotic relationbetween meaningand phonetic form
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 4
Prosody in the Ranks and Interpretations Model
(MORPHO)PHONEME
MORPHEME
LEXICAL ROOT
DERIVED WORD
COMPOUND WORD
PHRASE
CLAUSE
SENTENCE
TEXT
LE
XIC
ON
– h
oli
stic
pro
per
ties
, o
pac
ity
DISCOURSE
Grammar – compositionality
Prosodic-phonetic Interpretation
phoneme rank: segment/tone/accent/stress
word rank:
morphological tone/accent/stress
sentence, clause, phrase rank: intonation, rhythm
phrasal accent, boundary tone
utterance rank: intonation, rhythm
discourse rank: intonation, rhythm
Rank
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 5
Speech Timing:
Regularities and Rhythm
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 6
Speech timing
Relevance of speech timing
– Studies in prosodic typology of timinge.g. mora, syllable, foot timing (depending on annotation)
– Studies in musicologye.g. song, music performance
– Speech technology● measuring foreign language phonetic proficiency● diagnosis and therapy in speech pathology● natural duration models forl speech synthesis● designing disambiguation models in speech recognition
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 7
Speech timing
● Discourse rank:– prosodic adaptation, turn-taking (sequence, interruption)– back channel intonation
● Text/utterance rank:– rhetorical pause, rhythm– timing of new/old information
● Sentence, phrase rank:– stress or syllable timed regularities– phrase-final lengthening
● Word rank (simple, derived, compound, inflected):– mora, syllable, foot timing (depending on annotation)
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 8
Measuring Timing Regularities
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 9
Analysis of timing relations
1. Empirical resources1. corpus creation: planning, recording, storage2. transcription, annotation
2.Method1. Recording, transcription, annotation2. Measurement:
1.Manual, with spreadsheets2.Automatic analysis
e.g. TGA, Time Group Analysis – an online tool for studying speech timing and rhythm
http://wwwhomes.uni-bielefeld.de/gibbon/TGA/
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 10
Speech data: corpus creation
● Pre-recording phase:● definition of purposes for which the data will be used● scenario: domain, activities, speakers● equipment and technical operator:
– general: digital audio (recorder / laptop), digital video– specialised: laryngograph, etc
● Recording phase:● negotiate scenario with chiefs, elders, speakers● ensure the recording location is quiet● if possible ensure the microphones, video tripod etc. can
be stably positioned● Post-recording phase:
● provide recordings with metadata immediately● label the data media immediately● make safety copies immediately
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 11
Timing and stress: pitch pattern and syllable duration
Questions for discussion:
● Measure the durations of the speech sounds in these words.● Can you order the types of speech sound by their average durations?
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 12
Regularity and Rhythm in Speech
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 13
Regularity and Rhythm in Speech
Rhythm is an emergent property of timing, determined multiple factors in the typology of languages:
–
● Functional factors – well investigatedDiscourse: speech rate, pauses
● Grammatical factors – these are well investigated:Lexical: contrastive duration (2 or 3 values)Phrasal: relations between stressed-unstressed itemsDiscoursal: rhetorical pause
● Phonetic factors – somewhat controversial:Inherent Consonant and Vowel durationBalance of Consonant and Vowel duration in syllablesLanguage specific compositional units of timing:
mora – syllable – foot
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 14
Regularity and Rhythm in Speech
Rhythm is an emergent property of timing, determined multiple factors in the typology of languages
A common theory of timing regularity distinguishes between different kinds of regularity in timing in different types of language:
● stress timing (or foot timing) – a ‘foot’ or ‘rhythm unit’ is a syllable sequence stressed syllables and neigbouring unstressed syllables; there are different theories of foot structure
● syllable timing – a ‘syllable’ is a sequence of speech sounds consisting of a vowel and its neighbouring consonants; there are different theories of syllable structure
● mora timing – a ‘mora’ is a unit of timing which is smaller than the syllable
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 15
Regularity and Rhythm in Speech
Rhythm is an emergent property of timing, determined multiple factors in the typology of languages
Duration properties of speech sounds:● intrinsic duration:
– vowels– consonants
● contrastive duration:– vowels– consonants
● rhythmic duration:– strong – weak durations:
● syllable patterns of consonants and vowel alternations:C – V – C – V ...
● foot patterns of stressed and unstressed syllable alternations:CVC – cvc – CVC – cvc ...
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 16
Regularity and Rhythm in Speech: A Basic Rhythm Model
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 17
Regularity and Rhythm in Speech: A Basic Rhythm Model
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 18
Regularity and Rhythm in Speech: A Basic Rhythm Model
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 19
Regularity and Rhythm in Speech: A Basic Rhythm Model
Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016
Lecture 6: Speech Timing 20
Discovering regularities in rhythm:
Relations between neighbouring syllables
Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016
Lecture 6: Speech Timing 21
Partial recovery of alternation: Wagner quadrants
Wagner (2006) has a topological procedure for recovering non-absolute differences by plotting DUR(i) x DUR(i+1):
Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016
Lecture 6: Speech Timing 22
Partial recovery of alternation: Wagner quadrants
Wagner (2006) has a topological procedure for recovering non-absolute differences by plotting DUR(i) x DUR(i+1):
Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016
Lecture 6: Speech Timing 23
Partial recovery of alternation: Wagner quadrants
Wagner (2006) has a topological procedure for recovering non-absolute differences by plotting DUR(i) x DUR(i+1):
Note: still binary relations
However, 4 quadrants permit distinguishing between long-short & short-long
Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016
Lecture 6: Speech Timing 24
Binary duration relations: Wagner Quadrants for German
Green: stressed->unstressed
Blue: unstressed->stressed
Red:...->phrase-final
D(i+1)
Comment: stress timed - green & blue disjoint
D(i)
Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016
Lecture 6: Speech Timing 25
Binary duration relations: Wagner Quadrants for English
Green: stressed->unstressed
Blue: unstressed->stressed
Red:...->phrase-final
D(i+1)
D(i)
Comment: stress timed - green & blue disjoint
Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016
Lecture 6: Speech Timing 26
Binary duration relations: Wagner Quadrants for French
Green: stressed->unstressed
Blue: unstressed->stressed
Red:...->phrase-final
D(i+1)
Comment: syllable timed - green & blue overlap
D(i)
Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016
Lecture 6: Speech Timing 27
Binary duration relations: Wagner Quadrants for Italian
Green: stressed->unstressed
Blue: unstressed->stressed
Red:...->phrase-final
D(i+1)
D(i)
Comment: stress timed - green & blue disjoint
Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016
Lecture 6: Speech Timing 28
Binary duration relations: Wagner Quadrants for Polish
Comment: highly syllable timed - green & blue overlap
Green: stressed->unstressed
Blue: unstressed->stressed
Red:...->phrase-final
D(i+1)
D(i)
Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016
Lecture 6: Speech Timing 29
Discovering regularities in rhythm:
dynamic timing models
Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016
Lecture 6: Speech Timing 30
Models of rhythm and entrainment
● Fred Cummins and Robert Port– rhythm– entrainment of rhythm
● Plinio Barbosa– rhythm as oscillation– different domains of oscilation
● Petra Wagner and associates– examination of different oscillator models
Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016
Lecture 6: Speech Timing 31
Barbosa’s dynamic timing model
Def. “rhythm”: speech rhythm is understood as the consequence of the variation of perceived duration along the entire utterance.
Two levels of duration encoding / control / specification, coupling between 2 oscillators:syllabic: intrinsic lexical level
phrasal: extrinsic, properly rhythmic level
entrainment (coupling) of the oscillators
Emulation of results of other rhythm studies:the greater wo, the more like stress-timing
the smaller wo, the more like syllable-timing
Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016
Lecture 6: Speech Timing 32
Barbosa’s dynamic rhythm model
Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016
Lecture 6: Speech Timing 33
Barbosa’s dynamic rhythm model
phrase pulse
syllable oscillations
(for English these could be stress oscillations)
Note also work by Cummins, Port, Wagner, Windman and others on oscillator models of rhythm.
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 34
Regularity and Rhythm in Speech:
‘Isolated’ words in citation contexts
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 35
Timing and stress: pitch pattern and syllable duration
Questions for discussion:● What are the durations of the syllables?● What are the ratios between the durations of syllables in each word?● Assuming (a big assumption) that the effect of stress is the same, whether
the syllable is first or second, what is the effect of final lengthening?
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 36
Timing and stress: pitch pattern and syllable duration
Which is more important for stress – pitch change or duration?Which is more important for duration, stress or final lengthening?
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 37
Timing and stress: pitch pattern and syllable duration
IM 0.18port! 0.47 ratio: 2.6:1im 0.22PORT! 0.49 ratio: 2.2:1
IM 0.22port? 0.49 ratio: 2.2:1im 0.17PORT? 0.5 ratio: 2.9:1
Which is more important for stress – pitch change or duration?Which is more important for duration, stress or final lengthening?
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 38
Regularity and Rhythm in Speech:
Words and phrases in utterance contexts
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 39
Timing and stress: pitch pattern and syllable duration
Tasks for discussion:● Enter the durations of the syllables into a spreadsheet.● Find the mean & standard deviation of all syllable durations (no pauses).● Find the mean and standard deviation of the stressed syllable durations.● Find the mean and standard deviation of the unstressed syllable durations.
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 40
Words and phrases in utterance contexts
● Procedure:– Select a Praat TextGrid– Enter into a spreadsheet the timestamps of items in the
tier you are examining (e.g. syllables):start timestampend timestamp
– Calculate durations of the items:end – start
– Calculate the mean (average) duration– Calculate the standard deviation of the duration– Calculate the coefficient of variation (relative standard
deviation) of the duration
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 41
Some data processing functions used in timing analysis
μ=∑i=1
nx i
n
σ=√∑i=1
n(xi−μ)2
n
Mean (average):
Standard deviation (of sample):
CV=σμ
Coefficient of variation (relative standard deviation), ‘varco’:
z= x−μσ
Normalisation or standardisation of data values from different sources in order to make them comparable:
z-score (standard score)Task for discussion:
First calculate the z-scores for your data, then calculate the mean, standard deviation and coefficient of variation.
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 42
Some data processing functions used in timing analysis
μ=∑i=1
nx i
n
σ=√∑i=1
n(xi−μ)2
n
Mean (average):
Standard deviation (of sample):
CV=σμ
Coefficient of variation (relative standard deviation), ‘varco’:
z= x−μσ
Normalisation or standardisation of data values from different sources in order to make them comparable:
z-score (standard score)Task for discussion:
First calculate the z-scores for your data, then calculate the mean, standard deviation and coefficient of variation.
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 43
Speech timing – regularity measures
‘Rhythm metrics’ of relative isochrony:– measures of regularity...irregularity of timing units:
– σ: Standard Deviation– PIM: Pairwise Irregularity Measure– PFD: Pairwise Foot Difference– rPVI, nPVI: raw and normalised Pairwise Variability Index
– not rhythm: they ignore rhythmic alternation
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 44
Speech timing – regularity measures
‘Rhythm metrics’ of relative isochrony:– measures of regularity...irregularity of timing units:
– σ: Standard Deviation– PIM: Pairwise Irregularity Measure– PFD: Pairwise Foot Difference– rPVI, nPVI: raw and normalised Pairwise Variability Index
– not rhythm: they ignore rhythmic alternation
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 45
Speech timing – regularity measures
‘Rhythm metrics’ of relative isochrony:– measures of regularity...irregularity of timing units:
– σ: Standard Deviation– PIM: Pairwise Irregularity Measure– PFD: Pairwise Foot Difference– rPVI, nPVI: raw and normalised Pairwise Variability Index
– not rhythm: they ignore rhythmic alternation
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 46
Speech timing – regularity measures
‘Rhythm metrics’ of relative isochrony:– measures of regularity...irregularity of timing units:
– σ: Standard Deviation– PIM: Pairwise Irregularity Measure– PFD: Pairwise Foot Difference– rPVI, nPVI: raw and normalised Pairwise Variability Index
– not rhythm: they ignore rhythmic alternation
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 47
Speech timing – regularity measures
‘Rhythm metrics’ of relative isochrony:– measures of regularity...irregularity of timing units:
– σ: Standard Deviation– PIM: Pairwise Irregularity Measure– PFD: Pairwise Foot Difference– rPVI, nPVI: raw and normalised Pairwise Variability Index
– not rhythm: they ignore rhythmic alternation
Used for consonants, whose duration is not so variable
Used for vowels, whose duration is quite variable
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 48
Timing: temporal relations – rhythm metrics
‘Rhythm metrics’ of relative isochrony:– measures of regularity...irregularity of timing units:
– σ: Standard Deviation– PIM: Pairwise Irregularity Measure– PFD: Pairwise Foot Difference– rPVI, nPVI: raw and normalised Pairwise Variability Index
– not rhythm: they ignore rhythmic alternation
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 49
Timing: temporal relations – rhythm metrics
‘Rhythm metrics’ of relative isochrony:– measures of regularity...irregularity of timing units:
– σ: Standard Deviation– PIM: Pairwise Irregularity Measure– PFD: Pairwise Foot Difference– rPVI, nPVI: raw and normalised Pairwise Variability Index
– not rhythm: they ignore rhythmic alternation
Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 50
Timing: temporal relations – rhythm metrics
Task:1. Choose a Mandarin TextGrid and an English TextGrid.2. Calculate the z-score (optional step).3. Calculate the nPVI for each.4. Compare these results with the results for Standard Deviation.5. Can you draw any conclusions about syllable timing and stress timing in English and Mandarin?