-
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
Focus Marking in German
Kordula De Kuthy
HS Neuere Arbeiten zur Fokusprojektion WS 09/10
February 2, 2010
1 / 17
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
Prosodic Marking of Focus DomainsCategorial or Gradient
(1) a. Q: Who did you call?
b. A: [I called]background [MAry]F
(2) a. Q: Did you call John?
b. A: No, [I called]background [MAry]F
(3) a. Q: What happened?
b. A: [I called MAry]F
I The differences between answers in (1b), (2b), and (3b)are
discrete:
I it is either MAry or I called Mary which is in focus andI MAry
is either contrasted with another specific person,
or is singled out from a larger set.I Are these differences
marked prosodically, andI does the prosodic marking involve
I discrete means, i.e. phonological categories such aspitch
accent type, or
I gradient means, such as duration, or F0 timing andscaling
differences (which do not lead to a difference inphonological
categories.
2 / 17
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
Prosodic marking of broad vs narrow focus inGerman
I Féry (1993) looked for categorial distinctions in theprosodic
marking of broad versus narrow focus inGerman.
I The result of an production experiment revealed thatspeakers
used the same nuclear pitch accent type(H*L) in both broad and
narrow focus as in (4) and (5).
(4) a. Q: Was ist los?
b. A: [ANna ist weggelaufen.]F
(5) a. Q: Wer ist weggelaufen?
b. A: [ANna]F [ist weggelaufen.]background
3 / 17
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
Production experimentBaumann et al. (2006) design a production
experiment toinvestigate whetherI prosodic means are used in German
to differentiate
between three sizes of focus domains involving focusprojection
and
I between these and narrow focus andI between narrow focus and
contrastive focus
position), in German have later and higher peak placementthan
non-contrastive ones [3].
Furthermore, stressed vowels have a significantly longerduration
in contrastive themes. Similarly, Second OccurrenceFocus (SOF;
[10]) involves a longer duration of the targetword. SOF is induced
by a focus operator such as even afterthe main focus of the phrase,
that is, in the unaccented stretchfollowing the nuclear accent.
In both of the above cases gradient means are used toexpress a
binary opposition: contrastive or non-contrastivethemes; SOF or no
focus. In our own study, the size of focusdomains can be seen as
discrete but not necessarily binary - asthe size of focus domains
can be extended step by step toinclude more and more constituents,
bounded only bysentence length. Contrastive or non-contrastive
narrow focus,on the other hand, can be considered a binary
distinction,although not all theories distinguish narrow focus
fromcontrast, since narrow focus is also contrastive in some
way[20].
2. Production experiment
2.1. Recordings
A production experiment was designed to investigate
whetherprosodic means are used in German to differentiate
betweenthree different sizes of focus domain involving
focusprojection, and between these and narrow focus, and, withinthe
narrow focus cateogory, contrastive focus. Our hypothesesare based
on the fact that gradient variation has been found toexpress other
differences in information structure (see 1.2).However, this
variation does not preclude a categorical dis-tinction e.g. in
pitch accent type.
2.1.1. Reading material
Reading material consisted of five question-answer pairs withthe
answer ‘Manuela will Blumen malen.’ (Manuela wants topaint
flowers.). The main criterion the target sentence had tofulfill was
its continuous voicing, so as to be able toaccurately measure exact
peaks and valleys in the F0 contour.The questions are listed below,
followed by the focus domainsaccording to question-answer
congruence.
Questions:
1. Was gibt’s Neues? What’s new?2. Was gibt’s Neues von Manuela?
What about Manuela?3. Was will Manuela? What does Manuela want?4.
Was will Manuela malen? What does Manuela want topaint?5. Manuela
will Gesichter malen? Manuela wants to paintfaces?
Answers: Manuela will Blumen malen.
1. [ ] focus broad2. [ ] focus3. [ ] focus4. [ ] focus narrow5.
Nein, [ ] focus contrastive
lit.: Manuela wants flowers paint
2.1.2. Speakers and recording procedure
Six speakers (three female, three male) between the ages of23
and 27 took part in the experiment. All of them werestudents at the
University of Cologne. Four speakersoriginated from the north-west
of Germany, one from the west(just below the Benrath isogloss), and
one from the north ofBavaria.
The recordings were carried out in a soundproof room,with the
instructor reading out the questions, and the subjectsgiving the
answers. The five sentences were interspersed withfillers and read
aloud four times in randomised orders by eachspeaker, leading to 20
tokens per speaker. Thus, 120utterances in total entered the
analysis.
2.2. Analysis
Using the speech analysis tool EMU [5], we labelled the onsetand
the end of the nuclear word (which was the word Blumenin all
cases), and the start and end of each segment. Theprenuclear and
nuclear pitch accents were transcribed inGToBI [11] with an
additional label for the beginning of thenuclear rise. Example
contours are given in Fig.1.
Figure 1: Example F0 contours for broad and narrow focus(answers
1 and 4, speaker CB)
3. Results and discussion
3.1. Categorical means
As a first result, contrary to predictions in the literature,
boththe size of the focus domain and type of focus affect thechoice
of accent type on the focus exponent: in broad(er)focus structures
(sentences 1 and 2) a downstepped nuclearaccent was produced in 42%
of all cases, while in narrowerfocus domains (sentences 3 and 4)
fewer downsteps occurred(25% and 17%, respectively). In
contrastively focussedutterances no downstep was produced at all
(Fig.2).
A Spearman’s Rho correlation analysis showed asignificant
interaction between nuclear pitch accent type andsentence type
(p
-
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
Labeling of the resulting data
position), in German have later and higher peak placementthan
non-contrastive ones [3].
Furthermore, stressed vowels have a significantly longerduration
in contrastive themes. Similarly, Second OccurrenceFocus (SOF;
[10]) involves a longer duration of the targetword. SOF is induced
by a focus operator such as even afterthe main focus of the phrase,
that is, in the unaccented stretchfollowing the nuclear accent.
In both of the above cases gradient means are used toexpress a
binary opposition: contrastive or non-contrastivethemes; SOF or no
focus. In our own study, the size of focusdomains can be seen as
discrete but not necessarily binary - asthe size of focus domains
can be extended step by step toinclude more and more constituents,
bounded only bysentence length. Contrastive or non-contrastive
narrow focus,on the other hand, can be considered a binary
distinction,although not all theories distinguish narrow focus
fromcontrast, since narrow focus is also contrastive in some
way[20].
2. Production experiment
2.1. Recordings
A production experiment was designed to investigate
whetherprosodic means are used in German to differentiate
betweenthree different sizes of focus domain involving
focusprojection, and between these and narrow focus, and, withinthe
narrow focus cateogory, contrastive focus. Our hypothesesare based
on the fact that gradient variation has been found toexpress other
differences in information structure (see 1.2).However, this
variation does not preclude a categorical dis-tinction e.g. in
pitch accent type.
2.1.1. Reading material
Reading material consisted of five question-answer pairs withthe
answer ‘Manuela will Blumen malen.’ (Manuela wants topaint
flowers.). The main criterion the target sentence had tofulfill was
its continuous voicing, so as to be able toaccurately measure exact
peaks and valleys in the F0 contour.The questions are listed below,
followed by the focus domainsaccording to question-answer
congruence.
Questions:
1. Was gibt’s Neues? What’s new?2. Was gibt’s Neues von Manuela?
What about Manuela?3. Was will Manuela? What does Manuela want?4.
Was will Manuela malen? What does Manuela want topaint?5. Manuela
will Gesichter malen? Manuela wants to paintfaces?
Answers: Manuela will Blumen malen.
1. [ ] focus broad2. [ ] focus3. [ ] focus4. [ ] focus narrow5.
Nein, [ ] focus contrastive
lit.: Manuela wants flowers paint
2.1.2. Speakers and recording procedure
Six speakers (three female, three male) between the ages of23
and 27 took part in the experiment. All of them werestudents at the
University of Cologne. Four speakersoriginated from the north-west
of Germany, one from the west(just below the Benrath isogloss), and
one from the north ofBavaria.
The recordings were carried out in a soundproof room,with the
instructor reading out the questions, and the subjectsgiving the
answers. The five sentences were interspersed withfillers and read
aloud four times in randomised orders by eachspeaker, leading to 20
tokens per speaker. Thus, 120utterances in total entered the
analysis.
2.2. Analysis
Using the speech analysis tool EMU [5], we labelled the onsetand
the end of the nuclear word (which was the word Blumenin all
cases), and the start and end of each segment. Theprenuclear and
nuclear pitch accents were transcribed inGToBI [11] with an
additional label for the beginning of thenuclear rise. Example
contours are given in Fig.1.
Figure 1: Example F0 contours for broad and narrow focus(answers
1 and 4, speaker CB)
3. Results and discussion
3.1. Categorical means
As a first result, contrary to predictions in the literature,
boththe size of the focus domain and type of focus affect thechoice
of accent type on the focus exponent: in broad(er)focus structures
(sentences 1 and 2) a downstepped nuclearaccent was produced in 42%
of all cases, while in narrowerfocus domains (sentences 3 and 4)
fewer downsteps occurred(25% and 17%, respectively). In
contrastively focussedutterances no downstep was produced at all
(Fig.2).
A Spearman’s Rho correlation analysis showed asignificant
interaction between nuclear pitch accent type andsentence type
(p
-
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
Higher accent peaks
I Two speaker show a highly significant effect of nuclearaccent
pitch hight on sentence type.
support the finding of [1] for English that !H* was perceivedas
significantly less prominent than L+H* or H*.
0
20
40
60
80
100
nuclear
pitch
accent
type (%)
1 2 3 4 5
sentence type
downstep
no downstep
Figure 2: Differences in nuclear pitch accent type in relationto
sentence type, all speakers (N=120)
In 20% of all cases there was no prenuclear H tone, sincethe
nuclear accent was the only accent in the phrase (78% ofthe
prenuclear accents were of the type (L+)H* and 2% of thetype L*+H).
Half of the single-accent phrases occurred incontrastive
utterances, which is in line with the observation of[7] that the
prominence of an accent can be increased bydeaccenting other words
in the phrase.
The observation already made by [3] in their investigationfor
contrastive and non-contrastive themes in German thatspeakers vary
considerably as to the (combination of) meansthey employ for
signalling aspects of information structure, issupported by our
data.
As for the use of different accent types, for example, fourout
of six speakers use downstepped nuclear accents formarking broad
focus and non-downstepped peak accents formarking narrow and, in
particular, contrastive focus. Theother two speakers do not use
downstepping contours at all,i.e. all prenuclear and nuclear
accents were of the type(L+)H*.
3.2. Gradient means
As the focus domain narrows, we also observe the use of
thefollowing gradient means:
a) increased duration of the focus exponentb) higher peak on the
nuclear accent (marking the
focus exponent)c) greater pitch excursion to the peak of the
nuclear accentd) delay in the nuclear accent peak.
Across all speakers, duration varied consistently with the
sizeof focus domain but it did not distinguish between contrastand
non-contrast (Fig.3). In a one-way ANOVA with sentencetype as
independent factor, the focus domain had a highlysignificant effect
on the duration of the focus exponent(p
-
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
The Experiment
Articulatory gestures and focus marking in German
Anne Hermes, Johannes Becker, Doris Mücke, Stefan Baumann &
Martine Grice
IfL Phonetik, University of Cologne, Germany {anne.hermes;
becker.johannes; doris.muecke; stefan.baumann;
martine.grice}@uni-koeln.de
Abstract
This study reports on a production experiment investigating
tonal and articulatory means of encoding different focus
structures in German. Using an electromagnetic
articulograph,
we examined the movements of the upper and lower lips
(related to sonority expansion) during the production of
target
words occurring in four different focus conditions. We found
systematic differences not only between unaccented vs.
accented target words (background vs. contrastive focus),
but
also within the category ‘accented’: the differences in
articulatory expression for broad vs. contrastive focus were
expressed by greater displacements and lower stiffness of
lip
aperture (opening and closing movements). Our results
suggest that German speakers express discrete linguistic
differences, namely differences in focus structure, by
gradually but systematically varying sonority expansion in
focus exponents across consonants and adjacent vowels, thus
enhancing the syntagmatic contrast.
1. Introduction
In most studies dealing with information structure, focus
and
background are regarded as two distinct categories.
Consequently, it is often assumed that the prosodic marking
of these categories should be categorically distinct as
well,
thus reducing the prosodic analysis to the question of
whether
a constituent is accented or unaccented. More recent studies
(e.g. [2]) have shown, however, that different focus
structures
are encoded by different accent types and/or by varying
continuous parameters such as duration or pitch excursion on
the focus exponents, thus creating different degrees of
prominence on the respective items.
In the present study we are primarily concerned with the
role of articulatory gestures in focus marking. The few
previous investigations in this field are restricted to words
in
maximally diverging focus structures (contrastive focus vs.
background) and thus to the accented-unaccented dichotomy
(e.g. [5] for English and [1] for Italian). It is unclear
from
these studies, however, whether the articulatory differences
found (e.g. greater jaw lowering or lip aperture in
contrastive
focus) are simply due to accentuation or whether the
articulatory expression of different focus structures can be
regarded as a continuum of prominence or emphasis (as
reported in [6] for French).
In order to shed light on this question, we explore the
variation in articulatory parameters which are related to
lip
kinematics (greater displacement, longer duration, higher
peak velocity and lower stiffness of lip opening to enhance
prominence) in the marking of target words occurring in
different types of focus (contrastive, non-contrastive) and
different sizes of focus domain (broad, narrow; see e.g.
[9]),
or in the background. In particular, we investigate
differences
within the category ‘accent’ (broad vs. narrow focus, broad
vs. contrastive focus) as well as between accented and
unaccented words (contrastive focus vs. background). Results
on differences between background and broad focus as well
as narrow and contrastive focus are not presented here.
1.1. Reading material
The speech material included question-answer sets eliciting
four different focus structures: the NP under investigation
occurred either as part of the previously mentioned
background or in broad, narrow or contrastive focus. The
target words, i.e. the fictitious names after Dr.
!"#$%&',
were always disyllabic, with the stressed syllable
containing
one of the four long target vowels /i!/, /a!/, /o!/ or /u!/.
An
example of a question-answer set is given below:
Questions:
1. Will Norbert Dr. Bahber treffen? Does Norbert want to
meet Dr. Bahber?
2. Was gibt´s Neues? What´s new?
3. Wen will Melanie treffen? Whom does Melanie want to
meet?
4. Will Melanie Dr. Werner treffen? Does Melanie want to
meet Dr. Werner?
Answers: test word in:
Melanie will Dr. Bahber treffen.
1. [ ]focus background
2. [ ]focus broad focus
3. [ ]focus narrow focus
4. [ ]focus contrastive focus
(lit.: Melanie wants Dr. Bahber to-meet)
1.2. Speakers and recordings
Three native speakers of Standard German (aged 26, 27 and
37) were recorded with a 2D Electromagnetic Midsagittal
Articulograph (EMMA) and a time-synchronized DAT-
recorder. The kinematic data were recorded at 500Hz,
downsampled to 200Hz and smoothed with a 40Hz low-pass
filter. The acoustic data were digitized at 44.1kHz.
The subjects listened to the questions (which were
presented both visually and auditorily) and were instructed
to
answer these questions in a contextually appropriate manner
and at a normal speech rate. After a test block of five
question-answer-pairs each subject read out the target
sentences (four focus structures, four target words, seven
repetitions) in pseudo-randomised order, leading to 112
tokens per speaker in total.
Lip movements were monitored by EMMA (Carstens
AG100), with sensors placed on the vermillion border of the
upper and lower lip within the midsagittal plane. Two
additional sensors on the nose and the upper gums served as
a
reference in order to correct helmet movements during the
recordings.
13 / 17
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
Labeling of the data
200
300
400
DM AH WP
background broad narrow contrastive
*** ***
*** ***
*** ***
!" #$ " %&
!" #$ " %&
1.3. Labelling procedure
Acoustic and articulatory data were labelled by hand using the
EMU speech database system. A screen shot including all tiers and
labels described below is given in Fig.1.
Segment boundaries of consonants and vowels of the accented and
post-accented syllables (c1, v1, c2, v2) were annotated in the
acoustic waveform.
In the tonal analysis we identified three different GToBI accent
types on the target word (as proposed in [2]): !H* (downstep), ^H*
(upstep) or H* (neither downstep nor upstep). Note that up- and
downsteps are always related to a preceding prenuclear LH accent on
the subject argument. Deaccentuation of the target word was marked
with ‘Ø’.
For the kinematic data, the lip aperture (LA) index was
calculated in terms of the Euclidean distance between the two
sensors of upper and lower lip, including movements both in the
horizontal and vertical dimension [4]. Minima and maxima of opening
and closing gestures (min1, max1, min2, max2) were located at
zero-crossings in the respective velocity trace. Additionally, we
labelled peak velocities at zero-crossings in the respective
acceleration trace (p1, p2, p3). Twelve utterances (all from
speaker WP) were removed from the analysis because no clear turning
points for the lip kinematics could be identified.
Figure 1: Labelling scheme; from top to bottom: oscillogram, F0
curve, velocity and position curve
of lip aperture (LA); target word B/i:/ber.
2. Results and Discussion
We analysed all measures with one-way-ANOVAs for each speaker
separately and with a Tukey post hoc test. The dependent variables
included accent type and word duration for the acoustic measures,
and displacement, peak velocity, duration and stiffness for the
articulatory measures. The independent variable FOCUS STRUCTURE
included broad focus, narrow focus, contrastive focus and
background.
2.1. Accent types
Table 1 shows the accent types preferably used by the three
speakers in the different focus structures. As expected, all
speakers deaccented the target words when they occurred in the
background. In broad focus, speakers DM and AH almost exclusively
used downsteps (DM 85.2%; AH 100%), whereas
speaker WP typically produced upsteps (84%; only 4% downsteps).
In the narrow focus condition, speakers DM and WP both produced
upsteps (DM 82.6%; WP 100%), while speaker AH used all three accent
types nearly to the same extent (36% upsteps; 32% downsteps; 32%
unmodified H*). In contrastive focus, all three speakers always
used upsteps.
Table 1: Most frequently produced accent types per speaker and
focus condition.
2.2. Acoustic durations
We examined the duration of the target words for all speakers.
Since our target words are disyllabic, the domain ‘word’ is
identical with the domain ‘foot’. Fig.2 shows mean durations of the
target word B/i:/ber for the different focus conditions. For all
three speakers, we found a significant increase in word duration
from background to contrastive focus (e.g. AH: 33ms longer, p
-
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
Kinematic resultsI Kinematic results are presented for two
speakers for the
vowel /i:/.I The figure shows averaged trajectories for the
distance
between upper and lower lip during the production ofthe target
word, for each focus condition separately.
I Low displacements indicate that the lips are closed forthe
production of the stop consonants.
I High values indicate open lips during the vowels.I Going from
background through broad and narrow to
contrastive focus there is an increase in duration and
lipaperture
200
300
400
DM AH WP
background broad narrow contrastive
*** ***
*** ***
*** ***
!" #$ " %&
!" #$ " %&
1.3. Labelling procedure
Acoustic and articulatory data were labelled by hand using the
EMU speech database system. A screen shot including all tiers and
labels described below is given in Fig.1.
Segment boundaries of consonants and vowels of the accented and
post-accented syllables (c1, v1, c2, v2) were annotated in the
acoustic waveform.
In the tonal analysis we identified three different GToBI accent
types on the target word (as proposed in [2]): !H* (downstep), ^H*
(upstep) or H* (neither downstep nor upstep). Note that up- and
downsteps are always related to a preceding prenuclear LH accent on
the subject argument. Deaccentuation of the target word was marked
with ‘Ø’.
For the kinematic data, the lip aperture (LA) index was
calculated in terms of the Euclidean distance between the two
sensors of upper and lower lip, including movements both in the
horizontal and vertical dimension [4]. Minima and maxima of opening
and closing gestures (min1, max1, min2, max2) were located at
zero-crossings in the respective velocity trace. Additionally, we
labelled peak velocities at zero-crossings in the respective
acceleration trace (p1, p2, p3). Twelve utterances (all from
speaker WP) were removed from the analysis because no clear turning
points for the lip kinematics could be identified.
Figure 1: Labelling scheme; from top to bottom: oscillogram, F0
curve, velocity and position curve
of lip aperture (LA); target word B/i:/ber.
2. Results and Discussion
We analysed all measures with one-way-ANOVAs for each speaker
separately and with a Tukey post hoc test. The dependent variables
included accent type and word duration for the acoustic measures,
and displacement, peak velocity, duration and stiffness for the
articulatory measures. The independent variable FOCUS STRUCTURE
included broad focus, narrow focus, contrastive focus and
background.
2.1. Accent types
Table 1 shows the accent types preferably used by the three
speakers in the different focus structures. As expected, all
speakers deaccented the target words when they occurred in the
background. In broad focus, speakers DM and AH almost exclusively
used downsteps (DM 85.2%; AH 100%), whereas
speaker WP typically produced upsteps (84%; only 4% downsteps).
In the narrow focus condition, speakers DM and WP both produced
upsteps (DM 82.6%; WP 100%), while speaker AH used all three accent
types nearly to the same extent (36% upsteps; 32% downsteps; 32%
unmodified H*). In contrastive focus, all three speakers always
used upsteps.
Table 1: Most frequently produced accent types per speaker and
focus condition.
2.2. Acoustic durations
We examined the duration of the target words for all speakers.
Since our target words are disyllabic, the domain ‘word’ is
identical with the domain ‘foot’. Fig.2 shows mean durations of the
target word B/i:/ber for the different focus conditions. For all
three speakers, we found a significant increase in word duration
from background to contrastive focus (e.g. AH: 33ms longer, p