German - uni-tuebingen.dekdk/WS09-10/focus-marking... · 2010. 2. 2. · 2.1.1. Reading material Reading material consisted of five question-answer pairs with the answer ÔManuela

Focus Marking inGerman

Categorial andgradient prosody(Baumann et al.2006)

Results anddiscussion

Articulatorygestures and focusmarking

The experiment


Focus Marking in German

Kordula De Kuthy

HS Neuere Arbeiten zur Fokusprojektion WS 09/10

February 2, 2010

1 / 17





The experiment


Prosodic Marking of Focus DomainsCategorial or Gradient

(1) a. Q: Who did you call?

b. A: [I called]background [MAry]F

(2) a. Q: Did you call John?

b. A: No, [I called]background [MAry]F

(3) a. Q: What happened?

b. A: [I called MAry]F

I The differences between answers in (1b), (2b), and (3b)are discrete:

I it is either MAry or I called Mary which is in focus andI MAry is either contrasted with another specific person,

or is singled out from a larger set.I Are these differences marked prosodically, andI does the prosodic marking involve

I discrete means, i.e. phonological categories such aspitch accent type, or

I gradient means, such as duration, or F0 timing andscaling differences (which do not lead to a difference inphonological categories.

2 / 17





The experiment


Prosodic marking of broad vs narrow focus inGerman

I Féry (1993) looked for categorial distinctions in theprosodic marking of broad versus narrow focus inGerman.

I The result of an production experiment revealed thatspeakers used the same nuclear pitch accent type(H*L) in both broad and narrow focus as in (4) and (5).

(4) a. Q: Was ist los?

b. A: [ANna ist weggelaufen.]F

(5) a. Q: Wer ist weggelaufen?

b. A: [ANna]F [ist weggelaufen.]background

3 / 17





The experiment


Production experimentBaumann et al. (2006) design a production experiment toinvestigate whetherI prosodic means are used in German to differentiate

between three sizes of focus domains involving focusprojection and

I between these and narrow focus andI between narrow focus and contrastive focus

position), in German have later and higher peak placementthan non-contrastive ones [3].

Furthermore, stressed vowels have a significantly longerduration in contrastive themes. Similarly, Second OccurrenceFocus (SOF; [10]) involves a longer duration of the targetword. SOF is induced by a focus operator such as even afterthe main focus of the phrase, that is, in the unaccented stretchfollowing the nuclear accent.

In both of the above cases gradient means are used toexpress a binary opposition: contrastive or non-contrastivethemes; SOF or no focus. In our own study, the size of focusdomains can be seen as discrete but not necessarily binary - asthe size of focus domains can be extended step by step toinclude more and more constituents, bounded only bysentence length. Contrastive or non-contrastive narrow focus,on the other hand, can be considered a binary distinction,although not all theories distinguish narrow focus fromcontrast, since narrow focus is also contrastive in some way[20].

2. Production experiment

2.1. Recordings

A production experiment was designed to investigate whetherprosodic means are used in German to differentiate betweenthree different sizes of focus domain involving focusprojection, and between these and narrow focus, and, withinthe narrow focus cateogory, contrastive focus. Our hypothesesare based on the fact that gradient variation has been found toexpress other differences in information structure (see 1.2).However, this variation does not preclude a categorical dis-tinction e.g. in pitch accent type.

2.1.1. Reading material

Reading material consisted of five question-answer pairs withthe answer ‘Manuela will Blumen malen.’ (Manuela wants topaint flowers.). The main criterion the target sentence had tofulfill was its continuous voicing, so as to be able toaccurately measure exact peaks and valleys in the F0 contour.The questions are listed below, followed by the focus domainsaccording to question-answer congruence.

Questions:

1. Was gibt’s Neues? What’s new?2. Was gibt’s Neues von Manuela? What about Manuela?3. Was will Manuela? What does Manuela want?4. Was will Manuela malen? What does Manuela want topaint?5. Manuela will Gesichter malen? Manuela wants to paintfaces?

Answers: Manuela will Blumen malen.

1. [ ] focus broad2. [ ] focus3. [ ] focus4. [ ] focus narrow5. Nein, [ ] focus contrastive

lit.: Manuela wants flowers paint

2.1.2. Speakers and recording procedure

Six speakers (three female, three male) between the ages of23 and 27 took part in the experiment. All of them werestudents at the University of Cologne. Four speakersoriginated from the north-west of Germany, one from the west(just below the Benrath isogloss), and one from the north ofBavaria.

The recordings were carried out in a soundproof room,with the instructor reading out the questions, and the subjectsgiving the answers. The five sentences were interspersed withfillers and read aloud four times in randomised orders by eachspeaker, leading to 20 tokens per speaker. Thus, 120utterances in total entered the analysis.

2.2. Analysis

Using the speech analysis tool EMU [5], we labelled the onsetand the end of the nuclear word (which was the word Blumenin all cases), and the start and end of each segment. Theprenuclear and nuclear pitch accents were transcribed inGToBI [11] with an additional label for the beginning of thenuclear rise. Example contours are given in Fig.1.

Figure 1: Example F0 contours for broad and narrow focus(answers 1 and 4, speaker CB)

3. Results and discussion

3.1. Categorical means

As a first result, contrary to predictions in the literature, boththe size of the focus domain and type of focus affect thechoice of accent type on the focus exponent: in broad(er)focus structures (sentences 1 and 2) a downstepped nuclearaccent was produced in 42% of all cases, while in narrowerfocus domains (sentences 3 and 4) fewer downsteps occurred(25% and 17%, respectively). In contrastively focussedutterances no downstep was produced at all (Fig.2).

A Spearman’s Rho correlation analysis showed asignificant interaction between nuclear pitch accent type andsentence type (p





The experiment


Labeling of the resulting data

position), in German have later and higher peak placementthan non-contrastive ones [3].

Furthermore, stressed vowels have a significantly longerduration in contrastive themes. Similarly, Second OccurrenceFocus (SOF; [10]) involves a longer duration of the targetword. SOF is induced by a focus operator such as even afterthe main focus of the phrase, that is, in the unaccented stretchfollowing the nuclear accent.

In both of the above cases gradient means are used toexpress a binary opposition: contrastive or non-contrastivethemes; SOF or no focus. In our own study, the size of focusdomains can be seen as discrete but not necessarily binary - asthe size of focus domains can be extended step by step toinclude more and more constituents, bounded only bysentence length. Contrastive or non-contrastive narrow focus,on the other hand, can be considered a binary distinction,although not all theories distinguish narrow focus fromcontrast, since narrow focus is also contrastive in some way[20].

2. Production experiment

2.1. Recordings

A production experiment was designed to investigate whetherprosodic means are used in German to differentiate betweenthree different sizes of focus domain involving focusprojection, and between these and narrow focus, and, withinthe narrow focus cateogory, contrastive focus. Our hypothesesare based on the fact that gradient variation has been found toexpress other differences in information structure (see 1.2).However, this variation does not preclude a categorical dis-tinction e.g. in pitch accent type.

2.1.1. Reading material

Reading material consisted of five question-answer pairs withthe answer ‘Manuela will Blumen malen.’ (Manuela wants topaint flowers.). The main criterion the target sentence had tofulfill was its continuous voicing, so as to be able toaccurately measure exact peaks and valleys in the F0 contour.The questions are listed below, followed by the focus domainsaccording to question-answer congruence.

Questions:

1. Was gibt’s Neues? What’s new?2. Was gibt’s Neues von Manuela? What about Manuela?3. Was will Manuela? What does Manuela want?4. Was will Manuela malen? What does Manuela want topaint?5. Manuela will Gesichter malen? Manuela wants to paintfaces?

Answers: Manuela will Blumen malen.

1. [ ] focus broad2. [ ] focus3. [ ] focus4. [ ] focus narrow5. Nein, [ ] focus contrastive

lit.: Manuela wants flowers paint

2.1.2. Speakers and recording procedure

Six speakers (three female, three male) between the ages of23 and 27 took part in the experiment. All of them werestudents at the University of Cologne. Four speakersoriginated from the north-west of Germany, one from the west(just below the Benrath isogloss), and one from the north ofBavaria.

The recordings were carried out in a soundproof room,with the instructor reading out the questions, and the subjectsgiving the answers. The five sentences were interspersed withfillers and read aloud four times in randomised orders by eachspeaker, leading to 20 tokens per speaker. Thus, 120utterances in total entered the analysis.

2.2. Analysis

Using the speech analysis tool EMU [5], we labelled the onsetand the end of the nuclear word (which was the word Blumenin all cases), and the start and end of each segment. Theprenuclear and nuclear pitch accents were transcribed inGToBI [11] with an additional label for the beginning of thenuclear rise. Example contours are given in Fig.1.

Figure 1: Example F0 contours for broad and narrow focus(answers 1 and 4, speaker CB)

3. Results and discussion

3.1. Categorical means

As a first result, contrary to predictions in the literature, boththe size of the focus domain and type of focus affect thechoice of accent type on the focus exponent: in broad(er)focus structures (sentences 1 and 2) a downstepped nuclearaccent was produced in 42% of all cases, while in narrowerfocus domains (sentences 3 and 4) fewer downsteps occurred(25% and 17%, respectively). In contrastively focussedutterances no downstep was produced at all (Fig.2).

A Spearman’s Rho correlation analysis showed asignificant interaction between nuclear pitch accent type andsentence type (p





The experiment


Higher accent peaks

I Two speaker show a highly significant effect of nuclearaccent pitch hight on sentence type.

support the finding of [1] for English that !H* was perceivedas significantly less prominent than L+H* or H*.

0

20

40

60

80

100

nuclear

pitch

accent

type (%)

1 2 3 4 5

sentence type

downstep

no downstep

Figure 2: Differences in nuclear pitch accent type in relationto sentence type, all speakers (N=120)

In 20% of all cases there was no prenuclear H tone, sincethe nuclear accent was the only accent in the phrase (78% ofthe prenuclear accents were of the type (L+)H* and 2% of thetype L*+H). Half of the single-accent phrases occurred incontrastive utterances, which is in line with the observation of[7] that the prominence of an accent can be increased bydeaccenting other words in the phrase.

The observation already made by [3] in their investigationfor contrastive and non-contrastive themes in German thatspeakers vary considerably as to the (combination of) meansthey employ for signalling aspects of information structure, issupported by our data.

As for the use of different accent types, for example, fourout of six speakers use downstepped nuclear accents formarking broad focus and non-downstepped peak accents formarking narrow and, in particular, contrastive focus. Theother two speakers do not use downstepping contours at all,i.e. all prenuclear and nuclear accents were of the type(L+)H*.

3.2. Gradient means

As the focus domain narrows, we also observe the use of thefollowing gradient means:

a) increased duration of the focus exponentb) higher peak on the nuclear accent (marking the

focus exponent)c) greater pitch excursion to the peak of the nuclear accentd) delay in the nuclear accent peak.

Across all speakers, duration varied consistently with the sizeof focus domain but it did not distinguish between contrastand non-contrast (Fig.3). In a one-way ANOVA with sentencetype as independent factor, the focus domain had a highlysignificant effect on the duration of the focus exponent(p





The experiment


The Experiment

Articulatory gestures and focus marking in German

Anne Hermes, Johannes Becker, Doris Mücke, Stefan Baumann & Martine Grice

IfL Phonetik, University of Cologne, Germany {anne.hermes; becker.johannes; doris.muecke; stefan.baumann; martine.grice}@uni-koeln.de

Abstract

This study reports on a production experiment investigating

tonal and articulatory means of encoding different focus

structures in German. Using an electromagnetic articulograph,

we examined the movements of the upper and lower lips

(related to sonority expansion) during the production of target

words occurring in four different focus conditions. We found

systematic differences not only between unaccented vs.

accented target words (background vs. contrastive focus), but

also within the category ‘accented’: the differences in

articulatory expression for broad vs. contrastive focus were

expressed by greater displacements and lower stiffness of lip

aperture (opening and closing movements). Our results

suggest that German speakers express discrete linguistic

differences, namely differences in focus structure, by

gradually but systematically varying sonority expansion in

focus exponents across consonants and adjacent vowels, thus

enhancing the syntagmatic contrast.

1. Introduction

In most studies dealing with information structure, focus and

background are regarded as two distinct categories.

Consequently, it is often assumed that the prosodic marking

of these categories should be categorically distinct as well,

thus reducing the prosodic analysis to the question of whether

a constituent is accented or unaccented. More recent studies

(e.g. [2]) have shown, however, that different focus structures

are encoded by different accent types and/or by varying

continuous parameters such as duration or pitch excursion on

the focus exponents, thus creating different degrees of

prominence on the respective items.

In the present study we are primarily concerned with the

role of articulatory gestures in focus marking. The few

previous investigations in this field are restricted to words in

maximally diverging focus structures (contrastive focus vs.

background) and thus to the accented-unaccented dichotomy

(e.g. [5] for English and [1] for Italian). It is unclear from

these studies, however, whether the articulatory differences

found (e.g. greater jaw lowering or lip aperture in contrastive

focus) are simply due to accentuation or whether the

articulatory expression of different focus structures can be

regarded as a continuum of prominence or emphasis (as

reported in [6] for French).

In order to shed light on this question, we explore the

variation in articulatory parameters which are related to lip

kinematics (greater displacement, longer duration, higher

peak velocity and lower stiffness of lip opening to enhance

prominence) in the marking of target words occurring in

different types of focus (contrastive, non-contrastive) and

different sizes of focus domain (broad, narrow; see e.g. [9]),

or in the background. In particular, we investigate differences

within the category ‘accent’ (broad vs. narrow focus, broad

vs. contrastive focus) as well as between accented and

unaccented words (contrastive focus vs. background). Results

on differences between background and broad focus as well

as narrow and contrastive focus are not presented here.

1.1. Reading material

The speech material included question-answer sets eliciting

four different focus structures: the NP under investigation

occurred either as part of the previously mentioned

background or in broad, narrow or contrastive focus. The

target words, i.e. the fictitious names after Dr. !"#$%&',

were always disyllabic, with the stressed syllable containing

one of the four long target vowels /i!/, /a!/, /o!/ or /u!/. An

example of a question-answer set is given below:

Questions:

1. Will Norbert Dr. Bahber treffen? Does Norbert want to

meet Dr. Bahber?

2. Was gibt´s Neues? What´s new?

3. Wen will Melanie treffen? Whom does Melanie want to

meet?

4. Will Melanie Dr. Werner treffen? Does Melanie want to

meet Dr. Werner?

Answers: test word in:

Melanie will Dr. Bahber treffen.

1. [ ]focus background

2. [ ]focus broad focus

3. [ ]focus narrow focus

4. [ ]focus contrastive focus

(lit.: Melanie wants Dr. Bahber to-meet)

1.2. Speakers and recordings

Three native speakers of Standard German (aged 26, 27 and

37) were recorded with a 2D Electromagnetic Midsagittal

Articulograph (EMMA) and a time-synchronized DAT-

recorder. The kinematic data were recorded at 500Hz,

downsampled to 200Hz and smoothed with a 40Hz low-pass

filter. The acoustic data were digitized at 44.1kHz.

The subjects listened to the questions (which were

presented both visually and auditorily) and were instructed to

answer these questions in a contextually appropriate manner

and at a normal speech rate. After a test block of five

question-answer-pairs each subject read out the target

sentences (four focus structures, four target words, seven

repetitions) in pseudo-randomised order, leading to 112

tokens per speaker in total.

Lip movements were monitored by EMMA (Carstens

AG100), with sensors placed on the vermillion border of the

upper and lower lip within the midsagittal plane. Two

additional sensors on the nose and the upper gums served as a

reference in order to correct helmet movements during the

recordings.

13 / 17





The experiment


Labeling of the data

200

300

400

DM AH WP

background broad narrow contrastive

*** ***

*** ***

*** ***

!" #$ " %&

!" #$ " %&

1.3. Labelling procedure

Acoustic and articulatory data were labelled by hand using the EMU speech database system. A screen shot including all tiers and labels described below is given in Fig.1.

Segment boundaries of consonants and vowels of the accented and post-accented syllables (c1, v1, c2, v2) were annotated in the acoustic waveform.

In the tonal analysis we identified three different GToBI accent types on the target word (as proposed in [2]): !H* (downstep), ^H* (upstep) or H* (neither downstep nor upstep). Note that up- and downsteps are always related to a preceding prenuclear LH accent on the subject argument. Deaccentuation of the target word was marked with ‘Ø’.

For the kinematic data, the lip aperture (LA) index was calculated in terms of the Euclidean distance between the two sensors of upper and lower lip, including movements both in the horizontal and vertical dimension [4]. Minima and maxima of opening and closing gestures (min1, max1, min2, max2) were located at zero-crossings in the respective velocity trace. Additionally, we labelled peak velocities at zero-crossings in the respective acceleration trace (p1, p2, p3). Twelve utterances (all from speaker WP) were removed from the analysis because no clear turning points for the lip kinematics could be identified.

Figure 1: Labelling scheme; from top to bottom: oscillogram, F0 curve, velocity and position curve

of lip aperture (LA); target word B/i:/ber.

2. Results and Discussion

We analysed all measures with one-way-ANOVAs for each speaker separately and with a Tukey post hoc test. The dependent variables included accent type and word duration for the acoustic measures, and displacement, peak velocity, duration and stiffness for the articulatory measures. The independent variable FOCUS STRUCTURE included broad focus, narrow focus, contrastive focus and background.

2.1. Accent types

Table 1 shows the accent types preferably used by the three speakers in the different focus structures. As expected, all speakers deaccented the target words when they occurred in the background. In broad focus, speakers DM and AH almost exclusively used downsteps (DM 85.2%; AH 100%), whereas

speaker WP typically produced upsteps (84%; only 4% downsteps). In the narrow focus condition, speakers DM and WP both produced upsteps (DM 82.6%; WP 100%), while speaker AH used all three accent types nearly to the same extent (36% upsteps; 32% downsteps; 32% unmodified H*). In contrastive focus, all three speakers always used upsteps.

Table 1: Most frequently produced accent types per speaker and focus condition.

2.2. Acoustic durations

We examined the duration of the target words for all speakers. Since our target words are disyllabic, the domain ‘word’ is identical with the domain ‘foot’. Fig.2 shows mean durations of the target word B/i:/ber for the different focus conditions. For all three speakers, we found a significant increase in word duration from background to contrastive focus (e.g. AH: 33ms longer, p





The experiment


Kinematic resultsI Kinematic results are presented for two speakers for the

vowel /i:/.I The figure shows averaged trajectories for the distance

between upper and lower lip during the production ofthe target word, for each focus condition separately.

I Low displacements indicate that the lips are closed forthe production of the stop consonants.

I High values indicate open lips during the vowels.I Going from background through broad and narrow to

contrastive focus there is an increase in duration and lipaperture

200

300

400

DM AH WP

background broad narrow contrastive

*** ***

*** ***

*** ***

!" #$ " %&

!" #$ " %&

1.3. Labelling procedure

Acoustic and articulatory data were labelled by hand using the EMU speech database system. A screen shot including all tiers and labels described below is given in Fig.1.

Segment boundaries of consonants and vowels of the accented and post-accented syllables (c1, v1, c2, v2) were annotated in the acoustic waveform.

In the tonal analysis we identified three different GToBI accent types on the target word (as proposed in [2]): !H* (downstep), ^H* (upstep) or H* (neither downstep nor upstep). Note that up- and downsteps are always related to a preceding prenuclear LH accent on the subject argument. Deaccentuation of the target word was marked with ‘Ø’.

For the kinematic data, the lip aperture (LA) index was calculated in terms of the Euclidean distance between the two sensors of upper and lower lip, including movements both in the horizontal and vertical dimension [4]. Minima and maxima of opening and closing gestures (min1, max1, min2, max2) were located at zero-crossings in the respective velocity trace. Additionally, we labelled peak velocities at zero-crossings in the respective acceleration trace (p1, p2, p3). Twelve utterances (all from speaker WP) were removed from the analysis because no clear turning points for the lip kinematics could be identified.

Figure 1: Labelling scheme; from top to bottom: oscillogram, F0 curve, velocity and position curve

of lip aperture (LA); target word B/i:/ber.

2. Results and Discussion

We analysed all measures with one-way-ANOVAs for each speaker separately and with a Tukey post hoc test. The dependent variables included accent type and word duration for the acoustic measures, and displacement, peak velocity, duration and stiffness for the articulatory measures. The independent variable FOCUS STRUCTURE included broad focus, narrow focus, contrastive focus and background.

2.1. Accent types

Table 1 shows the accent types preferably used by the three speakers in the different focus structures. As expected, all speakers deaccented the target words when they occurred in the background. In broad focus, speakers DM and AH almost exclusively used downsteps (DM 85.2%; AH 100%), whereas

speaker WP typically produced upsteps (84%; only 4% downsteps). In the narrow focus condition, speakers DM and WP both produced upsteps (DM 82.6%; WP 100%), while speaker AH used all three accent types nearly to the same extent (36% upsteps; 32% downsteps; 32% unmodified H*). In contrastive focus, all three speakers always used upsteps.

Table 1: Most frequently produced accent types per speaker and focus condition.

2.2. Acoustic durations

We examined the duration of the target words for all speakers. Since our target words are disyllabic, the domain ‘word’ is identical with the domain ‘foot’. Fig.2 shows mean durations of the target word B/i:/ber for the different focus conditions. For all three speakers, we found a significant increase in word duration from background to contrastive focus (e.g. AH: 33ms longer, p

German - uni-tuebingen.dekdk/WS09-10/focus-marking... · 2010. 2. 2. · 2.1.1. Reading material Reading material consisted of five question-answer pairs with the answer ÔManuela

Documents