Activation of referents in the bilingual mind Jacopo Torregrossa 1 & Christiane Bongartz 2 1 University of Hamburg, Department of Romance languages 2 University of Cologne, English Department E-mail: [email protected], [email protected]Abstract. This paper investigates reference production by bilingual and monolingual children. We focus on the degree to which the activation of referents is encoded by different types of referring expressions among bilinguals and monolinguals. The study is based on forty-six story retellings produced in German by twenty-five Greek-German bilingual children and twenty-one monolingual children, respectively. The activation of referents is assessed based on a multi-factorial analysis of cognitive and linguistic factors that are involved in the use of referring expressions. The results show that pronouns produced by bilingual children tend to encode a lower degree of activation of referents and to be underspecific. This may be an effect of reduced processing speed experienced by bilinguals in the mapping of the referent’s activation onto the use of a certain referring expression. More in general, we account for the observed differences between bilinguals and monolinguals in terms of cognitive mechanisms underlying bilingual language production.
39
Embed
Activation of referents in the bilingual mind - pre-publication version
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Activation of referents in the bilingual mind
Jacopo Torregrossa1 & Christiane Bongartz2
1 University of Hamburg, Department of Romance languages
Abstract. This paper investigates reference production by bilingual and monolingual children. We focus on the degree to which the activation of referents is encoded by different types of referring expressions among bilinguals and monolinguals. The study is based on forty-six story retellings produced in German by twenty-five Greek-German bilingual children and twenty-one monolingual children, respectively. The activation of referents is assessed based on a multi-factorial analysis of cognitive and linguistic factors that are involved in the use of referring expressions. The results show that pronouns produced by bilingual children tend to encode a lower degree of activation of referents and to be underspecific. This may be an effect of reduced processing speed experienced by bilinguals in the mapping of the referent’s activation onto the use of a certain referring expression. More in general, we account for the observed differences between bilinguals and monolinguals in terms of cognitive mechanisms underlying bilingual language production.
1. Introduction1
This paper investigates reference production in bilingual children. In order
to refer to an entity, speakers can use different types of referring expressions
(RE henceforth) which vary in terms of explicitness. For instance, if a
professor wants to refer to a student in her class, she could use the RE the
student in the front row which is more explicit than the RE the student,
which is in turn more explicit than the pronoun she/he. According to the
traditional view on reference production, speakers use REs to mark the
activation (alias prominence, accessibility, salience – see Arnold 2010) of a
referent in discourse, and hence, to guide the listeners in the identification of
the referent in their mental model of discourse (Ariel 1990). However,
recent literature has shown that the use of REs does not depend only on the
referent’s discourse status. Under the same discourse conditions, reference
production may vary across different groups of speakers. For instance,
processing constraints may affect the referent’s representation in the
speaker’s mind, and hence, the production of REs (see Arnold 2010 for a
review).
1 The data presented in this paper have been collected and analyzed within the CoLiBi project – Cognition, Literacy and Bilingualism in Greek-German speaking children (principal investigators: Christiane Bongartz and Ianthi Tsimpli), jointly founded by IKY (Greek State Scholarship Foundation) and DAAD (German Academic Exchange Service). We thank Maria Andreou, Eva Knopp, Ianthi Tsimpli, the audience of the workshop “Cognitive and linguistic effects in anaphora resolution” (15-16.05.2015 – Thessaloniki) and our anonymous reviewer. All remaining errors “refer back” to us.
Bilingual reference production represents a privileged viewpoint for
studying the interaction between cognitive and linguistic factors that affect
the use of REs. We will investigate to what extent bilingual children diverge
from their monolingual peers in the use of REs and whether possible
differences can be accounted for by cognitive mechanisms underlying
bilingual language production.
In Section 2, we introduce the notion of a referent’s activation, as used in
Kibrik (2011). The author elaborates a model of reference production that
accounts for discourse and cognitive factors involved in the representation
of the referent in the speaker’s mind. In Section 3, we review previous
studies concerning the processing mechanisms underlying reference
production across different groups of speakers, with special focus on
bilingual language production. In Section 4, we present our study which
compares bilingual and monolingual children in the use of REs, based on
the analysis of narratives elicited from both groups. Finally, Section 5
presents a discussion of the results.
2. Activation of referents in narrative discourse
The production of a coherent narrative involves establishing discourse
relations between discourse units. For instance, when telling a story,
speakers have to keep track of different events, deciding when to introduce
them in the plot, which of them to foreground (or background), and how to
connect them by means of rhetorical relations (e.g., elaboration or
explanation). At the sentence level, this is expressed, for example, by the
use of certain tense-aspect markers or the distinction between different
levels of embedding (see Tomasello 2003: 271 for an overview).
The focus of this paper is on discourse relations established by reference
chains tracking different characters in the narrative and on the linguistic
means used to express these relations. After a character is introduced in the
story, reference to it may be maintained across two (or more) adjacent
discourse units or reintroduced after a hiatus. Reference introduction,
maintainance, or reintroduction are signaled by the use of REs, depending
on the inventory of referring forms available in each language. For instance,
in languages having the opposition between indefinites, definites, and
pronouns (like English or German), speakers are likely to use indefinite
nouns for reference introduction, reduced REs (e.g., pronouns) for
maintainance, and more informative REs (i.e., phonologically fully-fledged,
such as definite nouns) for reference reintroduction2.
The analysis presented in this paper goes beyond this static approach to the
mapping between forms (i.e., types of REs) and functions (reference 2 At a more global level, referring functions (i.e., reference introduction, maintenance or reintroduction) may be signaled by a specific informational partitioning of the sentence (e.g., topic-comment, given-new or focus-background). For example, indefinite nouns expressing introduction tend to appear post-verbally, contrary to given referents, which tend to occur pre-verbally.
introduction, reintroduction and maintenance). We follow Kibrik (2011) in
arguing that the production of REs is the result of a dynamic interaction
between cognitive and linguistic factors. In particular, Kibrik (2011:53)
claims that the use of REs is grounded on two distinct cognitive processes,
i.e., attention and activation. First mention of a referent in discourse implies
that the referent is attended to by the speaker. Once attended to, a referent
becomes activated in the speaker’s working memory (WM, henceforth).
The referent’s degree of activation varies throughout the narrative,
depending on several factors. For example, recency of mention (alias
distance) affects activation3. Recently mentioned referents are more
activated than referents that have been mentioned less recently. The
syntactic position and the grammatical role of the referent’s previous
mention (alias antecedent) are relevant factors, too. Distance being equal, a
referent whose antecedent is a subject is usually more activated than a
referent whose antecedent is a direct or an indirect object. Likewise, the
referent is more activated if its antecedent occurs in a main clause rather
than a in subordinate clause4. Moreover, the degree of activation depends on
the number of characters intervening between two mentions of the same
3 In this paper we will measure distance in terms of number of clauses (see Section 4.3). Kibrik’s (2011) assessment of distance, by contrary, is based on the representation of the hierarchical structure of discourse (ibid.:403). 4 For example, in (i) the pronoun he, which encodes a high degree of a referent’s activation, tends to refer to the subject of the main clause (John) and not to the subject of the subordinate clause (Luke), even if the latter is mentioned more recently. (i) Johni said that Lukej felt bad. Hei (…)
character. Our analysis will take into account all these factors (we refer to
Kibrik 2011 and Arnold 2010 for an overview of additional ones).
Whenever the speaker intends to refer to a character in the unfolding
narrative, she maps the referent’s degree of activation at the given point in
discourse onto the use of a RE. Typically, pronouns (or, more in general,
reduced REs) encode a higher degree of a referent’s activation than more
specific forms, such as definite nouns or proper names (see Ariel 1990 as a
main reference on this issue). However, the threshold of activation that is
relevant for the use of a reduced vs. fully-fledged RE is subject to inter- and
intra-speaker variation (see Section 3).
As an example of this, let us consider (1), which is the English translation of
the first clauses of a narrative told in German by a 8-year-old Greek-
German bilingual child (see Section 4 for a description of the task and the
participants of the study), and focus on the reference chain corresponding to
the referent denoted by the dog.
(1) There was a dog
and he had a chart with a balloon attached to it.
Then a rabbit came.
and he wanted to play with the balloon.
and the dog said…
In the first line of the narrative the character is introduced by means of an
indefinite noun (a dog) and gets activated. In the second line, the referent’s
degree of activation is high, given that the antecedent a dog is in a main
clause, in subject position and only one clause distant. The speaker maps
this high degree of activation into the use of the pronoun he. The next
mention of the dog occurs after three clauses and a new character intervenes
between the new mention and its antecedent. This involves a decay in the
referent’s activation. The resulting degree of activation is mapped into the
use of a definite noun.
Kibrik (2011) develops a quantitative model to assign an activation score to
referents at any given point in discourse. He observes that not all the above-
mentioned factors (distance, argumenthood, etc.) have equal effect in
determining the referent’s degree of activation and, based on a trial-and-
error heuristic procedure, derives the activation score of each factor (we
refer to Kibrik 2011:396-428). The referent’s activation score is derived as
the sum of all these weighted scores. It should be pointed out that the
referent is endowed with an activation score independently of its being
mentioned and of the RE used to refer to it. For example, coming back to
(1), the dog remains activated in the third and fourth line, even if it is not
mentioned. Furthermore, the activation score associated with the dog in the
second clause would have been the same even if the speaker had used a
more specific RE (e.g., the dog, this dog).
The analysis to be presented in Section 4 complies with Kibrik’s multi-
factorial approach in its main lines. We will derive the weight of each
activation factor by means of a learning algorithm (InfoGain function)
which is implemented in a machine learning software (i.e., WEKA, see Hall
et al. 2009). This will allow us to avoid weighting the contribution of each
factor by means of a trial-and-error procedure (as is the case of Kibrik 2011)
and to control for the interaction between the different factors (see Grüning
and Kibrik 2002 for a similar approach).
Before proceeding to the analysis, it is important to state clearly why
Kibrik’s model provides a useful theoretical tool to assess individual
variation in reference production (e.g., between bilingual children and their
monolingual peers). We mentioned that Kibrik defines a referent’s
activation in very specific terms, i.e., as activation in the speaker’s WM.
This entails that individual differences in WM capacity may affect the
referent’s activation and the corresponding use of REs. The picture is
further complicated by the impact of mechanisms of processing efficiency
on WM capacity (see, e.g., Bayliss et al. 2005). The quantitative approach
outlined in this section will allow us to understand whether a type of RE is
associated with the same activation score across different groups of
speakers. Any possible difference is interpreted in the light of differences in
the speakers’ cognitive profiles.
3. Activation in the bilingual mind
Section 2 contains a list of discourse factors that influence the referent’s
degree of activation, and hence, the use of REs. However, recent literature
has shown that cognitive factors play an important role, too. For instance,
Rosa & Arnold (2011) analyze the effect of a speaker’s distraction on the
performance in a narrative task: they asked the participants to do a shape-
sorting task while telling a picture-based narrative. The authors noticed that,
compared to not distracted controls, distracted speakers tend to use REs
encoding lower degrees of the referent’s activation (i.e., definite nouns vs.
pronouns). Interpreted in the light of the activation model presented in
Section 2, these results show that the cognitive load imposed by the
distraction task renders the referent less activated in WM5. This is reflected
in the use of more specific (or less reduced) REs. Thus, overspecification
seems to be the outcome of WM load. However, WM is only one of the
cognitive factors involved in the use of REs.
Van Rij et al. (2011) elaborate a model in which the role of WM in
reference production is kept distinct from mechanisms of processing
efficacy (i.e., processing speed). WM is involved in computing the factors
that affect a referent’s activation (e.g., establishing the discourse topic based
on grammatical information). Processing speed is involved in perspective
5 Rosa and Arnold do not explicitly mention the notion of WM, but their analysis is fully consistent with the theoretical framework reported in Section 2.
taking, which consists in checking whether the listener is able to recover the
intended referent. By considering both cognitive processes, the authors
account for patterns of variation in reference production across different
groups of speakers. For example, let us consider again Rosa & Arnold’s
observation that under WM load adult speakers tend to overspecify.
According to van Rij et al. (2011), WM load affects the identification of the
antecedent in subject position as the discourse topic, but does not influence
processing speed. This explains why speakers rely on an overly cautious
strategy, using a fully-fledged form (e.g., a definite noun) instead of a
pronoun, to avoid ambiguities for the listener.
Child reference production is explained according to a different pattern.
Children who are younger than 7 years have low WM (as is the case of
adults under a WM load condition) and low processing speed (see van Rij et
al. 2011 and references therein). Problems in the identification of the
discourse topic coupled with deficits in perspective taking lead to the
production of underspecific forms, which are often ambiguous for the
listener.
In this paper, we do not consider the cognitive resources involved in
perspective taking. However, the distinction between WM and processing
speed is consistent with the referent’s activation model sketched in Section
2. On the one hand, the referent’s activation score is represented in WM. On
the other hand, we assume that processing speed is involved in the mapping
of the activation score into a certain RE, i.e., in lexical retrieval of REs
(pronouns or definite descriptions). Under low processing speed, the speaker
should follow an underspecification strategy, given that reduced form are
less demanding to access and produce (see Almor 1999, Hendriks et al.
2008 and Rosa & Arnold 2011 for a review).
The focus of this paper is on bilingual reference production. Based on the
assumptions that bilingualism affects cognitive and linguistic processes and
that reference production depends on cognitive and linguistic factors, we
expect bilingual speakers to exhibit a different use of REs than their
monolingual peers. We will consider two possible scenarios.
One possibility is that bilinguals use fully-fledged REs in association with a
high degree of the referent’s activation, i.e., they overspecify. Following on
Rosa & Arnold (2011), overspecification may be the result of the cognitive
load experienced by bilinguals, due to the inhibition of the language which
is not in current use (Levy et al. 2007, Philipp & Koch 2009 and Bialystok
et al. 2011 for a review). This cognitive load reduces the referent’s degree of
activation, as explained above.
An alternative possibility is that bilinguals underspecify as an effect of
reduced processing speed in the mapping of the referent’s activation score
into the use of a RE. In particular, slow processing may disfavor the
retrieval of (complex) full nouns and lead to the production of lighter forms,
such as pronouns. Several studies have argued for a bilingual disadvantage
in tasks of lexical competence and access, which may be due either to the
competition between the two language systems (see, a.o., Bialystok et al.
2011) or to a frequency of use effect (see, a.o., Gollan et al. 2008). When
mapping the referent’s activation score into the use of a certain RE,
bilinguals have to deal with alternatives both within their current language
(e.g., use of a pronoun vs. definite noun) and between their two languages.
In particular, Greek-German bilinguals have to cope with two distinct
referential systems. In German (a non-pro-drop language with no clitics), in
a scaled cline of activation, pronouns encode a higher degree of activation
than do definite nouns. In Greek (a pro-drop and clitic language), null
subjects and clitics are in complementary distribution and encode higher
degrees of activation than do both overt pronouns and definite nouns6. Overt
pronouns in German, thus, encode a lower degree of activation than their
null counterparts in Greek.
For both scenarios outlined above (overspecification vs. underspecification),
we predict, in line with Blumenfeld & Marian (2007), that the effects of
parallel language activation are modulated by language proficiency: the
greater the proficiency in the language not currently used, the more active
this language will remain in bilingual language use.
6 In this contribution, we deal only with the German referential system. We will not consider the production of the demonstrative pronouns der, die, das, mainly due to their low frequency in our corpus. We refer to Hinterwimmer (2015) for an analysis of the conditions of use of demonstrative pronouns and to Torregrossa (to appear) for their use by German monolingual children between 8 and 10 years.
Some studies suggest that the first scenario, i.e., overspecification, is more
likely to occur. Andreou et al. (2015) account for reference production by
two groups of Greek-German bilingual children, one living in Germany and
one living in Greece (the present paper focuses on a subclass of the latter
group – see Section 4.1). After labelling each RE based on its linguistic
function (reference introduction, reintroduction and maintenance), the
authors show that in Greek, bilingual children living in Germany tend to use
explicit forms (definite nouns) in contexts in which the use of less reduced
forms (null and clitic pronouns) would have been appropriate as well. In this
respect, bilingual children differ from their monolingual peers. The authors
account for this pattern of production in terms of language dominance,
given that, among the bilinguals in Germany, Greek is the less dominant
language, for vocabulary score, early literacy preparedness and current
language use.
Examples of overspecification in bilingual reference production are also
found in the studies by Serratrice (2007) and Chen & Lei (2012). Serratrice
(2007) analyzes reference production in 8-year-old English-Italian bilingual
children in both their languages, as compared to English and Italian
monolinguals. She found that for reference maintenance in Italian, bilingual
children use more definite nouns (vs. clitics) compared to monolinguals, but
only in object position. Chen & Lei (2012) account for narrative production
by English-Chinese bilingual children, English monolinguals and Chinese
monolinguals, and show that in reference reintroduction in Chinese,
bilingual children produce more definite nouns and fewer null pronouns
than monolinguals. Interestingly, in both studies, overspecification occurred
in just one of the two languages of each bilingual child, which may
ultimately be attributed to dominance of one language on the other and
related cross-linguistic effects7. What becomes apparent, however, is that
overspecification is a specific production strategy in need of further
explication, and not a default compensatory strategy associated with
bilingualism as such. Since a consistent pattern of results has not emerged
yet, the present paper aims to shed some new light on bilingual reference
production and the cognitive factors affecting it. By relying on the
activation model for reference production presented in Section 2, we will
show whether bilinguals overspecify or underspecify and interpret the
results in terms of the abovementioned theories (with overspecification
being the result of WM-load and underspecification of low processing
speed). Finally, in order to understand to what extent language proficiency
affects bilingual reference production, we will perform a correlational
analysis between the activation scores encoded by different types of REs
and vocabulary measures.
7 In both studies, the authors do not profile the participants for language dominance. Serratrice claims that bilinguals prefer definite nouns to clitics, because definite nouns are morpho-syntactically less complex. Chen & Lei (2012) claim that the preference for nouns over pronouns in Chinese is an effect of cross-linguistic influence of English on Chinese.
4. The study
4.1. Participants
This study is based on a selection of 25 bilingual children among the 77
Greek-German bilinguals analyzed by Andreou et al. (2015). In particular,
we consider the ones living in Greece and ranging in age from 8 to 10 years
(mean age: 9 years) at the time of testing. This group of children consists of
simultaneous bilinguals (4/25), early sequential bilinguals (10/25), late
sequential bilinguals that have Greek as their L1 (10/25) and a late
sequential bilingual child with German as his L1. At the time of testing, all
children attended the German school in Thessaloniki, in which German is
the main medium of instruction (between 19 and 25 hours per week). The
hours of instruction in Greek vary between 4 and 7 hours per week. The
opportunity to attend a school in which the dominant language differs from
the dominant language in the society contributes to rendering this group of
children the most balanced among the bilinguals analyzed by Andreou et al.
(2015). This relatively balanced profile emerges also from the analysis of
other ethnographic measures extracted from the questionnaires administered
to the children before the testing (cf. Andreou et al. 2015 and Bongartz
2016), such as language use before schooling, early literacy preparedness
and current language use.
We also considered 21 age-matched German monolinguals (range = 8.1 to
10.6; mean age: 9.4) who attended a German school in Cologne (Germany).
All children (bilinguals and monolinguals) had no history of cognitive or
linguistic impairment or hearing loss (Andreou et al. 2015). To assess their
language proficiency in German (i.e., the language under analysis in this
contribution), we administered the productive vocabulary test normed for
German children (Petermann et al. 2010).
4.2. Materials
The material consists of 46 story retellings produced in German (25 by the
bilingual children and 21 by the monolinguals). The narratives were elicited
using the Edmonton Narrative Norms Instrument (ENNI) designed by
Schneider et al. (2005). ENNI includes six stories, divided into three groups
of increasing complexity. For our task, we used the two most complex ones.
Each of them consists of 13 pictures (with no text) which represent a series
of events involving two major characters (of different gender) and two
minor ones (of different gender, too). Bilinguals were asked to produce two
stories (one for each language). We will consider only the stories told in
German. Monolinguals produced only one story.
The task was administered as a sequence of Power Point slides on a
computer screen. First, participants had to choose one of three envelopes,
each containing one of the two stories. They were told that each envelope
contained a different story. Then, they looked at the story pictures two by
two, while listening to the model story on the headphones. Finally, once the
thirteen picture synopsis had appeared on the screen, they had to retell the
story to the investigator, who feigned ignorance of the plot. The stories were
audio-recorded and then transcribed by German native speakers. We refer to
Andreou et al. (2015) for further details concerning the methodology and
the procedure of the experiment.
4.3. Analysis
Unit of analysis is the clause defined by the occurrence of a verb. For each
clause, we identified REs denoting animate characters. In line with Kibrik
(2011) – see Section 2 – our analysis focuses on the activation encoded by
REs, distinguishing between pronouns (PRON) and definite noun phrases
(DEFDP, including proper names and possessive noun phrases consisting of
a possessive adjective followed by a noun, e.g., seine Mutter ‘his mother’).
On the contrary, we will not take into account REs used to attend to a
certain referent (i.e., indefinite nouns for reference introduction).
In German, PRONs, determiners in DEFDPs and possessive adjectives are
marked for case (nominative, accusative and dative), number (singular and
plural) and gender (masculine, feminine and neuter). This is exemplified in
the narrative extract in (2). In the first unit, two REs occur: der Hase (the
rabbit), which is a DEFDP whose determiner is masculine, singular and
marked for nominative, and seine Mutter (his mother), a DEFDP whose
possessive adjective is singular, feminine and marked for accusative (which
is not different from nominative in this case). The second unit contains two
pronouns, i.e., the nominative case-marked singular, masculine pronoun er
and the accusative case-marked singular, feminine pronoun sie (also in this
case, the form is not distinct from its nominative counterpart).
(2) Da fand der Hase seine Mutter
There found the NOM.SING.MASC. rabbit his ACC.SING.FEM. mother
Und er fragte sie (…)
and he NOM.SING.MASC. asked her ACC.SING.FEM.
(bilingual; 8.1)
Based on the list of activation factors given in Section 2, we coded each RE
for its linguistic features (i.e., type, syntactic position and grammatical role),
for linguistic features of the referent’s previous mention (i.e., grammatical
role and syntactic position of the RE’s antecedent), for distance between the
RE and its antecedent and for number of intervening characters. We
identified three types of REs, i.e., indefinites (INDEF) – which are not
considered in the next steps of the analysis –, pronouns (PRON) and definite
nouns (DEFDP). The factor referring to the syntactic position of the RE and
its antecedent (CLAUSE and A-CLAUSE respectively) can have two
values, i.e., MAIN (i.e., occurrence in a main clause) and SUB (i.e.,
occurrence in a subordinate clause). For grammatical role
(GRAMMATICAL), we introduced three values, i.e., SUBJ (subject), OBJ
(direct object) and OTHER (indirect object or adjunct). The distance
between the RE and its antecedent was measured in units (i.e., clauses). We
refer to Torregrossa et al. (2015) for the first presentation of this model and
Torregrossa and Bongartz (submitted) for a recent elaboration. Table 1
contains an example of the analysis, based on the story told by a 9-year-old
bilingual child.
Table 1: Example of the coding.
The first column contains the transcription of the narrative and the second
the English translation. If a unit contains more than one RE, it is repeated
(as many times as it contains REs) and underlined. Repeated sentences are
CHAIN TYPE CLAUSE GRAMMATICAL A-CLAUSE A-GRAMMATICAL A-TYPE CHARACTERS DISTANCEEs war eine Hündin und ein Hase. There were a female dog and a rabbit 1 INDEF MAIN SUBJ INTRO INTRO INTRO INTRO INTROEs war eine Hündin und ein Hase. There were a female dog and a rabbit 2 INDEF MAIN SUBJ INTRO INTRO INTRO INTRO INTRODer Hase sah die Freundin von ihm The rabbit saw the female friend of him 2 DEFDP MAIN SUBJ MAIN SUBJ INDEF 0 1Der Hase sah die Freundin von ihm The rabbit saw the female friend of him 1 DEFDP MAIN OBJ MAIN SUBJ INDEF 1 1Der Hase sah die Freundin von ihm The rabbit saw the female friend of him 2 PRON MAIN OTHER MAIN SUBJ DEFDP 1 0dass sie einen Handwagen hatte mit einem Luftballon dran. that she had a chart with a balloon attached. 1 PRON SUB SUBJ MAIN OBJ DEFDP 1 1Er wollte mit dem Ballon spielen, He wanted to play with the balloon 2 PRON MAIN SUBJ MAIN OTHER PRON 1 2und die Freundin sagte, and the female friend said, 1 DEFDP MAIN SUBJ SUB SUBJ PRON 1 2er muss ihn zuerst losbinden, den Ballon. he has to untie it first, the balloon 2 PRON MAIN SUBJ MAIN SUBJ PRON 1 2Da wollte er ihn losbinden. Then he wanted to untie it 2 PRON MAIN SUBJ MAIN SUBJ PRON 1 3Und ausversehen rutschte er ihm aus der Hand. and suddenly it slips out (to him) from the hands. 2 PRON MAIN OTHER MAIN SUBJ PRON 0 1
considered as a single unit for the measure of distance. The third column
(CHAIN) assigns an index to each character (1 for the female dog and 2 for
the rabbit, in the case at issue). The number of characters intervening
between two mentions of the same referent (CHARACTERS) is counted
based on linear criteria. For example, in (3) reference to the rabbit (von ihm
‘of his’) intervenes between the two mentions of the female dog (die
Freundin ‘the female friend’ and sie ‘she’, respectively). (4) is a
modification of (3), in which the prepositional phrase von ihm ‘of his’ is
substituted by the prenominal possessive adjective seine ‘his’. In this case,
the rabbit character does not intervene between the two mentions of the dog
character.
(3) Der Hase sah die Freundin von ihm,
The rabbit NOM.SING.MASC. saw the friend ACC.SING.FEM. of he DAT.SING.MASC.
dass sie einen Handwagen hatte.
that she NOM.SING.FEM. a ACC.SING.MASC. cart had
(4) Der Hase sah seine Freundin, dass
The rabbit NOM.SING.MASC. saw his ACC.SING.FEM. friend ACC.SING.FEM. that
sie einen Handwagen hatte.
she NOM.SING.FEM. a ACC.SING.MASC. cart had
The final dataset was uploaded in WEKA, which is a machine learning
software tool including several algorithms for data mining (Hall et al. 2009).
We used WEKA to determine the extent to which the use of REs is affected
by the factors considered in the analysis, with special attention to those
involved in reference activation, i.e., linguistic features of the antecedent,
distance and numbers of intervening characters (see Section 2). We applied
the info gain function that derives the weight of each factor in determining a
certain outcome based on a decision tree analysis (use of a pronoun vs.
definite noun in our case – see Torregrossa et al. 2015 for the representation
of the decision tree corresponding to the use of REs in Italian and Greek).
The bilingual and the monolingual data have been analyzed separately, to
understand whether the use of REs was sensitive to different factors in the
two groups of speakers.
As has been mentioned in Section 2, the referent’s activation score was
derived as the weighted sum of the values corresponding to each activation
factor. The weights were provided by the info gain function. For each factor,
we assigned a number to each of its possible values. The greater the number,
the greater its effect in strengthening the referent’s activation. For instance,
a referent whose antecedent is a subject is more activated than a referent
whose antecedent is an object (Section 2). This is reflected in the
assignment of the number 0.6 to the value SUBJ and 0.4 to the value OBJ
(see (b) below).
(a) Clause of the antecedent: MAIN (0.4) > SUBORDINATE (0.2)
(b) Grammatical role of the antecedent: SUBJ (0.6) > OBJ (0.4) > OTHER
(d) Number of intervening characters: 0 (0.8) > 1 (0.6) > 2 (0.4) > 3 (0.2) >
+3 (0)
In the next section, we show the results of the analysis. In Section 4.4.1, we
report the weights of each factor in monolingual and bilingual reference
production, and provide some examples of how the referent’s activation
score is derived. Then, we compare the activation scores associated with the
use of pronouns (Section 4.4.2) and definite nouns (Section 4.4.3) in both
groups of speakers.
4.4. Results
4.4.1. Weight of the activation factors and derivation of the activation score
Table 2 reports the weights of each activation factor in determining the use
of REs in monolingual and bilingual reference production. The factors are
ordered from the weakest to the strongest predictor of RE type. It should be
noted that even if the values of the weights are slightly different across the
two groups, bilingual and monolingual reference production is sensitive to
the same hierarchy of factors. Distance and number of intervening
characters are the strongest factors, followed by the antecedent’s
grammatical role and syntactic position, respectively.
FACTOR MONOLINGUALS BILINGUALS
a-clause 0.003 0.007
a-grammatical 0.02 0.05
characters 0.17 0.14
distance 0.27 0.21
Table 2: Weight of each activation factor in monolingual and bilingual reference production.
The result concerning the antecedent’s syntactic position is expected, given
that child narratives exhibit a lower level of syntactic complexity than adult
narratives (for which syntactic complexity plays a more important role – see
Bongartz and Torregrossa, to appear, and Andreou et al. 2015). The data
related to the antecedent’s grammatical role require further investigation.
Van Rij et al. (2011) claim that, at the age of 7, children do not have
sufficient WM resources to identify the discourse topic based on
grammatical information (Section 3). These cognitive resources may still be
developing in the age range considered in this paper.
Table 3 shows how the referent’s activation score is calculated at each unit
of the narrative reported in Table 1. We focus on the chain referring to the
female dog character.
Table 3: Activation scores associated with the female dog character in the narrative extract in Table 1.
First, let us consider the referent’s activation score at the point in which the
referent is mentioned (see the activation scores in red in the table). In line 4,
the female dog is referred to with a definite noun, i.e., die Freundin (von
ihm) ‘the female friend of his’. The antecedent is the indefinite noun eine
Hündin (a female dog) in line 1. The referent’s activation score is derived as
indicated in (5) – see (a)-(d) in Section 4.3 for the values and Table 2 for the
weights.
(5) Activation score of the antecedent’s syntactic position: (0.4*0.003) +
Activation score of the antecedent’s grammatical role: (0.6*0.05) +
Activation score of the number of intervening characters: (0.6*0.14) +
Activation score of the distance RE-antecedent: (0.6*0.21) =
Referent’s activation score: 0.2412
TYPE A-CLAUSE WEIGHT_A-CLAUSE GRAMMATICAL WEIGHT_GRAMMATICAL CHARACTERS WEIGHT_CHARACTER DISTANCE WEIGHT_DISTANCE SCOREEs war eine Hündin und ein Hase. INDEF INTRO INTRO INTRO INTRO INTRO INTRO INTRO INTRO INTROEs war eine Hündin und ein Hase 0,2832Der Hase sah die Freundin von ihm, 0,2412Der Hase sah die Freundin von ihm, DEFDP MAIN (0,4) 0,003 SUBJ (0,6) 0,05 1 (0,6) 0,14 1 (0,6) 0,21 0,2412Der Hase sah die Freundin von ihm, 0,2732dass sie einen Handwagen hatte mit einem Luftballon dran. PRON MAIN (0,4) 0,003 OBJ (0,4) 0,05 1 (0,6) 0,14 1 (0,6) 0,21 0,2312Er wollte mit dem Ballon spielen, 0,2406und die Freundin sagte, DEFDP SUB (0,2) 0,003 SUBJ (0,6) 0,05 1 (0,6) 0,14 2 (0,4) 0,21 0,1986er muss ihn zuerst losbinden, den Ballon. 0,2412Da wollte er ihn losbinden. 0,1992Und ausversehen rutschte er ihm aus der Hand. 0,1572
Therefore, in line 4, the dog’s activation score equals 0.2412. This
activation score is mapped onto the use of a definite noun (die Freundin von
ihm). Employing this general procedure, we can calculate individual values
for each story participant at each mention, and then arrive at the average
activation score associated with the use of a pronoun or a definite noun.
Based on this, we will compare bilingual and monolingual reference
production (Section 4.4.2).
Before proceeding, however, it should be pointed out that once the dog
referent is introduced it will stay activated and hence associated with an
activation score independently of its being actually mentioned (cf. the
activation scores in black in Table 3). For instance, in the last line of Table 3
(und ausversehen rutschte er ihm aus der Hand ‘and suddenly it slipped to
him out of his hands’), only the rabbit is mentioned (e.g., ihm ‘to him’).
However, the dog is active, too. Its activation depends on the fact that its
last mention is in a main clause (0.4*0.003), in subject position (0.6*0.05),
three unit distant (0.2*0.21) with no intervening character (0.6*0.14) (if we
exclude the balloon, which is inanimate). The resulting score equals 0.1572,
as indicated in the table.
4.4.2. Activation of pronouns in bilinguals and monolinguals
Figure 1 compares the activation scores encoded by bilingual pronouns (on
the left) and monolingual pronouns (on the right). The activation scores
have been z-score normalized, given that the activation factors have
different weights across the two groups of participants (Table 2 in Section
4.4.1). Bilingual pronouns tend to encode a lower degree of activation than
monolingual pronouns (one-way ANOVA: F(1) = 60.56, p < .001).
Figure 1: Box plot of the z-score normalized activation scores associated with the use of pronouns by
bilinguals (on the left) and monolinguals (on the right).
Figure 2 shows the regression between the average activation score of
pronouns (for each participant) and the score obtained in the vocabulary
test. Vocabulary score is a good predictor of the referent’s activation score
in pronoun production (r2=.24, p<0.01): the lower the child’s proficiency,
the lower the activation score.
Figure 2: Dispersion graph between the referent’s activation encoded by pronouns (activation) and the
vocabulary score in all children (bilinguals and monolinguals).
4.4.3. Activation of definite nouns in bilinguals and monolinguals
Figure 3 compares the (z-score normalized) activation scores encoded by
definite nouns across the two groups (bilinguals and monolinguals). The
result is consistent with the one shown in Figure 1. Bilingual definite nouns
encode lower degrees of activation than monolingual definite nouns (one-
way ANOVA: F(1) = 6.04, p < .02).
A 2x2 mixed design ANOVA with factors Group (Bilingual and
Monolingual) as a within subject variable and Type of Referring Expression
as a between subject variable reveals a significant interaction between the
two factors (F(3) = 132.39, p < .01). This suggests that the difference
between monolingual pronouns and bilingual definite nouns is not as
marked as the difference between bilingual pronouns and monolingual
pronouns. Furthermore, the regression between vocabulary scores and
average activation scores of definite nouns is not significant (r2=.011, p >
0.5).
Figure 3: Box plot of the z-score normalized activation scores associated with the use of definite
nouns by bilinguals (on the left) and monolinguals (on the right).
5. Discussion and conclusions
The aim of our study was to understand whether bilingual children differ
from their monolingual peers in the use of REs and whether this difference
is related to processing mechanisms specific to bilingual language
production.
-4-2
02
4ac
tivat
ion
bilinguals monolinguals
In Section 2, we identified two cognitive processes underlying reference
production. On the one hand, WM is involved in keeping track of the factors
that affect a referent’s activation (e.g., grammatical role of the referent’s
previous mention). On the other hand, processing mechanisms are involved
in the retrieval of REs appropriate to a given degree of activation. Based on
previous studies, we claimed that individual differences in WM capacity or
processing speed might affect reference production. More specifically, low
WM involves a decay in the referent’s activation which is reflected in the
use of REs encoding lower degrees of activation, i.e., definite nouns vs.
pronouns (overspecification). Slow processing speed leads to the production
of reduced referring forms (i.e., pronouns vs. definite nouns), which are
easier to access and process (underspecification).
The results reported in Figure 1 (Section 4.4.2) show that bilinguals tend to
produce pronouns encoding lower levels of activation than monolinguals.
Table 4 shows an example of the production of an underspecific pronoun.
The rabbit (der Hase, in bold) is the subject of the coordinated VPs rennt
(runs) and will (wants) and is picked up by the (underspecific) pronoun er
(he) in the last sentence. Notice that the two mentions of the rabbit are
separated by four clauses, in which two characters occur, i.e., the balloon-
seller and the plural character denoting the rabbit and the dog (sie ‘they’ in
the penultimate line). Thus, in the last sentence, the rabbit has a relatively
low activation score, which is nevertheless mapped into the use of a reduced
RE, i.e., a pronoun. Reference to the young rabbit is underspecific and
ambiguous between the rabbit himself and the balloon-seller.
Table 4: Extract from a story told by a bilingual child (8.11)
Table 5 shows that in a very similar context, an age-matched monolingual
child does not produce an underspecific reference. In the last line, the rabbit
is referred to by means of a definite noun (der Hase ‘the rabbit’),
corresponding to the referent’s low degree of activation (distance of five
clauses between the first and last mention of the rabbit and occurrence of
one intervening character)8.
8 As mentioned in footnote 6, we did not consider the use of the demonstrative pronoun der. In Table 5, the first occurrence of der refers to the young rabbit and the second to the balloon-seller. Neither reference is ambiguous.
(…) Da kommt ein Mann mit Luftballons. (There arrives a man with balloons)
Da rennt der Hase hin, (Then the rabbit runs there)
um seine Freundin glücklich zu machen (to make his female friend happy)
und will ihr einen Luftballon kaufen. (and he wants to buy her a balloon)
Aber dann sagt der Luftballonverkäufer, (and then the balloon seller says)
dass er 50 Cent will für einen Luftballon. (that he wants 50 cents for a balloon)
und dann betteln sie ihn an (…) (and then they pray him)
Also dann sucht er in seinen Taschen. (and then he looks in his pockets).
Dann war da ein alter Hase mit uhm # mit Ballons (then there was an old rabbit with balloons)
Dann hat der gesagt: (and he – the rabbit – said)
“Können Sie mir den größten und schönsten Ballon geben (Could you please give me the biggest and
priettiest balloon)
Table 5: Extract from a story told by a monolingual child (8.9)
Based on the literature discussed in Section 3, we interpret the data in
Figure 1 as showing that bilingual reference production is constrained by
processing speed. When mapping the referent’s activation score into the use
of a RE, bilinguals go for the “easiest” option. Furthermore, Figure 2 reveals
that this effect is modulated by language proficiency: the lower the
proficiency, the greater the processing costs and the more likely the
production of underspecific forms. In this paper, we do not report any data
revealing a bilingual disadvantage in processing speed, and the correlation
between slow processing and underspecification remains partly speculative.
We refer to the study by Bongartz & Torregrossa (submitted) which takes
into account a group of Greek-German bilingual children and reports a clear
correlation between activation scores and the scores obtained in a lexical
decision task (as a measure of processing speed).
Finally, it is noteworthy that in this study, overspecification does not emerge
as a bilingual referential strategy. Overspecification is expected as an effect
of low WM capacity (Section 3). The literature reports no evidence that, in
den Sie hatten?” (that you have?)
Dann hat der gesagt. (then he – the rabbit seller – said)
er wollte dafür Geld haben. (he wanted to have money for it)
Aber der Hase hatte kein Geld dabei (…) (but the rabbit had no money with him).
WM tasks, bilinguals perform worse than monolinguals. Rather, the
opposite seems to hold true (Bialystok et al. 2011).
degrees of activation than monolingual definite nouns, as is the case of
pronouns. While the mapping of a (relatively) low activation into the use of
a pronoun indicates underspecification (i.e., ambiguity), definite nouns are
supposed to express low activation. Our data suggest that monolinguals tend
to use definite nouns in cases in which they could have used pronouns.
Table 6 – which reports an extract from a story told by a monolingual child
– provides empirical evidence in favor of this claim (see, for example, the
use of er ‘he’ instead of der Jungenhase ‘the young rabbit’ in the second
line).
Table 6: Extract from a story told by a monolingual child (8.7)
(…) und dann war das Hundemädchen sauer auf den Jungenhasen (Then the female dog was angry
with the young male rabbit)
Und dann hat der Jungenhase ein älteren Hasen mit ganz vielen Ballons gesehen. (und then the
young male rabbit saw an older rabbit with many balloons)
Und dann hat der Jungenhase um einen den schönsten Ballons gebeten. (and then the young male
rabbit asked for the most beautiful balloon)
und der ältere Hase wollte für den Ballon Geld haben (and the older man wanted to have money for
the balloon).
und der Jungenhase fand kein Geld in seinen Hosentaschen (and the young male rabbit found no
money in his pockets).
However, a qualitative analysis of the data reveals that this pattern (i.e., the
use of definite nouns vs. pronouns) is not consistent across all monolingual
participants, as also shown by the fact that the difference between the two
groups in Figure 3 is not as marked as is the case of Figure 1 (Section 4.4.1
and 4.4.2). When producing a fully-fledged form in association with a
relatively high degree of a referent’s activation, monolinguals seem to rely
on a cautious strategy for reference production which avoids ambiguities for
the listener. In the context reported in Table 6, this strategy might be
motivated by the occurrence of two referents of the same type and gender
(i.e., two male rabbits). Interestingly, the fact that there is no great
difference between bilinguals and monolinguals in the production of definite
nouns suggests that the two groups rely on a similar discourse-pragmatic
sensitivity. The observed differences might be due to diverging mechanisms
available in the two groups for the integration of syntax and discourse-
pragmatic information.
In this paper, we have provided a processing account for the observed
differences between monolinguals and bilinguals in reference production.
This does not mean, however, that cross-linguistic influences between
Greek and German do not play any relevant role. For example, the
production of overspecific pronouns could be the product of reinterpretation
of Greek null and clitic pronouns as German overt pronouns. Likewise, the
similar use of definite nouns may be motivated by the fact that the morpho-
syntactic structure of definite nouns is very similar across the two languages
(i.e., both Greek and German mark definite nouns for case, gender and
number). Our data do not allow to disentangle the effects of processing from
cross-linguistic influences of one language on the other. An in-depth
investigation of the interaction between these two factors can only be based
on the comparison between bilinguals speaking languages with similar
referential systems and bilinguals speaking languages with different ones
(cf. Torregrossa et al. submitted for an analysis of reference production in
Albanian-Greek and German-Greek bilinguals).
Before concluding, a last remark is in order. The results reported in this
paper diverge from the ones in Andreou et al. (2015), which show that
bilingual children tend to overspecify. This is surprising given that the
children considered in this paper are a subset of the ones taken into account
by the authors. However, the two studies are not comparable for several
reasons. First, they are based on two different methodologies for data
analysis (Section 3). Moreover, the authors observe overspecification in
association with the production of null subjects and clitics in Greek, which
we do not consider in the present study9. Finally, the analysis by Andreou et
al. (2015) does not discuss whether the bilinguals living in Greece and the
ones living in Germany perform differently. In our future work, we will
9 We refer to Torregrossa & Bongartz (submitted) for the use of the multi-factorial analysis for the analysis of reference production in Greek by the same group of bilinguals presented in this paper.
investigate to what extent these differences in sociological and ethnographic
measures (e.g., type of bilingualism, schooling, and language use in
different contexts – Bongartz 2016 on this issue) affect reference
production.
References
Almor, Amit. 1999. Noun-phrase anaphora and focus: The informational