Page 1
This article was downloaded by: [University of Alabama at Tuscaloosa]On: 25 August 2014, At: 17:21Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK
Discourse ProcessesPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/hdsp20
Applying the Landscape Model to ComprehendingDiscourse From TV News StoriesMina Lee a , Beverly Roskos-Ewoldsen a & David R. Roskos-Ewoldsen aa Department of Psychology , University of AlabamaPublished online: 11 Dec 2008.
To cite this article: Mina Lee , Beverly Roskos-Ewoldsen & David R. Roskos-Ewoldsen (2008) Applying the Landscape Model toComprehending Discourse From TV News Stories, Discourse Processes, 45:6, 519-544
To link to this article: http://dx.doi.org/10.1080/01638530802359566
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of theContent. Any opinions and views expressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon andshould be independently verified with primary sources of information. Taylor and Francis shall not be liable forany losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use ofthe Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions
Page 2
Discourse Processes, 45:519–544, 2008
Copyright © Taylor & Francis Group, LLC
ISSN: 0163-853X print/1532-6950 online
DOI: 10.1080/01638530802359566
Applying the Landscape Modelto Comprehending Discourse
From TV News Stories
Mina Lee, Beverly Roskos-Ewoldsen, andDavid R. Roskos-Ewoldsen
Department of Psychology
University of Alabama
The Landscape Model of text comprehension was extended to the comprehension
of audiovisual discourse from text and video TV news stories. Concepts from the
story were coded for activation after each sequence, creating a matrix of activations
that was reduced to a vector of the degree of total activation for each concept.
In Study 1, the degree vector correlated well with participants’ ratings of how
much the sequence made them think of each concept. In Study 2, the degree
vector, vectors based on the number of activations, and the degree of co-activation
were used to predict participants’ recall. The model predicted recall for the text
version well, but only moderately well for the video version. The Landscape Model
was modified using Dual Code Theory by coding and analyzing audio and visual
information as separate components. It predicted students’ recall well, indicating
its robustness as a model of discourse processing.
Although most studies of discourse comprehension focus on the written or
spoken word, much of the discourse we process is visually based. Face-to-
face conversation involves gestures and facial expressions in addition to the
interlocutors’ shared visual scene; discourse on television or in movies typically
has visuals in the background or foreground, in addition to showing speakers’
gestures and facial expressions. In all these cases, the visual context can be
Correspondence concerning this article should be addressed to Beverly Roskos-Ewoldsen,
Department of Psychology, University of Alabama, 348 Gordon Palmer Hall, Box 870348,
Tuscaloosa, AL 35487-0348. E-mail: [email protected]
519
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 3
520 LEE, ROSKOS-EWOLDSEN, ROSKOS-EWOLDSEN
important for understanding the discourse. In this study, we examine comprehen-
sion of discourse from television news stories as a focused way of understanding
the relation between linguistic and visual processing more generally.
When people watch a television news story, one of their implicit goals is
to process and understand (i.e., comprehend) the story that they are watching,
much as it is when they read a newspaper story. As discourse research has
shown, comprehending a story involves integrating past knowledge and new
knowledge into a coherent mental model so that the story is meaningful to the
reader or viewer. Research on the reading and comprehension of a written text
has consistently shown that people construct mental models of what they are
reading (Albrecht & O’Brien, 1993; Bower & Morrow, 1990; Garnham, 1997;
Gyselinck & Tardieu, 1999; O’Brien & Albrecht, 1992; Radvansky & Zacks,
1991), as well as mental representations of situations and what is occurring
in those situations (Morrow, Greenspan, & Bower, 1987; Zwaan & Radvansky,
1998). Just as text comprehension relies on the construction of a mental model,
it makes sense that viewing visual stories also involves the construction of a
mental model.
What is known about the nature of such models and how they are constructed?
The focus of most prior research on mass media has not focused on processing,
although there has been some research on the effects of structural changes
in visual information on perception or comprehension of a story (e.g., edits,
panning and zooming, camera angles). For example, one particular area of focus
is an assessment of the effects of formal features of television programs on
processing information. Unlike a print story, a televised story benefits from
its presentational structure. Structural features such as quick scene shifts or
strange camera angles are defined as having perceptual salience as long as those
features attract immediate attention. The salient structural features of television
programs are hypothesized to keep viewers aroused (Singer, 1980). However,
salient structural features do not automatically lead to attention and further
comprehension. Some researchers argue that the comprehension process, rather
than salience, guides attention (Anderson & Lorch, 1983; Collins, 1983; Gunter,
1987; Schmitt, Anderson, & Collins, 1999)—that is, attention to a television
program is not a reactive response to salient elements of a story but is a
comprehension-driven process.
Another area of focus is in the literature on children’s television, where
the visual elements of the story have different effects on memory than the
verbal elements. Television visuals change how children recount a story they
saw, compared to reading or hearing the story. Specifically, children who saw
a story on TV described more visual elements of the story than children who
read or heard the same story (Beagles-Roos & Gat, 1983; Meringoff, 1980),
and they retold the story in a different order than did children who read the
story (Hoffner, Cantor, & Thorson, 1988). The visual presentation of stories
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 4
LANDSCAPE MODEL 521
also results in more memory errors in children, such as intrusions of extraneous
information (Hayes, Kelly, & Mandel, 1986).
We argue that a fruitful way to think about visual media effects is to ap-
proach visual media as an extension of verbal discourse (Roskos-Ewoldsen,
Roskos-Ewoldsen, & Carpentier, 2002; Yang, Roskos-Ewoldsen, & Roskos-
Ewoldsen, 2004) and to use the research tools of discourse psychologists to
examine how people mentally represent or process discourse in media stories
(Roskos-Ewoldsen et al., 2002). In so doing, we follow in the tradition of a few
quantitative studies that have explored this issue. For example, in Livingstone’s
(1987, 1989) research on soap operas, participants who regularly viewed a soap
opera rated their agreement with statements about its story line, where the state-
ments represented different perspectives depending on the characters involved.
Livingstone (1987, 1989) found that viewers’ comprehension of a soap opera was
tied to their perceptions of the characters within the soap opera (see also Cohen,
2002). For example, characters were seen as varying along three dimensions:
moral–immoral, mature–immature, and traditional–modern. In another study,
Magliano, Dijkstra, and Zwaan (1996) explored whether sources of information
in a film—mise en scene (i.e., costumes, lighting, placement of actors and props
within a scene, etc.), montage (i.e., edits within a film), dialogue, and music—
in narrative films, such as Moonraker (Broccoli & Gilbert, 1979), influenced
whether and when viewers made predictive inferences. Generally, they found that
when these sources of information were available, viewers were more likely to
make accurate predictive inferences than when they were not available. Finally,
Magliano, Miller, and Zwaan (2001) explored the role of changes in different
dimensions within a feature-length movie on perceptions of events in the movie.
Included were changes in time, changes in where the action was occurring, and
changes in the location of characters. When there was a change along one or
more of these dimensions, participants were more likely to say that a new event
had begun.
In this study, we focus on whether models of discourse processing of text can
be extended to processing of discourse in visual media. Specifically, we tested
a well-known model of discourse processing, the Landscape Model (van den
Broek, Risden, Fletcher, & Thurlow, 1996; van den Broek, Risden & Husebye-
Hartmann, 1995; van den Broek, Young, Tzeng, & Linderholm, 1999). As with
the more general mental models framework, the Landscape Model is concerned
with how people generate a coherent understanding of a story. Specifically, it
focuses on how concepts within a story are activated in memory while people
read a text story. The model gets its name from the observation that various
concepts are activated by a story to varying degrees across time (i.e., across the
sentences of the story).
What is unique about the Landscape Model as a model of text comprehension
is that it focuses on coherence by looking at the relation between the online
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 5
522 LEE, ROSKOS-EWOLDSEN, ROSKOS-EWOLDSEN
processing of a story and the memorial representation of that story. By looking
at a participant’s memory for a text, the model takes advantage of the well-
established finding that greater levels of activation of a particular concept result
in greater memory for that concept. Thus, by using the theory’s predictions for
how active various concepts are in memory, one can test whether those concepts
are indeed more likely to be remembered.
In addition, the model recognizes several sources of activations and asso-
ciations in the comprehension process. It assumes that there are four general
sources of concept activation while attending to a story (van den Broek et al.,
1996; van den Broek et al., 1999). First, the immediate environment will activate
concepts in memory. Specifically, concepts within the current sentence (e.g.,
in a book) or a scene (e.g., in a movie) will be activated. Second, because
activation dissipates across time (Higgins, Bargh, & Lombardi, 1985), concepts
from the immediately preceding sentence or scene should still be activated,
albeit at a lower level of activation. Third, concepts from earlier in the story
may be reactivated when they are necessary for maintaining the coherence of the
story. Fourth, world knowledge that is necessary for understanding the story will
be activated. According to the model, the cognitive representation of the story
will reflect these four sources of activation. The model seems to capture one’s
mental representation because it predicts participants’ memory for text-based
stories very well (van den Broek & Gustafson, 1990).
APPLYING THE LANDSCAPE MODEL TO
TV NEWS STORIES
Our strategy was to follow the methods originally used to test the Landscape
Model. In these studies, the stories used to test the Landscape Model have tended
to be short; for example, in one study (van den Broek et al., 1996), the stimulus
story was 13 sentences long and included only 26 concepts. In branching out
to the visual realm, we similarly began relatively simply with a TV news story.
A TV news story has both a video version and a transcript that can serve as
a text-only version. The two different forms of the TV news story (i.e., video,
text) enable a direct comparison of text and video processing, as well as a com-
parison of the text version with previous research using text. However, there are
several complexities involved in preparing a TV news story for testing, including
determining the meaningful units of analysis and identifying the concepts.
To address these complexities, extensive discussion and piloting went into
establishing the structure of the video news story itself, from which a theoretical
landscape of activations was created. In addition, we ensured that the theoret-
ically driven model captured readers’ online processing of concepts in the TV
news story. However, the main thrust of the study was to investigate how well the
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 6
LANDSCAPE MODEL 523
Landscape Model (i.e., the theoretically derived activations) predicted recall for
the video version of the TV news story. To preview, the model predicted the text
version very well, but predicted recall for the video version only moderately well.
We then modified the Landscape Model based on Dual Code Theory (Paivio,
1971, 1986, 1991; Sadoski & Paivio, 2001) so that verbal and visual components
of the story were treated separately, and this modification worked well.
Preparing the Stimulus
Video news story. The news story chosen for the study was an excerpt
from the TV show Daybreak, which appears on Cable News Network (CNN).
The story aired on April 4, 2002, and was about a robot exhibition that had
been held in Japan at that time. The story highlights Japan’s latest robotics
innovations, including robots that save lives by detecting land mines; robots
that mimic human expressions, language, and talents; and robots that serve a
therapeutic purpose by responding to human touch. The news clip was 2 min,
10 s long. The transcript of the story was obtained from CNN.
Constructing meaningful units of analysis. The first step in preparing
the stimulus was to decide on the meaningful units of analysis. With text, this is
straightforward because sentences or clauses typically serve as meaningful units.
With TV news, the task is trickier because a meaningful unit in the transcript
does not always correspond with a break in the visual elements of the news story.
For example, the first two sentences of the transcript are, “Move over Madonna?
Maybe not quite yet.” These would typically be treated as two different units in
a text-based story. However, it was impossible to create a clean break between
these two sentences in the video news story. Either the video version clipped the
sentencesshort so that theend ofonesentenceand thebeginning of thenext sentence
seemed cut off too abruptly, or they appeared in both clips, creating redundancy.
As a result, to determine meaningful units in the video version, we classified
units using a combination of sentence structure and edits, such as changes in
camera angle and changes in scene. For example, between the first two sentences
and the third sentence (i.e., “These new entertainment robots made by Sony
aren’t about to win a Grammy, but they are talented.”), there was a shift in
the camera shot from the news reporter reporting from a studio news set to the
robots in action at the exhibition, with the reporter’s voice-over. In this case, the
first two sentences were designated as the first sequence in the story, and the
third sentence was designated as the second sequence. Based on these criteria,
the three authors plus two other graduate students familiar with the Landscape
Model mutually agreed on 19 meaningful units. The same units were used for
the text version of the story because we were interested in a comparison of a
text version and a video version (Appendix A).
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 7
524 LEE, ROSKOS-EWOLDSEN, ROSKOS-EWOLDSEN
Identifying concepts. The next step, following procedures for testing the
Landscape Model, was to identify the concepts that occurred in the story. Con-
cepts were defined as words or images that seemed important for understanding
the story. Identifying the concepts in this news story was made difficult by the
complexity of the story compared with single sentences. To begin the task, the
first author read the transcript, writing down all concepts that came to mind.
Then, the five researchers separately read the transcript while looking at the
original list of concepts; additional concepts were noted. Through subsequent
discussions among the authors, concepts were added, deleted, or combined.
Some concepts became word phrases rather than single words because the
single words did not seem to capture the information in the story and, more
important, if all single concepts had been included there would have been well
over 100 concepts. As an example, the sentences, “Aside from singing and
dancing, they can recognize human faces and names” and “This one recognizes
and follows different colored objects,” produced the compound concepts of
can sing/dance/entertain and can recognize objects, among others. The five
researchers agreed on 54 concepts for the text version (Appendix B).
For the video version, visual information, including background visuals, was
added to the list of concepts using the same procedures as for the text version.
Again, we collapsed single concepts into compound concepts. For example,
we used dog-like-robot rather than dog and robot separately. Including visual
information resulted in a longer list of 79 concepts, agreed on by all five
researchers (Appendix B).
Creating Theoretical Activation Matrices
Coding for degree of activation. Two graduate students, who were oth-
erwise not involved in the project, coded the activation levels of the concepts.
Each coder independently rated both the video and text versions of the story.
The text version appeared as 19 sequences of sentences, each on a separate
page. The video version was edited so that there was a 10 sec blank screen after
each sequence. This allowed the rater to pause the video cassette recorder for
the ratings that followed each sequence. No specific order of video and text was
specified.
After reading or watching each sequence, the coders rated each of the con-
cepts (54 for text and 79 for video) for activation using the 5-point rating scheme
introduced by van den Broek et al. (1996). Using this coding scheme, a concept
was assigned a 5 if it was either explicitly mentioned in the dialogue or text
or was visually salient in the scene. A concept was assigned a 4 if it aided in
creating a coherent understanding of the sequence or was causally related to what
was occurring. A concept that acted as an enabler was assigned a value of 3.
Enablers are concepts that enable actions to occur within a scene. For example,
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 8
LANDSCAPE MODEL 525
a stairway could be an enabler in the scene in which the robots demonstrate
that they are mobile. Finally, a concept that could be inferred from a scene or
dialog and text was assigned a value of 2. If a concept did not fit into any of the
aforementioned categories, it was assigned a 0. A 1 is not used in this coding
scheme to allow for dissipation of inferred concepts (dissipation is described
later). For the video version, coders had a separate page of concepts for each
sequence. For the text version, each coder received a packet with the sequences
and concepts to be rated. Specifically, the first sequence was on the first page,
followed by a page of the concepts to be rated. This was followed by a page
with the second sequence, which was followed by the list of concepts to be
rated, and so on. The coders were instructed not to return to a previous page
once they began to code the concepts.
Constructing activation matrices and vectors. An activation matrix was
constructed for each version of the story (video, text) and for each coder. The
matrix represented the amount of activation (i.e., rating) each concept received
after each sequence. Thus, the video version resulted in a 19 � 79 matrix of
activations, and the text version resulted in a 19 � 54 matrix for each coder.
These matrices were then modified because the Landscape Model assumes
that the activation of a concept will dissipate across subsequent sequences if
it is not reinstated. Thus, those concepts that were not reactivated in the next
sequence were assigned a value equal to one half of their value from the previous
segment. The activation levels of these concepts were again reduced to 0 during
the next sequence if they were not reactivated based on the story. If the concept
was reactivated in the next sequence, it received a rating based on the current
sequence; that is, the ratings from one sequence to the next were not additive.
Figures 1 and 2 show the activation matrices for the text and video versions of
the story, respectively.
The matrices were reduced to concept activation vectors by adding the ac-
tivation level of each concept across sequences, as dictated by the Landscape
Model of text comprehension (van den Broek et al., 1996). This resulted in a
1 � 54 vector for the text version and a 1 � 79 vector for the video version.
Again, there were two vectors per version—one for each coder.
Reliability of activation vectors. To assess the reliability of the activation
vectors, we focused on these activation vectors for the simple reason that they
are the primary predictors of story recall (see Study 2). The vectors for each
rater were correlated in both the text version and the video version to determine
an interrater reliability. For the text version, the correlation was r D :92 .p <
:01/I and for the video version, it was r D :76 .p < :01/: In addition, there
were no significant differences in overall ratings between the coders for each
of the versions. These high correlations and lack of mean differences provide
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 9
526 LEE, ROSKOS-EWOLDSEN, ROSKOS-EWOLDSEN
FIGURE 1 Concept activations for the text version of the news story (number of con-
cepts D 54).
FIGURE 2 Concept activations for the video version of the news story (number of con-
cepts D 79).
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 10
LANDSCAPE MODEL 527
evidence that the coding scheme was reliable; therefore, we concluded that the
activation vectors, and by inference the matrices, were reliable. Finding that the
text version afforded more reliable activation ratings than the video version is not
surprising because some people are more likely than others to notice background
visual information. As a consequence, however, the lower reliability of the video
version compared to the text version may result in lower predictive power of
the Landscape Model for predicting story recall.
Theoretical matrices and vectors of activations. The coders’ matrices
were averaged and then reduced to vectors for subsequent analyses. Specifi-
cally, their activations for each sequence-concept combination were averaged
separately for the text and video versions, and then collapsed to two vectors—
one for each version.
STUDY 1: ESTABLISHING EXTERNAL VALIDITY OFTHE THEORETICAL MATRIX OF ACTIVATION
The next step was to establish that there was a relation between the theoretical
activations, based on the matrices formed by the coders, and actual participants’
activations (i.e., empirical activations) for both the text and video versions of
the story. This relation is important for two reasons. First, it provides external
validity for the theoretical matrix. A strong correlation between the theoretical
matrices and the empirical matrices would indicate that the theoretical activa-
tion matrix is adequately capturing the participants’ mental models. Second, it
provides further evidence that the coding scheme, and therefore the activation
matrix, is reliable.
Method
Participants. Thirteen students were recruited from introductory mass com-
munication courses. They received either extra credit or credit toward a course
requirement for their participation. Seven of the students were assigned to the
text version of the story, and 6 were assigned to the video version. None of
the participants in this study reported having seen the news story previously. A
power analysis (Faul & Erdfelder, 1992) indicated that 6 to 7 participants would
be enough to detect a correlation of r D :76 (from the interrater reliability of the
video version), with alpha D .05 and power of around .80 (.76, .84). However,
the more important power analysis used the number of concepts as the unit of
analysis. In this case, 54 and 79 are enough units to detect an effect size as
small as r D :40: From this perspective, the critical question is whether 6 to 7
participants would be enough to establish reliable findings. It was, as seen later.
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 11
528 LEE, ROSKOS-EWOLDSEN, ROSKOS-EWOLDSEN
Materials and procedure. The text and video versions of the story were
used. After each sequence, participants rated on an 11-point scale how much the
sentence or scene made them think of each of the concepts. The scale ranged
from 0 (not at all) to 10 (a lot), and was modeled after van den Broek et al.’s
(1996) scale. Participants in the video version received a packet of 19 concept
lists, 1 for each sequence. Participants in the text version received a packet with
sequences and concept lists interwoven, and were instructed not to return to a
previous page once they had begun rating the concepts. The procedure lasted no
longer than 1 hr.
Results and Discussion
Empirical activation matrices. The ratings for each sequence-concept com-
bination were averaged across participants. This resulted in two averaged acti-
vation matrices—one for the video version and the other for the text version. It
was not necessary to modify the empirical matrices to account for dissipation
because of the way the rating scale is designed—that is, if a person is thinking
less about a concept on its nC1 sequence, he or she will assign it a lower value.
Reliability of empirical matrices. The matrices generated by each partici-
pant were reduced to concept activation vectors by adding the activation level of
each concept across sequences. Activation matrices were used because they are
the primary predictors of recall (Study 2). To assess the internal reliability of the
matrices, Cronbach’s alpha was calculated for both the text and video versions,
with the concepts as the cases and the participants as the items. Reliability was
calculated in this way because analyzing simple correlations (and means) for 6
to 7 students was unwieldy. For the text version, Cronbach’s alpha was .89; and
for the video version, it was .83. These results indicate high internal reliability
for the two matrices, and shows that 6 to 7 participants were enough to obtain
stable data.
Relation between theoretical and empirical activation matrices. To
analyze the relation between the theoretical and empirical activation matrices,
the participants’ matrices were averaged and then reduced to vectors by adding
across sequences. Specifically, their ratings for each sequence-concept com-
bination were averaged separately for the text and video versions, and then
collapsed to two vectors—one for each version. Next, the theoretical vectors
were correlated with the empirically derived vectors. For the text version, the
correlation was r D :66 .p < :01/I and for the video version, it was r D :70
.p < :01/:
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 12
LANDSCAPE MODEL 529
Discussion. These are strong correlations, and they indicate that the the-
oretical activation matrices based on the coders’ ratings captured the empirical
activation matrices based on participants’ thoughts about the concepts. In other
words, the theoretical activation matrices have external validity. This is important
because the theoretical model is being used to predict participants’ recall of the
video news story, and a strong relation between the theoretical and empirical
models bodes well for this task.
STUDY 2: PREDICTING RECALL
In this main study, the theoretical activation matrices are used to predict free
recall of the news story, thus determining how well the Landscape Model of
text comprehension applies to a story that has substantial visual components.
As a comparison, a study using the Landscape Model to predict free recall for
a text story found that 64% of the variance in recall was accounted for by the
theoretical activation matrix (van den Broek et al., 1996). This level is used as a
guide to judge whether the Landscape Model is applicable to TV news stories.
Three attributes of the activated concepts are used by the Landscape Model
to predict memory for the news story: the number of times a concept is acti-
vated, the degree to which a concept is activated, and the associations formed
between simultaneously activated concepts. According to the model, each of
these attributes reflects a unique aspect of the mental representation of a story.
First, the number of times a concept is activated within a story is expected to
influence recall of that concept because a concept that is activated fewer times
within the story will result in lower memory for that concept than when the
concept is activated more times in the story. Second, the degree to which a
concept is activated across all the scenes of the story is expected to influence
recall of that concept because concepts with more overall activation are more
likely to be recalled than concepts with lower levels of overall activation. Finally,
associations between concepts that are activated simultaneously should influence
the likelihood that a concept is recalled. This is because two concepts that are
activated simultaneously will be linked in memory; it is assumed that their
activation will have an effect such that the concepts that have more linkages in
memory are more likely to be recalled than those concepts with fewer linkages
in memory.
Method
Participants. Thirteen students who had not participated in Study 1 were
recruited from introductory mass communication courses. They received either
extra credit or credit toward a course requirement for their participation. Seven
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 13
530 LEE, ROSKOS-EWOLDSEN, ROSKOS-EWOLDSEN
of the students were assigned to the text version of the story, and 6 were assigned
to the video version. None of the participants in Study 2 reported having seen
the news story previously.
Materials and procedure. Participants watched the unedited version of the
video news story or read the unedited version of the text presented in transcript
form. Next, they completed an unrelated task for 10 min to avoid memory
recency effects. Finally, participants wrote down as much of the story as they
could recall. They were not told before reading or viewing the story that they
would have to recall the story. They were given as much time as they needed
during recall. The entire procedure lasted 30 min.
Results and Discussion
Recall data. Overall, participants recalled a low number of concepts from
the text-based version of the story .M D 8:00 out of 54 concepts, SD D 3.51)
than from the video version of the story .M D 12:67 out of 79 concepts, SD D
3.56), t.11/ D 2:37; p D :04: Although it appears that the research participants
had better memory for the video version of the story, there were more concepts
in this version. When the proportion of possible items was analyzed, there was
no difference in recall between the text version .M D 0:15; SD D 0.65) and
the video version of the story .M D 0:16; SD D 0.45), t.11/ D :03; p D :98:
Creating the recall vector. Two new coders not otherwise involved in
the project coded the recall data. Each coder coded each participant’s written
response for the number of times each of the concepts (54 concepts for text, 79
concepts for video) was included in the response. Thus, each participant had a
vector of the number of times each concept was recalled. After discussion, the
coders achieved 100% agreement for both the text and video versions of the
story for each participant.
To create the set of recall vectors, the number of times each concept was
recalled was added across all participants, resulting in two recall vectors—one
for the text version and one for the video version.
Creating three types of activation vectors. Three sets of theoretical
activation vectors were formed using the procedures by van den Broek et al.
(1996). Each set contained a vector for the text version and a vector for the
video version. The first set of vectors comprised the theoretical activation matrix
described earlier. This vector set is called the degree of activation vector set. The
other two sets of vectors used these activation matrices as a base. For one set,
activation was recoded in an all-or-none fashion based on whether a concept
received any activation during each of the 19 sequences. A concept received
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 14
LANDSCAPE MODEL 531
a 1 if there was any activation and a 0 if there was none. These 1s and 0s
were summed across sequences, resulting in a number of activations vector set
representing the total number of times each concept was activated. The final set
of vectors measured how much co-activation occurred among the concepts. This
co-activation represents a particular concept’s associative strength with other
concepts. To calculate the amount of co-activation between a given concept and
all other concepts, the following formula was used, based on van den Broek
et al. (1996):
S.x; y/ D
IX
iD1
YX
yD1
Axi Ayi
S.x; y/ is the strength of co-activation between a concept x and all other
concepts y; i represents the sequence .I D 19/; Ax is the activation of concept
x, and Ay represents the activations of each of the other concepts y .Y D 54�1
for text and Y D 79 � 1 for video). In other words, for a given concept, one
begins with the first sequence and multiplies the activation of the concept with
each of the other concepts’ activations from that sequence separately, and then
sums the products. Once this is completed, the co-activations are summed across
sequences. This two-step procedure is repeated for each of the concepts. The
resulting vector set is called the association vector set.
Correlations among the activation vectors. Correlations among these
vectors were high. For the text version, the correlations were r D :83 (degree,
number), r D :96 (degree, association), and r D :82 (number, association): all
ps < .01, N D 54: For the video version, they were r D :85 (degree, number),
r D :99 (degree, association), and r D :86 (number, association): all ps < .01,
N D 79:
Predicting recall. The following analyses are a direct test of the model’s
hypothesis that the activation of a concept while watching the news story con-
tributes to the formation of a stable memorial representation of the story, which
can be recalled at a later time (van den Broek et al., 1996). Two regression
equations were computed—one for the text version and one for the video
version. In both cases, the recall vector was the criterion variable; and the degree,
number, and association vectors were treated as the predictor variables. All three
were entered into the analysis simultaneously because of the multicollinearity of
the vectors (Cohen, Cohen, West, & Aiken, 2003). Multicollinearity influences
the interpretation of individual betas such that the relative influence of each
component cannot be determined. However, this is not perceived to be a problem
because the focus of the study is how well the Landscape Model predicts recall
for the story and not the relative influence of the components of the model.
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 15
532 LEE, ROSKOS-EWOLDSEN, ROSKOS-EWOLDSEN
For the text version, the model predicted recall well .R2 D :86; p <
:01/; F.3; 50/ D 98:61; p < :001: The beta weights for degree, number, and
association were 1.47, �0.57, and �0.14, respectively.1 Caution should be used
in interpreting these beta weights because of the multicollinearity. Consistent
with past research, the Landscape Model does an excellent job of predicting
participants’ memory for a text-based story. For the video version, the model
did not fare as well, although it still predicted 32% of the variance in recall
.R2 D :32; p < :01/; F.3; 75/ D 11:81; p < :001:2 Beta weights for degree,
number, and association were 1.61, �0.43, and �0.71, respectively. Because of
the multicollinearity among the predictor variables here and elsewhere, caution
must be used when interpreting these weights.
Although the Landscape Model does an acceptable, reliable job of predicting
participants’ memory for the TV news story, its predictive ability is much higher
for the text story than for the video story. This result raises the question of
whether the Landscape Model can be revised to predict the video recall data
better.
Before modifying the model, however, one can ask whether it is simply
misspecifying the visual components. Perhaps if only the concepts that the two
versions had in common were tested, the Landscape Model would predict recall
equally well. To test this, we used only the 54 concepts that both the text
and the video versions had in common. In many respects, these versions were
very similar. First, although the video version had higher overall activation than
the text version in degree, number, and association (all ps < .05), the degree,
number, and association vectors for the two versions were correlated: rs = .63,
.62, and .63, respectively; all ps < .001. Second, in terms of the average number
of times each concept was recalled, the correlation between the two versions
was high .r D :93; p < :001/: Further, the means were statistically equivalent:
text, M D 0:24; SD D 0.56; and video, M D 0:32; SD D 0.71; t.53/ D
1:91; p D :06; although there is a trend for higher recall in the video version
than in the text version. Regression analyses revealed a different story. A new
regression analysis for the reduced 54-concept video version produced R2 D :45;
1Another way to analyze the data is to conduct a forward stepwise regression, which adds the
variable that accounts for the most variance first; and if another accounts for additional variance, it
is added next. In this analysis, degree predicts R2D :75 of the variance .ˇ D 1:35/; F.1; 52/ D
156:72; p < :001I and number accounts for another R2D :10 .ˇ D �0:58/; Fchange.1; 51/ D
35:89; p < :001—for a total of R2D :85; F.2; 51/ D 148:88; p < :001: Association did not
account for any additional amount of variance.2A forward stepwise regression analysis shows that degree predicts R2
D :27 of the variance
.ˇ D 0:91/; F.1; 77/ D 29:01; p < :001I and number accounts for another R2D :04 .ˇ D �0:44/;
Fchange.1; 76/ D 4:78; p < :03—for a total of R2D :32; F.2; 76/ D 17:61; p < :001: Association
did not account for any additional amount of variance.
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 16
LANDSCAPE MODEL 533
F.3; 50/ D 13:42; p < :001—higher than for the 79-concept video version but
still far below that for the text version .R2 D :86/: The beta weights for degree,
number, and association were 0.89, �0.46, and 0.16, respectively. Therefore,
although recall for the concepts that both versions had in common was the same
and the theoretical models were similar, the model for the video story still did
not predict recall as well as the model for the text story. In fact, when the model
for the text version is used to predict recall for the video version, the model
predicts recall quite well, R2 D :83; F.3; 50/ D 81:40; p < :001 (ˇs D 1.57,
�0.51, and �0.30, respectively).
One could argue that we should stop here. However, this would miss the
critical point that online (i.e., sequence by sequence) processing of video news
stories does not capture comprehension of the video story nearly as well as
online processing of text news stories does. It is important to understand what
was being processed while participants watched the video and why the model
did not predict their recall well. Further, many videos rely more heavily on
visual information than does a news story. In these cases, meaning often results
mostly from the visuals, and coding text only would not capture that. Therefore,
we pursued the question of how the online processing of video news stories
differed from the online processing of text news stories, even when the concepts
were identical. We believe that the answer derives from the basic differences in
processing text and pictures or images or, more specifically, representing verbal
and visual information in memory (e.g., Kosslyn, 1994; Paivio, 1986; Palmer,
1999; see also Baggett, 1989; Farah, 1989; Levie, 1987; Moliter, Ballstaedt, &
Mandl, 1989).
A primary theory that focuses on the independence of visual and verbal
information is Dual Code Theory (Paivio, 1971, 1986, 1991; Sadoski & Paivio,
2001). According to Dual Code Theory, there are two types of representa-
tions in memory—one verbal and the other nonverbal. Within each type of
representation, concepts are represented as nodes (i.e., logogens in the verbal
system and imagens in the nonverbal system) and are connected to each other
in an associative structure that differs for verbal and nonverbal representations.
Further, verbal and nonverbal representations are independent of each other,
and the activations of concepts that are dually coded are additive. For example,
assume that the visual element of the story involves a robot going up some stairs
and that there is a reporter voice-over saying, “Robots can be taught to climb
stairs.” The activations of the concept “robot” from the verbal and nonverbal
representations would be added together to create a higher level of activation
of the concept “robot” than either the verbal element or the nonverbal element
of the story alone. Conversely, activation of the concept “reporter” would not
involve adding activation from both the verbal and nonverbal codes because the
reporter is only heard during this segment. Its activation would derive only from
the verbal system. As a result, dually coded information is predicted to have
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 17
534 LEE, ROSKOS-EWOLDSEN, ROSKOS-EWOLDSEN
higher levels of overall activation that, in turn, leads to better memory for the
information.
In our study, it could be the case that recall participants in the video group
comprehended and remembered only the verbal aspects of the news story, and
that is why their recall matched the text group so well. In this case, a verbal-
only coding of the 79 concepts would suffice for determining recall for the
video version. On the other hand, the recall participants in the video group may
have comprehended and remembered both visual and verbal aspects of the news
story, but the verbal aspects of the news story outweighed its visual aspects;
this possibility cannot be tested with the current data because the expert coders
rated the visual and verbal aspects of the news story simultaneously. A third, but
unlikely, possibility is that the visual components of the news story outweighed
the verbal information in terms of predicting recall. Perhaps the reason that the
text version of the Landscape Model predicted the 54 concepts so well is because
these concepts had strong verbal components in the news story. However, the text
version cannot capture the concepts that are only visual, and maybe a visual-only
model would predict recall for the entire 79 concepts better than a verbal-only
model.
To test these possibilities, it is necessary to code the verbal aspects of the
news story separately from the visual aspects. In doing so, we incorporated
two key aspects of Dual Code Theory. First, to capture the separate nature of
verbal and nonverbal representations, we coded activation for visual information
(visual only) separately from verbal information, which in our case is audio
in nature (audio only). Second, reflecting the additive nature of activation in
the two representational systems, we used each type of activation as separate
predictors in a new regression equation, using the video recall data described
earlier as the criterion variable.
There were two other aspects of Dual Code Theory we did not include to keep
the analysis as simple as possible, at least to begin with. First, we decided not
to include qualitatively different kinds of associations for verbal and nonverbal
information. Instead, co-activations for verbal and nonverbal information were
calculated in the same manner. Second, we chose to include only co-activations
within a type of representation (i.e., associations within the verbal and nonverbal
systems), rather than including co-activations across types of representations
(i.e., referential connections).
Creating audio-only and visual-only activation vectors. The coders from
Study 1 recoded the video news story to obtain the separate visual-only and
audio-only activation matrices for the news story. For the visual-only version,
coders watched a sequence of the news story with the volume off, paused
the news story during the blank screen at the end of the sequence, and then
coded how much each of the 79 concepts was activated by the visual images in
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 18
LANDSCAPE MODEL 535
that sequence. This was repeated for each of the 19 sequences. For the audio-
only version, coders listened to a sequence in the audio story, with the visual
information blocked. They paused the story at the end of each sequence, and
then coded how much each of the 79 concepts was activated by the sound portion
of each segment. There was no order specified for coding the visual and audio
information. There was a lag of around 1 month between the original coding
and the present coding.
Two sets of activation matrices were created—one for the audio-only version
of the news story and the other for the visual-only version. Within each set
were the two coders’ activation matrices. The coders’ activation matrices were
modified for the dissipation of activation over time and then averaged. This
resulted in a single degree of activation matrix for the audio-only version and
another for the visual-only version. These matrices were each reduced to vectors
by summing the activations across sequences.
The activation matrices and, consequently, the activation vectors showed
different configurations of activation (Figures 3 and 4, audio only and visual
only, respectively). A correlation between the visual-only and audio-only de-
gree vectors was reliable but moderate in strength—r D :36; p D :001—
corroborating the differences between audio and visual online processing seen
in Figure 2. For the number and association vectors, the correlations between
the visual-only and audio-only versions were r D :07 .p D :27/ and r D :32
.p D :002/; respectively.
Predicting recall. The participants’ recall vector from the video version
of the news story was used as the criterion variable, and the six activation
vectors were used as the predictor variables in a regression analysis. As in
the original analysis, the six predictor variables were entered into the analysis
simultaneously because of the multicollinearity of some of the vectors, and
interpreting individual beta weights for each variable requires caution. The new
model predicted recall very well: R2 D :73; F.6; 72/ D 32:51; p < :001:3 The
beta weights for all predictor variables except visual-only number and audio-
only association were reliably different than zero (audio only: 0.87, �0.26, and
0.04; visual only: 2.72, 0.06, and �2.52; all ps < .04). The new model was
on par with a previous test of the Landscape Model for recall of text material
.R2 D :64I van den Broek et al., 1996), and accounted for much more of the
variance in recall of the video version of the news story than the original model
3A forward stepwise regression analysis shows that audio-only degree predicts 64% of the
variance, F.1; 77/ D 138:36; p < :001I and visual-only degree accounts for another 4%,
�F.1; 76/ D 10:77;p D :002: Visual-only association added another 2.2%, �F.1; 75/ D 5:79;
p D :02I and audio-only number added 1.9%, �F.1; 74/ D 5:27; p D :02: Visual-only number
and audio-only association did not account for any additional amount of variance. For the total
model, F.4; 74/ D 49:67; p < :001:
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 19
536 LEE, ROSKOS-EWOLDSEN, ROSKOS-EWOLDSEN
FIGURE 3 Concept activations based on audio-only information of the video news story
(number of concepts D 79).
FIGURE 4 Concept activations based on visual-only information of the news story
(number of concepts D 79).
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 20
LANDSCAPE MODEL 537
.R2 D :32/: In comparison, the original model accounted for R2 D :86 of the
variance in the text version of the news story.
To test the three possibilities described earlier, we performed three additional
regression analyses. First, we used the audio-only vectors to predict recall. In
this case, R2 D :66; F.3; 75/ D 48:87; p < :001 (ˇs D 1.10, �0.26, and
�0.08 for degree, number, and association vectors, respectively). For the second
possibility, we entered the audio-only vectors into the equation first, followed
by the visual-only vectors. The audio-only vectors produced the same results
as the first hypothesis, of course; but the visual-only vectors added reliably
to the variance accounted for, �R2 D :07; �F.3; 72/ D 6:13; p D :001:
For the overall model, R2 D :73; F.6; 72/ D 32:51; p < :001 (audio-only
ˇs D 0.87, �0.26, and 0.04; visual-only ˇs D 2.72, 0.06, and �2.52). For the
third hypothesis, we used the visual-only vectors to predict recall: R2 D :33;
F.3; 75/ D 12:18; p < :001 (ˇs D 5.20, �0.04, and �4.71). It appears that
participants’ memorial representation primarily reflected the verbal information
in the video. Their memorial representation also reflected the visual information,
but this information was overwhelmed by the verbal information.
GENERAL DISCUSSION
The results of these studies suggest that the mental models approach, and the
Landscape Model in particular, provide a viable way to study how viewers
process and comprehend video stories like TV news stories. However, video
stories, unlike text stories, are harder to study using the Landscape Model,
in part because there are no simple units to study. Unlike a written or auditory
story that is told in sentences, video stories often have no such easily identifiable
units of meaning. Instead, structural features, such as camera angles and scene
changes, need to be incorporated when identifying the units of meaning in the
story. However, a change in camera angle does not always signal the end of one
unit and the beginning of the next. For example, a dialog between two people
may have several changes in camera angle and yet one would not call each
camera angle a unit of information (Lang, Bradley, Park, Shin, & Chung, 2006).
However, these difficulties can be overcome, at least with TV news stories.
In the two studies presented here, we investigated how well models of text
comprehension could be extended to comprehension of media stories that contain
both visual and verbal components—in particular, TV news stories. Empirical
findings have been accumulating in the mass media literature on factors that
affect comprehension of and memory for print news stories including visual
presentation styles (Gunter, 1987; Gunter, Furnham, & Griffiths, 2000), prior
knowledge of topics (Robinson & Davis, 1986), and gender and age (Robinson
& Levy, 1986). However, there has been very little research on how people build
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 21
538 LEE, ROSKOS-EWOLDSEN, ROSKOS-EWOLDSEN
mental models of a visual media story (Roskos-Ewoldsen et al., 2002). The
Landscape Model, as an example of the broader category of mental models, has
been used successfully to capture the online processing of text, thus predicting
comprehension of and subsequent recall for text news stories (van den Broek
et al., 1996, 1999).
In these studies, we adapted the Landscape Model for use with TV news
stories to test its robustness as a model for comprehension of a video story. We
created theoretically based landscapes of activation for text and video versions
of a TV news story, based on the Landscape Model of text comprehension. The
theoretically driven landscapes were found to capture the moment-to-moment
thoughts of participants as they read or watched the TV story (Study 1). In
Study 2 we used the theoretically based landscapes to predict recall of the
text and video versions of the TV story. For the text version, the landscape
of activation accounted for an impressive 86% of the variance in participants’
recall. However, for the video version of the story, the landscape accounted
for only 32% of the variance in recall. Although 32% is still remarkable, it
is clearly not as impressive as is the variance in recall for text stories (86%
here, 64% in van den Broek et al., 1996). The discrepancy between the text and
video versions suggested that comprehension of TV news stories differs from
the comprehension of text stories.
The question is, how are they different? Our results indicated that our par-
ticipants paid most attention to the verbal aspects of the news story. To the
extent that our participants are typical, this suggests that the visual aspects
of a TV news story are represented in memory but are overwhelmed by the
verbal representations. However, these representations are not identical. This
is supported by the findings when our expert coders coded the verbal (i.e.,
audio) information separately from the visual information, and each kind of
information was treated as a separate variable in predicting recall for the video
story. These independent ratings implemented Dual Code Theory, which is a
theory of memory positing that there are two different representational codes in
memory—one for verbal information and one for nonverbal information (Paivio,
1986, 1991; Sadoski & Paivio, 2001). When visual and verbal information were
treated separately, they accounted for 73% of the variance in recall. This result is
on par with a previous study of text comprehension (van den Broek et al., 1996).
A follow-up question is why the original theoretical coding of the video did
not predict recall of the video story. The answer appears to be that the expert
coders were paying more attention to the visual aspects of the video story than
those who were recalling the video story. In our view, this discrepancy reflects
a bottleneck in the experts’ coding of activations, rather than a bottleneck in
the activations themselves. The discrepancy is similar to the discrepancy in
investigations of iconic memory (Sperling, 1960). When participants were asked
to recall all of the alphanumeric characters within a very briefly presented matrix
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 22
LANDSCAPE MODEL 539
of characters (whole report method), recall was poor (25%). However, when
participants were cued to report a specific part of the matrix (partial report
method), recall was much higher (75%). Regarding our results, it was likely
difficult for our coders to report immediately both visual and auditory activations
simultaneously (whole report); but when they could focus on a particular type
of information (partial report), their coding adequately captured activations.
Participants who watched the news story and later recalled the story were able
to consolidate both sources of information into longer term memories in both
the verbal and nonverbal systems, resulting in the ability to recall both verbal
and nonverbal information, albeit to different extents (i.e., audio only captured
66% of the variance in recall, whereas visual only added an additional 7% of
the variance). Of course, this answer is speculative, and more research is needed
to test it.
Although the verbal and visual online processing of the video story we used
predicts recall for the video story well, TV news stories are similar in many ways
to text stories and to discourse found in textbooks. Discourse in TV news stories,
like text discourse, has an introduction and then a series of facts connected
by transitions, followed by a conclusion. There is little inferencing occurring
other than to which object a pronoun refers (e.g., “it” refers to “the robot”). In
particular, there is little reason to predict future events or news stories (forward
inferencing) and almost no reason to use current information to understand
previous news stories (backward inferencing). As a result, perhaps it should not
be surprising that a combination of verbal and visual online processing was able
to predict recall well.
However, it remains to be seen how well the online processing suggested
by the Landscape Model predicts more complex video stories, such as TV
dramas or movies. In these situations, there is reason to predict future events
or reinterpret past events. Thus, inferencing, both forward and backward is
much more important. Further, not only is a story (or several stories) told
within a single episode of a TV series, and a single movie within a genre
of movies, but stories cross episodes or movies within a genre—that is, there
is intertextual discourse that needs keeping track of, in addition to intratextual
discourse (Roskos-Ewoldsen, Roskos-Ewoldsen, Yang, & Lee, 2007; see also
Eco, 1990, 1992). In addition, many dramas have emotional content that is not
obviously incorporated into the Landscape Model. For some episodes or series,
this may be very important for predicting recall; for others, it may not be.
It may be that that people process different genres of stories differently (e.g.,
Zwaan, 1994); or, it may be that movies require a higher level of organization
than simple unit-by-unit activations, such as clustering activations into events,
to capture recall. We suspect that the event indexing model is a good place to
start identifying these higher order events (Zwaan, Langston, & Graesser, 1995;
Zwaan, Radvansky, Hilliard, & Curiel, 1998).
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 23
540 LEE, ROSKOS-EWOLDSEN, ROSKOS-EWOLDSEN
REFERENCES
Albrecht, J. E., & O’Brien E. J. (1993). Updating a mental model: Maintaining both local and
global coherence. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19,
1061–1070.
Anderson, D. R., & Lorch, E. P. (1983). Looking at television: Action or reaction? In J. Bryant & D.
Anderson (Eds.), Children’ understanding of television: Research on attention and comprehension
(pp. 1–34). New York: Academic.
Baggett, P. (1989). Understanding visual and verbal messages. In H. Mandl & J. R. Levin (Eds.),
Knowledge acquisition from text and pictures (pp. 101–124), Amsterdam: Elsevier.
Beagles-Roos, J., & Gat, I. (1983). Specific impact of radio and television on children’s story
comprehension. Journal of Educational Psychology, 75, 128–137.
Bower, G. H., & Morrow, D. G. (1990). Mental models in narrative comprehension. Science, 247,
44–48.
Broccoli, A. R. (Producer), & Gilbert, L. (Director). (1979). Moonraker. [Videotape]. London, UK:
Eon Productions Ltd./Les Artistes Associes. [Distributed by MBM/UA Home Video].
Cohen, J. (2002). Deconstructing Ally: Exploring viewers’ interpretations of population television.
Media Psychology, 4, 253–277.
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression: Correlation
analysis for the behavioral sciences (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates, Inc.
Collins, W. A. (1983). Cognitive processing in television viewing. In E. Wartella & D. C. Whitney
(Vol. Eds.), Mass communication review yearbook (pp. 195–209). Beverly Hills, CA: Sage.
Eco, U. (1990). The limits of interpretation. Bloomington: Indiana University Press.
Eco, U. (1992). Interpretation and overinterpretation/Umberto Eco with Richard Rorty, Jonathan
Culler, Christine Brooke-Rose; edited by Stefan Collini. New York: Cambridge University Press.
Farah, M. J. (1989). Knowledge from text and pictures: A neuropsychological perspective. In
H. Mandl & J. R. Levin (Eds.), Knowledge acquisition from text and pictures (pp. 59–71).
Amsterdam: Elsevier.
Faul, F., & Erdfelder, E. (1992). G Power: A Priori Post–Hoc, and Compromise Power Analysis
for MS–DOS (Version 2.0) [Computer program]. Bonn, FRG: Bonn University, Department of
Psychology.
Garnham, A. (1997). Representing information in mental models. In C. Martin (Ed.), Cognitive
models of memory (pp. 149–172). Cambridge, MA: MIT Press.
Gunter, B. (1987). Poor reception. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Gunter, B., Furnham, A., & Griffiths, S. (2000). Children’s memory for news: A comparison of
three presentation media. Media Psychology, 2, 93–118.
Gyselinck, V., & Tardieu, H. (1999). The role of illustrations in text comprehension: What, when,
for whom, and why? In S. R. Goldman & H. van Oostendorp (Eds.), The construction of mental
representations during reading (pp. 195–218). Mahwah, NJ: Lawrence Erlbaum Associates, Inc.
Hayes, D. S., Kelly, S. B., & Mandel, M. (1986). Media differences in children’s story synopses:
Radio and television contrasted. Journal of Educational Psychology, 78, 341–346.
Higgins, E. T., Bargh, J. A., & Lombardi, W. L. (1985). Nature of priming effects on categorization.
Journal of Experimental Psychology: Learning, Memory, and Cognition, 11, 59–69.
Hoffner, C., Cantor, J., & Thorson, E. (1988). Children’s understanding of a televised narrative:
Developmental differences in processing video and audio content. Communication Research, 15,
227–245.
Kosslyn, S. M. (1994). Image and brain. Cambridge, MA: Bradford/MIT Press.
Lang, A., Bradley, S. D., Park, B., Shin, M., & Chung, Y. (2006). Parsing the resource pie: Using
STRTs to measure attention to mediated messages. Media Psychology, 8, 369–394.
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 24
LANDSCAPE MODEL 541
Levie, W. H. (1987). Research on pictures: A guide to the literature. In D. M. Willows & H. A.
Houghton (Eds.), The psychology of illustration (Vol. 1, pp. 1–50). New York: Springer.
Livingstone, S. M. (1987). The implicit representation of characters in “Dallas”: A multidimensional
scaling approach. Human Communication Research, 13, 399–420.
Livingstone, S. M. (1989). Interpretive viewers and structured programs. Communication Research,
16, 25–57.
Magliano, J. P., Dijkstra, K., & Zwaan, R. A. (1996). Generating predictive inferences while viewing
a movie. Discourse Processes, 22, 199–224.
Magliano, J. P., Miller, J., & Zwaan, R. A. (2001). Indexing space and time in film understanding.
Applied Cognitive Psychology, 15, 533–545.
Meringoff, L. K. (1980). Influence of the medium on children’s story apprehension. Journal of
Educational Psychology, 72, 240–249.
Moliter, S., Ballstaedt, S., & Mandl, H. (1989). Problems in knowledge acquisition from text and
pictures. In H. Mandl & J. R. Levin (Eds.), Knowledge acquisition from text and pictures (pp. 3–
35). Amsterdam: Elsevier.
Morrow, D. G., Greenspan, S. L., & Bower, G. H. (1987). Accessibility and situation models during
narrative comprehension. Journal of Memory and Language, 26, 165–187.
O’Brien, E. J., & Albrecht J. E. (1992). Comprehension strategies in the development of a mental
model. Journal of Experimental Psychology Learning, 18, 777–784.
Paivio, A. (1971). Imagery and verbal processes. Oxford, England: Holt, Rinehart & Winston.
Paivio, A. (1986). Mental representations: A dual coding approach. New York: Oxford University
Press.
Paivio, A. (1991). Images in mind: The evolution of a theory. Hertfordshire, England: Harvester
Wheatsheaf.
Palmer, S. E. (1999). Vision science: Photons to phenomenology. Cambridge, MA: Bradford/MIT
Press.
Radvansky, G. A., & Zacks, R. T. (1991). Mental models and the fan effect. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 17, 940–953.
Robinson, J. P., & Davis, D. K. (1986). Comprehension of a single evening’s news. In J. P. Robinson
& M. R. Levy (Eds.), The main source: Learning from television news (pp. 107–132). Beverly
Hills, CA: Sage.
Robinson, J. P., & Levy, M. R. (1986). Comprehension of a week’s news. In J. P. Robinson & M. R.
Levy (Eds.), The main source: Learning from television news (pp. 87–105). Beverly Hills, CA:
Sage.
Roskos-Ewoldsen, D., Roskos-Ewoldsen, B., & Carpentier, F. (2002). Media priming: A synthesis.
In J. B. Bryant & D. Zillmann (Eds.) Media effects in theory and research (2nd ed., pp. 97–120).
Mahwah, NJ: Lawrence Erlbaum Associates, Inc.
Roskos-Ewoldsen, B., Roskos-Ewoldsen, D. R., Yang, M., & Lee, M. (2007). Comprehension of
the Media. In D. R. Roskos-Ewoldsen and J. L. Monahan (Eds.), Communication and social
cognition: Theory and methods (pp. 319–348). Mahwah, NJ: Lawrence Erlbaum.
Sadoski, M., & Paivio, A. (2001). Imagery and text: A dual coding theory of reading and writing.
Mahwah, NJ: Lawrence Erlbaum Associates, Inc.
Schmitt, K. L., Anderson, D. R., & Collins, P. A. (1999). Form and content: Looking at visual
features of television. Developmental Psychology, 35, 1156–1167.
Singer, J. (1980). The power and limitations of television: A cognitive-affective analysis. In P.
Tannenbaum (Ed.), The entertainment functions of television (pp. 31–65). Hillsdale, NJ: Lawrence
Erlbaum Associates, Inc.
Sperling, G. (1960). The information available in brief visual presentations. Psychological Mono-
graphs, 74 (Issue X, Whole No. 498).
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 25
542 LEE, ROSKOS-EWOLDSEN, ROSKOS-EWOLDSEN
van den Broek, P., & Gustafson, M. (1990). Comprehension and memory for texts: Three generations
of reading research. In S. R. Goldman, A. C. Graesser, & P. van den Broek (Eds.), Narrative com-
prehension, causality, and coherence (pp. 15–34). Mahwah, NJ: Lawrence Erlbaum Associates,
Inc.
van den Broek, P., Risden, K., Fletcher, C., & Thurlow, R. (1996). A “landscape” view of reading:
Fluctuating patterns of activation and the construction of a stable memory representation. In B.
Britton & A. Graesser (Eds.), Models of understanding text (pp. 165–188). Mahwah, NJ: Lawrence
Erlbaum Associates, Inc.
van den Broek, P., Risden, K., & Husebye-Hartmann, E. (1995). The role of readers’ standards for
coherence in the generation of inferences during reading. In R. R. Lorch & E. J. O’Brien (Eds.),
Sources of coherence reading (pp. 353–373). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
van den Broek, P., Young, M., Tzeng, Y., & Linderholm, T. (1999). The Landscape Model of reading:
Inferences and the online construction of a memory representation. In H. van Oostendorp & S. R.
Goldman (Eds.), The construction of mental representations during reading (pp. 71–98). Mahwah,
NJ: Lawrence Erlbaum Associates, Inc.
Yang, M., Roskos-Ewoldsen, B., & Roskos-Ewoldsen, D. (2004). Implications of the Landscape
Model of text memory for brand placement. In L. J. Shrum (Ed.), The psychology of entertainment
media: Blurring the lines between entertainment and persuasion (pp. 79–98). Mahwah, NJ:
Lawrence Erlbaum Associates, Inc.
Zwaan, R. A. (1994). Effect of genre expectations on text comprehension. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 20, 920–933.
Zwaan, R. A., Langston, M. C., & Graesser, A. C. (1995). The construction of situation models in
narrative comprehension: An event-indexing model. Psychological Science, 6, 292–297.
Zwaan, R. A., & Radvansky, G. A. (1998). Situation models in language comprehension and memory.
Psychological Bulletin, 123, 162–185.
Zwaan, R. A., Radvansky, G. A., Hilliard, A. E., & Curiel, J. M. (1998). Constructing multidimen-
sional situation models during reading. Scientific Studies of Reading, 2, 199–220.
APPENDIX A
Nineteen Sequences of the News Clip
1. Move over Madonna? Maybe not quite yet.
2. These new entertainment robots made by Sony aren’t about to win a
Grammy, but they are talented.
3. Aside from singing and dancing, they can recognize human faces and
names.
4. This is ASIMO. Honda’s latest pride and joy.
5. It moves with greater flexibility and spontaneity than any other robot on
earth.
6. It also learns. One museum already uses it as a tour guide.
7. These robots are part of Japan’s second ever robot expo, Robodex.
8. Its goal: To show off Japan’s latest robotics innovations and encourage
engineers to share ideas.
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 26
LANDSCAPE MODEL 543
9. Some robots now under development in Japan are meant to save lives.
This one detects land mines.
10. Still others mimic humans, this one with facial expressions.
11. This one recognizes and follows different colored objects.
12. Yet another has been taught to play the flute with special sensitivity in
its robotic lips, tongue and fingertips.
13. Many speak at least some Japanese.
14. “Konichiwa.” “Konichiwa.”
15. Robots may not yet take over our jobs—at least in our lifetime—but they
are likely to enhance our lives both at home and at work in ways we
still can’t predict. And Japan wants to be at the forefront of the robot
revolution.
16. A healing robot made to look like a baby seal is already used for therapy
with senior citizens and in pediatric wards in Japan.
17. It responds to human touch. Like a real pet, it has been clinically proven
to reduce stress without the hygiene issues of real animals.
18. Robot expo organizers hope to inspire a new generation of robot inven-
tors.
19. “I’d like to create a robot that can communicate with human beings in
a warm way,” says this university student.” “I think robots will change
human culture. It will be like coexisting with beings from another planet.”
APPENDIX B
Concept List
Alien-Beings
ASIMO
Can-reduce-stress
Can-dance/sing/entertain
Can-not-predict-ways
Can-recognize-objects
Can-recognize-human-face/name
Clinically-proven
Coexist-with humans
Communicate-with-humans
Detects
Engineers
Enhance-lives
Forefront
Honda
Inspire
Inventors
Japan
Japanese
Konichiwa
Landmines
Learn
Like a pet
Mimics-human/facial-expression
Moves-with-flexibility/spontaneity
Madonna
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4
Page 27
544 LEE, ROSKOS-EWOLDSEN, ROSKOS-EWOLDSEN
Museum
No-hygiene-issues
Not-as-talented/skilled-as-human
Not-in our lifetimes
Not-real-animal
Organizers
Pediatric-wards
Play-flute
Pride
ROBODEX
Robot
Robot that-heals
Robot-expo
Robotic-innovations
Saves-lives
Seal-responds-to touch
Senior-citizens
Sensitive-robot-fingertips/lips/tongue
Share-ideas
Show-off robots
Sony
Speaking
Talented
Therapy
Under-development
University-student
Used-as-tour-guide
Will-change-culture
Concept Presented Only in the Video Version
of the News Story
Can-move-eyes/head
Crowd
Disc-like attachment
Disc-sweeping movement
Disgust/dislike
Dog-like-Robot
Exhibition hall
Gesturing robot
Going-down-stairs
Holding/patting-seal
Man
Microphone
Raise-lip
Reporter
Robot-with-woman-face
Robot blinking-eyes
Robot-doll
Robot-face/fingers
Robot-machine
Small-red-ball
Tracking-objects
Walking
Waving-robot
White-baby-seal robot
Woman
Dow
nloa
ded
by [
Uni
vers
ity o
f A
laba
ma
at T
usca
loos
a] a
t 17:
21 2
5 A
ugus
t 201
4