-
News, Copyright, and Online AggregatorsPreliminary
Lesley Chiou and Catherine E. Tucker
October 1, 2010
Abstract
This paper examines how the practices of online aggregators of
news content affectconsumers search of news online. The aggregation
of content by third-party websiteshas become a controversial
digital copyright issue. On the one hand, aggregators arguethat
they provide a useful service to consumers and that the amount and
substantialityof the news article used in their links is small
enough to be considered fair use. Onthe other hand, content
providers argue that such practices represent copyright
in-fringement and that third-party aggregators profits from
advertising associated withcontent reduces the potential market for
or value of copyrighted material. We empir-ically examine how the
severing of a relationship between a major content providerand a
major news aggregator affected consumers search for online news.
Specifically,we investigate how the removal of all hosted articles
by The Associated Press fromGoogle News at the end of 2009 (due to
a dispute in licensing negotiations) affectedwhich news sites
consumers visited. Our empirical analysis suggests that the
removalof The Associated Presss content was correlated with a
decline in subsequent visits totraditional news sites immediately
after visiting Google News compared to other newsaggregators that
continued to host Associated Press content.
Economics Department, Occidental College, CAMIT Sloan School of
Management, MIT, Cambridge, MA.We thank Christopher Hafer of
Experian Hitwise.
1
-
1 Introduction
The Internet has reshaped the news and media industry by
providing a wealth of information
and easy access to news, stories, and events that occur locally
and globally. Over 74 million
users visit newspapers sites each month, accounting for more
than a third of all web users
(Advertising Age, 2009). With the proliferation of search
engines and news sites, consumers
face an array of options and sources for information. Search
engines have responded to
consumers demand for information by creating online news
aggregators, such as Google
News and Yahoo! News. These news aggregators feature a
collection of stories and headlines
from various online news sources, which are collated into a
single site. As such, aggregators
offer a convenient place for users to consolidate their news
reading.
We study empirically whether using a news aggregator shifts
consumers consumption of
news online and whether news aggregators are substitutes or
complements for the primary
sources they feature. Surprisingly, there has been no empirical
work that quantifies the
effect of aggregators on primary news sources, even though the
US Copyright Office requires
proof of fair use and asks users to determine whether their use
of the copyrighted material
reduces its potential market or value. On the one hand, news
aggregators may steal
traffic from media sites if users rely solely on the aggregators
abbreviated descriptions of
the article and do not visit the source site. In fact, this
accusation has been levied by several
major media organizations, including The Associated Press and
News Corporation. Rupert
Murdoch, chairman of News Corp., has accused news aggregators of
stealing content and
violating copyrights (Sandoval, 2009). Mark Cuban, chairman of
HDNet, even referred to
such practices as vampiric, saying that Newspapers are getting
their blood sucked by Google
and content aggregators (Kaplan, 2010). Moreover, some anecdotal
evidence exists that a
significant fraction of readers scan only the headlines from
Google News and do not visit
the source site (Sullivan, 2010). On the other hand, news
aggregators may expose readers
2
-
to news sources and sites that they might otherwise not visit,
and therefore generate traffic
for the primary news source. As argued by Arrington (2010), When
an aggregator puts up
a link to your site, they are doing you a favor by sending you
traffic.
Therefore, the overall effect of news aggregators on consumers
news-seeking habits is
an empirical question. Our paper focuses on how the practices of
news aggregators affect
consumers search for news online. We use a rich dataset of
consumers online search behavior
and site visits to examine the effect of a policy change in
content displayed by a major
online news aggregator in displaying content from a primary
news. The decision to host
news sources may be correlated with other factors that influence
a consumers consumption
of news, so we use a discontinuous event that altered the set of
news sources provided by a
major news aggregator.
In January 2010, after a breakdown in licensing negotiations,
Google removed all news
articles by The Associated Press from its news aggregator
(Haddad, 2010). We compare
consumers site visits before and after this policy relative to
traffic from Yahoo! News, which
continued to provide Associated Press content in this period.
Yahoo! News and Google News
play a large role in the online news market and are among the
top 5 news sites visited by
readers. We find that after Associated Press content was removed
from Google News, fewer
consumers subsequently visited traditional news sites (the
sources for much of Associated
Press content) relative to consumers using Yahoo! News. We
checked the robustness of the
result in a variety of ways. Our finding suggests that the
aggregation of news content actually
complements the original content. In other words, users are more
likely to be provoked to
seek the original source and read further when they come across
a story summarized by an
aggregator, rather than being merely content with the
summary.
Our paper builds on a growing literature that documents how the
Internet has affected
the consumption of traditional news media. Gentzkow (2007)
investigated the relationship
between oine and online newspapers. He found some evidence of
complementarity but ulti-
3
-
mately concluded that this was an artifact of customer
heterogeneity and that the provision
of online news was costly to newspaper print edition revenues.
This finding was supported
in separate research by George (2008) and Filistrucchi (2005).
Kaiser and Kongsted (2005)
find evidence of complementarity for online magazines, which
suggests that substitution is
particularly important for news content rather than more
magazine-type content. To our
knowledge, however, we are the first to study how the practice
of online aggregation affects
online news consumption. Our distinction between traditional
media, which is primarily
local, and news aggregators that have national reach builds on
earlier research that has
documented the importance of the interaction between national
and local news distribution
practices for understanding consumption of news (George and
Waldfogel, 2006; Oberholzer-
Gee and Waldfogel, 2009). Our work also relates to research that
has evaluated the conflict
between digitization and copyright. These studies have focused
predominantly on the issues
relating to the piracy of film and musical content (Rob and
Waldfogel, 2006; Oberholzer-Gee
and Strumpf, 2007; Danaher et al., 2010) by unauthorized
distribution channels.
Our paper has several implications for media markets and public
policy. First, a fierce
debate exists over intellectual property and copyrights for
content posted online. Search en-
gines and aggregators accumulate information from primary
sources, and controversy exists
over whether news aggregators violate existing copyrights and
whether content can be pro-
vided freely by a third party. Secondly, policymakers have long
stressed the importance of
the diversity of news consumption and how consumption of local
news in particular encour-
ages civic engagement. Our results suggest that news aggregators
actually provoke readers
to seek further news. Given that local news sites comprise a
substantial fraction of online
news, aggregators may promote more diversity in consumption
patterns.
4
-
2 Data and Institutional Setting
2.1 Contractual Dispute between Google and Associated Press
The Associated Press, founded in 1846, is one of the most
powerful news agencies in the
world. Since the demise of United Press International, it is the
only national news service
in the US, and its major competitors are now the United
Kingdom-based Reuters and the
France-based Agence-France Presse. It is a cooperative that is
owned by various newspapers
and radio and television stations in the United States. These
stakeholders both contribute
stories to the Associated Press and use material which are
written by its staff journalists.
During the past decade, The Associated Press has been at the
forefront of efforts by copyright
holders to circumscribe fair use for digital content and protect
copyholders rights. For
example, in June 2008, Associated Press has invoked the Digital
Millennium Copyright Act
and insisted that various bloggers remove Associated Press
content (Ardia, 2008).
Google News is ranked as the fifth most visited news website by
Hitwise. Receiving 2.90%
of all news site visits, it is the second most popular news
aggregator service after Yahoo!
News, which received 7.09% of all news site visits. Founded on
April 2002, Google News
electronically aggregates different news sources based upon a
proprietary algorithm. As of
December 2009, Google News claimed that it received news content
from 25,000 publishers
across the world and that it sent 1 billion clicks to these
publishers every month (Cohen,
2009). Google News has been supported by advertising revenues in
the US since February
2009. Figure 1 provides a screenshot of Google News. Google News
has two noticeable
features that distinguishes it from traditional news sites.
First, a variety of sources are
listed for each story. Second, the order of news is
electronically determined based on users
preferences, the recency of the story, and the interest it has
received from other users.
Since both The Associated Press and Google News are key players
in the distribution
of news, it is not surprising that they have forged a
partnership. Table 1 summarizes the
5
-
Figure 1: Screen shot of Google News screen
Note: On June, 30 2010, the formatting of Google News changed
somewhat and reduced the ability of users tocustomize the placement
of the columns containing news. Therefore the screenshot above,
which was producedafter this formatting change, may be slightly
different from what users viewed during the period that we
study.
6
-
major events of their relationship. We study a discontinuity in
this relationship, which was
engendered by negotiations surrounding the contract renewal at
the end of January 2010. As
part of their existing contract, Google and The Associated Press
agreed that Associated Press
content could be hosted by Google for a period of 30 days.
Therefore, if the contract ended
in January 2010 and was not renewed, Google would have to stop
posting new Associated
Press content 30 days prior to the end of the contract.
Presumably, to make this clean
break a credible outside option, Google did indeed stop posting
content for seven weeks
during these contract negotiations. We should emphasize that
this is necessarily based on
the observations of industry outsiders, since both Google and
the Associated Press signed
binding non-disclosure agreements, which prevented them from
ever commenting on the
course or outcome of negotiations (Sullivan, 2010).
This removal of Associated Press content represents a useful
natural experiment for
empirical researchers. Since the removal of content was provoked
by the intricacies of contract
negotiation, its timing can be thought of reasonably exogenous,
as it was determined by the
expiration of the contract rather than any considerations of the
popularity (or lack thereof)
for Associated Press content at that time. As detailed in Table
1, the dispute with the
Associated Press led Google to remove content by the Associated
Press from December 23,
2009 to February 9, 2010. Fortunately for our purposes, Yahoo!
News continued to host
Associated Press content without interruption during this time,
which enables us to use its
web users behavior as a control in our regressions. We compare
which websites consumers
navigated to after visiting a news aggregator before and after
the removal of content on
Google for both visitors to Google News and Yahoo! News.
Critics and supporters alike of news aggregators have proposed
numerous arguments for
whether the removal of Associated Press content may either
benefit or hurt news websites.
On the one hand, if consumers are no longer able to obtain
Associated Press news content,
they may be more likely to seek the news directly from the
Associated Press member orga-
7
-
Table 1: Timeline of negotations between Google and Associated
PressDate Event
March, 2005 Google is sued by Agence France Presse for copyright
in-fringement after AFP content appeared on Google News.
August, 2006 Google and Associated Press first sign contract to
enableAssociated Press content to appear on Google News for30 day
window.
December 24, 2009 Associated Press content no longer appears on
Google.Industry press speculates that this is in preparation forthe
expiration of contract between Associated Press andGoogle in a
months time.
End January 2009 Associated Press and Google contract set to
expire
February 2010 Associated Press Content returns to Google
News.
nizations and newspapers. On the other hand, consumers may
simply be less likely to seek
further information about news. In essence, this distinction can
be boiled down to whether
consumers view news aggregators as a complement or substitute to
original news sources. Do
they use news aggregators to identify news stories that they
then pursue in greater depth,
or do they simply stop after reading the first news item? For
instance, the Associated Press
ran a news story about economic depression in Michigan in August
2010. The screenshot
of how the story appeared on Google News is depicted in Figure
2. The links relating to
the Associated Press story that appear at the bottom of a
typical story are also depicted
in Figure 2. After reading the Associated Press summary of the
story, readers are free to
explore the issue further in local newspapers such as the
Detroit News and Lansing State
Journal. The question we ask is whether the presence of the
Associated Press content on
Google News makes it more or less likely that a news consumer
would then trouble to visit
Detroit News or the Lansing State Journal, both of which are
members of the Associated
Press Network.
Our analysis is focused on the period immediately prior to and
during the removal of
8
-
Figure 2: Example screenshot of Associated Press article hosted
on Google NewsNote: Google News, August 1st 2010. Text of article
has been slightly edited to fit on page.
9
-
Associated Press articles from Google News for two reasons.
First, it is not immediately clear
at which point in February that Google News and Associated Press
resumed their relationship
and reached a new agreement. Second, it is not apparent whether
the reinstatement during
this time consisted of the older, missing content or new content
or whether Google changed
the presentation of AP articles afterwards. For example, it
would be problematic if Google
decided to highlight Associated Press content after the contract
negotiations were concluded,
perhaps as a sweetener to the deal. For these reasons, we focus
on visits to news sites
during the months of December 2009 and January 2010.
2.2 Data Description
Our data derive from Experian Hitwise. Hitwise develops
proprietary software that Internet
Service Providers (ISPs) use to analyze website logs created on
their network. Once the ISP
aggregates the anonymous data, the data are provided to Hitwise.
According to their website,
Hitwise collects these usage data from a geographically diverse
range of ISP networks and
opt-in panels, representing all types of Internet usage,
including home, work, education
and public access. Currently, Hitwise has usage data from a
sample of 25 million people
worldwide.
We collected information on the sites that users visit
immediately after navigating to
Google News or Yahoo! News. We use weekly data from the week
ending December 5, 2009
to the week ending February 27, 2010 for the top 1500 sites
navigated after Google News or
Yahoo! News. Hitwise reports the fraction of total traffic that
arrives at each downstream
site immediately after a visit to Google News and Yahoo! News.
We constructed a 2-
month panel where the unit of observation is the percentage of
weekly clicks a downstream
website received from either Google News or Yahoo! News.
Twenty-six percent of websites
received incoming traffic from both Google and Yahoo! News. The
remainder of websites
were only visited after navigating to one particular aggregator.
This may reflect internal
10
-
complementarities for these companies. For instance, someone
using Google News is unlikely
to navigate to Yahoo! Mail, and similarly someone using Yahoo!
News is unlikely to navigate
to Gmail.
We categorized the websites into two main classes: non-news
(e.g., Yahoo! Mail, mys-
pace.com) and traditional news (e.g., newyorktimes.com,
bostonherald.com). We applied
Hitwises own categorization of news websites to identify
traditional news media, but we
excluded weather sites and news aggregators from the 5 major
search engines (such as Ya-
hoo! News, Google News, Huffington Post) from the category.1 We
identified a site as an
aggregator based upon whether or not they produced their own
original content.
We also constructed a separate category for international news
(e.g., bbc.com/news,
hindustantimes.com), which we use in our robustness checks. We
would expect the removal
of Associated Press content to affect traditional news media
sites, but the removal should
not affect visits to international sites that tend to either
generate their own content or rely
on non-American news agencies for their content.
Table 2 reports the summary statistics for our data. It is
striking that 20 percent of
the time, consumers navigate to a traditional news media website
from the news aggregator.
Traditional news sites captured most traffic. International news
received less traffic (5.5
percent of sites visited) than traditional news sites.
1Hitwise reports the top 10,000 ranked news and media sites in
November 2009.
11
-
Table 2: Summary statistics for downstream websites from Google
News and Yahoo! News
Mean Std Dev Min Max Observations% clicks 0.016 0.19 0 18.3
100503Google News 0.50 0.50 0 1 100503PeriodDispute 0.67 0.47 0 1
100503Traditional News Site 0.20 0.40 0 1 100503News Aggregator
Site 0.0011 0.033 0 1 100503International News Site 0.055 0.23 0 1
100503Observations 100503
Notes: This table reports statistics for websites visited
immediately after Google News and Yahoo! Newsduring December 2009
and January 2010. The period during which the dispute occurred
between AssociatedPress and Google News was after December 23,
2009. Traditional news sites refer to news and media sitesas
defined by Hitwise, excluding weather sites, international news
sites, and news aggregators from the top5 search engines.
12
-
Table 3 displays the top 50 (traditional) news websites in our
dataset and the average
percentage of downstream clicks they receive. Table 4 displays
the top 50 non-news websites
in our dataset and the average percentage of downstream clicks
they receive. As shown in
Table 4, the top non-news websites reflect the top website
brands on the Internet. This is
suggestive evidence that users of news aggregator sites have
both mainstream Internet tastes
and regard the sites as part of their normal Internet
consumption.
13
-
Table 3: Top 50 news websites visitedafter Google News and
Yahoo! News
Avg Visit Pctabcnews.com 2.11associatedcontent.com
0.11bleacherreport.com 0.17bloomberg.com 0.51boston.com
0.24bostonherald.com 0.19businessweek.com 0.15cbsnews.com
0.19chron.com 0.13cnn.com 1.85csmonitor.com 0.15dallasnews.com
0.11drudgereport.com 0.64edition.cnn.com 0.20examiner.com
0.65foxnews.com 1.13foxnews.com/entertainment
0.082foxnews.com/politics 0.20freep.com 0.13gather.com
0.34latimes.com 0.48mcclatchydc.com 0.095mercurynews.com
0.44miamiherald.com 0.15msnbc.com 0.83news.com 0.12nj.com
0.11npr.org 0.16nydailynews.com 1.59nypost.com 0.26nytimes.com
2.88people.com 0.39philly.com 0.15politico.com 0.53radaronline.com
0.060reuters.com 0.69seattlep-i.nwsource.com
0.11seattletimes.nwsource.com 0.11sfgate.com
0.17sportsillustrated.cnn.com 0.10startribune.com
0.084thedailybeast.com 0.17theweek.com 0.14time.com 1.16upi.com
0.093usatoday.com 0.72usmagazine.com 0.23usnews.com
0.082voanews.com 0.13washingtonpost.com 1.74wsj.com 0.86
Table 4: Top 50 Non-news websitesvisited after Google News and
Yahoo!News
Avg Visit Pctaddress.yahoo.com 0.12amazon.com 0.59aol.com
0.46aralifestyle.com 0.14ask.com 0.19bankofamerica.com 0.18bing.com
0.62blogsearch.google.com 0.77buzz.yahoo.com 0.21chase.com
0.14cosmos.bcst.yahoo.com 0.95ebay.com 1.00education.yahoo.net
0.34espn.com 0.56facebook.com 6.23fastflip.googlelabs.com
3.60finance.google.com 0.36finance.yahoo.com 0.60games.yahoo.com
0.099gmail.com 1.55google.com 11.6howlifeworks.com
1.04huffingtonpost.com 0.96images.google.com
0.50latimesblogs.latimes.com 0.16livescience.com 0.38mail.live.com
1.28mail.yahoo.com 9.94maps.google.com 0.23members.yahoo.com
0.29movies.yahoo.com 0.13msn.com 1.03my.yahoo.com 0.67myspace.com
1.54news.google.com 0.24omg.yahoo.com 0.32rivals.com
0.10search.yahoo.com 2.20shine.yahoo.com 0.13space.com
0.15sports.yahoo.com 0.26sports.yahoo.com/nfl 0.13tmz.aol.com
0.20tv.yahoo.com 0.12video.google.com 0.27weather.com
0.67weather.yahoo.com 0.39wikipedia.org 0.50yahoo.com
7.20youtube.com 2.47
14
-
Table 5: Demographic description of usersMeasure Yahoo! News
Google News New York Times
Male 59.95 63.8 61.21Age 18-24 12.12 13.89 6.17Age 25-34 18.05
14.72 13.93Age 35-44 19.03 17.08 12.98Age 45-54 21.41 22.24
19.45Age 55+ 29.38 32.06 47.47Income 150k 9.29 9.6 10.77
Source: Hitwise
Notes: This table reports the fraction of users of a particular
website within each demographic category.Statistics are reported
for users of Yahoo! News, Google News, and the New York Times
website.
To verify that Yahoo! News could be considered an appropriate
control group for Google
News, we checked that the users shared similar observable
demographics. Table 5 reports the
fraction of users within each demographic category for a
particular site. The users of Yahoo!
News and Google News do indeed look reasonably similar; they are
skewed towards being
older, predominantly male, and wealthier than the general U.S.
population. For comparison,
we also report demographics for users of the New York Times
website. The users of the
New York Times site are similar, though significantly older,
than the average users of a news
aggregator. Table 5 also provides suggestive evidence of why the
debate over ad revenues from
news content is so contentious. Users such as these are a
remarkably attractive demographic
group from an advertisers perspective.
15
-
3 Analysis
Figure 3 summarizes our main analysis. Figure 3 illustrates the
mean percentage of down-
stream traffic for users that visited Google News and Yahoo!
News during our period. As
seen in the graph, little change occurs in downstream site
navigation for Yahoo!. However,
news sites experience a decline in visits from Google News after
the removal of Associated
Press relative to the change in traffic from Yahoo! News.
Figure 4 extends this analysis to show how visit behavior varies
for international news
sites as well. Once again, little change exists in user behavior
for these additional types of
websites on either Yahoo! News or Google News, suggesting that
these sites were not affected
by the removal of Associated Press content. As expected, these
international websites are
unlikely to be affected by the removal of AP content due to the
nature of their content. As
seen in Figure 5, no such change in clicks occurred in the prior
year during the same calendar
months of December 2008 and January 2009.
Figure 3: Downstream sites visited after Google News and Yahoo!
News
Notes: This figure shows the average percentage of clicks for
news and non-news sites navigated to aftervisiting from Google News
and Yahoo! News before and after the removal of The Associated
Press fromGoogle News.
16
-
Figure 4: Downstream sites visited after Google News and Yahoo!
News
Notes: This figure shows the average percentage of clicks for a
variety of website types navigated to aftervisiting from Google
News and Yahoo! News before and after the removal of The Associated
Press fromGoogle News.
To formalize the insights provided by Figures 3 and 4, we run a
difference-in-differences
regression for the policy change and estimate the following
regression for the percentage of
clicks to website i after visiting news aggregator j in month
t:
%clicksijt = 0 + 1Newsi Googlej PeriodDisputet + 2Newsi
PeriodDisputet+ 3Newsi Googlej + 4Googlej+ i + weekt + ijt
where News is an indicator variable equal to 1 if the website is
a traditional news source,
Google is an indicator variable equal to 1 if the traffic
originated after viewing Google News,
and PeriodDispute is an indicator variable equal to 1 for the
weeks after the removal of
Associated Press from Google News. The controls are
downstream-website fixed effects.
The vector weekt contains weekly fixed effects to capture
national variation in the volume
17
-
Figure 5: Downstream sites visited after Google News and Yahoo!
News in prior year(December 2008 and January 2009)
Notes: This figure shows the average percentage of clicks for
news and non-news sites navigated to aftervisiting from Google News
and Yahoo! News in December 2008 and January 2009 for the year
prior to theremoval of The Associated Press from Google News.
and interest generated by news stories in that week. The
coefficient on the interaction
term News Google PeriodDispute captures the effect of the
Associated Press removalon visits to traditional news sites
compared to non-news sites from Google News with the
corresponding change in traditional news and non-traditional
news sites on Yahoo! as a
control. We estimate this specification using ordinary least
squares and cluster our standard
errors at the website level to avoid the downward bias reported
by Bertrand et al. (2004).
Table 6 reports the results for various regression
specifications, incrementally building
up to our full specification described by equation (1). Very
little variation exists in the size
or precision of our coefficient of interest in each of the
columns. The negative coefficient on
NewsGooglePeriodDispute implies that during the dispute with
Associated Press, GoogleNews users were less likely to visit
traditional news websites after visiting Google News. This
suggests that the presence of Associated Press articles in
Google News prompted users to
seek further information at traditional news sites and thereby
encouraged more diversity in
18
-
Table 6: Downstream traffic Google and Yahoo! News before and
after the policy change
(1) (2) (3)% clicks % clicks % clicks
PeriodDispute X Google X News -0.00583 -0.00583 -0.00583
(0.00271) (0.00284) (0.00284)PeriodDispute X Google 0.00152
0.00152 0.00152
(0.00219) (0.00229) (0.00229)PeriodDispute -0.000514 -0.000514
-0.000560
(0.000655) (0.000686) (0.00105)Google -0.00372 -0.0115
-0.0115
(0.00367) (0.00617) (0.00617)News -0.000393
(0.00356)PeriodDispute X News 0.00143 0.00143 0.00143
(0.000963) (0.00101) (0.00101)News X Google 0.0184 0.0324
0.0324
(0.00600) (0.00753) (0.00753)Website Fixed Effects Yes Yes
YesWeek Fixed Effects No No YesObservations 100503 100503
100503R-Squared 0.000543 0.581 0.581
Robust standard errors clustered at website level. *p < 0.1,
**p < 0.05, ***p < 0.01. Thedependent variable is the
fraction of traffic to websites after visiting Google News or
Yahoo!News. The policy change is the removal of hosted articles by
The Associated Press fromGoogle News.
news consumption.
News sites on Google experience a 6 percentage point decrease in
clicks after the removal
of Associated Press articles. Compared to the mean percentage
share of 2.9 percent before
the policy change, this drop represents an approximately 20
percent decrease in traffic to
news sites after the removal of Associated Press articles from
Google. If the claim in Cohen
(2009) is true that Google sends a billion clicks each month to
its partner news providers, then
this percentage translates into a very large change in the
number of clicks that traditional
news websites receive. While we do not know precisely the
international breakdown, our
data from Hitwise suggest that 40 percent of all clicks before
the policy change went to
19
-
traditional news media websites hosted in the US. Therefore,
this 20 percent decrease could
imply a 80 million decrease in visits each month from Google
News users each month to
traditional news media websites hosted in the US.
Our results suggest that news aggregators complement the news
sources that they fea-
ture by directing traffic to these news sites. The provision of
content on news aggregators
encourages readers to seek further information from other news
sources.
20
-
4 Robustness Checks
We conducted various robustness checks as reported in Table 7.
Columns (1) and (2) check
the robustness of our results to alternative specifications. We
apply a Tobit regression to
account for sites that receive zero clicks in a given week and
also a semi-log regression.2 Both
regressions have similar signs for the coefficients of interest;
news sites receive less traffic from
Google after the policy change.
Columns (3)-(5) check robustness of the results to alternative
definitions of the con-
trol group. As described previously, users navigated to a
variety of non-traditional news
sites after visiting a news aggregator. These sites included
both non-traditional and non-
Associated Press sources of news. In columns (3) and (4), our
robustness checks omit the
top news aggregators and international websites as part of the
control group. These alter-
native definitions of the control group could be warranted if
the removal of Associated Press
content also affected navigation to these sites directly (e.g.,
if Associated Press content had
previously encouraged people to visit international websites, or
if the removal of Associated
Press content on Google altered peoples perceptions of news
aggregators.) In column (5),
we check robustness to removing both news aggregators and
international sites from our
data. Generally, the results are robust in sign.
2For the semi-log regression, we use log(%clicks+0.01) as the
dependent variable.
21
-
Tab
le7:
Rob
ust
nes
sch
ecks:
Dow
nst
ream
traffi
cto
loca
lnew
ssi
tes
from
Goog
leN
ews
and
Yah
oo!
New
sb
efor
ean
daf
ter
the
pol
icy
chan
ge
(1)
(2)
(3)
(4)
(5)
Tob
itS
emi-
log
No
Agg
rega
tors
No
Inte
rnat
ion
alN
ews
vs
Non
-New
s
Per
iod
Dis
pu
teX
Goog
leX
All
New
s-0
.022
9
-0.0
216
-0.0
0583
-0
.006
20
-0.0
0632
(0.0
0924
)(0
.012
7)(0
.002
84)
(0.0
0303
)(0
.00306)
Per
iod
Dis
pu
teX
Goog
le0.
0039
4-0
.007
830.
0015
20.
0017
80.0
0179
(0.0
0557
)(0
.005
72)
(0.0
0230
)(0
.002
49)
(0.0
0251)
Per
iod
Dis
pu
te-0
.005
56-0
.006
85-0
.000
464
-0.0
0076
2-0
.000763
(0.0
0446
)(0
.005
27)
(0.0
0108
)(0
.001
11)
(0.0
0112)
Goog
le0.
0248
-0.0
207
-0.0
114
-0.0
153
-0
.0155
(0.0
0874
)(0
.012
3)(0
.006
19)
(0.0
0674
)(0
.00680)
All
New
s0.
112
(0.0
208)
Per
iod
Dis
pu
teX
All
New
s0.
0152
0.01
55
0.00
142
0.00
148
0.0
0154
(0.0
0564
)(0
.008
76)
(0.0
0101
)(0
.001
04)
(0.0
0105)
All
New
sX
Goog
le-0
.012
60.
0785
0.03
23
0.03
64
0.0
362
(0.0
142)
(0.0
232)
(0.0
0754
)(0
.008
04)
(0.0
0812)
Web
site
Fix
edE
ffec
tsN
oY
esY
esY
esY
esW
eek
Fix
edE
ffec
tsY
esY
esY
esY
esY
es
Ob
serv
atio
ns
1005
0310
0503
1003
9594
959
94203
R-S
qu
ared
0.68
40.
580
0.58
10.5
81
Rob
ust
stan
dar
der
rors
clust
ered
atw
ebsi
tele
vel.
*p