UNIVERSITAT AUTÒNOMA DE BARCELONA Non-Sponsored Technology Adoption: How Network Effects Create Dominant Standards PUE Final Project Josep Nueno Guitart Tutor: Gabriel Izard Granados Abstract This project studies the role of network effects in the diffusion of technical innovations. In order to do so we study the transition from an incumbent video containing format (AVI) to a newer more efficient one (MP4) in the context of The Pirate Bay, a peer-to-peer file sharing network, using a database of the 2.1 million files available in their catalog. In order to carry on our analysis we divide the peers active in The Pirate Bay in those that upload files and those that download them, and we find statistically significant evidence of network effects in the MP4 adoption shares for both groups. Lastly, we propose a theoretical model to explain some of the observed phenomena. Este proyecto estudia el rol que los efectos de red desempeñan en la difusión de innovaciones. Analizamos la transición de un formato contenedor de video incumbente (AVI) a otro más nuevo y eficiente (MP4) en el contexto The Pirate Bay, una la red peer-to-peer para compartir archivos, usando una base de datos de los 2.1 millones de archivos disponibles en su catálogo. Para llevar a cabo nuestro análisis dividimos a los usuarios de The Pirate Bay en aquellos que cargan ficheros y aquellos que los descargan, y encontramos evidencia estadísticamente significativa de efectos de red en la proporción de usuarios que adoptan MP4 en ambos grupos. Finalmente proponemos un modelo teórico para explicar algunos de los fenómenos observados. PROGRAMA UNIVERSITAT-EMPRESA
72
Embed
Non-Sponsored Technology Adoption: How Network Effects ... premi/tfc 41 23 Nueno.pdf2. Overview of BitTorrent and The Pirate Bay BitTorrent is a protocol that facilitates peer-to-peer
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
UNIVERSITAT AUTÒNOMA DE BARCELONA
Non-Sponsored Technology
Adoption: How Network Effects
Create Dominant Standards
PUE Final Project
Josep Nueno Guitart
Tutor: Gabriel Izard Granados
Abstract
This project studies the role of network effects in the diffusion of technical innovations. In order
to do so we study the transition from an incumbent video containing format (AVI) to a newer
more efficient one (MP4) in the context of The Pirate Bay, a peer-to-peer file sharing network,
using a database of the 2.1 million files available in their catalog. In order to carry on our
analysis we divide the peers active in The Pirate Bay in those that upload files and those that
download them, and we find statistically significant evidence of network effects in the MP4
adoption shares for both groups. Lastly, we propose a theoretical model to explain some of the
observed phenomena.
Este proyecto estudia el rol que los efectos de red desempeñan en la difusión de innovaciones.
Analizamos la transición de un formato contenedor de video incumbente (AVI) a otro más
nuevo y eficiente (MP4) en el contexto The Pirate Bay, una la red peer-to-peer para compartir
archivos, usando una base de datos de los 2.1 millones de archivos disponibles en su catálogo.
Para llevar a cabo nuestro análisis dividimos a los usuarios de The Pirate Bay en aquellos que
cargan ficheros y aquellos que los descargan, y encontramos evidencia estadísticamente
significativa de efectos de red en la proporción de usuarios que adoptan MP4 en ambos grupos.
Finalmente proponemos un modelo teórico para explicar algunos de los fenómenos observados.
PROGRAMA UNIVERSITAT-EMPRESA
1
Non-sponsored technology adoption: how
network effects create dominant standards.
Tutor: Gabriel Izard Granados
Signatura:
2
PRESENTACIÓ DEL TREBALL FI DE CARRERA
Josep Nueno Guitart, alumne de la vint-i-tresena Promoció del Programa
Universitat-Empresa, fa entrega per duplicat del treball fi de carrera titulat:
Non-sponsored technology adoption: how network effects create dominant
standards amb el qual participa en la dinovena convocatòria de les Beques
Universitat Empresa. Declara conèixer i acceptar les bases de la convocatòria.
Així mateix declara que el treball fi de carrera que presenta és inèdit, no
plagiat, i haver respectat el compromís de confidencialitat amb les empreses
del PUE.
Igualment autoritza al Programa Universitat-Empresa a la publicació del seu
treball.
Bellaterra (Cerdanyola del Vallès), 30 de maig de 2013
Signat:
3
Table of contents
1. Introduction 6
2. Overview of The Pirate Bay and BitTorrent 10
3. Overview of digital audiovisual formats 12
4. Literature review 16
5. Econometric model 20
6. Data 22
7. Results 30
8. Theoretical model 34
9. Simulation 43
10. Discussion 47
11. Appendix 1 – Code used for database cleanup 51
12. Appendix 2 – Descriptive statistics for television shows 55
13. Appendix 3 – Stata output 56
14. Appendix 4 – MKV regressions and comments 62
15. Appendix 5 – Theoretical model derivations 65
16. Appendix 6 – Code used for the simulation 68
17. References 70
4
Index of figures
Figure 1. Revolutionary innovations in media technology 5
Figure 2. Revolutionary and Evolutionary innovations within media paradigms 5
Figure 3. Monthly uploaded AVI and MP4 files to TPB 9
Figure 4. Monthly format share of uploaded AVI and MP4 files to TPB 9
Figure 5. Broadband penetration and TPB use 11
Figure 6. Monthly uploads to TPB by format 13
Figure 7. HD television sets sold in 2005-2010 14
Figure 8. Monthly uploaded media files to TPB per codec 15
Figure 9. Dataset creation process 23
Figure 10. Average number of seeders per file for AVI and MP4 files 24
Figure 11. Average number of seeders per file (TV Shows sample) 26
Figure 12. Average number of comments per file (TV Shows sample) 27
Figure 13. Net share effects for = 0.9 39
Figure 14. Equilibrium adoption shares for firms varying and 45
Figure 15. Equilibrium adoption shares for firms varying n and fkmax 45
Figure 16. Equilibrium adoption shares for consumers varying and 46
Figure 17. Equilibrium adoption shares for consumers varying n and fkmax 47
5
Index of tables
Table 1. Descriptive statistics 29
Table 2. Random effects MLE regression of MP4 share for users 31
Table 3. Random effects MLE regression of AVI share for users 32
Table 4. Random effects MLE regression of MP4 share for uploaders 33
Table 5. Possible interactions between firms and consumers 35
Table 6. List of parameters used in the simulation 43
Table 7. List of variables used in the simulation 45
6
1. Introduction
After increasing significantly during the late 20th
century, the rate at which
technological innovations are generated experienced unusually high levels of
acceleration over the last decade: digitalization creates very demanding environments as
the relative ease of transition between technologies results in a faster innovation-
substitution cycle. Even if these changes rarely suppose a drastic change in terms of the
improvements new developments warrant, the high frequency at which they are taking
place make them an interesting subject.
Varian and Shapiro (1999) categorize technical innovations depending on whether
the new standard is compatible with the old one or not. If it is he calls it an
“evolutionary innovation” while if it is not it becomes a “revolutionary innovation”.
Looking at the literature on standards from this perspective one can notice that most
work addresses the dynamics of revolutionary innovations (i.e. disruptive changes
among standards) while there is much less focus on evolutionary ones (the substitution
activity within a technology as the many innovative alternatives fight to become the
dominant standard). Another classification prevalent in the reviewed literature is the
distinction between sponsored and unsponsored standards. The former are proprietary
technologies sold by an agent who is capable of strategic maneuvering in order to
maximize the chances of its standard becoming the dominant one. On the other hand in
the case of unsponsored standards, no one else besides the final consumers stand to gain
anything from adoption. While some papers on theory of unsponsored standards have
been written, most notably Katz and Shapiro (1985, 1986) and Farrell and Saloner
(1986), little empirical research has been conducted on them, and the emphasis has
always been on competition rather than replacement of an incumbent format by a more
effective new one.
One aspect in which most research agrees is in the relevance direct and indirect
network effects have in the adoption of new technologies. Direct network effects are a
consequence of adoption by other users: the classical example is the increase in utility
users connected to a telephone network experience when an additional user decides to
join in. Indirect network effects are different in that while their impact may also
increase along user adoption, they are not a direct consequence of it. An example of
them is software variety in different operating systems for personal computers: a higher
7
adoption rate for Mac computers increases the incentive of developers to create products
that work in that platform, which in turn increases the variety of applications available
for that operating system. This is a self-reinforcing loop, since more variety entails a
higher attractiveness for the platform which increases adoption by users. While harder
to identify than direct effects, indirect network effects play a huge role in determining
whether or not a specific technology will succeed in carving out a user base1.
Figure 1: Revolutionary innovations in media technology
Figure 2: Revolutionary (red) and Evolutionary (blue) innovations within media
paradigms
This project intends to study the dynamics of adoption in the case of evolutionary
innovations, paying special attention to the impact of network effects. A case at hand is
1 For a detailed empirical investigation of the relevance of indirect network effects see Ohashi (2004)
8
the gradual replacement there has been in digital video formats, where Audio Video
Interleave (AVI), a format originally designed by Microsoft but freely licensed, was
gradually replaced by ISO’s MPEG-4 Part 14 (MP4). This change was by no means
revolutionary since it did not challenge the governing technical paradigm (digital video,
see Figures 1 and 2), but it did carry incremental improvements to the quality and
usefulness of the contents offered. Figures 3 and 4 show the count and share of AVI and
MP4 multimedia files uploaded daily to The Pirate Bay which, for the time being, can
be taken as the dominant catalog of files being shared in BitTorrent, a protocol for peer-
to-peer sharing of files. We can notice how during the second half of the ‘00s MP4
gained ground over AVI as the preferred format in which to distribute media files, until
becoming the dominating one by mid-2012. This setting is ideal to study how non-
disruptive, non-sponsored technologies bid for dominance within a technical paradigm,
in this case digital video. Furthermore, the exponential growth of MP4 is suggestive of
self-reinforcing dynamics, which in turn points to network effects as a driver of
adoption. Files shared in BitTorrent are ideal to study the role of indirect network
effects since peers on the network can be divided into file uploaders (a small subset of
total peers) and regular users who only download files, and by studying how adoption
decisions of a group impact the adoption decision of the other we can separate and
assess the impact of direct and indirect network effects.. Additionally, since the studied
technologies are freely used in this context we can zero-in network effects while paying
only limited attention to strategic maneuvering by the original designers or other parties.
Sections 2 and 3 give a quick overview of the workings of BitTorrent, The Pirate
Bay and video container formats (AVI and MP4). Section 4 gives an overview of the
literature written on the subject. In section 5 presents an econometric model to explain
the changes in adoption of video formats by peers, and a description the data
manipulations that were carried out in order to test the model is the subject of section 6.
Section 7 proposes a theoretical model to explain the observed results, and in section 9 a
simulation is run in order to better understand its dynamics. Finally section 10 discusses
the findings of the project.
9
0
500
1000
1500
2000
2500
3000
3500
4000
4500
20
04
-08
20
05
-01
20
05
-06
20
05
-11
20
06
-04
20
06
-10
20
07
-03
20
07
-08
20
08
-01
20
08
-06
20
08
-11
20
09
-04
20
09
-09
20
10
-02
20
10
-07
20
10
-12
20
11
-05
20
11
-10
20
12
-03
20
12
-08
20
13
-01
Mo
nth
ly u
plo
ads
to T
PB
Figure 3: Monthly uploads to TPB of AVI and MP4 files
MP4
AVI
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
20
04
-08
20
05
-01
20
05
-06
20
05
-11
20
06
-04
20
06
-10
20
07
-03
20
07
-08
20
08
-01
20
08
-06
20
08
-11
20
09
-04
20
09
-09
20
10
-02
20
10
-07
20
10
-12
20
11
-05
20
11
-10
20
12
-03
20
12
-08
20
13
-01
Mo
nth
ly u
plo
ad s
har
e
Figure 4: Monthly share of MP4 and AVI for media files uploaded to TPB
MP4
AVI
10
2. Overview of BitTorrent and The Pirate Bay
BitTorrent is a protocol that facilitates peer-to-peer sharing of large files. Peer-to-
peer downloading differs from traditional client-server downloading in that the transfer
of a file is not handled by a single central server, but instead is carried out by a network
of computers running a peer-to-peer file-sharing software (or client). When a specific
file is downloaded from a peer-to-peer network each computer in it that has the
requested file transfers a small part of it, which greatly improves efficiency, both in
terms of congestion and in terms of download time. BitTorrent was devised with the
goal of incentivizing sharing and minimize the free rider behavior that is so prevalent in
peer-to-peer sharing (mainly, disconnecting from the network as soon as the download
is complete). It does so through a “tit-for-tat” system which ranks each peer in terms of
the amount of time it remains connected to the network after finishing his or her
downloads, improving future download speeds together with the ranking. Computers
that are sharing the complete file are known as “seeders”, and are an indicator of the
availability of a file in the network. Since for most peers upload speed is significantly
lower than download speed BitTorrent allows parallel download of file chunks,
therefore bypassing the bandwidth bottleneck other peer-to-peer networks are faced
with.
All these factors have contributed to a steady increase over the last decade in the
number of users and companies using the BitTorrent protocol to share files legally or
illegally. As Figure 5 illustrates, the advent of broadband for mass consumption has
facilitated the exchange increasingly large files, and since the early 2000s sharing video
files (be it movies, video clips or television shows) has become commonplace for a lot
of internet users. Following this surge in popularity websites started appearing that
indexed the files in the BitTorrent network and provided the necessary information to
access them (Torrent Files and Magnet Links2).
2 Torrent Files contain data about the locations of a file within the BitTorrent network while Magnet
Links contain a unique identifier that is derived from the contents of the file. Both can be used to start peer-to-peer downloads in the BitTorrent protocol.
11
Source: OECD, own preparation
One of the biggest repositories is The Pirate Bay (TPB), established in 2004 and
indexing as of February 2013 more than 2 million files. During its history TPB has been
frequently on the spotlight due to its status as one of the biggest networks providing
access to pirated content, which have made it the target of a lot of attention by
intellectual property enforcers who repeatedly have taken legal action against it. Despite
their efforts the website continues active although it has been forced to change its
country domain several times (currently all its traffic is being routed from the island of
Saint Martin). Its long trajectory along with its popularity make it an ideal candidate to
study how the MP4 format has come to dominate most video downloads. The website
has a page for each file in its catalog which contains information on it and allows users
to leave comments. Further increasing its attractiveness a dump of its database was
carried out in February 2013 by programmer Karel Bilek who publically posted the
results making them available for download by anyone interested.
The motivations and incentives of peers operating in a BitTorrent network are
clearer for those that download files than for hose that upload them. The incentives for
the downloading side don’t really need to be explained, since there is an obvious benefit
to downloading media for free and at a decent speed. However the uploading side, the
one that creates the file and prepares it for sharing, invests a significant effort in it
apparently without gaining any kind of compensation, other than the occasional thank
0
50000
100000
150000
200000
250000
300000
0.00
5.00
10.00
15.00
20.00
25.00
30.00
File
up
load
s to
TP
B
Bro
adb
and
pen
etra
tio
n o
n O
ECD
co
un
trie
s (%
) Figure 5: Broadband penetration and TPB use
Broadband penetration on OECD countires Bi-Quarterly uploads to TPB
12
you from people that download. In some cases Torrent files contain text documents
which lead to the uploading part’s website, thus generating traffic which may translate
into ad revenue. Other files may contain malware or may require the completion of
surveys or the disclosure of personal information in order to unlock the contents. Finally
some peers seem to operate out of idealism, with their final goal set on the free
circulation of information, whatever its shape or content. Regardless of the motivation
behind the file upload one thing is clear: the more a file is shared the better. All the
peers that upload files wish to maximize its impact by making sure it will have a wide
distribution. Format is one of the choices which factors into the success of a file and
therefore peers that upload them decide which one to adopt keeping this maximization
goal in mind. For clarity during the rest of the project those users that upload files will
be referred to as “uploaders” while those that download them will just be “users”.
The fact that downloading a file from BitTorrent increases the download speeds for
that specific file for all users makes the system ideal to study direct network effects.
Indirect network effects can also be assessed, in particular the effect that the diversity of
contents available in either format has on the adoption rates of each type of peers.
3. Overview of digital audiovisual formats
In order to be able to play a digital video it first needs to be contained in a wrapper or
container format. This wrapper contains data about the video file which the media
player, the application in charge of turning the digital information contained in the file
into actual images and sounds, needs in order to run the video. There are several
competing formats available, each with its strengths and weaknesses. Figure 6 shows
the number of monthly uploads of four of the most frequently used wrappers which for
convenience will be called “the big four”: MKV, WMV, MP4 and AVI. Despite this
variety it quickly becomes clear that two formats have been in dominating positions
over the decade in terms of the files available for download on TPB: Microsoft’s Audio
Video Interleave (AVI) and the International Standards Organization’s MP4. The fate of
WMV, another Microsoft wrapper, is very closely tied to the one of its more popular
and successful cousin AVI. Furthermore, due to methodological constraints, WMV was
not suitable for our study since, as will be explained later on, the empirical part will
13
center on format changes for television shows, of which only a very small part is
wrapped in WMV (a disproportionate amount WMV files are “adult” files, for some
reason). On the other hand MKV also experimented a big increase in usage during the
same time period. While this project focuses mostly on the competition between AVI
and MP4, we will provide some estimates and comments on MKV adoption in
Appendix 3, and its usefulness for future research will be examined in the discussion
section.
For the better part of the last decade AVI was in a dominant position with most
audiovisual files exchanged on BitTorrent wrapped in that format. An AVI file is
divided into three parts, also known as “chunks”: the first one contains metadata, such
as the video’s definition (width and height) or frame rate. The second one contains the
audiovisual content proper which is encoded by using software library known as a
codec: before packaging a video file into its container it needs to be encoded, which is
the process of transforming analog data into digital data. Software libraries designed to
enable this process are known as codecs, of which there is a large variety and many are
freely available to the public. The final chunk is optional and contains additional
metadata on the file.
Despite its initial success, the AVI format has several limitations, especially
regarding compression and aspect ratio, as well as a lack of standardization for features
0
500
1000
1500
2000
2500
3000
3500
4000
4500
20
04
-08
20
05
-01
20
05
-06
20
05
-11
20
06
-04
20
06
-10
20
07
-03
20
07
-08
20
08
-01
20
08
-06
20
08
-11
20
09
-04
20
09
-09
20
10
-02
20
10
-07
20
10
-12
20
11
-05
20
11
-10
20
12
-03
20
12
-08
20
13
-01
Mo
nth
ly u
plo
ads
to T
PB
Figure 6: Monthly uploads to TPB for the big four
MP4
AVI
WMV
MKV
14
such as the time code, which are important for professional use of the file. Some
competing formats have solved this issues allowing for more efficient file transfer and
manipulation. One of these is ISO’s MP4 format (also known as MPEG-4 Part 14),
developed as a version of Apple’s QuickTime File Format. Although it was first
published in 2001 it didn’t start enjoying success until the mid ‘00s, mostly due to the
popularization of High Definition (HD) media. Before HD the most successful codec
was DivX, which allowed the efficient compression of large videos into a digital file.
Most AVI files had their second chunk encoded with DivX, so the popularity of the
codec fueled the diffusion of the format. However, one of the shortcomings the public
version of the codec had was its inefficiency when it came to encoding HD videos.
There are many codec alternatives (such as x264), although at first they didn’t enjoy
much success. This is attributable to the lack of demand for HD files, since most
monitors at the beginning of the 2000s were not able to reproduce this content. However
as the decade advanced the widespread adoption of HD television sets and screens (see
fig. 7) changed this situation, greatly increasing the demand for HD media files.
Figure 7: HD television sets sold in 2005 – 2010
Source: GfK Retail and Technology, July 2010
The fact that many of this television sets allowed the reproduction of encoded
digital files increased the demand for digital media. However this increase in demand
didn’t spread evenly among all codecs, and, as can be appreciated in Figure 8, those that
were better suited for encoding HD files (in particular x264) absorbed most of the
bump.
15
While MP4 was not the only container format able of carrying x264 encoded
video files it benefited greatly from the switch to the new codec since, due to the factors
examined later in this section, it was in an advantageous position to exploit any
weakness AVI showed. The similarity between the x264 trend in Figure 8 and the MP4
trend in Figure 5 illustrates a very strong tie between the x264 codec and the format.
Another force that played an important role in the adoption of MP4 was advent
of smartphones and digital music. One of the first companies that started taking
advantage of the improvements MP4 offered was Apple who created an audio codec
(Apple Lossless) which stored audio data into an MP4 wrapper and that was used for all
the music iTunes Store offered. Other portable devices also offered compatibility with
the standard and the advent of mobile computing entailed a large increase in the
installed base of devices offering compatibility with the new format. In 2004 Apple
released the source code for its Apple Lossless codec, making it open source and royalty
free, further fuelling its growth. At that point most of the content that was being
distributed in MP4 was music and despite its advantages over AVI the format was
hardly used for packaging audiovisual contents: mobile devices still did not have neither
the memory nor the resources necessary to play those videos, and HD screens were not
popular yet. Furthermore on the TPB site the huge popularity of AVI had locked-in
users and uploaders into the incumbent format. The release of additional MP4-
0
2000
4000
6000
8000
10000
12000
20
04
-04
20
04
-09
20
05
-02
20
05
-07
20
05
-12
20
06
-05
20
06
-10
20
07
-03
20
07
-08
20
08
-01
20
08
-06
20
08
-11
20
09
-04
20
09
-09
20
10
-02
20
10
-07
20
10
-12
20
11
-05
20
11
-10
20
12
-03
20
12
-08
20
13
-01
Mo
nth
ly u
plo
aded
file
s
Month
Figure 8: Number of monthly uploaded files to TPB per codec
Divx
x264
16
compatible devices such as PlayStation 3 and the incipient penetration of media centers
into households further increased the attractiveness of the format.
This account summarizes external trends that explain the adoption of MP4 by
users. Without them network effects alone would not have been sufficient to move the
user base of TPB and the wider BitTorrent community out of the old format. In later
parts of the project we will take this into account, despite the fact that our focus is on
the network effects.
4. Literature review
The first literature on technology adoption was written in the 1980s and focuses greatly
in the role network effects play in the process. In their seminal paper, Katz and Shapiro
(1985) described how consumption externalities that were generated by users of a
product impact its demand. They identify two possible types of consumption
externalities: the first, corresponding to direct network effects, is a “direct physical
effect of the number of purchasers on the quality of the product”, as in the telephone
case described in the introduction. Indirect network effects are variables that while they
can be related to the number of users of a certain product are not exclusively dependent
on it: market quota or the costs of post-sale services could be instances of these. They
go on to create a model in which firms compete to attract consumers to their networks
through pricing and find that under certain circumstances the optimal solution is to
allow compatibility between their networks in order to maximize adherence by
consumers by amplifying the intensity of network effects.
Further developing their approach Katz and Shapiro (1985, 1986 and 1995)
extend their model and apply it to technology adoption. They introduce the distinction
between sponsored and unsponsored standards, and identify the inefficiencies that can
arise from adoption: specifically they demonstrate how the incumbent technology has
an advantage over the new one in the case of unsponsored standards, while a sponsored
one will have a strategic advantage over unsponsored ones, even if it is inferior, since
they can behave in a strategic way in order to make sure their technology is the one that
finally succeeds. This process of replacement of a superior technology by an inferior or
less mature one is called excess momentum. They go on to describe the possible
manoeuvers a sponsored standard can use in order to gain the upper hand such as
17
committing to future prices or, in the case of the software-hardware paradigm,
integrating vertically.
Another tenet of the late 80s technology adoption literature are the works of
Farrell and Saloner (1985, 1986). Their model describes adoption of an unsponsored
standard and includes two additional factors into the consumers’ choice: first they
assume consumers can form expectations as to which standard will succeed, which can
lead to a bandwagon effect in which the choice of the first consumer creates a cascading
effect that makes all subsequent adoption decisions identical to the first one. Second,
they introduce the notion of an installed base, which is a reflection of the number of
users committed to a standard on day zero. This installed base under uncertainty
conditions can trap the market into an old, inferior standard since, despite being
individually interested in adopting, consumers do not dare to do so because they ignore
what the choices of subsequent adopters may be, a process which they will call excess
inertia.
A final set of models proposed in in the 80s focused on increasing returns and
path dependence, with Arthur (1987, 1990 and 1994) as one of the main exponents of
the current. In the series of papers compiled in the book Increasing returns and path
dependence in the economy he provides models that illustrate how historical accidents
may explain why an inferior technological standard may end up being adopted over a
superior one. The main analytical tool he uses is a statistical model known as a Polya
urn process: “it can be pictured by imagining a table to which balls are added one at a
time; they can be of several possible colors – white, red, green or blue. The color of the
ball to be added next is unknown, but the probability of a given color depends on the
current proportion of colors on the table”3. Arthur goes on to describe several
economical processes governed by similar self-reinforcing processes, such as the choice
of geographical location by firms, or technology adoption. In the case of technology
adoption he defines the adoption choice as a random walk with critical bounds: if the
process surpasses a threshold all future choices will go to the same technology. He goes
on to demonstrate the existence of several stable equilibriums for such problems, and
examines how historical accidents condition which one is eventually reached. His
models fit into the Evolutionary Economics school and are a great introduction to path
3 Arthur, Brian, 1994, Increasing Returns and Technology adoption p. 6
18
dependence, a notion that has gained a lot of relevance in development, spatial and
financial economics.
During the 1990s and 2000s most of the literature turns to the study of sponsored
standards and technologies and the strategic aspect of the problems. Some papers will
follow more on the tradition of the previous studies conducted while others choose to
develop their theories within the two sided markets approach. An example of the former
would be Besen and Farrell (1994), who will describe different competitive strategies in
standard setting and the mechanisms by which a firm may try to steer the market in its
favor, and how the market structure influences the outcome. They demonstrate how
when firms are similar they choose the same compatibility strategy and therefore
facilitate the emergence of a single, consolidated standard. However if firms are
dissimilar a standards battle is likely to occur: bigger firms may want to forbid new
entrants to adhere to its network. In the same line, Götz (1999) analyzes the adoption
and diffusion of a technology in markets with monopolistic competition, and
demonstrates how in a non-cooperative setting identical firms may adopt a new
technology at different dates. For non-identical firms he assigns a rank to the good of
each firm which alters the demand consumers have of it, and demonstrates how bigger
firms have a bigger incentive for adoption. This project proposes a variant of Götz’s
model in which consumers, and not only firms, are also faced with an adoption choice
(see Section 8).
Varian and Shapiro (ibid) and Varian et al. (2004) gives a business-oriented
overview on the state of the art in standards literature and creates many useful
classifications for technical innovations. His focus is on strategic maneuvering and in
discussing past cases, specifically Standards Wars such as VHS vs. Beta or Standard
Gauge vs. Broad Gauge in the early days of railroads.
A last, and quite recent, contribution to this line of inquiry is the manual
compiled by Farrell and Klemperer (2006). Although it focuses mainly in network
externalities study competition under switching costs and network effects. They show
how these effects can lock in customers into their early choices and how suboptimal
arrangements from a social welfare point of view can prevail under these conditions.
Literature on platform competition in two sided markets is also relevant when
studying technology adoption, in particular in competition between sponsored non-
19
compatible standards, although the bulk of it does not deal with it explicitly. This is
illustrated by the change in language, with platform being used more often than
technology or standard. Still models of two sided markets share many similarities with
models of technology adoption, as in essence both deal with network effects and the
incentives they generate. Rochet and Tirole (2002, 2006) have published several of their
works on the dynamics of competition in two sided markets, with a special focus on
pricing. In their models, platforms compete to attract a demand which is split on two
sides, with at least one of them experiencing a positive membership externality when
additional customers join the opposite side. They demonstrate that the profit-
maximizing decision of a monopolistic platform is to subsidize the side of the demand
that has higher price elasticity and overbill the inelastic side. Armstrong (2002)
develops his own model and pays special attention to competition between platforms.
He also allows the demand side to perform a broader range of behaviors, specifically he
allows those agents to multihome (i.e. use several platforms at the same time). In the
technology adoption context his model could be applied to
Empirical literature
A lot of the empirical literature on technology adoption has been done in within the
Marketing field (and a surprising amount of it focuses on adoption of electronic
payment systems), and tends to be based on attitudinal rather than hard data. There is a
lot of variety in the frameworks these researchers use, but most construct their
investigations around the Technology Acceptance Model (TAM). Proposed by Davis
(1989) TAM is applied mostly in research on the diffusion Information Technology, and
focuses on the perception users have of what he considers the two main drivers behind
an adoption decision: ease of use and usefulness. The model has undergone several
extensions and modifications and has been widely used since its introduction.
Some research has also been dedicated to the exploration of network effects using
data form other sources. For example, Rysman (2003) studies competition between
networks by studying how Yellow Pages directories compete. His final goal is to
determine whether or not standardization would be preferable to competition from a
social welfare point of view, since that would allow maximizing the magnitude of
network effects. His conclusion is that, in the specific case of Yellow Pages directories,
competition is preferred to standardization
20
As for technology adoption, in his paper Ohashi (2003) studies the competition
between Beta and VHS between 1978 and 1986, and the impact networks may have had
in the final victory of VHS. In order to identify the network effects he incorporates into
the consumers’ utility function an installed base variable for each of the competing
standards. He then proceeds by estimating adoption by using a nested logit model, in
which he first estimates the likelihood of adoption of any VCR device, and then the
likelihood of choosing either VHS or Beta. His model allows him to run simulations
with which he can contrast hypotheticals, and one of the most remarkable results he
obtains is that the success of VHS would have been unlikely had its price been lower
during the first stages of competition.
Along this line of work, Clements and Ohashi (2004) study the role of indirect
network effects on the videogame market in the United States. Videogame platforms are
an instance of sponsored non-compatible technological standards, and securing a broad
variety of software products early on in order to get a large installed base is one of the
main concerns of platform rivals. The paper goes on to model the strategic interactions
between platforms and software providers.
5. Econometric Model
The objective of the empirical part of this project is to determine the impact of direct
and indirect network effects on in the substitution of AVI by MP4 for audiovisual file-
sharing in TPB. In order to do so two models have been developed to explain the
variations in the share of adoption of MP4 files (one for each side of the BitTorrent
ecosystem).
Direct network effects are a consequence of adoption of MP4 by other users: as
explained before download speed for a specific file increases with the number of users
that have a copy of it, therefore increasing the attractiveness of a specific format.
However guessing which files each user is interested in is impossible with the data
accessible to us, so we decided to cluster our sample by using television shows: we
assume that someone that downloads an episode of a television show is more likely to
be interested in other episodes and he or she stands to benefit from the direct network
effect generated by larger shares in a specific format within that subset of files.
21
Furthermore, clustering by television show holds the additional advantage of allowing
us to follow a group of users and uploaders over time, since new episodes are uploaded
into TPB after they are broadcast through traditional television, so dummy variables for
unobserved demographic characteristics and time can be added.
As for the indirect network effect, we will model it is a function of the variety of
audiovisual media being offered in MP4. In order to parameterize this variety we will
use the lagged share of audiovisual media being offered in that format, or to put it
another way, the percentage of uploaded media files in MP4 up to that date.
With this in mind we define the following MP4 adoption share function
where is the MP4 adoption share among users for files uploaded at
date t for television show TV; is the format adoption share of users for
previous episodes of that television show; is the lagged proportion of media
files being offered in MP4, and and are television show and time dummies. We
decided to lag since given the large number of media files available for
download it’s unlikely that users would react immediately to variations in the format
composition of the aggregate total of files available.
Time dummies are included to account for unobserved changes in the utility of
MP4. As was discussed in Section 3 starting in the mid ‘00s MP4 becomes more
attractive thanks to it being compatible with HD and mobile devices. Lacking any way
to model these effects we include yearly time dummies as a means to capture the
increase in popularity the format had during the decade for reasons other than network
effects. Finally a dummy is also included for television shows, which is useful mainly to
cluster the sample in differentiated user groups and to a lesser extent to control for
unobserved demographic characteristics affecting the adoption decision.
The model estimating the adoption rate among uploaders of a specific television
show is similar, only in this case the both direct and indirect network effects come from
users. This is based on the assumption that uploaders don’t experiment any direct
benefit resulting from other uploaders switching to MP4 and instead benefit from it
indirectly through a ricochet effect: more uploads in MP4 mean more users in MP4
which in turn makes the format more attractive to uploaders. In order to capture it we
use the total installed base share for users at time t, taking into account all audiovisual
22
media. For the modeling of the direct network effect we proceed in the same fashion as
in the users’ case, taking into account only the user installed base within a single
television show. Thus we specify the following model for upload share
where is the share of installed base MP4 has among all users downloading
media files and , and are the same as in the user specification
(installed base share for a specific television show, television show dummies and time
dummies, respectively).
6. Data
The TPB database used was compiled by Karel Bilek4 by running a Perl script on the
TPB website. The whole process took “about six months” and according to his own
account between 100 and 300 Torrent files are missing (which is negligible when
compared to the more than 2 million Torrent files he did manage to compile). The data
was stored in an XML format and the uncompressed file weighs 4.4 Gb. For each
Torrent file the dump contained several fields, all of which were discarded except the
following:
Identification number: a unique identifier for each file
Title: the title of the file, often specifying the format as well in the case of media
Seeders: the number of users that have a full copy of the file and that are sharing
it
Upload date: the date in which the Torrent file was created
Information: comments left by the uploader which were also checked for file
format
User comments: comments left by downloaders of the file
Due to the large size of the database most of the cleaning and sampling was carried out
with the UNIX terminal. A first step consisted in creating a new XML field containing
for the number of comments each Torrent file had (which although available in the TPB
website was not included in Bilek’s dump) and removing the actual comments in order
4 Karel Bilek’s Github page: http://runn1ng.github.io/piratebay.html