Usage-Based Pricing of the Internet...Usage-Based Pricing of the Internet Aviv Nevoy Northwestern University John Turnerz University of Georgia Jonathan Williamsx University of Georgia
Post on 31-Jul-2020
5 Views
Preview:
Transcript
Usage-Based Pricing of the Internet�
Aviv Nevoy
Northwestern UniversityJohn Turnerz
University of Georgia
Jonathan Williamsx
University of Georgia
Preliminary and IncompleteNovember 2012
Abstract
We estimate demand for residential broadband to study the e¢ ciency propertiesof usage-based billing. Using detailed-high frequency internet protocol data records,we exploit variation in the intertemporal tradeo¤s faced by subscribers to estimate thedistribution of subscribers� preferences for di¤erent characteristics of service; accessand overage fees, usage allowances, and connection speeds. We �nd signi�cant hetero-geneity in tastes along each dimension of service. Using these estimates, we examinethe e¢ ciency of various 3-part tarri¤ pricing schedules. We �nd that usage-based pric-ing models currently being employed in North America are successful at eliminatinglarge volumes of low-value tra¢ c while having a minimal impact on subscriber welfare.These �ndings provide strong support for the FCC�s backing of the industry�s moveaway from �at-rate pricing.Keywords: Demand, Broadband, Dynamics, Usage-based Pricing, Welfare.JEL Codes: L13.
�We thank those North American Internet Service Providers that provided the data used in this paper.We thank Terry Shaw, Jacob Malone, Scott Atkinson, and seminar participants at Ga. Tech and UGA forinsightful comments that signi�cantly improved this paper. Jim Metcalf provided expert computational andstorage support for this project. All remaining errors are ours.
yDepartment of Economics, Northwestern University, nevo@northwestern.edu, ph: (847) 491-7001.zDepartment of Economics, University of Georgia, jlturner@uga.edu, ph: (706) 542-3376.xCorresponding Author: Department of Economics, University of Georgia, jonwms@uga.edu, ph: (706)
542-3689.
1
1 Introduction
In the U.S., �last mile" connectivity to the internet is privately provided by telecomm (e.g.,
AT&T) and/or cable (e.g., Comcast) companies. This leaves the problem of allocating
scarce network resources (i.e., bandwidth) at the discretion of the internet service provider
or ISP. This ability of ISPs to price the delivery of content from the internet to subscribers
(and vice versa) has important implications for the future development of online content,
communications, and more generally, the way in which people use the internet. Therefore,
this has lead to signi�cant discussion on the way the Internet should be regulated.
During the past decade in the U.S., ISPs have typically sold unlimited access to the
internet for a �xed monthly fee. During this time, the average residential subscriber�s usage
has grown 50% annually. This dramatic growth in usage has led to a shift in the industry
towards usage-based pricing plans similar to those commonly associated with cellular phones.
Typically, these plans take the form of a three-part tari¤: a �xed access price, a usage
allowance, and a marginal price for usage in excess of the allowance. Just this year, two of
the largest cable providers, Comcast and TimeWarner Cable, conducted trials of usage-based
pricing in select markets.
ISPs argue that usage-based pricing is necessary to curtail the usage of the small number
of subscribers that dramatically drive up network costs and degrade the quality of service
for other subscribers. This views usage-based pricing as a type of Pigouvian tax that helps
equate a subscriber�s private bene�t to costs realized by the ISP (i.e., network investment)
and other subscribers (i.e., degraded service). ISPs also argue that usage-based pricing gives
the right incentives for content developers to minimize the bandwidth requirements of their
applications. For example, Youtube recently added an option for users to degrade the quality
of video streams. This allows the subscriber to degrade quality to a level acceptable to them,
while avoiding overage charges and minimizing any costs to the ISP or other users of the
tra¢ c they generate. These types of arguments led the Federal Communications Commission
(FCC) to recently back the practice, "Usage-based pricing would help drive e¢ ciency in the
networks," Julian Genachowski, FCC Chairman (Chicago Tribune, May 22, 2012).
2
The recent shift in the industry towards usage-based pricing models, along with the sup-
port of government regulators, has given rise to numerous organizations devoted to pre-
venting it. These include web sites (e.g. www.stopthecap.com and openmedia.ca/meter)
that monitor ISPs�activities for indications (e.g., reporting usage on subscribers�bills) that
usage-based pricing will be introduced. The web sites then ask their followers to bombard
the ISP with complaints, often providing direct contact information for the companies�exec-
utives, in the hope of preventing or delaying it. More formal organizations lobby regulators
and legislators directly. Geekdom is one such organization. "It�s like locking the doors to the
library," Nicholas Longo, Geekdom Director (NY Times, June 26, 2012). Generally, these
organizations believe that the activities of high volume subscribers are of high value and any
type of �caps" or usage-based pricing will result in signi�cant welfare losses for subscribers.
The extant academic literature on topics related to the economics of internet access is
very limited and almost exclusively theoretical in nature. Existing theoretical studies (e.g.
Odlyzko (2012)) often reach very strong and con�icting conclusions regarding the welfare
consequences of usage-based pricing. To date, the lack of detailed data on consumption of
internet content has limited empirical work on the topic to only a couple papers: Goolsbee
and Klenow (2001) and Lambrect et. al. (2007). Goolsbee and Klenow (2001) use Forrester
Technographics Survey data on individuals� time spent on the internet and earnings to
innovatively estimate the private bene�t to subscribers of residential broadband. However,
the authors�estimate relies on the potentially dubious assumption that an hour spent on the
internet is an hour in forgone wages.
The lack of more empirical studies on these important issues is largely due to the propri-
etary nature of, and technological constraints associated with collecting, much of the data
required to study these issues. However, the dramatic growth in usage over the past decade
has forced ISPs to invest in technology to track usage and better manage scarce network
resources. These investments in data collection have created an opportunity for academic
researchers to better understand the economics of broadband internet access. In this paper,
we use 5 months of detailed hourly data on internet utilization data we obtained from a group
3
of North American ISPs, some of which employ usage-based pricing, to study the impact
of usage-based pricing on subscribers. We observe the total volume of content downloaded
and uploaded each hour for approximately 5 months from late 2011 to early 2012 for over
30,000 subscribers.
To study the welfare implications of usage-based pricing, we begin by building a dynamic
model of subscribers�inter-temporal decision making throughout a billing cycle under usage-
based pricing. Speci�cally, we model subscribers as utility-maximizing agents that solve a
dynamic optimization program each billing cycle. While we do not have variation in the
service plans or tiers o¤ered to subscribers during the sample period, the high frequency
nature of our data and variety of three-part tarri¤s o¤ered to consumers allow us to accurately
estimate demand. In particular, the high frequency data allows us to exploit variation in
the shadow price, i.e., implications of current consumption on the probability of incurring
overage charges later in the billing cycle, to trace out marginal utility for subscribers. In
addition, selection into plans or choice of a particular three-part tarri¤ reveals a great deal
about preferences by revealing an average willingness to pay for content and a preference over
the speed of one�s connection (i.e., Mb/s). Our provider o¤ers plans ranging from almost
linear tarri¤s (i.e., very low usage allowances) to plans with allowances well over 100GBs.
The connection speeds, overage prices, and usage allowances are all non-decreasing in the
�xed access fee.
A potential concern with such an approach with such a model is that it is wrong to have
each subscriber solve his/her optimization problem in isolation. Or, network externalities
among users makes the problem a dynamic game in which subscribers choose when to use
the internet based on preferences and expectations regarding congestion in the network.
However, the data we use in this paper comes from an ISP that operates an over-provisioned
and pristine network. This allows us to accurately model a subscriber�s usage decision as
an independent one. We discuss how we can measure the absence of network externalities
in this provider�s network in Section 2.
To estimate the model, we adapt the techniques of Ackerberg (2009), Bajari et. al. (2007)
4
and Fox et. al. (2012). These techniques avoid very computationally expensive �xed-point
estimation algorithms. In particular, it is only necessary to solve the dynamic programming
problem for each type of agent a single time. Second, one can relax parametric/distributional
assumptions typically made when estimating such dynamic models. This is critical for our
purposes given the limited amount of information we have about a subscriber and the extreme
heterogeneity in usage behavior is di¢ cult to model. Finally, the techniques naturally deal
with di¢ cult to model forms of selection, which is an important issue in our application.
Subscribers select into a service tier or plan and we only observe usage under that optimally
selected plan.
The application itself is of interest and has important policy implications. As desirable,
yet bandwidth-intensive, applications continue to be developed and subscriber usage grows,
it will be increasingly important to have accurate measures of the demand for content. Our
results largely support the current regulatory stance of the FCC. We �nd that usage-based
pricing, as currently implemented by North American providers, is successful at removing
a great deal of low-value tra¢ c from the networks. This is largely due to the negative
correlation between the value and volume of typical online activities (e.g., 5GBs to stream
a movie and only bytes to send 1000s of emails).1 We show that subscribers derive a great
deal of utility from the �rst bytes of tra¢ c generated on a broadband connection, but utility
diminishes rapidly thereafter. This supports the goal of the FCC to increase the reach of
basic broadband service to more rural and under-served areas as stated in their National
Broadband Plan.2 Finally, our results are important for how content will be delivered in
the future by telecomm and cable companies. There is an appeal to unicasting, or allowing
each user to view content at their convenience. However, the low willingness to pay we �nd
for this convenience and high costs (i.e., large amount of additional tra¢ c to accomodate on
networks) suggests that into the forseeable future (a short time in this industry), the cost
e¤ectiveness of broadcasting will continue to dominate arguments against usage-based billing
1This is consistent with Net�ix�s failed attempt to raise the prices of their service in 2012.2The FCC is currently working with telecomm and cable providers to o¤er a basic $9.99 tier to low-income
households. See Greenstein and Prince (2008) for more on the historical reach of broadband services.
5
of the internet.
The remainder of the paper is as follows. In Section 2 we discuss our data in greater
detail and provide reduced-form results that motivate assumptions made in the structural
model and demonstrate a high price elasticity for online content. In Section ?? we discuss
the model used to capture the intertemporal decisions of subscribers regarding consumption
of content under usage-based billing. Section 4 presents our methodology for estimating
the structural model and presents the results. Sections 5 and 6 presents the results of our
counterfactual exercise to identify the bene�t to subscribers of removing usage allowances
and �nal conclusions, respectively.
2 Data
The data used in this paper are Internet Protocol Data Record (IPDR) data. IPDR is a
standardized framework for collecting usage and performance data from IP-based services
and is currently the most popular way for cable operators to e¢ ciently measure subscriber
usage. The IPDR framework is supported by a DOCSIS (Data Over Cable Service Interface
Speci�cation) 2.0/3.0 compliant CMTS (Cable Model Termination System). A CMTS uses
a collector to gather data at a minimum con�gurable reporting period of 15 minutes. Our
data reports the volume of a subscriber�s usage every hour, linking subscriber�s to their
usage through their cable modem�s (CM) MAC (Media Access Control) address. The
usage reported for a subscriber does not re�ect DOCSIS framing overhead; however, it does
include operator-initiated management, control tra¢ c, and Internet-originated tra¢ c (pings,
port scans, etc) which must be considered when metering internet usage. See Clarke (2009)
for more details on the structure of DOCSIS cable networks.
The unit of observation for the IPDR data is a MAC address and a record creation time,
or month, day, and hour. The data also reports the DOCSIS mode, 2.0 or 3.0, of each
subscriber�s modem. DOCSIS 3.0 modems permit greater provisioned speeds, as DOCSIS
2.0 modems limit a subscriber�s connection to no more than 42.88 Mb/s. In the data, we
observe bytes and packets passed by a subscriber, both in the upstream (e.g., uploading a
6
�le to Dropbox) and downstream (e.g., streaming a movie from Net�ix) directions. For
billing purposes, and consequently our purposes, the direction of the tra¢ c is ignored and
we examine the total tra¢ c in either direction.
In addition to the number of bytes and packets passed by a subscriber, we also observe
the number of packets that are delayed or dropped by the network for each subscriber.
Delayed packets correspond to those that are requested by a subscriber in excess of their
connections provisioned speed, e.g., requesting packets at a rate of 10 Mb/s on a connection
that is provisioned for 8 Mb/s. Typically, delayed packets are ultimately passed. Dropped
packets are those that never reach their destination. Observing the extent of dropped
packets is extremely important, because in a network that is inadequately provisioned (i.e.,
not enough bandwidth to handle requests for content) externalities among users can result
in interdependent demands. Fortunately, our data comes from a market and internet service
provider (ISP) that operates an overly-provisioned and pristine network. Over a 5 month
period, not a single subscriber had more than 0:001% of packets dropped in any one hour
period. Our discussions with industry experts suggest that dropped packets in excess of 1%
correspond to a degraded quality of service that would be noticeable to the subscriber. For
this reason, in modeling a subscriber�s usage decision, we reasonably assume that their usage
decision does not depend on concerns over congestion in the network. Otherwise stated,
we assume independence of subscribers�demand functions. IPDR data does identify the
CMTS interface that a user is linked to. Using this information, one can infer which users
are connected on the network and allow for interdependent demands. See Malone (2012)
for an empirical study of network externalities in broadband networks.
As mentioned above, a subscriber�s usage is linked to the MAC address of their cable
modem. Through the MAC, we�re able to link in information about the subscriber�s service
tier (e.g. usage allowance of 50 GB and a provisioned downstream (upstream) speed of 8
Mb/s (1 Mb/s)) and the day that the billing cycle resets (e.g., usage counters reset to zero on
7th of each month). We have monthly reports that give the service tier for each subscriber.
Not surprisingly, since our provider did not change any features of any tiers, we see very
7
few subscribers (less than 0:1%) switch tiers during the �ve months for which we have data.
This lack of variation in the features of tiers would seem to be discouraging for identi�cation
purposes. However, as we discuss below, the high frequency nature of the data allows us
to see intertemporal decisions made by every subscriber at each point in the billing cycle,
compensating for the lack of variation in features of the service tiers.
2.1 Sample and Descriptive Statistics
Our sample includes hourly usage for approximately 30,000 subscribers from October 1st of
2011 to February 14th of 2012 in a single metropolitan market. Due to the sheer volume of
data and the fact that over 85% of residential�s usage is during peak hours (7pm-11pm), we
aggregate usage to a daily level. See Figure 1 for average subscriber utilization in Kb/s
over the day and across all subscribers and tiers. In addition, we remove subscriber-day
observations that are not part of a complete billing cycle for a subscriber. This results in
either 3 or 4 complete billing cycles for each subscriber, depending on when the subscriber�s
usage counter resets each month.
The internet service provider o¤ers multiple tiers, which di¤er along a few dimensions of
signi�cance. Similar to broadband service o¤ered by a typical North American provider, our
provider di¤erentiates service tiers by provisioned speed, ranging from 2 Mb/s (1 Mb/s) to
60 Mb/s (2 Mb/s) downstream. The fairly uncommon aspect of our provider�s broadband
service is that each tier is priced with a three-part tari¤, similar to many cell phone plans,
which includes an access fee, a usage allowance, and a per GB overage fee. The access fees
are a �xed fee paid each month, irrespective of usage, while the usage allowance permits the
subscriber to use a certain amount of data before incurring overage fees for each GB of data
in excess of the allowance. From the least to the most expensive tier (lowest to highest
access fee), the usage allowance and provisioned speed are non-decreasing.
Figures 2a and 2b plot monthly usage quantiles for subscribers on the least and most
expensive tiers as a percentage of the usage allowance, respectively. Figure 2a (2b) shows
that on the lowest (highest) tier approximately 30% (20%) of subscribers exceed their usage
8
allowance. The large number of subscribers well below the usage allowance demonstrates
the importance of allowing for satiation for online content in subscribers�preferences. These
�gures also point to the large degree of heterogeneity in usage across subscribers, even within
a tier, with the heaviest users on each tier in a month use 20 times more than the median
user. We discuss how we control for selection into service tiers when estimating demand in
Section 4.
Table 1 breaks down usage at a daily frequency, the unit of observation for the remainder
of our analysis, aggregating across service tiers. Average usage in a month is 21.7 GBs,
while median usage is only 8.5 GBs. This corresponds to an interquartile range of 56 GBs,
with the 75th percentile (62 GBs) over 10 times the 25th percentile (6 GBs). On average,
approximately 6% of users exceed their usage allowance. Of those who exceed their usage
allowance, the average (median) overage is 26.9 GBs (14.2 GBs). For all subscribers, the
median price paid per GB of content is $5.73, while the 25% is $1.79 and the 75% is $19.73.
As we discuss in Section 4.1, these average willingness to pay statistics will be important for
inferring preferences for the subscribers that have a negligible probability of exceeding their
usage allowance in a given month (face a shadow price near zero for consuming content at
each point in the billing cycle).3
2.2 Preliminary Analysis
Before moving to the structural model we provide evidence that suggests that subscribers
are aware of their state (position relative to usage allowance and time remaining in month)
and are forward looking. Similar to many cell phone operators, our provider gives notices
via E-mail and text as a subscriber nears their allowance, allows the subscriber to login and
check their usage to date, and provides an application for web browsers that monitors usage
in real time. Thus, the cost of verifying one�s state should is small. We begin by running
the following regression
cikmt = �0 + �1
�Cikm(t�1)
Ck
�+ �2daysleftmt + dowmt�+ timet + �im + �ikmt, (1)
3Past studies, e.g., Lambrecht et. al. (2007), have completely relied on such variation to identify demand.
9
where the dependent variable, cikmt, is subscriber i�s usage on plan k, t days from the end of
the billing cycle in month m. The ratio,Cikm(t�1)
Ck, is the proportion of the usage allowance
used to date and is given by the subscriber�s total usage in the previous (t� 1) days of the
billing cycle, Cikm(t�1) =Pt�1
�=1 cikm� , divided by the usage allowance on plan k, Ck. We
also include daysleftmt, the number of days left in the billing cycle, dummies for the days of
the week, dowmt, and a time trend, timet. The inclusion of subscriber-billing month �xed
e¤ects removes persistent forms of heterogeneity across subscribers as well as any billing-cycle
speci�c shocks to usage (e.g., seasonality or trends in usage).
Intuitively, �1 should be negative while the sign of �2 is ambiguous. As the probability
that a subscriber will exceed the usage allowance increases, i.e., the shadow price of current
consumption increases, a subscriber with a high price elasticity of demand will tend to
pull back (�1 < 0) on usage. This reduction in usage may occur well in advance of the
usage allowance if the subscriber wants to ensure that overage charges will not be incurred.
Similarly, for any level of previous usage, a subscriber further from the end of the billing
cycle may want to reduce current usage (�2 < 0) to ensure the usage allowance is not exceed.
However, early in the billing cycle, a great deal of uncertainty regarding future demand
is yet to be realized so the user may want to ensure that they use the entire allowance
(�2 > 0). It is important to note that any form of positive serial correlation in usage
will work against �nding a negative relationship between consumption (cikmt) and previous
consumption (Cikm(t�1)
Ck). Such correlation in usage may arise from the dynamics of the
subscribers�intertemporal decision process itself. For example, a subscriber that enters an
undesirable state (i.e. high cumulative usage early in a billing cycle) may respond by using
the service consistently less throughout the remainder of the billing cycle. This points to
the importance of modeling the entire process for consumption, not just the process near
any nonlinearities in the pricing schedule. We discuss these issues further in Section 4.
The estimates of Equation 1, and variations of, are reported in Table 1a�1d. In each of
the Tables, Columns 1 and 2 report the estimates of Equation 1 where the dependent variable
is in levels and log-transformed, respectively. Columns 1 and 2 both report a negative sign
10
for the proportion of the usage allowance used to date. The pull back in consumption
of -0.255 GB or approximately 17% of daily consumption is statistically and economically
meaningful. These results are consistent with subscribers being aware of their states and
adjusting consumption in response to the probability of exceeding the usage allowance and
incurring overage charges, i.e., an increase in the shadow price of consumption. Only one
of the estimates of the coe¢ cient on the days remaining in the billing cycle is statistically
signi�cant. This may be due in part to the strong correlation between the proportion of
the usage allowance used in a month and the number of days remaining in the month. The
other controls show that the weekend is the heaviest usage day (Sunday is omitted) and
there is some evidence of a weak positive trend in daily usage.
The linear relationship between a subscriber�s current usage and their position relative
to the usage allowance assumed in Columns 1 and 2 is clearly ad-hoc. In particular, one
may expect a highly nonlinear relationship, as it�s not clear when and how a subscriber
will begin to respond to an increasing probability of exceeding their usage allowance. This
would depend on a number of things, including any uncertainty in future demand for internet
content. To better capture any such dynamics in usage, we specify a set of indicators for
a subscriber�s position relative to their usage allowance; between 50% and 75%, 75% and
90%, 90% and 95%, and 95 to 100%, and over the allowance. Columns 3 and 4 of Table 1
report these estimates with the dependent variable in levels and log transformed, respectively.
Column 3 shows a monotonically decreasing consumption pro�le for subscribers nearing (and
exceeding) their usage allowance. Subscribers increasingly reduce consumption as it becomes
clear that the usage allowance will be binding and the shadow price of current consumption
approaches the per-unit overage price. The results in Column 4 are very similar, once a user
exceeds 95% of their usage allowance, they�ve reduced consumption by approximately 27%
and have fully internalized the overage price. Yet, again, well in advance of these levels,
subscribers begin to account for the possibility of exceeding the usage allowance. In the next
section, we formalize this intuition by modeling the intertemporal decision making process
facing subscribers.
11
3 Model
3.1 Utility
We assume consumers derive utility from content and a numeraire good. To consume
content, each consumer must choose a tier or plan, indexed by k: Each plan is characterized
by the speed sk by which content is delivered over the internet, by the usage allowance Ck, by
the �xed fee Fk and by the per-unit price of usage in excess of the allowance, pk. Speci�cally,
Fk pays for all consumption up to Ck, while all units above Ck cost pk per unit. For any
plan, the number of days in the billing cycle is T .
Utility is additively separable over all days in the billing cycle.4 Let consumption of
content on day t of the billing cycle be ct and let consumption of the numeraire good on day
t be yt: We specify the simple quasi-linear form, where the �ow of utility is quadratic in ct.
Speci�cally, a consumer of type h on plan k has
uh(ct; yt; k) = �t ln(1 + ct)� ct (�1h � �2h ln(sk)) + yt;
where the time-varying unobservable, �t; is not known to the subscriber until period t and is
independently and identically distributed on [0; �] according to a distribution Gh:5 Hence,
the consumer�s marginal utility varies randomly across days in ways that the consumer cannot
predict. The speci�cation includes a constant marginal cost of consuming online content,
�1 � �1 ln(sk), that is decreasing in the speed of the connection, sk. This implies that the
consumer has a satiation point, which captures key features of the data. All parameters
di¤er across types of consumers.
Letting income be I and letting total consumption since the beginning of the billing cycle
be Ct �Pt
j=1 cj and Yt �Pt
j=1 yt; respectively, de�ne the monthly budget constraint as
Fk + pk(CT � Ck)1�CT > Ck
�+ YT � I; (2)
4In this way, we assume that content with a similar marginal utility is generated each day or constantlyrefreshed. This may not be the case for a subscriber that has not previously had access to the internet.
5The right-truncation of G is necessary to ensure that the consumer can a¤ord any e¢ cient level of dailyconsumption. Let h, sk and pk be the highest levels of these parameters. It su¢ ces to assume thatTpk ( h + ln(sk) + �) < I.
12
where 1 [�] is the indicator function.
Denote the discount factor � 2 (0; 1): Conditional on choosing plan k, the consumer�s
problem is to choose daily consumption to maximize
U =
TXt=1
�t�1E [uh(ct; yt; k)] ; subject to (2).
Throughout this paper, we will assume that all consumers have su¢ cient income to pay for
satiation levels of content.
3.2 Optimal Consumption
The subscriber�s problem is a �nite-horizon dynamic-programming problem. Consider the
terminal period (T ) of a billing cycle and denote the remaining allowance CkT �MaxfCk�
CT�1; 0g: The e¢ ciency condition for optimal consumption depends on whether it is optimal
to exceed CkT . Intuitively, if the consumer is well below the cap (i.e., CkT is high) and does
not have a particularly high draw of �T , then she consumes content up to the point where
the marginal utility of content is zero. If marginal utility at ct = CkT is positive but below
pk; then it is optimal to consume the remaining allowance. If one is already above the cap
(i.e., CkT = 0) or draws an extremely high �T ; then it is optimal to consume up the the
point where the marginal utility of content equals the overage price.
In each of these situations in the last period, there are no intertemporal tradeo¤s. Usage
today has no impact on next period�s state, as cumulative consumption resets to zero at
the beginning of each billing cycle. Thus, the problem is reduced to solving a static util-
ity maximization problem, given a subscriber�s cumulative usage up until period T , CT�1,
and the realization of preference shock, �T , which together determine the implications for
overage charges and the marginal utility of usage, respectively. Denote this optimal level of
consumption, for a given realization of �T , by c�hkT .
For a given realization of the preference shock in period T , the suscriber�s utility from
13
entering the �nal period with state CT�1 and behaving optimally in the terminal period is
VhkT (CT�1; �T ) =
24 �T ln(1 + c�hkT )� c�hkT (�1h � �2h ln(sk)) + yt�pkfc�hkT1�CT�1 > Ck
��(CT�1 + c�hkT � Ck)1
�CT�1 < Ck < CT�1 + c
�hkT
�g.
35Prior to the realization of �T , the subscriber�s expected utility is then
E [VhkT (CT�1)] =
�Z0
VhkT (CT�1; �T )dGh(�T ).
The expected value function, E [VhkT (CT�1; �T )], is de�ned for all CT�1 > 0. Similarly, the
expected usage of a subscriber prior to the realization of �T is given
E [c�hkT (CT�1)] =
�Z0
c�hkT (CT�1; �T )dGh(�T ).
Other conditional moments of optimal consumption can be calculated similarly for each
state, (CT�1; t).
Similarly, for any day in the billing period besides the last day, t < T , the optimal policy
function for a subscriber of type h on plan k is
c�hkt(Ct�1; �t) = argmaxct
24 �T ln(1 + ct)� ct (�1h � �2h ln(sk)) + yt�pkfct1�Ct�1 > Ck
�� (Ct � Ck)1
�CT�1 < Ck < Ct
�g
+�E�Vhk(t+1)(Ct�1 + ct)
�.
35and the value functions are given by
Vhkt(Ct�1; �t) =
24 �T ln(1 + c�hkt)� c�hkt (�1h � �2h ln(sk)) + yt�pkfc�hkt1�Ct�1 > Ck
�� (CT�1 + c�hkt � Ck)1
�CT�1 < Ck < CT�1 + c
�hkt
�g
+�E�Vhk(t+1)(CT�1 + c
�hkt)�:
35for each ordered pair (Ct�1; �t).6 Similar to the terminal period, the expected value function
is
E [Vhkt(Ct�1)] =
�Z0
Vhkt(Ct�1; �t)dGh(�t).
6Notice this formulation of the optimization problem assumes that the subscriber is aware of their cumu-lative consumption, Ct�1, on each day in the billing cycle. This is a realistic assumption for our data, asthe results in Table 2 demonstrate.
14
for all t < T = 30 and the mean of the mean of a subscriber�s usage at each state is
E [c�hkt(Ct�1)] =
�Z0
c�hkt(Ct�1; �t)dGh(�t): (3)
The policy functions for each type (h) of subscriber imply a distribution for the time spent
in particular states (t; Ct�1) over a billing cycle. We discuss solving for this distribution,
generated by optimal subscriber behavior, and how it, along with the moments of usage,
forms the basis of our method of moments approach discussed in Section 4.
3.3 Model Solution and Stationary Distribution
Let Gh denote normal distribution, truncated at �, with mean �h and variance �2h. For
a plan, k, and subscriber type, h, characterized by the vector (�1h; �2h; �h; �h), the �nite-
horizon dynamic program described above can be solved recursively, starting at the end of
each billing cycle (t = T ). To do so, we discretize the state space for Ct to a grid of 1800
points with spacing of size, csk GBs, for each plan, k. Our data is hourly, so time is naturally
discrete, but we aggregate time up to the day (t = 1; 2; :::; 30 over a billing cycle with T = 30
days).7 This discretization leaves �t as the only continuous state variable. Because the
subscriber does not know �t prior to period t, we can integrate this out and the solution to
the dynamic programming problem for a subscriber of each type h can be characterized by
the expected value functions, EVhkt(Ct�1), and policy functions, c�hkt(Ct�1; �t). To perform
the numerical integration over the bounded support [0; �] of �t, we use adaptive Simpson
quadrature.
Having solved the program for a subscriber of type h, one can then generate the transition
process for the state vector implied by the solution to the dynamic program. The transition
probabilities between the 54,000 possible states (1800*30) are implicitly de�ned by threshold
values for �t. For example, consider a subscriber of type h on plan k, that has consumed
Ct�1 prior to period t. The value of �t that makes a subscriber indi¤erent between setting
ct = zcsk rather than ct = (z + 1)csk (advance cumulative consumption by z or z + 1 steps
7This aggregation loses very little information, as over 80% of usage is on peak (between 6pm and 11pm).
15
of size csk) equates the marginal utility (net of any overage charges) of an additional unit of
consumption to the loss in the net present value of future utility
��EVhk(t+1)(Ct�1 + (z + 1)cs)� EVhk(t+1)(Ct�1 + zcs)
�.
These thresholds, which along with all subscribers� initial condition, (C0 = 0), de�ne the
transition process between states. Subscribers will consume no less if speed (sk) is higher
(lower opportunity cost of time), the overage price is lower, and the gradient of the expected
value function is not too steep in cumulative consumption.
For each subscriber type, h, and plan, k, we characterize this transition process by the
cdf of the stationary distribution that it generates,
�hkt(C) = P (Ct�1 < C) ,
the proportion of subscribers that have consumed less than C through period t of the billing
cycle.8 These probabilities, for di¤erent values of C, are directly observable in our data and
form the basis for our method of moments approach discussed in Section 4.
3.4 Optimal Plan Choice
After solving the dynamic program a subscriber type, h, under every plan, k, selection into
plans by subscribers can be naturally dealt with. A subscriber selects a plan with knowledge
of their type, (�1h; �2h; �h; �h), and the features of the plan, but not the realization of their
particular needs (realizations of �t for t = 1::T ) over the course of a billing cycle. In this case,
the subscriber will select the plan, k = 1; ::; K, with the highest expected utility, or choose
no plan at all, k = 0. To identify the optimal plan for each type, one can simply �nd the
plan that gives the highest expected utility at the beginning of a billing cycle, E [Vhk1(0)],
and then ensure that this is greater than zero (the outside option�s value, E [Vh01(0)] is
normalized to 0). The optimal plan for a type h subscriber is then
k�h = argmaxk2f0;1;:::;Kg
fE [Vhk1(0)]� Fkg .
where the �xed fee for the outside option is, F0 = 0.8The discretized state space makes this cdf a step function.
16
4 Estimation
We use a method of moments approach to recover the primitives of the model, the joint
distribution of the parameter vector (�1h; �2h; �h; �h). Our model predicts moments of
optimal behavior at each state, along with the time spent in di¤erent states, (Ct�1; t), for
each subscriber type. We seek to �nd the distribution of subscriber types that matches
the distribution of Ct�1 in the population of subscribers at each point in the billing cycle,
t. This approach has the advantage of exploiting the high-frequency nature of our data, as
it allows us to use variation in the intertemporal decisions made by subscribers at di¤erent
states, rather than the end-product of these decisions (e.g. monthly internet usage).
Our approach to estimation is most similar to the two-step algorithms advocated by
Ackerberg (2009), Bajari et. al. (2007), and Fox et. al. (2011). The �rst step is to recover
the moments to be matched from the data and solve the dynamic program for a wide variety
of subscriber types, (�1h; �2h; �h; �h).9 We recover both the cdf of cumulative consumption,
Ct�1, for each plan, k, at each point in the billing cycle and the unconditional mean and
variance of usage at each state, (Ct�1; t). In the second step, we follow Fox et. al. (2011)
by searching for the weights or density of each type that best match the moments recovered
from the data. The moments we chose to match were chosen for their identifying power
and computational ease. In particular, these moments are linear in the type-speci�c weights
which reduces the matching process to a linear regression subject to a linear constraint and
non-negativity restrictions. In addition to the computational advantages, this approach
has the advantage of not placing parametric restrictions on the shape of the subscriber type
distribution and naturally deals with selection (i.e., identify each type�s optimal plan, k�h, in
the �rst step).
9Fox et. al. (2011) correctly point out that identifying the correct support for the parameter vector,(�1h; �2h; �h; �h), may in fact be viewed as an additional step to the estimation process. Yet, their motivatingexample of a random coe¢ cient demand model and aggregated data (i.e., market shares for each product)is much di¤erent than our application. In particular, the authors are assuming that one observes onlyaggregate data making it impossible to know exactly what range of types are consistent with the data.However, in our application, we know the complete distribution of usage and this dramatically simpli�esidentifying the support of the type distribution that is consistent with even the most infrequent occurrencesin the data.
17
4.1 Identi�cation
To realize the full computational advantages of the Fox et. al. (2012) approach, we consider
those moments with the most identifying power and then decompose the moments into
parts that are linear in the parameters. The advantage of our data is that we observe the
distribution of actions for subscribers at each state, (Ct�1; t), along with the distribution of
subscribers across states. Thus, we observe how consumers respond to marginal (shadow)
prices ranging zero upto the overage price. This allows us to consider any moments of
the conditional distribution of consumption at those states at which a subscriber is present.
We focus on the conditional mean and variance of usage at each state. These moments are
determined by the probability that di¤erent types reach a particular state and the actions
taken at this state. Thus, it is important that our econometric approach correctly identify
both this compositional aspect of the conditional moments.
To see this, consider a model with two types, low usage (L) and high usage (H) subscribers.
Consider those states that are only be reached by the low types (i.e., low cumulative con-
sumption well into a billing period). At these states, subscribers are essentially solving
a static utility maximization problem with a marginal price of zero, as there is a negligi-
ble probability they will exceed the usage allowance (i.e., shadow price of consumption is
nearly zero). Knowing only low usage subscribers are present in these states and observing
a subscriber solve this problem each day, equating marginal utility to zero, identi�es the
parameters of the utility function for these L types. Similarly, high demand subscribers are
likely to exceed the usage allowance, equating marginal utility to the overage price from the
beginning of the billing cycle. Thus, observing variation in usage at these states identi�es
the utility function for high demand subscribers.
One might then argue that the weights, relative mass of H and L types in the population,
for each type are identi�ed by the mixture of actions taken at intermediate states that can be
reached by both types. However, this is a very weak source of identi�cation in our data due
to the large degree of heterogeneity among users. Speci�cally, consumers sort themselves out
across the state space so quickly at the beginning of the month that the only real identifying
18
variation for the weights comes on the very �rst day of the billing cycle for which all types
are at the same state. After that point, the types are essentially over disjoint portions of
the state space. Thus, the conditional moments by themselves may identify the types, h, of
subscribers that are present in the data but provide very little information on their relative
weights.
Along with the lack of identifying power for the weights, considering the conditional
moments also has the problem of being nonlinear in the weights. For any reasonable number
of types (e.g., 500 or more), this results in an infeasible constrained-nonlinear optimization
problem. For example, the conditional mean of consumption at each state is a mixture of
type-speci�c policy functions,
E [c�kt(Ct�1)] =HPh=1
E [c�hkt(Ct�1)]�ht(Ct�1),
where
�ht(Ct�1) = ht(Ct�1)�hHPh=1
ht(Ct�1)�h
:
Thus, �ht(Ct�1) is a nonlinear function of both the probability a type reaches a particular
state and the relative mass of the type in the population. The conditional variance of usage
is de�ned similarly.
To remedy both the computational and identi�cation di¢ culties with these moments, we
decompose the conditional moments in to two parts, the numerator and denominator The
numerator,HPh=1
E [c�hkt(Ct�1)] ht(Ct�1)�h,
is just the unconditional mean of usage at each state while the denominator,
HPh=1
ht(Ct�1)�h,
is the mass of subscribers at a particular level of cumulative consumption, Ct�1, on day t
of the billing cycle. Both these moments are linear in the weights, �h, and together solve
the identi�cation problem. In particular, by matching both sets of moments, we match the
19
conditional usage at each state (useful for identifying utility of each type) while also pinning
down the relative weights of each type by matching the distribution of subscribers across the
state space. The details of the matching procedure are discussed in Section 4.3.
4.2 Recovering Empirical Moments
The large number of observations and high frequency of our data, along with the low dimen-
sionality of our state space, (Ct�1; t), allows us to adopt a �exible nonparametric approach
for recovering moments from the data to match to our model. We recover both the cdf of
cumulative consumption for each day in the billing cycle, t, along with the conditional mean
and variance of usage at each state. The unconditional mean and variance are then the
product of the pdf of cumulative consumption and the conditional moments.
4.2.1 CDF of Cumulative Consumption
To recover the cumulative distribution of Ct�1 at each point in the billing cycle, t, for each
plan, k, we use a smooth version of a simple Kaplan-Meier estimator,
b�kt(C) = 1
Nk
NkXi=1
1�Ci(t�1) < C
�.
We estimate these moments for each k and t, considering values of C such that b�kt(C) 2[0:01; 0:99], ensuring that we �t the tails of the usage distribution. This results in approxi-
mately 30,000 moments to match for each plan. Let b�cdfk denote the vector of moments for
plan k.10
To compute point-wise standard errors for our estimates of these distributions, we draw
on the literature on resampling methods with dependent data, see Lahiri (2003). The de-
pendence in our data comes from the panel nature of the data, as we observe individuals
making daily decisions on consumption over 3 or 4 full billing cycles. The straightforward
structure of our panel signi�cantly simpli�es the resampling procedure. We repeatedly esti-
mate the cumulative distribution functions, leaving out di¤erent groups of subscribers. We
10We use a normal kernel and adaptive bandwidth to smooth the empirical cdf.
20
choose 1,000 randomly sampled groups of 5,000 subscribers and re-estimate each distribution
omitting the di¤erent groups of subscribers each time. These estimates are then used to cal-
culate a variance-covariance matrix, bVcdfk , for the moments for each plan, k. This weighting
matrix is used to account for the di¤erent scale of our moments and inversely weight more
variable moments.
Figures 2a, 2b, and 2c present the recovered cdf of cumulative consumption for each
day of the billing cycle, for the least expensive, most popular, and most expensive plans,
respectively. The least and most expensive plans are the two least popular plans o¤ered
by our provider. Yet, there is still a more than adequate number of observations to get an
accurate characterization of the time spent in di¤erent states by subscribers on these plans.
On both the least and most expensive plans, there are a signi�cant proportion of subscribers
that exceed their usage allowance, 20% and 30%, respectively. While the proportion of
subscribers exceeding the allowance on the most popular plan is small, the absolute number
of users is actually larger than the total number of users to exceed the allowance on all other
plans combined.
4.2.2 Unconditional Mean and Variance of Consumption
The large number of observations in and richness of our data, along with the low dimension-
ality of the our state space, (Ct�1; t), allows us to adopt a very �exible estimation approach
to recover the moments of usage at each state. Our problem essentially reduces to estimating
a surface de�ned over the (Ct�1; t) plane.
To �exibly estimate the conditional moments, we adopt a nearest neighbor approach.
Consider point in the state space, ( eCet�1;et). A neighbor is an observation in the data for
which t = et and Ct�1 is within some distance of eCet�1 (e.g., 0.5 GBs). Denote the �xed
number of nearest neighbors, those with the smallest distance from eCet�1, used to estimatethe moments at ( eCet�1;et), under plan k, by Nk( eCet�1;et). The estimate of the conditional
mean at ( eCet�1;et) isbE hc�kt( ~C~t�1)i = 1
Nkt( eCet�1)Nkt( eCet�1)X
i=1
ci,
21
where i = 1::::Nk( eCet�1;et) indexes the set of nearest neighbors. Similarly, our estimator of
the conditional variance is
bV hc�kt( ~C~t�1)i = 1
Nkt( eCet�1)� 1Nkt( eCet�1)X
i=1
�ci � bE hc�kt( ~C~t�1)i�2 .
If Nk( eCet�1;et) < 100, we do not estimate the conditional mean. If there are at least 100 butless than 500 neighbors, we use all neighbors to estimate the conditional mean. If there are
more than 500 neighbors, we use those 500 neighbors nearest to eCet�1. The unconditional
mean is then recovered as the product of the probability of observing a subscriber at state
( eCet�1;et), estimated from the cdf of cumulative consumption we recover, and the conditionalmean. Let the vector of estimates for the unconditional means and variances for plan k be
denoted by b�avgk and b�vark , respectively.
The nearest neighbor approach has a number of advantages over other estimators for
our application. First, as with any nonparametric estimator, it imposes no parametric
restrictions on the surface. Second, nearest neighbor estimators inherently are bandwidth
adaptive, see Pagan and Ullah (1999). This is particularly useful in our application. The
number of users reaching very high volumes of cumulative consumption can be small for
some plans. In these low-density situations, nearest neighbor estimators will expand the
bandwidth appropriately until a given number of observations are included in the estimator
of the surface. We do restrict the degree to which the estimator can expand the bandwidth
in these low-density situations in order to limit any potential bias such expansion might
introduce. Our results are very robust to varying both the minimum number of neighbors
(Nkt( eCet�1) > 100) required for a conditional moment to be estimated and the cuto¤ that
determines that determines how much the bandwidth can adapt in low-density situation to
identify Nkt( eCet�1) neighbors. We estimate each surface at the same set of discrete set of
state space points used when numerically solving the dynamic programming problem for
each subscriber type. We again use a block-resampling procedure to compute a variance
covariance matrix for our estimates for the conditional mean and variance, bVavgk and bVvar
k ,
respectively. We use these matrices to inversely weight more variable moments.
22
While we will match the unconditional mean and variance at each state, it is useful and
intuitive to present the conditional means which demonstrate a few properties of our data
more clearly than the analogous unconditional moments. These results are summarized
in Figures 3a-3c and 4a-4c for the mean and variance, respectively, for the least expensive,
most popular, and most expensive plans. For each plan, the surfaces characterizing the
conditional means have the same pattern.
Very early in the billing period the di¤erent types of subscribers reveal themselves. The
high types sort themselves to high cumulative consumption states and continue to consume
at a very high level. Interestingly, we see that consumption is relatively smooth across the
billing cycle, which suggests that (high volume) subscribers are quite adept at smoothing
consumption. Or, we do not see much of a drop in average consumption for the highest
volume subscribers as they near the overage, reinforcing our decision to model subscribers
as forward-looking and rational economic agents. The low-volume types tend to migrate to
low cumulative consumption states as the billing cycle progresses and continue to consume
at low levels. In addition, there is a wide variety of intermediate types that consume at a
fairly constant level throughout the billing cycle. The estimates of the standard deviation
in usage at each state follow a similar patterns to the means. The sorting is again evident
and higher mean types tend to have much more variable usage, while the standard deviation
of usage tends to be proportional to the mean.
4.3 Matching Moments
4.3.1 Objective Function
The second step of our estimation approach follows the method of moments approach due
to Bajari et. al. (2007) and Fox et. al. (2011). Our objective is to match, as closely
as possible, the empirical moments we recover from the data to those predicted by our
model. The parameters we minimize over are the relative mass of di¤erent types, �h, in the
population of subscribers that choose a plan, k.
23
Speci�cally, for each plan, our estimates of these weights are chosen to satisfy
b�k = argmin�k
gk(�k)0 bV�1
k gk(�k),
subject to the weights for all types choosing plan k summing to unity,
HkXh=1
�kh = 1,
and each of the Hk weights being nonnegative,
�kh � 0 8h.
The vector
gk(�) =
0B@ b�avgk � �avgk �kb�vark � �vark �kb�cdfk � �cdfk �k
1CAis the di¤erence between the moments recovered from the data (b�k) and the weighted averageof type speci�c moments predicted by the model (�k is a matrix with Hk columns). Thus,
each element of gk(�k), corresponds to a unique ordered pair, t and C, is of the form
b�kt(C)� HkPh=1
�h�hkt(C).
As in Fox et. al. (2011), one can then think of gk(�k) as a vector of random variables, where
the randomness is a result of sampling variability in the empirical moments (measurement
error in observed market shares in their example).
The weighting of moments by the block diagonal matrix,
bV�1k =
0B@ bVavgk 0 0
0 bVvark 0
0 0 bVcdfk
1CA�1
;
ensures that more variable moments receive relatively less weight, although we �nd virtually
no di¤erence in our estimates and unweighted estimates. After estimating the weights
associated with each type that selects a plan, we appropriately normalize the weights to
24
re�ect the number of subscribers choosing each plan to get the joint distribution of types
across all plans.
As pointed out by Bajari et. al. (2007) and Fox et. al. (2011), least squares minimization
over a bounded support and subject to linear constraints is a well-de�ned convex optimization
problem. Thus, convergence ensures a global minimum, although not necessarily unique.
In cases with many types, some of which behave similarly, identi�cation issues can arise
regardless of the richness of the moments one is matching. The approach of Bajari et. al.
(2007) and Fox et. al. (2012), as with any approach, requires that the type-speci�c matrix
of moments (�avgk , �vark , and �cdfk ) be of full rank. If types are too similar in their behavior,
collinearity issues arise and it is not possible to separately identify the weights associated
with each type. Fortunately, in our application, it is not necessary to accurately identify
the weights associated with each individual type, rather only the total mass of types that
behave similarly.
We take intuitive steps to reduce any such issues associated with collinearity. After
extensive experimentation to identify the support for the parameter vector that completely
encompasses the range of individual behaviors observed in the data, we solve the dynamic
programming problem over an extensive grid. In total, we solve the dynamic programming
problem for 10,000 types (10 values for each parameter), identifying the optimal plan for each
or whether to subscribe at all. This leads to a total of 6,409 types selecting a plan rather
than not subscribing. For those types selecting a particular plan, of the 6,409 choosing any
plan, we use the following algorithm to identify a set of types that are not too similar to one
another.
We begin by constructing the matrix of correlation coe¢ cients of each column (each
corresponding to a type-speci�c moment) of gk(�k) with every other column. Thus, the
(i,j) element of this matrix is the correlation coe¢ cient of the moments predicted by the
model for types i and j, two types that chose plan k. We then take the �rst type and
remove all those types whose moments have a correlation coe¢ cient greater than 0.99 with
the �rst type. We take the next type, of those remaining, and eliminate all types that have
25
a correlation coe¢ cient greater than 0.99 with this type. We continue this process, cycling
through the types, until we�re left with a set of types whose moments have correlation
coe¢ cients less than 0.99 with all other types�moments. This process results in a well
de�ned optimization problem, such that gk(�k) is of full rank (moments of types remaining
are not too near to being collinear). Notice the algorithm would give a di¤erent set of types,
depending on which type it is initiated with. However, the resulting set of types (number of
columns in gk(�k)) will always be the same and be indistinguishable from the perspective of
their ability to match the behavior observed in the data. The algorithm results in a total,
across all plans, of 1,189 types for which we will estimate weights. For each plan, the search
algorithm takes less than one minute to converge.
4.3.2 Results
Similar to Bajari et. al. (2007), we �nd a relatively small number of types are assigned a
nonzero weight despite estimating weights for 1,189 types. No plan has more than 28 types
receiving positive weights, while the average across plans is only 15. This has the advantage
of signi�cantly simplifying the counterfactual analysis, where the dynamic programming
problem is solved repeatedly. We return to this point in Section 5.
Plans with higher usage allowances attract types with higher average (i.e., high �h) and
more variable (i.e., high �h) usage. This reinforces the �nding of Lambrect et. al. (2007)
that uncertainty plays an important role in subscribers� plan choice. Those types with
a higher preference for speed (i.e., high �2h) select the more expensive plans with greater
provisioned speeds, while those with the highest opportunity cost of time (i.e., high �1h)
often to select plans with lower �xed fees and lower usage allowances.
The 4-dimensional joint distribution of the taste parameters is di¢ cult to visualize, so
in Figures 5a-5d, we present the marginal distribution of types across all plans. These
�gures makes clear the bene�t of the nonparametric approach of Baraji et. al. (2007) and
Fox et. al. (2012), as common parametric densities would give an extremely poor �t. We
�nd signi�cant outliers along each dimension.
26
More important than the weights themselves, are what the weight implies about the
model�s ability to �t the behavior observed in the data. Figures 6a, 6b, and 6c present the
estimates of the cdf of cumulative consumption for the least expensive, most popular, and
most expensive plans at each day in the billing cycle. Comparing the model�s �t for these
plans to the moments recovered from the data, Figures 2a, 2b, and 2c, it is clear that the
model �ts the data quite well. The one exception is at the very low end of distribution, where
the model tends to over-predict usage over the course of the billing cycle. In particular,
the model has di¢ culty rationalizing an individual subscribing to broadband service and not
using the service at all (less than 0.5 GBs a month). This situation is likely to arise in the
data as a result of individuals taking extended vacations or simply not cancelling the service
in periods when their demand is extremely low. To better visualize the model �t, Figures
7a, 7b, and 7c plot both the empirical moments and the model �t for the last day of the
billing cycle. For over 95% of the usage distribution, the �t is very tight.
5 Welfare Implications of Usage-Based Pricing
Currently, some of the largest providers of residential broadband in the United States are
in the process of implementing or conducting trials of usage-based pricing (i.e., usage al-
lowances and a non-zero overage price).11 The estimates of the structural model provide an
opportunity to explore the implications of usage-based pricing for consumer welfare and its
potential to drive e¢ ciency in broadband networks.
To accomplish this, we consider two alternative scenarios. We �rst examine how usage
and consumer surplus changes when overage fees are simply eliminated and all other features
(i.e., �xed fees and provisioned speeds) are held constant. We compare these outcomes to
those when the provider is permitted to choose new �xed fees.12
There are a couple caveats to consider with this counter-factual exercise. First, our �exible
11http://www.�rstcoastnews.com/topstories/article/276426/483/Cable-companies-cap-data-use-for-revenue12It is important to note that our analysis only accounts for private welfare. In particular, we do not
account for any positive (e.g., education) or negative (e.g. violent games) externalities due to exposure tocontent.
27
nonparametric approach to identifying the joint distribution of subscribers�preferences limits
what we can say about types that don�t select a plan under the current usage-based pricing
schedule. In particular, we only identify weights for those types that actually selected
a plan.13 If those types not subscribing currently were to subscribe once overages were
eliminated, we could understate the bene�t to subscribers. This is likely not much of a
concern for qualitative conclusions, as those types with the greatest demand for broadband
will choose to subscribe in either case. Second, we do not allow the provider to change
the number of plans o¤ered or provisioned speeds. This can only reduce revenues to the
provider and decrease the likelihood of �nding that usage-based pricing is welfare improving.
Permitting the provider to select the number of plans or alter speeds requires nesting a
solution algorithm for the dynamic program for the many types of subscribers that we
estimate a positive weight for inside a high-dimensional optimization problem. This problem
is computationally prohibitive.
Usage is signi�cantly higher for users at the top end of the distribution under both al-
ternative scenarios, nearly doubling, while the bottom end of the distribution is basically
identical as one would expect. The only di¤erence in usage in the two alternative scenarios
is that the change in the �xed fees causes some subscriber types to switch to a lower tier.
Overall, we �nd that average usage increases by approximately 38% under both alternative
pricing regimes, with all of this increase coming from the top end of the distribution. The
average usage increases from 21.7GBs under the current pricing regime to approximately
30GBs for both alternatives. We �nd that the overall increase in subscriber subplus is only
4% and 2% in the two respective scenarios. This results in a very large drop in the average
value of each GB of data consumed. The average value of the additional GBs consumed by
subscribers is less than $0.18. If variable costs increase linearly with usage, as ISPs argue,
the small increase in consumer surplus associated with eliminating usage-based pricing does
not warrant the additional costs.13One option would be to impose some type of smoothness restrictions on the distribution of the taste
parameters. However, given the very unsmooth nature of the marginal distributions in Figures 5a-5d, it isnot clear what those restrictions would be.
28
6 Conclusion
The topic of how to price content being delivered over the internet will be an increasingly
important topic as more bandwidth-intensive applications are developed and the fraction
of computer savvy individuals in the US population grows. Our results provide strong
support for the FCC�s decision to support the industry�s move towards usage-based pricing
as a means to e¢ ciently allocate bandwidth to subscribers. We show that usage-based
pricing is an e¤ective means to eliminate extremely low value tra¢ c from the internet, while
minimizing the impact on consumer surplus. This suggests that the broadcasting model of
delivering content will continue to be more e¢ cient than unicasting means, unless there is a
signi�cant technological breakthrough that dramatically lowers the cost of bandwidth.
In addition to the policy implications for our �ndings, this study also represents a con-
tribution to the literature on estimating demand in a dynamic setting. To our knowledge,
we are the �rst to apply and demonstrate the usefulness of the nonparametric techniques
of Fox et. al. (2012). While our application is very well suited to these techniques, i.e.,
low dimensionality of type-speci�c parameter vector, it is our belief that both the �exibility
and computational ease of the techniques will make them appropriate for a wide variety of
settings in empirical microeconomics outside Industrial Organization.
Access to data from a provider operating a pristine and uncongested network allowed us
to ignore any interdependence in the decisions of subscribers when estimating preferences for
online content. However, this is often not the case on broadband networks. In a recent Wall
Street Journal article, the FCC published aggregate statistics on the signi�cant degradation
in the performance of broadband networks during peak hours. Measuring the extent of
network externalities in communication networks is an important topic that future research.
Malone (2012) represents a �rst step towards better understanding how congestion impacts
the utility that subscribers derive from broadband service, yet more work needs to be done.
29
References[1] Ackerberg, Daniel (2009) "A New User of Importance Sampling to Reduce Computa-
tional Burden in Simulation Estimation", Quantitative Marketing and Economics, 7(4),343-376.
[2] Aviva Aron-Dine, Liran Einav, Amy Finkelstein, and Mark Cullen (2012) "Moral hazardin health insurance: How important is forward looking behavior?", Working Paper.
[3] Bajari, Patrick, Jeremy Fox, and Stephen Ryan (2007) "Linear Regression Estimationof Discrete Choice Models with Nonparametric Distributions of Random Coe¢ cients",American Economic Review P&P, 97(2), 459-463.
[4] Copeland, A. and Cyril Monnet (2009) "The Welfare E¤ects of Incentive Schemes",Review of Economic Studies, 76(?), 93-113.
[5] Chung, Doug, Thomas Steenburgh, and K. Sudhir "Do Bonuses Enhance Sales Produc-tivity? A Dynamic Structural Analysis of Bonus-Based Compensation Plans", HBSWorking Paper #11-041.
[6] Fox, Jeremy, Kyoo il Kim, Stephen Ryan, and Patrick Bajari "A Simple Estimator forthe Distribution of Random Coe¢ cients", forthcoming in Quantititative Economics.
[7] Goolsbee, Austan and Peter J. Klenow (2006) "Valuing Products by the Time SpentUsing Them: An Application to the Internet", American Economic Review P&P, 96(2),108-113.
[8] Hendel, Igal, and Aviv Nevo (2006) "Measuring the Implications of Sales and ConsumerInventory Behavior", Econometrica, 74(6), 1637-1673.
[9] Johari, Ramesh, Gabriel Weintraub, and Benjamin Van Roy (2009) "Investment andMarket Structure in Industries with Congestion", Working Paper.
[10] Lahiri, S.N. (2003) "Resampling Methods for Dependent Data", Springer.
[11] Lambrecht, Anja, Katja Seim, and Bernd Skiera (2007) "Does Uncertainty Matter?Consumer Behavior Under Three-Part Tari¤s", Marketing Science, 26(5), 698-710.
[12] Malone, Jacob (2012) "Measuring Congestion Externalities in Communication Net-works", Working Paper.
[13] Marsh, Christina (2012) "Estimating Demand Elasticities Using Nonlinear Pricing",Working Paper.
[14] Misra, Sanjong and Harikesh Nair (2010) "The Welfare E¤ects of Incentive Schemes",Working Paper.
[15] Odlyzko, Andrew, Bill St. Arnaud, Erik Stallman, and Michael Weinberg (2012) "Con-sidering the Role of Data Caps and Usage Based Billing in Internet Access Service",Public Knowledge Whitepaper.
[16] Pagan, Adrian and Aman Ullah (1999) "Nonparametric Econometrics", Cambridge Uni-versity Press.
30
[17] Weintraub, Gabriel Y., C. Lanier Benkard, and Benjamin Van Roy (2010) "Computa-tional Methods for Oblivious Equilibrium", Operations Research, 58(4), 1247-1265.
[18] Weintraub, Gabriel Y., C. Lanier Benkard, and Benjamin Van Roy (2008) "MarkovPerfect Industry Dynamics with many Firms", Econometrica, 76(6), 1375-1411.
[19] Yao, Song, Carl Mela, Jeongwen Chiang, and Yuxin Chen (2011) "Determining Con-sumers�Discount Rates with Field Studies", Working Paper.
31
Levels-Fraction Log-Fraction Level-Dummy Log-Dummy
C(t-1) -0.255*** -0.171***
(0.004) (0.006)
dum_50to75 -0.228*** -0.169***
(0.005) (0.008)
dum_75to90 -0.350*** -0.159***
(0.009) (0.014)
dum_90to95 -0.483*** -0.289***
(0.017) (0.026)
dum_95to100 -0.566*** -0.279***
(0.019) (0.029)
dum_gt100 -0.773*** -0.272***
(0.011) (0.016)
Time Left 0.026 0.079** 0.026 0.079**
(0.027) (0.040) (0.027) (0.040)
Day of Week Dummies yes yes yes yes
Time Trend yes yes yes yes
Subscriber-Month Dummies yes yes yes yes
R2 0.242 0.122 0.257 0.152
# of Observations 3,046,570 2,993,767 3,046,570 2,993,767
Table 1a: Consumption Regressions: All Tiers
Levels-Fraction Log-Fraction Level-Dummy Log-Dummy
C(t-1) -0.032*** -0.051***
(0.001) (0.008)
dum_50to75 0.017*** -0.063**
(0.004) (0.031)
dum_75to90 -0.003 -0.019
(0.006) (0.049)
dum_90to95 -0.004 -0.085
(0.011) (0.082)
dum_95to100 -0.000 -0.092
(0.011) (0.085)
dum_gt100 0.011** -0.069*
(0.005) (0.039)
Time Left -0.001*** 0.000*** -0.003*** -0.003***
(0.000) (0.000) (0.001) (0.001)
Day of Week Dummies yes yes yes yes
Time Trend yes yes yes yes
Subscriber-Month Dummies yes yes yes yes
R2 0.211 0.100 0.227 0.196
# of Observations 96,223 96,223 94,949 94,949
Table 1b: Consumption Regressions: Least Expensive Tier
Levels-Fraction Log-Fraction Level-Dummy Log-Dummy
C(t-1) -0.818*** -0.522***
(0.008) (0.013)
dum_50to75 -0.233*** -0.180***
(0.006) (0.009)
dum_75to90 -0.394*** -0.174***
(0.010) (0.017)
dum_90to95 -0.608*** -0.354***
(0.020) (0.032)
dum_95to100 -0.587*** -0.280***
(0.023) (0.036)
dum_gt100 -1.075*** -0.372***
(0.013) (0.021)
Time Left 0.024 0.024 0.089** 0.089**
(0.026) (0.026) (0.042) (0.042)
Day of Week Dummies yes yes yes yes
Time Trend yes yes yes yes
Subscriber-Month Dummies yes yes yes yes
R2 0.271 0.131 0.258 0.196
# of Observations 2,608,388 2,608,388 2,559,925 2,559,925
Table 1c: Consumption Regressions: Most Popular Tier
Levels-Fraction Log-Fraction Level-Dummy Log-Dummy
C(t-1) -2.178*** -0.499***
(0.182) (0.091)
dum_50to75 -0.510*** -0.130**
(0.104) (0.051)
dum_75to90 -0.812*** -0.189**
(0.153) (0.076)
dum_90to95 -0.642** -0.249*
(0.278) (0.137)
dum_95to100 -1.909*** -0.528***
(0.288) (0.142)
dum_gt100 -1.156*** -0.196**
(0.183) (0.090)
Time Left -0.036*** -0.015*** -0.007*** -0.002
(0.005) (0.004) (0.002) (0.002)
Day of Week Dummies yes yes yes yes
Time Trend yes yes yes yes
Subscriber-Month Dummies yes yes yes yes
R2 0.271 0.131 0.266 0.135
# of Observations 17,057 17,057 16,907 16,907
Table 1d: Consumption Regressions: Most Expensive Tier
12am 5am 10am 3pm 8pm0
200
400
600
800
1000
1200
Time of Day
Ave
rage
Tra
ffic
per
Sub
scrib
er (
Kb/
s)
Figure 1: Traffic by Time of Day
downloads (Kb/s)uploads (Kb/s)
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 60
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
CD
F o
f Cum
ulat
ive
Con
sum
ptio
n
Cumulative Consumption (% of Allowance)
Figure 2a: Least Expensive Plan − CDF
0 0.5 1 1.5 20
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
CD
F o
f Cum
ulat
ive
Con
sum
ptio
n
Cumulative Consumption (% of Allowance)
Figure 2b: Most Popular Plan − CDF
0 0.5 1 1.5 20
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
CD
F o
f Cum
ulat
ive
Con
sum
ptio
n
Cumulative Consumption (% of Allowance)
Figure 2c: Most Expensive Plan − CDF
0
5
10
15
20
25
30
01
2’30
12’3
00
0.1
0.2
0.3
0.4
0.5
Days into Billing Cycle
Figure 3a: Least Expensive Plan − Nearest−Neighbor Mean
Cumulative Consumption (% of Allowance)
Mea
n of
GB
/day
0
5
10
15
20
25
30
0
0.5
1
1.5
20
2
4
6
8
10
Days into Billing Cycle
Figure 3b: Most Popular Plan − Nearest−Neighbor Mean
Cumulative Consumption (% of Allowance)
Mea
n of
GB
/day
0
5
10
15
20
25
30
0
0.5
1
1.50
2
4
6
8
10
Days into Billing Cycle
Figure 3c: Most Expensive Plan − Nearest−Neighbor Mean
Cumulative Consumption (% of Allowance)
Mea
n of
GB
/day
0
5
10
15
20
25
30
01
2’30
12’3
00
0.5
1
1.5
2
Days into Billing Cycle
Figure 4a: Least Expensive Plan − Nearest−Neighbor SD
Cumulative Consumption (% of Allowance)
Mea
n of
GB
/day
0
5
10
15
20
25
30
0
0.5
1
1.5
20
2
4
6
8
10
Days into Billing Cycle
Figure 4b: Most Popular Plan − Nearest−Neighbor SD
Cumulative Consumption (% of Allowance)
Mea
n of
GB
/day
0
5
10
15
20
25
30
0
0.5
1
1.50
2
4
6
8
10
Days into Billing Cycle
Figure 4c: Most Expensive Plan − Nearest−Neighbor SD
Cumulative Consumption (% of Allowance)
Mea
n of
GB
/day
0 5 10 15 20 250
10
20
30
40
50
60
70
80
90
100
µh
Figure 5a: Marginal Distribution of µh
Rel
ativ
e F
requ
ency
of T
ypes
0 5 10 150
10
20
30
40
50
60
70
80
90
κ1h
Figure 5b: Marginal Distribution of κ1h
Rel
ativ
e F
requ
ency
of T
ypes
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
10
20
30
40
50
60
κ2h
Figure 5c: Marginal Distribution of κ2h
Rel
ativ
e F
requ
ency
of T
ypes
0 5 10 15 20 250
10
20
30
40
50
60
70
80
90
σh
Figure 5d: Marginal Distribution of σh
Rel
ativ
e F
requ
ency
of T
ypes
0 1 2 3 4 5 6 7 8 9 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
End of Month Usage (% of Allowance)
CD
F
Figure 7a: Model Fit of Least Expensive Plan CDF on Last Day of Billing Cycle
DataModel
0 1/4 1/2 3/4 1 5/4 3/2 7/4 20
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
End of Month Usage (% of Allowance)
CD
F
Figure 7b: Model Fit of Most Popular Plan CDF on Last Day of Billing Cycle
DataModel
0 1/3 2/3 1 4/3 5/3 2 7/3 8/3 30
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
End of Month Usage (% of Allowance)
CD
F
Figure 7c: Model Fit of Most Expensive Plan CDF on Last Day of Billing Cycle
DataModel
top related