1 TECHNOLOGY SUPPORT AND POST-ADOPTION IT SERVICE USE: EVIDENCE FROM THE CLOUD German F. Retana INCAE Business School Alajuela, Costa Rica [email protected]Chris Forman, Sridhar Narasimhan, Marius Florin Niculescu, D. J. Wu Georgia Institute of Technology Scheller College of Business 800 West Peachtree NW, Atlanta GA 30308 {chris.forman, sri.narasimhan, marius.niculescu, dj.wu}@scheller.gatech.edu September 2012 This Version: June 2015 Abstract Does a provider’s technology support strategy influence its buyers’ post- adoption IT service use? We study this question in the context of cloud infrastructure services. The provider offers two levels of support, basic and full. Under basic support, the provider handles simple service quality issues. Under full support, the provider also offers education, training, and personalized guidance. Using a unique data set on public cloud infrastructure services use by 22,179 firms from March 2009 to August 2012 and fixed effects dynamic panel data models, we find that buyers who opt for full support use 31.84% more of the service than those who do not. Moreover, buyers continue learning from the provider as they continue having access to full support, and full support has a stronger influence on buyer behavior the longer it has been accessed. The effect of technology support on the effectiveness of buyers’ IT use follows a similar pattern: it increases upon switching from basic to full support and its impact grows over time. Keywords: IT service, organizational learning, IT use, cloud computing, Infrastructure-as-a-Service, technology support, service strategies
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
TECHNOLOGY SUPPORT AND POST-ADOPTION IT SERVICE USE:
Chris Forman, Sridhar Narasimhan, Marius Florin Niculescu, D. J. Wu
Georgia Institute of Technology Scheller College of Business
800 West Peachtree NW, Atlanta GA 30308 {chris.forman, sri.narasimhan, marius.niculescu, dj.wu}@scheller.gatech.edu
September 2012 This Version: June 2015
Abstract
Does a provider’s technology support strategy influence its buyers’ post-adoption IT service use? We study this question in the context of cloud infrastructure services. The provider offers two levels of support, basic and full. Under basic support, the provider handles simple service quality issues. Under full support, the provider also offers education, training, and personalized guidance. Using a unique data set on public cloud infrastructure services use by 22,179 firms from March 2009 to August 2012 and fixed effects dynamic panel data models, we find that buyers who opt for full support use 31.84% more of the service than those who do not. Moreover, buyers continue learning from the provider as they continue having access to full support, and full support has a stronger influence on buyer behavior the longer it has been accessed. The effect of technology support on the effectiveness of buyers’ IT use follows a similar pattern: it increases upon switching from basic to full support and its impact grows over time.
Keywords: IT service, organizational learning, IT use, cloud computing, Infrastructure-as-a-Service, technology support, service strategies
2
TECHNOLOGY SUPPORT AND POST-ADOPTION IT SERVICE USE:
EVIDENCE FROM THE CLOUD
1.! Introduction
Businesses are increasingly shifting their information technology (IT) infrastructure from
traditional on-premises deployment to the cloud to take advantage of the commoditization of
some IT resources. In light of these changes, it is important to understand the impact brought by
a cloud provider’s post-adoption technology support strategy on buyers’ IT use. We explore this
question empirically in this research note.
The challenges of new technology adoption are well documented in the information
systems (IS) literature. Significant knowledge barriers cause firms to delay not only the adoption
(Attewell 1992; Chatterjee et al. 2002) but also the actual assimilation of IT (Fichman and
Kemerer 1997; Fichman and Kemerer 1999).1 While extant literature emphasized the importance
of organizational learning in overcoming knowledge barriers (Attewell 1992; Fichman and
Kemerer 1997), much less is known about how providers’ strategies related to knowledge
transfer affect buyers’ consumption level. We aim to fill this gap in understanding by examining
how different levels of a provider’s technology support may influence the manner and extent to
which a buyer uses a new IT service.
In our research setting, the provider’s buyers use its hardware resources and choose
between two levels of technology support, basic or full. A prime goal of full support is to educate
buyers on how to best use the cloud infrastructure service and adapt it to their idiosyncratic
1 In addition to knowledge barriers, researchers have documented several other factors that drive post-adoption variations in usage (Parthasarathy and Bhattacherjee 1998; Zhu and Kraemer 2005; Zhu et al. 2006). These studies focus mostly on buyers, and their internal capabilities, rather than their interactions with the provider.
3
business needs. When receiving full support, buyers receive personalized guidance and training,
and thus have the opportunity to learn directly from the provider’s prior experience in deploying
applications in the cloud. Buyers not willing to pay the price premium for full support will only
receive a default basic level of support.
We evaluate the implications of full support for buyer behavior. We collect unique data
from a major global public cloud provider of infrastructure services who sells computing power
and storage. Our panel data consist of consumption time series for 22,179 firms that used the
provider’s service at some point between March 2009 and August 2012. We use fixed effects
dynamic panel data models to compare buyers’ use of the service before adopting full support
and during their continued access to full support. We find that buyers who adopt and continue
having access to full support use, on average, 31.84% more of the IT service relative to buyers
who have only had access to basic support, suggesting that technology support has important
implications for buyer behavior. To our knowledge, this is the first study to quantitatively
document how support can influence IT service use. Furthermore, we show that the impact of
technology support grows over time, providing suggestive evidence that technology support
facilitates buyer learning.
We also probe how omitted variables may influence our results. A particular worry is
reverse causality, i.e., the support choice decision may follow IT use. To address this concern,
we first run our models employing matched subsamples that are constructed using a coarsened
exact matching (CEM) procedure (Blackwell et al. 2010) based on buyers’ attributes and usage
of the service before they upgrade from basic to full support. Second, we leverage detailed data
on buyers’ support interactions through online live chat sessions and support tickets as the basis
for instruments for buyer decisions to upgrade to full support. Third, we estimate dynamic panel
4
data models that include lagged dependent variables and use deeper lags of our variables as
instruments for both IT use and the support upgrade decision using a generalized method of
moments (GMM) estimation approach (Arellano and Bover 1995; Blundell and Bond 1998). We
augment this latter approach with our support-based instruments. The estimates across these
various subsamples and models are qualitatively consistent with our main findings.
We also investigate the impact of technology support on IT use by examining alternative
measures of infrastructure use. Specifically, we provide evidence that technology support helps
buyers make better and more efficient use of the service by quantifying the effects that full
support has on buyers’ likelihood of deploying horizontally distributed and scalable
architectures. We find that buyers increase the fraction of servers they run in a parallel and
horizontally scalable architecture by 3.19 percentage points after they switch from basic to full
support. Given that the mean proportion of servers running in parallel in our sample is only 12%,
this is an economically significant change in behavior.
Besides informing the IS literature on post-adoption usage of new technology, our results
have important implications for managers. The adoption rates of cloud infrastructure services
have been significantly below expectations (Emison 2013; Microsoft and Edge Strategies 2011;
SearchDataCenter.com 2011). A potential reason for this pattern is that these services are not
offered as fully outsourced, turnkey and ready-to-use solutions for firms. Surveys during the time
span of our data (e.g., Symantec 2011) suggested that most buyers were not well prepared to use
cloud services and that helping them overcome their knowledge limitations is vital for the
success of the cloud model. Our results provide quantitative evidence of the importance of
overcoming such knowledge barriers to cloud service use.
5
2.!Theory Background
In this section we provide motivation for how interactions with a service provider can increase
service use through knowledge transfer. We will provide specific testable implications of this
motivation in Section 3.
Firms adopting new technologies often face uncertainty over how to adapt these to the
idiosyncratic environments where they will be embedded, as well as broader issues revolving
around the complementary organizational and process changes required for new IT to be
deployed successfully (Fichman 2004; Hitt et al. 2002; Wu et al. 2013). It is well known that
firms’ internal capabilities and technical know-how affect both the timing of new IT adoption
(Attewell 1992; Bresnahan and Greenstein 1996; Forman et al. 2008) as well as post-adoption
usage (Parthasarathy and Bhattacherjee 1998; Zhu and Kraemer 2005; Zhu et al. 2006). In
particular, firms are known to delay not only the adoption (initial purchase) but also the actual
assimilation of a technology because of knowledge barriers (Åstebro 2004; Fichman and
Kemerer 1997).2 !Several studies stressed the importance of organizational learning in
overcoming the knowledge barriers of new technology adoption and use (Attewell 1992;
Chatterjee et al. 2002; Fichman and Kemerer 1997).
Third parties such as consultants or other firms can often serve as useful repositories of
knowledge on how to adopt and use new technologies (Bresnahan and Greenstein 1996; Chwelos
et al. 2001). However, an important problem is how to transfer tacit and sticky knowledge on
technology use to new settings (Alavi and Leidner 2001). Such knowledge transfer—where a
2 At the individual level, for many services, buyers frequently play a dual role as both recipients and producers of the service, performing actions that are essential to the value they receive from the service. This phenomenon is known as service co-production (e.g., Xue et al. 2011). Extant research has consistently shown that customers’ capabilities in co-producing the service are a key determinant of their adoption and continued use (e.g., Buell et al. 2010; Frei 2008; Xue and Harker 2002; Xue et al. 2007).
6
source communicates knowledge so it is learned and applied by a recipient (Ko et al. 2005)—has
been studied within organizations in various contexts within the IS literature (Alavi and Leidner
2001). However, as noted above, knowledge transfer can also occur between firms, as when
consultants share knowledge with their clients. For example, researchers have reported survey-
based evidence that providers and consultants can transfer technical knowledge to the client
through interactions (Ko et al. 2005). Better knowledge of how to use the system can increase
post-adoption use of business IT systems (Åstebro 2004). However, to our knowledge, there is
little quantitative evidence on how a provider’s various specific strategies to transfer knowledge
to buyers affect the realized post-adoption consumption level for the offered service.
In this study, in the context of cloud infrastructure services, we focus on one such
strategy that facilitates interactions between providers and buyers, the offering of personalized
technology support, and seek to measure its impact on service consumption volume and
efficiency. Because many of the nuances of cloud deployment are not the norm in traditional
application architectures, there are several reasons why overcoming knowledge barriers may play
an important role in enabling the demand for cloud services. For example, many of the expected
features of enterprise-grade servers, such as redundant components that ensure high availability
and physical access to servers, are not present in the cloud. The cloud requires users to design for
failure (Reese 2009) and consider how to keep an application running if any given server
randomly disappears. Moreover, the cloud’s scaling capabilities can only be truly exploited if the
applications scale out horizontally (i.e., employ several servers performing functions in parallel).
A 2011 survey found that only 25% of IT staff in global organizations had cloud experience with
public infrastructure or platform-as-a-service, and 50% of the organizations claimed that their
staff was “less than somewhat prepared to handle” these services (Symantec 2011).
7
Thus, it is non-trivial for some of the buyers of cloud infrastructure services to overcome
these knowledge barriers on their own. A provider can greatly assist its buyers in lowering these
barriers and uncertainties by transferring knowledge and by training buyers how to better use the
service via technology support. In our setting, the provider offers personalized guidance and
training via full support. For example, when offering full support, the provider takes a proactive
approach in helping users configure their software applications so that they effectively scale in
the cloud. This is a common issue for e-commerce applications with uncertain customer-driven
IT capacity demand due to the implications of new product introductions and marketing
campaigns that will generate temporary spikes in usage. Thus, full support is different from pure
outsourcing where the provider does everything for the customer and “takes the burden of
learning off the back of a potential user” (Attewell 1992).
Based on the above theoretical arguments, we posit that buyers who adopt and have
continued access to full support use more service compared to similar buyers who only have
access to basic support.
3.!Empirical Model
3.1.! Effects of Full Support on Service Use
We employ linear fixed effects dynamic panel data models to tease out the effects on cloud use
of the adoption of and continued access to full support. The pay-per-use model provides cloud
infrastructure buyers the freedom to pay only for the computing resources they consume. In our
setting, the provider bundles server capacity in terms of memory (GB of RAM), processing
power (number of virtual CPUs), and storage (GB space of local hard disk). The three attributes
are highly correlated in the offer menu; a server with more of one attribute had more of the other
two. Since the servers are priced based on the amount of memory they have, and memory is the
8
basis for buyers’ infrastructure sizing decisions, the amount of memory consumed by buyers
over time is a direct measure of their use of cloud services. We compute the average GB of RAM
used by a buyer per month and denote it as !"#$%&',). Then, given the strong positive skew in
its distribution, following standard practice we compute *+!"#$%&',) = ln !"#$%&',) + 1 and
use it as our dependent variable. All variables are summarized in Appendix A.
Our first model tests if current or prior adoption of full support is associated with greater
We address this bias through System GMM estimation (Anderson and Hsiao 1981; Archak et al.
2011; Arellano and Bond 1991; Arellano and Bover 1995; Blundell and Bond 1998; Ghose
2009). We will show results using 3 lags F = 3 , yet results are consistent if we use fewer or
more lags (e.g., F = 1, 4). We elaborate on our use of System GMM in the results section.
9
Parameter @' is the buyer fixed effect and A) is a vector of calendar month fixed effects.
We also include a vector of dummy variables, B',), indicating in what month of its tenure a buyer
is when month 7 starts. Finally, parameter C',) is our error term which we assume is correlated
only within individual firms, but not across them.
Our fixed effects model allows us to difference out unobserved time-invariant buyer-level
heterogeneity that may influence both the choice of support type and IT use. We also run our
models using matched subsamples constructed using a coarsened exact matching (CEM)
procedure (Blackwell et al. 2010). CEM reduces the dependence of our estimates on our model
specification and also reduces endogeneity concerns when making causal inferences (Ho et al.
2007). As described in further detail below, we match firms based on their pre-upgrade memory
consumption levels, pre-upgrade frequency of infrastructure resizing (i.e., number of changes in
their total memory use), intended use cases for the cloud service, industry, and size.
Further, we use exogenous failure events experienced by buyers as an instrument for their
support choice decision. When this type of unforeseeable problem occurs, the support
interactions that take place between buyers and the provider can serve as a signal to buyers for
the value of full support. Basic support buyers who, because of the failure, obtain experience in
using the service with a greater involvement from the provider, may be more likely to upgrade to
full support than buyers who do not have such experiences with the provider. However, such
interactions on their own are unlikely to increase use of the provider’s service. Additionally,
since the failures are exogenous (e.g., can occur with equal probability to any server independent
of the support choice), they are also not directly related to any learning or level of technical
sophistication of the buyer. We employ a probit model that has the exogenous failures as
regressors to generate predicted values for 45**678759',), which we denote345**678759',)I . We
10
then use the fitted value, 45**678759',)I , as our instrument in a standard two-stage least squares
(2SLS) estimation (Angrist and Pischke 2009, pp. 142-144; Imbens and Wooldridge 2007). We
note that the lagged levels of service usage in our model control for the potential correlation
between the size of the cloud infrastructure deployment and its likelihood of experiencing a
failure in some of its components.
3.2.! Time Varying Effects of Full Support
To allow the marginal effect of switching to full support to vary in a flexible way over time, we
employ indicator variables for the lags of the adoption event, JK$F745**',). This variable is only
set to 1 in the period when full support is adopted. Thus, the lags of the form JK$F745**',)<L
indicate if buyer E adopted full support M periods ago (counting from period 7). We use this
indicator in the following autoregressive distributed lag (ARDL) model (Greene 2008, pp. 681-
689):
*+!"#$%&',) = 1 + 2LJK$F745**',)<L
N
L>O
+ 2P45**678759',)<N<?
+ :;3*+!"#$%&',)<;
=
;>?
3+ @' + A) + B',) + C',).
(2)
As with Model (1), we will show results using 3 lags F = 3 of the dependent variable, yet
results are consistent if we use a different number of lags. We include % = 12 lags of
JK$F745**',) so that our model identifies the effects of adopting full support during the 12
months following the event. Results are robust to changes in the number of lags. We use
45**678759',)<N<? to account for the effect of adopting full support beyond % months in the past.
11
4.!Data and Sample Construction
One of the essential characteristics of cloud infrastructure services is that they are offered on-
demand (Mell and Grance 2011). Buyers only pay hourly rates contingent on server capacity and
operating system. However, there are important technical challenges in deploying horizontally
scalable configurations where several cloud servers work in parallel, which may in turn limit
buyers’ ability to use many servers at once. As mentioned before, the provider offers two levels
of support, basic and full. Under full support, the provider charges a fixed price premium per
server-hour used plus an additional fixed monthly fee (which is prorated on a daily basis). There
are no sign-up or termination fees for the full support service. Please see Appendix C for a
detailed description of the provider’s cloud infrastructure services, their pricing, and the
corresponding levels of technology support. In Appendix D we discuss the potential implications
of server operating system heterogeneity.
We have collected a unique data set on cloud infrastructure services and technology
support use from a provider. Our entire data set given to us by the provider includes 79,619
buyers that adopted the provider’s services at some point between March 2009 and August 2012.
To isolate the causal effects of full support, we restrict our baseline sample to buyers who are
likely to have similar usage profiles over time, but for their adoption of full support. We exclude
buyers who use the service very little or who do not change their cloud architecture configuration
(i.e., do not resize their infrastructure).3 These buyers have very different time-varying profiles
relative to full support buyers and, although we exclude them ex ante, they likely would also be
3 We exclude buyers who only accessed basic support and averaged 512 MB RAM/hour or less during their first 6 months (excluding 1st month) or made no adjustments to size of their infrastructure during their first 6 months (excluding 1st month). An infrastructure resizing occurs in any launch, halt, or resizing of a server in the buyers’ cloud infrastructure. We do not consider their behavior during their 1st month in our threshold because most buyers are setting up their infrastructure during this time.
12
excluded later by our CEM procedures. After these restrictions, our baseline sample includes
22,179 buyers and 368,606 buyer-month observations. Table 1 provides descriptive statistics of
the cloud use time-varying variables in our baseline sample; we will describe our second
dependent variable 4%8R7E$+S8%8**"*',) later in section 5.2, but include it in the table for
completeness. Table 1 also presents statistics contingent on buyers’ support choice
45**678759',) ; difference in means t-tests for all variables are significant at the 1% level.
Observations, 45,815, 45,847, 45,830, 45,802,Pseudo<R2, 0.123, 0.085, 0.113, 0.135,Linear regressions in Part A and probit regressions in Part C include monthly calendar (A)) and tenure dummies (B',)). Most interactions with m%E#"97"%1',) in Part C are dropped due to collinearity. Robust standard errors, clustered on buyers, in parentheses. * F3 < 30.10, ** F3 < 30.05, *** F3 < 30.01.
Upgrade,change,("c − 1), 62.40%, 32.37%, 108.60%, 72.03%, 10.20%, 5.69%, 10.87%, 6.51%,Dependent variable is *+!"#$%&',). All regressions include monthly calendar (A)) and tenure dummies (B',)). Columns (1) through (4) show robust standard errors, clustered on buyers, in parentheses. System GMM models in columns (5) through (8) have robust standard errors that use Windmeijer’s (2005) finite sample correction. * F3 < 30.10, ** F3 < 30.05, *** F3 < 30.01. Hansen J statistic not reported for 2SLS estimations in columns (1) through (4) as model is exactly identified. System GMM estimations in columns (5) through (8) consider 45**678759',) as endogenous. Given AR(2) in the errors, they all use the 2nd lag of the first difference of *+!"#$%&',) and 45**678759',) as their instruments for the levels equation. Columns (5) and (7) use all available lags of *+!"#$%&',) and 45**678759',) as instruments for the first differences equation, from the 3rd lag until the end of the panel. Columns (7) and (9) only use the 3rd lag of *+!"#$%&',) and from the 3rd to the 11th lag of 45**678759',) as instruments for the differences equation. Additionally, columns (8) and (9) augment the instruments matrix by considering the same vector of exogenous failure-related instruments shown in column (4) of Table 3.
model with too many instruments (Roodman 2009b). Finally, we augment our instrument matrix
with the exogenous failure-based instruments used in column (4) of Part C of Table 3. The
specifics of these processes are described in Appendix H. Next, we discuss the results of these
various models.
We show the model with all available instruments in column (5) of Table 4. The
coefficient for 45**678759',) suggests an increase in memory usage of 10.20% (i.e., "O.Ox_ − 1).
The results with the minimum number of instruments possible are reported in column (6), and we
continue finding a positive and significant effect for full support, this time representing an
increase in memory usage of 5.69% (i.e., "O.Oyy − 1). Finally, we augment our instrument matrix
18
for these same model specifications with the exogenous failure-based instruments. The new
results are shown in columns (7) and (8) of Table 4 and do not vary much relative to those
already discussed in columns (5) and (6).
5.2.! Effects of Technology Support on Efficiency of IT Use
As mentioned in our theory background section, buyers who access full support may learn from
the provider in ways that enable them to make better use of the cloud service. We test if it is true
that buyers make better and more efficient use of the advanced cloud specific infrastructure as a
result of having access to full support. An advantage of cloud infrastructure services is that we
can partially observe certain attributes of buyers’ deployments, some of which are diagnostic in
assessing how proficient a buyer is in making use of the infrastructure. If full support helps
buyers use the service better, one would expect that they employ architectures that can scale
more efficiently, although this comes at the cost of increased complexity. We explain this
assertion and offer a test of it in the discussion below.
Although the on-demand nature of the service along with its rapid elasticity provides
firms the opportunity to reduce idle computing capacity waste and eliminate the necessity of an
up-front capital commitment in overprovisioning resources (Armbrust et al. 2010; Harms and
Yamartino 2010), doing so requires firms to scale their infrastructure in a cost-efficient manner.
There are essentially two ways of growing an IT infrastructure: vertically and horizontally
(Garcia et al. 2008; Michael et al. 2007; Reese 2009, p. 176). Scaling vertically, while easy to
execute since it generally only implies increasing the capacity of the single server performing a
function, does not allow the buyer to truly leverage the cloud’s scalability. For example, growth
is capped by the maximum server capacity available. In contrast, scaling horizontally with
several servers performing functions in parallel is complex. However, it offers virtually
19
unlimited growth potential plus it allows buyers to have a more resilient architecture. For greater
details on the benefits and complexities of the scaling methods, please see Appendix I.1.
As a result of these increased efficiencies and complexity, we use the fraction of servers
running in parallel as a measure that proxies for a buyer’s skill at using cloud computing. We
emphasize that although launching a single server is a trivial task for any system administrator,
launching several of them in a horizontally scalable manner is non-trivial. Additionally, this
measure varies separately from memory use, our first dependent variable: a buyer can consume a
large volume with none of its servers functioning in parallel, in which case the fraction is zero, or
a small volume with all of its servers functioning in parallel, which makes the faction equal to 1.
To compute this metric we scan the names of the servers used daily by buyers and count, to the
extent possible, how many of them are performing the same functions; the process is explained
in Appendix I.2. The monthly average fraction of servers running in parallel is captured in our
new dependent variable, 4%8R7E$+S8%8**"*',) (see Table 1 for descriptive statistics).
We estimate the exact same models described in our empirical approach (Section 3) but
substitute 4%8R7E$+S8%8**"*',) for *+!"#$%&',) as the dependent variable. Overall, our results
are consistent with previous findings when using the IT service use dependent variable,
providing additional evidence that full support enables buyers to use the cloud more efficiently.
Results in Table 5 show that buyers who have adopted and continue having access to full
support have a fraction of servers working in parallel that is between 3.19 and 4.26 percentage
points higher than that of basic support users; this is significant considering that mean
4%8R7E$+S8%8**"*',) is 0.12. We show these models using 2 lags of 4%8R7E$+S8%8**"*',) as
covariates so that the results are comparable to those used in the System GMM approach below,
yet the results are consistent if we use different numbers of lags.
Total,Number,of,IVs, , , , , 798, 482, 809, 493,Hansen,J,Statistic,p<value, , , , , 0.984, 0.345, 0.985, 0.319,Upgrade,change,(2×100), 9.49, <1.35, 6.50, 8.82, 1.35, 0.49, 1.39, 0.51,Dependent variable is 4%8R7E$+S8%8**"*',). All regressions include monthly calendar (A)) and tenure dummies (B',)). Columns (1) through (4) show robust standard errors, clustered on buyers, in parentheses. System GMM models in columns (5) through (8) have robust standard errors that use Windmeijer’s (2005) finite sample correction. * F3 < 30.10, ** F3 < 30.05, *** F3 < 30.01. Hansen J statistic not reported for 2SLS estimations in columns (1) through (4) as model is exactly identified. System GMM estimations in columns (5) through (8) consider 45**678759',) as endogenous. Given AR(2) in the errors, they all use the 2nd lag of the first difference of 4%8R7E$+S8%8**"*',) and 45**678759',) as their instruments for the levels equation. Columns (5) and (7) use all available lags of 4%8R7E$+S8%8**"*',) and 45**678759',) as instruments for the first differences equation, from the 3rd lag until the end of the panel. Columns (6) and (8) only use the 3rd to 12th lag of 4%8R7E$+S8%8**"*',) and the 3rd to the 8th lag of 45**678759',) as instruments for the differences equation. Additionally, columns (7) and (8) augment the instruments matrix by considering the same vector of exogenous failure-related instruments shown in column (4) of Table 3.
22
5.3.! Time Varying Effects of Full Support
The estimation results for Model (2) employing both the full sample and the CEM1 subsample
are shown in Table 8. For both dependent variables, all coefficients for the JK$F745**',)<L
indicators are positive and statistically significant. This initially suggests that full support’s effect
does not fade, at least not entirely, over time. The coefficients do not change much if we employ
a different number of lags for the support indicators (%) or the dependent variables (F).
Nevertheless, the precise interpretation of these coefficients is not straightforward.
Since the lags of the dependent variables are influenced by the prior access to full
support, each lag of JK$F745**',) enters the model in highly complicated ways. To show the
time-varying effects of support we plot the impulse response functions of the dependent
variables to the switch in the support type (i.e., a unit change in a binary variable) (Hamilton
1994, pp. 318-323). Specifically, we compute and plot ~�ÄÅÇÉÑNÖÜ,á~àâÑ=)äã��Ü,áåç
over time M to show how
current memory usage is influenced by the adoption of full support M periods ago. We describe
the estimation procedure in detail in Appendix J.
We show the impulse response functions of the dependent variables to the adoption of full
support in Figure 1. The figure suggests buyers significantly grow their volume of service
consumption (left panel) as well as the proportion of servers they run in parallel (right panel)
immediately after full support adoption. Moreover, the effects of having access to full support
grow over time, as evidenced by the positive slope of the functions. Buyers continue to benefit
from having access to full support over time, and full support has a stronger influence on buyer
Item! Support!Type! Units! Calculation!Basic! Full!Support!Costs!Number,of,Chats, 0.366, 0.702, Quantity,/,month, Mean,number,of,chats,/,month,Cost,of,a,Chats, $2.73, $5.24, $,/,month, Quantity,×,$7.46,a,Number,of,Tickets,b, 0.117, 0.650, Quantity,/,month, Mean,number,of,tickets,/,month,,Cost,of,a,Tickets, $4.31, $23.95, $,/,month, Quantity,×,$36.83,a,Cost,of,Support, $7.04, $29.19, $,/,month, Costs,of,Chats,+,Cost,of,Tickets,Cloud!Server!Profits!Estimated,Usage,c, 1,440.0, 1,898.5, GB,RAM/month, For,full,,median,usage,×,1.3184,Server,Hourly,Rate,d, $0.045, $0.090, $,/,GB,RAM,/,hour, Based,on,AWS,pricing.,Estimated,ARPU,e, $64.80, $170.86, $,/,month, Estimated,Usage,×,Hourly,Rate,Estimated,Profits, $51.84, $136.69, $,/,month, ARPU,×,80%,f,Difference!in!Profits,Net,Profits, $44.80, $107.51, $,/,month, Server,Profits,–,Support,Costs,Net,Profits,Gains,(abs.), $62.71, $,/,month, $107.51,–,$44.80,,Net,Profits,Gains,(%), 140%, %, $107.51,/,$44.80,–,1,a These are the estimated costs per chat session and ticket given to us by the provider. b We only count buyer-initiated (inbound) tickets. We exclude (outbound) announcements by provider through tickets. c Median usage under basic support is 2 GB RAM/hour; we multiply by 720 hours/month to get monthly usage under basic support. For full support we consider a 31.84% increase in usage from estimate in column (1) of Table 2. d During our sample period, Amazon Web Services’ (AWS) Elastic Compute Cloud (EC2), the public IaaS with the largest market share and thus with the dominant price-setting position, offered small 1.7 GB RAM servers at $0.08/hour and medium 3.75 GB RAM servers at $0.16/hour (source: aws.amazon.com). Based on these rates, we compute the mid-point price for 1 GB RAM / hour at $0.045. This is the price under basic support. For full support, even though the provider adds $0.12 to the hourly rate, we only add $0.045 to attain a conservative estimate. We also ignore the fixed monthly fee charged by the provider to buyers under full support. See Appendix C for more details on pricing. e Average Revenue per User. f The provider estimates their server-related variable costs are around 20%. These include server and datacenter depreciation expenses, datacenter rent, power and cooling, and non-infrastructure related items like credit card fees and bad debt expenses.
26
buyers’ continuous access to full support in the form of personalized guidance has significant,
quantifiable and sustainable business value. In this way, our research adds to other recent
findings about the value of service and support in the cloud setting. For example, Retana et al.
(2015) show that proactively providing customers with information about the value of a service
during the customer onboarding process can decrease customer attrition and decrease the number
of costly support interactions.
Our research has included a range of analyses that we have employed to isolate the
effects of full support on IT services uses. However, as in any empirical study, our research has
limitations. In particular, while our analyses of user decisions to use horizontally scalable
architectures provide suggestive evidence of learning, we were unable to directly observe
learning using our current research design.
Such limitations offer exciting opportunities for future research. For example, better data
on the productivity of service use would help researchers to more precisely isolate the effects of
provider interactions on customer learning. More broadly, we believe that future work should use
transactional data such as ours to gauge the impact of other buyer interactions with third parties,
such as traditional outsourcing firms and individuals in online communities of practice, to assess
their impact on the manner and effectiveness with which firms use IT. We hope our findings will
encourage additional work in this important area.
27
References
Alavi, M., and Leidner, D.E. 2001. "Review: Knowledge management and knowledge management systems: Conceptual foundations and research issues," MIS Quarterly (25:1), pp 107-136.
Anderson, T.W., and Hsiao, C. 1981. "Estimation of Dynamic Models with Error Components," Journal of the American Statistical Association (76:375), pp 598-606.
Angrist, J.D., and Pischke, J.-S. 2009. Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton, NJ: Princeton University Press.
Archak, N., Ghose, A., and Ipeirotis, P.G. 2011. "Deriving the Pricing Power of Product Features by Mining Consumer Reviews," Management Science (57:8), August 1, 2011, pp 1485-1509.
Arellano, M., and Bond, S. 1991. "Some Tests of Specification for Panel Data: Monte Carlo Evidence and an Application to Employment Equations," The Review of Economic Studies (58:2), pp 277-297.
Arellano, M., and Bover, O. 1995. "Another Look at the Instrumental Variable Estimation of Error-components Models," Journal of Econometrics (68:1), pp 29-51.
Armbrust, M., Fox, A., Griffith, R., Joseph, A.D., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I., and Zaharia, M. 2010. "A View of Cloud Computing," Communications of the ACM (53-58:4), pp 50-58.
Åstebro, T. 2004. "Sunk Costs and the Depth and Probability of Technology Adoption," The Journal of Industrial Economics (52:3), pp 381-399.
Attewell, P. 1992. "Technology Diffusion and Organizational Learning: The Case of Business Computing," Organization Science (3:1), pp 1-19.
Azoulay, P., Graff Zivin, J.S., and Sampat, B.N. 2011. "The Difusion of Scientifc Knowledge Across Time and Space:Evidence from Professional Transitions for the Superstars of Medicine." National Bureau of Economic Research.
Azoulay, P., Graff Zivin, J.S., and Wang, J. 2010. "Superstar Extinction," Quarterly Journal of Economics (25), pp 549-589.
Blackwell, M., Iacus, S., King, G., and Porro, G. 2010. "cem: Coarsened Exact Matching in Stata," Stata Journal (9:4), pp 524-546.
Blundell, R., and Bond, S. 1998. "Initial Conditions and Moment Restrictions in Dynamic Panel Data Models," Journal of Econometrics (87:1), pp 115-143.
Bresnahan, T.F., and Greenstein, S. 1996. "Technical Progress and Co-invention in Computing and in the Uses of Computers," Brookings Papers on Economic Activity: Microeconomics), 1996, pp 1-83.
Buell, R.W., Campbell, D., and Frei, F.X. 2010. "Are Self-Service Customers Satisfied or Stuck?," Production and Operations Management (19:6), pp 679-697.
Cameron, A.C., and Trivedi, P.K. 2010. Microeconometrics Using Stata, Revised Edition. College Station, TX: Stata Press.
Casalicchio, E., and Colajanni, M. 2000. "Scalable Web Clusters with Static and Dynamic Contents," IEEE International Conference on Cluster Computing, 2000., pp. 170-177.
Chatterjee, D., Grewal, R., and Sambamurthy, V. 2002. "Shaping Pp for E-Commerce: Institutional Enablers of the Organizational Assimilation of Web Technologies," MIS Quarterly (26:2), pp 65-89.
Chen, H., De, P., and Hu, J. 2015. "IT-Enabled Broadcasting in Social Media: An Empirical Study of Artists’ Activities and Music Sales," Information Systems Research (forthcoming).
Cherkasova, L. 2000. "FLEX: Load Balancing and Management Strategy for Scalable Web Hosting Service," Fifth IEEE Symposium on Computers and Communications (ISCC 2000), Antibes, France, pp. 8-8.
Chwelos, P., Benbasat, I., and Dexter, A.S. 2001. "Research Report: Empirical Test of an EDI Adoption Model," Information Systems Research (12:3), pp 304-321.
Emison, J.M. 2013. "2013 State of Cloud Computing."
28
Fichman, R.G. 2004. "Real Options and IT Platform Adoption: Implications for Theory and Practice," Information Systems Research (15), pp 132-154.
Fichman, R.G., and Kemerer, C.F. 1997. "The Assimilation of Software Process Innovations: An Organizational Learning Perspective," Management Science (43:10), pp 1345-1363.
Fichman, R.G., and Kemerer, C.F. 1999. "The Illusory Diffusion of Innovation: An Examination of Assimilation Gaps," Information Systems Research (10:3), pp 255-275.
Forman, C., Goldfarb, A., and Greenstein, S. 2008. "Understanding the Inputs into Innovation: Do Cities Substitute for Internal Firm Resources?," Journal of Economics & Management Strategy (17:2), pp 295-316.
Frei, F.X. 2008. "The Four Things a Service Business Must Get Right," Harvard Business Review (86:4), pp 70-80.
Furman, J.L., Jensen, K., and Murray, F. 2012. "Governing Knowledge in the Scientific Community: Exploring the Role of Retractions in Biomedicine," Research Policy (41), pp 276-290.
Garcia, D.F., Rodrigo, G., Entrialgo, J., Garcia, J., and Garcia, M. 2008. "Experimental Evaluation of Horizontal and Vertical Scalability of Cluster-based Application Servers for Transactional Workloads," in: 8th International Conference on Applied Informatics and Communications (AIC'08). Rhodes, Greece: World Scientific and Engineering Academy and Society (WSEAS), pp. 29-34.
Ghose, A. 2009. "Internet Exchanges for Used Goods: An Empirical Analysis of Trade Patterns and Adverse Selection," MIS Quarterly (33:2), pp 263-291.
Greene, W.H. 2008. Econometric Analysis, (6th ed.). New Jersey: Pearson Prentice Hall. Hamilton, J.D. 1994. Time Series Analysis. Princeton: Princeton University Press. Hansen, L.P. 1982. "Large Sample Properties of Generalized Method of Moments Estimators,"
Econometrica (50:4), pp 1029-1054. Harms, R., and Yamartino, M. 2010. "The Economics of the Cloud," Microsoft,
Hitt, L.M., Wu, D.J., and Zhou, X. 2002. "Investment in Enterprise Resource Planning: Business Impact and Productivity Measures," Journal of Management Information Systems (19:1), pp 71-98.
Ho, D., Imai, K., King, G., and Stuart, E. 2007. "Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference," Political Analysis (15:3), pp 199-236.
Iacus, S.M., King, G., and Porro, G. 2012. "Causal Inference without Balance Checking: Coarsened Exact Matching," Political analysis (20:1), pp 1-24.
Imbens, G., and Wooldridge, J. 2007. "Control Function and Related Methods," in: NBER Summer Institute - What's New in Econometrics? http://www.nber.org/minicourse3.html.
Ko, D.-G., Kirsch, L.J., and King, W.R. 2005. "Antecedents of Knowledge Transfer from Consultants to Clients in Enterprise System Implementations," MIS Quarterly (29:1), pp 59-85.
Levenshtein, V.I. 1966. "Binary Codes Capable of Correcting Deletions, Insertions, and Reversals," Soviet Physics Doklady (10:8), pp 707-710.
Mell, P., and Grance, T. 2011. "The NIST Definition of Cloud Computing," National Institute of Standards and Technology Information Technology Laboratory (ed.). Gaithersburg, MD.
Michael, M., Moreira, J.E., Shiloach, D., and Wisniewski, R.W. 2007. "Scale-up x Scale-out: A Case Study using Nutch/Lucene," Parallel and Distributed Processing Symposium (IPDPS 2007). IEEE International, pp. 1-8.
Microsoft, and Edge Strategies. 2011. "SMB Cloud Adoption Study Dec 2010 - Global Report," http://www.edgestrategies.com.
Nickell, S. 1981. "Biases in Dynamic Models with Fixed Effects," Econometrica (49:6), pp 1417-1426. Parthasarathy, M., and Bhattacherjee, A. 1998. "Understanding Post-Adoption Behavior in the Context of
Online Services," Information Systems Research (9:4), pp 362-379.
29
Reese, G. 2009. Cloud Application Architectures: Building Applications and Infrastructure in the Cloud. O'Reilly Media.
Retana, G.F., Forman, C., and Wu, D.J. 2015. "Proactive Customer Education, Customer Retention, and Demand for Technology Support: Evidence from a Field Experiment," Manufacturing and Service Operations Management (forthcoming).
Roodman, D. 2009a. "How to Do xtabond2: An Introduction to Difference and System GMM in Stata," Stata Journal (9:1), pp 86-136.
Roodman, D. 2009b. "A Note on the Theme of Too Many Instruments," Oxford Bulletin of Economics & Statistics (71:1), pp 135-158.
SearchDataCenter.com. 2011. "Data Center Decisions 2011 Survey Special Report," TechTarget. Singh, J., and Agrawal, A. 2011. "Recruiting for Ideas: How Firms Exploit the Prior Inventions of New
Hires," Management Science (57:1), pp 129-150. Symantec. 2011. "State of the Cloud Survey," Symantec. Wu, D.J., Ding, M., and Hitt, L.M. 2013. "IT Implementation Contract Design: Analytical and
Experimental Investigation of IT Value, Learning, and Contract Structure," Information Systems Research (24:3), pp 787-801.
Xue, M., and Harker, P.T. 2002. "Customer Efficiency," Journal of Service Research (4:4), pp 253-267. Xue, M., Hitt, L.M., and Chen, P.-Y. 2011. "Determinants and Outcomes of Internet Banking Adoption,"
Management Science (57:2), pp 291-307. Xue, M., Hitt, L.M., and Harker, P.T. 2007. "Customer Efficiency, Channel Usage, and Firm Performance
in Retail Banking," Manufacturing & Service Operations Management (9:4), pp 535-558. Zhu, K., and Kraemer, K.L. 2005. "Post-Adoption Variations in Usage and Value of E-Business by
Organizations: Cross-Country Evidence from the Retail Industry," Information Systems Research (16:1), pp 61-84.
Zhu, K., Kraemer, K.L., and Xu, S. 2006. "The Process of Innovation Assimilation by Firms in Different Countries: A Technology Diffusion Perspective on E-Business," Management Science (52:10), pp 1557-1576.
1
APPENDIX FOR
TECHNOLOGY SUPPORT AND POST-ADOPTION IT SERVICE USE: EVIDENCE FROM THE CLOUD
A.!Summary and Description of Variables
The following is a summary and description of all the variables used throughout the analyses, both as covariates in the regressions as well as criteria for the matching (CEM) process.
Indicates,buyer,has,more,than,250,employees.,úò_lú', CEM,criterion,b, Indicates,buyer,has,a,High,Uncertainty,Usage,use,case.,úò_ûú', CEM,criterion,b, Indicates,buyer,has,a,Low,Uncertainty,Usage,use,case.,úò_üg', CEM,criterion,b, Indicates,buyer,has,a,Back,Office,Applications,use,case.,úò_lg', CEM,criterion,b, Indicates,buyer,has,a,Hosting,Services,use,case.,úò_m†', CEM,criterion,b, Indicates,buyer,has,a,Test,and,Development,use,case.,a,Please see Appendix F for further details on the construction of all CEM-related variables.,
3
B.!Results Considering Switching to Basic
The objective of this appendix is to assess how our results change if we control for full support
buyers’ downgrade to basic support. Model (1) presented in section 3.1 has covariate
45**678759',) which indicates if full support was adopted by buyer E3by time 7. In other words, if
buyer E used full support for the first time in period ê, then 45**678759',) = 1{7≥ê}. Buyers
accessing full support have the option to downgrade to basic support. Let 4$%#"%45**678759',)
be a binary variable that signals if buyer E does not have access to full support by the end of the
focal month 7 but was using full support at the start of the focal month or in some prior month(s).
In other words, if buyer E switched from full support to basic support in period j, then
4$%#"%45**678759') = 1{7≥j}. We augment Model (1) with the 4$%#"%45**678759',) variable
(0.004), (0.011), (0.012), (0.023),Observations, 324,406, 43,355, 33,779, 11,888,Buyers, 21,573, 2,684, 2,029, 687,R2, 0.637, 0.614, 0.611, 0.628,Upgrade,change, 2×100 ,,, 3.24, 4.02, 4.00, 4.30,Dependent variable is 4%8R7E$+S8%8**"*',). All regressions include monthly calendar (A)) and tenure dummies (B',)). Robust standard errors, clustered on buyers, in parentheses. * F3 < 30.10, ** F3 < 30.05, *** F3 < 30.01.
5
C.!Provider Cloud Infrastructure Service and Technology Support Offerings
This appendix offers additional details than those presented in section 0 of the manuscript in
relation to the research context and the provider’s service characteristics. In our particular
setting, the cloud provider has recognized that the novelty of the service plus the complexities
involved in deploying distributed architectures that best leverage the cloud’s scalability may pose
significant knowledge barriers to buyers attempting to use the service. In response to this, the
provider offers them the option to contract and access full support. We discuss first the pricing
and terms of the cloud infrastructure service offering, and then elaborate on what characterizes
full support.
One of the essential characteristics of cloud infrastructure services is that they are offered
on-demand (Mell and Grance 2011). Buyers only pay for what they use, and nothing else: there
are no sign-up fees, no minimum spending requirements, no periodical subscription fees and –
since buyers can choose not to use their service as well – there are no contract termination
penalties either. Moreover, in the particular case of our provider, the computing resources are
offered to buyers at fixed hourly rates that increase in server size or capacity, generally in a
linear fashion. Servers’ capacity is defined in terms of memory (GB of RAM), processing power
(number of virtual CPUs), and local storage (GB space of local hard disk). During our
observation period, the three capacity metrics tend to vary together as a bundle, meaning that
more of one is generally associated with more of the other two, yet prices are set and buyers
usually make infrastructure sizing decisions in terms of memory. Prices also vary depending on
the operating system chosen for a server (e.g., Windows servers cost more than Linux servers),
yet such heterogeneity does not alter our main findings (see Appendix D for a detailed analysis).
6
Buyers in our context can launch as many servers and of any size they want, when they
want. However, as is discussed in section 5.2 and in Appendix I, there are important technical
challenges in deploying horizontally scalable configurations where several cloud servers work in
parallel. These challenges may in turn limit buyers’ ability to use many servers at once. Finally,
there are no usage caps, with the only exceptions to this being that the provider may have limited
hardware installed at its data centers or may take security measures to prevent misuse of its
service (e.g., spamming). In other words, for legit buyers, there is no pre-defined cap or limit to
how much they can choose to use the service.
The provider complements its infrastructure offering with full support, which is offered
for a fixed price premium per server-hour used plus an additional fixed monthly fee. For
instance, instead of paying $0.10 per hour for a 2GB RAM Linux server under basic support, a
full support buyer would pay $0.12 more, i.e., $0.22 per hour. Similarly, for the 4GB RAM
server priced at $0.20 per hour under basic support, the full support buyer would pay $0.32 per
hour. The monthly fee is paid as a monthly subscription, which is a fee high enough to deter
buyers with very low willingness to pay (i.e., bloggers that use a single very small server). There
are no sign-up or termination fees for the full support service. The only explicit switching cost
from one support level to another is technical rather than monetary: when downgrading from full
support to basic support, because of technical limitations in the service offering (during our
observation period), buyers must redeploy their servers on their own under the new support
regime. The redeployment will involve launching new servers with virgin operating systems (i.e.,
“out of the box”), and then installing and configuring their business applications on them.
A prime goal of full support is to educate buyers on how to best use the cloud
infrastructure service and adapt it to their idiosyncratic business needs. When receiving full
7
support, buyers receive personalized guidance and training, and thus have the opportunity to
learn directly from the provider’s prior experience in deploying applications in the cloud. Buyers
not willing to pay the price premiums will only receive a basic level of support that has limited
scope in the sense that it is intended to aid buyers with issues concerning account management or
overall performance of the infrastructure service. For example, while a full support buyer may be
personally guided step by step on how to deploy a web server through phone conversations, live
chat sessions or support tickets, basic support buyers will be referred to a knowledge base.
Similarly, if a server failed, which happens much more frequently than in traditional datacenter
settings given the commodity hardware employed and the multi-tenant architecture (i.e., multiple
organizations’ virtual servers are hosted in the same and shared physical server), the provider
would work together with full support buyers in solving the issues, while basic support users
would only be notified about the failure, if anything. Thus, basic support buyers do not have
fluid access to external knowledge from the provider and have to rely mostly on their internal
capabilities to co-produce the service.
8
D.!Implications of Server Operating System Heterogeneity
Throughout our econometric approach, we capture the average effect of full support on buyer
behavior while acknowledging that there may be heterogeneity in that average effect. The buyer
choice of a server’s operating system (OS) is one source of such heterogeneity. In this appendix
we examine how OS choice influences our results.
Even though we observe buyer OS choice, it is difficult to hypothesize how variations in
OS choice influence buyer behavior. With the OS, there is variation in prices as well as in
difficulty of system administration. However, it is unclear ex-ante how this relates to the effect
of provider support on buyer behavior. Furthermore, the observed OS usage is an endogenous
choice for which we do not have an instrument. Given these nuances, we have not explicitly
studied how heterogeneity in OS choice influences full support’s effects on buyer behavior.
Having said that, and despite the fact that we lack a valid instrument for buyer OS choice,
there is value in exploring buyer OS choice heterogeneity. In section D.1 we describe the buyer
preferences regarding which OS they use. We show that most (e.g., 85%) of them tend to use a
single OS throughout their observed tenure; a consequence of this is that our buyer fixed effects
absorb some of the OS-specific heterogeneity. Then, as a robustness check, in section D.2 we run
some variations of our models considering the OS preferences by interacting our support
indicator with OS choice indicators. The results are consistent with our main findings.
D.1.! Buyer OS Preferences
During the time span of our data, the provider offered its servers running 4 different OS and we
observe which OS each individual server used:
1., Linux: Several distributions, although we do not observe which.
2., Windows: Several versions of Windows Server, although we do not observe which.
9
3., Red Hat Enterprise Linux.
4., SQL Server: This is really a Windows Server running SQL Server, yet it was offered under its own price scheme and hence is considered another operating system for this exercise.
Even though there were multiple OS available, as we show in Table D.1, most buyers
either exclusively or at least primarily used a single OS. To determine if a buyer is a user of a
particular OS, we computed the proportion of the total amount of GB RAM-hours consumed by
each buyer over its observed tenure that were consumed with each of the 4 different OS. Then,
using arbitrary yet high thresholds (e.g., from 85% up to 100%), we flag a customer as user of a
certain OS if the proportion of service use with that OS is greater or equal than the defined
threshold. Using these proportions of workloads under each OS and varying thresholds, we
populated each column of Table D.1 as follows:
•, Sample: We show data for two samples, the full baseline sample and the CEM1 subsample, to show that the proportions of buyers using each OS do not vary significantly with the matching process.
•, Threshold: Indicates the percentage of total usage using a specific OS used to flag a buyer as a user of that OS.
•, Linux, Windows, Red Hat and SQL: Indicate the proportion of buyers who used at least as much as the threshold of their total usage under each corresponding OS. For example, 57.35% of buyers in the baseline sample used at least 99% of all their GB RAM-hours on Linux servers.
•, Only 1: Given a certain threshold, it shows the proportion of total buyers that used only a single OS. The column is the sum of the 4 different OS columns to the left.
•, Mixed: Given a certain threshold, it shows the proportion of total buyers that used a mix of more than a single OS. The “Only 1” column and this column add up to 100%.
The main takeaway from Table D.1, and in particular form the “Only 1” column, is that
most customers primarily use a single OS. For instance, 66.96% of buyers in the baseline sample
ran all their servers using a single OS, and 80.13% ran at least 95% of their workloads using a
The results in Table D.2 are generally consistent with our main findings. However, we refrain
from a detailed analysis of the coefficients since OS choice is an endogenous variable for which
we do not have an instrument.
13
E.!Description of Origin of Survey Data
The survey is optional and administered as part of the online signup web form; the response rate
is 43.4%, and we have not found systematic differences between respondents and non-
respondents. The survey was first administered in June 2010, and we have all buyers’ responses
until February 2012. Although there can only be one survey response per account, since buyers
can have multiple accounts, we may also have multiple responses per buyer. In our data we have
6,152 survey responses from 5,565 different buyers in the baseline sample, 431 of which
changed their response to at least one item across their surveys. However, for 42.3% of the
buyers with varying responses the time gap between the survey responses is too short (i.e., less
than 3 months) as to suggest that the variance is due to changes in firms’ sizes or goals. Given
this, we do not rely on variance across responses for our analysis and rather only consider the
5,134 buyers that either have a single survey response or that have consistent responses across all
their submissions. Further, we have not considered firm attributes in the survey as controls in our
models since they do not vary over time and thus would be absorbed by the firm fixed effect. We
use 3 of the items in the survey: the firms’ total employment, their intended use case for the
cloud infrastructure service, and their industry. For employment, the survey asks buyers to
indicate their range of employment and we convert the survey’s ranges to numerical values by
taking the mean value of each range (e.g., we convert “From 51 to 100” to 75).
14
F.!Description of CEM Procedures and Subsamples
F.1.! Overview of CEM Procedure
We run our models on subsamples defined using a coarsened exact matching (CEM) procedure
(Blackwell et al. 2010; Iacus et al. 2012). For matching purposes, we consider buyers who
adopted full support at any point in their tenure as treated and those that relied exclusively on
basic support as controls. Matching reduces endogeneity concerns (Ho et al. 2007), and CEM has
been used extensively in recent work to improve the identification of appropriate control groups
in difference-in-differences estimation (e.g., Azoulay et al. 2011; Azoulay et al. 2010; Furman et
al. 2012).
CEM is particularly convenient for our setting because it is a nonparametric procedure
that does not require the estimation of propensity scores. This is useful because, aside from the
exogenous failures, we have limited data that would allow us to directly predict the likelihood of
full support. Each unique vector formed by combinations of the coarsened covariates describes a
stratum. Since the number of treated and control observations in each strata may be different,
observations are weighted according to the size of their strata (Iacus et al. 2012). The differences
in means between the treated and the controls across the various matching variables are almost
all statistically significant. However, the samples are perfectly balanced and any mean
differences are eliminated once we apply the CEM weights (see Table F.8 for descriptive
statistics with weights applied). All our regressions with CEM-based subsamples employ these
weights. When exact matching is possible, such that for every treated observation there is a
control observation identical to the first one across all possible covariates except for the
treatment, a simple difference in means of the dependent variables would provide an estimate of
the causal effect of interest. Nonetheless, since it is nearly impossible to use exact matching in
15
observational data and thus there is always a concern about the influence of omitted variables,
we continue using our fixed effects panel data model to control for them.
We match buyers based on five attributes: (1) level of IT use (i.e., memory use), (2)
frequency of cloud infrastructure resizing (i.e., how often buyers launch a server, halt a server, or
resize an existing one), (3) employment, (4) intended use case for the cloud infrastructure
service, and (5) industry. The first two attributes are derived directly from firms’ observed usage
of the cloud service. The latter three attributes come from the optional signup survey described in
Appendix E. The precise matching criteria are described below in section 0.
For the matching process, we only consider treated buyers who started using the cloud
service with basic support and upgraded to full support later on. This allows us to match the
upgraders to controls based on their usage behavior before they adopted full support, had the
controls adopted full support in the same month of their tenure. This approach, which is similar
to the one implemented by Azoulay et al. (2010) and Singh and Agrawal (2011), ensures to the
extent possible that treated firms do not exhibit differential usage behavior before they adopt full
support relative to controls. Among the 5,134 buyers for which we have all this data (i.e., they
answered the signup survey), 1,259 are treated and 3,875 are potential controls. Using the five
criteria described above, we develop 3 different weighted matched subsamples. Each of the three
subsamples is built with increasingly stringent matching criteria, which in turn produces finer
strata (i.e., less buyers satisfy the matching criteria used) and reduces our concerns of having too
coarse strata. The details of the subsamples construction are offered in section F.3.
16
F.2.! CEM Matching Criteria
Five different attributes of firms were used to match treated and controls. In this section we
describe each of them as well as the binning done within each attribute. Their corresponding
descriptive statistics are shown in Table F.1.
Table!F.1.!Descriptive!Statistics!of!Variables!used!for!CEM!before!matching!(5,134!buyers) Buyer Role All buyers Controls Treated Number of Buyers 5,134 3,875 1,259
Only use basic support 73,594 16,157 14,338 2,365 1,732 526 Start with basic, upgrade to full support 1,409 1,408 1,132 275 258 136 Start with basic, upgrade to full, and downgrade to basic 205 203 159 45 39 25 Start with full, downgrade to basic 215 215 215 Excluded Excluded Excluded Only full support 4,196 4,196 4,196 Excluded Excluded Excluded
Data components available Cloud infrastructure usage and support choice data Yes Yes Yes Yes Yes Yes Survey data used for CEM Incomplete Incomplete Incomplete Yes Yes Yes Support interaction data used to construct IVs Incomplete Incomplete Yes Yes Yes Yes
CEM procedure applied? No No No Yes Yes Yes
Table&F.3.&Description&of&Matching&Criteria&used&in&CEM&Procedures&Abbreviation Description # of Categories Categories Emp Employment 5 0-10, 11-50, 51-100, 101-250, >250 UC General use cases (can have more than 1) 5 High variance, low variance, back office, hosting, test & dev Mem Memory usage in months before upgrade 9 <0.5, 0.5-1, 1-2, 2-4, 4-8, 8-16, 16-32, 32-64, >64 Adj Frequency of infrastructure resizing in months before upgrade 5 0, 1-2, 3-9, 10-43, >43 Ind Industries 258 Popular ones have 11% to 15% of observations t-upg Upgrade month for treated, and month in tenure for controls 40 One per month; longest delay in upgrading is 40 months.
Matched Buyers / Total Buyers Average Buyers per Stratum Variables used for matching Controls Treated Both Controls Treated Both Emp UC Mem Adj Ind t-upg
Table&F.5.&&Descriptive&Statistics&of&TimeGVarying&Variables&Sample Full Baseline Support CEM1 CEM2% CEM3%Buyers 79,619 22,179 20,040 2,685 2,029 687 Observations 1,073,998 368,606 298,539 48,725 37,837 13,262 Variable Mean S.D. Min Max Mean S.D. Min Max Mean S.D. Min Max Mean S.D. Min Max Mean S.D. Min Max Mean S.D. Min Max !"#$%&',) 3.4 19.2 0 2,284.5 7.9 31.4 0.0 2,284.5 7.3 27.4 0.0 2,284.5 5.2 17.2 0.0 675.9 6.7 22.1 0.0 675.9 5.1 12.1 0.0 329.0 *+!"#$%&',) 0.746 0.871 0 7.734 1.348 1.040 0 7.734 1.343 1.014 0 7.734 1.218 0.894 0 6.518 1.302 0.986 0 6.518 1.221 0.920 0 5.799 ,-**.-//$%0',) 0.055 0.228 0 1 0.160 0.367 0 1 0.183 0.387 0 1 0.078 0.268 0 1 0.093 0.291 0 1 0.155 0.362 0 1 .1203ℎ5$67823',) 0.003 0.052 0 1 0.008 0.089 0 1 0.009 0.093 0 1 0.007 0.082 0 1 0.007 0.084 0 1 0.013 0.113 0 1 ,%7302$+97%7**"*',)%0.058 0.198 0 1 0.121 0.266 0 1 0.120 0.267 0 1 0.106 0.251 0 1 0.116 0.258 0 1 0.104 0.246 0 1
Table&F.6.&&Descriptive&Statistics&of&Variables&used&in&Survey&Data&for&CEM&before&matching&(5,134&buyers)&Buyer Role All Buyers Controls Treated t-test of mean
Table&F.7.&&Descriptive&Statistics&of&Variables&in&CEM1&Matched&Sample&without&Weights&(2,685&buyers)&Buyer Role All Buyers Controls Treated t-test of mean
Table&F.8.&&Descriptive&Statistics&of&Variables&in&CEM1&Matched&Sample&with&Weights&(2,685&buyers)&Buyer Role All Buyers Controls Treated t-test of mean
G.!Support Interactions and Construction of Instruments
G.1.! Support Interactions Coding Process
The content of the support interactions between the provider and its buyers was used to identify
three types of exogenous failures experienced by buyers. The following are the keywords and
phrases used to identify each of these types of interactions. All support interactions that matched
some keyword or phrased were visually examined to rule out false positives.
!Table!G.1.!!Keywords!and!Phrases!Searched!for!Support!Interactions!Coding!Support Interaction Type Description of Event List of keywords or phrases
!"#$%&'"()
Provider may suffer from generalized outages in different components of its service (e.g., memory leak in provider’s cloud management system). Such generalized problems are announced in the provider’s status webpage and/or announced to buyers.
Providers’ service status URL, cloud status, outage, scheduled maintenance, undergoing maintenance
!"#$*)'+,-.
Some node in the provider’s infrastructure, generally belonging to some buyer, is suffering from a distributed denial of service attack (DDoS) or some networking hardware device has failed.
Server does not respond to ARP requests, faulty switch, network issue in our data center, lb in error state, load-balancer hardware nodes, DDoS
!"#$/,0'
Buyer is suffering degraded performance due to a problem in the physical host in which the buyer’s virtual machine runs. Problems are generally associated with excessive read/write operations on the hard disks, either by the buyer or by another buyer whose virtual machine lives in same physical server. Problems could also be associated with failure of the physical hardware.
Consuming a significant amount of Disk I/O, very high disk I/O usage, iowait, iostat, swapping, swappers, swap space, extreme slowness, slowdown problems, hardware failure, degraded hardware, drive failing, drives failing, server outage, host failure, server is down, server down, site down, host became unresponsive, server unresponsive, server not responding, server is unresponsive, is hosted on has become unresponsive, problem with our server, host server, physical host, physical hardware, physical machine, host machine, failing hardware, hardware failure, imminent hardware issues, migrate your cloud server to another host, queued for move, issue on the migrations, host server of your cloud servers
24
G.2.! Construction of Support-Based Variables
Let ! ∈ !"#$%&'"(), !"#$*)'+,-., !"#$/,0' represent a type of support interaction
identified through coding process. Let *&3!4,5 be the number of support interactions of type !
counted for buyer # during month '. Further, let 677!4,5 be the accumulated number of support
interactions of type ! that buyer # has experienced up to month '. Formally, 677!4,5 =
*&3!4,99:59:; . Finally, we construct indicators that are turned on when the total number of
interactions is greater than * = 1,2, as !*4,5 = 1{?@@AB,CDE}. Then, for example, variable
!"#$%&'"()24,5 will be equal to 1 if buyer # has accumulated at least 2 support interactions that
have been coded as type !"#$%&'"() by month '.
25
G.3.! Descriptive Statistics of Support Interactions