Top Banner

of 17

Consumer Learning From Online Reviews

Apr 04, 2018

Download

Documents

sommukh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/29/2019 Consumer Learning From Online Reviews

    1/17

    Vol. 32, No. 1, JanuaryFebruary 2013, pp. 153169ISSN 0732-2399 (print) ISSN 1526-548X (online) http://dx.doi.org/10.1287/mksc.1120.0755

    2013 INFORMS

    Modeling Consumer Learning from

    Online Product ReviewsYi Zhao

    J. Mack Robinson College of Business, Georgia State University, Atlanta, Georgia 30303,[email protected]

    Sha YangMarshall School of Business, University of Southern California, Los Angeles, California 90089,

    [email protected]

    Vishal NarayanSamuel Curtis Johnson Graduate School of Management, Cornell University, Ithaca, New York 14853,

    [email protected]

    Ying ZhaoHong Kong University of Science and Technology, Clearwater Bay, Kowloon, Hong Kong,

    [email protected]

    We propose a structural model to study the effect of online product reviews on consumer purchases ofexperiential products. Such purchases are characterized by limited repeat purchase behavior of the sameproduct item (such as a book title) but significant past usage experience with other products of the same type(such as books of the same genre). To cope with the uncertainty in quality of the product item, we positthat consumers may learn from their experience with the same type of product and others experiences withthe product item. We model the review credibility as the precision with which product reviews reflect theconsumers own product evaluation. The higher the precision, the more credible the information obtained fromproduct reviews for the consumer, and the larger the effect of reviews on the consumers choice probabilities.We extend the Bayesian learning framework to model consumer learning on both product quality and reviewcredibility. We apply the model to a panel data set of 1,919 book purchases by 243 consumers. We find thatconsumers learn more from online reviews of book titles than from their own experience with other books ofthe same genre. In the counterfactual analysis, we illustrate the profit impact of product reviews and how it

    varies with the number of reviews. We also study the phenomenon of fake reviews. We find that fake reviewsincrease consumer uncertainty. The effects of more positive reviews and more numerous reviews on consumerchoice are smaller on online retailing platforms that have fake product reviews.

    Key words : learning models; choice models; product reviewsHistory : Received: May 6, 2011; accepted: September 18, 2012; Peter Fader served as the user-generated

    content special issue editor and then Preyas Desai served as the editor-in-chief and Scott Neslin served asassociate editor for this article. Published online in Articles in Advance December 17, 2012.

    1. IntroductionThe Internet has provided an opportunity for con-sumers to share product evaluations online, facilitat-ing a new channel for the communication of productinformation and word of mouth. Online consumerreviews are now widely available on websites of man-ufacturers, retailers, and market makers, as well asfor product categories including both search goodsand experiential goods. Before deciding to purchasea specific item, consumers now have online access tonumerous product reviews posted by several users.

    Recent research has provided considerable evidencethat online product reviews have an impact on prod-uct sales. Chevalier and Mayzlin (2006) find a posi-tive relationship between consumer book ratings and

    book sales. Liu (2006) finds a positive relationshipbetween the volume of reviews of a movie and its boxoffice revenue. Whereas prior research has addressedthe link between consumer product reviews and

    product sales based on aggregate-level analysis, littlework has examined the underpinning of such impactin affecting individual consumer purchase decisions.Such microlevel analysis can help marketers gain adeeper understanding of the impact of user-generatedproduct reviews and more effectively respond to thisnew market phenomenon.

    In this paper, we propose a structural model tostudy how online reviews affect individual consumerpurchases of experiential products. Our basic premiseis that consumers are uncertain about the true product

    153

  • 7/29/2019 Consumer Learning From Online Reviews

    2/17

    Zhao et al.: Modeling Consumer Learning from Online Product Reviews154 Marketing Science 32(1), pp. 153169, 2013 INFORMS

    quality or the extent to which a product matchestheir preference or usage condition. This gives riseto consumers need for reading product reviews andlearning from other consumers usage experiencesto reduce such uncertainty. This is especially truefor experiential products such as books, movies, andmusic because, unlike other products, these are con-sumed solely for the pleasure and experience theyprovide. Consumers have been known to rely onrecommendations for experiential products signifi-cantly more than other types of products (Sncaland Nantel 2004). For this reason, online reviews arevery important when consumers are choosing prod-ucts they do not have first-hand experience with. Byreading the post-consumption evaluations of a prod-uct by others, consumers can make a more informeddecision about which product(s) to purchase.

    The proposed model is built on the framework ofconsumer learning of product quality based on pastusage experience (Erdem 1998, Narayanan et al. 2005,Zhang 2010). The basic idea is that consumers areimperfectly informed and therefore uncertain aboutthe true quality of a product. This uncertainty is likelyto be greater for experiential goods than for searchgoods. This is because repeated purchase of the sameproduct item (e.g., the same movie, CD title, or booktitle) is quite uncommon for experiential goods. Thisis unlike the purchase behavior of other types of prod-ucts such as groceries (where repeated purchase of thesame product item is the norm) or physician adop-tion of new drugs (where physicians learn from bothsampling and repeated prescription). So even thoughthe consumer might have consumed similar products(say, products belonging to the same genre) in thepast, there is considerable uncertainty about the qual-ity of the specific product item (henceforth referredto as item). This uncertainty is associated with thelikelihood that the specific item (say, the movie Cou-

    ples Retreat) matches a consumers preference. To copewith such uncertainty, consumers may learn based ontheir past usage experience with this type of prod-uct (i.e., experience with comedy movies they haveviewed in the past) and other consumers usage expe-riences with the focal item through consumer reviews.They then update their belief of the subjective productquality (likelihood of matching) in a Bayesian fash-ion. When determining whether to purchase a booktitle, the consumer may integrate multiple sources ofinformation. The consumer might learn from (a) hisown past experience with other titles belonging to thesame genre of books as the title under considerationand (b) information about other consumers postcon-sumption evaluations of that title. A novel feature ofthis research is that our model captures both typesof learning: from ones own experience with the cate-gory and from information about others experienceswith the specific product.

    Although the effect of product reviews on aggre-gate sales is well studied, it remains unclear if allreviews of a product have the same effect on thechoice of a consumer. Conversely, there is little under-standing if all consumers are affected by a set ofreviews to the same extent. In addition to model-ing consumer learning from ones own experienceand information about others experiences, we alsomodel the credibility of information obtained fromothers. The credibility of the source of communicationis directly related to the persuasiveness of communi-cation (Sternthal et al. 1978). As such, more crediblereviews for a product are likely to have a greater effecton the consumers propensity to buy that product.It is well established that the credibility of a source ofcommunication is greater if he or she is more similarto the recipient (Brock 1965). Accordingly, we modelthe review credibility as the precision with whichproduct reviews reflect the consumers own productevaluation. The higher the precision, the more cred-ible the information obtained from product reviewsfor the consumer, and the larger the effect of reviewson the consumers choice probabilities. We extendthe Bayesian learning framework to model consumerlearning on both product quality and review credibil-ity. Our model is general enough to allow for the pos-sibility that some consumers might find the same setof reviews to be more credible than other consumers.Finally, we allow the credibility of all reviews of aproduct to vary with time.

    We apply the proposed model to a panel data setcomprising 1,919 book purchases by 243 consumers ata large product reviews website. Our empirical anal-ysis leads to three unique findings. First, we find evi-dence of stronger learning from product reviews thanlearning from own experience. Consumers learn lessfrom their own experience with books similar to theone they are considering for purchase (i.e., books ofthe same genre) than they do from others experienceswith the focal product. Second, we find evidence that,

    based on the similarity between reviews for a productand their own evaluation of that product, consumersupdate their beliefs about the credibility of productreviews over time. Third, ignoring the learning pro-cess from others lowers the model performance andleads to biases in model estimates. Such insights can-not be obtained from reduced-form aggregate models.

    Next we discuss the managerial implications ofthis research. Based on the parameter estimates ofour structural model, and a set of general assump-tions, we estimate the profit impact of online productreviews and how this profit varies with the numberof reviews posted for a product. Consider a firm thatmanages word-of-mouth activity (Godes and Mayzlin2009) by incentivizing consumers to post online prod-uct reviews. Such a firm might be interested in under-standing the profit impact of providing incentives

  • 7/29/2019 Consumer Learning From Online Reviews

    3/17

    Zhao et al.: Modeling Consumer Learning from Online Product ReviewsMarketing Science 32(1), pp. 153169, 2013 INFORMS 155

    for more reviews. In this context, our policy sim-ulations provide three unique insights. First, thereare diminishing returns to increasing the number ofreviews. So the first review of a product has a greaterprofit impact than the tenth review. Second, althoughincreasing reviews always leads to greater market

    share, they might lead to lower profits. This hap-pens when the marketing costs associated with elic-iting more reviews are not commensurate with grossmargins from product sales. Third, there is an opti-mum number of product reviews that the firm shouldspend on to maximize profits. We estimate this opti-mum number based on our model estimates.

    In a second counterfactual experiment, we studythe issue of fake product reviews. Although con-sumers have been exposed to product reviews forseveral years now, practitioners are increasingly con-cerned about the phenomenon wherein firms incen-tivize people to post fake positive reviews about

    the products they market. Though academics havestudied this phenomenon (Mayzlin 2006, Dellaro-cas 2006), a deep understanding of this importantand topical issue is not possible with a reduced-form aggregate-level approach. Our model providesa tool to better understand this phenomenon. Weempirically demonstrate that fake reviews increaseconsumer uncertainty. The effects of more positivereviews and more numerous reviews on consumerchoice are lower on online retailing platforms thathave fake product reviews.

    The remainder of this paper is organized as fol-lows. In 2, we briefly review the relevant literature

    and discuss how our study extends this work. In 3,we present the data. In 4, we develop a structuralmodel that captures consumer learning about the sub-

    jective product quality, based on both ones own andother consumers usage experiences. We also describethe process of how the credibility of reviews evolvesover time. In 5, we apply the model to a consumerchoice decision on book purchases, and we discussthe empirical results and implications of this applica-tion. Section 6 concludes the paper with implicationsfor future research.

    2. Literature ReviewWe first briefly review two streams of literature, andthen we discuss how this research contributes to thoseliteratures.

    The first stream of literature is related to the effectof online product reviews on sales of experientialproducts. There are mixed findings on the relation-ship between how positively a product is reviewedand the sales of that product. Several studies haveempirically shown that positive reviews are associ-ated with higher sales, whereas negative reviews tend

    to hurt sales of experiential products such as booksand movies (Dellarocas et al. 2007, Chevalier andMayzlin 2006). Several other studies did not find anystatistically significant relationship (Duan et al. 2008,Liu 2006). Another finding is that buyers seem tofind movies and books that have generated numerous

    reviews more interesting, which in turn drives greaterdemand, than those movies and books that have notreceived as many reviews (Liu 2006).1

    The second stream of literature is related toBayesian learning (Meyer and Sathi 1985, Robertsand Urban 1988, Luan and Neslin 2009). This processassumes that consumers are uncertain about productquality and update their quality expectation basedon past experience and other factors such as market-ing communication. In marketing, Erdem and Keane(1996) is a pioneering paper that studies the effectof learning from advertising. Erdem (1998) mod-els consumer cross-category learning; i.e., consumer

    usage experience of one product can influence con-sumer quality expectation of another product madeby the same company under the same brand. Mehtaet al. (2003) apply the notion of consumer learningin studying consideration set formation under priceuncertainty and consumer search. Erdem et al. (2004)study consumer learning of store brand quality acrosscountries. Narayanan et al. (2005) and Narayanan andManchanda (2009) model physician learning of thequality of new drugs from marketing instruments.Iyengar et al. (2007) model consumer learning of bothservice quality and usage. Zhao et al. (2011) modelconsumer learning in a product crisis situation.

    We propose a novel modeling approach to studythe impact of online product reviews on con-sumer choice. Our approach makes two contribu-tions. Unlike previous studies that have examinedthe relationship between product reviews and con-sumer choice at an aggregate reduced-form level, wepropose a structural individual-level choice model.We extend the Bayesian learning framework to incor-porate product reviews as a source of informa-tion that influences individual-level consumer choice.Although aggregate-level descriptive analysis is con-venient for establishing a general phenomenon (i.e.,a product review has a significant impact on prod-

    uct sales), it is perhaps not appropriate for study-ing the microlevel mechanism under which such aphenomenon occurs. This is where the disaggregate-level structural analysis offers several advantages.From a theoretical standpoint, aggregate models are

    1 Recent work has also studied dynamics in online product reviews(Moe and Trusov 2011, Moe and Schweidel 2011). Li and Hitt (2008)and Godes and Silva (2011) find evidence of a declining trend inposted reviews as the volume of reviews increases with time. How-ever, these papers do not study the effect of product reviews onchoice, which is our main focus.

  • 7/29/2019 Consumer Learning From Online Reviews

    4/17

    Zhao et al.: Modeling Consumer Learning from Online Product Reviews156 Marketing Science 32(1), pp. 153169, 2013 INFORMS

    inadequate to understand the relative importance of aconsumers learning from product reviews (i.e., learn-ing from indirect experience) vis--vis the consumerslearning from his or her own experience with theproduct (learning from direct experience). Our model-ing approach yields the novel insight that consumers

    learn more from the indirect experience of productreviews about a book than from their own directexperience of a product category. From a method-ological perspective, our model predicts consumerchoice better than reduced-form models estimatedwith the same data. From a managerial perspec-tive, we demonstrate how the model can be usedto assess the economic effect of the much-discussedphenomenon of fake product reviews and how theymight affect consumer choice. Such analysis is notpossible with aggregate-level reduced-form models.

    Our second contribution pertains to the studyof the credibility of product-related communication.

    Although credibility has been extensively studied inthe consumer behavior literature, and its importancein determining the persuasiveness of communica-tion is well established (Kelman 1961, Chaiken 1980),it remains unstudied in the choice modeling literature.Modeling credibility is challenging because althoughproduct reviews and consumer choices are observedto the researcher, the credibility of reviews is unob-served. As mentioned earlier, we model credibilityas the precision with which product reviews reflectthe consumers own product evaluation. The Bayesianlearning literature assumes that the precision level ofany source of product information (such as advertis-

    ing) is constant. However, the perceived precision ofinformation from product reviews is likely to varyover time, as she accumulates more information fromher own experience to validate product reviews. Wecontribute to the Bayesian learning literature not just

    by incorporating product reviews as an additionalinformation source but also by modeling the credibil-ity of product reviews, how it varies over time, andhow it affects individual-level consumer choice. Withthe exception of Zhang (2010), the Bayesian learningliterature has focused on learning from consumersown experiences and marketing instruments, ignoringthe process of learning from others.

    Zhang (2010) empirically models observationallearning (learning based on actions taken by others)using a Bayesian updating framework. She studiesthe U.S. kidney market, where patients on a wait-ing list sequentially decide whether to accept a kid-ney offer. Patients draw negative quality inferences

    based on earlier refusals in the queue. Our study isdifferent from this work as follows. We model learn-ing from information sharing (learning from informa-tion shared by others). Indeed, observational learningand information sharing are distinct forms of social

    learning. In addition, we simultaneously model andcompare learning from ones own and others experi-ence. Own experience with a product is not relevantin the specific institutional setting of Zhang (2010),

    but it is likely to occur for most choice decisionsbecause repeat purchase behavior is quite common in

    most product categories, including durables. Finally,we model the consumers perception of the credibilityof product reviews and how this credibility evolvesover time.

    3. The DataThe data for this study were collected from a U.S.company. The full sample includes consumers whoparticipate in a marketing program to evaluate each oftheir purchases in the book category. In other words,the data provide information on which book was

    bought by a consumer at a purchase occasion and

    the consumers reviews on each of her purchases.The company regularly emails participants, urgingthem to post reviews, and it runs other marketingprograms to ensure that all purchases are reviewed.However, like other marketing data (such as scannerpanel data), there could be some missing observa-tions, which we are unable to detect or trace.

    A product review typically consists of the follow-ing: a review title, a review body, pros and consof the product, and a rating of the product on afive-point scale. The five-point scale reads as follows.1 = avoid it, 2 = below average, 3 = average, 4 =above average, and 5 = excellent. In this research, we

    focus our attention to studying learning behavior inthe purchases of books. Books are experience goodsand are frequently reviewed by consumers. Also, theprevalence of advertising is expected to be much lessin books than in other frequently reviewed experi-ence goods such as movies, so it is quite unlikely thatconsumers in our data learn about product qualityfrom advertising. Finally, it has been established thatat an aggregate level, book reviews affect book sales(Chevalier and Mayzlin 2006).

    Next we discuss our sampling plan. Out ofall bookspurchased at least once in the 30-month period start-ing July 1999, we randomly selected 1,000 books. We

    then classified each book into one of the following cat-egories: romance (romance), science fiction/fantasy(science fiction), mystery and crime (mystery),and horror and thriller (horror). For the purposeof model calibration, we followed the practice in thelearning literature of sampling heavy buyers (Erdemand Keane 1996) and restricted ourselves to those con-sumers who made at least four purchases from these1,000 books in the data period. For consumers withfewer purchases, it is rather difficult to identify thekey pattern from noise. We model the purchase of

  • 7/29/2019 Consumer Learning From Online Reviews

    5/17

    Zhao et al.: Modeling Consumer Learning from Online Product ReviewsMarketing Science 32(1), pp. 153169, 2013 INFORMS 157

    the top 150 books (by the number of purchase obser-vations) and classify the remaining 850 books intofour category-specific other goods. This leads to adata set of 1,919 purchases made by 243 consumers,including 798 purchases of the other goods. For the1,919 purchase observations, the distribution of con-

    sumer ratings is as follows: 969 purchases were rated5, 524 were rated 4, 257 were rated 3, 111 were rated2, and 58 were rated 1. As is typical of online productreviews, a majority of product evaluations are quitepositive. Next, we sampled all the product reviewsposted on the companys website in the data periodfor all these books and the date when each reviewwas posted. We finally collected data on the monthand year in which each book was released, as well asits price. Table 1 reports the summary statistics.

    Among the four genres of books, science fictionhas the largest share, followed by the genres mys-tery, romance, and horror. The science fiction genre

    has the highest average consumer rating, followed bythe genres romance, mystery, and horror. Book rat-ing data are important in our study for both substan-tive and methodological reasons. First, book ratingsmeasure consumer satisfaction or experience of a spe-cific book. Following Chintagunta et al. (2009), weemploy these data in our learning model as experi-ence signals that are used to update consumers qual-ity beliefs. Because experience signals are observedin our case, they do not need to be simulated as intypical empirical applications where experience sig-nals are often unobserved. This significantly reducesthe computational burden of model estimation. Sec-

    ond, the review information allows us to observe thesimilarity between the consumers rating of a product

    Table 1 Summary Statistics of the Data

    ScienceVariables Romance fiction Mystery Horror

    Choice share 0.13 053 023 011Buyers ratings of

    purchased bookMean 4.11 430 402 398SD 1.08 095 112 116

    Mean of reviewers ratings ofpurchased books

    Mean 4.18 440 407 407SD 0.33 042 060 049

    Number of reviewsMean 16.41 1807 1728 1735SD 21.56 2165 2491 1731

    Time since publication(in days)Mean 732.99 124841 51452 75800SD 686.97 115445 61164 61243

    Price (in $)Mean 12.33 2085 1703 1546SD 4.30 1547 833 782

    and reviewers ratings. Such similarity enables us toempirically study consumer learning of the credibilityof reviews. In our data, the mean of the absolute dif-ference between the consumers ratings and the meanof reviewers ratings on commonly reviewed booksis 0.09 (statistically insignificant), and the variance

    is 1.21.

    4. The ModelIn this section, we develop a structural model thatcaptures consumer learning about the subjectiveproduct quality and reviewer credibility, based on

    both their own and others usage experiences. Thegeneral modeling context is that a consumer makeschoice decisions about which item (such as a booktitle) to purchase. These items can be grouped intodifferent categories (such as genres). The consumeris facing uncertainty on true product quality and thereview credibility. She updates her belief on review

    credibility using her own experience and the reviews.She then integrates her own experience with reviewsto form an expectation on the true product qual-ity. Such expectation will then drive her choice. Wedescribe the model in the following order: (i) mod-eling consumer updating of the expected quality ofa category, (ii) modeling consumer updating of theperceived credibility of reviews, (iii) modeling con-sumer integration of usage information from the selfand others, and finally, (iv) modeling consumer choicedecisions.

    4.1. Modeling Consumer Updating of

    Expected Quality of a CategoryFirst, we model how consumer i updates her belief ofthe quality of category j, based on items in this cat-egory that she buys and consumes. Quality in learn-ing models refers to only that product attribute aboutwhich consumers are uncertain (Erdem 1998) andwhich is not perfectly observable. We assume thatconsumer i holds prior beliefs about the quality ofeach category at each time period, which are updatedwhen the consumer consumes an item from that cate-gory, to form posterior quality beliefs. Consider a con-sumer at time t 1 who has a prior belief of the truecategory quality Aij given her information set Ii t2,

    which is distributed as follows:

    Aij Ii t2 NEi t2Aij2vijt2 (1)

    where Ei t2Aij is the mean and 2vijt2 captures the

    uncertainty of consumer is belief about category j attime t2. The subscript is a notation to differentiatethe perception variance from other variance param-eters in the model. For example, a consumer mighthave a generally positive quality perception of hor-ror books but might be highly uncertain of this per-ception. As we discuss later, this uncertainty reduces

  • 7/29/2019 Consumer Learning From Online Reviews

    6/17

    Zhao et al.: Modeling Consumer Learning from Online Product Reviews158 Marketing Science 32(1), pp. 153169, 2013 INFORMS

    over time as the consumer consumes items in the cat-egory j. So the more horror books the consumer reads,the lesser this uncertainty becomes for the horror cat-egory. Next we assume that the consumer purchasesitem kj at time t 1. Her evaluation (or experiencesignal) of this item after consumption is AEikj t1. Con-

    sistent with the literature, we assume that this evalu-ation is normally distributed around the consumersbelief of the quality of category j (Aij as follows(Erdem and Keane 1996):

    AEikj t1 NAij2 (2a)

    This is reasonable, because the consumers utilityfor an item (and hence her choice of that item) isaffected not just by her quality beliefs but also by herpreferences for attributes for which there is no uncer-tainty (such as price). So consumers could purchase

    books of lower quality than the category mean. Thevariance 2 is referred to as experience variability.Unlike typical learning models where consumer expe-rience signals are unobserved, we observe evaluationsof books purchased by the consumer (measured bya five-point discrete scale rating). We assume the fol-lowing relationship between the latent experience sig-nal AEikj t1 and the observed evaluation/rating of thepurchased book REikj t1:

    AEikj t1 15 ifREikj t1 = 1

    AEikj t1 1525 ifREik j t1 = 2

    AEikj t1 2535 ifREik j t1 = 3 (2b)

    AEikj t1 3545 ifREik j t1 = 4

    AEikj t1 45+ ifREikj t1 = 5

    The consumer uses her experience with item kj toupdate her quality belief about category j. Specifically,following Bayes rule, her posterior belief about thecategory quality after consuming product item kj inthis category at time t 1 is as follows:

    Aij Ii t1 NEi t1Aij2vijt1 (3)

    where

    Ei t1Aij= Ei t2Aij+Dikj t1Bijt1

    AEikj t1 Ei t2Aij (4a)

    1

    2vijt1=

    Dikj t1

    2+

    1

    2vijt2 (4b)

    Bijt1 =2vijt2

    2vijt2 +2 (4c)

    Dikj t1 is a dummy variable indicating whether theitem kj is consumed by consumer i at time t1. Note

    that if the experience signal AEikj t1 resulting from theconsumption of item kj is the same as the expectedvalue of category quality in time t 2, then thisexpected value remains unchanged in time t 1. Also,Bijt1, which denotes the relative weight of the expe-rience signal, is greater for greater values of2vijt2. In

    other words, the more uncertain a consumer is abouther prior quality belief about the category, the moreshe learns from consuming an item in that category.

    We allow the true category qualities to be differ-ent across consumers. Specifically, we allow Aij NAj

    2A. We assume the initial prior belief of the

    true category quality to be normally distributed; i.e.,Aij Ii1 NEi0Aij

    2vij0. By using the assumption

    of rational expectation, we have Ei0Aij = Aj and2vij0 =

    2A. In other words, we assume that whereas

    consumer i does not know the value of her truequality Aij in the first period, she knows that it isnormally distributed across the consumer population

    as Aij NAj2A. So she rationally forms her initialprior beliefs to be the same as the population-leveldistribution.

    4.2. Modeling Updating of Perceived Credibilityof Average Review by Consumers

    In this section, we present a model for the credibil-ity of information that the consumer receives fromproduct reviews. For any product, the consumer isexposed to product reviews that provide informationabout the quality of the product. The credibility ofthis information for the consumer is unobserved tothe researcher. We model the review credibility as the

    precision with which product reviews reflect the con-sumers own product evaluation. The higher the pre-cision, the more credible is the information obtainedfrom product reviews for the consumer.

    Specifically, let MRkj be the population mean ofthe evaluations of all reviewers for the item kj. MRkjnonsystemically deviates from consumer is evalua-tion of item kj, with the deviation following a normaldistribution:

    MRkj AEikj t1 N02R (5)

    The consumer cannot observe the product evaluations

    of all reviewers but is exposed to reviews posted ator before time t 1. Let Rikj t1 be the numericalevaluation of a specific review of product kj posted

    by reviewer i at or prior to time t 1. Further-more, Rkj t1 and nkj t1 respectively denote the sam-ple mean and number of reviews of product kj ator prior to time t 1. We assume each review to bean unbiased signal of the population distribution ofevaluations:

    Rikj t1 MRkj N0 2R (6)

  • 7/29/2019 Consumer Learning From Online Reviews

    7/17

    Zhao et al.: Modeling Consumer Learning from Online Product ReviewsMarketing Science 32(1), pp. 153169, 2013 INFORMS 159

    where 2R is the variance of the deviation betweenthe evaluation of a specific review that consumer iis exposed to and the population mean. When thisvariance is large, it suggests that the review is not agood representation of the population mean. Becausean individual can be both a consumer and a reviewer,

    we assume

    2

    R=

    2

    R to be consistent.

    2

    Based on Equations (5) and (6), the distribution ofthe difference between the evaluation of consumer iand the average of the evaluations of all reviews thata consumer is exposed to is the following:

    Rkj t1 AEikj t1 N

    0

    1+

    1

    nkjt1

    2R

    (7)

    We earlier defined the credibility of reviews as theprecision with which product reviews reflect the con-sumers own product evaluation. It follows fromEquation (7) that the variance parameter 2R measuresthe degree to which reviews are deviated from a con-

    sumers own experience with the same book. Con-sequently, 1/2R measures the precision with whichthe reviewers mean evaluation captures consumer isown product evaluation, reflecting the similarity intaste between the reviewers and the consumer. Themore deviated reviewers mean evaluation is fromconsumer is evaluation, the less credible reviews areto consumer i. Our modeling framework is generalenough to allow for the perceived review credibilityto be different across consumers and over time.

    Next we describe how we model the evolution ofthe credibility of reviews for consumer i over time.We assume that consumer i has a prior belief of2R at

    time t 1, which is gamma distributed:

    1

    2R

    Ii t2 i t2i t2 (8)i t2 is the shape parameter, and i t2 is the inversescale parameter. Both take positive values only. Afterthe consumer receives nkj t1 evaluations for item kjat time t 1 (with the mean evaluation signal beingRkj t1 and consumes the item kj to form her ownevaluation AEikj t1, her posterior distribution of thecredibility of reviews (based on Bayes rule) is

    1

    2R Ii t1 i t1i t1 (9)

    where

    i t1 =i t2 +Dikj t1

    2 and (10a)

    i t1 =i t2 +Dikj t1Rkj t1 AEik j t1

    2

    21+ 1/nkj t1 (10b)

    2 We thank an anonymous reviewer for suggesting thisspecification.

    The extent of updating of the scale parameter of con-sumer i is proportional to the deviation between thereviewers mean evaluation and consumer is ownevaluation of the item. Also, the greater the numberof reviews that consumer i is exposed to, the greaterthe level of updating will be.

    4.3. Modeling Integration of the Consumers OwnExperience and Reviews

    So far, we have discussed how consumers updatetheir perception of the quality of a category whenthey consume a product from that category. We havealso presented a general model of the credibility ofproduct reviews for the consumer. Next, we describehow the consumer uses the two sources of informa-tion (own experience and product reviews) to choosean item to buy. Suppose the consumer is making apurchase decision on an item kj at time t. One sourceof relevant information is her belief of the quality ofcategory j at time t1. From Equation (2a), we knowthat the evaluation of item kj for consumer i is nor-mally distributed around the category quality beliefAij as follows:

    AEikjt NAij2 (11a)

    We also know from Equation (1) that the prior cat-egory quality belief at time t 1 is distributed asfollows:

    Aij Ii t1 NEi t1Aij2vijt1 (11b)

    Equations (11a) and (11b) together imply that

    AEikjt NEi t1Aij2 +vijt1 (11c)

    The other source of information is the evaluationof item kj by reviewers. We assume that consumer iforms an expectation of the credibility of reviews(captured by the variance parameter 2R. We alsoknow from Equation (8) that 1/2R i t1i t1.Based on this, consumer i forms the following expec-tation of credibility at time t 1:

    Ei t12R=

    i t1i t1 1

    (12a)

    This expectation combined with Equation (7)clearly implies that the mean evaluation of item kj byreviewers is distributed as follows:

    Rkj t N

    AEikjt

    1+

    1

    nkj t1

    i t1

    i t1 1

    (12b)

    Then based on the Bayes rule, Equations (11c)and (12b), the expected quality of item kjt for con-sumer i at time t, is normally distributed with thefollowing mean and variance:

    EitAEikjt=Wikjt1Ei t1Aij+1Wikjt1Rkj t , (13a)

  • 7/29/2019 Consumer Learning From Online Reviews

    8/17

    Zhao et al.: Modeling Consumer Learning from Online Product Reviews160 Marketing Science 32(1), pp. 153169, 2013 INFORMS

    VaritAEikj t

    =1

    1/2+2vijt1+it11/it1/1+1/nkjt1

    (13b)

    WEikj t1

    =1/2+2vijt1

    1/2+2vijt1+it11/it1/1+1/nkjt1

    (13c)

    The expected quality of the item depends on the con-sumers perception of the expected quality of the cat-egory to which the item belongs and the reviewsposted for that item. The mean and variance of thedistribution of expected item quality both affect theconsumers expected utility for the item. She thendecides on which item to buy based on her expectedutilities across items in various categories.

    We now present the specification for initial param-eters i0i0 related to consumer learning fromreviews. Following Equation (12a), we have i0 =i0 1Ei0

    2R; i0 indicates the magnitude of rich-

    ness of consumer is initial experience. Intuitively,an inexperienced consumer should have lower i0compared with an experienced customer. Ei0

    2R rep-

    resents consumer is initial uncertainty of reviews.To obtain stable and theoretically meaningful esti-mates, we estimate i0Ei0

    2R instead of i0i0.

    Moreover, since i0 and Ei02R are both positive,

    we assume logi0 NM0V20 and logEi0

    2R

    NMR0V2R0. We allow for heterogeneity across con-

    sumers in their initial credibility of reviews. Nextwe present our model for the consumers decision-making process of what item to purchase.

    4.4. Modeling Consumer Choice DecisionsWe discuss our utility-maximization-based approachof modeling how a consumer chooses to buy an itemin the presence of information from her own experi-ence with the category the item belongs to, and infor-mation from product reviews. Following Chintaguntaet al. (2009), we assume that consumer is utility foritem kj can be written as

    3

    Vik

    jt

    =expri

    i

    AEik

    j t

    +Xik

    jt

    i

    +ik

    j t

    (14)where ri is the risk aversion parameter and i is thequality weight (both are greater than 0), Xikj t is a vec-

    tor of covariates such as price, and i is the vectorof coefficients ofXikj t; ikjt is the extreme value error.

    Because the consumer cannot observe product quality

    3 We also tested the utility function specification following Erdemand Keane (1996). However, the exponential specification leads toa better model performance, and hence we decided to adopt thisspecification.

    prior to purchase, we assume that she decides basedon the following expected utility:

    EitVikjt= EitexpriiAEikjt

    expriXikj ti +ikjt (15)

    Based on the theory of moment-generating func-tion for normal distribution, this expression for theexpected utility simplifies to the following:

    EitVikjt

    =expriiEitAEikjt+

    12rii

    2VaritAEikj t

    expriXikjti+ikj t

    =exp

    riiEitAEikjt

    12rii

    2VaritAEikjt

    +Xikj ti+ikjt (16)

    The details of how EitAEik

    j t and VaritAEik

    j t dependon the consumers own experience with category j,and with reviews posted for the item kj, were pre-sented in Equation (13). The consumer maximizes theexpected utility, which is equivalent to maximizingthe following:

    Uikj t = Uikjt + ikjt =iEitAEikj t

    riVaritAEikjt+Xikj ti + ikjt (17)

    where ri = 1/2rii2.

    It is important to account for unobserved het-erogeneity in consumer preferences, especially in

    models of learning (Shin et al. 2011). Accordingly,we model unobserved heterogeneity of the modelparameters across consumers as follows: logi N2, logri Nr

    2r, Aij NAj

    2A, and i

    MVN, where is a diagonal variance matrix.Finally, the utility of the other good oj in categoryj is specified as

    Uiojt =Oj+iojt (18)

    where Oj is the category-specific intercept for theother good.

    It is plausible that the utilities of items within a

    category are correlated because of unobserved vari-ables. To account for this possible correlation withina category, we adopt a nested logit specification. If is a measurement of this correlation (within nestsor categories), we can write the probability of con-sumer purchasing product kj, conditional on the con-sumer purchasing category j as follows (Berry 1994,Train 2003):

    PDikjt = 1 Dijt = 1=expUikjt/1 k expUikjt

    /1 (19)

  • 7/29/2019 Consumer Learning From Online Reviews

    9/17

    Zhao et al.: Modeling Consumer Learning from Online Product ReviewsMarketing Science 32(1), pp. 153169, 2013 INFORMS 161

    Furthermore, the unconditional probability of the con-sumer purchasing from category j is given by

    PDijt = 1=

    k expUikjt/1 1

    j

    k expUikjt/1

    1 (20)

    The unconditional probability of buying item kj isobtained by simply multiplying the two probabilitiesabove. This completes the description of the model.

    We include two covariates in the model: the priceof the book and the time elapsed since publication(in days) for each book. The average price of a book inthe science fiction genre is the greatest among the fourgenres, followed by mystery, horror, and romance.We predict that demand may be negatively correlatedwith the time since publication because publishersoften invest more resources in marketing a book whenit is launched. Among the four genres, mystery booksare the latest releases, followed by books in the horror,

    romance, and science fiction genres. It is plausible thattemporal variations in the reviews of a book mightlead the online retailer and/or the publisher to changeprices. However, we are unable to find any evidenceof this on the website from which we obtained data.

    4.5. Model Identification, Estimation,and Comparison

    We first discuss specific features of our data thatenable model identification. Our data set is differ-ent from other data used in the learning literature(e.g., scanner data and physician prescription data)in that we observe two separate pieces of informa-

    tion, one being the consumers own experience sig-nal (unbiased predictor of consumer preference), i.e.,her rating of a book after consumption; and the other

    being reviewers ratings of the same book. The twopieces of information allow us to separately identifyconsumer learning from the self and from reviewers.Also, because we observe consumers postconsump-tion evaluations (i.e., book ratings), we are able toseparately identify J intercepts, i.e., an intercept foreach genre. In a standard choice model with J choiceoptions, we can only identify J 1 intercepts with-out any additional information. Our approach directlyfollows Chintagunta et al. (2009). In their paper, theauthors integrate customer satisfaction data into alearning model and use satisfaction data to infer qual-ity perceptions.

    In our empirical context, we observe the buyersbook rating, which is a good indicator of the per-ceived quality AEijt . It is well known that the vari-ance of utility weight, experience variability, andvariance of risk reversion cannot be jointly identified

    because of the invariance scale problem in classi-cal learning models (Erdem 1998). The common solu-tion for this problem is to fix the variance of utility

    weight to 1. However, because we have indirect infor-mation on the consumers perceived quality in ourcontext, we can identify these three variances withoutany additional constraint.

    We next explain the intuition on the identificationof quality weight and risk aversion coefficients from

    the prior quality belief. Quality weight and risk aver-sion coefficients do not vary over time, and there-fore their effect on choice is persistent. However, theeffect of prior quality belief on choice (based on theBayesian updating theory) is decreasing over time.In other words, the effect of prior is more significantin the first few observations but minimal in the longterm. These different patterns allow us to achieve theidentification. Separate identification of these threeparameters has been widely adopted in the learningliterature. Because we can identify the quality weightparameter and the risk aversion parameter at the indi-vidual consumer level given the panel data, we can

    then identify the unobserved heterogeneity in theseparameters across consumers. We ran some simpleregressions and found some evidence on the effectof average review ratings and number of reviews onconsumer choices.

    The rationale for identifying the risk aversion coef-ficient from quality weight is as follows. The risk aver-sion coefficient measures the consumers sensitivity toher uncertainty on product or quality, and this uncer-tainty mainly depends on number of previous pur-chases in the same category and number of reviewsfor the product or book (please refer to Equations (4b)and (13b)). However, quality weight measures the

    consumers sensitivity to her evaluation on the prod-uct quality, and this evaluation mainly depends onher ratings on her previously purchased books in thesame category and reviewers ratings on the book(please refer to Equations (4a) and (13a)). Namely, therisk aversion coefficient and quality weight measureeffects on two completely different variables, so wecan separately identify them.

    Finally, quality weights are separately identifiedfrom the true mean qualities in standard learningmodels, through the functional form of the distribu-tion and by fixing the variance of the quality weightacross consumers to be 1 (Erdem 1998). In addition,

    in our case, we observe consumer experience signals,i.e., the product ratings, which help us make infer-ence on true mean quality. Thus, the true mean qual-ity and the observed choices data jointly identify thequality weight.

    We used the simulated maximum likelihoodmethod to estimate the model because the discretechoice probabilities needed to construct the likelihoodfunctions involve high-order integrals over the ran-dom variables (e.g., Keane 1993, Hajivassiliou andRuud 1994). We used the quasi-Newton method to

  • 7/29/2019 Consumer Learning From Online Reviews

    10/17

    Zhao et al.: Modeling Consumer Learning from Online Product Reviews162 Marketing Science 32(1), pp. 153169, 2013 INFORMS

    Table 2 Model Simulation Results

    True MeanParameter value estimate (SE)

    True mean qualityAj=1 (romance) 20 1893 (0.061)Aj=2 (science fiction) 25 2476 (0.044)

    Aj=3 (mystery) 30 2976 (0.034)Aj=4 (horror) 35 3528 (0.029)

    Intercept (Other goods)Oj=1 (romance) 30 3068 (0.197)Oj=2 (science fiction) 30 2991 (0.192)Oj=3 (mystery) 30 3121 (0.196)Oj=4 (horror) 30 2980 (0.202)

    Utility parameters (utility weight) 10 0980 (0.038)r (measure of risk aversion) 05 0624 (0.245)1 (coefficient of price) 05 0467 (0.030)2 (coefficient of time since publication) 05 0575 (0.047)

    Other parametersA (SD in true category quality) 05 0502 (0.018) (SD in true utility weights) 05 0489 (0.027)r (SD in true risk aversion) 05 0354 (0.101)1 (SD in price coefficient) 05 0493 (0.030)2 (SD in coefficient of time since publication) 05 0536 (0.034)j (SD of experience variability) 10 0871 (0.093)logit() (correlation among category utilities) 00 0039 (0.109)log0 (credibility parameter) 23 2186 (0.221)log0 (credibility parameter) 07 0567 (0.279)

    maximize the log-likelihood function. The BHHHalgorithm (Berndt et al. 1974) is employed to approx-imate the Hessian. To demonstrate that the proposedmodel is completely identified and that the estima-tion procedure is valid, we conducted a simulation

    study. We drew a set of covariates from a standardnormal distribution and assigned true values for allmodel parameters. On estimating the model, basedon the method described above, we find that we areable to statistically recover the true values for allparameters. Table 2 reports the true values and theestimates of all parameters. The 95% confidence inter-vals of all estimates contain the corresponding trueparameter values, suggesting that our model is iden-tified and the proposed estimation procedure is valid.We repeated the simulation exercise with several dif-ferent sets of parameter values and find consistentlyunbiased estimates.

    We estimated three models (the proposed modeland two nested versions of the proposed model).Model 1 is the baseline model where we includethe two covariates but do not incorporate learning.Model 2 incorporates learning from consumers ownexperiences with each genre of books. Model 3 is ourproposed model where we account for both learn-ing from the self and learning from others. Table 3reports the in-sample fit statistics of the three models.Models 2 and 3 both perform better than model 1.So incorporating learning from the self and others,

    Table 3 In-Sample Model Fit Comparison

    Learn from Learn fromself others AIC BIC

    Model 1 N N 2933282 2942733Model 2 Y N 2931344 2941907Model 3 (proposed) Y Y 2913942 2926729

    both improve model performance. Furthermore, irre-spective of the fit statistic (log likelihood, Akaikeinformation criterion (AIC), or Bayesian informationcriterion (BIC)), there is a greater improvement inmodel performance from model 2 to model 3, thanfrom model 1 to model 2. This suggests greater learn-ing from product reviews than learning from ownexperience, which is to be expected. It may be recalledthat we model consumer learning from own experi-ence with respect to each book genre and model con-sumer learning from others with respect to the focal

    book title. Compared with the uncertainty about a

    genre, a consumer is likely to be much more uncer-tain about the match of his or her own taste with aspecific book.

    Next we compare the out-of-sample predictive abil-ity of the three models (see Table 4). For this purpose,we draw a random holdout sample of 368 purchaseobservations (10% of our data), estimate the model onthe remaining data, and compute fit statistics basedon the holdout sample. We employ two measures: hitrate (book) is defined as the proportion of book pur-chases which are correctly predicted by the model,and hit rate (category) is defined as the propor-tion of purchases for which the model correctly pre-

    dicts the category from which this book is purchased.Across both metrics, models 2 and 3 perform betterthan model 1. The improvement in hit rate is greater

    between models 3 and 2, than between models 2and 1. So although modeling learning both from one-self and from others is important for the predictiveperformance of the model, gains in predictive perfor-mance are much greater from modeling learning fromothers. This provides converging evidence that in ourdata, there is greater learning from others than fromthe consumers own past experience.

    Table 4 Out-of-Sample Model Fit Comparison

    Relative RelativeHit rate increase Hit rate increase

    (book) (%) (book) (%) (category) (%) (category) (%)

    Model 1 440 NA 3504 NAModel 2 475 795 3520 046Model 3 497 1295 3704 571

    (proposed)

    Notes. Note that the hit rate at the book level based on the random choicerule is 0.65% (=1/154), and the hit rate at the category level based on therandom choice rule is 25% (=1/4). Compared to these benchmarks, all ofour presented models perform reasonably well.

  • 7/29/2019 Consumer Learning From Online Reviews

    11/17

    Zhao et al.: Modeling Consumer Learning from Online Product ReviewsMarketing Science 32(1), pp. 153169, 2013 INFORMS 163

    To demonstrate the value of our structural ap-proach, we compare our proposed model to a reduced-form model that incorporates the same set of informa-tion. This is an expanded version of model 1 with thefollowing covariates: genre-specific intercept, averagerating of the book prior to the purchase, number of

    reviews posted prior to the purchase, average rat-ing of the books in the category the book belongsto prior to the purchase, genre-specific state depen-dence effect, price, and time since publication. Wefind that our proposed model provides higher hitrates (4.97% at the book level and 37.04% at the cat-egory level) than the reduced-form model (4.77% atthe book level and 35.88% at the category level).This clearly demonstrates the importance of modelingthe underlying learning mechanism to gain a deeperunderstanding about the impact of product reviews.

    5. Results and Managerial

    Implications5.1. ResultsNext we discuss the parameter estimates and associ-ated insights. Table 5 reports the estimates from our

    Table 5 Parameter Estimates (Mean and Standard Error)

    Parameter Model 1 Model 2 Model 3

    True mean qualityAj=1 (romance) 4248 (0.106) 4406 (0.131) 4478 (0.111)Aj=2 (science fiction) 4514 (0.072) 4845 (0.087) 4746 (0.064)Aj=3 (mystery) 4198 (0.096) 4359 (0.108) 4270 (0.093)A

    j=4(horror) 4207 (0.112) 4308 (0.134) 4218 (0.111)

    Intercept (other goods)Oj=1 (romance) 4067 (0.363) 1716 (0.531) 13277 (0.498)Oj=2 (science fiction) 5009 (0.352) 2616 (0.526) 12372 (0.495)Oj=3 (mystery) 4099 (0.377) 1837 (0.521) 13196 (0.498)Oj=4 (horror) 3458 (0.369) 1150 (0.530) 13887 (0.503)

    Utility parameters (utility weight) 0591 (0.147) 2208 (0.621) 2199 (0.685)r (log(risk aversion)) 0198 (0.313) 2350 (0.062)1 (coefficient of price) 0076 (0.900) 0042 (0.528) 0014 (0.534)2 (coefficient of 0665 (0.075) 0626 (0.072) 0768 (0.084)

    time since publication)

    STD and correlation parametersA (SD in true category quality) 0752 (0.062) 0871 (0.064) 0502 (0.026) (SD in true utility weights) 0343 (0.081) 1526 (0.532) 0495 (0.229)

    r (SD in true risk aversion) 0367 (0.120) 0060 (0.025)1 (SD in 1 0031 (1.808) 0032 (0.304) 0004 (0.367)2 (SD in 2 0811 (0.087) 0753 (0.076) 1032 (0.091) (experience variability) 1260 (0.025) 1241 (0.034) 1329 (0.033) (correlation in cross- 0013 (0.079) 0005 (0.060) 0012 (0.055)

    category utilities)

    Parameters related to learning on reviewsM0 (mean of initial log 6125 (0.929)V0 (SD of initial log 1379 (0.596)MR0 (mean of initial logE

    2R 1886 (0.118)

    VR0 (SD of initial logE2

    R 0398 (0.052)

    Note. Statistically significant estimates (p < 005) appear in bold.

    proposed model (model 3) and its two nested ver-sions (models 1 and 2). The relative magnitudes ofthe estimated true mean quality levels are generallyconsistent with the market shares of the four genresof books. The mean quality of the genre with greatestmarket share (science fiction) is estimated to be the

    greatest. Overall, the mean quality estimates for thefour genres across the three models are not very dif-ferent. This is because the genre-specific mean qual-ity is inferred from the consumer book review data,which are the same across three models. One differ-ence is that the inference on genre-specific mean qual-ity in models 2 and 3 also utilizes consumer bookchoice data through the initial prior specification.

    The intercept estimates for the other goods havesome noticeable changes. This is due to the differentestimates of the utility weight and risk aversion coef-ficient across models. For example, the intercept esti-mates are much smaller in model 3 than in model 1.

    This is mainly due to the significant risk coefficientwe found. In model 1, the risk coefficient is fixed to

    be 0 since we did not model any learning or uncer-tainty there, whereas in model 3, the risk coefficientestimate is significant. In model 3, the risk aversion

  • 7/29/2019 Consumer Learning From Online Reviews

    12/17

    Zhao et al.: Modeling Consumer Learning from Online Product Reviews164 Marketing Science 32(1), pp. 153169, 2013 INFORMS

    suggested that the expected utility of each of the fourgenres of the books will incur a negative component(i.e., risk coefficient times the expected variance, andthere is a negative sign in front of the risk coefficientin our model specification). This negative compo-nent will drive down their expected utility, given thatthe expected mean quality is mainly inferred fromthe observed review ratings (i.e., they will not off-set the negative component). For the simulated choiceshares to be consistent with what is observed fromthe data, the intercept estimates of the four othergoods in model 3 are smaller.

    The three models yield very different estimates ofthe mean utility weight parameter . Specifically, wefind that the parameter estimates from the modelthat ignores learning from both the self and oth-ers (model 1) are biased upwards. The risk aver-sion parameter, r, is found to be significant in bothmodel 2 (=exp0198) and model 3 (=exp2350).Also, we find that irrespective of model specification,consumers are heterogeneous in their risk aversion(r is significant for both models 2 and 3). Similarly,model 2 overestimates the initial consumer uncer-tainty of true quality (vijt=0. However, the magni-tude of such uncertainty is not substantial. This isconsistent with our finding of limited consumer learn-ing from their own experience of books of the samegenre. This implies that most of quality uncertainty iswith respect to the subjective quality of the specific

    book title, which is what incentivizes an individualconsumer to learn from product reviews. Further-more, all models produce a significant estimate ofexperience variability (), suggesting that consumerexperience only provides a noisy signal of the trueproduct quality. Finally, the correlation parameter inthe nested-logit specification is insignificant in allthree models.

    We now report the effects of the two covariates weincorporate in our application: price and time sincepublication. To capture the nonlinearity effect of timesince publication, we include the log-transformedtime since publication. Across all three models, wefind that the price coefficient is insignificant, sug-gesting low sensitivity to prices in book choices.This could perhaps be driven by low variance inprices across books (see Table 1). As predicted, booksthat have been published more recently are preferred.Both nested models overestimate this coefficient, pro-viding further evidence of the importance of account-ing for learning from reviews in choice models.

    A unique feature of our model is that it accountsfor the credibility of reviews and allows it to varyacross consumers. Credibility is measured by 1/2R(i.e., precision), where 2R is a measure of uncertaintyabout reviews posted by reviewers. The initial cred-ibility coefficient is modeled as 1/Ei0

    2R. Our esti-

    mates of mean and standard deviation for logEi02R

    are 1.886 (SD = 0118) and 0.398 (SD = 0052), respec-tively. The large and significant standard deviationfor logEi0

    2R suggests that review credibility is sub-

    stantially heterogeneous. The mean of logi0, mag-nitude of richness of experience, is 6.125 (SD = 0929),which indicates that most consumers have experience

    buying books. However, high and significant stan-dard deviation of logi0 suggests our data mightinclude some novices as well.

    In summary, the following are the unique findingsof this research: (a) there is a significant amount ofconsumer learning from online reviews of book titles,much more than that from consumers own experi-ence with other books in the same genre; (b) based onthe similarity between reviews for a book and theirown evaluation of that book, consumers update their

    beliefs about the credibility of product reviews overtime; and (c) not accounting for consumer learningfrom others leads to biased inferences of the effects of

    some covariates on choice.

    5.2. Counterfactual SimulationsA key benefit of adopting a structural modelingapproach to understand the effect of product reviewsis that we are able to estimate the effects of firmsmarketing policies and reviewer behavior on con-sumer choice and market share. To illustrate the man-agerial relevance of this research, we present twosimulations.

    Simulation 1. The objective of this counterfactualexperiment is to illustrate how our modeling frame-work can help firms evaluate the profitability of

    offering product review incentives. Our managerialcontext is that of a firm that manages word-of-mouthactivity (Godes and Mayzlin 2009) by incentivizingconsumers to post online product reviews. In manyindustries such as retail and lodging, consumers areincentivized to write a review to share their expe-riences. Firms such as Amazon, Apple, Macys, andWalmart are known to regularly request their con-sumers to post product reviews. Consider a bookretailer (or publisher) whose gross margin of sellinga book is x% of the selling price (P), and the averagecost of incentivizing a consumer to post one prod-uct review is y% of the selling price. As is typical

    of word-of-mouth marketing campaigns, we assumethat the firm cannot control the content of the reviewposted by the reviewer but can incentivize consumersto post more reviews. If the market size is M units,it follows that the profit increase as a result of vadditional reviews is given by MxP vyP, whereM is the increase in demand as a result ofv prod-uct reviews. So the former term in the expressionfor profit increase denotes increase in revenue, andthe latter term denotes the marketing expenditureincurred by the publisher.

  • 7/29/2019 Consumer Learning From Online Reviews

    13/17

    Zhao et al.: Modeling Consumer Learning from Online Product ReviewsMarketing Science 32(1), pp. 153169, 2013 INFORMS 165

    Based on this setup, we estimate the profit impactof increased customer reviews for a representative

    book (with all attributes set at mean levels in ourdata). We assume the marketing expenditure asso-ciated with an additional product review, y = 10%,and a conservative market size M of 10,000 units.

    We then conduct policy simulations by estimatingchanges in market share, and consequent changesin profits, at varying levels of product margins (wevary x= 10%, 12%, and 14%, based on book industrynorms) and varying numbers of additional reviews(we vary v = 10, 20 100). The market share ofa representative book with the number of reviewsequaling the mean number of reviews per book in ourdata is estimated to be 1.13%.

    Results of policy simulations appear in Table 6, andthere are three key findings. First, as expected, thereare diminishing returns to increasing the number ofreviewers. Increasing the number of reviewers from 0

    to 10 has a much greater profit impact than increas-ing the number of reviewers from 90 to 100. Second,although increasing reviewers always leads to greatermarket share, it might lead to lower profits. For exam-ple, the policy of spending 10% of the book priceon incentivizing an additional review leads to lowerprofits if the number of reviews increases by 70 andif the profit margin is 10%. Third, there is an opti-mum number of product reviews that the firm shouldspend on to maximize profits. For example, if theprofit margin is 14%, the optimal number of reviewsis 40. These results are consistent with the mechanismof Bayesian learning, as uncertainty is greater initially

    and reduces over time as the consumer gains moreexperience with the genre and is exposed to morenumerous reviews. From a managerial perspective,this suggests that although all product reviews havea positive effect on market share, reviews that areposted earlier have a greater effect than those posted

    Table 6 Effect of Increasing Product Reviews on Profits

    Profit increaseas % of baseline profitIncrease in Market

    the number shareof reviews ( in %) Margin= 10% Margin= 12% Margin= 14%

    00 113 000 000 00010 126 708 782 83420 136 1168 1316 142130 141 1150 1372 153040 145 1062 1357 156850 145 619 988 125260 145 177 619 93670 146 177 339 70880 146 619 029 39290 146 1062 398 076

    100 147 1416 678 152

    Note. Baseline profit is the profit when the number of reviewers is the sameas the sample mean.

    later. Firms spending marketing dollars for incentiviz-ing people to post reviews might wish to consider thisdynamic effect of reviews on market share. It would

    be rational for firms to pay more for eliciting earlierreviews than for later reviews.

    Simulation 2. Fake reviews are receiving an increas-

    ing amount of attention by practitioners (Streitfeld2011, Helft 2010). Some firms hire consumers to writefake reviews to spread more positive marketing com-munication. Our model provides a tool to understandthe impact of this practice. Because fake reviews aremore positive than authentic reviews, there should

    be a greater discrepancy between fake reviews of aproduct and the consumers own experience of thatproduct, which will increase consumer uncertainty.Intuitively, such increased consumer uncertainty mayreduce the effect of reviews because of the loweredcredibility of reviews.

    Consider four online retailing platforms selling the

    same books. One has only authentic product reviews(termed authentic), and the other three have a 25%,50%, or 75% chance of getting fake reviews (termedfake), respectively. For authentic reviews across thefour platforms, we set the mean quality of each bookas 4. Fake reviews are assumed to have a rating of 5,

    because they are always very positive. The experiencesignals of a new consumer who buys from these plat-forms are drawn from the distribution N42. Foreach platform, we consider two subconditions: oneinvolves customers with less prior experience (0 = 5)and the other involves customers with more priorexperience (0 = 50). For each platform and undereach of the two conditions, we simulate the path ofthe degree of uncertainty (Eit

    2R) perceived by the

    consumer as she purchases more books. Some pat-terns suggest the following. In the fully authenticplatform (the dark lines in Figures 13), the uncer-tainty first decreases (as a result of learning fromones own experience) and then levels off. In the fakeplatforms, the consumer is exposed to a combina-tion of fake and authentic reviews. We find that heruncertainty increases significantly as she purchasesmore books. This analysis demonstrates that fakereviews increase consumer uncertainty.4 Furthermore,

    the effect of fake reviews on consumer uncertaintybecomes larger as the likelihood of getting a fakereview increases and/or the consumer prior experi-ence decreases.

    Next we illustrate how the effect of reviews dif-fers across the four platforms. We first examine theeffect of increased review ratings of a book when theconsumer makes her 21st purchase on each platform(Figures 13). We know that this consumer perceives

    4 We set all model parameters at their mean level.

  • 7/29/2019 Consumer Learning From Online Reviews

    14/17

    Zhao et al.: Modeling Consumer Learning from Online Product Reviews166 Marketing Science 32(1), pp. 153169, 2013 INFORMS

    Figure 1 The Effect of Fake Reviews on Consumer Uncertainty

    (25% Fake Reviews)

    Authentic

    Fake

    Purchase occasion

    Purchase occasion

    Consumeruncertainty

    Consumeruncertain

    ty

    21191715131197531

    21191715131197531

    6.0

    5.0

    4.0

    3.0

    2.0

    1.0

    0.0

    6.0

    5.0

    4.0

    3.0

    2.0

    1.0

    0.0

    (a) 0 = 5

    (b) 0 = 50

    all reviews on the fake platforms to be less credi-ble than those on the authentic platform. We com-pute the relative change in the choice probability5

    of a book when its product ratings are increased byone standard deviation (but everything else is heldconstant). The consumer is exposed to only authen-tic reviews for this book on all four platforms. Werepeat this exercise for each of the 150 books in ourdata and then calculate the average. As shown inTable 7, exposure to fake reviews reduces the effectof subsequent reviews, even when the subsequent

    reviews are authentic. This is because the consumerperceives all reviews on the fake platforms to beless credible, irrespective of whether they are fake orauthentic.

    We also examine the effect of increased number ofreviews of a book on its choice probability on eachplatform. We perform the same analysis as before,

    5 The choice probability change is calculated as (choice probabilityunder the increased rating choice probability under the currentrating)/(choice probability under the current rating).

    Figure 2 The Effect of Fake Reviews on Consumer Uncertainty

    (50% Fake Reviews)

    Purchase occasion

    Purchase occasion

    Consumeruncertainty

    Consumeruncertain

    ty

    21191715131197531

    21191715131197531

    6.0

    5.0

    4.0

    3.0

    2.0

    1.0

    0.0

    6.0

    5.0

    4.0

    3.0

    2.0

    1.0

    0.0

    (a) 0 = 5

    (b) 0 = 50

    Authentic

    Fake

    with the exception that instead of increasing the rat-ings of the book, we increase the number of its reviews

    by one standard deviation. Our findings suggest thatfake reviews tend to reduce the effect of more numer-ous reviews, even when the additional reviews areauthentic.

    In summary, we find that the marginal effect ofmore positive ratings on choice probability is loweron platforms where fake reviews are common. Themarginal effect of increased number of reviews is also

    reduced as a result of prior exposure to fake reviews.This illustrates the detrimental effect of the practiceof untruthful user-generated content.

    6. ConclusionMarketing researchers have studied how consumerslearn about product quality from several stimuli.Structural models of learning have been employedin the context of consumers usage experience(Roberts and Urban 1988), advertising exposure(Erdem and Keane 1996, Ackerberg 2003, Byzalov and

  • 7/29/2019 Consumer Learning From Online Reviews

    15/17

    Zhao et al.: Modeling Consumer Learning from Online Product ReviewsMarketing Science 32(1), pp. 153169, 2013 INFORMS 167

    Figure 3 The Effect of Fake Reviews on Consumer Uncertainty

    (75% Fake Reviews)

    Purchase occasion

    Purchase occasion

    Consumeruncertainty

    Consumeruncerta

    inty

    21191715131197531

    21191715131197531

    6.0

    5.0

    4.0

    3.0

    2.0

    1.0

    0.0

    6.0

    5.0

    4.0

    3.0

    2.0

    1.0

    0.0

    (a) 0 = 5

    (b) 0 = 50

    Authentic

    Fake

    Shachar 2004), umbrella branding (Erdem 1998),physician detailing in the pharmaceutical industry(Narayanan et al. 2005), and learning from observingchoices made by other consumers (Zhang 2010). Inthis research, we propose a structural model of learn-ing from a source of product-related information thatis becoming increasingly ubiquitousonline productreviews.

    The context of our research is consumer choiceof experiential products such as books, music, andmovies. A unique characteristic about this phe-

    nomenon is that it entails numerous sources of

    Table 7 Average Market Share Change as a Result of Increased

    Ratings and Increased Numbers of Reviews (in %)

    Fake review Fake review Fake review Fake review(0% (25% (50% (75%

    probability) probability) probability) probability)

    Increased rating 810 575 437 379of reviews

    Increased number 2311 2233 1994 1846of reviews

    product information. In other words, a consumer istypically exposed to reviews posted by other con-sumers. It is plausible that the credibility of thereviews evolves over time as the consumer gains moredirect experience and has more opportunities to eval-uate how well reviews predict her own preference.

    We propose a very generalized specification of model-ing review credibility such that the credibility of prod-uct reviews is allowed to vary over time for the sameconsumer (as she gains more direct experience) andacross different consumers. Another feature of expe-riential goods purchases is limited repeated purchaseof the same product item but significant experiencewith other products of the same type. Accordingly, wemodel consumer learning from their own experiencewith other products of the same type.

    Our analysis leads to several unique findings. Con-sumers learn more from reviews of a given productthan they do from their own past experience of sim-

    ilar products. Consumers update their beliefs of thecredibility of reviews based on their own experiencesand ratings from reviews on the same books, so thatlearning from reviews varies across consumers andalso over time. We demonstrate how our model can beused for decisions pertaining to word-of-mouth mar-keting. Specifically, we compute the profit impact ofincreasing the number of reviews when firms needto spend marketing resources in incentivizing con-sumers to post reviews. We find strong evidence ofdiminishing effects of product reviews on profits;a firm could even incur losses when investing in asufficiently large number of product reviews for low-

    margin products. In another policy simulation, weexamine the issue of fake reviews. We consider twotypes of online retailing platforms, one of which hasonly authentic product reviews (termed authentic),and the other has a possibility of getting fake reviews(termed fake). We find that fake reviews increaseconsumer uncertainty. The effects of more positivereviews and more numerous reviews on consumerchoice are lower on online retailing platforms thathave fake product reviews.

    This research marks the first attempt to incorporatea novel source of product information into structuralmodels of consumer learning. As such, our findings

    are neither without limitations nor comprehensive.There are several limitations in this study, suggest-ing future research opportunities. First, we studya single-attribute learning context (based on genre).However, our model is sufficiently general and can

    be readily extended to capture learning on multipleattributes (based on genre, author, etc.). Second, wemodel the difference between the consumer experi-ence signal and the average review rating to be 0.This assumption can be relaxed to specifically modelconsumer-perceived review bias. Third, our data are

  • 7/29/2019 Consumer Learning From Online Reviews

    16/17

    Zhao et al.: Modeling Consumer Learning from Online Product Reviews168 Marketing Science 32(1), pp. 153169, 2013 INFORMS

    not commonly available because they include con-sumers who evaluate all of their purchases. In thecase when only a fraction of consumer purchases areevaluated, we can treat consumer experience signalson those unevaluated products as latent variables andintegrate them out in model inference as in a standard

    learning model. Fourth, despite the associated com-putational burdens, it might be interesting to studywhether consumers adopt forward-looking behaviorin the context of the purchase of experiential products.Finally, although we model learning from consumerreviews posted on a major website that hosts suchinformation, the presence of alternative sources oflearning, such as advertising, off-line word of mouth,and product reviews from other websites, cannot beruled out. This issue of incomprehensive data sets isperhaps generic to all research on consumer learning.We hope that this study will stimulate further inter-est in this challenging, interesting, and increasingly

    important research area.

    AcknowledgmentsThis paper was originally intended to appear in the spe-cial issue on user-generated content, but the review processwas not concluded in time. The authors thank the anony-mous company that provided data for this study. Theyare grateful to the editor, associate editor, and two anony-mous referees for extremely helpful comments. The authorsalso thank participants at the Marketing Science confer-ence as well as workshops given at Duke University andthe University of Texas at Dallas for their valuable com-ments. The second author acknowledges the financial sup-port from the National Natural Science Foundation of China

    [Grants 71128002 and 71210003].

    References

    Ackerberg DA (2003) Advertising, learning and consumer choice inexperience goods markets: An empirical examination. Internat.Econom. Rev. 14(3):10071040.

    Berndt E, Hall B, Hall R, Hausman J (1974) Estimation and infer-ence in nonlinear structural models. Ann. Econom. Soc. Measure-ment 3/4:653665.

    Berry ST (1994) Estimating discrete choice models of product dif-ferentiation. RAND J. Econom. 25(2):242262.

    Brock T (1965) Communicator-recipient similarity and decisionchange. J. Personality Soc. Psych. 1(6):650654.

    Byzalov D, Shachar R (2004) The risk reduction role of advertising.Quant. Marketing Econom. 2(4):283320.

    Chaiken S (1980) Heuristic versus systematic information process-ing and the use of source versus message cues in persuasion.

    J. Personality Soc. Psych. 39(5):752766.

    Chevalier JA, Mayzlin D (2006) The effect of word of mouth onsales: Online book reviews. J. Marketing Res. 43(3):345354.

    Chintagunta PK, Jiang R, Jin GZ (2009) Information, learning, anddrug diffusion: The case of Cox-2 inhibitors. Quant. MarketingEconom. 7(4):399343.

    Dellarocas C (2006) Strategic manipulation of Internet opinionforums: Implications for consumers and firms. Management Sci.52(10):15771593.

    Dellarocas C, Zhang M, Awad NF (2007) Exploring the value ofonline product reviews in forecasting sales: The case of motionpictures. J. Interactive Marketing 21(4):2345.

    Duan W, Gu B, Whinston AB (2008) Do online reviews matter?Anempirical investigation of panel data. Decision Support Systems45(4):10071016.

    Erdem T (1998) An empirical analysis of umbrella branding. J. Mar-

    keting Res. 34(3):339351.Erdem T, Keane MP (1996) Decision-making under uncertainty:

    Capturing dynamic brand choice processes in turbulent con-sumer goods markets. Marketing Sci. 15(1):120.

    Erdem T, Zhao Y, Valenzuela A (2004) Performance of store brands:A cross-country analysis of consumer store brand preferences,perceptions, and risk. J. Marketing Res. 41(1):86115.

    Godes D, Mayzlin D (2009) Firm-created word-of-mouth communi-cation: Evidence from a field test. Marketing Sci. 28(4):721739.

    Godes D, Silva J (2012) Sequential and temporal dynamics of onlineopinion. Marketing Sci. 31(3):448473.

    Hajivassiliou V, Ruud P (1994) Classical estimation methods forLDV models using simulation. Arrow K, McFadden D, eds.

    Handbook of Econometrics, Vol. IV (Elsevier Science, New York),23832441.

    Helft M (2010) Charges settled over fake reviews on iTunes. NewYork Times (August 26) http://www.nytimes.com/2010/08/27/technology/27ftc.html.

    Iyengar R, Ansari A, Gupta S (2007) A model of consumer learningfor service quality and usage. J. Marketing Res. 44(4):529544.

    Keane M (1993) Simulation estimation for panel data models withlimited dependent variables. Maddala GS, Rao CR, Vinod HD,eds. Handbook of Statistics (Elsevier Publishers, New York),545571.

    Kelman HC (1961) Processes of opinion change. Public OpinionQuart. 25(Spring):5778.

    Li X, Hitt LM (2008) Self-selection and information role of onlineproduct reviews. Inform. Systems Res. 19(4):456474.

    Liu Y (2006) Word of mouth for movies: Its dynamics and impacton box office revenue. J. Marketing 70(3):7489.

    Luan JY, Neslin SA (2009) The development and impact of con-sumer word of mouth in new product diffusion. WorkingPaper 2009-65, Tuck School of Business at Dartmouth, Hanover,NH. http://ssrn.com/abstract=1462336.

    Mayzlin D (2006) Promotional chat on the Internet. Marketing Sci.25(2):155163.

    Mehta N, Rajiv S, Srinivasan K (2003) Price uncertainty and con-sumer search: A structural model of consideration set forma-tion. Marketing Sci. 22(1):5884.

    Meyer RJ, Sathi A (1985) A multiattribute model of consumerchoice during product learning. Marketing Sci. 4(6):4161.

    Moe W, Schweidel D (2012) Online product opinions: Incidence,evaluation, and evolution. Marketing Sci. 31(3):372386.

    Moe W, Trusov M (2011) Measuring the value of social dynamics

    in online product forums. J. Marketing Res. 48(3):444456.Narayanan S, Manchanda P (2009) Heterogeneous learning and the

    targeting of marketing communication for new products. Mar-keting Sci. 28(3):424441.

    Narayanan S, Manchanda P, Chintagunta PK (2005) Temporal dif-ferences in the role of marketing communication in new prod-uct categories. J. Marketing Res. 42(3):278290.

    Roberts JH, Urban G (1988) Modeling multiattribute utility, riskand belief dynamics for new consumer durable brand choice.

    Management Sci. 34(2):167185.

    Sncal S, Nantel J (2004) The influence of online product rec-ommendations on consumers online choices. J. Retailing80(2):159169.

  • 7/29/2019 Consumer Learning From Online Reviews

    17/17

    Zhao et al.: Modeling Consumer Learning from Online Product ReviewsMarketing Science 32(1), pp. 153169, 2013 INFORMS 169

    Shin S, Misra S, Horsky D (2012) Disentangling preferences andlearning in brand choice models. Marketing Sci. 31(1):115137.

    Sternthal B, Phillips LW, Dholakia R (1978) The persuasive effect ofsource credibility: A situational analysis. Public Opinion Quart.42(3):285314.

    Streitfeld D (2011) In a race to out-rave, 5-star Web reviews gofor $5. New York Times (August 19) http://www.nytimes.com/2011/08/20/technology/finding-fake-reviews-online.html.

    Train K (2003) Discrete Choice Models with Simulation (CambridgeUniversity Press, Cambridge, UK).

    Zhang J (2010) The sound of silence: Observational learning fromthe U.S. kidney market. Marketing Sci. 29(2):315335.

    Zhao Y, Zhao Y, Helsen K (2011) Consumer learning in a turbu-lent market environment: Modeling consumer choice dynam-ics after a product harm crisis. J. Marketing Res. 48(2):255267.