Do reputation feedback systems really improve trust among anonymous traders? An experimental study

1

Do Reputation Feedback Systems really improve Trust among

Anonymous Traders? An experimental study

David Masclet* Thierry Pénard†

Abstract Feedback systems are claimed to be a crucial component of the success of electronic marketplaces like eBay or Amazon Marketplace. This article aims to examine the efficiency of various feedback systems on trust between anonymous traders, through a set of experiments based on the trust game. Our results indicate that trust is significantly improved by the introduction of a reputation feedback system. However such mechanisms are far from being perfect and are especially vulnerable to strategic ratings and reciprocation. Our findings indicate that some changes in rating rules may significantly improve the efficiency of feedback systems, by avoiding strategic rating or reciprocation, and hence stimulate trust and trustworthiness among traders. In particular, a system in which individuals are not informed of the other trader’s decision before taking their own decision provides better results both in terms of trust and earnings. JEL classification: C92, C72, L14, L86.

*CREM, CNRS, Université de Rennes 1, Marsouin and CIRANO (Montreal). [email protected] † CREM, CNRS, Université de Rennes 1, Marsouin. [email protected] We would like to express our gratitude to Jean Robert Tyran and K. Abbink and conference participants at the JEE and ESA, as well as E. Priour for scheduling and running the experimental sessions. We were able to carry out these experiments thanks to financial support provided by the Brittany Regional Council, as part of the MARSOIN program.

2

1. Introduction

How can trust emerge among strangers? Should we trust people that we have never met

physically? These questions seem to be odd given the success of electronic marketplaces like

eBay or AmazonMarketPlace in which millions of anonymous buyers and sellers trust each other

daily. This success challenges the view that the features of online markets, such as geographical

distance and anonymity, make opportunistic behavior much easier than in traditional markets

with “face-to-face” transactions. Online sellers can be opportunistic by cheating on the quality of

the product (for example by exaggerating its quality), or on delivery (e.g. not shipping, shipping

items other than those described, shipping counterfeit merchandise …). Online buyers can also

be dishonest regarding the payment sent to the seller (for example by delaying payment).

Electronic marketplaces can enhance trust and reduce opportunistic behaviors by screening

participants, certifying quality or monitoring transactions. However such mechanisms can only

be implemented in small online marketplaces (like business-to-business places), where the

anonymity of traders is limited. On C2C (customer-to-customer) marketplaces like eBay or

AmazonMarketplace, the huge number of participants and daily transactions makes it difficult to

centralize monitoring. An alternative method is to let traders self-monitor their transactions by

providing them with a decentralized reputation-building mechanism. The eBay Feedback Forum

that is claimed to be a crucial component of the success of eBay, is a good example of such a

decentralized mechanism. The idea is that the traders themselves are often in a better position to

monitor and evaluate their partners. In the eBay forum, both buyers and sellers have the

opportunity of rating each other. The buyer can send a "positive", "negative", or "neutral" rating

and the seller a "positive", or "neutral" rating. Both can also leave a comment.1 Each eBay user is

1 In addition to leaving a feedback rating (positive, neutral, or negative), buyers can also leave anonymous detailed seller ratings in four areas: item as described, communication, shipping time, and shipping and handling charges. The detailed seller rating system is based on a one- to five-star scale. Five stars is the highest rating, and one star is the lowest rating. Even if these detailed seller ratings do not affect the overall feedback score, they can provide additional information about the seller’s performance.

3

therefore characterized by his or her feedback profile (i.e. the historical record of all the ratings

they have received), which is available for consultation by all other users. The buyer and seller

thus hold information on the reputation or reliability of their partner at the time of concluding the

transaction. These feedback mechanisms therefore play both a punishment and reward role as

well as a signaling role, since each trader can punish (reward) her partner by leaving negative

(positive) ratings, but it also allows each trader to build up a publicly-observable reputation. The

threat of having a bad public reputation may provide the trader with sufficient incentives to be

honest.

Many empirical studies have found that reputation feedback systems exert a deterrent effect on

the opportunistic behavior the Internet's anonymity may incite buyers and sellers to adopt (Ba

and Pavlou, 2002; Houser and Wooders, 2006; Cabral and Horctasu, 2009; Resnick et alii.,

2006). The empirical results show that a seller with good ratings can expect to sell an item more

quickly and at a better price.2 It is in the interest of both partners for their subsequent

transactions to be as honest as possible in order to generate positive ratings, or at least avoid

negative ratings.

The aim of this paper is threefold. First, we seek to provide further evidence of the effects of

rating on trust between anonymous traders, through a set of laboratory experiments based on trust

games with and without feedback systems. Second, we seek to understand sellers’ and buyers’

motives for rating their partners. (1) Individuals may choose to assign a negative (positive) rating

in order to punish (reward) an unfair (fair) outcome (we refer to this as the outcome-oriented

rating motive); (2) They may want to reciprocate received ratings by assigning a negative

(positive) rating because they have received a negative (positive) rating (we call this a feedback-

oriented rating); (3) Last, individual may be willing to assign a positive rating because they

2 Resnick et al. (2006) and Dellarocas (2006) provide summaries of this work. For example, Houser and Wooders (2006) find that on eBay a 10% rise in the number of positive ratings recorded for a seller is associated with a 0.17% in the price that the seller can command, whereas a 10% rise in neutral or negative ratings lowers the price obtained by 0.24%.

4

expect that their partner will subsequently reciprocate by sending them a positive rating (we refer

to this as the strategic rating motive).3 While the first motive is directly related to the outcome

of the relation between the traders, this is obviously not the case for the two last motives that are

directly related to the feedback mechanism and hence might bias the informational content of the

feedback profiles.

The third aim of this paper is to compare different feedback reputation systems and investigate

whether slight modifications of the rating rules may improve the efficiency of the feedback

system. Indeed, feedback mechanisms are far from being perfect and are especially vulnerable to

strategic ratings and reciprocation (feedback-oriented ratings). Does the introduction of feedback

rules that reduce both strategic and feedback oriented rating incentives improve the informational

content of ratings, and hence stimulate cooperation among traders?

To answer these questions, we ran an experiment inspired by the trust game devised by Berg et

al. (1995), in which Player A (the trustor) selects the amount of allocation he wishes to send to

Player B (the trustee). Player B actually receives three times the amount sent and then must

decide how much to return to the first player.4 In comparison to the standard trust game, we

added in some treatments a second step, once the investment decisions have been made, during

which participants are given the option of evaluating their partners.

Three feedback systems were compared. In a first treatment called simultaneous rating,

players simultaneously submitted ratings without information about the rating decision of their

partner. In a second treatment called exogenous sequential rating, rating decisions were

3 For example, on eBay Forum, a buyer or seller can submit an unjustified positive rating, to encourage the transaction partner to reciprocate with a positive rating. 4 This game yields a good approximation of how a transaction might be conducted in online market such as eBay that dictates that one of the commercial partners (the buyer) makes payment to the other (seller), in return for the promise of receiving the purchased item. The buyer is therefore required to trust the seller, who in turn can elect to be honest or, conversely, opportunistic by not delivering the item or by sending an item that does not correspond to that listed in the auction description. The fact that both buyer and seller‘s decisions are not binary reflects the degree of trust. Indeed for example the seller can be dishonest on the degree of quality of the item sent to the buyer or on the number of days for delivery.

5

sequential and the order in which players evaluated their partners was randomly imposed by the

computer. Finally in a third treatment (called endogenous sequential rating), the order in which

players evaluated their partners was endogenously chosen by the agents themselves. Precisely,

each player could opt to submit a rating either immediately or later.

The comparison of these three rating systems allows us to clearly disentangle the different

players’ motives for rating their partners and to measure the relative efficiency of each feedback

mechanism. With endogenous sequential ratings, evaluations can be driven by both outcome-

oriented and feedback-oriented motives, but also by strategic motives. In contrast, the

simultaneous feedback system rules out the use of both strategic and feedback-oriented ratings,

and thereby enhances the efficiency of the system. A simultaneous rating system prevent players

from adopting strategic behavior, since they cannot assign a positive rating in order to trigger a

positive response from the partner. It also reduces feedback-oriented rating, since players cannot

punish (reward) a partner for having received a negative (positive) rating. Finally, the feedback

system where the order of the decision is exogenously imposed (exogenous sequential rating)

provides an intermediate system that prevents the first player from using rating for feedback-

oriented motives and the second player from adopting strategic rating.

Our analysis builds on previous experimental work. In particular, our approach is related

to Bolton et al. (2004), who ran an experiment using a sequential buyer-seller game. The authors

compare a treatment with and without reputation. In the reputation treatment, players were

informed of each other’s past play. Bolton et al. (2003) find that trust and trustworthiness were

significantly higher under a reputation system.5 Our paper is also related to Keser (2003), the

only previous paper, to our knowledge, to consider the effect of feedback system in the context of

a trust-game experiment. Keser examined the effects of reputation by comparing three different

5 Bolton et al. (2006) also run experiments investigating how the interaction between market competition and reputation creates trust between sellers and buyers. See also Gazzale and Khopkar (2007) that use the same experimental design, but introduce the possibility for sellers to observe buyer’s past feedback provision.

https://www.researchgate.net/publication/5160019_How_Effective_are_Online_Reputation_Mechanisms_An_Experimental_Study?el=1_x_8&enrichId=rgreq-e0e53942-949a-44c5-b6f2-23e36ff16bcf&enrichSource=Y292ZXJQYWdlOzIzMjg3MzQ5MztBUzo5ODU3NTQ1OTU1MzI4M0AxNDAwNTEzNjI5MzI0

6

treatments. In the baseline treatment, subjects play a repeated trust game under a stranger

matching protocol. The reputation treatments (short- and long-term reputation) are similar to the

baseline treatment, except that there is an additional stage in which the trustor can rate the trustee

by assigning her a costless positive, negative or neutral rating. At the beginning of each period,

the trustor will be therefore informed about the trustee's previous ratings. Keser finds that

introducing this feedback system significantly increases cooperation in the trust game, in

particular when subjects have full information (long-run reputation). Our experiment builds on

Keser (2003), with the notable exception that we allow both the buyer and the seller to rate their

partners and we introduce a cost of evaluation. In addition, our experiment provides an in-depth

analysis of the different motives for rating one's partner, by comparing various feedback

mechanisms.

Our study is also related to several other previous studies that have investigated the limitations of

reputation feedback systems (Dellarocas et al., 2006, Dini and Spagnolo, 2007). Dellarocas et

al. (2006) and Klein et al. (2005) provided empirical evidence of reprisals and reciprocity at work

on eBay, which ultimately serves to increase artificially the number of positive ratings and

reduces negative ratings (see also Resnick and Zeckhauser, 2002). Our study provides further

evidence of the imperfections of these feedback mechanisms and seeks to disentangle the

different motives in the context of a lab experiment. The laboratory experiments have the

advantage of isolating each motive in a controlled environment, and of avoiding any possible role

for contextual effects. Our analysis relies on actual and costly decisions instead of subjective

reported behaviour.

To anticipate our results, we find that trust is significantly improved by the use of a

reputation feedback system. We also show that evaluations are largely driven by feedback-

oriented rating and strategic motives. Our results also indicate that trust can be improved when

feedback rules are more constrained. In particular, a system in which individuals are not

https://www.researchgate.net/publication/224101752_Keser_C_Experimental_games_for_the_design_of_reputation_management_systems_IBM_Syst_J_42_498-506?el=1_x_8&enrichId=rgreq-e0e53942-949a-44c5-b6f2-23e36ff16bcf&enrichSource=Y292ZXJQYWdlOzIzMjg3MzQ5MztBUzo5ODU3NTQ1OTU1MzI4M0AxNDAwNTEzNjI5MzI0

https://www.researchgate.net/publication/228955281_Trust_Among_Strangers_in_Internet_Transactions_Empirical_Analysis_of_eBay's_Reputation_System?el=1_x_8&enrichId=rgreq-e0e53942-949a-44c5-b6f2-23e36ff16bcf&enrichSource=Y292ZXJQYWdlOzIzMjg3MzQ5MztBUzo5ODU3NTQ1OTU1MzI4M0AxNDAwNTEzNjI5MzI0

7

informed of their partner's decision before taking their own decision (simultaneous rating system)

provides better results both in terms of trust and payoffs compared to sequential rating systems.

The experimental protocol is presented in the next section. We briefly present the

theoretical predictions of our experimental games in Section 3. The experimental results are

discussed in Section 4, and Section 5 concludes.

2. Experimental design

2.1. Overview

Our experimental design consists of four different treatments. The baseline treatment

corresponds to a finite repeated simultaneous trust game. The game lasts for 20 periods. At the

beginning of each period, participants A and B each receive a 10-unit allocation. Player A (the

buyer) selects an amount between 0 and 10 units to send to B (the seller), while at the same time

B determines the sum to be returned, which is between 0 and the amount received (the latter is

the amount sent by A multiplied by three).6 Player A's gain is then equal to 10 – amount sent +

amount returned, and Player B's gain is 10 + 3*amount sent by A – amount returned.

The second treatment, called “Simultaneous rating” is identical to the baseline treatment

except that we added a second stage in which the players have the opportunity of rating their

partner. Both players (A and B) can decide to rate their partner by assigning either a negative (-1)

or positive (+1) point. However, leaving a rating costs 1 unit (i.e. 1/10th of the initial allocation).7

The rated player does not incur any direct cost or benefit, although the negative or positive points

received are recorded on the player's feedback profile. This profile contains a historical record of

6 This is a cold procedure, where A and B play simultaneously. Player A chooses the amount to send B, while B determines the amount to return for all potential amounts received from A. The advantage of this cold procedure is that it places the two players in a more symmetrical position than a so-called hot procedure (whereby A chooses first, and subsequently B), which could provide Players A and B with justified (i.e. non-strategic) reasons to evaluate their partner positively or negatively. 7 This cost corresponds to the opportunity cost of rating, measured by the amount of time and effort devoted to this task. Note that introducing a cost in the decision to rate allows us to compare treatments with identical theoretical prediction for each treatment (zero rating and zero investment in each treatment).Note also that if we observe a significant number of ratings in our experiments, this should be even higher in the absence of cost of rating .

8

all ratings, along with a score that represents the cumulative sum of positive and/or negative

points obtained over all of the previous periods. At each new period, the players’ profile is

transmitted to their subsequent partners, so that each player is aware of his partner's ratings and

has an idea of his partner's reputation. In the “simultaneous rating” treatment both players are

required to make their rating decisions simultaneously. This rule could correspond to keeping

buyer and seller ratings secret until the rating period has expired; this system would, in theory,

eliminate both strategic ratings and Feedback-oriented ratings.

The third treatment, called “Endogenous Sequential rating”, is the same as the

“simultaneous rating” except that each participant is given the choice to either evaluate

immediately or wait, knowing that only one rating will be accepted. This option is reflected in the

experimental protocol by splitting the rating stage into two phases: each player has the possibility

of proceeding with the rating straight away in Phase 1 or waiting until Phase 2. If the participant

waits until Phase 2, he is made aware of her partner's choice (i.e. either an immediate

positive/negative rating, or no rating), prior to ultimately deciding whether to evaluate her

partner8. With this system, players can implement various types of strategies. They can opt to

submit a positive rating immediately in order to incite a positive rating in return (strategic

motive), provided the partner has decided to wait or, on the other hand, to wait so as to

punish/reward any partner who gives a negative/positive rating in Phase 1 (Feedback-oriented

rating).9

The last treatment, called "Exogenous Sequential rating" is identical to the previous

treatment described above, except that the order in which players evaluate one another (i.e. rating

in phase 1 or in phase 2) is exogenously predetermined by a computer in each period. This

variant constrains partners' rating freedom along with their possibility of adopting strategic 8 Such rating rules are very similar to those practiced on several online marketplaces such as for example eBay, which leave buyers and sellers can rate each other during 90 days following the transaction: they can either rate immediately or wait until the rating period has almost elapsed before sending their ratings points. 9 The waiting preference on the part of a player wishing to send a negative rating to an opportunistic partner can also be strategic, i.e. in order to avoid reprisals (in the form of a negative rating) from the punished partner.

9

behavior. Feedback-oriented rating (i.e. reprisal or reward) is ruled out for the player who is

designated to rate first (i.e. in phase 1), while strategic rating is impossible for the player in the

second position

These experimental treatments enable us to compare the performance and level of trust

generated by different rating policies. Does the introduction of feedback rules reducing both

strategic and feedback-oriented ratings improve the informational content of ratings and hence

stimulate cooperation among partners?

2.2. Procedure and parameters

The experiment consists of 25 sessions, with 10 participants in each session.10 There were at

least six sessions, and thus independent observations, for each treatment. Seven sessions were

conducted under the baseline and Simultaneous rating treatments, and six for both the exogenous

sequential and the endogenous sequential rating treatments, giving a total of 252 participants. All

of the sessions were held at the Center for Research in Economics and Management (CREM),

University Rennes I, Rennes, France. The experiment was computerized using the Ztree program

developed at the University of Zurich.11 The subjects were undergraduate students from a variety

of majors. Roughly one-third were students in economics in the first two years of their university

studies, and all but a small number of the remaining two-thirds were students in law,

management, and medicine. None of the subjects had participated in an economic experiment

previously. No individual participated in more than one session. On average, a session lasted 100

minutes, including initial instructions and subject payment.

10 With the exception of two sessions of the baseline treatment and one session of the exogenous sequential treatment which contained eight players. 11 See Fischbacher (1999) for a description of the Ztree computer program.

10

At the beginning of the experiment, the instructions were distributed and read to the

subjects.12 All subjects were then required to answer a number of questions concerning the rules

of the game and how earnings are determined. The experimenter then announced and explained

the correct answers. Subjects could indicate whether they had any questions about the process

and the experimenter would answer them in private.

Each session had twenty interaction periods. Each period within a session proceeded under

identical rules. At the beginning of the experiment, each participant was assigned the role of

player A or player B. They kept this role during the entire session. The computer network then

matched the subjects into pairs of players, with one player A and one player B. A stranger

matching protocol was used in all of the sessions: at the end of each period, the composition of

the groups changed so that individuals were rematched with another partner on a random basis.

Average player remuneration was 18 euro.

Table 1 presents a summary description of the sessions. The two first columns show the

session number and the number of subjects who took part in the session. The third column

indicates the treatment.

[Table 1 : about here]

3. Theoretical predictions and conjectures

3.1. Standard theoretical predictions

Before presenting our results, we briefly present the theoretical predictions of our experimental

games. We first consider the baseline treatment. Player A decides how much to send to player B

by choosing an amount α from the interval between 0 and d where d corresponds to player A’s

initial endowment. The invested amount is tripled by the experimenter. Player B then decides

12 Game instructions are available upon request from the authors.

11

how much to return from the interval between 0 and 3 α. Therefore, the earnings of players A and

B in a period are given by:

(1a)A dπ α β= − + and

where α is the amount sent by player A to player B and β corresponds to the amount received

from B. The subgame perfect equilibrium of the trust game predicts that player A sends nothing

to player B, anticipating that the latter, being fully rational, would never return anything (α= 0,

β=0). So if the game is played once, there is a dominant strategy for both players to send and

return zero. If the game is finitely repeated, the only subgame perfect equilibrium of the game is

also for all players to send zero in each period.13

Most of previous experimental studies on the trust game are inconsistent with the standard

predictions. Berg, Dickhaut and McCabe (1995) found that almost all of the senders passed a

positive amount of money (evidence of trust) and the trusted players tend to return the amount

that the other player invested (evidence of trustworthiness). Similar results were obtained by

Croson, R and N. Buchan, (1999), Chaudhuri and Gangadharan (2003), Burk et al. (2003),

Willinger et al (2005).

Let us turn now to the treatments with a feedback system. In the first stage, subjects play the

trust game as above. In the second stage, each player costly evaluates her partner. Player A and

Player B’s earnings in a period are given by:

and

where ci,j is the cost incurred by player i of rating player j. It can be proved by backward

induction that in the second stage, players should never rate their partners, since rating is costly

(regardless of the feedback system). In the absence of rating, the game is therefore reduced then

13 The fact that our game is played using strategy method does not change theoretical predictions of the game.

3 (1b)B dπ α β= + −

, (2a)a a bd cπ α β= − + − ,3 (2b)b b ad cπ α β= + − −

12

to a standard trust game with a trivial subgame perfect equilibrium (zero trust and zero

trustworthiness) and players’ gain is their initial endowment. This situation is collectively

suboptimal because by sending a part or all of his endowment, the first player could have

increased the total gains of the two players.

3.2. Conjectures

One might conjecture that introducing a feedback system influences positively both trust and

trustworthiness. Reputation feedback systems may affect decisions in two different ways. First,

displaying feedback profiles might help players infer their partners’ intentions. Many studies in

different experimental contexts have shown the influence of releasing the past history of

individual players’ decisions. Berg, Dickhaut and McCabe (1995) find that the provision of

social history (i.e. information on the amounts invested and returned in previous experimental

sessions) significantly increases investment and return. In a related study, Keser (2003) shows

that the introduction of a reputation feedback system enhances the overall efficiency by

increasing both the level of trust and of trustworthiness, although rating one’s partner is not

directly costly. In another context, Duffy and Feltovich (2002, 2006) find that the ability to

observe the other player’s decision increases cooperation in the prisoner’s dilemma.14

Second, the attribution of negative (positive) ratings may play a disciplining role as a stick and

carrot mechanism. For example, players who receive negative evaluations may be incited to

increase their investment in the next period. As shown in different contexts and in particular in

VCM (Voluntary Contribution Mechanism) experiments, people do not hesitate to sacrifice a part

of their money to sanction or reward their partners in order to express their disapproval and such

mechanisms have positive effects on cooperation (see Fehr and Gächter (2000, 2002), Carpenter

14 In a Voluntary Contribution Mechanism context, Sell and Wilson (1991) find that displaying the history of each group member’s contribution yields higher contributions than posting only the group’s total contribution or no information at all.

13

(2007), Carpenter, Matthews and Ong'ong'a (2004), Masclet, Noussair, Tucker and Villeval

(2003), Nikiforakis and Normann (2008), Nikiforakis, Normann and Wallace (20010), Bochet,

Page, and Putterman (2007); Dugar, (2008)).15 Based on these findings we conjecture that the

opportunity of evaluating one’s partner may have positive and significant effects on trust and

trustworthiness. This is summarized in H1.

H1: both trust and trustworthiness should be higher with a feedback system.

Our second conjecture concerns the differences between the feedback systems. In the

simultaneous rating system, evaluations are only outcome-oriented since this treatment rules out

feedback-oriented rating and strategic rating. In contrast, in the endogenous sequential rating

treatment (and to some extent in the exogenous treatment), players can also rate their partners for

strategic reasons or feedback oriented motives (reprisals and reward). The effects on trust and

trustworthiness of these additional motives are not clear cut. One can distinguish two potential

effects on trust. The first effect is a negative direct effect on trust and trustworthiness : available

information about one’s partner (i.e. feedback profile) and about received rating is less reliable in

presence of feedback-oriented and strategic ratings, which should translate into lower levels of

trust and trustworthiness. The second effect is indirect, through the rating strategies. Allowing

observability of ratings may incite individuals to assign less negative ratings. Indeed subjects

may refrain from using negative rating by fear of retaliation which might dilute the effectiveness

of the system in increasing trust (see Denant-Boemont, Masclet and Noussair, 2007; Nikiforakis,

15 When a game is one-shot or repeated with strangers, two non-strategic motives are generally evoked in the literature to explain why subjects may be willing to assign negative (positive) points. A first non-strategic motive is related to negative (positive) emotions, such as anger and disapproval (approval). It relies on the idea that people react to unfair (fair) intentions by sacrificing a part of their payoffs in order to punish (reward) others, even when there are no reputation gains from doing so (Rabin (1993), Falk and Fischbacher (2006)). A second non-strategic reason to punish (reward) group members relies on distributional concerns such as inequality aversion (Fehr and Schmidt (1999), Falk, Fehr and Fischbacher (2005), Bolton and Ockenfels (1999)).

14

2008).16 However in contrast, one should observe higher levels of positive ratings. For instance,

individuals may be willing to assign a positive rating because they expect that their partner will

subsequently reciprocate by sending them a positive rating. The increase of the total amounts of

positive ratings in players’ feedback profile should enhance both trust and trustworthiness. To

what extent does such gain in term of higher investments induced by additional positive rating

compensate for the loss in term of reduction of negative evaluations? Based on previous studies

that compared the effectiveness of reward/sanction mechanisms, one may reasonably suspect that

the negative effect should largely offset the positive effects of higher positive ratings (Andreoni

et al., 2003; Walker and Halloran, 2004; Sefton et al., 2007).17 Our conjecture is given in H2.

H2: Trust and trustworthiness should be higher when strategic rating or/and reciprocation are

ruled out.

Our third conjecture concerns the timing of the evaluations in the endogenous sequential

treatment. Based on our conjectures about strategic and feedback-oriented motives, we postulate

that if subjects anticipate the reaction of their counterpart, they should assign a positive rate in the

first phase of evaluation and evaluate negatively in the second phase of evaluation. This is

consistent with previous studies that have investigated the “Last minute feedback” effect (See,

for example, Resnick, Kuwabara, Zeckhauser, and Friedman (2000), Resnick and Zeckhauser

(2002), Klein et al. (2006); Cabral and Hortacsu (2009)). This is stated more precisely in H3.

16 Nikiforakis (2008) conducted a public good experiment in which there are two rounds of sanctions. Each individual becomes aware, after the first round of sanctions, of the punishment that each individual assigned to him. He then has the opportunity to sanction those, but only those, who sanctioned him. This creates a second round of sanctions, but only for the purpose of avenging sanctions received, which is termed as counterpunishment. Nikiforakis finds that the existence of the option to counterpunish nearly entirely offsets the increase in contributions the existence of the opportunity to punish creates. 17 Several studies have shown that punishment is more efficient than rewards to promote cooperation in VCM (Andreoni et al., 2003; Walker and Halloran, 2004; Sefton et al., 2007).

15

H3: One should observe more positive ratings in first phase and more negative rating in the

second phase of evaluation of the endogenous sequential treatment.

4. Experimental results

This section is organised as follows. Subsection 4.1 discusses the patterns of players’

investments in the trust game and considers the impact of rating on the first-stage decisions.

Subsection 4.2 analyses the determinants of rating in each rating system. In particular, we

examine the extent to which constraining rating rules can stimulate trust and fair trading.

4.1. Trust and trustworthiness with and without rating systems Table 2 shows the average and standard deviations of the investments of players A and B in each

treatment. Player A's average investment is the highest in the simultaneous treatment (4.36),

followed by the Exogenous Sequential (4.17), Endogenous sequential (3.02) and Baseline (2.24)

treatments, but there is considerable heterogeneity between groups in all of the treatments. A

Mann-Whitney pairwise test comparing investments between treatments, under the assumption

that each session is a unit of observation, reveals greater investments in the exogenous sequential

treatments and simultaneous treatment than in the Baseline treatment (two-tailed tests: z=-2.714

with p<0.01; and z=-2.747 with p<0.01, respectively). We also observe higher investment levels

in the endogenous sequential treatment than in the baseline, although the difference is not

significant (two-tailed test: z=-1.14 with p>0.1). These results indicate that introducing a rating

system significantly increases trust and trustworthiness.

On average Players A send 0 units in about 45% of the cases in the baseline treatment.

This frequency is significantly lower for the treatments with ratings, in particular in the

exogenous sequential (25.5%) and simultaneous (26%) treatments. In contrast, investing a high

amount is more likely to appear in the treatments with opportunity of rating. For example, the

16

frequency of choosing 10 units is 14.7% in the simultaneous treatment and only 5% in the

baseline treatment.

Further, average investment is significantly higher in the simultaneous treatment than in the

endogenous sequential rating treatment (two-tailed test: z = 1.715 with p<0.1).18 This finding

indicates that rating rules matter, and that the less constrained rating system (i.e. the endogenous

sequential rating system) is not necessary the most efficient to stimulate trust.

Figure 1 displays average investment by A-players per period for each treatment. It

confirms that for almost all periods the investment level is higher under treatments with rating

than in the baseline. Figure 1 also indicates that in all treatments investment levels decrease over

time. When comparing average investment in periods 1-5 and 16-20, we find that average

investment falls significantly in all treatments. This decrease is statistically significant in the

baseline (Wilcoxon test: z =2.366, p<0.05). The same test also indicates significant declines in

the endogenous sequential treatment (z=1.99, p<0.05) and the simultaneous treatment (z =2.366,

p<0.05).19

[Figure 1and Table 2: about here]

Table 2 also shows the average investments by player B and the average relative amounts

returned in each treatment. B-players invest significantly more in the rating treatments than in

the baseline treatment. They return an average of 4.21 (simultaneous treatment) and 3.98 units

(exogenous sequential treatment), which is significantly higher than the 1.45 units in the baseline

treatment (z=-2.875 with p<0.01, and z=-2.429 with p<0.05, respectively for the simultaneous

and sequential treatments). However the average return in the endogenous sequential treatment

(2.8 units) is not significantly different from that obtained in the baseline treatment (z=-1.571

with p>0.1). Comparing the different rating systems, we note that the average investment level in

18 The differences between endogenous and exogenous sequential treatments, and between exogenous sequential and simultaneous treatments, are not significant. 19 No significant difference is found in the exogenous sequential treatment (z =0.524, p>0.1).

17

the simultaneous treatment is significantly greater than in the endogenous sequential treatment

(z=1.64 with p<0.1). The other differences between rating systems are not statistically

significant.

Figure 2 shows player B’s investment for each level of player A’s investment. It indicates that for

all treatments, the level of player B’s investment is strongly and positively correlated with player

A's investment. It also shows that for each level of player A’s investment, players B react more

strongly in the treatments with rating than in the baseline.

Finally figure 3 shows player B’s average return over time for each treatment. It confirms our

previous findings that Player B’s average investment is higher under treatments with a rating

system, and in particular under the simultaneous treatment compared to the baseline treatment.

Figure 2 also shows that average return falls over time in all treatments. Comparing periods 1-5

and 16-20, this fall is significant in all treatments (p=0.0180, p=0.0464, p=0.0747 and p=0.0280

for the baseline, endogenous sequential, exogenous sequential and simultaneous treatments,

respectively).

[Figures 2 and 3 : about here]

To sum up, our experimental results are quite consistent with the hypothesis that introducing a

rating system improves cooperation, in particular when the rating system prevents the adoption of

strategic behavior.

4.2. The determinants of trust and trustworthiness

To measure the relative impact of each of the three rating systems on trust and

trustworthiness, we estimated the influence of rating on investments and return. The results are

shown in Table 3. Table 3 consists of two parts. The left part displays the results of three

18

regressions in which the dependent variable is the amount sent by players A. The use of Tobit

models is justified by the number of left-censored observations in the sample. In addition, since

each subject is observed a number of times, we appeal to panel data methods, and estimate all of

the regressions with random effects. The right part shows the results of regressions in which the

dependent variable is Player B’s rate of return.20 Regressions are estimated as Random Effect

Generalized Least Squares with standard errors clustered at the session level.

The determinants of the amounts invested and rate of return by players in period t are: the amount

received in period t-1 from the previous partner and a set of variables describing the current

partner's profile. The "positive rating in t-1" ("negative rating in t-1", respectively) variable

indicates whether the partner received a positive (negative, respectively) rating in the previous

period. These variables are interpreted in comparison with the omitted variable “No rating in

previous periods”. The cumulative positive rating variable takes the value of the sum of positive

ratings since the beginning of the game. The cumulative negative rating variable is constructed

symmetrically. The impact of feedback systems on player investments is measured by means of a

dummy variable for each rating system (simultaneous, exogenous sequential and endogenous

sequential rating treatment). We also introduced a trend variable (period) and a dummy variable

for the final period (period_20).


Column (1) of table 3 indicates that the introduction of a rating system has a positive and

significant effect on the amount sent by player A. The marginal effects indicate that playing the

simultaneous treatment is associated with a statistically significant increase in the amount sent by

player A of about 10 percentage points relative to the baseline treatment. The marginal effects

20 Since the amount sent back by Player B is bounded by the amount sent by Player A, it would be misleading to use the amount returned by player B to measure trustworthiness. Only relative return for positive amounts sent by player A are considered here since players B cannot react to zero amount received from players A.

19

0.095 and 0.055 associated to the variables “exogenous sequential treatment” and “endogenous

sequential treatment” show that individuals who play the exogenous sequential treatment increase

their investment by 9.5 and 5.5 percentage points with respect to the baseline treatment,

respectively. Column (2) of Table 3 also shows that the amount sent by player A depends on the

amount received from player B during the previous period. Consequently, Player A is more

strongly inclined to trust player B (i.e. to send him a higher share of his endowment) were he

have received a greater sum from his previous partner. Quantitatively, our results imply that a

10% increase in the amount received from player B during the previous period increases the total

amount sent by 1.19 percentage points, according to column (2). Finally the negative coefficient

associated to the trend variable shows that the investment level falls over time, even with the

introduction of a rating system.

Column (3) of Table 3 shows that the amount sent by Player A is depending on the

partner's feedback profile (i.e. her past ratings). More interestingly players take into account the

full history of past ratings and do not only focus on the most recent rating. Player A will

therefore increase investment if the number of positive ratings received by her counterpart

exceeds the number of negative ratings.

When comparing the treatments with rating, column (3) indicates that players A invest

significantly more in the simultaneous rating and exogenous sequential rating treatments than in

the endogenous sequential treatment. This finding confirms the fact that introducing more

restrictive rating rules (ratings with a predetermined order or simultaneous ratings) leads to

greater trust. Precisely those who play the simultaneous treatment and the sequential exogenous

treatment increase their investment by 5.9 and 5.4 percentage points with respect to the sequential

endogenous treatment, respectively. These findings confirm the fact that rating systems that

reduce strategic rating or reciprocation lead to greater trust.

20

Turning next to the determinants of players B’s relative returns, table 3 indicates that the

introduction of a rating system also improves significantly trustworthiness (see columns 4 and 5)

as shown by the positive and significant coefficients associated to the treatment variables.

The coefficients associated to the variables “simultaneous rating” and “exogenous sequential

rating” as well as the variables associated with the partner's feedback profile are not significant in

Column (6).21 A possible reason is that received information about the partner's feedback profile

is less informative for Player B than for players A since they can easily condition their decision

on the amount received from player A in the current period.22

To sum these findings indicate that the introduction of a rating system has a positive effect on

both trust and trustworthiness. We also find that the simultaneous system provides better results

compared to sequential rating systems. In particular players A are more inclined to trust player B

under the simultaneous rating system.

4.3. Motivations for rating one's counterpart and limitations of the current system

In this section, we consider the motivations for rating one's partner. Table 4 provides information

about the determinants of rating in each treatment. First, consistent with outcome-oriented

motivations, ratings are strongly correlated with investment levels. Table 4 shows that players

assign negative ratings for low investment levels. In contrast, positive ratings are more likely to

be assigned to high investment levels. Second, table 4 indicates that most of the negative ratings

are assigned in phase 2 of the endogenous sequential and exogenous sequential treatments while

21 To compare the relative impact of partner's feedback profile in the different treatments with rating, we have also estimated the determinants of trust and trustworthiness in separate regressions including interaction variables “treatment* cumulative positive rating” and treatment* positive (negative) rating in t-1 (not reported here but available upon request). The insignificant coefficient associated to these variables indicate that the impact of partner's feedback profile is not significantly different across treatments. 22 As shown by figure 3, there exists a strong positive correlation between the “amount sent by player A” and the amount returned by player B. Due to endogeneity problems, the variable “amount sent by player A” could not be included in the estimates presented in the right part of table 3.

21

positive ratings are generally given in the first stage. Consistent with assumption H3, these

findings indicate that players assign negative ratings in the second phase to escape retaliation (a

last minute rating strategy) while evaluating positively in first phase may incite the counterpart to

reciprocate by evaluating positively.


In order to provide more formal evidence of the determinants of evaluation, table 5

presents results of several regressions on the probability of rating one’s partner as well as on the

probability of assigning a negative (positive) evaluation. As the probability of submitting a

negative (positive) rating actually depends upon the probability of submitting a rating, we

considered two separable decisions using a two-step estimation procedure: first the decision to

evaluate someone and second, conditional on the decision to evaluate, the choice of assigning a

negative evaluation.


Table 5 consists of two panels. The left panel displays the results of regressions in which

the dependent variable is the A’s rating of B. The right panel presents the results of the

determinants of B’s rating. Columns (1) and (5) show the results of the Random Effects Probit

on the decision to evaluate one’s counterpart. This selection Probit is used to determine the

inverse of the Mill’s ratio (IMR). The independent variables “sent amount” and “received

amount” are included in the estimates on the probability of evaluating someone. We also

included dummies for each treatment with rating as well as some trend variables.

We then explain negative rating decision, conditional on the decision to evaluate,

correcting for the potential selection bias. We attempt to explain negative rating using the

following explanatory variables: the “amount sent”, the “amount returned by the partner” and

22

treatment dummies. Regressions (2) and (6) refer to all treatments with rating while regressions

(3) and (7) refer to the sequential endogenous treatment only. The last regression in each panel

refers to the sequential exogenous treatment.

Regressions (3), (4), (7) and (8) include several additional explanatory variables: two

dummy variables “received a negative rating” and “received a positive rating” from the

counterpart and a dummy “Rating during phase 1” that equals 1 if the individual assigned her

rating during Phase 1 and 0 otherwise. The first two variables aim at capturing feedback-oriented

ratings (both retaliation and positive reciprocity) while the last variable enables us to identify the

presence of strategic rating motivations.

Columns (1) and (5) of Table 5 show that the probability of evaluating one’s counterpart

increases with the amount sent. The negative and significant coefficients associated to the trend

variable indicate that the probability of rating someone significantly declines over time. A

possible reason is that as cooperation emerges less evaluation is needed over time. Note that

similar findings are observed in VCM with punishment. Finally the coefficient associated to

dummies for each treatment are not significant (or significant at 10% level only for the sequential

endogenous treatment), suggesting that the overall level of evaluation is not significantly

different across treatments. Non-parametric tests report similar results.

Turning next to the determinants of negatively evaluating someone, table 5 also indicates that for

both players, negatively rating one's partner decreases with the amount received. The

coefficients associated to the dummy variables “endogenous sequential treatment” and

“exogenous sequential treatment” are not significant in column (2) and negative and significant

in column (6), indicating that the probability for players B of negatively rating her counterpart is

significantly lower under the sequential treatments (both endogenous and exogenous) with

respect to the simultaneous treatment as the reference category. This finding is to some extent

23

consistent with our assumption according to which people would refrain from evaluating

negatively when reprisals are possible.

The variable "Received a negative rating” captures a positive and highly significant

coefficient in columns (3) and (7). These findings confirm that in this treatment players use

negative ratings as a means of reprisal against partners who assign them a negative rating

(feedback-oriented rating). This variable is insignificant in the estimates that refer to the

sequential exogenous treatments (see columns (4) and (8)). This suggests that forcing the

individuals to evaluate in a pre-defined order reduces retaliation. In all estimates, the coefficient

associated to the variable “"Received a positive rating” is not significant.

Further, the negative and significant coefficients on "Rating during Phase 1" in columns

(3) and (7) show that rating immediately reduces the probability of a negative rating in the

sequential endogenous treatment. These findings support our conjecture about strategic ratings :

those who want to rate their counterpart negatively tend to wait for the second phase (if possible)

to avoid reprisals whereas those who post a positive rating are strongly incited to do this as

quickly as possible (i.e. in the first phase of the game). These findings are consistent with our

assumption H3. Regression (8) also reports a negative coefficient associated to this variable,

although significant at 10% level, indicating that the sequential exogenous rating treatment

reduces to some extent strategic motives compared to the endogenous sequential rating treatment.

To summarize, our data reveal that both players use negative ratings as a means of reprisal and

tend to wait (if possible) the last phase to assign these ratings. Restricting the opportunity to

choose the moment of evaluation tends to reduce both retaliation and strategic ratings.

4.4. The effects of receiving evaluations

24

This section concerns the relationship between the evaluation (positive or negative) a subject

receives and the change in her decision in the next period. Indeed even if our game is played

under a stranger matching protocol, one might suspect that receiving some evaluations in the

current periods may influence one’s investment decisions in the next period. The variable

“received evaluation in t-1” was deliberately not included in the regressions shown in table 3, due

to the existence of potential autocorrelation problem. For example, players who received

negative points in t-1 are probably low investors who are likely to continue to under invest in the

next periods. To circumvent this difficulty, we estimated how receiving evaluation in t-1 may

change the investment of individual i between period t-1 and t. The results are shown in table 6.

The first column displays the estimate of the regressions in which the dependent variable is the

change in the amount sent by player A between period t-1 and t. The second column shows the

results of the regression in which the dependent variable is the change in Player B’s rate of

return. The independent variables include dummy for having received positive (negative) rating

in the previous period and interaction variables “received positive (negative) rating*treatment”.

[Table 6 about here]

Table 6 shows that receiving negative evaluations raise both the level investment and the rate of

return. Players A who received a negative evaluation in period t-1 of the simultaneous treatment

tend to increase their investment in t by 1.869 units with respect to the reference state with no

evaluation received. Player A’s investment increases by 1.02 and 0.70 units for participants

playing the endogenous and the exogenous sequential rating treatments, respectively.23 The

effect of receiving a negative evaluation is lower under the sequential endogenous and exogenous

treatment compared to the simultaneous treatment. This finding confirms our conjecture about

23 Using the results from Table 6, the calculations for the endogenous sequential treatment are: with respect to the reference state, we must add the coefficient 1.869 of ″ Received neg. rating in t-1 (all treatments)” to the coefficient --0.850 of “Received neg. rating in t-1 and endo treat”.

. ”.

25

the fact that available information about received rating is less reliable in presence of feedback-

oriented and strategic ratings. A negative and significant effect is found for those who received

a positive rating in previous period. Precisely those who were positively evaluated in t-1 tend to

reduce their investment or their rate of return in the current period. Both results seem to indicate

that individuals tend to adjust their investment level toward a norm that would correspond to

receiving no rating.

Similar findings are found concerning the amount returned by player B. Players B who received

a negative evaluation in period t improve their return to A in the next period. In contrast, those

who received a positive evaluation tend to adjust their investment by reducing their return in the

next period.

4.5. Welfare levels In all treatments, player B’s payoffs are significantly higher than player A's (Wilcoxon tests,

p<0.005). Payoff dispersion is also higher for B-players than for A-players. This result is

confirmed in Figure 4, which displays average payoff over time in each treatment.

[Figure 4: about here]

Player A’s payoff is on average 9.21 in the baseline treatment compared to 9.44, 9.57 and

9.62 in the endogenous sequential, exogenous sequential and simultaneous treatments,

respectively. The difference in total earnings between the baseline treatment and the

simultaneous treatment is significant at the p < 0.1 level, according to a Mann-Whitney rank-sum

test. However, the two other treatments with rating systems do not generate significantly higher

earnings than the baseline treatment.

Player B’s average payoff also increases from 15.27 in the baseline treatment to 16.01,

18.3 and 18.65 in the endogenous sequential, exogenous sequential and simultaneous treatments.

26

The differences between the baseline and the simultaneous treatment and the exogenous

sequential treatment are significant (z=-2.492 with p=0.012, and z=-2.429 with p=0.0127,

respectively). However, the difference between the baseline treatment and the endogenous

sequential treatment is not significant (z=-0.429, p=0.668). These results indicate that the

introduction of rating systems is welfare-improving, but only if reciprocation and retaliation are

ruled out.

5. Conclusion

Evidence for the role of reputation feedback mechanisms in influencing behavior remains

elusive. In this paper we have examined the effects of introducing rating systems on trust and

trustworthiness between anonymous traders. We have also compared different rating mechanisms

that vary in the way subjects assign ratings and the information about the opponent’s rating

decision. The aim was to investigate to what extent introducing some slight changes in rating

rules might improve the efficiency of feedback systems and stimulate cooperation among

anonymous individuals. We have four key findings.

First, we provide new evidence of the benefits of rating mechanisms like those introduced

on eBay or AmazonMarketPlace. Our data indicate that individuals do not hesitate to evaluate

their counterpart even when it is costly and that trust is significantly improved by the introduction

of reputation feedback mechanisms.

Second, we find that ratings are strongly correlated with investment levels, which is

consistent with outcome-oriented motivation. Players assign positive (negative) ratings for high

(low) investment levels. However our findings also reveal that some individuals do not hesitate

to use ratings for strategic or reciprocation motives when it is possible. For instance individuals

use negative ratings as a means of reprisal against those who evaluated them negatively.

27

Third, our data reveal that people choose the timing of the evaluations strategically in the

endogenous sequential treatment : most of them assign negative evaluations in the second phase

to avoid reprisals and send positive evaluation in the first phase of the game.

Last, we find that both trust and trustworthiness are significantly higher in the

simultaneous rating system that reduces both retaliation and strategic ratings compared to the

sequential endogenous and exogenous systems. These findings are consistent with Resnick and

Zeckhauser (2002), Dellarocas et al. (2004) and Dellarocas et al. (2006), who underline the

informational bias in eBay feedback due to rating reciprocity and fear of retaliation.

One implication of our work is that designing reputation feedback systems in which ratings are

kept secret could limit strategic ratings and retaliation which proves more effective in enhancing

trust. Unconstrained rating mechanisms may produce informational bias and can erode the level

of trust. They can be improved by constraining the partners to rate each other simultaneously

(i.e. by reducing feedback-oriented and strategic rating motivations).

Recently eBay has introduced some changes in its rating system, in order to improve the accuracy

of evaluation. The main change in eBay‘s system is that sellers are no more able to leave negative

or neutral rating about buyers. These modifications in the rating system are likely to reduce

retaliation. However these changes may be insufficient since they do not rule out strategic

ratings and positive reciprocation.24 Moreover, they have triggered strong negative reactions

from many sellers who were frustrated from being deprived of their right to punish opportunistic

buyers. As a consequence, some decided to boycott eBay or to switch to AmazonMarketplace.

This protest highlights how any change in the design of a marketplace should be experimented

and discussed with the community of users. Probably, a feedback system that enables both

buyers and sellers to send positive or negative rating simultaneously, by constraining the timing 24 Furthermore sellers who have been deprived of the use of negative evaluations may also be strongly incited to substitute negative rating by “no rating” to signal their disapproval. As a consequence, a non evaluation from a seller could be interpreted as a negative signal.

28

available to evaluate would be more acceptable for eBay’s users and would reduce all non-

outcome oriented ratings.

This study has thus paved the way to the examination of the appropriate design of virtual

communities and electronic marketplaces. It also confirms the interest electronic marketplace

managers have in performing laboratory experiments prior to implementing or changing the

design of their platforms.

References

Anderson, C. M. and Putterman, L., (2006). "Do Non-Strategic Sanctions Obey the Law of Demand? The Demand for Punishment in the Voluntary Contribution Mechanism." Games and Economic Behavior, 54(1), 1-24.

Andreoni J. Harbaugh W. and Vesterlund L. (2003), “The Carrot or the Stick: Rewards, Punishments and Cooperation”, American Economic Review, Vol. 93, No 3 pp. 893-902.

Ba, S. and Pavlou P., (2002) “Evidence of the Effect of Trust Building Technology in Electronic markets: Price Premiums and Buyer Behavior.” MIS Quarterly 26(3).

Berg J., Dickhaut, J. and MacCabe K. (1995), "Trust, Reciprocity and Social History", Games and Economic Behavior, 10, pp. 122-142.

Bochet, O, Page, T. and Putterman, L., (Forthcoming). "Communication and Punishment in Voluntary Contribution Experiments." Journal of Economic Behavior and Organization.

Bolton, G. E. and Ockenfels, A., (2000). "ERC: A Theory of Equity, Reciprocity, and Competition." American Economic Review, 90(1), 166-93.

Bolton G., Loebbecke C. and Ockenfels A. (2006) “How Social Reputation Networks Interact with Competition in Anonymous Online Trading: An Experimental Study”, University of Cologne, Working Paper Series in Economics N°32.

Bolton G. Katok E. and Ockenfels A. (2004) “How effective are electronic reputation mechanisms? An experimental investigation”, Management Science, 50 (11), pp. 1587-1602.

Buchan Nancy and Rachel T.A. Croson, Eric J. Hohnson (2003) “Trust and Reciprocity : An International Experiment” working paper.Burk S. Carpenter J. and Verhoogen E. (2003) “Playing both roles in the trust game”, Journal of Economic Behavior and Organization, vol. 2, pp. 195-216.

Cabral L., Hortacsu A. (2009), ‘‘The Dynamics of Seller Reputation: Theory and Evidence from eBay’’. Forthcoming in Journal of Industrial Economics.

Carpenter, J.P., (2006). "Punishing Free-Riders: How Group Size Affects Mutual Monitoring and the Provision of Public Goods." Games and Economic Behavior, (Forthcoming).

Carpenter, J.P., (2007). " The demand for punishment" Journal of Economic Behavior & Organization. 62 522–542

Carpenter, J.P.; Matthews, P. and Ong'ong'a, O., (2004). "Why Punish? Social Reciprocity and the Enforcement of Pro-Social Norms." Journal of Evolutionnary Economics, 14Chaudhuri A. and L. Gangadharan (2003) “Gender Differences in Trust Games and reciprocity” working paper.

29

Croson, R and N. Buchan, (1999), “Gender and Culture: International Experimental Evidence from Trust Games”, American Economic Review Papers and Proceedings, v. 89(2), 386-91.

Dellarocas C. (2006), "Reputation Mechanisms", in T. Hendershott, (ed.), Handbook on Economics and Information Systems. Elsevier.

Dellarocas C. (2003), "The Digitization of Word-of-Mouth: Promise and Challenges of Online Feedback Mechanisms", Management Science, 49 (10), pp. 1404-1427.

Dellarocas C. Dini, F. and Spagnolo, G. (2006) “Designing Reputation (Feedback) Mechanisms”, in Handbook of Procurement, Cambridge University Press.

Dellarocas C. Fan M. and Wood, C. (2004), “Self-Interest, Reciprocity and Participation in Online Reputation Systems”, MIT Sloan, Working Paper 4500-04.

Denant-Boemont L. Masclet D. and Noussair C. (2007) "Punishment, Counterpunishment and Sanction Enforcement in a Social Dilemma Experiment" Economic Theory, vol 33, no1, 145-167

Dini F.,Spagnolo G., (2007), « Buying Reputation on eBay », QUADERNO CONSIP VIIDuffy, J., and N. Feltovich. 2006 Words, Deeds and Lies: Strategic Behavior in Games with Multiple Signals. Review of Economic Studies, 73, 669-688.

Duffy, J., and N. Feltovich. 2002. Do Actions Speak Louder than Words? An Experimental Comparison of Observation and Cheap Talk, Games and Economic Behavior, 39, 1-27.

Falk, A.; Fehr, E. and Fischbacher, U., (2005). "Driving Forces Behind Informal Sanctions." Econometrica, 73 (6), 2017-2030.

Falk, A. and Fischbacher, U. (2006), "A Theory of Reciprocity", Games and Economic Behavior, 2006, 54 (2), 293-315

Fehr, E. and Gächter, S., (2002). "Altruistic Punishment in Humans." Nature, 415(10), 137-40.

____, (2000). "Cooperation and Punishment in Public Goods Experiments." American Economic Review, 90(4), 980-94.

Gazzale, R. and Khopkar, T. (2007) “Remain Silent and Ye Shall Suffer: Seller Exploitation of Reticent Buyers in an Experimental Reputation System”, Working Paper.

Houser, D., Wooders, J. (2006), ‘‘Reputation in Auctions: Theory and Evidence from eBay’’, Journal of Economics and Management Strategy, 15, pp. 353-369.

Keser, C. (2003), "Experimental Games for the Design of Reputation Management Systems", IBM Systems Journal 42(3), pp. 498-506.

Klein T. J., Lambert C., Spagnolo G., Stahl K. O. (2006), “Last Minute Feedback”, Governance and the Efficiency of Economics Systems. Masclet, David; Noussair, Charles; Tucker, Steve and Villeval, Marie-Claire, (2003). "Monetary and Non-

Monetary Punishment in the Voluntary Contributions Mechanism." American Economic Review, 93(1), 366-80.

Neumark, David and Postlewaite, Andrew, (1998). "Relative Income Concerns and the Rise in Married Women's Employment." Journal of Public Economics, 70, pp. 157-83.

Nikoforakis, Nikos, and Normann, Hans-Theo, (2008). "A Comparative Statics Analysis of Punishment in Public-Good Experiment.", Experimental Economics,Vol. 11 ( 4), pp.358-369

Nikoforakis, Nikos, Normann, Hans-Theo, Wallace, Brian (20010). "Asymmetric Punishments in Public-Good Experiments.", Royal Holloway, forthcoming Southern Economic JournalNikoforakis, Nikos (2008) "Punishment and Counter-Punishment in Public Good Games: Can We Really Govern Ourselves?", Journal of Public Economics,92, 91-112.

Ortmann, A., J. Fitzgerald, and C. Boeing, 2000, Trust, reciprocity, and social history: A Re-examination, Experimental Economics 3, 81-100.

Price, Michael E.; Cosmides, Leda and Tooby, John, (2002). "Punitive Sentiment as an Anti-Free Rider Psychological Device." Evolution and Human Behavior, 23, 203-31.

Rabin, Matthew, (1993). "Incorporating Fairness into Game Theory and Economics." American Economic Review, 83, 1281-1302.

30

Resnick, P., K. Kuwabara, R. Zeckhauser, and E. Friedman (2000): \Reputation systems," Communications of the ACM, 43(12), 45{48.

Resnick P and Zeckhauser, R. (2002), "Trust Among Strangers in Internet Transactions: Empirical Analysis of eBay's Reputation System", in Michael Baye, ed., The Economics of the Internet and e-Commerce. Elsevier Science, pp. 127-157.

Resnick P and Zeckhauser, R. Swanson, J. and Lockwood K. (2006), "The Value of Reputation on eBay: A Controlled Experiment", Experimental Economics, 9(2), pp. 79-101.

Sefton, M., Shupp, R. and Walker, J. (2007), “The Effect of Rewards and Sanctions in Provision of Public Goods”, Economic Inquiry, Vol. 45, pp. 671-690.

Sell, J., and R. Wilson. 1991. Levels of Information and Contribution to Public Goods. Social Forces 70:107-124.

Walker, J.M and Halloran, M. (2004), “Rewards and Sanctions and the Provision of Public Goods in One Shot Settings”, Experimental Economics, Vol. 7, pp. 235-247.

Willinger Marc, Claudia Keser, Christopher Lohmann, Jean-Claude Usunier (2003) “A comparison of trust and reciprocity between France and Germany : experimental investigation based on the investment game” Journal of Economic Psychology, vol. 24, issue 4, p. 447-466

31

Table 1: Characteristics of the Experimental Sessions

Session Number

Number of Subjects

Treatment

Session Number

Number of Subjects

Treatment

1 10 Baseline 14 10 Exo. Seq. 2 10 Baseline 15 10 Exo. Seq. 3 8 Baseline 16 10 Exo. Seq. 4 8 Baseline 17 8 Exo. Seq. 5 10 Baseline 18 10 Exo. Seq. 6 8 Baseline 19 10 Exo. Seq. 7 10 Baseline 20 10 Simultaneous 8 10 Endo seq. 21 10 Simultaneous 9 10 Endo seq. 22 10 Simultaneous 10 10 Endo seq. 23 10 Simultaneous 11 10 Endo seq. 24 10 Simultaneous 12 10 Endo seq. 25 10 Simultaneous 13 10 Endo seq. 25 10 Simultaneous Total 252 subjects

32

Table 2. Player A's investment and Player B's return

Player A’ investment

Player B’s investment

Treatment Session

All periods average

All periods STD

Session All periods average

All periods STD

Baseline

1 0.95 1.91 1 0.38 0 .93 2 1.64 2.52 2 0.63 1.62 3 3.05 3.18 3 2.7 4.23 4 2.1 2.37 4 1.025 1.84 5 3.69 3.18 5 3.24 4.38 6 2.9 3.57 6 1.52 3.67 7 .8 1.21 7 0.15 0.505

Av. invest. 2.24 2.91 1.45 3.16 Av.Relative return 21.57% Endo seq eval 8 1.25 1.71 8 0.75 1.71

9 1.81 2.02 9 1.24 2.39 10 5.28 3.12 10 4.11 4.42 11 3.18 2.39 11 3.26 4.60 12 4.01 3.40 12 5.2 6.51 13 2.6 3.58 13 2.11 5.03

Av. invest. 3.02 3.09 2.77 4.67 Av.Relative return 30.57% Exo seq eval 14 3.26 3.69 14 2.48 4.48

15 5.38 3.73 15 5.51 7.36 16 4.425 2.81 16 5.2125 4.73 17 4.5 3.49 17 4.42 5.06 18 4.41 3.12 18 3.92 5.24 19 3.1 3.47 19 2.64 3.72

Av. invest. 4.17 3.50 3.98 5.35 Av.Relative return 31.81% Sim Eval

20 3.11 3.98 20 3.16 6.89 21 3.61 3.12 21 2.8 5.01 22 3.45 3.01 22 3.54 6.26 23 4.76 3.45 23 3.29 4.42 24 4.64 3.80 24 5.18 6.70 25 5.26 3.82 25 6.19 7.04 26 5.69 3.89 26 5.32 6.67

Av. invest. 4.36 3.70 4.21 6.32 Av.Relative return 32.18%

33

Table 3: Determinants of the amounts invested by participants Models RE Tobit (Marginal effects)a

RE GLSb Dep. Variable Player A ‘s invest. Relative return from Player Be treatments

All treat. (1)

All treat. (2)

Treat. with rating (3)

All treat. (4)

All treat. (5)

Treat. with rating (6)

Baseline Ref. Ref. Ref. Ref.

Simultaneous rating 0.106*** c 0.108*** 0.059** 0.102** 0.101** 0.030 (0.022) (0.022) (0.024) (0.042d) (0.044) (0.028) Exogenous sequential rating 0.095*** 0.098*** 0.054** 0.095** 0.092** 0.023 (0.020) (0.020) (0.022) (0.044) (0.043) (0.027) Endogenous sequential rating 0.055* 0.051* Ref. 0.070* 0.070* Ref. (0.028) (0.030) (0.041) (0.044)

Amount received from the partner in t-1

0.119*** 0.087*** 0.001 0.001 (0.021) (0.020) (0.002) (0.002) Cumulative pos. ratings 0.028*** -0.004 (partner's profile) (0.006) (0.008) Cumulative neg. ratings -0.021*** 0.005 (partner's profile) (0.004) (0.008) Pos. rating in t-1 0.0009 0.034* (partner's profile) (0.014) (0.013) Neg. rating in t-1 -0.001 0.029 (partner's profile) (0.010) (0.028) Period -0.009*** -0.009*** -0.004*** -0.005*** -0.005*** -0.004* (0.001) (0.021) (0.001) (0.002) (0.002) (0.002) Period 20 0.010 0.009 -0.015 -0.123*** -0.125*** -0.140*** (0.017) (0.018) (0.021) (0.019) (0.020) (0.037) Constant 0.251*** 0.243*** 0.307*** (0.041) (0.044) (0.040) Observations 2394 2394 1786 1687 1574 1247

Left cens. Right cens.

820 209

820 209

539 180

Notes: aRE Tobit=Random Effects Tobit; b RE GLS=Random Effects Generalized Least Squares; c *** Significant at the 0.01 level; ** at the 0.05 level; * at the 0.1 level;d Standard errors in parentheses clustered at the session level e: Only relative return for positive amounts sent by player A are considered here.

.

34

Table 4 . Distribution of ratings per treatment and per type of player.

Rating of player A Player B’s invest.

Rating of player B Player A’s invest.

Treat. Rating All Seq rating All Seq rating

Phase 1 Phase 2 Phase 1

Phase2

Simulta-neous

No rating 77.71%

3.64 78.71%

4,23

Negative rating

16% 2.8 11.57%

1,71

Positive rating

6.29% 14.81 9.71% 8,53

Exogenous sequential

No rating 75.34%

3.42 77.76%

3,69

Negative rating

17.59%

70.27% 72.46% 3.14 7.41% 20.55%

50% 2,25

Positive rating

7.07% 29.23% 27.54% 12.14 14.83%

79.45%

50% 7,60

Endoge-nous sequential

No rating 72.83%

2.49 72.67%

2,89

Negative rating

20.17%

61.54% 90.28% 1.98 10.50%

30.51%

58.70%

1,5

Positive rating

7% 38.46% 9.72% 11.23 16.83%

69.49%

41.30%

6.33

35

Table 5: The determinants of ratings (Probit with random effects for the probability of negative rating)

A's rating of B B's rating

All treat. with rating


Sequ.endo.

Sequ.exo.



Sequ.endo.

Sequ.exo.

Prob. Eval. (selection probit)

Prob neg. Eval.

Prob neg. Eval.

Prob neg. Eval.

Prob.Eva. (selection probit)

Prob neg. Eval.

Prob neg. Eval.

Prob neg. Eval.

(1) (2) (3) (4) (5) (6) (7) (8) Received amount

-0.032*** -0.322*** -0.356*** -0.282*** 0.022 -0.395*** -0.498*** -0.293*** (0.007) (0.036) (0.079) (0.062) (0.014) (0.046) (0.090) (0.109)

Sent amount 0.180*** 0.358*** 0.545*** 0.317* 0.029*** -0.033 0.074 -0.172* (0.016) (0.096) (0.179) (0.185) (0.008) (0.027) (0.052) (0.101)

Sim. Treat. Ref. Ref. Ref. Ref. Seq. exo 0.179 -0.100 0.109 -0.725**

(0.226) (0.446) (0.202) (0.293) Seq. endo. 0.413* 0.127 0.248 -1.279***

(0.227) (0.474) (0.203) (0.312) Received no rating

Ref.

Ref.

Ref. Ref.

Received a negative rating

1.798** (0.787)

1.047 (0.904)

0.870** (0.399)

0.588 (0.663)

Received a positive rating

-0.426 (0.394)

-0.617 (0.587)

-0.547 (0.555)

-0.091 (1.045)

Rating during Phase 1

-1.251*** -0.283 -0.815*** -0.816* (0.411) (0.439) (0.313) (0.453)

Period -0.047*** -0.030*** (0.007) (0.006)

Period20 -0.210 -0.045 (0.231) (0.198)

IMR

1.091* 2.057** -0.542 -0.568 -0.345 0.992 (0.586) (1.014) (2.258) (0.672) (1.014) (2.050)

Constant -0.129*** -0.716 -1.777 -0.542 -0.909*** 3.238*** 2.027 0.665 (0.188) (1.210) (1.644) (2.258) (0.163) (1.169) (1.528) (3.206) Observations 1880 462 163 143 1880 442 164 129 Log-Likelihood

-784.291

-127.422 -39.175 -41.018 -873.480

-147.26' -59.600

-33.025

Notes: Standard errors are shown in parentheses; *** statistically significant at the 1% level; ** at the 5% level; and* at the 10% level.

36

Table 6. Determinants of change in investment and relative return in all treatments with rating (Random Effects Generalized Least Squares)

Change in player A's invest. between t-1 and t

Change in player B’s relative return between t-1 and t

Variables (1) (2)

Received nothing in t-1 Ref Ref

Received pos. -2.903*** -0.189*** rating in t-1 (0.484) (0.050) Received pos. rating 1.024 0.001 in t-1*exo treat. (0.583) (0.079)

Received pos. rating 0.788 0.048 in t-1*endo treat. (0.588) (0.069)

1.869*** (0.380)

0.123*** (0.032)

Received neg. rating in t-1 Received neg. rating -1.166* 0.021 in t-1*exo treat. (0.726) (0.048) Received neg. rating -0.850* -0.077** in t-1*endo treat. (0.493) (0.039) Constant -0.036 -0.05 (0.380) (0.009)

Obs. 2394 1169

Figure 1.Player A’s average investment in Each Treatment by Period

Figure 2. Player B’s return for each player A’s investment level

37

Player A’s average investment in Each Treatment by Period

Player B’s return for each player A’s investment level

Figure 3: Player B’s Average return in each treatment, by period

Figure 4. Average payoff over time

38

: Player B’s Average return in each treatment, by period

Average payoff over time

Do reputation feedback systems really improve trust among anonymous traders? An experimental study

Documents