Top Banner
Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective Beliefs to the Current State of the Art in Decision Theory October 10, 2007 Fuqua School of Business, Durham NC Peter P. Wakker Theo Offerman Joep Sonnemans Gijs van de Kuilen Application considered first: grading students.
45

Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US.

Adapting Proper Scoring Rules for Measuring Subjective Beliefs to the Current State of the Art in Decision

Theory

October 10, 2007Fuqua School of Business, Durham NC

Peter P. WakkerTheo Offerman Joep Sonnemans Gijs van de Kuilen

Application considered first: grading students.

Page 2: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

Say, you grade a multiple choice exam in geography to test students' knowledge about Statement H: Capital of North Holland = Amsterdam.

2

Reward: if H true if not H true H not-H

1 0 0 1

Assume: Two kinds of students. 1. Those who know. They answer correctly.2. Those who don't know. Answer 50-50.

Problem: Correct answer does not completely identify student's knowledge. Some correct answers, and high grades, are due to luck. There is noise in the data.

Page 3: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

One way out: oral exam, or ask details.Too time consuming.

Now comes a perfect way out:Exactly identify state of knowledge of each student. Still multiple choice, taking no more time!

Best of all worlds!

3

Page 4: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

4

For those of you who do not know answer to H.What would you do?Best to say "don't know."System perfectly well discriminates between students!

Reward: if H true if not H true H

not-H

1 0

0 1

don't know 0.75 0.75

Page 5: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

New Assumption: Students have all kinds of degrees of knowledge. Some are almost sure H is true but are not completely sure; others have a hunch; etc.

Above system does not discriminate perfectly well between such various states of knowledge.

One solution (too time consuming): Oral exams etc.

5

Page 6: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

Second solution:Find r such that student indifferent between:

6

partly know H, to degree r

r r

HReward: if H true if not H true 1 0

Then r = P(H).(Assuming expected value maximization …)

How measure r?Can elicit r from risky choices (we will skip details):

Page 7: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

7

don't know 0.20 0.20 H

Reward: if H true if not H true1 0 choice 2

don't know 0.10 0.10 H

Reward: if H true if not H true1 0 choice 1

don't know 0.30 0.30 H

Reward: if H true if not H true1 0 choice 3

.

.

.

don't know 0.90 0.90 H

Reward: if H true if not H true1 0 choice 9

Page 8: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

8

Etc. Can get approximation of true subjective probability p. Binary-choice ("preference") solutions are popular in decision theory.

Too time consuming for us. Rewards for students are still somewhat noisy this way (only approximation of p).

Page 9: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

9

Problems: Why would student give true answer r = p? What at all is the reward (grade?) for the student? Does reward, whatever it is, give an incentive to give true answer?

Third solution [introspection]

partly know H, to degree r

r r

HReward: if H true if not H true 1 0

Simply ask student which r ("reported probability") brings indifference in

Page 10: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

10

I now promise a perfect way out:de Finetti's dream-incentives.Exactly identifies state of knowledge of each student, no matter what it is.Takes little time; no more than multiple choice.Rewards students fairly, with little noise.

Best of all worlds.For the full variety of degrees of knowledge.-----------------------------------------------------------Student can choose reported probability r for H from the [0,1] continuum, as follows:

Page 11: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

11

Reward: if H true if not H true

Claim: Under "subjective expected value," optimal reported probability r = true subjective probability p.

1 0 r=1

0 1 r=0

1 – (1–r)2 1–r2r

(don't know!?) 0.75 0.75 r=0.5:

: (H = sure!?)

: (not-H is sure!?)

degree ofbelief in H (?)

:

Page 12: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

Reward: if H true if not H true

1 – (1–r)2 1–r2 r: degree ofbelief in H

To help memory:Proof of claim. 12

p true probability; r reported probability.Optimize EV = p(1 – (1–r)2) + (1–p)(1–r2).1st order optimality: 2p(1–r) – 2r(1–p) = 0. r = p!

Page 13: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

Easy in algebraic sense.Conceptually: !!! Wow !!!

Incentive compatible ... Many implications ...de Finetti (1962) and Brier (1950) were the first neuro-scientists!

Useful in many domains,more than in teaching: in teaching, if scores are used "more" then distorting side payments (such as to pass or fail; Lichtendahl & Winkler 2007).

13

Page 14: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

"Bayesian truth serum" (Prelec, Science, 2005).Superior to elicitations through preferences .Superior to elicitations through indifferences ~ (BDM).

Widely used: Hanson (Nature, 2002), Prelec (Science 2005). In accounting (Wright 1988), Bayesian statistics (Savage 1971), business (Stael von Holstein 1972), education (Echternacht 1972), finance (Shiller, Kon-Ya, & Tsutsui 1996), medicine (Spiegelhalter 1986), psychology (Liberman & Tversky 1993; McClelland & Bolger 1994), experimental economics (Nyarko & Schotter 2002).

Remember: based on expected value; in 2007 …!?

We bring- realism

14

(of prospect theory) to proper scoring rules;- the beauty of proper scoring rules to prospect theory and studies of ambiguity.

Page 15: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

Survey

Part I. Deriving reported prob. r from theories: expected value; expected utility (Winkler &Murphy'70); nonexpected utility for probabilities; nonexpected utility for ambiguity.

Part II. Deriving theories from observed r. In particular: Derive beliefs/ambiguity attitudes. Will be surprisingly easy.

Proper scoring rules <==> risk & ambiguity:Mutual benefits.

Part III. Implementation in an experiment.

15

Page 16: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

Part I. Deriving r from Theories (EV, and then 3 deviations).

16

Event H: Hillary Clinton next president of the US.not-H: Someone else next president.

We quantitatively measure your subjective belief about this event (subjective probability?), i.e. how much you believe in Hillary.

Page 17: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

Your subj. prob.(H: Hillary next president) = 0.75 (charming husband Bill).

EV: Then your optimal rH = 0.75.

17

Page 18: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

18Reported probability R(p) = rH as function of true probability p, under:

nonEU

0.69

EU

0.61

rEV

EV

rnonEU

rnonEUA

rnonEUA: nonexpected utility for unknown probabilities ("Ambiguity").

(c) nonexpected utility for known probabilities, with U(x) = x0.5 and with w(p) as common;

(b) expected utility with U(x) = x (EU);

(a) expected value (EV);

rEU

next p.

go to p. 21, Example EU

go to p. 25, Example nonEU

0.25 0.50 0.75 10p

R(p)

0

0.50

1

0.25

0.75

go to p. 29, Example nonEUA

reward: if H true if not H true EV rEV=0.75 0.94 0.44 0.8125 rEU=0.69 0.91 0.52 0.8094 rnonEU=0.61 0.85 0.63 0.7920 rnonEUA=0.52 0.77 0.73 0.7596

Page 19: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

So far we assumed EV (as does everyone using proper scoring rules, but as does no-one in modern risk-ambiguity theories ...)

Deviation 1 from EV: EU with U nonlinear

Now optimizepU(1 – (1– r)2) + (1 – p)U(1 – r2)

r = p need no more be optimal.

19

Page 20: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

Theorem. Under expected utility with true probability p,

20

U´(1–r2)U´(1 – (1–r)2)

(1–p)p +

pr =

U´(1–r2)U´(1 – (1–r)2)

(1–r)r +

rp =

Reversed (and explicit) expression:

A corollary to distinguish from EV: r is nonadditive as soon as U is nonlinear.

Page 21: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

How bet on Hillary? [Expected Utility]. EV: rEV = 0.75.Expected utility, U(x) = x: rEU = 0.69. You now bet less on Hillary. Closer to safety(Winkler & Murphy 1970).

21

go to p. 18, with figure of R(p)

Page 22: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

Deviation 2 from EV: nonexpected utility for probabilities (Allais 1953, Machina 1982, Kahneman & Tversky 1979, Quiggin 1982, Gul 1991, Luce & Fishburn 1991, Tversky & Kahneman 1992; Birnbaum 2005; survey: Starmer 2000)

22

For two-gain prospects, virtually all those theories are as follows:

For r 0.5, nonEU(r) = w(p)U(1 – (1–r)2) + (1–w(p))U(1–r2).

r < 0.5, symmetry, etc.Different treatment of highest and lowest outcome: "rank-dependence."

Page 23: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

23

p

w(p)

1

1

0

Figure. The common weighting function w.w(p) = exp(–(–ln(p))) for = 0.65.

w(1/3) 1/3;

1/3

1/3

w(2/3) .51

2/3

.51

Page 24: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

Theorem. Under nonexpected utility with true probability p,

24

U´(1–r2)U´(1 – (1–r)2)

(1–w(p))w(p) +

w(p)r =

U´(1–r2)U´(1 – (1–r)2)

(1–r)r +

rp =

Reversed (explicit) expression:

w –1(

)

Page 25: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

How bet on Hillary now? [nonEU with probabilities]. EV: rEV = 0.75.EU: rEU = 0.69.Nonexpected utility, U(x) = x, w(p) = exp(–(–ln(p))0.65).rnonEU = 0.61.You bet even less on Hillary. Again closer to fifty-fifty safety.

25

go to p. 18, with figure of R(p)

Deviations from EV were at level of behavior so far, not of beliefs. Now for something different; more fundamental for our purposes.

Page 26: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

Deviation 3 from EV: Ambiguity (unknown probabilities; concerns belief or decision-attitude? Yet to be settled).

Of different nature than previous two deviations. Not to "correct for if distorting" but the thing to measure.

How deal with unknown probabilities?

Have to give up Bayesian beliefs descriptively.According to some even normatively.

26

Page 27: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

27Instead of additive beliefs p = P(H), nonadditive beliefs B(H) (Dempster&Shafer, Tversky&Koehler; etc.)

All currently existing decision models:For r 0.5, nonEU(r) =

W(H)U(1 – (1–r)2) + (1–W(H))U(1–r2).

Is '92 prospect theory, = Schmeidler (1989).Includes multiple priors (Wald 1950; Gilboa & Schmeidler 1989);For binary gambles: Pfanzagl 1959; Luce ('00 Chapter 3); Ghirardato & Marinacci ('01, "biseparable").

Can always write B(H) = w–1(W(H)),so W(H) = w(B(H)). Then

w(B(H))U(1 – (1–r)2) + (1–w(B(H)))U(1–r2).

Page 28: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

28

U´(1–r2)U´(1 – (1–r)2)

(1–w(B(H)))w(B(H)) +

w(B(H))rH =

U´(1–r2)U´(1 – (1–r)2)

(1–r)r +

rB(H) =

Reversed (explicit) expression:

w –1(

)

Theorem. Under nonexpected utility with ambiguity,

Page 29: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

How bet on Hillary now? [Ambiguity, nonEUA]. rEV = 0.75.rEU = 0.69.rnonEU = 0.61.Similarly,

rnonEUA = 0.52 (under plausible assumptions).r's are close to insensitive fifty-fifty."Belief" component B(H) = w–1(W) = 0.62.

29

go to p. 18, with figure of R(p)

Page 30: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

B(H): ambiguity attitude /=/ beliefs??Before entering that debate, first:How measure B(H)?Our contribution: through proper scoring rules with "risk correction."

This ends Part I. Before going to Part II (derive theory from r), in preparation, we consider other proposals for measuring B and W considered in the literature (that we will improve upon):

30

Page 31: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

Proposal 1 (common in decision theory):Measure U,W, and w from behavior, and derive B(H) = w–1(W(H)) from it.Problem: Much and difficult work!!!

Proposal 2 (common in decision analysis of the 1960s, and in modern experimental economics): measure canonical probabilities, that is,for H, find event Hp with objective probability p such that (H:100) ~ (Hp:100) = (p:100). Then B(H) = p.Problem: measuring indifferences is difficult.

Proposal 3 (common in proper scoring rules): Calibration …Problem: Need many repeated observations.

31

Page 32: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

32

We reconsider reversed explicit expressions:

U´(1–r2)U´(1 – (1–r)2)

(1–r)r +

rp = w

–1(

)

U´(1–r2)U´(1 – (1–r)2)

(1–r)r +

rB(H) = w

–1(

)

Corollary. p = B(H) if related to the same r!!

Part II. Deriving Theoretical Things from Empirical Observations of r

Page 33: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

33

Our proposal takes the best of several worlds!

Need not measure U,W, and w.

Get "canonical probability" without measuring indifferences (BDM …; Holt 2006).

Calibration without needing many repeated observations.

Do all that with no more than simple proper-scoring-rule questions.

Page 34: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

34Example (participant 25)

stock 20, CSM certificates dealing in sugar and bakery-ingredients.Reported probability:r = 0.75

9191

For objective probability p=0.70, also reported probability r = 0.75.Conclusion: B(elief) of ending in bar is 0.70!We simply measure the R(p) curves, and use their inverses: is risk correction.

Page 35: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

35

Directly implementable empirically. We did so in an experiment, and found plausible results.

Page 36: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

Part III. Experimental Test of Our Correction Method

36

Page 37: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

Method

Participants. N = 93 students. Procedure. Computarized in lab. Groups of 15/16 each. 4 practice questions.

37

Page 38: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

38Stimuli 1. First we did proper scoring rule for unknown probabilities. 72 in total.

For each stock two small intervals, and, third, their union. Thus, we test for additivity.

Page 39: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

39Stimuli 2. Known probabilities: Two 10-sided dies thrown. Yield random nr. between 01 and 100.Event H: nr. 75 (p = 3/4 = 15/20) (etc.).Done for all probabilities j/20.

Motivating subjects. Real incentives. Two treatments. 1. All-pay. Points paid for all questions. 6 points = €1. Average earning €15.05.2. One-pay (random-lottery system). One question, randomly selected afterwards, played for real. 1 point = €20. Average earning: €15.30.

Page 40: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

40

Results(of group average; at individual level more corrections)

Page 41: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

41

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.000 0.100 0.200 0.300 0.400 0.500 0.600 0.700 0.800 0.900 1.000

Reported probability

Cor

rect

ed p

roba

bilit

y

ONE (rho = 0.70) ALL (rho = 1.14) 45°

Average correction curves

Page 42: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

42

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5

ρ

F(ρ )

treatmentone

treatmentall

Individual corrections

Page 43: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

43

Figure 9.1. Empirical density of additivity bias for the two treatments

Fig. b. Treatment t=ALL

0.60

20

40

60

80

100

120

140

160

0.4 0.2 0 0.2 0.4 0.6

Fig. a. Treatment t=ONE

0

20

40

60

80

100

120

140

160

0.6 0.4 0.2 0 0.2 0.4 0.6

For each interval [(j2.5)/100, (j+2.5)/100] of length 0.05 around j/100, we counted the number of additivity biases in the interval, aggregated over 32 stocks and 89 individuals, for both treatments. With risk-correction, there were 65 additivity biases between 0.375 and 0.425 in the treatment t=ONE, and without risk-correction there were 95 such; etc.

corrected

corrected

uncorrected

uncorrected

Corrections reduce nonadditivity, but more than half remains: ambiguity generates more deviation from additivity than risk.Fewer corrections for Treatment t=ALL. Better use that if no correction possible.

Page 44: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

Summary and Conclusion Modern risk&ambiguity theories: proper scoring rules are heavily biased. We correct for those biases. Benefits for proper-scoring rule community and for risk- and ambiguity theories. Experiment: correction improves quality; reduces deviations from ("rational"?) Bayesian beliefs. Do not remove all deviations from Bayesian beliefs. Beliefs are genuinely nonadditive/ nonBayesian/sensitive-to-ambiguity. Proper scoring rules: the post-neuroeconomics approach for mind-reading.

44

Page 45: Topic: Our chance estimates of uncertain events; e.g.: Hillary Clinton next president of the US. Adapting Proper Scoring Rules for Measuring Subjective.

The end.

45