Probability The Concept and its Rules of Use · 2015. 2. 9. · of probability theory for modeling natural phenomena. As distinct applications of a ... the view of judgments about

Probability

The Concept and its Rules of Use

Derek Charles Shiller

A Dissertation

Presented to the Faculty

of Princeton University

in Candidacy for the Degree

of Doctor of Philosophy

Recommended for Acceptance

by the Department of

Philosophy

Advisors: Adam Elga and Sarah-Jane Leslie

January 2015

© Copyright by Derek Charles Shiller, 2014.

All rights reserved.

Abstract

Over the past several centuries, progress on the applications of probability has

far surpassed our philosophical understanding of the nature of probability. This

dissertation consists of four chapters that explore the nature of our judgments about

probability. In the first chapter, I present my guiding perspective of judgments about

probability as moves in a kind of practice. Judgments about probability should be

understood in terms of the intellectual practice of assigning probabilities, rather than

in terms of any representative content; it is the dimensions of that practice that make

judgments about probability count as judgments about probability. I propose that

we assign probabilities as a way of distilling how we feel about the bearing of bodies

of evidence on different propositions.

In the second chapter, I develop an account of the cognitive foundations of these

judgments. I suggest that making a judgment about probability involves applying a

special probabilistic concept to a proposition. Because judgments about probability

involve our deployment of certain concepts and gradations of confidence do not, the

former sorts of attitudes are more cognitively sophisticated than the latter.

In the final two chapters, I explore the sources of the most important norms that

have been alleged to govern probability. In the third chapter, I offer an unsubstantive

explanation: those norms are partly constitutive of what it is to be an assignment

of probability. We bind ourselves to the norms by intending to engage in a practice

constitutively governed by those norms. In the final chapter, I suggest that the

norm of conditionalization, the cornerstone of Bayesian epistemology, is applicable

in only a limited range of situations and cannot be made to do all of the work

that it is traditionally expected to do. Conditionalization has normative significance

because conditionalization preserves certain kinds of commitments. While these kinds

of commitments are common, not every assignment of probabilities needs to rely on

those kinds of commitments.

iii

Acknowledgements

I’d like to thank my advisors, Adam Elga and Sarah-Jane Leslie, for their support

and guidance in producing this dissertation, and for their willingness to read draft

after draft as the project evolved. My ideas – at times heretical – were challenged

without being dismissed. I deeply appreciate the freedom and encouragement they

gave me to develop my views as I saw fit.

My work has been greatly influenced by much of the Princeton philosophy depart-

ment both directly and indirectly. I was glad to work in an environment that cele-

brated philosophical diversity and novelty. Many professors provided me with helpful

advice along the way, and their suggestions shaped the views contained within. I

especially want to thank Tom Kelly, Boris Kment, and Michael Smith for their sug-

gestions.

Many graduate students took the time to carefully think and converse about

the views contained herein. Jack Spencer, Jordan Delange, Yoaav Isaacs, Daniel

Berntson, and Han Wietmarschen provided helpful comments and probing discus-

sions. William Melanson also proved tireless in reading my work and provided me

with many constructive proposals for improvement.

Most of all, I am indebted to my wife, Megan Nelson, for allowing me to indulge

in my dreams, for following along, and for keeping me grounded.

iv

To Megan for her unwavering support from the start.

v

Contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

1 Introduction: Probability as a Practice 1

1.1 Judgments about Probability . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Interpretations of Probability . . . . . . . . . . . . . . . . . . . . . . 3

1.2.1 Degrees of Belief . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2.2 The Logical Interpretation . . . . . . . . . . . . . . . . . . . . 8

1.2.3 Cognitivist Subjectivism . . . . . . . . . . . . . . . . . . . . . 9

1.2.4 Credal Noncognitivism . . . . . . . . . . . . . . . . . . . . . . 11

1.3 Probability as a Practice . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.3.1 Intellectual Practices . . . . . . . . . . . . . . . . . . . . . . . 13

1.3.2 The Use of Probability . . . . . . . . . . . . . . . . . . . . . . 16

1.4 Overview of What’s to Come . . . . . . . . . . . . . . . . . . . . . . . 19

2 Probability and Confidence: Grounds for Divorce 22

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.1.1 Probability as a Concept . . . . . . . . . . . . . . . . . . . . . 24

2.1.2 Content and Vehicles . . . . . . . . . . . . . . . . . . . . . . . 26

2.2 Consideration 1: Structural Diversity . . . . . . . . . . . . . . . . . . 29

2.3 Consideration 2: Novel Quantifiability . . . . . . . . . . . . . . . . . 33

vi

2.3.1 Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.3.2 Novelty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.4 Consideration 3: Non-specific Probabilities . . . . . . . . . . . . . . . 36

2.4.1 Ambiguity and Non-specificity . . . . . . . . . . . . . . . . . . 37

2.4.2 Non-specific Attitudes . . . . . . . . . . . . . . . . . . . . . . 39

3 Constitutivism about the Formal Norms 41

3.1 The Formal Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.2 The Bare Characterization . . . . . . . . . . . . . . . . . . . . . . . . 44

3.2.1 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.3 The Pragmatic Characterization . . . . . . . . . . . . . . . . . . . . . 48

3.3.1 The Dutch Book Argument . . . . . . . . . . . . . . . . . . . 49

3.4 The Aim Characterization . . . . . . . . . . . . . . . . . . . . . . . . 53

3.4.1 Measures and Disagreement . . . . . . . . . . . . . . . . . . . 56

3.4.2 Wedgwood’s Inference to the Best Explanation . . . . . . . . . 58

3.4.3 Joyce’s Supervaluationist Argument . . . . . . . . . . . . . . . 60

3.5 The Constitutive Characterization . . . . . . . . . . . . . . . . . . . . 66

3.5.1 Additivity as a Constitutive Rule of a Practice . . . . . . . . . 68

3.5.2 Reasons for Engaging in the Practice . . . . . . . . . . . . . . 70

3.5.3 The Status of Other Norms . . . . . . . . . . . . . . . . . . . 72

4 The Authority of Conditionalization 75

4.1 The Bayesian Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.1.1 The Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.2 The Logic of the Procedure . . . . . . . . . . . . . . . . . . . . . . . 79

4.2.1 Dutch Book Arguments . . . . . . . . . . . . . . . . . . . . . 80

4.2.2 Accuracy-Based Arguments . . . . . . . . . . . . . . . . . . . 82

4.2.3 Constitutivism . . . . . . . . . . . . . . . . . . . . . . . . . . 84

vii

4.2.4 Authority through Commitment Preservation . . . . . . . . . 86

4.2.5 How Commitments are Preserved . . . . . . . . . . . . . . . . 87

4.3 An Illustrative Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

4.3.1 Alice’s Predicament . . . . . . . . . . . . . . . . . . . . . . . . 91

4.3.2 The Counterfactual Argument . . . . . . . . . . . . . . . . . . 93

4.4 Limitations of the Bayesian Procedure . . . . . . . . . . . . . . . . . 96

4.4.1 Varieties of Commitment . . . . . . . . . . . . . . . . . . . . . 96

4.4.2 Alice’s Commitments . . . . . . . . . . . . . . . . . . . . . . . 99

4.4.3 The Authority of the Bayesian Procedure . . . . . . . . . . . . 101

4.4.4 Must we have Relative Commitments? . . . . . . . . . . . . . 102

5 Conclusion 106

Bibliography 110

viii

Chapter 1

Introduction: Probability as a

Practice

1.1 Judgments about Probability

The modern notion of probability dates back to the 17th century, when mathemati-

cians and gamblers suddenly recognized the value of applying the new theory of

combinatorics to games of chance. Some notion of plausibility or evidential support

surely preceded this achievement, but it was only with the new mathematical appara-

tus in view that people really began to conceive of probabilities as we do today. The

significance of the modern notion of probability was grasped at once by those at the

forefront of its development: while its most obvious uses lay in gambling, the formal

structure of probability held promise for helping us navigate uncertainty in all of its

forms.

Probabilities are now ubiquitous: they have taken up an important role in the

sciences, medicine, our legal system, and our daily life. We’ve developed complex

tools for using probabilities in ever more sophisticated ways. Despite our progress in

working with probabilities, the fundamental nature of probability remains controver-

1

sial. The twentieth century saw the development of a number of different schools of

thought about the fundamental nature of probability. Through ‘theories of probabil-

ity’, these schools proposed many different applications of the mathematical apparatus

of probability theory for modeling natural phenomena. As distinct applications of a

formal apparatus, these theories are consistent with each other. But they can also

be construed as making inconsistent claims about what it is that judgments about

probability are really about.

We make judgments about probability. In court, we try to assess how likely it is

that the defendant is innocent. Before setting out for a walk, we try to assess how

likely it is to rain. We judge how likely it is that our sports team will win or lose.

We calculate the odds that our airplane will crash. We decide that scientists are very

probably right about global warming. We conclude that the Bohmian interpretation

of quantum mechanics is no more likely to be true than the Everettian interpretation

and the Collatz conjecture is probably unprovable. The fundamental question at the

foundation of probability is what it is about these judgments about probability that

makes them count as judgments about probability.

Not all judgments about probability need to count as judgments about probability

for the same reasons. Collectively, judgments about probability may only share a

family resemblance. It is extremely plausible that there are at least two basic distinct

kinds of probability. One sort of judgment about probability relates to our evidence.

The other reflects something in the world.

A coin may be biased without us knowing which way it is biased. If we have

to judge how likely it is that the coin will land heads, we might judge that the

probability of heads is .5, on account of our ignorance of the way that it is biased.

Alternatively, we might judge that it is not .5 on account of the fact that it is biased.

The first kind of judgment reflects to our level of evidence. The second kind reflects

the physical nature of the coin itself. These two sorts of judgments deserve distinct

2

interpretations. The judgment that the probability of heads is not .5 is a judgment

about objective chance. Particles, coins, and dice have objective chances of behaving

in certain ways. These chances are part of the things themselves, and don’t depend

on our evidence in any way. The judgment that the coin is equally likely to land

heads or tails is a judgment about subjective probability. The probabilities that we

ascribe to past events or mathematical conjectures are good candidates for judgments

about subjective probability, since despite being assigned intermediate probabilities,

we all know that they must be one way or the other.

The topic of this dissertation is judgments about subjective probability (hence-

forth, just ‘judgments about probability’). I will start in this chapter by offering a

brief account of how I think we ought to understand these judgments. While I don’t

believe that it is wise to develop accounts of both subjective and objective judgments

entirely in isolation from each other, I will focus mostly on subjective judgments.

I will propose that we should understand judgments about probabilities as moves

within a practice, and draw analogies to games of horseshoes and card counting. In

the following chapters, I will discuss a number of related questions and provide an-

swers that draw inspiration from, and provide support for, the view of judgments

about subjective probability that I present here.

1.2 Interpretations of Probability

Though the details vary considerably, the space of plausible analyses of judgments

about probability closely resembles the space of plausible analyses of ethical judg-

ments. The problem of supplying an analysis of the judgment that a proposition is

probable is a lot like the problem of supplying an analysis of the judgment that an

action is wrong. Both kinds of judgment have a subjective or perspectival flavor, but

neither kind of judgment is explicitly relativized. The ethical question has received

3

much more sustained attention over the past century, and the wealth of possible an-

swers proposed for that question provides us a range of possible answers to consider for

the first. Most plausible metaethical views parallel plausible views about judgments

of probability: non-naturalism, analytic descriptivism, synthetic naturalism, sensibil-

ism, subjectivism, assessor relativism, expressivism, error theory, and fictionalism all

make for prima facie reasonable theories of judgments of probability.

Philosophers of probability have developed some of these ideas, often somewhat

independently of their metaethical peers. The primary division in theories falls be-

tween those analyses that characterize the judgments in terms of their representa-

tional content and those that don’t. Cognitivist views suggest that what it is to be

a judgment about probability is to be a judgment that has a certain kind of representa-

tional content. Noncognitivist views suggest that judgments about probability are

characterized by something other than their representational content. They are not

judgments that have a particular representational content containing a probabilistic

component.

Judgments about objective probability most likely deserve a cognitivist treatment.

With judgments of subjective probability, it is less clear. There are cognitivist theories

of judgments of subjective probability that come in realist naturalist, realist non-

naturalist, and subjectivist flavors. Realist positions characterize judgments about

probabilities as judgments about the measure of something outside of us. Naturalist

positions take these to be measures of some natural phenomenon, one that we can

understand in terms of physical objects and properties. Non-naturalist positions

take the subject of judgments about probability to be a non-natural phenomenon.

The ‘logical interpretation’ of probability championed by John Meynard Keynes [20],

Rudolf Carnap [5], and Donald Williams [39], might be understood in either of these

ways. Their idea was that judgments about probability concern the relations between

particular propositions and bodies of evidence. Subjectivist cognitivist positions, on

4

the other hand, characterize judgments about probabilities as judgments about how

propositions relate to each other and to a context-dependent standard. This standard

is typically the judge’s, though it might also be supplied by the judge’s community or

culture. The most plausible version of this proposal is based on John Macfarlane’s [25]

notion of assessment sensitivity, and has been popular as an account of the meaning

of epistemic modals such as ‘might’.

Several distinct non-cognitivist positions have been proposed, but few have been

examined in depth. One non-cognitivist view, defended by Stephen Toulmin [35],

suggests that we can understand probability through its use in modulating commit-

ments. (It was proposed with the linguistic expression of the concept in mind.) This

view faces deep challenges in providing an appealing view of probabilistic judgments,

as opposed to probabilistic assertions, and has now been mostly been abandoned.

The currently dominant noncognitivist view is ‘credal noncognitivism’, for which im-

portant early work was done by Frank Ramsey [28] and Bruno de Finetti [7], and

more recently has been defended by Huw Price [27], Simon Blackburn [2], and Seth

Yalcin [42], holds that judgments about probability are nothing other than beliefs.1

Every belief is simultaneously a judgment about probability.

Credal noncognitivism rests on an implausible psychological theory, but there are

other related views that do not suffer from its defects. In the coming pages, I will

present a second, and what I regard as more plausible, noncognitivist view about

probability. This view interprets judgments of probability as moves within a certain

kind of practice. While I won’t explicitly try to defend this interpretation in great

depth in this dissertation, it will shape much of what I have to say, and the dissertation

should be read with this interpretation in mind. Before taking a closer look at each

1Unfortunately, credal noncognitivism has not been nearly as deeply explored as its metaethicalanalogues. The philosophers I cite may not explicitly embrace every bit of the doctrine that I amattributing to them.

5

of these views, I will describe the doctrine of degrees of belief which will play a role

in their development.

1.2.1 Degrees of Belief

The notion of degrees of belief is important to many interpretations of probability.

When I talk about degrees of belief, I refer to the posits of a particular psychological

doctrine characterized by three major components. These three components all relate

to the claim that beliefs come in gradations. The fact that beliefs admit of gradations

is unexceptional; every attitude can be regarded as coming in degrees along any

number of different dimensions. What is special about the doctrine of degrees of

belief is what it says about these gradations. The theory makes certain claims about

one kind of gradation that belief enters into. These claims can be broken into three

components of the view.2

According to the first component, the degree of a belief is correlated in some way

with its role in decision making. Beliefs ordinarily combine with desires to produce

intentions. The product of these combinations depends upon the degrees of both

belief and desires. Those beliefs of a higher degree have a special tendency to exert

an influence over what we decide to do: we tend to prefer actions which we have

a high degree of belief will result in outcomes for which we have a high degree of

desire. There are other ways to understand the dimension of gradation and some of

these other ways exert some influence over the way that we typically think of the

notion. Often, we connect degrees of belief up with the feelings of confidence that

they inspire. This allows us to use introspection, rather than inferences from our

behavior, to detect our degrees of belief.

According to the second component, belief, disbelief, and uncertainty involve the

same kind of attitude that differ only in where they fall along the gradation’s scale.

2Behaviorist and instrumental versions of the view might make due without realism about gra-dations. Though they retain popularity, I will set these views aside.

6

A high degree of belief in a proposition constitutes belief, a low degree of belief

constitutes disbelief, and a middle degree constitutes uncertainty.

According to the third component of the doctrine, these degrees of belief can

be quantified with numbers in a way such that they typically obey the axioms of

probability theory. That is, each belief can be assigned a numerical representation of

its degree between 0 and 1, and the representation of the degree of a disjunctive belief

is the sum of the representations of the beliefs in the disjuncts if they are inconsistent.

The doctrine of degrees of belief is an empirical conjecture according to which

beliefs come in a kind of a natural gradation that obeys the probability calculus and

is employed in the manner specified by decision theory. As an empirical conjecture, it

is supported to some extent by our ordinary interactions with believers. Beliefs can

clearly be compared both in terms of their phenomenological properties and in the

effects that they have on our actions. However, the other elements of the doctrine

are not as well supported. There is little independent evidence that there exists a

metric that will allow us to evaluate degrees of beliefs between 0 and 1 in such a way

that they typically satisfy the axioms of probability3. Similarly, there is little evidence

that belief, disbelief, and uncertainty really exist along the same cognitive spectrum –

differing only in degree and not in kind. For all that we know, the differences between

uncertainty and belief might be very unlike the differences between low-level belief

and high-level belief. If the doctrine of degrees of belief is false, then many of the

going interpretations of subjective judgments of probability will need to be rethought.

It is an advantage of the view of judgments of probability that I will ultimately favor

that it doesn’t rely on this doctrine.

3Anyone with coherent preferences may be ascribed degrees of belief that satisfy the axioms. Idon’t doubt that this can be done. What I doubt is that these ascriptions match up with any naturalgradations in people’s beliefs. That is, I doubt that an arbitrary consistent ascription will matchany psychological joints.

7

1.2.2 The Logical Interpretation

A drop in the pressure is evidence that it will rain. The fact that the GDP has

gone up at least 2% for each of the last five years is evidence that it will do so again

this year. Propositions provide support for each other, and the logical interpretation

sees judgments about probability as estimates of this support. According to the

logical interpretation of judgments about probability, the content of such a judgment

concerns the value of a measure of the amount of support the evidence provides for a

proposition.

The logical interpretation claims that judgments about probability are judgments

about something objective: given any body of evidence and any proposition, there is a

fact about just how much support that body of evidence provides for that proposition.

Moreover, in many cases there is a way of measuring that support with numbers

between 0 and 1. This is what we’re really thinking about when we think about

probabilities. To say that the drop in pressure indicates a 50% probability of rain

is to say that the amount of support that the proposition the pressure dropped

provides for it will rain is measured at .5.

According to this interpretation, every judgment about probability is implicitly

based on some body of evidence whether or not we’re aware of it. Unless we explicitly

relativize to a body of evidence, such as when we make a conditional judgment about

probability, we’re implicitly relativizing on our total body of evidence.

The chief problem for the logical interpretation is to find a metric that can make

evidential support objective – how do we convert evidential support to a number

between 0 and 1? We could solve this problem by providing a satisfactory metric

of evidential support. Such metrics surely exist, but no single metric warrants the

position that the logical interpretation needs to give it. Historical attempts to do

this have typically made heavy and controversial use of the Principle of Indifference,

according to which we ought to assign equal probabilities to propositions when we

8

lack evidence that distinguishes them. We might fare better in providing an indirect

account of the metric, where the metric is determined by rational behavior in light

of the evidence. On this proposal, a probability relation holds between a proposition

and a body of evidence when the body of evidence makes it rational to behave in ways

consistent with assigning that proposition that probability and following a standard

decision procedure. That way we could specify what the value of evidential support

is in a way that removes some doubt about the existence of such a measure.

A second problem for the logical interpretation is that it suggests an unreasonable

amount of dogmatism. There are many questions of probability about which dis-

agreement seems to be rationally permissible, and moreover, in many of these cases

we may think of ourselves as being no more correct than anyone else. Is there a fact

of the matter how much support our evidence provides to the skeptical hypotheses

that we are brains in vats? Or is there a fact how likely we should believe it is that

something preceded the big bang? Reasonable people may disagree in their judg-

ments about these probabilities. In light of the existence of some propositions that

don’t appear to warrant an objective judgment, we should conclude that judgments

about probability are not judgments about any objective quantities. Until we find a

plausible metric, or evidence that one exists, we should look to see what elsewhere

for an interpretation of probability.

1.2.3 Cognitivist Subjectivism

It is implausible that our judgments about probability concern a particular metric of

evidential support. This suggests an analogy with judgments about personal taste.

When someone judges that something is tasty, they seem to be judging that it tastes

good to them. When someone judges that something is funny they seem to be judging

that it is funny to them. When someone judges that a proposition is probable, they

seem to be judging that it seems probable to them in light of their evidence. There is

9

some etymological support for this hypothesis. Before the term ‘probability’ acquired

its present sense, it seems to have been used to mean believable4. If we have to

assign content to the judgment, then we could do so by treating the content as a self-

ascription. The judgment “that is tasty” might be interpreted as a judgment “that

is tasty to me (or to us)” and, given the doctrine of degrees of belief, the judgment

“that is probable” might be interpreted as a judgment “I have a high degree of belief

that this is the case.”5

Subjectivism has a long history in ethics, and it has been, with a few exceptions,

widely regarded as an implausible metaethical position. The problems of subjec-

tivism about probability are a lot like the problems of subjectivism about ethics.

It is just implausible to think that judgments about probability concern our own

psychological states. We may judge that we occupy a psychological state without

making the corresponding judgment about probability, and we may make a judgment

about probability without making the (allegedly) corresponding judgment about our

psychological state. To judge that a proposition has a certain probability just feels

utterly different then judging that we have any particular attitude towards it.

There are a few tests we can do to tell whether two statements have the same con-

tent, and these tests tell against cognitivist subjectivism. One test involves replacing

an instance of one claim with an instance of the other within a complex context. If

every judgment that a proposition had a probability were identical in content to a

judgment that the judge had a certain psychological state, then we should be able to

substitute one thought for the other in many contexts without a change in meaning.

For instance, there should be a sense of I believe that it will probably rain

tomorrow in which we can exchange it will probably rain tomorrow with I

have a high degree of belief that it will rain tomorrow without chang-

ing the meaning. But there doesn’t seem to be. Seth Yalcin [41] has made much of

4See [14].5See [31].

10

the observation that we cannot exchange psychological statements for probabilistic

statements in the context of supposition. While it makes perfect sense to suppose

that one might assign a high degree of probability in a falsehood, it doesn’t make any

sense to suppose that something might be both false and highly probable.

Subjectivism faces a number of problems beyond its intrinsic implausibility and

the issues it faces embedding in suppositions and conditionals. It is well known to

have difficulty handling disagreement, assertability, and retraction conditions. While

it may be possible to add epicycles6 onto the interpretation to solve these problems,

subjectivism must bear explanatory fruits before it is worth the effort to fend off its

challenges. The biggest problem with subjectivism is that it simply does not provide

much of an advantage over noncognitivist views to merit biting the bullet on the

problems it creates or warrant the complexities required to properly deal with them.

1.2.4 Credal Noncognitivism

Subjectivism characterizes judgments about probability as possessing psychological

contents. One move that has been very popular in metaethics is to shift from a subjec-

tivist cognitivism to noncognitivism. The noncognitivist agrees that the subjectivist

is right in thinking that there is a close tie between having a particular degree of

belief and making a particular judgment about probability. Judgments about proba-

bility, however, aren’t mere representations of one’s degrees of belief. Instead of being

representations of degrees of belief, credal noncognitivists hold that judgments about

probability just are degrees of belief. Credal expressivism extends the view by adding

that probabilistic language is used to express one’s degrees of belief in the way that

ordinary language is used to express one’s beliefs.

Credal noncognitivism is noncognitive because it does not associate any special

content with judgments about probability. Judgments about probabilities certainly

6See [8, 36, 24].

11

have representational contents, insofar as they are propositional attitudes, but prob-

abilities are not parts of those contents. By identifying judgments about probability

with degrees of belief, the credal noncognitivist proposes to characterize judgments

about probability in terms of their cognitive roles, rather than the contents that they

have.7

Just like the most popular versions of subjectivism, the view rests heavily on the

doctrine of degrees of belief. The way I will understand the proposal here, credal

noncognitivism is a substantive psychological proposal: it holds not just that judg-

ments about probability have a particular functional role – it holds also that they

involve the same kinds of gradations as degrees of belief. To judge that a proposition

has a probability of .9 is to have a degree of belief of .9 in it. I have expressed my

doubts about the doctrine of degrees of belief, and I think the problems with the view

provide us with good reasons to reject credal noncognitivism. There are, however,

versions of noncognitivism that are more promising. I think that noncognitivism

is the right way to interpret probability. But credal noncognitivism is wedded to

implausible psychological views. For this reason, it should be rejected.

Naive forms of metaethical noncognitivism face similar problems. Metaethical

noncognitivists think that moral judgments express attitudes that are desire-like,

disapproval-like, or intention-like rather than beliefs. Modern noncognitivists seldom

wish to commit to thinking that such judgments are nothing other than ordinary

desires or attitudes of disapproval. This would be ludicrous. Even if moral judg-

ments typically have the functional profile of desires, they are clearly very different

kinds of things from ordinary desires. We should be careful to distinguish the more

sophisticated moral judgments from the less sophisticated desires.

7Having a certain content may also be a kind of cognitive role, but not every role gives riseto a content. Presumably, noncognitivists attribute roles to our judgments about probability thatpreclude them from having contents.

12

Credal noncognitivists have done little to explore the differences between judg-

ments about probability and beliefs. While I am sympathetic with the claim that

judgments about probability have the same functional profile attributed to degrees

of belief, I think that more work needs to be done to distinguish judgments about

probability from beliefs. I also think that once we have gotten clearer on the differ-

ence between beliefs and judgments of probability, it will become more plausible to

understand judgments about probability as constituting a move in a certain practice

rather than as being expressive of a particular kind of attitude.

1.3 Probability as a Practice

While credal noncognivisim itself has its problems, noncognitivism is promising. The

problems with credal noncognitivism arise from its connection to the doctrine of de-

grees of belief. We would do better if we could distinguish degrees of probability in

judgments about probability from degrees of beliefs. I propose that we understand

judgments about probability as moves within a particular practice. I’ll start by ex-

plaining what I mean when I say that a judgment is a move, and then I will give a

sketch of what the practice of assigning probabilities involves.

1.3.1 Intellectual Practices

We are deeply cultural creatures – most of our daily routines involve practices that

we have learned. We seldom stop to consider why we do things the way that we do.

We simply do them because that is the way it is done. This is true not just of our

unthinking habits, but with many of our intellectual practices as well. In learning

about mathematics, science, history, or philosophy, we are acculturated to go about

things in a certain way, to produce certain kinds of intellectual products, and to use

certain tools or heuristics.

13

When learning about a new field, we often acquire its basic concepts through

learning how to manipulate them for specific purposes. Most readers should be fa-

miliar with this from their early study of mathematics. Pre-college mathematics is

often taught as a set of tools to use for solving certain kinds of problems. Students

learn techniques for manipulating symbols and concepts in order to solve problems.

Mathematical concepts and techniques work like cognitive slide-rules. While stu-

dents understand how to use the symbols and concepts to solve the problems set

before them, they needn’t have any understanding of why the procedures that they

are taught to use work or what the representative significance of the symbols or con-

cepts or their manipulation might consist in. One doesn’t need to understand the

epsilon-delta definition of a limit in order to apply calculus to find an area. Nor does

one need to have any idea what to make of imaginary numbers in order to use them

to solve equations.

We learn about formal and informal probability theory in the same way that we

learn about calculus and imaginary numbers. Whether instructed by teachers or

through the observations of its use by others, we learn how it is that we are supposed

to use assignments of probability in order to solve problems. Through observation

and education, we acquire the ability to engage in the practice ourselves. There

may be some deep representational meaning that makes sense of the way that we

use probability, as there is with the epsilon-delta definition of derivatives in calculus,

but there also need not be. It may be that the practice we have for calculating

probabilities is useful even if there is no ultimate interpretation of probability that

makes sense of that use as representational.

I propose that judgments about probability are characterized with respect to the

practice of assigning probabilities. The proper subject of analysis isn’t the judgments

themselves, but the practice that those judgments fit into. To characterize the prac-

tice, we don’t need to explain what kinds of mental states are involved in employing

14

it. There might be many ways of characterizing a practice, but the practice of proba-

bility can be adequately characterized in terms of its rules and its aim. The practice

of playing horseshoes is characterized by rules that specify that each player is to

take turns throwing their horse shoe from a set distance. The winner is the player

whose horse shoe come closest to a specified post. The aim of the practice is to win.

Similarly, the practice of assigning probabilities has rules and an aim that together

characterize the practice.

Of course, not every rule involved in a practice characterizes that practice – only

those rules that are constitutive. We may regard it as a rule that one ought not throw

one’s horseshoe so as to injure the other player and hamper their ability to throw their

remaining horseshoes. This rule may govern the practice of playing horseshoes, but

it isn’t constitutive of it. It is difficult to say exactly which rules are constitutive and

as such many practices admit of some vagueness. In the final chapter, I will suggest

that a rule is constitutive if it is required for coherent and intentional engagement

in the practice. This doesn’t however provide a precise division between constitutive

and non-constitutive rules. Can one use one’s foot to toss a horse shoe? Is distance

between the post and the horse shoe to be measured in a straight line as measured

along the plane of the surface of the Earth, or are differences in height to be taken

into consideration? For many of these issues, there need be no fact of the matter.

We must also say something about what it is for someone to engage in that

practice. A judgment about probability is a form of engaging in the practice of

assigning probabilities. Not everyone who tosses a horse shoe is playing horseshoes.

They must be intentionally engaged in the game. For horseshoes, engaging in the

practice requires a set of communal intentions to play the game. Anyone who engages

in the practice of horseshoes and, within that practice tosses a horse shoe, has made

a certain kind of move. We might also say that a student who manipulates symbols

learned in calculus class is engaging in the practice of using calculus to solve a problem.

15

Their engagement in this practice comes from their intention to manipulate symbols

in the way that they were taught. The steps in their manipulations count as moves

within this practice. Similarly, I suggest we regard judgments about probability as

moves within a practice. While they might not be so heavily constrained, they are

nevertheless governed by specific rules for how they should be assigned.

I propose that the practice of assigning probabilities is captured by certain consti-

tutive rules that govern the practice and an aim for that practice and that we count

as engaging in the practice insofar as we intentionally utilize symbols and concepts

designed for this practice. This view differs from credal noncognitivism in that it sees

judgments about probability as distinct from run-of-the-mill beliefs. These judgments

are more akin to the kinds of judgments we make in the course of solving a math

problem than they are like attitudes of confidence.

I will make the case in the third chapter that the rules of the practice are the

formal norms of probability. In the next section of this chapter, I will say something

about the aim of the practice of assigning probabilities.

1.3.2 The Use of Probability

The characteristic use of judgments of probability is to solve problems in practical

and epistemic decision making. We calculate probabilities in order to decide what

to believe, how confident to be, and what to do. The best analogies for the practice

of assigning probabilities are methods of card-counting. Clever blackjack players

utilize various methods to keep track of the cards that have come up in order to

place better bets in future rounds. Blackjack players would be better off if they

could remember every card they have seen and knew the optimal bet to make in

their precise circumstances. Limits of memory and mathematical ability prevent us

from keeping and utilizing this kind of information. So instead, these players use

a heuristic to keep track of what is important about what cards they’ve seen. The

16

heuristic is useful because it is easy to keep track of and can easily apply given very

simple rules. We come up with schemes for card counting as a way of keeping track

of what evidence we’ve received in a form that bears well on the decisions that we

have to make.

If we were ideal agents, we might be able to remember every fact we ever learned

and muster all of our information whenever we had to make a decision. We might

also see, simply upon surveying the evidence and knowing our desires, what decision

we should make. If we were also masters of an ideal language, we might communicate

the full body of information that we possess to each other with just a few words. We

are not ideal agents and we have no ideal language. In our intellectual and practical

pursuits, we encounter far more information than we can remember, analyze, or

easily communicate. It is useful to have a shorthand way of keeping track of how our

evidence bears upon what decisions we want make. Since we cannot internalize and

analyze our evidence all at once, it is helpful to be able to convert that evidence into

a compressed form whose upshot for decision making is easy to understand.

Probability plays this role. When we make a judgment about probability, we

commit ourselves to a certain manner of representing that evidence for the purpose

of decision making. The particular form that judgments about probability take can be

explained by the particular uses to which we put it. We adopt the scheme of applying

the axioms of probability because it is useful to have a compressed representation

of evidential support that satisfies those axioms for decision making. The particular

structure of probability assignments lends them to decision making because it allows

us to apply utility maximization as a decision procedure.

The calculation of probabilities is a technique that we’ve learned in order to solve

decision-making problems. Judgments of probability are no more a part of our in-

herent cognitive repertoire than strategies for card counting. Belief is part of our

inherent cognitive repertoire, as are whatever gradations it comes in. Judgments

17

about probability differ from them in the same kind of way that a judgment about

the current count differs from a degree of belief in the kinds of cards that one will

soon draw. Our judgments about probability, like our judgments about the count,

are more cognitively sophisticated.

In outline, this approach to probability is consistent with the logical interpreta-

tion. Just as the concepts of calculus found representational interpretations in the

epsilon-delta definitions, it is possible that some day we will find representational in-

terpretations for our judgments of probability. Whether this is so depends on whether

we can find a objective metric for measuring probabilistic support. I have already

expressed my doubts this will ever happen. We might also adopt a fictionalist view ac-

cording to which the logical interpretation is right about the content of the judgments

about probability, but wrong about our commitments when we make such judgments.

We aren’t committed to their truth. There is much to like about fictionalism, but

in the absence of a clear explanation of why we need to attribute representational

content to judgments about probability, I am skeptical that the attribution of con-

tent is anything but superfluous. There is no reason why we should think that our

judgments about probability even purport to be anything objective.

I disagreed with the logical interpretation when it comes to the assumption of

objectivity of measures of evidence, and this is largely why I prefer the noncognitive

account of probability as a practice. A judgment about probabilities is a kind of

decision we make about how to represent the evidence for the purposes of decision

making. Sometimes, evidence may be conclusive, and the only rational thing to do is

to assign one particular probability. But sometimes the evidence won’t itself warrant

or demand any particular representation. No representation needs to be rationally

forbidden. This doesn’t undermine the practice of assigning probabilities, because

when one assigns a probability, one isn’t committed to regarding that assignment

to the proposition as correct. The view of probability as a practice preserves much

18

of the appeal of the logical interpretation without committing itself to that view’s

extravagances.

1.4 Overview of What’s to Come

In the chapters that follow, I will explore issues related to the interpretation of proba-

bility that I have just proposed. The second chapter concerns the cognitive grounding

of judgments about probability. I discuss how it is that our judgments of probability

really do differ from our degrees of belief. I argue that we cannot regard judgments

about probability as the credal noncognitivist proposes because that theory encoun-

ters a number of basic explanatory problems. In short, judgments about probability

seem to resemble beliefs about heights, weights, temperatures, duration, and prices

much more than they do gradations of belief. They act like standard estimations

of quantities. The lesson that I draw from this is that we should regard judgments

about probability as conceptualized in the same way that we regard judgments about

other quantities. What we’re doing when we ascribe a probability is, from a cognitive

perspective, more or less what we’re doing when we ascribe a height.

This supports the interpretation of probability as a practice. We often pick up

new concepts that are constitutive of a practice without thinking or caring too much

about their representational fidelity. What matters is that they work. We use the

same cognitive mechanisms to manipulate these concepts that we use to manipulate

representational concepts, including concepts of quantities, because it is useful. Think

about the the practice of rating a movie or a restaurant. It is hard to say what, exactly,

the rating represents. Nevertheless, by applying a quantity to the movie we invite

certain kinds of manipulation. We can, for instance, say that one movie has twice as

many stars as another. We can compare averages and talk about relative differences

19

in quality with precision. This may not be especially useful when it comes to rating

movies, but it is very useful for using measures of evidential support.

The next two chapters concern the norms that govern judgments about probability.

Many different kinds of norms have been taken to apply to our judgments about

probability. Two classes of norms are particularly fundamental. In the third chapter,

I discuss the formal norms of probability. The formal norms of probability concern

what kinds of judgments about probability are formally consistent with each other.

It is irrational to judge that the probability that it will rain tomorrow is .75, while

simultaneously judging that the probability that it will not rain tomorrow is .5. Why

should this be so? Many different explanations have been offered. I think that these

explanations are overly elaborate and their extravagences detract from the quality

of the explanations that they are able to provide. The answer is very simple. If we

understand judgments of probability as moves in a practice, we can understand the

formal norms as constitutive constraints on the practice. There are pragmatic reasons

to adopt a practice with the axioms of probability as formal constraints. Some of the

standard arguments for these formal constraints attest to these pragmatic reasons.

The constitutivity of the constraints for the practice, however, is what ultimately

explains the authority of the formal norms for the practice.

The second fundamental class of norms are relational. They tell us how one set of

judgments about probability based on one body of evidence ought to relate to another

set of judgments based on another body of evidence. These norms are often thought

of as diachronic norms for updating one’s probability assignment upon receiving new

evidence, however, I prefer to think of them as synchronic norms about how one’s

judgment of the probabilities relative to different bodies of evidence should relate to

each other. In the fourth chapter, I turn my attention to what I call the ‘Bayesian

procedure’, which is a procedure for settling the probabilities of propositions that

turns on the process of conditionalization. I argue that the Bayesian procedure is

20

not as authoritative as it is often regarded as being. While we may be required in

many cases to defer to its dictates, there are also cases in which we are free to ignore

them. I argue for this conclusion by providing a rationale for the procedure, and than

showing how the rationale can fail to hold. My analysis of the Bayesian procedure is

dependent on my account of the function of judgments about probability. Judgments

about probability represent our evidence for the purposes of action. It makes sense

to use the Bayesian procedure when one has certain kinds of commitments about

how evidence is to be represented. However, not every judgment about probability

provides us with the kinds of commitments that give the procedure authority.

21

Chapter 2

Probability and Confidence:

Grounds for Divorce

2.1 Introduction

Consider the following two theses.

The reducibility of belief thesis holds that familiar categorical be-

liefs are ultimately grounded in gradational attitudes. According to

the reducibility of belief thesis, our beliefs are a product of our levels of

confidence – there is no genuine psychological gap between full categorical

beliefs and attitudes of moderately high confidence.1

Noncognitivism about probability holds that certain judgments

about probabilities do not represent the world as being some particular

way vis-a-vis probability. In particular, judgments about probability

1There are many different ways to try to carry out this reduction, the simplest of which isthe Lockean Thesis [11] that holds that having a belief in a proposition is a matter of having theunderlying gradational attitude to a sufficiently high degree.

22

are typically understood as a special kind of attitude directed at the

proposition whose probability is being assessed. The object of the at-

titude is a proposition, but the degree of probability is not in any way

reflected in the proposition itself. The proposition isn’t a proposition

about probability. Instead, the degree of probability ascribed by the

judgment is an aspect of the attitude itself, just as when we have a

strong desire that the cost of milk is low, the strength of the desire is

not a part of the attitude’s object. Since the degree of probability is

part of the attitude rather than its object, the attitude itself is gradational.

In this chapter, I will use ‘confidence’ to refer to the graduated state underlying

belief2 and ‘judgment about probability’ to refer to the attitude postulated by the

noncognitive interpretation of probability.

We have good reasons to accept both the reducibility of belief thesis and noncog-

nitivism about probability. While these two theses are independent, one could think

that judgments of probability ought to be understood in terms of some gradational

attitude without thinking that that attitude is the same one that grounds belief,

parsimony pushes us to unite them.

According to the unification thesis, judgments about probability and levels of

confidence are really the same thing. The unification thesis is a bold theory about

psychology. It says that our beliefs are graduated in precisely the same manner as are

our judgments of probability. The unification thesis is a direct consequence of credal

expressivism (according to which we use probabilistic language to express degrees

of confidence)3 and a minimalistic semantics for ‘judgment’ (according to which “P

judges that probably Φ” is true roughly iff P has the mental state that P could

2I retain ‘degrees of belief’ as defined in the first chapter as posits of the doctrine of degrees ofbelief. The graduated state underlying belief needn’t satisfy the components of the doctrine.

3Credal expressivism is the linguistic analogue to the view of credal noncognitivism that wasintroduced in the first chapter.

23

appropriately express with the sentence “probably Φ”)4. Though it is rarely spelled

out and explicitly endorsed, doctrines that lead to the unification thesis seem to be

widely accepted.

I will argue in favor of a separation. I will advocate a psychological picture on

which judgments about probability are not attitudes of confidence, but are instead a

separate and more cognitively sophisticated attitude. What gives judgments about

probability their sophistication, and what prevents them from being mere attitudes

of confidence, is the way in which they involve concepts of probability. Attitudes of

confidence do not employ a concept of probability, but judgments of probability do.

Whereas infants and non-human animals are probably capable of the former kind of

attitude, only adult human beings seem capable of the latter.

I will present three considerations that suggest that judgments about probabil-

ity are something more sophisticated than the unification thesis would have it. One

consideration stresses the logical complexity of our judgments about probability. A

second consideration focuses on the fact that judgments about probability come in

numerically precise degrees. The final consideration relies on our capacity to make

judgments that are ambiguous between judgments of objective and subjective prob-

ability.

Before presenting these considerations, I will clarify the claim that judgments of

probability involve concepts of probability.

2.1.1 Probability as a Concept

A subjective interpretation of some judgments of probability is consistent with an

objective interpretation of others. When we judge that a die has a certain chance of

landing on a ‘6’ or that a radium atom has a certain chance of decay, the judgment may

concern some fact independent of the judge’s individual evidence. Judgments about

4Someone who does seem both of these views is Seth Yalcin. See [41] for his presentation of aminimalist semantics for attitudes (with ‘might’) and [42] for his discussion of credal expressivism.

24

objective chances cannot plausibly be interpreted as gradational attitudes. Instead,

they appear to be a species of ordinary categorical beliefs with gradations built into

their representational contents. Those judgments about probability that are most apt

for a subjective interpretation are those that are neither true nor false independently

of the judge’s relation to the proposition. The judgment that the Goldbach Conjecture

is probably - but only probably - true is a paradigm case. Either the conjecture is true

or it is not. The judgment of intermediate probability is more of a reflection more of

the judge’s own uncertainty than a verdict on the conjecture itself.

One reason to maintain that judgments about probability are not attitudes of

confidence is that probabilistic concepts play a role in the former but not the latter.

The gradations in attitudes of confidence are non-conceptual: one has no more need of

special concepts to be confident to different degrees than one has need of any special

concepts to be scared to different degrees.

Proponents of the unification view may draw from this the conclusion that judg-

ments about probability do not require concepts either. It is sometimes alleged

that judgments about probability cannot require any special concepts, because such

intellectually-undeveloped beings as infants and non-human animals seem capable

of making them[12, 9]. Recent experiments have demonstrated that infants have a

surprising capacity to make predictions on the basis of observations of relative pro-

portions.5 However, the conclusion only follows if we already take on board the

unification thesis. It is precisely because judgments of probability are conceptual

that I think the unification thesis must be false.

5If an infant is shown three red marbles and one blue marble bouncing around inside of a containerwith a single opening, he will look longer if the first marble to escape is blue [33]. Infants whoare first shown several marbles randomly drawn from a container and then shown the colors ofmarbles remaining inside the container look longer if the remaining marbles do not resemble themarbles drawn [40]. Infants are known to look longer at outcomes that they do not expect, so theseexperiments strongly suggest that infants’ expectations are guided by observations about relativeproportions.

25

The idea can best be understood through analogies. Beliefs about the weight of

a bowling ball can be gradational in two different ways. We can believe that a ball

is heavier or lighter, and we can be more or less confident in our belief. I take it

that these two forms of gradation are very different. Worries about the price of milk

and desires about the high temperature for the day are gradational along two similar

dimensions. The gradations of variance of weight, price, and temperature in these

attitudes are conceptualized. Our attitudes involve concepts of weight, price, and

temperature in different ways. The different ways that they involve these (and other)

concepts determine the gradations. There are no concepts, the manner of whose

involvement determines the variance of strength of desire6 or worry.

When I say that gradations of probability involve concepts of probability, I mean

to say that probabilistic concepts figure into these attitudes in the way that concepts

of weight, price, and temperature figure into beliefs. I will speculate a bit in the next

section about what this manner of figuring might involve. However we may ultimately

understand the nature and role of concepts, my claim is that the way the dimensions

of gradation enter into our judgments about probability look more like the way that

gradations of weight, price and temperature enter into beliefs and desires then the

way that gradations of confidence or desire strength do.

2.1.2 Content and Vehicles

When we believe something about weights, prices, or temperatures, the differences in

weight, price, and temperature correspond to differences in the representational con-

tent of our attitudes. There is no similar representational difference between beliefs

qua beliefs and worries qua worries. The former difference is conceptualized. The

latter is not. This could lead to the thought that conceptualized attitudinal grada-

6If metaethical noncognitivists are right, moral concepts may be an exception to this claim. Ithink it is most plausible that these desires differ from ordinary desires in the same kind of way thatI allege that judgments about probability differ from beliefs.

26

tions have to be representational, and consequently, they must reflect real gradations

out there in the world. If so, then in arguing that gradations in the assignment of

probability are conceptual, I am undermining the noncognitivism about probability.

Here is one way we might make sense of the postulated similarity between judg-

ments of probability and price: we might construe judgments about probability as

ordinary judgments about a specific class of propositions. There are many candi-

dates for this class of propositions. We might adopt the logical interpretation, on

which judgments about probability concern some numerical relation between bodies

of evidence and propositions. Alternatively, we might adopt a cognitivist subjectivist

stance, according to which our judgments are judgments about our actual degrees of

belief.

I do not, however, think that we must infer from the fact that conceptualized

gradations are often representational that they are always so. It seems likely that

there is a distinction between the dimensions of gradation at the level of their cognitive

implementation. If we do have a language of thought, then probabilistic concepts

figure amongst its vocabulary. However, my primary contention is that the way

that attitudes involve gradations of probability is like the way that attitudes involve

gradations of weights, prices, and temperatures.

I don’t hope to settle this issue here, and while I remain deeply attracted to the

noncognitivist approach, the point that I wish to take is far more basic. Whether or

not we ultimately opt for cognitivism or noncognitivism, I think we need to make the

same claim about the cognitive vehicles of the attitude. That claim is that judgments

about probability involve vehicles with probabilistic components in the way that

judgments about weight involve vehicles with weight-related components. Just as we

talk about probabilities with sentences containing words for probability, so we think

about probabilities with vehicles incorporating concepts of probability.

27

Not all differences in the apparent content of belief can be traced to differences in

representational content. Beliefs with the same propositional content can differ with

respect to the modes of presentation of the entities involved. There may be some

modes of presentation that, given a certain failure of reference fixing, lack any kind of

referent. Examples of the latter kind might be common in failed theories. A man may

once have thought that he had a particularly high amount of yellow bile. Supposing

that the theory of humors was based on faulty presuppositions, we may plausibly

say that the man had a belief without any particular propositional content. There is

no such thing as yellow bile, and lacking a proper subject, the man’s judgment was

not about anything in particular.7 If the attitude was nonrepresentational, it would

nevertheless be a belief and could be ascribed gradations in precisely the same way

that beliefs about weights, prices, or temperatures can be.

How do we explain this? The explanation may ultimately be that beliefs that have

propositions as their objects (or at least aspire to) gain access to those propositions

with the help of certain cognitive vehicles8. This would make propositional attitudes

parallel propositional assertions: just as we assert a proposition by uttering a sen-

tence, we believe a proposition by tokening a vehicle. I take it that these vehicles

allow us to compose concepts together in some way or other. This view is consis-

tent with many different stories about the structure of these vehicles. They might

be sentence-like, or map-like, or cross-word-puzzle-like, or they might be coded as

recipes like DNA, or they might have a completely different kind of structure. They

7It is certainly possible to find some propositional content to ascribe to him: perhaps he believesthat he has a high amount of what the physicians refer to as ‘yellow bile’. I think that this assumptionis unmotivated. Possessing deeply mistaken assumptions is often an adequate reason to think thatreference doesn’t occur and that as the result the individual has some mental states that count asbeliefs but lack representational content.

8I take it that the best theories of how vehicles match up with propositions relies on some kind ofcorrespondence between the functional relations existing between these attitudes and their vehicleson the one hand, and the between propositions on the other. In a sentence, expressing a propositionis epiphenomenal to the vehicle. The vehicle’s behavior explains why it expresses a proposition,rather than the other way around. This means that there need not be any important distinctionbetween those attitudes which, by virtue of suitable correspondence, get to be propositional.

28

might even be semantically compositional in only the loosest of senses. I don’t think

it matters for the present point, which is that we do not need to find gradations in

propositional contents in order to justify the association of gradations of probability

and gradations of weights, prices, and temperatures. We can instead justify the asso-

ciation by supposing that concepts of probability figure into the vehicles of judgments

of probability in the way that concepts of weights, prices, and temperatures figure

into the vehicles of judgments about weights, prices, and temperatures (however that

may be).

In the remainder of this paper, I will present three considerations that suggest

that gradations of probability share more in common with the prototypical conceptual

gradations than they do with prototypical non-conceptual gradations.

2.2 Consideration 1: Structural Diversity

The first thing to note about judgments of probability is their potential for complexity.

These judgments often have a straightforward form: we judge that some proposition

has some probability of being true and it is these kinds of judgments that make the

unification thesis most plausible. But our judgments about probability are not limited

to straightforward probability ascriptions. We can judge that probabilities are related

to each other in subtle ways.

The simplest of the more complex judgments are comparative. Consider the judg-

ment expressed by the following sentence.

� It is more likely that there is intelligent alien life in the Milky Way Galaxy than

in the Sagittarius Dwarf Galaxy.

This judgment does not reflect any particular degree of confidence in a particular

proposition. The judgment doesn’t concern either the existence of alien life in the

Milky Way or in the Sagittarius Dwarf Galaxy all by itself. The judgment somehow

29

compares the two. The unification thesis cannot make sense of this fact on its straight-

forward reading. The sentence clearly expresses a judgment about probabilities, but

it doesn’t clearly express any particular level of confidence.

The unification thesis might be saved by regarding the sentence as expressing a

relation between a pair of attitudes of confidence. One might be so confident in the

existence of life in the Milky Way and so confident in the existence of life in the

Sagittarius Dwarf Galaxy, and by virtue of the greater degree of confidence in the

former, count as making the comparative judgment.

The problem with this reductive solution is that a person need not make any first-

order judgments regarding the probabilities of the compared propositions in order to

make the comparative judgment. We can think that it is more likely for life to be in

one place than the other while at the same time being completely baffled about the

probability of life in either location.

Our capacity for irreducibly comparative judgments of probability distinguishes

the gradations of probability from the gradations of other attitudes. While it is

possible to have comparative degrees of fear (it is possible to be more afraid of one

thing than another) it is not possible to be so without being afraid to some particular

degree or other in each thing. The same goes for most other gradational attitudes.

Higher-order comparisons of fear are grounded in lower-order degrees of fear, while

higher-order probability assignments need not be grounded in lower-order probability

assignments.

By contrast, the kinds of gradations grounded in conceptual contents have the

potential for being irreducibly comparative. The following is a rather run-of-the-mill

belief.

The price of oil will be greater if Iran has a homegrown revolution than

it will be if Iran is bombed by a foreign power.

30

One can have this belief while at the same time knowing nothing about the actual

price of oil in either situation. The comparison can be irreducible.

The unification thesis might be saved by admitting that confidence is a bit dif-

ferent from other gradational attitudes. Sometimes it is suggested that judgments

probability are essentially comparative. Perhaps degrees of confidence are compar-

ative in the sense that the object of the fundamental psychological attitude is not

a proposition, but a pair of propositions. There is no such thing as a degree of

confidence in isolation, there are only relations of confidence. We can interpret non-

relational judgments as implicitly comparative. Perhaps to think that the probability

that there is life in the Milky Way is high is to be only slightly less confident in it

than in a tautology. Instead of understanding comparative probabilities in terms of

non-comparative probabilities, we must understand non-comparative probabilities as

implicitly comparative. This will allow us to save the unification thesis while at the

same time making sense of irreducibly comparative judgments.

Unfortunately, this strategy is complicated by the existence of yet other appar-

ently irreducible forms of judgments about probabilities. Alan Hajek [15] has argued

convincingly that conditional probabilities, such as expressed by the following, cannot

be reduced to unconditional probabilities.

It is likely that the economy will rally, given that Greece institutes brutal

austerity measures. (Conditional)

Our only real option, consistent with the unification thesis, is to try to reduce one

to the other.

31

A reduction of conditionals to comparatives is not implausible; taking inspiration

from the traditional ratio analysis of conditional probabilities9, we might try to

reduce conditionals to primitive comparatives. There are, however, many other forms

of probabilistic judgments that would also require a separate reductive treatment.

For instance:

It is more likely that the economy will collapse, given that Greece de-

faults, than it is that the economy will rally, given that Greece institutes

brutal austerity measures (Comparative Conditional).

The extent to which it is more likely that there is alien life in the Milky

Way Galaxy than in the Dwarf Sagittarius Galaxy is greater than the

extent to which its more likely that there is alien life in the Andromeda

Galaxy than in the Milky Way Galaxy. (Comparative Comparative)

On any given night, it is likely that it rains somewhere. (Quantified)

Each king is more likely to have died young than his predecessor.

(Quantified Comparative)

Apart from the problem of giving some kind of general reductive interpretation

that captures the extent of this variety, there is the problem of explaining the dimen-

sions of the variety as well. The degrees of probability appear to be tightly integrated

with the attitude’s content and exhibit systematicity and productivity. The examples

9According to the ratio analysis, a conditional probability Pr(Q|R) is equal to the ratio of non-

conditional probabilities: Pr(Q&R)Pr(R) .

32

above show how complexity can be produced by combining and embedding complex

forms within other complex forms. The dimensions of variety are determined in a sys-

tematic way that mirrors the variety of judgments that we can make about weights,

prices, and temperatures. The best explanation of this fact is that gradations of

probabilities and gradations of weights, prices, and temperatures figure into these

judgments in the same way.

2.3 Consideration 2: Novel Quantifiability

There are reasons to think that, among all of the gradational attitudes, judgments of

probability are special. Unlike other gradational attitudes, the degrees of probability

are quantifiable: we measure our probabilities with numbers. We may, for instance,

judge that the probability that Zenyatta will win the race is .75. By contrast, we do

not measure the degrees of our other gradational attitudes, such as fear, with numbers.

It makes no sense to say that one is scared to degree .5, or that one has ten degrees of

fear. We do, however, measure weights, prices, and temperatures with numbers. This

suggests that the way in which judgments of probability are gradational is different

from the ways in which other attitudes are gradational, and that it is more like the

way that judgments about weights, prices, and temperatures are gradational.

The quantifiability of probability stands in need of special explanation. Two fur-

ther facts about the way that we quantify probabilities bolster the difference between

judgments about probability and other gradational attitudes and point to what might

explain this. First, the quantification of degrees of probability exhibits a potential for

unbounded precision. Second, such quantification is historically novel. In the next

two sections, I will explain how these facts point to the involvement of probabilistic

concepts in our judgments of probability.

33

2.3.1 Precision

Not only do we measure probabilities with numbers, but these measurements admit of

no limit to their possible precision. Precise numerical judgments are rare, but we have

the capacity, in the right epistemic situations, to assign very precise probabilities.10

We seldom make use of this capacity because the epistemic situations in which we

find ourselves seldom calls for it, but it is not a cognitive limitation that prevents us.

The problem for the unification thesis is to explain how it is that our attitudes are

capable of precise quantification. If we give up the thesis, we can explain numerical

precision in the same way that we can explain numerical precision in beliefs. We are

able to believe propositions involving precise quantification. It is easy, for instance,

to believe that a particular mountain is 13,563.2 feet above sea level. It is hard to see

how we might do this without employing a recursive capacity to construct concepts

of numbers that can then figure into the vehicles of the attitudes. Someone who has

not been taught a recursive representation of numbers would not be able to have this

kind of belief. The presence of a concept of the number 13,563.2 explains how it is

that the belief gets to be a belief about so precise a height. A similar presence of

numerical concepts in probabilistic attitudes could explain their numerical precision.

The advantage that we get in supposing that concepts figure into the gradation

of probabilities isn’t that it provides a concrete explanation of how probabilities get

to be numerically precise. We do not know exactly how beliefs about prices get to be

numerically precise, either. But we know there has to be such an explanation in order

to explain the precision of other beliefs. If we think that probabilities are conceptual,

we can make use of the same explanation we already must assume exists.

10Precise assignments like this are more common with conditional probabilities where we can usethe antecedent to set up ideal contexts for precision.

34

2.3.2 Novelty

Undoubtedly probabilistic attitudes, in some form or other, have long played a part

in the cognitive lives of human beings. Ancient Romans probably judged that some

charioteers were more likely to win a given race than others, and the Sumerians surely

noted when the clouds suggested that rain was likely. It is only recently, however,

that numbers have been involved in judgments about probability. The extent to which

people thought about probability before the 17th century is not entirely known, but it

is relatively uncontroversial that nobody assessed probabilities (even comparatively)

with the use of numbers before the innovations of Pascal, Leibniz, and Bernoulli[14].

We have long made judgments about probability, but we have only recently begun

to quantify them. It would be quite surprising if our capacity for probabilistic preci-

sion were special – if it were distinct from our general capacity for precise judgments,

we would not expect it to appear and develop between 1650 and 1850. What seems

far more likely is that epistemic advances provided a reason to have more precise

subjective probabilities. There was no aspect of our cognitive architecture that was

dedicated to quantifying probability assignments; instead old structures (the same

structures as underlie beliefs) were co-opted for a new purpose.

There must have been some time before we began thinking about heights in terms

of numbers. The introduction of numerical measures of length radically changed the

kinds of judgments we were able to make. We went from being able to make vague

and comparative assessments of heights, to making judgments about precise numerical

heights. Throughout our history, there have been lots of quantities that we have come,

over time, to measure with numbers: distance, weight, mass, volume, hardness, and

temperature, among many others. Describing a quantity with numerical precision

requires a reliable measure. Distance measures were easier to come by and appeared

comparatively early. Temperature measures required more sophisticated devices, and

so came about relatively late. We don’t ordinarily measure colors with numbers, but

35

it is easy to see how we could introduce numerical precision into our color concepts

with a suitable measure. Concepts can transition from pre-quantitative primitive

forms to quantitative forms.

Contrast this with other attitudes. It is hard to imagine that we could suddenly

start having quantification built into our gustatory experiences (so that some foods

literally tasted twice as good as others). Or that we could suddenly start having

precisely-quantified fears or worries. We could learn how to assign numbers to levels

of fear, but the natural metric on numbers would correspond to nothing so natural

for fears. It makes no sense to say that one thing is twice as scary as something else –

though it does make sense to say that something is much scarier than something else.

By admitting that gradations of probability are conceptualized, we can make sense of

the historical novelty of probabilistic precision by attributing the development to a

conceptual revolution. It is therefore not surprising that quantification of probability

should have appeared with the invention of epistemic tools (mathematical combina-

torics in the 17th century and statistical data in the 17-18th centuries) that justify

those probabilities. We are not born with the concepts necessary to entertain any

content whatsoever. To entertain numerical contents in general, we need concepts

of numbers. To entertain probabilistic contents, we need concepts of probability.

The modern concept of probability appears to be a concept that we acquire through

exposure to the theory of probability that has been developed since the 17th century.

2.4 Consideration 3: Non-specific Probabilities

Judgments of objective chance are almost universally taken to be a species of be-

lief, and so the unification thesis suggests that beliefs about objective chance and

judgments about probability are distinct in a rather fundamental way. The final con-

sideration concerns the existence of an attitude halfway between belief in objective

36

chances and judgment about probability. To get a handle on these attitudes, we will

need to take a detour through probabilistic language.

Both judgments about probability and beliefs about objective probability are given

expression with sentences involving probabilistic language. ‘Probability’, ‘likelihood’,

and ‘chance’ can each be used to express either sort of attitude. This means that any

distinction in the attitudes requires an unmarked distinction in the language.11 The

lack of such distinction allows for instances in which neither an objective or subjective

interpretation is clearly licensed.

2.4.1 Ambiguity and Non-specificity

Probabilistic language can be used by a speaker without that speaker signaling either

an intended subjective or objective interpretation. A speaker may assert that an

outcome is probable without having communicated either that it is subjectively or

objectively probable. This might be attributed to ambiguity, but I think this would

get the phenomenon wrong. Probabilistic language is not ambiguous; rather it is

non-specific: like ‘jade’ and unlike ‘bank’, it is open to multiple precisifications, none

of which need to be communicated for the sentence to be meaningful.

The primary reason to favor non-specificity over ambiguity is that precisification

is generally irrelevant to successful communication. A speaker may aim to convey

a probability assignment without having any particular interpretation in mind, and

the audience may come to accept the speaker’s assertion without supplying either

interpretation. Consider the following situation:

11Ideally, a good semantics would find some way of unifying the semantics of objective and sub-jective probabilities, just as we have unified the semantics of epistemic, metaphysical, and othermodals. On a popular view, the language of subjective probability, unlike the language of objectiveprobability, does not contribute to the truth conditions of an utterance but rather has an effect onthe speech act produced[27, 32, 42]. It is hard to see how we could provide a unified semantics if thelanguage must play a radically different role in determining the attitude expressed – contributing tothe content in some cases and to the attitude in others.

37

Will, a weatherman, predicts a .75 probability of rain on Monday. This

assignment was derived from an examination of a forecasting model and it

is controversial whether or not these models provide sufficient evidence to

justify a judgment about objective chance. Will has no opinion about the

correct interpretation of his own expression or the capacity for his model

to justify a judgment about objective chance. In fact he has never really

thought about the matter. He uses his forecasting model and statistical

methods to make probabilistic predictions in the way that he was taught

in graduate school.

Ambiguity infects linguistic contents and not mental contents. Since the purpose

of assertion is typically to convey a particular belief, contextually unresolvable am-

biguous language results in a failure of communication. Suppose, for instance, that

John says to Mary, “I am going down to the bank”. Mary cannot come to incorpo-

rate his testimony into her own beliefs without deciding whether he meant a financial

institution or a river’s edge.

Since ambiguities need to be settled for a sentence to be meaningful, speakers

should have some interpretation in mind when making an utterance. The existence

of people like Will, who competently make assertions about probabilities without

having particular precisifications in mind, suggests that probabilistic language is not

ambiguous.

Probabilistic language does not require a subjective or objective interpretation

in order for successful communication to occur. One can hear the weatherman’s

report and come to adopt a corresponding attitude without ever needing to form an

opinion about how whether he was to be interpreted as talking about an objective or

subjective probability. This shows that probabilistic language can succeed without

any need for interpretation on the part of the audience.

38

2.4.2 Non-specific Attitudes

The existence of a non-specific use of probabilistic language suggests the existence of

a non-specific attitude. The attitude expressed by Will’s assertion is neither a belief

about objective chances nor a straight-forward judgment about subjective probabil-

ity. The existence of non-specific attitudes suggests that judgments about subjective

probability and beliefs about objective probability are not all that different after all.

Non-specific judgments of probability have gradations that straddle the gradations

of objective and subjective probability, and it is hard to see how this could be if the

gradations were in one case conceptualized and in the other case not.

We could come up with some kind of objective measures of scariness. These mea-

sures might allow us to make specific objective judgments about how scary something

is. One might, for instance, come up with a point-based scheme that assigns a crea-

ture three points if it has fangs, five points if it has patchy fur, etc. It is hard to

imagine, however, that we could ever have an attitude that straddled beliefs about

any objective measure of scariness and genuine fear. The reason for this is that gra-

dations of objective scariness would be conceptualized, while actual fear is not. The

two varieties of attitudes are just too different.

Postulating probabilistic concepts gives us an explanation of the possibility of

non-specificity because it allows us to see similarity between the content of judg-

ments about probability and beliefs about objective probability. Attitudes involving

subjective probabilistic concepts get to be judgments about probability. Attitudes in-

volving concepts of objective probability get to be judgments about objective chances.

Whatever features characterize these different concepts, it is possible to have concepts

with features that lie in between.

It is possible to have attitudes toward in-between contents because it is possible

to have non-specific concepts. We often are also capable of attitudes that are in-

between a belief about a weight and a belief about a mass. This is possible if we

39

have an unrefined or non-specific concept that counts as neither a concept of weight

or mass. As a result, some of our thoughts will concern neither an object’s weight

nor its mass but will be non-specific between them. I claim that what is going on in

cases of probabilistic non-specificity is basically the same. We can have an attitude

that is in-between a judgment about probability and a belief about objective chances

because we can have unrefined concepts of probability that fail to count specifically

as concepts of objective chance or concepts of subjective probability.

40

Chapter 3

Constitutivism about the Formal

Norms

3.1 The Formal Norms

Some judgments about probability commit us to other judgments about probability.

Judging that the probability that aliens are behind crop circles is .5 commits us to

judging that the probability that aliens are not behind crop circles to also be .5.

Judging that the probability that ‘Shakespeare’ was a pseudonym of Christopher

Marlow is .7 commits us to judging that the probability that ‘Shakespeare’ was either

a pseudonym of Christopher Marlow or of Francis Bacon to be at least .7. F

Why is this? These commitments arise from formal normative constraints on

probability. The formal norms require us to conform our probability assignments1 to

the axioms of probability. Obedience to these norms means that assigning one proba-

bility to one proposition commits us to assigning, or not assigning, other probabilities

to other propositions.

1In this chapter and the next, I will switch from speaking of ‘judgments of probability’ to ‘prob-ability assignments. This shift in terminology reflects no underlying change in meaning. However, a‘probability assignment’ is ambiguous between a judgment about probabilities or a set of judgmentsabout probabilities.

41

Following Kolmogorov’s [21] axiomization of probability theory, there has been

thought to be three fundamental formal constraints on our assignments of probability.

The first constraint, Normality, requires an assignment of a probability of 1 to

propositions that cannot be false. The second constraint, Non-negativity, requires

that all probabilities be greater than 0. The third constraint, Additivity, requires

that the probability of a proposition equivalent to a disjunction of inconsistent

propositions be equal to the sum of the assignments of the disjuncts. Collectively,

I will refer to these as the Formal Norms of probability. The first two norms set

the scale and as such are not especially substantive; my focus in this chapter will be

primarily on the more substantive norm of Additivity.

The topic of this chapter is the explanation of the normativity of the formal

constraints governing our assignments of probability – where do these normative

constraints come from and why do they hold sway over us?

The norms are not so straightforward as they first appear and to understand their

source we should first try to understand their content. To know what these norms

forbid, we have to understand what probability assignments are. In the last chapter,

I suggested that probability assignments are a kind of sophisticated judgment whose

gradations are conceptualized. I didn’t explore what it is about these judgments that

make them judgments about probability. In this chapter, I will explore this issue in

tandem with the question of normativity. Whatever a probability assignment is, it

better be the sort of thing that is subject to these norms, so it is best to approach

the task of discerning the nature of probability assignments and the explanations of

norms simultaneously. I will approach the issue in an independent way and put back

into play some of the views that I rejected previously. The conclusion I come to will

thereby provide independent support for the practical interpretation that I favor.

42

This paper is structured around four different ways of understanding what it is to

be a probability assignment and corresponding ways of understanding why probability

assignments are subjected to the norms to which they are. Each different way of

conceiving of probability assignments gives us a different way of understanding what

the norms forbid.

In the first section, “the Bare Characterization”, I will explore the view that

assignments of probability are simply graduated credal states. Any graduated credal

state is a probability assignment. I will argue that this proposal is deficient because

the most plausible account of the requirements of Additivity on this construal is deeply

implausible. Consequently, there must be something more to being a probability

assignment than being a graduated credal state.

In second section, “the Pragmatic Characterization”, I explore the view that prob-

ability assignments are graduated credal states used for making decisions in accor-

dance with standard decision theory. This notion is closely tied with Dutch book

arguments for Additivity. These arguments make use of the connection of proba-

bilities with decision making to produce a pragmatic argument for Additivity. I will

argue that this way of thinking about probabilities is implausible and the Dutch book

arguments are insufficient to account for the generality of the norms.

In the third section, “the Aim Characterization”, I explore the view that proba-

bility assignments are graduated credal states that possess a constitutive aim toward

accuracy. This aim has been fashioned into a clever non-pragmatic argument for Ad-

ditivity. I will discuss several different ways of explaining the norms by means of a

constitutive aim toward accuracy and argue that they all fail.

In the fourth section, “the Constitutive Characterization”, I will present and de-

fend my view that the norms are constitutive of the practice of assigning probabilities.

I will argue that the explanation for the normativity of Additivity, Normality and

Non-negativity is the following: intentional obedience to these norms is required for

43

engagement in the practice of assigning probabilities. One cannot count as assigning

probabilities unless one intentionally subjects one’s assignments to the norms. The

formal norms governing probability assignments are of a means-end variety: in order

to do something we wish to do, we must act in a certain way. They are, in the termi-

nology of Evans and Shah[10], weak norms: one can evade these norms by choosing

not to engage in the practice of assigning probabilities. To do so, one must simply

refrain from making use of concepts of probability. But one must accept the rules of

their use if one does opt to use them. Normality, Non-negativity and Additivity are

not forced upon us – we could choose not to keep track of probabilities at all. But

once we embrace the concept of probability and choose to use the concept to keep

track of our evidence, we are subject to its rules of use.

3.2 The Bare Characterization

Beliefs seem to come in degrees. We believe some things more than others. We have

some beliefs which are deep and strong, and others which are shallow or weak. We

have some beliefs that we will hold onto, come what may. We have other beliefs that

we drop when our mood changes. Belief is a graduated credal state because it comes

in degrees that are sensitive to evidence regarding how the world actually is.

According to the view that I will call the ‘Bare Characterization’ (a variant of

credal noncognitivism), probability assignments are nothing other than graduated

credal states. The act of assigning a probability is none other than the act of forming

a belief with a certain degree. The level of probability assigned is just the degree of

the belief. So when we say “John assigns a probability of x in ψ” we really just mean

that “John has a belief of degree-of-strength x in ψ”.

44

The Bare Characterization: A probability assignment is nothing other

than a graduated credal state. The degree of or probability reflects the

strength of the credal state.2

Suppose that the Bare Characterization is correct. What, then, does Additivity

forbid? Here is the obvious choice.

Bare Absence Requirement: Additivity consists in an obligation not

to have any non-additive graduated credal states.

The Bare Absence Requirement isn’t plausible, and unless we can find a better

understanding of the requirements of Additivity consistent with the Bare Charac-

terization, its implausibility will doom the prospects of the Bare Characterization

itself.

3.2.1 Alternatives

I will try to show that the Bare Absence Requirement is false by showing that there

is no normative requirement not to have non-additive credal states. It is difficult to

argue about basic matters of normativity. For the most general claims about what is

and what is not appropriate, we must rely on our intuitions both about the overall

plausibility of a purported norm and about the plausibility of its implications in

specific cases. My argument against the Bare Absence Requirement will rely on the

thought that it can be rational to have non-additive graduated credal states.

Graduated credal states, insofar as they are credal states, must be influenced by

evidence. There are various schemes for how a degree of credence in a proposition

might relate to the evidence. To illustrate this point, I will describe two such alterna-

tive schemes that sanction non-additive assignments, and I will sketch an individual

who has non-additive credal states, but who seems perfectly rational.

2I assume here that degree is understood somehow internally, and not as any kind of measuredependent on the resulting behavior of the individual.

45

In order to show that it is sometimes rational to have non-additive credal states,

I’ll need to set some ground rules on what it is to be a credal state. I propose two basic

criteria. First, in order to be a credal state, the state must be representational. This

means that it can be fruitfully understood as an attitude directed at a proposition.

Second, the state must be evidence-sensitive. We must change the states we’re in in

response to changes in our evidence. Categorical beliefs are representational states

that we become more likely to be in as we get more evidence in favor of the way they

represent things as being. Degrees of belief are states whose degree in a proposition

tends to correspond to the amount of evidence we have for that proposition. As we

get more evidence, we raise or lower our degrees of belief accordingly.

Here are two kinds of graduated credal states whose graduations follow their own

non-additive logic.

1. Mass Credences. A mass credence is a graduated credal state whose

gradations relate to evidence in a way that corresponds in spirit with

Dempster-Shafer [29] mass functions. A mass function is an assignment

of numbers to propositions that is intended to reflect the proportion of

the evidence specific to that proposition. Evidence that supports a more

precise variation on a proposition contributes nothing to the mass of the

proposition itself. Evidence for a disjunct, for instance, isn’t evidence

specific to the disjunctions in which it participates, and so it has no ef-

fect their corresponding mass functions. A mass credence is a graduated

credal state whose gradations react to evidence like mass functions. When

someone with mass credences receives evidence specific to a proposition,

their mass credence in that proposition would increase.

46

2. Consonance Credences.3 Given a probability function defined over a

space of possible world, we can define a consonance function to be that

function that takes a proposition and returns the probability of the most

probable world in which that proposition is true, divided by the probability

of the most probable world. The value of a consonance function on a propo-

sition corresponds (somewhat loosely) with how well the proposition fits

in with one’s best guess about the total state of the world. Evidence which

slightly favors remote contingencies in which the proposition is true has

no influence on the consonance value of that proposition. A consonance

credence is a graduated credal state whose gradations react to evidence in

the manner of consonance functions. The degree of consonance credence

in a proposition increases in response to evidence in favor of the most

likely scenario in which it could occur.

Both consonance and mass credences meet my criteria for being non-additive

graduated credal states. It is possible to have a mass function that ascribes high values

to each member of a disjunction and a low value to the disjunct. The consonance of

a disjunction is always equal to the consonance of one of its disjuncts and is generally

less than the sum of their consonances. So for both kinds of credal states, Additivity

definitely does not hold. If it is ever rational to have these states, then the Bare

Absence Requirement must be false. Now, let me introduce a hypothetical individual

who has both mass and consonance credences. This individual is, I claim, rational.

Bert is an epistemic libertine. Some creatures may have categorical beliefs

without graduated credal states. Some creatures may assign probabilities

and have no categorical beliefs. Bert has a surfeit of epistemic attitudes.

He has categorical beliefs. He has graduated credal states whose grada-

3Consonance credences are loosely inspired by Angelika Kratzer’s [22] proposal for the semanticsof probability operators.

47

tions act like probability functions. He also has distinct mass credences

and consonance credences. Bert doesn’t use these other states to help him

make decisions – he relies on his ordinary judgments about probability to

decide what to do. These other credal states play no role in Bert’s actual

decision making.

I claim two things: that Bert does have non-additive credal states and that Bert

is not irrational for having them. His states are credal insofar as they are represen-

tational and their gradations are sensitive to evidence about their representations.

They are non-additive by stipulation. Bert is not unreasonable simply for having

them. So the Bare Absence Requirement must be false.

There is no obligation to lack non-additive credal states, so what does Additivity

really forbid? It forbids having non-additive probability assignments. Probability

assignments must not be mere graduated credal states. Probability assignments must

have some special nature that accounts for their being subject to Additivity.

3.3 The Pragmatic Characterization

As long as probabilities have been measured quantitatively, they’ve been used for

gambling. In fact, modern probability theory was conceived by gamblers looking

for an edge. Probability clearly has a very special importance for decision making

under uncertainty; it provides us with a method for deciding what decisions to make,

given our interests. The ubiquity of this connection suggests that we might want to

incorporate this use into our account of the nature of probability assignments and

use the connection to account for Additivity.

The connection relies on a utility-maximizing decision procedure – henceforth,

the ‘standard decision procedure’4. One condition required to apply this decision

4There are many versions of this procedure. Nothing that I have to say depends upon selectingany one in particular.

48

procedure is that we have quantifiable graduated desire states. The expected utilities

of an action are sums of the measures of the strength of our desires for particular

outcomes to result from our actions, weighted by the graduated level of the probability

we assign to that outcome resulting. Thus, the procedure advocates performing that

action among our available actions whose possible outcomes are the most desirable,

as weighted by their probabilities of occurrence.

The Pragmatic Characterization: A probability assignment is a grad-

uated credal state that is used for the purpose of selecting actions on the

basis of expected utility.

And the Pragmatic Characterization suggests the following understanding of the

norms of probability.

Pragmatic Requirement: Additivity consists in a requirement not to

use non-additive graduated credal states for the purpose of selecting ac-

tions on the basis of optimizing ‘expected’ utilities.

Given the assumption that probability assignments are states that are used to

select actions on the basis of expected utilities, there are several promising arguments

that we can give for Additivity. Among the most well-known of these arguments is the

Dutch book argument. Ultimately, I don’t think that we can justify the assumptions

necessary to make this argument work, but it is worth rehearsing it to see how a

defender of the Pragmatic Characterization could go about explaining Additivity.

3.3.1 The Dutch Book Argument

If we always select the action that maximizes our expected utility, then in those

situations where our available actions are to either accept or decline a bet and where

there is no opportunity cost to accepting, we will accept those bets with positive

49

expected utility. The Dutch book argument suggests that in order to avoid accepting

collections of bets that will lead to sure losses, we must calculate expected utilities

with additive credal states.

A Dutch book is a set of bets that is especially advantageous to offer and disad-

vantageous to accept. The bookie will lose on some bets and win on others in a way

that depends on factors beyond the bookie’s control. But no matter how each bet

turns out individually, the bookie is guaranteed to gain a net positive amount – and

the bettor is guaranteed to lose – on the whole collection. Clearly, it is not good to be

disposed to accept Dutch books. Accepting Dutch books will lead one to lose utility.

But precisely how bad this disposition is seems to depend partly on the prevalence of

clever bookies.

This central observation lies at the heart of all Dutch book argument: if you

accept bets with positive expected utility and your probability assignments are non-

additive, then you will be disposed to accept certain Dutch books. Clearly, this is an

unfortunate way to be disposed. Unfortunateness is not irrationality, so where is the

argument for Additivity? We could try to leverage the unfortunateness of accepting

Dutch books into a conclusion about the irrationality of the states that dispose one to

accept Dutch books. We might accept a principle according to which any disposition

which is unfortunate to have in some situations is irrational.

Since it is bad to accept Dutch books, it is unfortunate to have states that would

dispose one to accept Dutch books. If unfortunate dispositions are always irrational,

then non-additive probabilities are irrational. However, this inference is suspect.

There are two problems. First, while a disposition can lead to unfortunately choices

in one context, it can be good in others. (Consider Newcomb’s paradox.) Some

rewards may be earmarked for those who are disposed to make bad bets on some

occasions. Alan Hajek notes [16] this flaw with Dutch book arguments. He suggests

that sometimes having non-additive probabilities will lead one to accept a set of bets

50

that collectively ensure a net positive that one wouldn’t accept if one had additive

probabilities. This observation is no threat to the Dutch book argument, because

it just shows that if one has additive assignments, one will occasionally refrain from

exchanging an advantageous risky bet for a sure thing. However, it is difficult to

rule out the possibility that we might be better off with some kind of non-additive

assignment given the kinds of bets we are likely to actually receive. Second, we should

also be careful to distinguish between dispositions and the actions that they lead to.

It may be that accepting a Dutch book is irrational, but having the dispositions to

accept those bets individually is not itself irrational. Or it may be that accepting

a Dutch book isn’t even irrational. It isn’t clear that there is any reasonable way

to avoid being disposed in some way or other to unfortunate choices in the right

contexts.5

A more general challenge often offered by critics of Dutch book arguments is that

they rely too much on practical considerations. The alleged problem with having non-

additive states is that it thwarts you from getting what you want. This subordinates

the normativity of probability states to means-end normativity. We want to lead rich

and happy lives, and this is hampered by having non-additive probability assignments,

so we should drop those non-additive assignments. But intuitively, the problem with

non-additive states is internal to the states themselves and not at all a practical issue.

Assigning non-additive probabilities isn’t like forgetting to measure twice before you

cut, or rubbing, rather than dabbing, a spill on a carpet. The general problem

is twofold. First, the reason to assign additive probabilities isn’t pragmatic, but

epistemic. Second, the problem has something to do with the attitudes themselves,

rather than their relation to other attitudes.

Several philosophers have tried to convert the practical issue into an epistemic

one. Christensen [6], for instance, has suggested that Dutch-book arguments show

5For instance, see [1].

51

that there is a sort of incongruity inherent in non-additive probability assignments.

His reason for thinking this lies in a purported connection with beliefs. Probability

assignments sanction certain ceteris paribus judgments about fair bets. A probability

assignment of 23

licenses the judgment that a bet at 2:1 odds is fair, all things consid-

ered. One need not be prepared to make the bet in order to judge that the bet is fair.

The issue is an epistemic one. If each probability assignment requires the judgment

that bets that are sanctioned by the standard decision procedure are fair, and if it is

an a priori truth about fairness that a collection of bets that are individually fair is

also fair, and if anything that guarantees a loss is unfair, then there is a conflict of

beliefs about fairness.6

This may help make the issue non-practical, but it doesn’t quite succeed in inter-

nalizing the problem. It isn’t the probability assignments themselves, but the beliefs

that they commit us to that leads to the problem. This would indicate that the

problem is dependent on being in a position to form beliefs about fairness. If one

cannot form beliefs about fairness, then probability assignments themselves are not

problematic. But failing to form beliefs about fairness doesn’t seem to exculpate

one’s non-additive probability assignments. The tie to fairness may be indicative of

a problem with probability assignments that fail the formal norms, but it doesn’t

explain what is wrong with them.

The central problem for Dutch book arguments is that they make the issue ex-

ternal to the judgments themselves, which suggests that the argument is only good

insofar as probability assignments have the requisite external connections. Break the

connections and the argument loses its force. This won’t be a problem if probability

assignments really have to be connected to decision making in the standard way in

order to be probability assignments. But this idea seems dubious. Can’t we assign

6Christensen doesn’t go so far as to allege an actual inconsistency of belief. He does not wish tosay that the sanctioned judgments about fairness are straightforwardly logically inconsistent, justthat there is something internally defective about them.

52

genuine probabilities and make use of alternative heuristics in deciding what to do?

Couldn’t we assign probabilities even if we never acted in any way whatsoever? The

answer, according to the Pragmatic Characterization, is ‘no’. So much the worse for

the Pragmatic Characterization: we obviously do assign probabilities and go on to

misuse them. People rarely go through the process of actually calculating expected

utilities, and it is doubtful that they generally succeed in approximating it.7

But even when we don’t use the standard decision procedure, we still assign prob-

abilities and the norms still apply to those assignments.

There are decision procedures we could follow on which non-additive probabilities

are not problematic. Assigning non-additive probabilities would still be problematic.

So the problem that arises in assigning non-additive probabilities cannot come from

the practical effect. This is not to say that one wouldn’t be making a mistake of some

kind in forgoing the standard decision procedure. My point is simply that making that

mistake doesn’t exculpate one for assigning non-additive probability assignments.

3.4 The Aim Characterization

If we cannot explain Additivity by reference to the effects of probability assignments,

perhaps we should focus on just what kind of credal state probability assignments

are. Jim Joyce [18, 19] spearheaded recent interest in this alternative approach by

suggesting that we could derive the norms of probability from the fact that probability

assignments aim at accuracy. This approach is strongly influenced by the view that

the fact that beliefs aim at truth can why certain kinds of beliefs are irrational.

7 Further, it is highly dubious that people even have quantifiable utilities. It is easy to quantifymoney and offer precise definitions for fair bets. But the value of money isn’t linear, and so decisiontheorists recognize that we shouldn’t always try to maximize our expected financial returns. Utilitiesact like a substitute for money that justifies our prescription to optimize: we can assume that thevalue of utility is linear. If our desires are not quantifiable, if we cannot lay a metric over degrees ofstrength of desires, then we can’t even begin to maximize utilities.

53

Other philosophers have independently given great attention to the way aims can

help explain norms. In order to provide such an explanation, we must first work

out what aims really are. Wedgwood [37] interprets the aim of truth as consisting

in the fact that it is a necessary condition for an attitude to be a belief that it be

a representational mental state that is correct iff true. Shah and Velleman [30] offer

an alternative view, according to which the concept belief incorporates the idea that

beliefs are correct when and only when true.

However we choose to cash out the sense in which belief aims at truth, it should

secure us the fact that assignments which formed in ways that are transparently

inclined towards falsity are irrational. Insofar as beliefs must be aimed at truth in

order to be beliefs, beliefs that are formed for reasons unrelated to truth are unlikely

to achieve that aim. This suggests that at least some kinds of epistemic normativity

can be understood by analogy to means-end practical normativity. The end, in this

sense, need not be the end of the believer in general, but rather of the belief (or of the

believer qua believer). One norm that might be explained in this fashion is the norm

against believing in inconsistent propositions. Inconsistent propositions are sure to

be false, and thus belief in them automatically fails at meeting its aims.

Joyce proposed that just as beliefs have a criterion of success that makes them

successful when true, probability assignments have a criterion of success that makes

them successful to the extent that they are accurate. Unlike beliefs, success isn’t

categorical. It comes in degrees. The more accurate a probability assignment, the

more successful it is. A probability assignment in a true proposition is more accurate

the higher the probability is, and a probability assignment to a false proposition is

more accurate the lower it is.

The Aim Characterization: A probability assignment is a graduated

credal state that aims at accuracy.

54

Joyce hopes that, just as the criterion of success for belief can explain some of

the norms that govern belief, the criterion of success for probability assignments can

explain Additivity. On Joyce’s proposal, the nature of probability assignments incor-

porates the facts about accuracy. Thus, we can say that Additivity forbids holding

non-additive credal states whose criterion of success is accuracy. The explanation of

this norm is that non-additive credal states fail to meet their aims as well as alterna-

tive additive assignments would. This proposal is a definite improvement over the last.

It allows that we might have probability assignments that don’t have any connection

with a decision procedure, and that those states might be subject to Additivity even

so.

The Aim Requirement: Additivity consists in an obligation not to have

non-additive graduated credal states whose constitutive aim is accuracy.

While I would agree that accuracy is plausibly a constitutive aim of probabil-

ity assignments, understood as a requirement on the applicability of our concept of

judgment-about-probability, I do not agree that this fact can be used to explain Ad-

ditivity. The basic problem is that probability assignments aim at accuracy in a way

that is too vague to explain Additivity. In order to use the accuracy aim to explain

Additivity, we need some way of measuring accuracy. While Joyce recognizes the pos-

sibility that we don’t aim at accuracy as measured in any particular way, he thinks

that we can explain Additivity supervaluationally, by looking all possible measures.

I will introduce several measures of accuracy and show how they might be em-

ployed in an explanation of Additivity. I will also raise doubts about whether we do

constitutively aim at accuracy as measured in any of these specific ways. Finally, I

will critique Joyce’s supervaluational argument.

55

3.4.1 Measures and Disagreement

Let the correct value of a probability assignment be 1 for assignments in true propo-

sitions and 0 for assignments in false propositions. The simplest way to measure the

accuracy of a probability is to calculate the distance of each assignment from the cor-

rect value. Call this the Difference score. A higher value on the Difference score, like

all of the other measures we’ll examine, leads to greater inaccuracy. On the Differ-

ence score, inaccuracy increases linearly with the distance from the correct value and

the accuracy of a set of assignments is the sum (or, alternatively, the average) of the

inaccuracies of each probability assignment. While the Difference score is very simple

and very natural, it has been widely rejected. It cannot provide an explanation for

Additivity.

Another simple way of measuring accuracy is with the Brier score, inspired by

the work of meteorologist Glenn Brier [3]. This measure of a probability assignments

accuracy as a square of the difference from the correct value. Hence, the inaccuracy

of a measure doesn’t increase linearly with the distance from the correct value. The

inaccuracy added by moving from .8 to .7 in a true proposition is less than the

inaccuracy added by moving from a .3 to .2. The Brier score is both realtively

simple and has a number of nice properties; hence it is the favorite contender among

inaccuracy measures for probabilists.

There are many other measures. One other measure is the Spherical score, which

bears mention as an alternative to the Brier score. According to the Spherical score,

the accuracy of a particular assignment is given by 1 − |(1−c)−p|√p2+(1−p)2

where c is the

correct value, and p is the assignment.

56

Sample Accuracy Scores

Assignment Contribution to Inaccuracy

(to a Falsity) Difference Brier Spherical

.3 .3 .09 .08

.4 .4 .16 .168

.5 .5 .25 .29

.55 .55 .3 .366

.7 .7 .49 .606

.8 .8 .64 .75

These measures all agree about when one probability assignment in one propo-

sition is more accurate than another probability assignment in another proposition:

they always say that propositions that get closer to the correct value are more accu-

rate. But they differ on how to quantify the relative accuracy of different assignments,

and this means that they will disagree about how accurate different collections of as-

signments are overall.

Quantifying inaccuracy allows us to compare the relative accuracy of collections of

assignments. For instance, assignments of .5 to two true proposition will be deemed

collectively more accurate than an assignment of .4 to one proposition and .55 to

another by the Brier score but not by the Spherical score.

How do we select one of these measures as the measure of accuracy as our assign-

ments constitutively aim at? There are two plausible routes. One way is to investigate

our intuitions about how to regard the relative degrees of success of different single

assignments to a proposition. How much worse is it to assign .7 than .8 in a true

proposition? Is it greater than the difference in assigning .3 and .2? The second way

is to investigate our intuitions about different collections of probability assignments.

Since different measures favor some sets of assignments over others, we can try to

57

generate intuitions about the relative success of probability assignments in a way

that will distinguish which scores are reasonable.

I submit that our probability assignments aim at accuracy in no precise way.

Neither of the above methods provides much of an intuitive grip on the problem.

Nothing pushes me to favor one of scoring rules over the others.

Instead, I think that the situation is a bit like this: when you aim a dart at a dart

board, you aim to get closer to the center, but in doing so, there is nothing in your

action that indicates any difference in success beyond the ranking. Your aiming at

the center need not require any implicit comparison between the difference of getting

one inch to the left and two inches to the left and four inches to the left and five

inches to the left. Every point is preferable to any point further away, but there is no

way of quantifying and comparing this preference (unless you explicitly form such a

preference). So, I think, it is with accuracy. We aim at accuracy, but not in a way

that assumes a measure of accuracy.

3.4.2 Wedgwood’s Inference to the Best Explanation

If we could settle on one measures of accuracy, then, if it was the right kind of measure,

we might have a suitable explanation of Additivity. We’ll say that one assignment

dominates another with respect to a scoring rule if it does better no matter how things

turn out. There are a number of measures where non-additive assignments are always

dominated by additive assignments and additive assignments are not dominated by

non-additive assignments. The Brier score and the Spherical score are both such

measures. The Difference score is not. Thus, given a non-additive assignment, there

is always an additive assignment that better reaches the constitutive goals of the

attitude.

But to employ this argument, we’ve got to settle on one measure. I’ve suggested

that there is no single right measure that captures how we aim at accuracy, and so

58

I don’t think that this argument will work. While we may have vague ideas about

which assignments are more accurate than others, they don’t serve to distinguish

between the Difference, Brier, and Spherical scores.

We can admit this while recognizing that among these, the Brier score may make

the most sense as a measure of accuracy. If we don’t constitutively aim at accuracy

under any precise measure, then we don’t aim to maximize accuracy according to the

Brier score.

The Brier score has a lot of theoretical advantages. It is simple. It is natural. If

we interpret probability assignments as a space, then the Brier score corresponds to

distance in that space from the truth. It is proper, which means that the expected

accuracy of a probability assignment, relative to that assignment, is higher for that

assignment than for any other, and it can be used to justify the formal norms. For

these reasons, it has found special favor among philosophers as the best bet among

viable contenders for a precise accuracy measure.

Ralph Wedgwood [38] thinks that we can establish the Brier score as the right

measure by means of an inference to the best explanation.8 The best explanation

of the formal constraints is that some score favors probability assignments that obey

them, and the number of nice properties known to be associated with the Brier score

strongly suggest that it is the correct accuracy measure.

How good this argument is depends on both the plausibility that we really can

treat the Brier score as a constitutive aim and on whether any other better explana-

tion is available. I am inclined to think that this argument will fail on both counts.

First, the manifest vagueness of our intuitions about accuracy seem to me to preclude

any precise measure of accuracy from figuring in the constitutive aim of probability

assignments. I think Shah and Velleman have the right interpretation of constitutive

8Wedgwood doesn’t think of the Brier score as a measure of accuracy. Rather, he thinks thatit is a measure of “correctness”. I am not sure how much of a difference this makes – though hehimself doesn’t present it as hugely significant.

59

aims, and this interpretation suggests that the constitutive aim of an attitude should

be relatively transparent to anyone with a concept of that attitude. The aim of accu-

racy is transparent, but much as I examine my concept of a probability assignment,

I cannot find any particular indication that one measure is right.

Second, there are alternative and equally good explanations available. The argu-

ment works best if we can assume that there is one fundamental norm that explains

all others. Wedgwood thinks that the fact that beliefs are correct iff true is funda-

mental norm that explains all norms of beliefs. This may have inclined him to think

that the aim of accuracy is fundamental for probability assignments. However, we

can also postulate a variety of different fundamental norms. To put it in Wedgwood’s

terminology: it may be that inaccuracy is one variety of incorrectness among others.

Violation of the formal norms may be its own kind of incorrectness. Before turning

to the explanation that I favor, I will address the argument that Joyce develops that

doesn’t require that we actually have aim at accuracy in any particular way.

3.4.3 Joyce’s Supervaluationist Argument

Joyce’s argument does not rely on any single measure of accuracy being correct.9

Instead, he tries to show that there are properties that any reasonable precise measure

should have that together entail that non-additive states are dominated by additive

states according to that measure,

Given this claim we can see that if there were a specific measure of accuracy

that our probability assignments aim at, then even if we can’t figure out which one

it is, the fact that we can recognize that it has properties that entail that additive

probabilities dominate other probabilities means that Additivity is true. But if there

isn’t, the argument gets a bit more complicated.

Here is what Joyce says about the possibility of vagueness:

9Though he has expressed some willingness to allow that there may be a correct measure ofaccuracy, which might vary with context.

60

In developing these ideas, I will speak as if accuracy can be precisely quan-

tified. This may be unrealistic, since the concept of accuracy for [proba-

bility assignments] may simply be too vague to admit of sharp numerical

quantification. Even if this is so, however, it is still useful to pretend that

it can be so characterized because this lets us take a “supervaluationist”

approach to its vagueness. The supervaluationist idea is that one can un-

derstand a vague concept by looking at all the ways in which it could be

made precise, and treating all facts about what properties all of its “pre-

cisifications” share as facts about the concept itself... I am going to be

interested... in the properties that all reasonable “precisified” measures

of gradational accuracy share. [18, 590]

I have a number of worries about the supervaluationist approach that Joyce’s

argument relies upon.

First of all, the supervaluationist inference that allows us to project properties

from the precisifications to the vague concept itself is highly questionable. I am

not sure what would license it. It does not seem to be obviously psychologically or

logically necessary that we should ascribe to a concept all of the properties shared by

all of its available precisifications. We could recognize that every viable precisification

has a property that we refuse to apply to the vague concept.

We might try to justify the inference as follows: the supervaluationist inference is

reasonable because a vague concept places constraints on its possible precisifications.

Those properties shared by all possible precisifications reflect constraints derived from

the vague concept itself. Thus, by looking at what each precisification shares, we can

get some insight into the constraints of the concept itself.

This line of thought is problematic. By virtue of being precise, the precisifications

may introduce aspects that are not implied by the vague concept itself. All precisifi-

cations are precise. A vague concept is not itself precise. The precision is an artifact

61

of the precisification process and does not result from constraints imposed by the

vague concept. There may be other properties that all precisifications share because

they are precisifications that the vague concept itself doesn’t by itself impose upon

them and that doesn’t reflect anything about the vague concept’s nature.

Aaron Bronfman [4] has developed a related argument against Joyce’s view. Bronf-

man notes that Joyce’s argument doesn’t show that any non-additive assignment is

dominated by a particular additive assignment under any reasonable accuracy mea-

sure. It may be that each accuracy measure favors its own distinct additive assign-

ments over any given non-additive assignment. Here is Joyce’s formulation of the

worry, presented with an analogy:

Suppose ethicists and psychologists somehow decide that there are just

two plausible theories of human flourishing, both of which make geograph-

ical location central to well-being. Suppose also that, on both accounts,

it turns out that for every city in the U.S. there is an Australian city with

the property that a person living in the former would be better off living

in the latter. The first account might say that Bostonians would be better

off living in Sydney, while the second says they would do better living in

Coober Pedy. Does it follow that any individual Bostonian will be better

off living in Australia? It surely would follow if both theories said that

Bostonians will be better off living in Sydney. But, if the first theory ranks

Sydney ≺ Boston ≺ Coober Pedy, and the second ranks Coober Pedy ≺

Boston ≺ Sydney, then we cannot definitively conclude that the person

will be better off in Sydney, nor that she will be better off in Coober Pedy.

So, while both theories say that a Bostonian would be better off living

somewhere or other in Australia, it seems incorrect to conclude that she

will be better off in Australia per se because the theories disagree about

which places in Australia would make her better off. [19, 289]

62

Joyce responds that it is still problematic to accept a probability that is domi-

nated by any reasonable precise measure. It makes some sense to say that this kind

of dominance is indicative of a problem – it is a bit hard to believe that rational

probability assignments would be dominated in this way – but it is hard to make

the case that it explains the norm. It is far from clear that any Bostonian in the

scenario that Joyce describes that wants to maximize their well-being has a reason to

move. If they are confident that there is no fact of the matter which theory of human

flourishing is correct, then there is no fact of the matter whether they will be better

off moving, no matter where they move. It is implausible that if there is no fact of the

matter whether they are worse off by staying put, there is anything irrational about

doing so.

Second, it is open to challenge that the accuracy aim of probability assignments is

not actually vague. Here we must be careful to distinguish between genuine vagueness

and a lack of precision. The constitutive aim of probability assignments could be

perfectly determinate and exhausted by the fact that probability assignments that

come closer to the truth are preferable. There need be no way of measuring and

quantifying degrees of accuracy. This doesn’t make the aim of accuracy vague. There

may be no way of measuring and quantifying degrees of tastiness, but that doesn’t

mean that judges in a chili cook-off lack criteria to select the winner. They aim to

choose the tastiest chili. The rules of soccer tell you who won a single game. But

they don’t tell you whether it is better to win 4-2 or 2-1. Nor do they tell you which

team did better in a series of games. The rules of soccer are not vague on this score;

it lies beyond their purview. Similarly, it may be that our aim of accuracy specifically

tells us to prefer single probability assignments that are closer to the correct value

and does nothing to measure degrees of accuracy or how to measure collections of

assignments. This needn’t make the aim a vague aim and so it wouldn’t necessarily

63

invite the supervaluationist inference, even if the supervaluational inference was safe

on the assumption of vagueness.

Third, we may worry about the kinds of considerations that Joyce uses in support

of the properties he thinks a scoring rule should have. The worry is that Joyce as-

sumes that accuracy does all of the work in settling what probability assignments are

allowable. Consider the kinds of properties that Joyce requires. The first iteration

of Joyce’s argument relied on the weak convexity axiom. The weak convexity axiom

stated that an adequate accuracy measure has to be conservative in a certain way:

given two set of assignments that are regarded as equally accurate, any mixture of

those assignments (arrived at by summing the products of the respective assignments

and weights that add up to 1) must be more accurate than those assignments them-

selves. Maher [26] notes that this assumption rules out the Difference score, and for

this reason he rejects it as a constraint on a reasonable probability assignment. I’m

inclined to agree with Joyce that conservativeness is a virtue of a probability assign-

ment, but I’m not tempted to attribute this to considerations of accuracy. The fact

that we judge conservative assignments better may be related to other aspects of the

nature of probability assignments. In essence, though there might be reasons to favor

conservative assignments over non-conservative assignments, they need not arise from

accuracy considerations.

In Joyce’s updated [19] version of the argument, he offers a new set of constraints10.

Among those constraints are two (offered for separate proofs): Propriety and Minimal

Coherence. A measure is proper if no probability assignment that obeys the formal

norms recognizes another probability assignment as having a greater expected accu-

racy. The problem with improper assignments, Joyce suggests, is that they “under-

mine their own adoption and use.” As soon as you’ve adopted a modest assignment,

10It is also worth noting that he moves from discussing accuracy to discussing epistemic utility.

64

you’ve got reason to change to another assignment which you regard as being more

accurate. This fact is the basis of his support of the criterion of propriety.

It is open to question whether you really should switch, even if you aim at accuracy

and you recognize another probability assignment to be more accurate. There may

be more risk involved in switching, for instance. Or else accuracy might not be all

that we aim at. For instance, you might not aim to adopt probability assignments

that are as accurate as they can be, but only as accurate as your evidence licenses

you to be. If so, then it would be a mark against the reasonableness of the Difference

score that it is improper, since changing the assignment to make it more accurate

might make it more accurate than it is licensed to be by the evidence available.

The issues I’ve taken with both propriety and weak convexity have a similar

source. Joyce assumes that the accuracy measure explains the comparative virtues

of different probability assignments when it is compatible with external constraints

that achieve the same ends. Its hard to make them out as results derived solely

considerations of accuracy, so they need not be regarded as requirements for any

genuine precisification of accuracy. In this way, Joyce’s argument shares something

with Wedgwood’s. Joyce seems to be assuming that accuracy must do the work

of deciding what probability assignments are reasonable. So any judgments we have

about reasonableness can be projected back onto accuracy measures. Those measures

in turn can be used to explain the formal norms. This kind of inference to the best

explanation, however, works to the extent that accuracy measures really can provide

the best explanation. In order to decide whether this is the case, we must consider

how other alternative proposals fare in explaining the reasonableness of the relevant

probability assignments.

Finally, it is worth considering the distinction between the properties of measures

it makes sense for the concept to be vague over and the properties of measures that

it makes sense to adopt as precisifications inpractice. It is surely vague when a boy

65

becomes a man. For expediency, we may say that the reasonable precisifications

include his 18th or his 25th birthday. The 16th hour of the 1122nd Tuesday of his life

is not an expedient point. If we had to settle on a date, the first two options are

better than the third. But a concept which is vague over the latter precisification is

reasonable, because there is a difference between what a concept is vague over and

what precisifications are reasonable. If the supervaluational argument is to have much

force, it must be because of what metrics our aim of accuracy is vague over, and not

what precisifications would be reasonable to adopt. Given Joyce’s interest in things

like propriety, he seems mostly concerned with the latter.

In summary, then, we need more than just a basic aim at accuracy; we need

precisely measurable accuracy measures to make the argument work. But it doesn’t

seem that we aim at accuracy in any particular way, and Joyce’s supervaluationist

gambit is highly questionable.

3.5 The Constitutive Characterization

So far we have considered three proposals. On the Bare Characterization, probabil-

ity assignments are just graduated credal states, and Additivity prohibited us from

having any non-additive graduated credal states. On the Pragmatic Characteriza-

tion, probability assignments had an essential tie to decision making, and thus the

Additivity prohibited us from having non-additive graduated credal states that were

used to select actions on the basis of expected utility. On the Aim Characterization,

probability assignments aimed to maximize accuracy, and thus Additivity prohibited

us from having non-additive graduated credal states that aimed at accuracy (in the

way that our states do). I think that none of these proposals were able to really

explain Additivity in a plausible manner.

66

I’m going to offer a distinct account that makes two separate claims. First, that

Additivity is a rule that partly constitutes the practice of assigning probabilities.

Second, that we become subject to the rules governing the practice of assigning prob-

abilities by engaging in that practice through our implicit intentions to participate in

the practice given our application of the concept of probability. As I suggested in the

introduction, we might consider what it is to make a judgment about probability in

terms of engagement in a practice. In the second chapter, I suggested that we need

to postulate a concept of probability. Here, I combine the two ideas: the concept of

probability is a concept whose application constitutes a move within a practice. Since

probability is understood in terms of that practice, anyone who assigns a probability

should regard themselves as making a move in that practice and hence being subject

to its norms. I will call this the ‘Formal Characterization’.

The Constitutive Characterization: It is partly constitutive of proba-

bility assignments that they are governed by normative formal constraints

including Additivity, Non-negativity, and Normality.

The Formal Requirement: Additivity consists in an obligation not to

have non-additive credal states that are governed by a normative require-

ment to be additive.

This may sound trivial. I take this to be an advantage of the view, for it explains

why the truth of Additivity is transparent and undeniable. In the next few sections, I

will explain how we should understand the way in which rules constitute our practices

and why the norm of Additivity holds any sway over us. I will conclude by discussing

the status of other norms.

67

3.5.1 Additivity as a Constitutive Rule of a Practice

When you play a game of chess, there are certain things you’re not supposed to do.

For instance, you’re not supposed to move a pawn back toward your own side. You’re

not supposed to do so, not because it is strategically unsound or unsportsmanlike,

but because it is against the rules. Moving a pawn backwards is forbidden by the

rules of chess and playing chess is an activity characterized in part by the rules you

must follow, or at least attempt to follow, while playing it. If you are to play chess,

you must make some attempt to play by the rules. In your game you may agree with

your opponent that moving a pawn backwards is allowed. There may be no reason

not to come up with such an agreement. Nor need the agreement be verbal; you may

make a backward move, your opponent may recognize this and not object, and the

game may proceed. But insofar as you and your opponent do not try to restrict all

moves to the legal moves of chess, you’re no longer playing chess. The norms that

govern the allowable moves in chess are norms that are constitutive of the practice of

playing chess, in the sense that an implicit intention to obey the rules is necessary to

play chess. There is nothing wrong with playing other games, but if you play chess,

you must intend to play by the rules.

Chess is a game with well-defined rules that we can choose to participate in or

refrain from participating in. The goal of the game is to maneuver one’s pieces into a

checkmate of the opponent’s king. The rules govern how we are allowed to proceed.

Probability assignments may be viewed in an analogous way. The practice of assigning

probabilities can be characterized by their goal or their use and by the constraints

allowed by the practice. The goal of chess is to get a checkmate. The constraints

are the rules governing allowable moves. Just as the constraints of chess help give

structure to the game and make it a worthwhile pastime, the constraints on assigning

probabilities make them useful.

68

The goal of assigning probabilities could be conceived of in either of two ways. In

the first way, the goal is accuracy. Just as Joyce suggested, it is part of what it is to

be a probability assignment to be aimed at coming as close as possible to the truth.

In the first chapter, I proposed a different goal – I suggested that we use probabilities

to measure the relative strengths’ of our sources of evidence. Given how closely these

align in practice, I don’t think that there needs to be a fact which is the real goal.

In each case, we will want to assign higher probabilities in propositions for which we

have lots of evidence, and lower probabilities in propositions for which we have less

evidence.

The constraints on our assignments are just Additivity, Non-negativity, and Nor-

mality. The fact that probability is characterized by the rules that it is distinguishes

it from other practices. There is an infinite variety of other practices governed by

other norms. We could have sought to assign values with the goal of accuracy or

reflecting our evidence in many different ways. These alternative practices need not

have been any worse, but they would have been different. Take Normality and Non-

negativity. Why is it that probability assignments should lie between 0 and 1, rather

than between 0 and 10, 0 and 12, or -5 and 23? There is no deep reason. However, if

we started using a scale between -5 and 23, we would no longer be assigning probabil-

ities. Probabilities are supposed to range between 0 and 1 for no better reason than

that restriction is built into what it is to be a probability. We choose to calculate

probabilities because the scale involved is especially convenient. We are obliged to

assign probabilities between 0 and 1, rather than -5 and 23, because that is what the

practice that we have chosen to adopt requires.

Additivity has a similar explanation. We are obliged to assign additive values

as probabilities because the restriction toward assigning additive values is partly

constitutive of the practice of assigning probabilities. Insofar as you’re not restricted

to assigning additive probabilities, you’re not engaging in the practice of assigning

69

probabilities. There need be nothing wrong with what you are doing. It is perfectly

rational and fine to assign non-additive values. Just as it’s fine to play checkers with

chess pieces, it is fine to have other non-additive kinds of graduated credal states. But

if you’re assigning probabilities, then you’re bound by the norms that govern them.

3.5.2 Reasons for Engaging in the Practice

The explanation of why probability assignments are governed by the rules of Addi-

tivity is shallow. The explanation is simply because that is part of what it is to be

a probability assignment. Probability assignments are moves in a certain practice.

That practice is governed by certain rules. Those rules require that we assign additive

values.

Simply saying this, however, doesn’t explain why Additivity holds any sway over

us. Simply saying that moving a pawn backwards is against the rules of chess doesn’t

explain why someone shouldn’t move their piece backwards while playing chess. Such

a move might disqualify the game as chess, but why does that matter at all? In other

words, why play chess, and why, in a game of chess, are we subject to the rules of

chess? Two answers can be given. The immediate answer is that we are subject to

the norms governing chess because we intend to play chess. Insofar as we intend to

play chess, we must try to follow its rules. The less immediate answer is that there

are all sorts of pragmatic reasons for playing chess. It is intellectually stimulating.

It is fun. It is something one can enjoy with another person. These are reasons to

subject oneself to the rules of chess.

Similarly, we can offer two related explanations for why our assignments should

conform to the rules governing the practice. The immediate explanation of why we are

normatively bound by the constraints of this practice is that we intend to engage in

this practice. Participating in a practice that is governed by certain norms requires the

intention to conform one’s activities to those norms. Normative practices confer their

70

norms to us through our intentions to participate in them. The intention to engage

in a normative practice involves an intention to conform our actions to the norms,

and our intentions are what explains the resulting obligation that we come to have to

conform our actions (or probability assignments) to the norms. The intention need

only be implicit. When we offer an acquaintance a casual remark, we (often) intend

that our utterance conform to culturally standard rules of grammar. This intention

isn’t explicit or deliberately formed. But it is clearly still an intention. Similarly, our

intention to conform our assignments to the norms governing probability is implicit

in the act of assigning a probability. It might be thought that this robs the norms

of something. If norms cannot be intentionally11 flouted, because they are built into

conditions of engaging in the practice, are they still norms? It doesn’t matter too

much what we call them, as long as we understand them. If the formal norms turn

out to be constitutive rules rather than genuine norms, we can still explain the fact

that we should assign probabilities that obey them, insofar as we assign probabilities

at all.

A less immediate reason is that there are good reasons to engage in the practice

of assigning probabilities. Probabilities are immensely useful, both epistemically and

practically. We can use probability assignments to come to hold better justified

beliefs and to select actions that are most likely to provide us with the most of what

we want. Engaging in the practice of assigning probabilities provides one with a

simple way to organize and compare one’s diverse bits of evidence and represent the

relative strengths of that evidence in a way that is useful for making decisions. By

packaging evidence into probabilities, we make it possible to select actions on the

basis of expected utilities. Since maximizing one’s expected utilities is often a good

way to get what one wants (especially in the long run), it is wise to engage in a

practice that makes a useful decision procedure available.

11While I am sympathetic with the idea that the norms cannot be intentionally flouted, I do thinkthat they can be unintentionally disobeyed.

71

There is a variety of good reasons we have to engage in the practice. It has proven

extremely useful in a number of the domains in which probabilities have been applied.

Additivity is a fundamental part of the practice and seems necessary to the utility to

which assignments of probability have been put. So we have good reason to engage

in the practice that subjects our assignments to Additivity. This is ultimately where

I think the Dutch book arguments become relevant. These arguments constitute

evidence of the pragmatic benefits of engaging in the practice of assigning probabil-

ities. They don’t explain the norm of Additivity, but they supply us to reasons to

engage in a practice for which Additivity is a norm. This allows the present account

to accommodate the Dutch book argument without making the explanation of the

norms wholly pragmatic. The reason why we are subject to the norms that we are is

because of our intention to participate in a practice governed by those norms. This

may be regarded as a kind of pragmatic reason, but by divorcing the justification

from other desires, and internalizing the pragmatic ramifications, it makes it much

more tolerably so. In fact, I think that the ultimate explanation is quite similar to

the explanations sometimes given for the norms of belief in terms of its aims. So this

account allows us to make space for Dutch book arguments without giving them too

central a role.

3.5.3 The Status of Other Norms

The practice of assigning probabilities is in part constituted by a rule forbidding the

assignment of non-additive values. What else goes into characterizing the practice?

There are many other norms that have been suggested to govern probability assign-

ments. None of these other norms is built into the practice of probability in the

way that Additivity, Normality, and Non-negativity are. It would take a substantial

amount of space to catalog and discuss these alternative norms. I will return to the

subject briefly in the next chapter, when I discuss the significance of conditionaliza-

72

tion, but it is worth saying something briefly about what else could explain norms

of probability. There are two major sources for potential norms, one internal to the

practice and one external to the practice. The internal source of norms, apart from

the formal constraints, is the goal of assigning probabilities. There are a variety of

other possible sources of norms.

In addition to the formal constraints required by the practice, the goal of the prac-

tice can help motivate some probabilistic norms. One instance of a norm that might

be derived from the goal of the practice is a norm to obey the Principle of Indiffer-

ence, which counsels us to assign equal probabilities to equally plausible propositions

for whom we have equal evidence. This principle has been formulated in variety of

different ways, and many problems with its bolder formulations have been uncovered.

Nevertheless, I think that there is a kernel of truth to it. The principle makes sense

as a requirement to have reasons for different probability assignments. If we take

reasons in favor of alternative propositions to be entirely equally weighty, it doesn’t

make sense to assign different probabilities. If we accept that the goal of our proba-

bility assignments is to represent the relative strengths of our evidence, then we can

see why we should assign equal probabilities to propositions that are backed by equal

reasons. Insofar as there is no reason to favor one proposition over another, we should

represent them as having equal evidential support. So it makes sense to make this

principle, insofar as it is true, a result of the goal. Unlike Normality, Non-negativity,

and Additivity, it isn’t a further constraint on the practice, but it is requisite for

meeting the goal.

One instance of a norm that arises from an external source (for an alternate view,

see [17]) is Lewis’s [23] Principal Principle. Loosely stated, the Principal Principle

says that one should assign probabilities in line with known objective chances. So if

one knows that a biased coin has a 23

chance of landing heads, one should assign a

probability of 23

that it will land heads on the next toss. I agree with Lewis in thinking

73

that the Principal Principle is integral to our understanding of chance. I think there-

fore that the source of the Principal Principle is our concept of an objective chance.

Plausibly, our concept of an objective chance resembles in some way the concept of

an objective property whose recognition merits a specific probability assignment in re-

sponse. Insofar as one thinks that there is a property that there is a property that

merits a specific probabilistic response, one has reason to adopt that response. This

is a response required by the concept, it doesn’t arise from the practice of assigning

probabilities itself.

The explanation that I have I offered for Additivity is both simple and straightfor-

ward. It makes sense of Additivity in a thoroughly non-mysterious fashion, without

giving up on the insights of past Bayesians who were attracted to pragmatic vindi-

cations of the norms. Their pragmatic stories have a place, but the place that they

have is mediated by our intentions to engage in a practice. They explain why those

intentions are not arbitrary, and those intentions explain why we are subject to Addi-

tivity. In the next chapter, I will take up the question of how probability assignments

with respect to different bodies of evidence ought to relate to each other. I will sug-

gest that one customary norm regarding their relation, conditionalization, should be

understood in terms of commitment preservation. This should help to demonstrate

the variety of different sources of the norms governing probability.

74

Chapter 4

The Authority of

Conditionalization

4.1 The Bayesian Procedure

We should only believe those propositions that are, given our evidence, rather likely

to be true. The ‘Bayesian procedure’, inspired by Thomas Bayes, is one way to

figure out what how likely a proposition is on the basis of our evidence. At the

core of this procedure is conditionalization, a method for updating a probability

assignment with the inclusion of new evidence. To apply the procedure, we first

commit ourselves to one probability assignment based on some subset of our evidence,

and then conditionalize that assignment on the remainder of our evidence.

For Bayesian epistemologists, who accept the doctrine of degrees of belief, the

Bayesian procedure resembles the normal course of rational belief change. Since I am

skeptical of that doctrine, I will regard the Bayesian procedure as one among many

possible methods that we may employ to decide what to believe. I think that we are

under no rational obligation to go through the steps necessary to apply the method; we

are free to use other methods or heuristics when they prove more useful. If, instead of

75

using the Bayesian procedure we apply another method, justified by its expediency,

that produces results that differ from the results of the Bayesian procedure, it is

rational to assign probabilities that are in line with that other method.

I will assume all of this and focus on a residual question: other methods for

settling on probabilities may be more expedient, and so one may opt not to apply

the Bayesian procedure or bother to discern its dictates – but is it ever rationally

permissible to ignore those of its dictates of which one is aware? A person may get

away with failing to comply with a superior’s orders when they don’t actually receive

those orders. The superior may still have authority over their subordinate insofar as

the subordinate would be compelled to comply with any orders that they did receive.

Is the Bayesian procedure authoritative in the same way?

I will argue that the Bayesian procedure is not always authoritative. Sometimes

the thing to do is to ignore its dictates. My argument for this fact will depend upon

a particular analysis of the source of the procedure’s (limited) authority. I’ll pro-

pose that conditionalization makes sense, when it does, by virtue of the fact that

conditionalization is a way of being faithful to a special type of commitment. Be-

cause conditionalization is required to be faithful to a special type of commitment –

specifically commitments to the relative probabilities – this justification of condition-

alization undermines the authority of the Bayesian procedure. If conditionalization is

only necessary to be faithful to a certain kind of commitment, then when we haven’t

undertaken that kind of commitment we are not obligated to heed the dictates of the

Bayesian procedure.

The plan for the paper is as follows. First, I will explain the Bayesian procedure

in more detail. The procedure requires substantive inputs – commitments to proba-

bility assignments given bodies of evidence – which it doesn’t supply any guidance in

selecting. This may seem to make the procedure useless, but I will instead suggest

that it gives us some insight into the real significance of the procedure. In the second

76

section, I will give an analysis of that significance. Often, the Bayesian procedure de-

livers probability assignments that one should adopt as one’s own in order to respect

one’s present commitments. In the third section, I will present a case that raises

concerns about the authority of the procedure. In the final section, I will discuss

what this case shows about the limits of the procedure’s authority.

4.1.1 The Details

The Bayesian procedure is a way of settling on a probability to assign to a given

proposition (the ‘target’ proposition). The procedure works by updating a probability

assignment based on a subset of our total evidence on the remainder of our evidence.

It requires two inputs. First, it requires a division of our evidence into two parts: the

evidence that we base the the input probability assignment on, and the evidence that

we use to update it. We must be able to encapsulate the latter bit of evidence as a

proposition (the ‘evidence’ proposition). The procedure also requires that we start

with an assignment of probabilities to the members of an algebra of propositions that

includes both the target proposition and the evidence proposition.1.

The input probability assignment is customarily referred to as the ‘prior proba-

bility assignment’. In order for the procedure to make sense, it is critical that the

prior probability assignment represent something about our commitments to what

the probability should be in light of the relevant restricted body of evidence. This

commitment needn’t involve a belief about any objective measures of evidence: it

may instead simply reflect a personal commitment that we have undertaken. Though

the procedure relies on an input probability assignment in order to provide any guid-

ance on what probabilities to assign, it isn’t useless. Often, the prior probability

assignment is based on a body of evidence whose probabilistic significance we are in

a better position to assess. The value of the Bayesian procedure is that it provides a

1Strictly speaking, we don’t need to assign probabilities to every member of the algebra, just tothe evidence proposition and the conjunction of the target proposition and the evidence proposition.

77

precise way in which to accommodate the remainder of our evidence. If we know what

to think about the probability of the members of our algebra given that restricted

body of evidence, we can figure out what to assign to the total body of evidence.

The Bayesian procedure consists of conditionalizing the specified prior probability

assignment on the evidence proposition. The evidence proposition rules out every

member of the prior algebra that it is inconsistent with. A proposition is untouched

by the evidence if it is entailed by the evidence proposition. An untouched proposition

is entirely consistent with the evidence – there is no way in which the proposition

could turn out to be true while the evidence proposition is false. Each conjunction

of any proposition with the evidential proposition is untouched. Each conjunction

of a proposition with the negation of the evidential proposition is ruled out. The

essential feature of conditionalization is that it sets to 0 the probabilities of all ruled

out propositions and preserves the ratios of all of the untouched propositions.

To see exactly what conditionalization involves, it can be helpful to think about

probability assignments geometrically. A probability assignment can be geometrically

represented by an association of propositions with regions. The region associated with

a conjunction of two propositions is the region that intersects the conjuncts’ associated

regions. The region associated with a disjunction of two propositions is the region

that overlaps the disjuncts’ associated regions. The probabilities of each proposition

correspond to the relative size of their associated regions. The total space, which

corresponds with what we know to be true, is treated as having a size of 1. Every

proposition has a probability equal to the size of the associated regions.

On this way of representing a probability assignment, conditionalization is a geo-

metric process. It involves first excising the regions associated with ruled-out propo-

sitions and scaling the remainder back up to a total size of 1 in such a way as to

preserve ratios of untouched propositions.

78

In the following geometrical representation, we have a probability space consisting

of three propositions: ψ, φ, and γ. The initial assignment grants them all an equal

(1⁄3) probability. ψ is the evidence proposition. So incorporating the evidence rules

out ψ. φ and γ are untouched propositions. The untouched propositions retain their

ratio as the total space is renormalized. φ and γ both come to have a probability of

1⁄2 after conditionalization.

ψ φ γ ψ φ γ φ γ

4.2 The Logic of the Procedure

If the Bayesian procedure is to have any kind of authority its reliance on condition-

alization must be vindicated. Conditionalization is a very popular account of how

to go about updating a probability assignment on new evidence, and so numerous

defenses of it have been given. Most of these defenses assume the doctrine of degrees

of belief and take conditionalization to concern the rational evolution of these beliefs.

By focusing on the Bayesian procedure, I have used conditionalization as a synchronic

process. We often conditionalize a probability assignment relative to a subset of our

present evidence on our total evidence. Nevertheless, with a bit of tweaking, the same

strategies for justifying conditionalization as a diachronic process might be carried

over for its synchronic application.

I will discuss two different attempts at justifying conditionalization that have

played a prominent role in contemporary discussions: Dutch book arguments and

accuracy-based arguments. After discussing and criticizing these two proposals, I

will consider a third proposal which extends the account of the formal norms that

I offered in the previous chapter to the conditionalization. On this third view, the

79

authority of conditionalization is constitutive of the practice of assigning probabilities.

I will ultimately reject this view and offer an alternative that takes the authority of

conditionalization (what authority it has) to arise from our commitments to how

evidence should be represented.

4.2.1 Dutch Book Arguments

Conditionalization is a process by which we derive one probability assignment from

another. If the Bayesian procedure is to have authority, it is because conditionaliza-

tion tells us how probability assignments with respect to different bodies of evidence

ought to relate to each other. The Dutch book argument for conditionalization [34]

rests on the practical problems that arise if we don’t conditionalize.

The Dutch book argument shows that those who fail to conditionalize and at-

tempt to maximize their utility will be willing to accept diachronic Dutch books –

a set of bets offered at different times that appear individually advantageous but

that collectively guarantee a loss. By accepting those bets that individually appear

advantageous, those who fail to conditionalize will be led to be worse off no matter

how things turn out. This is unfortunate.

Dutch book arguments show that there are pragmatic reasons to conditionalize.

However, it is difficult to translate this observation into a satisfying explanation of

why it is that we should conditionalize. I offered one such interpretation of the

argument in the previous chapter according to which we should infer the irrationality

of the beliefs from the unfortunateness of acting on their behalf, and presented several

criticisms of it. Those criticisms continue to have force in the present context: Dutch

books threaten to make conditionalization dependent on one’s decision procedure,

they aren’t sufficiently epistemic, and the unfortunateness of accepting Dutch books

doesn’t obviously entail irrationality about the decisions that lead to them.

80

There are also a couple of specific reasons to worry about the diachronic versions

of the argument used in defenses of conditionalization. Such arguments establish too

much and too little. First, the same considerations that counsel for conditionalizing

also seem to counsel against changing one’s mind. The Dutch book argument shows

us that one can be led to accept bets that will led to sure losses unless one condition-

alizes. It doesn’t matter why it is that one fails to conditinalize. If we should always

conditionalize, then it is not permissible to change one’s mind about how to repre-

sent our evidence. Such changes needn’t conform to conditionalization. But, unless

there are objective facts about evidence, it is hard to see how it could be irrational

to change one’s mind. If the only considerations we can find against conditionalizing

also apply to changing one’s mind, then those considerations prove too much.

Second, diachronic Dutch book arguments aren’t quite able to justify the syn-

chronic Bayesian procedure. They show something about how it is that the proba-

bility assignments one bases one’s decisions on should change over time. They don’t

show anything about how we are obliged to represent probability with respect to

different bodies of evidence at a single time. One can avoid accepting Dutch books

so long as one always updates one’s probability assignment on one’s total evidence

by conditionalization. This is compatible with wild swings in what one assigns to

propositions relative to non-total bodies of evidence. Such swings go against the

spirit of conditionalization and flout the Bayesian procedure, but they don’t produce

the same pragmatic issues. This second problem is the inverse of the first. By being

focused entirely on the probability assignment relative to the total body of evidence

at different times, the argument is too weak to establish the synchronic Bayesian

procedure.

81

4.2.2 Accuracy-Based Arguments

As with the formal norms, accuracy-based arguments have recently become popular

as an alternative to Dutch book arguments for conditionalizaion because they promise

to offer a less pragmatic explanation. accuracy-based arguments rely on the thought

that our attitudes characteristically aim to maximize accuracy. (Versions of this ar-

gument, such as that of Wallace and Greaves [13] discussed below, focus on ‘epistemic

utility’ which may reflect other virtues of a probability assignment beyond accuracy.)

Ideally, an accuracy-based argument would contain a proof that any updating proce-

dure other than conditionalization is bound to produce less accurate results. That is,

insofar as one aims at accuracy, one is always better off conditionalizing than not con-

ditionalizing. This would make the accuracy-based argument for conditionalization

similar to the accuracy-based argument for the formal norms.

The ideal is unobtainable. The accuracy of the result of updating a probability

assignment in a particular way depends both upon the accuracy of the original as-

signment and how it is updated. If one starts with a highly inaccurate assignment,

one is often better off simply scrapping it. Whether someone is better off condition-

alizing depends upon exactly what kind of assignment they start with. The next best

thing to the ideal involves relativizing the argument: maybe conditionalization won’t

always produce the most accurate result, but it will be such that one should always

think that it will. Given our uncertainty, conditionalization may always seem like the

best option.

David Wallace and Hilary Greaves [13] have proven a result of this kind. They

proved that no matter how accuracy is measured, everyone should expect that the

most accurate probability assignment is whatever assignment would be deemed most

accurate by the probability assignment arrived at by conditionalizing. They don’t

show that one should expect that conditionalizing itself delivers the most accurate

assignment, only that conditionalizing delivers the assignment that one should expect

82

to provide the best advice about what probability assignment to accept on the basis

of accuracy. Wallace and Greaves point out that if probability assignments regard

themselves as being the most accurate (as they do if accuracy is the sole virtue and

is measured with the Brier score or another proper measure) then conditionalization

is rationally required.

This is a very interesting result, but its limits should be noted. It faces many of

the same problems as the accuracy-based arguments for the formal norms. It is not

clear how the argument should work if accuracy can’t be measured precisely. Though

Wallace and Greaves purport to only assume a more general account of epistemic

utility, it is clear that they have accuracy in mind when suggesting that their result

helps to justify conditionalization. If we allow that epistemic utility includes other

virtues, it is not at all clear that probability assignments will recommend themselves.

If probability assignments don’t recommend themselves, then there is no guarantee

that conditionalization will be optimal. For instance, if we think that probability

assignments should be evaluated in terms of how closely they conform to what is

licensed by the evidence, then there could be situations, like the following below, in

which we shouldn’t conditionalize.

Suppose that someone assigns a probability of 1 to the proposition that the bod-

ies of evidence A and B license the following probability assignments to ψ and φ

(respectively):

A ψ φ

a .5 .2

b .1 .2

B ψ φ

a .6 .4

b 0 0

Then suppose that body of evidence B contains all those propositions in body

of evidence A, along with the proposition a. Suppose that, in possession of body of

evidence A, they assign probabilities in light of the above values. What should they do

once they subsequently learn a? Conditionalization would deliver a final probability

of about .7 to ψ. However, that assignment wouldn’t recommend itself in light of their

83

commitments to facts about what evidence licenses. Since they are entirely sure that

body of evidence B licenses an assignment of .6 in ψ, it would recommend adopting a

probability of .6 in light of the evidence. In fact, if someone wants to maximize their

epistemic utility, measured in closeness to what the evidence licenses, they should go

with .6.2

If we do think that bodies of evidence license probability assignments that don’t

relate by conditionalization, and if we aim in any way to assign probabilities licensed

by our evidence, then conditionalization may not be the way to go. It begs the ques-

tion to think that probability assignments must relate by conditionalization. Hence,

Wallace and Greaves rely on questionable assumptions to justify conditionalization.

4.2.3 Constitutivism

In the third chapter, I argued that the formal norms of probability need no explana-

tion. Being a probability assignment involves being subjected to the formal norms.

The norms get their grip on us through our intention to assign probabilities. Not all

norms of probability are like this, and I suggested that Lewis’s Principal Principle

the Principle of Indifference were not. Conditionalization may seem much more basic,

and therefore more plausible as a constitutive norm. However, I think that it is not.

My reason for denying constitutivity to conditionalization through its role in the

Bayesian procedure is that it does not appear to be a requirement for engagement in

the practice. (Intentional) obedience to constitutive norms is essential to participation

in a practice. For this reason, a good guide to whether or not a norm is constitutive of

a practice is whether or not it is coherent for someone to intentionally and flagrantly

2Conditionalization only really applies when we gain new evidence. This is known to createproblems for its application, since often our evidence typically changes by shifts of evidence, notmere accretion. When our body of evidence becomes B, for instance, we not only gain the informationthat our total body of evidence is A, but we also lose the information that our total body of evidenceis B.

84

violate that norm while attempting to engage in the practice.3 For example, one

cannot understand the game of chess and intend to play chess while moving one’s

pawns backward. Someone who intends to play a game that allows moving pawns

backward intends, at best, to play a variant of chess. This is why the moves available

to a pawn are constitutive. Since that move violates the rules, and intended obedience

to the rules is required for an intention to play chess, it is incoherent to intentionally

move backwards. On the other hand, one can intentionally make a move that is

strategically unsound – say, trading a queen for a pawn for no gain in position –

without incoherence.

Those norms that cannot be intentionally violated without confusion about the

practice or incoherence are constitutive. The Principle of Indifference and the Prin-

cipal Principle can be violated without confusion about the practice of assigning

probabilities, and for this reason, they are not constitutive. On the other hand, the

formal norms cannot be violated. Conditionalization falls on side of being coherently

violable. While it isn’t coherent to intentionally violate a formal norm, it is coher-

ent to intentionally refrain from conditionalizing. One might opt to follow heuristics

to decide what probabilities to assign, and these heuristics might diverge from the

Bayesian procedure. Since it is coherent to disobey the Bayesian norm while intend-

ing to assign probabilities, obedience to the Bayesian norm can’t be constitutive of

the practice of assigning probabilities, and needs some other explanation.

It is perfectly possible to introduce a new practice that is constitutively governed

by the rule: ‘assign values in accordance with the Bayesian procedure’. This practice

isn’t the practice we currently engage in when we assign probabilities. The Bayesian

procedure could be constitutive of a practice, but it is not actually constitutive of our

3One may violate a norm governing a practice that one doesn’t recognize. However, insofar asthey intend to engage in the practice, if they were to discover that that norm governs that practice,they would have to alter their behavior or give up their intention.

85

practice. Hence, we can’t rely on constitutivity to account for the authority of the

Bayesian procedure.

4.2.4 Authority through Commitment Preservation

The key to the normative authority of the Bayesian procedure lies in our relation to

the prior probability assignment. Our commitment to the dictates of the procedure

should be no greater than our commitment to what we feed into it. If we take a shot in

the dark in assigning the prior probability, we’ve got no reason to trust in the results

of conditionalizing. As I suggested in the first chapter, we don’t need to think that

any probability assignment is right in order to commit ourselves to one. We may think

of other probability assignments as equally rational responses to the evidence without

being indifferent between them, just as one may have particular desires without seeing

those desires as any more or less rational. Adopting a probability assignment from

among the rational assignments available involves taking a personal stance on what

probability is reasonable on the basis of that evidence.

Part of what it is to assign a probability is to commit oneself to representing a body

of evidence with that probability. I propose that the Bayesian procedure receives what

authority it has from the commitments that lead us to settle on the prior probability

assignment. The normative force of the Bayesian procedure comes from the fact

that conditionalization is the only way to properly respect our commitments when

we assign a posterior. By committing ourselves to the prior, we may have already

committed ourselves to the posterior.

This approach to understanding the authority of the Bayesian procedure has sev-

eral advantages over the previous proposals that I’ve discussed. First, it is clear why

conditionalization is an epistemic norm. The commitments that we have to the prior

probability assignment are commitments about how to represent evidence with prob-

abilities. They are clearly epistemic commitments. While we may lack practical or

86

moral reasons for conditionalizing, our epistemic commitments provide us with epis-

temic reasons to conditionalize. Second, we needn’t rely on evidentialism to deliver

these epistemic commitments. It is doubtful that single probability assignments are

objectively demanded by many bodies of evidence. There needn’t be a unique ratio-

nal body of evidence in order for the present approach to work. The reason is that

the normative force doesn’t come from the evidence, but from ourselves. We don’t

conditionalize because conditionalization is the right way to respond to the evidence;

we conditionalize because it is the only way to respect our own commitments. Third,

there is no problem handling the synchronicity of the Bayesian procedure. The fact

that we assess probabilities with respect to different bodies of evidence at the same

time produces no technical challenges for this account. It is designed to handle them.

Fourth, since the value of conditionalizing doesn’t lie in the consequences of doing

so, this view does not prohibit changes of opinion. We can alter our commitments

whenever we like. What we believed in the past has no sway over what we now be-

lieve. When we do alter our commitments, however, we must do so across the board.

If multiple responses to the evidence are rational, then we can rationally shift from

one position to another, so long as we shift both our prior and posterior assignments.

4.2.5 How Commitments are Preserved

This approach to justifying conditionalization will only be viable if we can explain

how it is that commitments to a prior probability assignment make for commitments

to posterior probability assignments. There are three important ingredients in the

account that I’ll offer. First, I will give an account of the nature of our commitments. I

will suggest that many of our commitments are commitments to relative probabilities.

Second, I will reiterate my analysis of what conditionalization really amounts to: it

preserves the probability ratios of untouched propositions. Third, I will propose that

any proposition that is entailed by two separate propositions provides no evidence that

87

is relevant to the two propositions’ relative probabilities. I’ll take each component in

turn.

Commitments to a prior assignment may take many forms. Just as we may be

committed to a political cause for moral, religious, or social reasons, we can be com-

mitted to a probability assignment because of other, more basic, commitments. The

precise form of our commitments can explain why it is that we should care about

conditionalization, though the fact that we can be committed to a probability assign-

ment for other reasons is important to the limitations of conditionalization as well,

as I will explore later.

The first ingredient concerns our commitments. One form that our commit-

ments may take is commitment to the relative probabilities of propositions in light of

the evidence. A commitment to relative probability in light of the evidence is a com-

mitment about how to regard two propositions on the basis of that evidence. Such a

commitment only tells us the relationship between the numbers assigned to the two

propositions, but with enough such relative commitments, we may settle on specific

values for a whole algebra of propositions. If we are committed to assigning an equal

probability to a collection of inconsistent propositions which collectively exhaust the

space of possibilities, then we must assign them a value solely dependent on the

number of such propositions. I will reserve the term ‘relative commitment’ for com-

mitments about the ratios that probability assignments should take, although there

are certainly other kinds of relationships we could be committed to having between

the probabilities we assign to propositions.

Relative commitments are quite common. We often decide what probabilities to

assign by trying to estimate how two propositions compare with each other. Once

we have figured out how they relate to each other, we can deduce what numbers

they must be assigned in order to maintain probabilistic coherence. It is important

that these relative probabilities are assigned because we adopt a commitment that

88

the evidence demands them. While all probabilities are assigned as a response to our

evidence, and as a way of representing that evidence, some probabilities may be more

directly demanded by the evidence than others. We may be committed to assigning

ψ and φ equal probabilities in light of the evidence, and committed to assigning φ and

γ equal probabilities in light of the evidence, without being committed to assigning

ψ and γ equal probabilities in light of the evidence.

The second ingredient concerns conditionalization. As I explained earlier, what

is special about conditionalization is that it preserves the ratios of untouched propo-

sitions. This fact about conditionalization exhausts it. Conditionalization should

therefore be seen as that procedure that leaves the ratios of untouched propositions

alone and restores a probability assignment to probabilistic coherence, given that

every proposition that was ruled out must be assigned a probability of 0.

The final ingredient is the claim that any bit of evidence that is entailed by

each of two propositions doesn’t alter the balance of evidence we have for those two

propositions. This claim rests on the thought that evidence entailed by each of two

propositions doesn’t help discern between them. If we expect that the the pressure

will drop whether it rains or hails, then we cannot use the fact that the pressure

dropped to reevaluate whether we think that it is more likely to rain or to hail. If we

think that if either Frank or Sal will come in first place, Lee will come in third, then

we can’t use the fact that Lee came in third to reevaluate whether we think Frank or

Sal is more likely to have come in first. If judgments about probability aim to capture

the balance of evidence, then the addition of undiscerning evidence shouldn’t alter

the ratios of assigned probabilities demanded by the evidence.

Now we can assemble the ingredients. If we have a commitment regarding the

relative probability of two propositions on a particular body of evidence, then the

nature of the commitment means that we are also committed to the same relative

probability with the addition of any undiscerning evidence. The same commitment

89

to a relative probability in light of the old evidence will lead to a commitment to

the same relative probability in light of the additional undiscerning evidence. Thus

if we are committed to regarding any two propositions as having a particular ratio of

probabilities given a certain body of evidence, we are committed to them having the

same ratio with the addition of any evidenced entailed by both propositions.

It follows from this that if we are committed to the relative probabilities of propo-

sitions in the prior assignment, we are committed to maintaining the ratios of all

untouched propositions with the addition of undiscerning evidence. Since this is all

that conditionalization does and since conditionalization is the only way of updating

one’s probabilities that does this, we should conditionalize. The Bayesian procedure

is authoritative if and only if we are rationally required to assign probabilities on

the basis of different bodies of evidence that are related by conditionalization. Re-

specting our commitments to relative probabilities in light of the evidence requires

conditionalizing. So the Bayesian procedure is authoritative if we have commitments

to the relative probabilities all of the members of our algebra in light of the evidence.

This account provides a neat explanation of the authority of the Bayesian proce-

dure, but it is limited. It requires that our prior assignment reflect commitments that

we have about relative probabilities in light of the evidence. Often this is not the case

and when it isn’t we won’t have any reason to conditionalize. In the remainder of the

paper, I will explore and defend the idea that we may rationally lack commitments

to relative probabilities.

4.3 An Illustrative Case

Before I begin discussing the limitations of the Bayesian procedure, I will present a

case that illustrates them. In this section, I will describe a case in which I think that

the Bayesian procedure is not authoritative, and I will give an argument to this effect.

90

This case should provide a vivid introduction to the kinds of ways we might fail to

have the relative commitments that I previously suggested were vital for explaining

the authority of the Bayesian procedure.

4.3.1 Alice’s Predicament

Consider Alice’s predicament:

Alice the Astrophysicist. Alice has a new theory of dark matter. Ac-

cording to her theory, dark matter is composed of a hitherto-unobserved

particle – the D-particle – that is part of a natural extension of the Stan-

dard Model. The properties of this particle do a very good job explaining

the observed properties of dark matter.

A gap in her theory concerns the origins of the particle: it is not produced

by any known particle interactions. Mathematical reasoning allows her to

narrow down the candidates for the process that might have produced

the particle to two. The particle could be produced as a result of either

low-energy or high-energy supersymmetry breaking, but not both. Alice

has no a priori reason to prefer either hypothesis; her theory makes no

predictions about how the D-particle is actually produced. But experi-

mentation with a particle accelerator has confirmed that no such particle

is produced by low-energy supersymmetry breaking.

91

Alice is concerned with four hypotheses:

d-par Dark matter is composed of D-particles.

high The D-particle is produced during high-

energy supersymmetry breaking.

low The D-particle is produced during low-

energy supersymmetry breaking.

d-par Dark matter is not composed of D-particles.

Alice’s Question: How likely should Alice think it is that d-par is

correct, given the fact that low is not?

If she is to use the Bayesian procedure, Alice must try to assign a probability

to each member of the algebra generated by her four hypotheses. Suppose that

she forms a prior assignment by considering what her non-empirical evidence (that

is, her evidence without the results from the particle accelerator) merits, and then

conditionalizing on the remainder of her evidence. Suppose that the combination of

naturalness and fit with the data lead Alice to assign a 2⁄3 prior probability to the

proposition that her theory is true. Further, in the absence of evidence, she divides

the probability of high and low up equally, so that each proposition receives a

probability of 1⁄3.

In Alice the Astrophysicist, the Bayesian procedure tells Alice that upon incorpo-

rating the evidence that rules out low, she should maintain the relative probabilities

of high and d-par because they are both inconsistent with low. Since she assigned

them both a probability of 1⁄3 before, and they come to exhaust the possibilities, the

Bayesian procedure says that Alice should end up assigning them each a probability

of 1⁄2.

Intuitively, the addition of d-par to Alice’s body of evidence does not alter her

epistemic situation with regard to d-par. Her prior confidence in d-par was driven

92

by the theory’s naturalness and by the fit between its predictions and the established

data. The fact that there were two sub-cases consistent with her theory was not a

relevant factor in her assignment of prior probability to the theory as a whole. The

theory remains as natural after the evidence that rules low out is included, and the

fit between the theory’s predictions and the known properties of dark matter persists.

This same evidence should continue to drive her division of probabilities into d-par

and d-par. So, the probability she comes to have in d-par should be the same as

her old probability. It should continue to be 2⁄3.

4.3.2 The Counterfactual Argument

Consider the following counterfactual scenario (differences from the original are ital-

icized):

Counterfactual Alice. Alice has a new theory of dark matter. Accord-

ing to her theory, dark matter is composed of a kind of hitherto-unobserved

particle – the D-particle – that is part of a natural extension of the Stan-

dard Model. The properties of this particle do a very good job explaining

the observed properties of dark matter. The combination of naturalness

and fit support d-par just as they do in the original scenario.

However, there is only one conceivable process that might produce the D-

particle: high-energy supersymmetry breaking. Alice has no independent

reason to think that it does so. In this counterfactual scenario, low-energy

supersymmetry breaking is not a viable process, due to minute alterations

in physical laws that don’t otherwise effect the naturalness and fit of d-

par. Consequently, Alice assigns high the full probability of the theory.

Let Pr−(d-par) be Alice’s probability assignment in the original scenario before

she is able to rule out low, Pr+(d-par) be Alice’s probability assignment in the

93

original scenario after she is able to rule out low and PrC(d-par) be her probability

assignment in the counterfactual scenario in which low is not a possibility.

The argument goes as follows. Since the same evidential factors bear on Pr−(d-

par) and PrC(d-par), they should be equal. And since the same evidential factors

bear on Pr+(d-par) and PrC(d-par), they should also be equal. Therefore, Pr−(d-

par) and Pr+(d-par) should be equal.

In assigning a probability to a proposition, we commit ourselves representing

that body of evidence with that probability assignment. It follows that we ought

to assign the same probability whenever we have the same evidence. If a particular

probability is the appropriate representation of a body of evidence, then as long as

that evidence does not change, the probability will remain appropriate. Insofar as

the evidence that Alice possesses does not change in a relevant way, her probability

assignment should not change.

Premise 1: Pr−(d-par) = PrC(d-par)

In the counterfactual situation, Alice assigns her probability to d-par in response

to the same evidence as she initially had in the original scenario. She assigns her

probability in response to the naturalness of her theory and the fit between the

theory and established data. Since she had the same reasons to believe d-par before

she rules out low in the original scenario and in the counterfactual scenario, she

should assign the same probability to the proposition in both cases. Thus, in the

counterfactual scenario, high should be assigned a probability of 2⁄3.

In the setup of the original case, Alice’s probability was stipulated to depend

on the naturalness of her theory and its fit with established data. Her probability

was not taken to depend upon her division of probabilities to hypotheses about how

the particle might be produced. The fact that the theory had two sub-cases was

not counted in its favor. This is a plausible assumption: when we go about assigning

94

probabilities, we do not normally bother to count or closely examine all of the possible

sub-cases.

This stipulation might be questioned – perhaps Alice is being irrational in assign-

ing her probabilities in this way. If Alice was an ideal epistemic agent, it is possible

that she would not need to rely on the Principle of Indifference to assign proba-

bilities. However, imperfect agents such as we use a variety of heuristics to assign

probabilities. I believe that if Alice is imperfect in her ability to collect and analyze

evidence, she needn’t be irrational for only basing her assessment of probability on

course-grained features of her body of evidence. We have little basis to conclude that

Alice is irrational for using a heuristic that produces the same probability in both the

prior and posterior assignments.

Premise 2: Pr+(d-par) = PrC(d-par)

What difference could it make whether low was genuinely epistemically open

to Alice in the first place? Ruling out low should not amount to evidence not to

believe d-par if its viability did not provide evidence in its favor in the first place.

By stipulation, it did not provide any such evidence that Alice recognized, so Alice

is in the same epistemic situation with respect to d-par upon ruling out low in the

original case as she is in the counterfactual case. In both cases she has the same

evidence for d-par. The theory exhibits the same naturalness, fit with the data, and

even has the very same viable sub-cases. So Alice should assign the same probability

to d-par in the original case after ruling out low as she should assign to d-par in

the counterfactual case.

The counterfactual argument suggests that the change in probabilities should look

more like this:

low high d-par high d-par

95

4.4 Limitations of the Bayesian Procedure

With Alice’s predicament in mind, I’ll return to the issues surrounding the authority

of the Bayesian procedure. First, I will explore what kinds of commitments one

may have other than commitments to representing bodies of evidence with relative

probabilities. Then I will suggest that as the case was described, it is most plausible

that the kinds of commitments that Alice has don’t support the authority of the

Bayesian procedure. Third, I will lay out the view that the authority of the Bayesian

procedure is limited to cases where we have commitments to relative probabilities.

Finally, I will take up two objections to this proposed limitation. The first of these

objections holds that we are rationally required to have commitments to relative

probabilities. The second holds that any commitment to an algebra provides us with

commitments to relative probabilities.

4.4.1 Varieties of Commitment

The Bayesian procedure only makes sense if we are committed to the prior probability

assignment. I offered one explanation of the authority of the Bayesian procedure that

relied on a particular kind of commitment. If we are committed to the relative

probabilities that we assign to propositions, those commitments should survive the

addition of undiscerning evidence. There are other kinds of commitments that we

might have, and the Bayesian procedure doesn’t build in any restriction on the kinds

of commitments it requires.

A commitment to a probability assignment to an algebra of propositions is typ-

ically derived from commitments to the particular propositions that make up the

algebra. These commitments to particular propositions may in turn be derived from

a variety of other commitments. Some of our commitments are fundamental (‘basic’)

and all other commitments are derived from those fundamental commitments and

96

from each other. I will take the notion of a basic commitment to be a psychological

primitive. Derived commitments are those commitments that we have because they

must be satisfied in order to satisfy all of one’s basic commitments.

There is nothing preventing us from having a basic commitment to a set of proba-

bility assignments to an algebra as a whole, but this would be quite unusual. It is more

typical that we have commitments to an algebra derivatively of having commitments

to the particular propositions that that make it up. Nor are we often directly com-

mitted to assigning particular values to propositions. In general, our commitments

to whole probability assignments to algebras are derived from our commitments to

the relations between assignments.

Commitments to relative probabilities are one way of having commitments to the

relations between assignments, but there are also others. We also have comparative

commitments, for instance, when we judge that one proposition is more likely than

another, and we have a kind of higher-order commitment, for instance, when we judge

that the difference in probability between two propositions is equal to the difference

in probability between two other propositions.

We also have commitments to methods. The Bayesian procedure can’t settle all of

our questions about what probabilities to assign. We can have commitments to the

prior that are derived from commitments to procedures for assigning probabilities.

One method for assigning probabilities instructs us to assign equal probabilities

to all analogous propositions in the absence of evidence. Another method is to adopt

the probability assignments of known experts. Another method is to meditate, and

then follow your own gut. Another method is to adopt probabilities in accordance

with known frequencies. When the application of these methods lead us to adopt a

commitment to a relation or a specific value, it is a derived commitment.4

4The fact that a relation can be derived from a method that we are committed to doesn’t meanthat it is derivative. We may have commitments that are overdetermined – we may have basiccommitments that we can also derive from our other commitments.

97

These procedures are all incomplete. We will often need to use many of them to

settle on a probability assignment to a larger algebra. We may even have multiple

commitments that are individually basic and allow the same derivative commitments

to be derived.

As an illustration of the ways in which we may come to be committed to a prob-

ability assignment, consider the following.

P(rain tomorrow) = .5

P(no rain tomorrow) = .5

There are many different ways in which one can come to have this assignment.

Exactly how one arrives at them makes a difference to how one should go about

updating them on the receipt of new evidence. One might, for instance, have come to

assign these probabilities just by examining the clouds, the time of year, the recent

weather, etc. One might then just intuit those probabilities, without any utilizing

any particular method. In this case, it is plausible that one would have a basic

commitment to the relative probabilities.

Alternatively, one might come to assign this probability on the basis of statistical

inference from a data set. This would involve a basic commitment to the principles

one used to derive the probability, and a derivative commitment to the relative proba-

bilities. Or, one might have a basic commitment to heeding the testimony of experts,

and one might have come to have a derivative commitment to the assignment after

hearing a meteorologist exclaim that it was as likely as not to rain tomorrow. Or, we

may have a basic commitment in the use of the Principle of Indifference, which coun-

sels assigning equal probabilities in propositions about which we have no evidence,

and we may regard ourselves as having no evidence about whether it will rain. In this

last case, we would have a basic commitment to the Principle of Indifference, and a

derived commitment to the assignments.

98

Thus, relative commitments are only one of many different ways in which we can

be committed to a probability assignment. If the authority of the Bayesian procedure

rests on these kinds of commitments, then when we lack those kinds of commitments

we should not regard the Bayesian procedure as authoritative.

4.4.2 Alice’s Commitments

Alice would have an obligation to accept the verdict of the Bayesian procedure if her

commitments to the prior assignment were commitments to the relative probabilities

of high and d-par in light of the evidence. If she had a basic commitment to thinking

that high should have the same probability as d-par, then that commitment would

mean that she should continue to assign d-par the same probability as d-par after

taking the additional evidence into account. In the way that I described the case,

however, it was suggested that her basic commitments did not include a commitment

to this relative probability. Her commitment was derived from two other commitments

that she had: to the relative probability of d-par and d-par to the use of the Principle

of Indifference in settling probabilities in the absence of evidence.

Since she had no reason to favor high and low, she assigned them equal proba-

bility. Her commitment to the Principle of Indifference will remain, and so she will

continue to be obligated to assign an equal probability in any propositions for which

she lacks discerning evidence, but no such two propositions exist in the resulting

algebra and so this commitment is irrelevant to her final probability assignment.

Alice had no basic commitment to the equal probability of high and d-par.

The basic rationale for the authority of Bayesian conditionalization doesn’t hold.

Alice’s commitments to her prior assignment don’t obviously entail much about how

she should be committed to the posterior assignment. Insofar as she lacks a basic

commitment to the equal probability of high and d-par before taking her evidence

from the particle accelerator into account, it is plausible that she isn’t committed to

99

assigning an equal probability after (unless it is required by her other commitments).

This relation was an accident of her other commitments. There is no reason to think

it is a commitment that should be carried over.

The fact that Alice lacks relative commitments to untouched propositions doesn’t

mean that she has no commitments about the posterior assignment. In order to

decide what Alice’s commitments to the prior assignment commit her to about the

posterior assignment, we must know more about what those were. Alice’s commit-

ment to the relative probabilities of d-par and d-par weren’t based on the Principle

of Indifference, so they must have a different source, either derived or basic. If her

commitment to this relative probability is basic, then it will supply Alice with no

commitments after the fact. Any basic commitment she had to the relative proba-

bility of d-par and d-par is irrelevant once she eliminates low, as d-par doesn’t

entail low. Plausibly, however, Alice’s basic commitment to the relative probabil-

ity of d-par and d-par will be accompanied by basic commitments to comparative

probabilities of the sub-cases of d-par and d-par; in particular, it is plausible that

she will have a basic commitment to always assign a probability in high no more

than twice as great as the probability she assigns to d-par. This means that she will

continue to be committed to thinking that d-par is no more than twice as great as

d-par.

If her commitments are derived from methods, then we would need to delve more

deeply into the details of the case and the particular methods involved to know where

she should be after incorporating the evidence. It is possible that her commitments

were to methods that led her to ignore the sub-cases into which d-par was divided. In

that case, it is plausible that she should still assign a probability of 2⁄3 to d-par after

taking the evidence into account; she is still committed to those principles and since

they were blind to the sub-cases, they will continue to deliver an equal probability in

d-par and d-par after the additional evidence is incorporated. On the other hand,

100

she might have been committed to the 2⁄3 assignment by virtue of a commitment to

methods that took the sub-cases into account. If she did, then depending on the

details of those methods, she might either be obligated to assign a probability in

d-par anywhere from 0 to 2⁄3, or she might lack commitments altogether.

4.4.3 The Authority of the Bayesian Procedure

The authority of the Bayesian procedure stems from the fact that our commitments

to a prior assignment may commit us to the probability assignment that results from

conditionalizing that prior assignment for our total evidence. The procedure lacks any

kind of authority over those who have no commitments to a prior assignment. The

procedure also lacks authority over those who have the wrong kind of commitment to

the prior assignment. Conditionalizing makes sense when we have commitments to

relative probabilities – specifically to ratios in light of the evidence. But it may not

make sense in other cases. If our commitments to relative probabilities are derivative,

rather than basic, then we must look to the basic commitments to see what we should

think.

If this is right, then there is no magic rule for deciding what probability assignment

one body of evidence merits in terms of the probability assignment that another does.

What one should do will depend upon what commitments one has, with different

commitments leading to different results. In each case, the individual commitments

must be carefully examined to see what those commitments require. How we should

update a probability assignment depends on how we arrived at it.

We could adopt a commitment to utilizing the Bayesian procedure itself. In that

case, we would give the Bayesian procedure unrestricted authority. Perhaps there are

good reasons to adopt such a commitment; there is something very appealing about

conditionalization that is revealed by Dutch book and accuracy arguments. But in

light of the partial vindication of the Bayesian procedure, I don’t think that we should

101

hold out hope that any more substantial commitment to the Bayesian procedure is

rationally mandatory. Insofar as we lack a commitment to the Bayesian procedure,

it lacks unrestricted authority over us.

4.4.4 Must we have Relative Commitments?

To conclude this chapter, I will briefly look at how a defender of the procedure might

respond. I will consider two responses. First, I will consider the proposal that we

should have commitments to relative probabilities. If we are to be rational, the

thought goes, we must have the commitments that make the procedure authorita-

tive. Second, I’ll consider the suggestion that the derived commitments to relative

probabilities that arise from derived commitments to any assignment to an algebra

are sufficient to provide full authority to the Bayesian procedure. These responses

may leave the Bayesian procedure with some limitations – someone who lacks relative

commitments may not be committing a further blunder in not respecting the author-

ity of the procedure, but if these responses are correct then one cannot get away with

ignoring the dictates of the Bayesian procedure without violating some norm or other.

I will focus on each kind of response in turn.

The argument against the unrestricted authority of the Bayesian procedure relies

on the thought that we are obligated to have basic commitments to relative probabil-

ities. I am skeptical that there is any rational requirement to have any commitments

whatsoever, so if this response is to have any bite, it is plausible that two things must

be the case: first, we are in the business of adopting commitments to representing ev-

idence with probabilities and second, all bodies of evidence demand that those in the

business of adopting such commitments adopt commitments to relative probabilities

in light of the evidence.

I expressed my skepticism of evidentialism in the first chapter. If evidentialism

were correct – if there were objective facts about what probabilities we should assign

102

– then it is plausible that we would always be obligated by the evidence to adopt

commitments to relative probabilities. This view may not explicitly require eviden-

tialism, but it is much less plausible without it. If our reactions to evidence are akin

to a matter of personal taste, then whether their bodies of evidence always merit

relative probabilities is itself highly subjective.

Even if evidentialism is mostly true, there may be many cases in which (relative)

probabilities are not dictated by the evidence. There are many difficult questions

about which rational disagreement seems permissible, and for which we don’t have

much of a clue about how to figure out an objective way of assigning probabilities.

Is there an objective fact of the matter, given what evidence that we have available

to us, how likely we should think it is that the human race will survive the next 500

years, that alien life exists within fifty light years, that there is a island of stability

in the elements with more than 200 protons? So long as there are some cases where

we can have commitments that are not commitments to relative probabilities, there

will be some restrictions to the authority of the Bayesian procedure.

Even if the evidence does render one probability assignment uniquely correct,

there is no guarantee that we have any obligation to adopt that assignment. The

evidence available to any child who has past sixth grade may entail that Goldbach’s

conjecture is false, but that doesn’t mean that anyone who fails to be committed to

assigning the conjecture a probability of 0 is being irrational. Even if evidence does

objectively determine a uniquely correct probability assignment, there is no reason

to think that we should be so constructed as to be rationally required to recognize

the evidence’s import and adopt the right commitments. So it may be rational for

individuals to lack commitments to relative probability assignments even if relative

probability assignments are in some sense uniquely supported by the evidence.

It strikes me as deeply implausible that there are facts about evidential relations

that greatly transcend our ability to recognize them that we are nevertheless rationally

103

obligated to obey. It is doubtful that every issue regarding evidential support is

closed to rational disagreement. So tentatively, short of an explanation of what the

evidence really does dictate, it seems safer to assume that sometimes the evidence

underdetermines the rational response (at least in terms of relative probabilities). If

evidence underdetermines the rational response, then it is plausible we are under no

obligations to think that the evidence merits any particular relative probabilities.

The second response, which held that derived commitments may be sufficient

to establish the authority of the Bayesian procedure, only makes sense if we are

derivatively committed to thinking that the relative probabilities are demanded by

the evidence. Just as there are multiple ways in which one might be committed to

a probability assignment, there are multiple ways in which one might be committed

to a relative probability, and these ways may make a difference to whether that

commitment survives a change in evidence. Earlier, I suggested that commitments

to relative probabilities should survive the addition of evidence that doesn’t touch

either proposition. This followed from the premise that relative probabilities that

are supported by evidence can’t change in response to new evidence. If we have

commitments about how one bit of evidence supports relative probabilities, that gives

us commitments about how we should react to new evidence. If we have commitments

about relative probabilities that follow not from the evidence but from our methods,

there is no guarantee that they tell us anything about how we should react to new

evidence.

We can have commitments to relative probabilities that are not commitments

about what is demanded by the evidence. Our commitments may be to have those

relative probabilities not because the evidence demands them, but because those

relative probabilities are the proper ones to have in the absence of evidence. This

is precisely what I think may go on when one bases one’s probabilities off of the

Principle of Indifference. The Principle of Indifference may be seen as a principle for

104

allocating probabilities in the absence of evidence – not a principle that states what

low levels of evidence themselves demand. The absence of evidence needn’t support

any particular probabilities at all. It should be clear why such a commitment would

not survive the incorporation of new evidence. The fact that in the absence of evidence

two propositions merit a particular ratio of probabilities doesn’t mean that they will

continue to do so when evidence is added.

Derived commitments need not be commitments to what the evidence supports.

Consequently, derived commitments to relative probabilities need not be the kinds of

things that one is rationally obligated to preserve.

Neither of these responses succeeds in undermining the explanation I have of-

fered for the authority of the Bayesian procedure, or the limitations on its authority

that seem to follow. Unless we find some alternative rationale to obey the Bayesian

procedure, we should conclude that it is indeed of limited authority.

105

Chapter 5

Conclusion

In the first chapter, I suggested that we might try to understand judgements about

probability as moves in a kind of practice. The following three chapters saw the

development of related ideas that together presented a more robust picture of the

nature of judgements about probability. Now that I have surveyed these different

proposals, I will briefly draw them together and explain how they relate to each

other.

The conclusion of the second chapter was that judgements about probability are

cognitively sophisticated. That they involve vehicles which incorporate concepts of

probability in the same way that ordinary beliefs incorporate quantitative concepts

such as price, height, and weight. Though I didn’t directly address the issue of the

representational qualities of these concepts, I suggested that we needn’t see probabili-

tistic concepts as standing in for anything in the real world. I think that we should

interpret regarding-as-evidence as a noncognitive attitude. To regard something as

evidence isn’t to believe anything about it, but to be prepared to use it a certain way

in deciding what to believe. Probabilities are a way of categorizing propositions on

the basis of these attitudes, in a way that makes them especially useful for decision

making.

106

The conclusion of the third chapter was that the constitutive account of the formal

norms governing probability held more promise than the alternative explanations.

The relevance of this result to the proposal of the first chapter should be clear. Since

this proposal relies on the idea that our intentional subjugation to the norms explains

why they hold sway over us, it isn’t anything about the world or about the basic

elements of the furniture of the mind that explains the norms. Our intentions are

fruitfully understood as intentions to engage in a practice. Since the interpretation

of probability as a practice helps to make this explanation of the norms available, it

gains plausibility for this explanation’s successes.

The conclusion of the fourth chapter was that the Bayesian procedure has only a

limited authority. This conclusion was a result of a proposal for the explanation of the

authority of the Bayesian procedure. I think that this explanation warrants accep-

tance as the best available explanation. It makes sense of the typical importance of

conditionalization, without relying on pragmatic considerations or the hefty assump-

tions of accuracy-based arguments. Though the account does not rely so directly on

the interpretation of probability as a practice, it does help round out the view. In

part, this explanation is made more plausible if we see judgements about probability

as involving commitments about the representation of evidence. If judgements about

probability were mere states of confidence, then the connection between confidence

and evidential commitments would need to be spelled out. As things stand, the intu-

itiveness of conditionalization provides support for my explanation of the constitutive

aim of assigning probabilities.

The resulting picture is one in which judgements about probability are part of a

heuristic that we use for keeping track of our evidence. The judgements are actions

of categorization. We apply a concept of probability and a number to reflect the

amount of evidence that we have for them. The amount of evidence that we have

for a proposition isn’t an objective matter, but is a matter akin to personal taste.

107

The numbers that we apply in these judgements must conform to the formal norms

of probability, for it is part of the practice that our judgements are subject to such

constraints. Probability is a form of epistemic bookkeeping that we learn from our

community. The attitude of regarding-as-evidence is innate. The way in which we

abstract from these attitudes and apply numbers is not.

The interpretation of probability as a move within a practice has many advantages

over credal noncognitivism. Besides not relying on the doctrine of degrees of belief,

it provides space for the development of the normative explanations that I advanced

in the previous two chapters of this paper. These views make the most fundamen-

tal of the normative rules that govern the assignment of probabilities a product of

ourselves. We impose rules upon ourselves, either by undertaking commitments to

relative probabilities, as with conditionalization and the Bayesian procedure, or by

deciding to assign probabilities in the first place. It is because we implicitly choose

to engage in a practice with certain rules and a certain aim, we become subject to

thre norms of probability.

There is much more that needs to be said for this account to be complete. I have

only sketched the idea of treating judgements as a move in a practice and I left out

the details of what it is that makes the judgements count as moves in a practice. If

we do not characterize judgements about probability in terms of their content or their

functional roles, we owe some other account.

Further, by giving up on representative meaning, noncognitivists must provide

the explanations that are lost along with representative content. The chief of these

problems is the Frege/Geach challenge, the challenge of explaining the logical relations

between complex judgements. I developed a version of this challenge and marshaled

it against the unification thesis in the second chapter. I think that the response

that I advocated in that chapter of regarding judgements of probabilities as involving

108

certain vehicles of probabilities is promising. But the problem is deep and a great

deal of work remains to be done.

Finally, I think that the ideas presented in this dissertation show promise in appli-

cation to other domains as well. In the first chapter, I suggested that interpretations

of moral judgements admit analogous interpretations of probabilistic judgements. I

hope the ideas that I have explored in this dissertation may give something back to

the normative domain. Normative judgements may likewise be understood in terms

of a network of communal practices. Perhaps this will not produce a different product

from the views of recent metaethicists, but it will allow us to develop those views in

a slightly different light.

109

Bibliography

[1] Frank Arntzenius, Adam Elga, and John Hawthorne. Bayesianism, infinite deci-sions, and binding. Mind, 113(450):251–283, 2004.

[2] Simon Blackburn. Opinions and chances. In D. H. Mellor, editor, Prospects forPragmatism, pages 175–96. Cambridge University Press, 1980.

[3] Glenn W Brier. Verification of forecasts expressed in terms of probability.Monthly weather review, 78(1):1–3, 1950.

[4] Aaron Bronfman. A gap in Joyce’s argument for probabilism. unpublishedmanuscript.

[5] Rudolf Carnap. Logical foundations of probability. 1950.

[6] David Christensen. Dutch-book arguments depragmatized: Epistemic consis-tency for partial believers. Journal of Philosophy, 93(9):450–479, 1996.

[7] Bruno de Finetti. Theory of Probability. New York: John Wiley, 1970.

[8] Andy Egan, John Hawthorne, and Brian Weatherson. Epistemic modals in con-text. In G. Preyer and G. Peter, editors, Contextualism in Philosophy, pages131–170. Oxford University Press, 2005.

[9] Lina Erickson and Alan Hajek. What are degrees of belief? Studia Logica,86(2):185–215, 2007.

[10] Matthew Evans and Nishi Shah. Mental agency and metaethics. Oxford studiesin metaethics, 7:80–109, 2012.

[11] Richard Foley. The epistemology of belief and the epistemology of degrees ofbelief. American Philosophical Quarterly, 29(2):111–124, 1992.

[12] Keith Frankish. Partial belief and flat-out belief. In Huber and Schmidt-Petri,editors, Degrees of Belief. Synthese Library, 2009.

[13] Hilary Greaves and David Wallace. Justifying conditionalization: Conditional-ization maximizes expected epistemic utility. Mind, 115(459):607–632, 2006.

[14] Ian Hacking. The emergence of probability. Cambridge University Press, 1975.

110

[15] Alan Hajek. What conditional probabilities could not be. Synthese, 137:273–323,2003.

[16] Alan Hajek. Arguments for–or against–probabilism? The British Journal forthe Philosophy of Science, 59(4):793–819, 2008.

[17] Alan Hajek. A puzzle about degree of belief, 2010.

[18] James Joyce. A nonpragmatic vindication of probabilism. Philosophy of Science,65(4):575–603, 1998.

[19] James Joyce. Accuracy and coherence: Prospects for an alethic epistemology ofpartial belief. In Huber and Schmidt-Petri, editors, Degrees of Belief. SyntheseLibrary, 2009.

[20] John Maynard Keynes. A treatise on probability. 1921.

[21] Andrej Nikolaevic Kolmogorov. Foundations of the theory of probability. 1950.

[22] Angelika Kratzer. Modality. semantics: an international handbook of contempo-rary research, ed. by a. von stechow and d. wunderlich, 639–50, 1991.

[23] David Lewis. A subjectivists guide to objective chance. In Ifs, pages 267–297.Springer, 1981.

[24] John MacFarlane. Epistemic modals are assesment sensitive. In Andy Egan andB. Weatherson, editors, Epistemic Modality. Oxford University Press, 2009.

[25] John MacFarlane. Assessment sensitivity: Relative truth and its applications.Oxford University Press, 2014.

[26] Patrick Maher. Joyce’s argument for probabilism. Philosophy of Science,69(1):73–81, 2002.

[27] Huw Price. Does ‘probably’ modify sense? Australasian Journal of Philosophy,61(4):396–408, 1983.

[28] Frank Plumpton Ramsey. Truth and probability. The foundations of mathematicsand other logical essays, pages 156–198, 1931.

[29] Glenn Shafer. A mathematical theory of evidence, volume 1. Princeton UniversityPress, 1976.

[30] Nishi Shah and J. David Velleman. Doxastic deliberation. The PhilosophicalReview, pages 497–534, 2005.

[31] Tamina Stephenson. Judge dependence, epistemic modals, and predicates ofpersonal taste. Linguistics and Philosophy, 30(4):487–525, 2007.

[32] Eric Swanson. Interactions with Context. PhD thesis, MIT, 2006.

111

[33] Erno Teglas, Vittorio Girotto, Michel Gonzalez, and Luca L. Bonatti. Intuitionsof probabilities shape expectations about the future at 12 monthes and beyond.Proceedings of the National Academy of Sciences, 2007.

[34] Paul Teller. Conditionalization and observation. Synthese, 26(2):218–258, 1973.

[35] Stephen Toulmin. The Uses of Argument. Cambridge University Press, 2003.

[36] Kai von Fintel and Anthony S. Gillies. ‘Might’ made right. In Andy Egan andBrian Weatherson, editors, Epistemic Modality. Oxford University Press, 2009.

[37] Ralph Wedgwood. The aim of belief. Nous, 36(s16):267–297, 2002.

[38] Ralph Wedgwood. Doxastic correctness. In Aristotelian Society SupplementaryVolume, volume 87, pages 217–234. Wiley Online Library, 2013.

[39] Donald Cary Williams. The ground of induction. 1963.

[40] Fei Xu and Vashti Garcia. Intuitive statistics by 8-month-old infants. Proceedingsof the National Academy of Sciences, 2008.

[41] Seth Yalcin. Epistemic modals. Mind, 116 (464):983–1026, 2007.

[42] Seth Yalcin. Non-factualism about epistemic modality. In Andy Egan and BrianWeatherson, editors, Epistemic Modality. Oxford University Press, 2011.

112

Probability The Concept and its Rules of Use · 2015. 2. 9. · of probability theory for modeling natural phenomena. As distinct applications of a ... the view of judgments about

Documents