Probability The Concept and its Rules of Use Derek Charles Shiller A Dissertation Presented to the Faculty of Princeton University in Candidacy for the Degree of Doctor of Philosophy Recommended for Acceptance by the Department of Philosophy Advisors: Adam Elga and Sarah-Jane Leslie January 2015
120
Embed
Probability The Concept and its Rules of Use · 2015. 2. 9. · of probability theory for modeling natural phenomena. As distinct applications of a ... the view of judgments about
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
4.4.3 The Authority of the Bayesian Procedure . . . . . . . . . . . . 101
4.4.4 Must we have Relative Commitments? . . . . . . . . . . . . . 102
5 Conclusion 106
Bibliography 110
viii
Chapter 1
Introduction: Probability as a
Practice
1.1 Judgments about Probability
The modern notion of probability dates back to the 17th century, when mathemati-
cians and gamblers suddenly recognized the value of applying the new theory of
combinatorics to games of chance. Some notion of plausibility or evidential support
surely preceded this achievement, but it was only with the new mathematical appara-
tus in view that people really began to conceive of probabilities as we do today. The
significance of the modern notion of probability was grasped at once by those at the
forefront of its development: while its most obvious uses lay in gambling, the formal
structure of probability held promise for helping us navigate uncertainty in all of its
forms.
Probabilities are now ubiquitous: they have taken up an important role in the
sciences, medicine, our legal system, and our daily life. We’ve developed complex
tools for using probabilities in ever more sophisticated ways. Despite our progress in
working with probabilities, the fundamental nature of probability remains controver-
1
sial. The twentieth century saw the development of a number of different schools of
thought about the fundamental nature of probability. Through ‘theories of probabil-
ity’, these schools proposed many different applications of the mathematical apparatus
of probability theory for modeling natural phenomena. As distinct applications of a
formal apparatus, these theories are consistent with each other. But they can also
be construed as making inconsistent claims about what it is that judgments about
probability are really about.
We make judgments about probability. In court, we try to assess how likely it is
that the defendant is innocent. Before setting out for a walk, we try to assess how
likely it is to rain. We judge how likely it is that our sports team will win or lose.
We calculate the odds that our airplane will crash. We decide that scientists are very
probably right about global warming. We conclude that the Bohmian interpretation
of quantum mechanics is no more likely to be true than the Everettian interpretation
and the Collatz conjecture is probably unprovable. The fundamental question at the
foundation of probability is what it is about these judgments about probability that
makes them count as judgments about probability.
Not all judgments about probability need to count as judgments about probability
for the same reasons. Collectively, judgments about probability may only share a
family resemblance. It is extremely plausible that there are at least two basic distinct
kinds of probability. One sort of judgment about probability relates to our evidence.
The other reflects something in the world.
A coin may be biased without us knowing which way it is biased. If we have
to judge how likely it is that the coin will land heads, we might judge that the
probability of heads is .5, on account of our ignorance of the way that it is biased.
Alternatively, we might judge that it is not .5 on account of the fact that it is biased.
The first kind of judgment reflects to our level of evidence. The second kind reflects
the physical nature of the coin itself. These two sorts of judgments deserve distinct
2
interpretations. The judgment that the probability of heads is not .5 is a judgment
about objective chance. Particles, coins, and dice have objective chances of behaving
in certain ways. These chances are part of the things themselves, and don’t depend
on our evidence in any way. The judgment that the coin is equally likely to land
heads or tails is a judgment about subjective probability. The probabilities that we
ascribe to past events or mathematical conjectures are good candidates for judgments
about subjective probability, since despite being assigned intermediate probabilities,
we all know that they must be one way or the other.
The topic of this dissertation is judgments about subjective probability (hence-
forth, just ‘judgments about probability’). I will start in this chapter by offering a
brief account of how I think we ought to understand these judgments. While I don’t
believe that it is wise to develop accounts of both subjective and objective judgments
entirely in isolation from each other, I will focus mostly on subjective judgments.
I will propose that we should understand judgments about probabilities as moves
within a practice, and draw analogies to games of horseshoes and card counting. In
the following chapters, I will discuss a number of related questions and provide an-
swers that draw inspiration from, and provide support for, the view of judgments
about subjective probability that I present here.
1.2 Interpretations of Probability
Though the details vary considerably, the space of plausible analyses of judgments
about probability closely resembles the space of plausible analyses of ethical judg-
ments. The problem of supplying an analysis of the judgment that a proposition is
probable is a lot like the problem of supplying an analysis of the judgment that an
action is wrong. Both kinds of judgment have a subjective or perspectival flavor, but
neither kind of judgment is explicitly relativized. The ethical question has received
3
much more sustained attention over the past century, and the wealth of possible an-
swers proposed for that question provides us a range of possible answers to consider for
the first. Most plausible metaethical views parallel plausible views about judgments
of probability: non-naturalism, analytic descriptivism, synthetic naturalism, sensibil-
ism, subjectivism, assessor relativism, expressivism, error theory, and fictionalism all
make for prima facie reasonable theories of judgments of probability.
Philosophers of probability have developed some of these ideas, often somewhat
independently of their metaethical peers. The primary division in theories falls be-
tween those analyses that characterize the judgments in terms of their representa-
tional content and those that don’t. Cognitivist views suggest that what it is to be
a judgment about probability is to be a judgment that has a certain kind of representa-
tional content. Noncognitivist views suggest that judgments about probability are
characterized by something other than their representational content. They are not
judgments that have a particular representational content containing a probabilistic
component.
Judgments about objective probability most likely deserve a cognitivist treatment.
With judgments of subjective probability, it is less clear. There are cognitivist theories
of judgments of subjective probability that come in realist naturalist, realist non-
naturalist, and subjectivist flavors. Realist positions characterize judgments about
probabilities as judgments about the measure of something outside of us. Naturalist
positions take these to be measures of some natural phenomenon, one that we can
understand in terms of physical objects and properties. Non-naturalist positions
take the subject of judgments about probability to be a non-natural phenomenon.
The ‘logical interpretation’ of probability championed by John Meynard Keynes [20],
Rudolf Carnap [5], and Donald Williams [39], might be understood in either of these
ways. Their idea was that judgments about probability concern the relations between
particular propositions and bodies of evidence. Subjectivist cognitivist positions, on
4
the other hand, characterize judgments about probabilities as judgments about how
propositions relate to each other and to a context-dependent standard. This standard
is typically the judge’s, though it might also be supplied by the judge’s community or
culture. The most plausible version of this proposal is based on John Macfarlane’s [25]
notion of assessment sensitivity, and has been popular as an account of the meaning
of epistemic modals such as ‘might’.
Several distinct non-cognitivist positions have been proposed, but few have been
examined in depth. One non-cognitivist view, defended by Stephen Toulmin [35],
suggests that we can understand probability through its use in modulating commit-
ments. (It was proposed with the linguistic expression of the concept in mind.) This
view faces deep challenges in providing an appealing view of probabilistic judgments,
as opposed to probabilistic assertions, and has now been mostly been abandoned.
The currently dominant noncognitivist view is ‘credal noncognitivism’, for which im-
portant early work was done by Frank Ramsey [28] and Bruno de Finetti [7], and
more recently has been defended by Huw Price [27], Simon Blackburn [2], and Seth
Yalcin [42], holds that judgments about probability are nothing other than beliefs.1
Every belief is simultaneously a judgment about probability.
Credal noncognitivism rests on an implausible psychological theory, but there are
other related views that do not suffer from its defects. In the coming pages, I will
present a second, and what I regard as more plausible, noncognitivist view about
probability. This view interprets judgments of probability as moves within a certain
kind of practice. While I won’t explicitly try to defend this interpretation in great
depth in this dissertation, it will shape much of what I have to say, and the dissertation
should be read with this interpretation in mind. Before taking a closer look at each
1Unfortunately, credal noncognitivism has not been nearly as deeply explored as its metaethicalanalogues. The philosophers I cite may not explicitly embrace every bit of the doctrine that I amattributing to them.
5
of these views, I will describe the doctrine of degrees of belief which will play a role
in their development.
1.2.1 Degrees of Belief
The notion of degrees of belief is important to many interpretations of probability.
When I talk about degrees of belief, I refer to the posits of a particular psychological
doctrine characterized by three major components. These three components all relate
to the claim that beliefs come in gradations. The fact that beliefs admit of gradations
is unexceptional; every attitude can be regarded as coming in degrees along any
number of different dimensions. What is special about the doctrine of degrees of
belief is what it says about these gradations. The theory makes certain claims about
one kind of gradation that belief enters into. These claims can be broken into three
components of the view.2
According to the first component, the degree of a belief is correlated in some way
with its role in decision making. Beliefs ordinarily combine with desires to produce
intentions. The product of these combinations depends upon the degrees of both
belief and desires. Those beliefs of a higher degree have a special tendency to exert
an influence over what we decide to do: we tend to prefer actions which we have
a high degree of belief will result in outcomes for which we have a high degree of
desire. There are other ways to understand the dimension of gradation and some of
these other ways exert some influence over the way that we typically think of the
notion. Often, we connect degrees of belief up with the feelings of confidence that
they inspire. This allows us to use introspection, rather than inferences from our
behavior, to detect our degrees of belief.
According to the second component, belief, disbelief, and uncertainty involve the
same kind of attitude that differ only in where they fall along the gradation’s scale.
2Behaviorist and instrumental versions of the view might make due without realism about gra-dations. Though they retain popularity, I will set these views aside.
6
A high degree of belief in a proposition constitutes belief, a low degree of belief
constitutes disbelief, and a middle degree constitutes uncertainty.
According to the third component of the doctrine, these degrees of belief can
be quantified with numbers in a way such that they typically obey the axioms of
probability theory. That is, each belief can be assigned a numerical representation of
its degree between 0 and 1, and the representation of the degree of a disjunctive belief
is the sum of the representations of the beliefs in the disjuncts if they are inconsistent.
The doctrine of degrees of belief is an empirical conjecture according to which
beliefs come in a kind of a natural gradation that obeys the probability calculus and
is employed in the manner specified by decision theory. As an empirical conjecture, it
is supported to some extent by our ordinary interactions with believers. Beliefs can
clearly be compared both in terms of their phenomenological properties and in the
effects that they have on our actions. However, the other elements of the doctrine
are not as well supported. There is little independent evidence that there exists a
metric that will allow us to evaluate degrees of beliefs between 0 and 1 in such a way
that they typically satisfy the axioms of probability3. Similarly, there is little evidence
that belief, disbelief, and uncertainty really exist along the same cognitive spectrum –
differing only in degree and not in kind. For all that we know, the differences between
uncertainty and belief might be very unlike the differences between low-level belief
and high-level belief. If the doctrine of degrees of belief is false, then many of the
going interpretations of subjective judgments of probability will need to be rethought.
It is an advantage of the view of judgments of probability that I will ultimately favor
that it doesn’t rely on this doctrine.
3Anyone with coherent preferences may be ascribed degrees of belief that satisfy the axioms. Idon’t doubt that this can be done. What I doubt is that these ascriptions match up with any naturalgradations in people’s beliefs. That is, I doubt that an arbitrary consistent ascription will matchany psychological joints.
7
1.2.2 The Logical Interpretation
A drop in the pressure is evidence that it will rain. The fact that the GDP has
gone up at least 2% for each of the last five years is evidence that it will do so again
this year. Propositions provide support for each other, and the logical interpretation
sees judgments about probability as estimates of this support. According to the
logical interpretation of judgments about probability, the content of such a judgment
concerns the value of a measure of the amount of support the evidence provides for a
proposition.
The logical interpretation claims that judgments about probability are judgments
about something objective: given any body of evidence and any proposition, there is a
fact about just how much support that body of evidence provides for that proposition.
Moreover, in many cases there is a way of measuring that support with numbers
between 0 and 1. This is what we’re really thinking about when we think about
probabilities. To say that the drop in pressure indicates a 50% probability of rain
is to say that the amount of support that the proposition the pressure dropped
provides for it will rain is measured at .5.
According to this interpretation, every judgment about probability is implicitly
based on some body of evidence whether or not we’re aware of it. Unless we explicitly
relativize to a body of evidence, such as when we make a conditional judgment about
probability, we’re implicitly relativizing on our total body of evidence.
The chief problem for the logical interpretation is to find a metric that can make
evidential support objective – how do we convert evidential support to a number
between 0 and 1? We could solve this problem by providing a satisfactory metric
of evidential support. Such metrics surely exist, but no single metric warrants the
position that the logical interpretation needs to give it. Historical attempts to do
this have typically made heavy and controversial use of the Principle of Indifference,
according to which we ought to assign equal probabilities to propositions when we
8
lack evidence that distinguishes them. We might fare better in providing an indirect
account of the metric, where the metric is determined by rational behavior in light
of the evidence. On this proposal, a probability relation holds between a proposition
and a body of evidence when the body of evidence makes it rational to behave in ways
consistent with assigning that proposition that probability and following a standard
decision procedure. That way we could specify what the value of evidential support
is in a way that removes some doubt about the existence of such a measure.
A second problem for the logical interpretation is that it suggests an unreasonable
amount of dogmatism. There are many questions of probability about which dis-
agreement seems to be rationally permissible, and moreover, in many of these cases
we may think of ourselves as being no more correct than anyone else. Is there a fact
of the matter how much support our evidence provides to the skeptical hypotheses
that we are brains in vats? Or is there a fact how likely we should believe it is that
something preceded the big bang? Reasonable people may disagree in their judg-
ments about these probabilities. In light of the existence of some propositions that
don’t appear to warrant an objective judgment, we should conclude that judgments
about probability are not judgments about any objective quantities. Until we find a
plausible metric, or evidence that one exists, we should look to see what elsewhere
for an interpretation of probability.
1.2.3 Cognitivist Subjectivism
It is implausible that our judgments about probability concern a particular metric of
evidential support. This suggests an analogy with judgments about personal taste.
When someone judges that something is tasty, they seem to be judging that it tastes
good to them. When someone judges that something is funny they seem to be judging
that it is funny to them. When someone judges that a proposition is probable, they
seem to be judging that it seems probable to them in light of their evidence. There is
9
some etymological support for this hypothesis. Before the term ‘probability’ acquired
its present sense, it seems to have been used to mean believable4. If we have to
assign content to the judgment, then we could do so by treating the content as a self-
ascription. The judgment “that is tasty” might be interpreted as a judgment “that
is tasty to me (or to us)” and, given the doctrine of degrees of belief, the judgment
“that is probable” might be interpreted as a judgment “I have a high degree of belief
that this is the case.”5
Subjectivism has a long history in ethics, and it has been, with a few exceptions,
widely regarded as an implausible metaethical position. The problems of subjec-
tivism about probability are a lot like the problems of subjectivism about ethics.
It is just implausible to think that judgments about probability concern our own
psychological states. We may judge that we occupy a psychological state without
making the corresponding judgment about probability, and we may make a judgment
about probability without making the (allegedly) corresponding judgment about our
psychological state. To judge that a proposition has a certain probability just feels
utterly different then judging that we have any particular attitude towards it.
There are a few tests we can do to tell whether two statements have the same con-
tent, and these tests tell against cognitivist subjectivism. One test involves replacing
an instance of one claim with an instance of the other within a complex context. If
every judgment that a proposition had a probability were identical in content to a
judgment that the judge had a certain psychological state, then we should be able to
substitute one thought for the other in many contexts without a change in meaning.
For instance, there should be a sense of I believe that it will probably rain
tomorrow in which we can exchange it will probably rain tomorrow with I
have a high degree of belief that it will rain tomorrow without chang-
ing the meaning. But there doesn’t seem to be. Seth Yalcin [41] has made much of
4See [14].5See [31].
10
the observation that we cannot exchange psychological statements for probabilistic
statements in the context of supposition. While it makes perfect sense to suppose
that one might assign a high degree of probability in a falsehood, it doesn’t make any
sense to suppose that something might be both false and highly probable.
Subjectivism faces a number of problems beyond its intrinsic implausibility and
the issues it faces embedding in suppositions and conditionals. It is well known to
have difficulty handling disagreement, assertability, and retraction conditions. While
it may be possible to add epicycles6 onto the interpretation to solve these problems,
subjectivism must bear explanatory fruits before it is worth the effort to fend off its
challenges. The biggest problem with subjectivism is that it simply does not provide
much of an advantage over noncognitivist views to merit biting the bullet on the
problems it creates or warrant the complexities required to properly deal with them.
1.2.4 Credal Noncognitivism
Subjectivism characterizes judgments about probability as possessing psychological
contents. One move that has been very popular in metaethics is to shift from a subjec-
tivist cognitivism to noncognitivism. The noncognitivist agrees that the subjectivist
is right in thinking that there is a close tie between having a particular degree of
belief and making a particular judgment about probability. Judgments about proba-
bility, however, aren’t mere representations of one’s degrees of belief. Instead of being
representations of degrees of belief, credal noncognitivists hold that judgments about
probability just are degrees of belief. Credal expressivism extends the view by adding
that probabilistic language is used to express one’s degrees of belief in the way that
ordinary language is used to express one’s beliefs.
Credal noncognitivism is noncognitive because it does not associate any special
content with judgments about probability. Judgments about probabilities certainly
6See [8, 36, 24].
11
have representational contents, insofar as they are propositional attitudes, but prob-
abilities are not parts of those contents. By identifying judgments about probability
with degrees of belief, the credal noncognitivist proposes to characterize judgments
about probability in terms of their cognitive roles, rather than the contents that they
have.7
Just like the most popular versions of subjectivism, the view rests heavily on the
doctrine of degrees of belief. The way I will understand the proposal here, credal
noncognitivism is a substantive psychological proposal: it holds not just that judg-
ments about probability have a particular functional role – it holds also that they
involve the same kinds of gradations as degrees of belief. To judge that a proposition
has a probability of .9 is to have a degree of belief of .9 in it. I have expressed my
doubts about the doctrine of degrees of belief, and I think the problems with the view
provide us with good reasons to reject credal noncognitivism. There are, however,
versions of noncognitivism that are more promising. I think that noncognitivism
is the right way to interpret probability. But credal noncognitivism is wedded to
implausible psychological views. For this reason, it should be rejected.
Naive forms of metaethical noncognitivism face similar problems. Metaethical
noncognitivists think that moral judgments express attitudes that are desire-like,
disapproval-like, or intention-like rather than beliefs. Modern noncognitivists seldom
wish to commit to thinking that such judgments are nothing other than ordinary
desires or attitudes of disapproval. This would be ludicrous. Even if moral judg-
ments typically have the functional profile of desires, they are clearly very different
kinds of things from ordinary desires. We should be careful to distinguish the more
sophisticated moral judgments from the less sophisticated desires.
7Having a certain content may also be a kind of cognitive role, but not every role gives riseto a content. Presumably, noncognitivists attribute roles to our judgments about probability thatpreclude them from having contents.
12
Credal noncognitivists have done little to explore the differences between judg-
ments about probability and beliefs. While I am sympathetic with the claim that
judgments about probability have the same functional profile attributed to degrees
of belief, I think that more work needs to be done to distinguish judgments about
probability from beliefs. I also think that once we have gotten clearer on the differ-
ence between beliefs and judgments of probability, it will become more plausible to
understand judgments about probability as constituting a move in a certain practice
rather than as being expressive of a particular kind of attitude.
1.3 Probability as a Practice
While credal noncognivisim itself has its problems, noncognitivism is promising. The
problems with credal noncognitivism arise from its connection to the doctrine of de-
grees of belief. We would do better if we could distinguish degrees of probability in
judgments about probability from degrees of beliefs. I propose that we understand
judgments about probability as moves within a particular practice. I’ll start by ex-
plaining what I mean when I say that a judgment is a move, and then I will give a
sketch of what the practice of assigning probabilities involves.
1.3.1 Intellectual Practices
We are deeply cultural creatures – most of our daily routines involve practices that
we have learned. We seldom stop to consider why we do things the way that we do.
We simply do them because that is the way it is done. This is true not just of our
unthinking habits, but with many of our intellectual practices as well. In learning
about mathematics, science, history, or philosophy, we are acculturated to go about
things in a certain way, to produce certain kinds of intellectual products, and to use
certain tools or heuristics.
13
When learning about a new field, we often acquire its basic concepts through
learning how to manipulate them for specific purposes. Most readers should be fa-
miliar with this from their early study of mathematics. Pre-college mathematics is
often taught as a set of tools to use for solving certain kinds of problems. Students
learn techniques for manipulating symbols and concepts in order to solve problems.
Mathematical concepts and techniques work like cognitive slide-rules. While stu-
dents understand how to use the symbols and concepts to solve the problems set
before them, they needn’t have any understanding of why the procedures that they
are taught to use work or what the representative significance of the symbols or con-
cepts or their manipulation might consist in. One doesn’t need to understand the
epsilon-delta definition of a limit in order to apply calculus to find an area. Nor does
one need to have any idea what to make of imaginary numbers in order to use them
to solve equations.
We learn about formal and informal probability theory in the same way that we
learn about calculus and imaginary numbers. Whether instructed by teachers or
through the observations of its use by others, we learn how it is that we are supposed
to use assignments of probability in order to solve problems. Through observation
and education, we acquire the ability to engage in the practice ourselves. There
may be some deep representational meaning that makes sense of the way that we
use probability, as there is with the epsilon-delta definition of derivatives in calculus,
but there also need not be. It may be that the practice we have for calculating
probabilities is useful even if there is no ultimate interpretation of probability that
makes sense of that use as representational.
I propose that judgments about probability are characterized with respect to the
practice of assigning probabilities. The proper subject of analysis isn’t the judgments
themselves, but the practice that those judgments fit into. To characterize the prac-
tice, we don’t need to explain what kinds of mental states are involved in employing
14
it. There might be many ways of characterizing a practice, but the practice of proba-
bility can be adequately characterized in terms of its rules and its aim. The practice
of playing horseshoes is characterized by rules that specify that each player is to
take turns throwing their horse shoe from a set distance. The winner is the player
whose horse shoe come closest to a specified post. The aim of the practice is to win.
Similarly, the practice of assigning probabilities has rules and an aim that together
characterize the practice.
Of course, not every rule involved in a practice characterizes that practice – only
those rules that are constitutive. We may regard it as a rule that one ought not throw
one’s horseshoe so as to injure the other player and hamper their ability to throw their
remaining horseshoes. This rule may govern the practice of playing horseshoes, but
it isn’t constitutive of it. It is difficult to say exactly which rules are constitutive and
as such many practices admit of some vagueness. In the final chapter, I will suggest
that a rule is constitutive if it is required for coherent and intentional engagement
in the practice. This doesn’t however provide a precise division between constitutive
and non-constitutive rules. Can one use one’s foot to toss a horse shoe? Is distance
between the post and the horse shoe to be measured in a straight line as measured
along the plane of the surface of the Earth, or are differences in height to be taken
into consideration? For many of these issues, there need be no fact of the matter.
We must also say something about what it is for someone to engage in that
practice. A judgment about probability is a form of engaging in the practice of
assigning probabilities. Not everyone who tosses a horse shoe is playing horseshoes.
They must be intentionally engaged in the game. For horseshoes, engaging in the
practice requires a set of communal intentions to play the game. Anyone who engages
in the practice of horseshoes and, within that practice tosses a horse shoe, has made
a certain kind of move. We might also say that a student who manipulates symbols
learned in calculus class is engaging in the practice of using calculus to solve a problem.
15
Their engagement in this practice comes from their intention to manipulate symbols
in the way that they were taught. The steps in their manipulations count as moves
within this practice. Similarly, I suggest we regard judgments about probability as
moves within a practice. While they might not be so heavily constrained, they are
nevertheless governed by specific rules for how they should be assigned.
I propose that the practice of assigning probabilities is captured by certain consti-
tutive rules that govern the practice and an aim for that practice and that we count
as engaging in the practice insofar as we intentionally utilize symbols and concepts
designed for this practice. This view differs from credal noncognitivism in that it sees
judgments about probability as distinct from run-of-the-mill beliefs. These judgments
are more akin to the kinds of judgments we make in the course of solving a math
problem than they are like attitudes of confidence.
I will make the case in the third chapter that the rules of the practice are the
formal norms of probability. In the next section of this chapter, I will say something
about the aim of the practice of assigning probabilities.
1.3.2 The Use of Probability
The characteristic use of judgments of probability is to solve problems in practical
and epistemic decision making. We calculate probabilities in order to decide what
to believe, how confident to be, and what to do. The best analogies for the practice
of assigning probabilities are methods of card-counting. Clever blackjack players
utilize various methods to keep track of the cards that have come up in order to
place better bets in future rounds. Blackjack players would be better off if they
could remember every card they have seen and knew the optimal bet to make in
their precise circumstances. Limits of memory and mathematical ability prevent us
from keeping and utilizing this kind of information. So instead, these players use
a heuristic to keep track of what is important about what cards they’ve seen. The
16
heuristic is useful because it is easy to keep track of and can easily apply given very
simple rules. We come up with schemes for card counting as a way of keeping track
of what evidence we’ve received in a form that bears well on the decisions that we
have to make.
If we were ideal agents, we might be able to remember every fact we ever learned
and muster all of our information whenever we had to make a decision. We might
also see, simply upon surveying the evidence and knowing our desires, what decision
we should make. If we were also masters of an ideal language, we might communicate
the full body of information that we possess to each other with just a few words. We
are not ideal agents and we have no ideal language. In our intellectual and practical
pursuits, we encounter far more information than we can remember, analyze, or
easily communicate. It is useful to have a shorthand way of keeping track of how our
evidence bears upon what decisions we want make. Since we cannot internalize and
analyze our evidence all at once, it is helpful to be able to convert that evidence into
a compressed form whose upshot for decision making is easy to understand.
Probability plays this role. When we make a judgment about probability, we
commit ourselves to a certain manner of representing that evidence for the purpose
of decision making. The particular form that judgments about probability take can be
explained by the particular uses to which we put it. We adopt the scheme of applying
the axioms of probability because it is useful to have a compressed representation
of evidential support that satisfies those axioms for decision making. The particular
structure of probability assignments lends them to decision making because it allows
us to apply utility maximization as a decision procedure.
The calculation of probabilities is a technique that we’ve learned in order to solve
decision-making problems. Judgments of probability are no more a part of our in-
herent cognitive repertoire than strategies for card counting. Belief is part of our
inherent cognitive repertoire, as are whatever gradations it comes in. Judgments
17
about probability differ from them in the same kind of way that a judgment about
the current count differs from a degree of belief in the kinds of cards that one will
soon draw. Our judgments about probability, like our judgments about the count,
are more cognitively sophisticated.
In outline, this approach to probability is consistent with the logical interpreta-
tion. Just as the concepts of calculus found representational interpretations in the
epsilon-delta definitions, it is possible that some day we will find representational in-
terpretations for our judgments of probability. Whether this is so depends on whether
we can find a objective metric for measuring probabilistic support. I have already
expressed my doubts this will ever happen. We might also adopt a fictionalist view ac-
cording to which the logical interpretation is right about the content of the judgments
about probability, but wrong about our commitments when we make such judgments.
We aren’t committed to their truth. There is much to like about fictionalism, but
in the absence of a clear explanation of why we need to attribute representational
content to judgments about probability, I am skeptical that the attribution of con-
tent is anything but superfluous. There is no reason why we should think that our
judgments about probability even purport to be anything objective.
I disagreed with the logical interpretation when it comes to the assumption of
objectivity of measures of evidence, and this is largely why I prefer the noncognitive
account of probability as a practice. A judgment about probabilities is a kind of
decision we make about how to represent the evidence for the purposes of decision
making. Sometimes, evidence may be conclusive, and the only rational thing to do is
to assign one particular probability. But sometimes the evidence won’t itself warrant
or demand any particular representation. No representation needs to be rationally
forbidden. This doesn’t undermine the practice of assigning probabilities, because
when one assigns a probability, one isn’t committed to regarding that assignment
to the proposition as correct. The view of probability as a practice preserves much
18
of the appeal of the logical interpretation without committing itself to that view’s
extravagances.
1.4 Overview of What’s to Come
In the chapters that follow, I will explore issues related to the interpretation of proba-
bility that I have just proposed. The second chapter concerns the cognitive grounding
of judgments about probability. I discuss how it is that our judgments of probability
really do differ from our degrees of belief. I argue that we cannot regard judgments
about probability as the credal noncognitivist proposes because that theory encoun-
ters a number of basic explanatory problems. In short, judgments about probability
seem to resemble beliefs about heights, weights, temperatures, duration, and prices
much more than they do gradations of belief. They act like standard estimations
of quantities. The lesson that I draw from this is that we should regard judgments
about probability as conceptualized in the same way that we regard judgments about
other quantities. What we’re doing when we ascribe a probability is, from a cognitive
perspective, more or less what we’re doing when we ascribe a height.
This supports the interpretation of probability as a practice. We often pick up
new concepts that are constitutive of a practice without thinking or caring too much
about their representational fidelity. What matters is that they work. We use the
same cognitive mechanisms to manipulate these concepts that we use to manipulate
representational concepts, including concepts of quantities, because it is useful. Think
about the the practice of rating a movie or a restaurant. It is hard to say what, exactly,
the rating represents. Nevertheless, by applying a quantity to the movie we invite
certain kinds of manipulation. We can, for instance, say that one movie has twice as
many stars as another. We can compare averages and talk about relative differences
19
in quality with precision. This may not be especially useful when it comes to rating
movies, but it is very useful for using measures of evidential support.
The next two chapters concern the norms that govern judgments about probability.
Many different kinds of norms have been taken to apply to our judgments about
probability. Two classes of norms are particularly fundamental. In the third chapter,
I discuss the formal norms of probability. The formal norms of probability concern
what kinds of judgments about probability are formally consistent with each other.
It is irrational to judge that the probability that it will rain tomorrow is .75, while
simultaneously judging that the probability that it will not rain tomorrow is .5. Why
should this be so? Many different explanations have been offered. I think that these
explanations are overly elaborate and their extravagences detract from the quality
of the explanations that they are able to provide. The answer is very simple. If we
understand judgments of probability as moves in a practice, we can understand the
formal norms as constitutive constraints on the practice. There are pragmatic reasons
to adopt a practice with the axioms of probability as formal constraints. Some of the
standard arguments for these formal constraints attest to these pragmatic reasons.
The constitutivity of the constraints for the practice, however, is what ultimately
explains the authority of the formal norms for the practice.
The second fundamental class of norms are relational. They tell us how one set of
judgments about probability based on one body of evidence ought to relate to another
set of judgments based on another body of evidence. These norms are often thought
of as diachronic norms for updating one’s probability assignment upon receiving new
evidence, however, I prefer to think of them as synchronic norms about how one’s
judgment of the probabilities relative to different bodies of evidence should relate to
each other. In the fourth chapter, I turn my attention to what I call the ‘Bayesian
procedure’, which is a procedure for settling the probabilities of propositions that
turns on the process of conditionalization. I argue that the Bayesian procedure is
20
not as authoritative as it is often regarded as being. While we may be required in
many cases to defer to its dictates, there are also cases in which we are free to ignore
them. I argue for this conclusion by providing a rationale for the procedure, and than
showing how the rationale can fail to hold. My analysis of the Bayesian procedure is
dependent on my account of the function of judgments about probability. Judgments
about probability represent our evidence for the purposes of action. It makes sense
to use the Bayesian procedure when one has certain kinds of commitments about
how evidence is to be represented. However, not every judgment about probability
provides us with the kinds of commitments that give the procedure authority.
21
Chapter 2
Probability and Confidence:
Grounds for Divorce
2.1 Introduction
Consider the following two theses.
The reducibility of belief thesis holds that familiar categorical be-
liefs are ultimately grounded in gradational attitudes. According to
the reducibility of belief thesis, our beliefs are a product of our levels of
confidence – there is no genuine psychological gap between full categorical
beliefs and attitudes of moderately high confidence.1
Noncognitivism about probability holds that certain judgments
about probabilities do not represent the world as being some particular
way vis-a-vis probability. In particular, judgments about probability
1There are many different ways to try to carry out this reduction, the simplest of which isthe Lockean Thesis [11] that holds that having a belief in a proposition is a matter of having theunderlying gradational attitude to a sufficiently high degree.
22
are typically understood as a special kind of attitude directed at the
proposition whose probability is being assessed. The object of the at-
titude is a proposition, but the degree of probability is not in any way
reflected in the proposition itself. The proposition isn’t a proposition
about probability. Instead, the degree of probability ascribed by the
judgment is an aspect of the attitude itself, just as when we have a
strong desire that the cost of milk is low, the strength of the desire is
not a part of the attitude’s object. Since the degree of probability is
part of the attitude rather than its object, the attitude itself is gradational.
In this chapter, I will use ‘confidence’ to refer to the graduated state underlying
belief2 and ‘judgment about probability’ to refer to the attitude postulated by the
noncognitive interpretation of probability.
We have good reasons to accept both the reducibility of belief thesis and noncog-
nitivism about probability. While these two theses are independent, one could think
that judgments of probability ought to be understood in terms of some gradational
attitude without thinking that that attitude is the same one that grounds belief,
parsimony pushes us to unite them.
According to the unification thesis, judgments about probability and levels of
confidence are really the same thing. The unification thesis is a bold theory about
psychology. It says that our beliefs are graduated in precisely the same manner as are
our judgments of probability. The unification thesis is a direct consequence of credal
expressivism (according to which we use probabilistic language to express degrees
of confidence)3 and a minimalistic semantics for ‘judgment’ (according to which “P
judges that probably Φ” is true roughly iff P has the mental state that P could
2I retain ‘degrees of belief’ as defined in the first chapter as posits of the doctrine of degrees ofbelief. The graduated state underlying belief needn’t satisfy the components of the doctrine.
3Credal expressivism is the linguistic analogue to the view of credal noncognitivism that wasintroduced in the first chapter.
23
appropriately express with the sentence “probably Φ”)4. Though it is rarely spelled
out and explicitly endorsed, doctrines that lead to the unification thesis seem to be
widely accepted.
I will argue in favor of a separation. I will advocate a psychological picture on
which judgments about probability are not attitudes of confidence, but are instead a
separate and more cognitively sophisticated attitude. What gives judgments about
probability their sophistication, and what prevents them from being mere attitudes
of confidence, is the way in which they involve concepts of probability. Attitudes of
confidence do not employ a concept of probability, but judgments of probability do.
Whereas infants and non-human animals are probably capable of the former kind of
attitude, only adult human beings seem capable of the latter.
I will present three considerations that suggest that judgments about probabil-
ity are something more sophisticated than the unification thesis would have it. One
consideration stresses the logical complexity of our judgments about probability. A
second consideration focuses on the fact that judgments about probability come in
numerically precise degrees. The final consideration relies on our capacity to make
judgments that are ambiguous between judgments of objective and subjective prob-
ability.
Before presenting these considerations, I will clarify the claim that judgments of
probability involve concepts of probability.
2.1.1 Probability as a Concept
A subjective interpretation of some judgments of probability is consistent with an
objective interpretation of others. When we judge that a die has a certain chance of
landing on a ‘6’ or that a radium atom has a certain chance of decay, the judgment may
concern some fact independent of the judge’s individual evidence. Judgments about
4Someone who does seem both of these views is Seth Yalcin. See [41] for his presentation of aminimalist semantics for attitudes (with ‘might’) and [42] for his discussion of credal expressivism.
24
objective chances cannot plausibly be interpreted as gradational attitudes. Instead,
they appear to be a species of ordinary categorical beliefs with gradations built into
their representational contents. Those judgments about probability that are most apt
for a subjective interpretation are those that are neither true nor false independently
of the judge’s relation to the proposition. The judgment that the Goldbach Conjecture
is probably - but only probably - true is a paradigm case. Either the conjecture is true
or it is not. The judgment of intermediate probability is more of a reflection more of
the judge’s own uncertainty than a verdict on the conjecture itself.
One reason to maintain that judgments about probability are not attitudes of
confidence is that probabilistic concepts play a role in the former but not the latter.
The gradations in attitudes of confidence are non-conceptual: one has no more need of
special concepts to be confident to different degrees than one has need of any special
concepts to be scared to different degrees.
Proponents of the unification view may draw from this the conclusion that judg-
ments about probability do not require concepts either. It is sometimes alleged
that judgments about probability cannot require any special concepts, because such
intellectually-undeveloped beings as infants and non-human animals seem capable
of making them[12, 9]. Recent experiments have demonstrated that infants have a
surprising capacity to make predictions on the basis of observations of relative pro-
portions.5 However, the conclusion only follows if we already take on board the
unification thesis. It is precisely because judgments of probability are conceptual
that I think the unification thesis must be false.
5If an infant is shown three red marbles and one blue marble bouncing around inside of a containerwith a single opening, he will look longer if the first marble to escape is blue [33]. Infants whoare first shown several marbles randomly drawn from a container and then shown the colors ofmarbles remaining inside the container look longer if the remaining marbles do not resemble themarbles drawn [40]. Infants are known to look longer at outcomes that they do not expect, so theseexperiments strongly suggest that infants’ expectations are guided by observations about relativeproportions.
25
The idea can best be understood through analogies. Beliefs about the weight of
a bowling ball can be gradational in two different ways. We can believe that a ball
is heavier or lighter, and we can be more or less confident in our belief. I take it
that these two forms of gradation are very different. Worries about the price of milk
and desires about the high temperature for the day are gradational along two similar
dimensions. The gradations of variance of weight, price, and temperature in these
attitudes are conceptualized. Our attitudes involve concepts of weight, price, and
temperature in different ways. The different ways that they involve these (and other)
concepts determine the gradations. There are no concepts, the manner of whose
involvement determines the variance of strength of desire6 or worry.
When I say that gradations of probability involve concepts of probability, I mean
to say that probabilistic concepts figure into these attitudes in the way that concepts
of weight, price, and temperature figure into beliefs. I will speculate a bit in the next
section about what this manner of figuring might involve. However we may ultimately
understand the nature and role of concepts, my claim is that the way the dimensions
of gradation enter into our judgments about probability look more like the way that
gradations of weight, price and temperature enter into beliefs and desires then the
way that gradations of confidence or desire strength do.
2.1.2 Content and Vehicles
When we believe something about weights, prices, or temperatures, the differences in
weight, price, and temperature correspond to differences in the representational con-
tent of our attitudes. There is no similar representational difference between beliefs
qua beliefs and worries qua worries. The former difference is conceptualized. The
latter is not. This could lead to the thought that conceptualized attitudinal grada-
6If metaethical noncognitivists are right, moral concepts may be an exception to this claim. Ithink it is most plausible that these desires differ from ordinary desires in the same kind of way thatI allege that judgments about probability differ from beliefs.
26
tions have to be representational, and consequently, they must reflect real gradations
out there in the world. If so, then in arguing that gradations in the assignment of
probability are conceptual, I am undermining the noncognitivism about probability.
Here is one way we might make sense of the postulated similarity between judg-
ments of probability and price: we might construe judgments about probability as
ordinary judgments about a specific class of propositions. There are many candi-
dates for this class of propositions. We might adopt the logical interpretation, on
which judgments about probability concern some numerical relation between bodies
of evidence and propositions. Alternatively, we might adopt a cognitivist subjectivist
stance, according to which our judgments are judgments about our actual degrees of
belief.
I do not, however, think that we must infer from the fact that conceptualized
gradations are often representational that they are always so. It seems likely that
there is a distinction between the dimensions of gradation at the level of their cognitive
implementation. If we do have a language of thought, then probabilistic concepts
figure amongst its vocabulary. However, my primary contention is that the way
that attitudes involve gradations of probability is like the way that attitudes involve
gradations of weights, prices, and temperatures.
I don’t hope to settle this issue here, and while I remain deeply attracted to the
noncognitivist approach, the point that I wish to take is far more basic. Whether or
not we ultimately opt for cognitivism or noncognitivism, I think we need to make the
same claim about the cognitive vehicles of the attitude. That claim is that judgments
about probability involve vehicles with probabilistic components in the way that
judgments about weight involve vehicles with weight-related components. Just as we
talk about probabilities with sentences containing words for probability, so we think
about probabilities with vehicles incorporating concepts of probability.
27
Not all differences in the apparent content of belief can be traced to differences in
representational content. Beliefs with the same propositional content can differ with
respect to the modes of presentation of the entities involved. There may be some
modes of presentation that, given a certain failure of reference fixing, lack any kind of
referent. Examples of the latter kind might be common in failed theories. A man may
once have thought that he had a particularly high amount of yellow bile. Supposing
that the theory of humors was based on faulty presuppositions, we may plausibly
say that the man had a belief without any particular propositional content. There is
no such thing as yellow bile, and lacking a proper subject, the man’s judgment was
not about anything in particular.7 If the attitude was nonrepresentational, it would
nevertheless be a belief and could be ascribed gradations in precisely the same way
that beliefs about weights, prices, or temperatures can be.
How do we explain this? The explanation may ultimately be that beliefs that have
propositions as their objects (or at least aspire to) gain access to those propositions
with the help of certain cognitive vehicles8. This would make propositional attitudes
parallel propositional assertions: just as we assert a proposition by uttering a sen-
tence, we believe a proposition by tokening a vehicle. I take it that these vehicles
allow us to compose concepts together in some way or other. This view is consis-
tent with many different stories about the structure of these vehicles. They might
be sentence-like, or map-like, or cross-word-puzzle-like, or they might be coded as
recipes like DNA, or they might have a completely different kind of structure. They
7It is certainly possible to find some propositional content to ascribe to him: perhaps he believesthat he has a high amount of what the physicians refer to as ‘yellow bile’. I think that this assumptionis unmotivated. Possessing deeply mistaken assumptions is often an adequate reason to think thatreference doesn’t occur and that as the result the individual has some mental states that count asbeliefs but lack representational content.
8I take it that the best theories of how vehicles match up with propositions relies on some kind ofcorrespondence between the functional relations existing between these attitudes and their vehicleson the one hand, and the between propositions on the other. In a sentence, expressing a propositionis epiphenomenal to the vehicle. The vehicle’s behavior explains why it expresses a proposition,rather than the other way around. This means that there need not be any important distinctionbetween those attitudes which, by virtue of suitable correspondence, get to be propositional.
28
might even be semantically compositional in only the loosest of senses. I don’t think
it matters for the present point, which is that we do not need to find gradations in
propositional contents in order to justify the association of gradations of probability
and gradations of weights, prices, and temperatures. We can instead justify the asso-
ciation by supposing that concepts of probability figure into the vehicles of judgments
of probability in the way that concepts of weights, prices, and temperatures figure
into the vehicles of judgments about weights, prices, and temperatures (however that
may be).
In the remainder of this paper, I will present three considerations that suggest
that gradations of probability share more in common with the prototypical conceptual
gradations than they do with prototypical non-conceptual gradations.
2.2 Consideration 1: Structural Diversity
The first thing to note about judgments of probability is their potential for complexity.
These judgments often have a straightforward form: we judge that some proposition
has some probability of being true and it is these kinds of judgments that make the
unification thesis most plausible. But our judgments about probability are not limited
to straightforward probability ascriptions. We can judge that probabilities are related
to each other in subtle ways.
The simplest of the more complex judgments are comparative. Consider the judg-
ment expressed by the following sentence.
� It is more likely that there is intelligent alien life in the Milky Way Galaxy than
in the Sagittarius Dwarf Galaxy.
This judgment does not reflect any particular degree of confidence in a particular
proposition. The judgment doesn’t concern either the existence of alien life in the
Milky Way or in the Sagittarius Dwarf Galaxy all by itself. The judgment somehow
29
compares the two. The unification thesis cannot make sense of this fact on its straight-
forward reading. The sentence clearly expresses a judgment about probabilities, but
it doesn’t clearly express any particular level of confidence.
The unification thesis might be saved by regarding the sentence as expressing a
relation between a pair of attitudes of confidence. One might be so confident in the
existence of life in the Milky Way and so confident in the existence of life in the
Sagittarius Dwarf Galaxy, and by virtue of the greater degree of confidence in the
former, count as making the comparative judgment.
The problem with this reductive solution is that a person need not make any first-
order judgments regarding the probabilities of the compared propositions in order to
make the comparative judgment. We can think that it is more likely for life to be in
one place than the other while at the same time being completely baffled about the
probability of life in either location.
Our capacity for irreducibly comparative judgments of probability distinguishes
the gradations of probability from the gradations of other attitudes. While it is
possible to have comparative degrees of fear (it is possible to be more afraid of one
thing than another) it is not possible to be so without being afraid to some particular
degree or other in each thing. The same goes for most other gradational attitudes.
Higher-order comparisons of fear are grounded in lower-order degrees of fear, while
higher-order probability assignments need not be grounded in lower-order probability
assignments.
By contrast, the kinds of gradations grounded in conceptual contents have the
potential for being irreducibly comparative. The following is a rather run-of-the-mill
belief.
The price of oil will be greater if Iran has a homegrown revolution than
it will be if Iran is bombed by a foreign power.
30
One can have this belief while at the same time knowing nothing about the actual
price of oil in either situation. The comparison can be irreducible.
The unification thesis might be saved by admitting that confidence is a bit dif-
ferent from other gradational attitudes. Sometimes it is suggested that judgments
probability are essentially comparative. Perhaps degrees of confidence are compar-
ative in the sense that the object of the fundamental psychological attitude is not
a proposition, but a pair of propositions. There is no such thing as a degree of
confidence in isolation, there are only relations of confidence. We can interpret non-
relational judgments as implicitly comparative. Perhaps to think that the probability
that there is life in the Milky Way is high is to be only slightly less confident in it
than in a tautology. Instead of understanding comparative probabilities in terms of
non-comparative probabilities, we must understand non-comparative probabilities as
implicitly comparative. This will allow us to save the unification thesis while at the
same time making sense of irreducibly comparative judgments.
Unfortunately, this strategy is complicated by the existence of yet other appar-
ently irreducible forms of judgments about probabilities. Alan Hajek [15] has argued
convincingly that conditional probabilities, such as expressed by the following, cannot
be reduced to unconditional probabilities.
It is likely that the economy will rally, given that Greece institutes brutal
austerity measures. (Conditional)
Our only real option, consistent with the unification thesis, is to try to reduce one
to the other.
31
A reduction of conditionals to comparatives is not implausible; taking inspiration
from the traditional ratio analysis of conditional probabilities9, we might try to
reduce conditionals to primitive comparatives. There are, however, many other forms
of probabilistic judgments that would also require a separate reductive treatment.
For instance:
It is more likely that the economy will collapse, given that Greece de-
faults, than it is that the economy will rally, given that Greece institutes
The extent to which it is more likely that there is alien life in the Milky
Way Galaxy than in the Dwarf Sagittarius Galaxy is greater than the
extent to which its more likely that there is alien life in the Andromeda
Galaxy than in the Milky Way Galaxy. (Comparative Comparative)
On any given night, it is likely that it rains somewhere. (Quantified)
Each king is more likely to have died young than his predecessor.
(Quantified Comparative)
Apart from the problem of giving some kind of general reductive interpretation
that captures the extent of this variety, there is the problem of explaining the dimen-
sions of the variety as well. The degrees of probability appear to be tightly integrated
with the attitude’s content and exhibit systematicity and productivity. The examples
9According to the ratio analysis, a conditional probability Pr(Q|R) is equal to the ratio of non-
conditional probabilities: Pr(Q&R)Pr(R) .
32
above show how complexity can be produced by combining and embedding complex
forms within other complex forms. The dimensions of variety are determined in a sys-
tematic way that mirrors the variety of judgments that we can make about weights,
prices, and temperatures. The best explanation of this fact is that gradations of
probabilities and gradations of weights, prices, and temperatures figure into these
judgments in the same way.
2.3 Consideration 2: Novel Quantifiability
There are reasons to think that, among all of the gradational attitudes, judgments of
probability are special. Unlike other gradational attitudes, the degrees of probability
are quantifiable: we measure our probabilities with numbers. We may, for instance,
judge that the probability that Zenyatta will win the race is .75. By contrast, we do
not measure the degrees of our other gradational attitudes, such as fear, with numbers.
It makes no sense to say that one is scared to degree .5, or that one has ten degrees of
fear. We do, however, measure weights, prices, and temperatures with numbers. This
suggests that the way in which judgments of probability are gradational is different
from the ways in which other attitudes are gradational, and that it is more like the
way that judgments about weights, prices, and temperatures are gradational.
The quantifiability of probability stands in need of special explanation. Two fur-
ther facts about the way that we quantify probabilities bolster the difference between
judgments about probability and other gradational attitudes and point to what might
explain this. First, the quantification of degrees of probability exhibits a potential for
unbounded precision. Second, such quantification is historically novel. In the next
two sections, I will explain how these facts point to the involvement of probabilistic
concepts in our judgments of probability.
33
2.3.1 Precision
Not only do we measure probabilities with numbers, but these measurements admit of
no limit to their possible precision. Precise numerical judgments are rare, but we have
the capacity, in the right epistemic situations, to assign very precise probabilities.10
We seldom make use of this capacity because the epistemic situations in which we
find ourselves seldom calls for it, but it is not a cognitive limitation that prevents us.
The problem for the unification thesis is to explain how it is that our attitudes are
capable of precise quantification. If we give up the thesis, we can explain numerical
precision in the same way that we can explain numerical precision in beliefs. We are
able to believe propositions involving precise quantification. It is easy, for instance,
to believe that a particular mountain is 13,563.2 feet above sea level. It is hard to see
how we might do this without employing a recursive capacity to construct concepts
of numbers that can then figure into the vehicles of the attitudes. Someone who has
not been taught a recursive representation of numbers would not be able to have this
kind of belief. The presence of a concept of the number 13,563.2 explains how it is
that the belief gets to be a belief about so precise a height. A similar presence of
numerical concepts in probabilistic attitudes could explain their numerical precision.
The advantage that we get in supposing that concepts figure into the gradation
of probabilities isn’t that it provides a concrete explanation of how probabilities get
to be numerically precise. We do not know exactly how beliefs about prices get to be
numerically precise, either. But we know there has to be such an explanation in order
to explain the precision of other beliefs. If we think that probabilities are conceptual,
we can make use of the same explanation we already must assume exists.
10Precise assignments like this are more common with conditional probabilities where we can usethe antecedent to set up ideal contexts for precision.
34
2.3.2 Novelty
Undoubtedly probabilistic attitudes, in some form or other, have long played a part
in the cognitive lives of human beings. Ancient Romans probably judged that some
charioteers were more likely to win a given race than others, and the Sumerians surely
noted when the clouds suggested that rain was likely. It is only recently, however,
that numbers have been involved in judgments about probability. The extent to which
people thought about probability before the 17th century is not entirely known, but it
is relatively uncontroversial that nobody assessed probabilities (even comparatively)
with the use of numbers before the innovations of Pascal, Leibniz, and Bernoulli[14].
We have long made judgments about probability, but we have only recently begun
to quantify them. It would be quite surprising if our capacity for probabilistic preci-
sion were special – if it were distinct from our general capacity for precise judgments,
we would not expect it to appear and develop between 1650 and 1850. What seems
far more likely is that epistemic advances provided a reason to have more precise
subjective probabilities. There was no aspect of our cognitive architecture that was
dedicated to quantifying probability assignments; instead old structures (the same
structures as underlie beliefs) were co-opted for a new purpose.
There must have been some time before we began thinking about heights in terms
of numbers. The introduction of numerical measures of length radically changed the
kinds of judgments we were able to make. We went from being able to make vague
and comparative assessments of heights, to making judgments about precise numerical
heights. Throughout our history, there have been lots of quantities that we have come,
over time, to measure with numbers: distance, weight, mass, volume, hardness, and
temperature, among many others. Describing a quantity with numerical precision
requires a reliable measure. Distance measures were easier to come by and appeared
comparatively early. Temperature measures required more sophisticated devices, and
so came about relatively late. We don’t ordinarily measure colors with numbers, but
35
it is easy to see how we could introduce numerical precision into our color concepts
with a suitable measure. Concepts can transition from pre-quantitative primitive
forms to quantitative forms.
Contrast this with other attitudes. It is hard to imagine that we could suddenly
start having quantification built into our gustatory experiences (so that some foods
literally tasted twice as good as others). Or that we could suddenly start having
precisely-quantified fears or worries. We could learn how to assign numbers to levels
of fear, but the natural metric on numbers would correspond to nothing so natural
for fears. It makes no sense to say that one thing is twice as scary as something else –
though it does make sense to say that something is much scarier than something else.
By admitting that gradations of probability are conceptualized, we can make sense of
the historical novelty of probabilistic precision by attributing the development to a
conceptual revolution. It is therefore not surprising that quantification of probability
should have appeared with the invention of epistemic tools (mathematical combina-
torics in the 17th century and statistical data in the 17-18th centuries) that justify
those probabilities. We are not born with the concepts necessary to entertain any
content whatsoever. To entertain numerical contents in general, we need concepts
of numbers. To entertain probabilistic contents, we need concepts of probability.
The modern concept of probability appears to be a concept that we acquire through
exposure to the theory of probability that has been developed since the 17th century.
2.4 Consideration 3: Non-specific Probabilities
Judgments of objective chance are almost universally taken to be a species of be-
lief, and so the unification thesis suggests that beliefs about objective chance and
judgments about probability are distinct in a rather fundamental way. The final con-
sideration concerns the existence of an attitude halfway between belief in objective
36
chances and judgment about probability. To get a handle on these attitudes, we will
need to take a detour through probabilistic language.
Both judgments about probability and beliefs about objective probability are given
expression with sentences involving probabilistic language. ‘Probability’, ‘likelihood’,
and ‘chance’ can each be used to express either sort of attitude. This means that any
distinction in the attitudes requires an unmarked distinction in the language.11 The
lack of such distinction allows for instances in which neither an objective or subjective
interpretation is clearly licensed.
2.4.1 Ambiguity and Non-specificity
Probabilistic language can be used by a speaker without that speaker signaling either
an intended subjective or objective interpretation. A speaker may assert that an
outcome is probable without having communicated either that it is subjectively or
objectively probable. This might be attributed to ambiguity, but I think this would
get the phenomenon wrong. Probabilistic language is not ambiguous; rather it is
non-specific: like ‘jade’ and unlike ‘bank’, it is open to multiple precisifications, none
of which need to be communicated for the sentence to be meaningful.
The primary reason to favor non-specificity over ambiguity is that precisification
is generally irrelevant to successful communication. A speaker may aim to convey
a probability assignment without having any particular interpretation in mind, and
the audience may come to accept the speaker’s assertion without supplying either
interpretation. Consider the following situation:
11Ideally, a good semantics would find some way of unifying the semantics of objective and sub-jective probabilities, just as we have unified the semantics of epistemic, metaphysical, and othermodals. On a popular view, the language of subjective probability, unlike the language of objectiveprobability, does not contribute to the truth conditions of an utterance but rather has an effect onthe speech act produced[27, 32, 42]. It is hard to see how we could provide a unified semantics if thelanguage must play a radically different role in determining the attitude expressed – contributing tothe content in some cases and to the attitude in others.
37
Will, a weatherman, predicts a .75 probability of rain on Monday. This
assignment was derived from an examination of a forecasting model and it
is controversial whether or not these models provide sufficient evidence to
justify a judgment about objective chance. Will has no opinion about the
correct interpretation of his own expression or the capacity for his model
to justify a judgment about objective chance. In fact he has never really
thought about the matter. He uses his forecasting model and statistical
methods to make probabilistic predictions in the way that he was taught
in graduate school.
Ambiguity infects linguistic contents and not mental contents. Since the purpose
of assertion is typically to convey a particular belief, contextually unresolvable am-
biguous language results in a failure of communication. Suppose, for instance, that
John says to Mary, “I am going down to the bank”. Mary cannot come to incorpo-
rate his testimony into her own beliefs without deciding whether he meant a financial
institution or a river’s edge.
Since ambiguities need to be settled for a sentence to be meaningful, speakers
should have some interpretation in mind when making an utterance. The existence
of people like Will, who competently make assertions about probabilities without
having particular precisifications in mind, suggests that probabilistic language is not
ambiguous.
Probabilistic language does not require a subjective or objective interpretation
in order for successful communication to occur. One can hear the weatherman’s
report and come to adopt a corresponding attitude without ever needing to form an
opinion about how whether he was to be interpreted as talking about an objective or
subjective probability. This shows that probabilistic language can succeed without
any need for interpretation on the part of the audience.
38
2.4.2 Non-specific Attitudes
The existence of a non-specific use of probabilistic language suggests the existence of
a non-specific attitude. The attitude expressed by Will’s assertion is neither a belief
about objective chances nor a straight-forward judgment about subjective probabil-
ity. The existence of non-specific attitudes suggests that judgments about subjective
probability and beliefs about objective probability are not all that different after all.
Non-specific judgments of probability have gradations that straddle the gradations
of objective and subjective probability, and it is hard to see how this could be if the
gradations were in one case conceptualized and in the other case not.
We could come up with some kind of objective measures of scariness. These mea-
sures might allow us to make specific objective judgments about how scary something
is. One might, for instance, come up with a point-based scheme that assigns a crea-
ture three points if it has fangs, five points if it has patchy fur, etc. It is hard to
imagine, however, that we could ever have an attitude that straddled beliefs about
any objective measure of scariness and genuine fear. The reason for this is that gra-
dations of objective scariness would be conceptualized, while actual fear is not. The
two varieties of attitudes are just too different.
Postulating probabilistic concepts gives us an explanation of the possibility of
non-specificity because it allows us to see similarity between the content of judg-
ments about probability and beliefs about objective probability. Attitudes involving
subjective probabilistic concepts get to be judgments about probability. Attitudes in-
volving concepts of objective probability get to be judgments about objective chances.
Whatever features characterize these different concepts, it is possible to have concepts
with features that lie in between.
It is possible to have attitudes toward in-between contents because it is possible
to have non-specific concepts. We often are also capable of attitudes that are in-
between a belief about a weight and a belief about a mass. This is possible if we
39
have an unrefined or non-specific concept that counts as neither a concept of weight
or mass. As a result, some of our thoughts will concern neither an object’s weight
nor its mass but will be non-specific between them. I claim that what is going on in
cases of probabilistic non-specificity is basically the same. We can have an attitude
that is in-between a judgment about probability and a belief about objective chances
because we can have unrefined concepts of probability that fail to count specifically
as concepts of objective chance or concepts of subjective probability.
40
Chapter 3
Constitutivism about the Formal
Norms
3.1 The Formal Norms
Some judgments about probability commit us to other judgments about probability.
Judging that the probability that aliens are behind crop circles is .5 commits us to
judging that the probability that aliens are not behind crop circles to also be .5.
Judging that the probability that ‘Shakespeare’ was a pseudonym of Christopher
Marlow is .7 commits us to judging that the probability that ‘Shakespeare’ was either
a pseudonym of Christopher Marlow or of Francis Bacon to be at least .7. F
Why is this? These commitments arise from formal normative constraints on
probability. The formal norms require us to conform our probability assignments1 to
the axioms of probability. Obedience to these norms means that assigning one proba-
bility to one proposition commits us to assigning, or not assigning, other probabilities
to other propositions.
1In this chapter and the next, I will switch from speaking of ‘judgments of probability’ to ‘prob-ability assignments. This shift in terminology reflects no underlying change in meaning. However, a‘probability assignment’ is ambiguous between a judgment about probabilities or a set of judgmentsabout probabilities.
41
Following Kolmogorov’s [21] axiomization of probability theory, there has been
thought to be three fundamental formal constraints on our assignments of probability.
The first constraint, Normality, requires an assignment of a probability of 1 to
propositions that cannot be false. The second constraint, Non-negativity, requires
that all probabilities be greater than 0. The third constraint, Additivity, requires
that the probability of a proposition equivalent to a disjunction of inconsistent
propositions be equal to the sum of the assignments of the disjuncts. Collectively,
I will refer to these as the Formal Norms of probability. The first two norms set
the scale and as such are not especially substantive; my focus in this chapter will be
primarily on the more substantive norm of Additivity.
The topic of this chapter is the explanation of the normativity of the formal
constraints governing our assignments of probability – where do these normative
constraints come from and why do they hold sway over us?
The norms are not so straightforward as they first appear and to understand their
source we should first try to understand their content. To know what these norms
forbid, we have to understand what probability assignments are. In the last chapter,
I suggested that probability assignments are a kind of sophisticated judgment whose
gradations are conceptualized. I didn’t explore what it is about these judgments that
make them judgments about probability. In this chapter, I will explore this issue in
tandem with the question of normativity. Whatever a probability assignment is, it
better be the sort of thing that is subject to these norms, so it is best to approach
the task of discerning the nature of probability assignments and the explanations of
norms simultaneously. I will approach the issue in an independent way and put back
into play some of the views that I rejected previously. The conclusion I come to will
thereby provide independent support for the practical interpretation that I favor.
42
This paper is structured around four different ways of understanding what it is to
be a probability assignment and corresponding ways of understanding why probability
assignments are subjected to the norms to which they are. Each different way of
conceiving of probability assignments gives us a different way of understanding what
the norms forbid.
In the first section, “the Bare Characterization”, I will explore the view that
assignments of probability are simply graduated credal states. Any graduated credal
state is a probability assignment. I will argue that this proposal is deficient because
the most plausible account of the requirements of Additivity on this construal is deeply
implausible. Consequently, there must be something more to being a probability
assignment than being a graduated credal state.
In second section, “the Pragmatic Characterization”, I explore the view that prob-
ability assignments are graduated credal states used for making decisions in accor-
dance with standard decision theory. This notion is closely tied with Dutch book
arguments for Additivity. These arguments make use of the connection of proba-
bilities with decision making to produce a pragmatic argument for Additivity. I will
argue that this way of thinking about probabilities is implausible and the Dutch book
arguments are insufficient to account for the generality of the norms.
In the third section, “the Aim Characterization”, I explore the view that proba-
bility assignments are graduated credal states that possess a constitutive aim toward
accuracy. This aim has been fashioned into a clever non-pragmatic argument for Ad-
ditivity. I will discuss several different ways of explaining the norms by means of a
constitutive aim toward accuracy and argue that they all fail.
In the fourth section, “the Constitutive Characterization”, I will present and de-
fend my view that the norms are constitutive of the practice of assigning probabilities.
I will argue that the explanation for the normativity of Additivity, Normality and
Non-negativity is the following: intentional obedience to these norms is required for
43
engagement in the practice of assigning probabilities. One cannot count as assigning
probabilities unless one intentionally subjects one’s assignments to the norms. The
formal norms governing probability assignments are of a means-end variety: in order
to do something we wish to do, we must act in a certain way. They are, in the termi-
nology of Evans and Shah[10], weak norms: one can evade these norms by choosing
not to engage in the practice of assigning probabilities. To do so, one must simply
refrain from making use of concepts of probability. But one must accept the rules of
their use if one does opt to use them. Normality, Non-negativity and Additivity are
not forced upon us – we could choose not to keep track of probabilities at all. But
once we embrace the concept of probability and choose to use the concept to keep
track of our evidence, we are subject to its rules of use.
3.2 The Bare Characterization
Beliefs seem to come in degrees. We believe some things more than others. We have
some beliefs which are deep and strong, and others which are shallow or weak. We
have some beliefs that we will hold onto, come what may. We have other beliefs that
we drop when our mood changes. Belief is a graduated credal state because it comes
in degrees that are sensitive to evidence regarding how the world actually is.
According to the view that I will call the ‘Bare Characterization’ (a variant of
credal noncognitivism), probability assignments are nothing other than graduated
credal states. The act of assigning a probability is none other than the act of forming
a belief with a certain degree. The level of probability assigned is just the degree of
the belief. So when we say “John assigns a probability of x in ψ” we really just mean
that “John has a belief of degree-of-strength x in ψ”.
44
The Bare Characterization: A probability assignment is nothing other
than a graduated credal state. The degree of or probability reflects the
strength of the credal state.2
Suppose that the Bare Characterization is correct. What, then, does Additivity
forbid? Here is the obvious choice.
Bare Absence Requirement: Additivity consists in an obligation not
to have any non-additive graduated credal states.
The Bare Absence Requirement isn’t plausible, and unless we can find a better
understanding of the requirements of Additivity consistent with the Bare Charac-
terization, its implausibility will doom the prospects of the Bare Characterization
itself.
3.2.1 Alternatives
I will try to show that the Bare Absence Requirement is false by showing that there
is no normative requirement not to have non-additive credal states. It is difficult to
argue about basic matters of normativity. For the most general claims about what is
and what is not appropriate, we must rely on our intuitions both about the overall
plausibility of a purported norm and about the plausibility of its implications in
specific cases. My argument against the Bare Absence Requirement will rely on the
thought that it can be rational to have non-additive graduated credal states.
Graduated credal states, insofar as they are credal states, must be influenced by
evidence. There are various schemes for how a degree of credence in a proposition
might relate to the evidence. To illustrate this point, I will describe two such alterna-
tive schemes that sanction non-additive assignments, and I will sketch an individual
who has non-additive credal states, but who seems perfectly rational.
2I assume here that degree is understood somehow internally, and not as any kind of measuredependent on the resulting behavior of the individual.
45
In order to show that it is sometimes rational to have non-additive credal states,
I’ll need to set some ground rules on what it is to be a credal state. I propose two basic
criteria. First, in order to be a credal state, the state must be representational. This
means that it can be fruitfully understood as an attitude directed at a proposition.
Second, the state must be evidence-sensitive. We must change the states we’re in in
response to changes in our evidence. Categorical beliefs are representational states
that we become more likely to be in as we get more evidence in favor of the way they
represent things as being. Degrees of belief are states whose degree in a proposition
tends to correspond to the amount of evidence we have for that proposition. As we
get more evidence, we raise or lower our degrees of belief accordingly.
Here are two kinds of graduated credal states whose graduations follow their own
non-additive logic.
1. Mass Credences. A mass credence is a graduated credal state whose
gradations relate to evidence in a way that corresponds in spirit with
Dempster-Shafer [29] mass functions. A mass function is an assignment
of numbers to propositions that is intended to reflect the proportion of
the evidence specific to that proposition. Evidence that supports a more
precise variation on a proposition contributes nothing to the mass of the
proposition itself. Evidence for a disjunct, for instance, isn’t evidence
specific to the disjunctions in which it participates, and so it has no ef-
fect their corresponding mass functions. A mass credence is a graduated
credal state whose gradations react to evidence like mass functions. When
someone with mass credences receives evidence specific to a proposition,
their mass credence in that proposition would increase.
46
2. Consonance Credences.3 Given a probability function defined over a
space of possible world, we can define a consonance function to be that
function that takes a proposition and returns the probability of the most
probable world in which that proposition is true, divided by the probability
of the most probable world. The value of a consonance function on a propo-
sition corresponds (somewhat loosely) with how well the proposition fits
in with one’s best guess about the total state of the world. Evidence which
slightly favors remote contingencies in which the proposition is true has
no influence on the consonance value of that proposition. A consonance
credence is a graduated credal state whose gradations react to evidence in
the manner of consonance functions. The degree of consonance credence
in a proposition increases in response to evidence in favor of the most
likely scenario in which it could occur.
Both consonance and mass credences meet my criteria for being non-additive
graduated credal states. It is possible to have a mass function that ascribes high values
to each member of a disjunction and a low value to the disjunct. The consonance of
a disjunction is always equal to the consonance of one of its disjuncts and is generally
less than the sum of their consonances. So for both kinds of credal states, Additivity
definitely does not hold. If it is ever rational to have these states, then the Bare
Absence Requirement must be false. Now, let me introduce a hypothetical individual
who has both mass and consonance credences. This individual is, I claim, rational.
Bert is an epistemic libertine. Some creatures may have categorical beliefs
without graduated credal states. Some creatures may assign probabilities
and have no categorical beliefs. Bert has a surfeit of epistemic attitudes.
He has categorical beliefs. He has graduated credal states whose grada-
3Consonance credences are loosely inspired by Angelika Kratzer’s [22] proposal for the semanticsof probability operators.
47
tions act like probability functions. He also has distinct mass credences
and consonance credences. Bert doesn’t use these other states to help him
make decisions – he relies on his ordinary judgments about probability to
decide what to do. These other credal states play no role in Bert’s actual
decision making.
I claim two things: that Bert does have non-additive credal states and that Bert
is not irrational for having them. His states are credal insofar as they are represen-
tational and their gradations are sensitive to evidence about their representations.
They are non-additive by stipulation. Bert is not unreasonable simply for having
them. So the Bare Absence Requirement must be false.
There is no obligation to lack non-additive credal states, so what does Additivity
really forbid? It forbids having non-additive probability assignments. Probability
assignments must not be mere graduated credal states. Probability assignments must
have some special nature that accounts for their being subject to Additivity.
3.3 The Pragmatic Characterization
As long as probabilities have been measured quantitatively, they’ve been used for
gambling. In fact, modern probability theory was conceived by gamblers looking
for an edge. Probability clearly has a very special importance for decision making
under uncertainty; it provides us with a method for deciding what decisions to make,
given our interests. The ubiquity of this connection suggests that we might want to
incorporate this use into our account of the nature of probability assignments and
use the connection to account for Additivity.
The connection relies on a utility-maximizing decision procedure – henceforth,
the ‘standard decision procedure’4. One condition required to apply this decision
4There are many versions of this procedure. Nothing that I have to say depends upon selectingany one in particular.
48
procedure is that we have quantifiable graduated desire states. The expected utilities
of an action are sums of the measures of the strength of our desires for particular
outcomes to result from our actions, weighted by the graduated level of the probability
we assign to that outcome resulting. Thus, the procedure advocates performing that
action among our available actions whose possible outcomes are the most desirable,
as weighted by their probabilities of occurrence.
The Pragmatic Characterization: A probability assignment is a grad-
uated credal state that is used for the purpose of selecting actions on the
basis of expected utility.
And the Pragmatic Characterization suggests the following understanding of the
norms of probability.
Pragmatic Requirement: Additivity consists in a requirement not to
use non-additive graduated credal states for the purpose of selecting ac-
tions on the basis of optimizing ‘expected’ utilities.
Given the assumption that probability assignments are states that are used to
select actions on the basis of expected utilities, there are several promising arguments
that we can give for Additivity. Among the most well-known of these arguments is the
Dutch book argument. Ultimately, I don’t think that we can justify the assumptions
necessary to make this argument work, but it is worth rehearsing it to see how a
defender of the Pragmatic Characterization could go about explaining Additivity.
3.3.1 The Dutch Book Argument
If we always select the action that maximizes our expected utility, then in those
situations where our available actions are to either accept or decline a bet and where
there is no opportunity cost to accepting, we will accept those bets with positive
49
expected utility. The Dutch book argument suggests that in order to avoid accepting
collections of bets that will lead to sure losses, we must calculate expected utilities
with additive credal states.
A Dutch book is a set of bets that is especially advantageous to offer and disad-
vantageous to accept. The bookie will lose on some bets and win on others in a way
that depends on factors beyond the bookie’s control. But no matter how each bet
turns out individually, the bookie is guaranteed to gain a net positive amount – and
the bettor is guaranteed to lose – on the whole collection. Clearly, it is not good to be
disposed to accept Dutch books. Accepting Dutch books will lead one to lose utility.
But precisely how bad this disposition is seems to depend partly on the prevalence of
clever bookies.
This central observation lies at the heart of all Dutch book argument: if you
accept bets with positive expected utility and your probability assignments are non-
additive, then you will be disposed to accept certain Dutch books. Clearly, this is an
unfortunate way to be disposed. Unfortunateness is not irrationality, so where is the
argument for Additivity? We could try to leverage the unfortunateness of accepting
Dutch books into a conclusion about the irrationality of the states that dispose one to
accept Dutch books. We might accept a principle according to which any disposition
which is unfortunate to have in some situations is irrational.
Since it is bad to accept Dutch books, it is unfortunate to have states that would
dispose one to accept Dutch books. If unfortunate dispositions are always irrational,
then non-additive probabilities are irrational. However, this inference is suspect.
There are two problems. First, while a disposition can lead to unfortunately choices
in one context, it can be good in others. (Consider Newcomb’s paradox.) Some
rewards may be earmarked for those who are disposed to make bad bets on some
occasions. Alan Hajek notes [16] this flaw with Dutch book arguments. He suggests
that sometimes having non-additive probabilities will lead one to accept a set of bets
50
that collectively ensure a net positive that one wouldn’t accept if one had additive
probabilities. This observation is no threat to the Dutch book argument, because
it just shows that if one has additive assignments, one will occasionally refrain from
exchanging an advantageous risky bet for a sure thing. However, it is difficult to
rule out the possibility that we might be better off with some kind of non-additive
assignment given the kinds of bets we are likely to actually receive. Second, we should
also be careful to distinguish between dispositions and the actions that they lead to.
It may be that accepting a Dutch book is irrational, but having the dispositions to
accept those bets individually is not itself irrational. Or it may be that accepting
a Dutch book isn’t even irrational. It isn’t clear that there is any reasonable way
to avoid being disposed in some way or other to unfortunate choices in the right
contexts.5
A more general challenge often offered by critics of Dutch book arguments is that
they rely too much on practical considerations. The alleged problem with having non-
additive states is that it thwarts you from getting what you want. This subordinates
the normativity of probability states to means-end normativity. We want to lead rich
and happy lives, and this is hampered by having non-additive probability assignments,
so we should drop those non-additive assignments. But intuitively, the problem with
non-additive states is internal to the states themselves and not at all a practical issue.
Assigning non-additive probabilities isn’t like forgetting to measure twice before you
cut, or rubbing, rather than dabbing, a spill on a carpet. The general problem
is twofold. First, the reason to assign additive probabilities isn’t pragmatic, but
epistemic. Second, the problem has something to do with the attitudes themselves,
rather than their relation to other attitudes.
Several philosophers have tried to convert the practical issue into an epistemic
one. Christensen [6], for instance, has suggested that Dutch-book arguments show
5For instance, see [1].
51
that there is a sort of incongruity inherent in non-additive probability assignments.
His reason for thinking this lies in a purported connection with beliefs. Probability
assignments sanction certain ceteris paribus judgments about fair bets. A probability
assignment of 23
licenses the judgment that a bet at 2:1 odds is fair, all things consid-
ered. One need not be prepared to make the bet in order to judge that the bet is fair.
The issue is an epistemic one. If each probability assignment requires the judgment
that bets that are sanctioned by the standard decision procedure are fair, and if it is
an a priori truth about fairness that a collection of bets that are individually fair is
also fair, and if anything that guarantees a loss is unfair, then there is a conflict of
beliefs about fairness.6
This may help make the issue non-practical, but it doesn’t quite succeed in inter-
nalizing the problem. It isn’t the probability assignments themselves, but the beliefs
that they commit us to that leads to the problem. This would indicate that the
problem is dependent on being in a position to form beliefs about fairness. If one
cannot form beliefs about fairness, then probability assignments themselves are not
problematic. But failing to form beliefs about fairness doesn’t seem to exculpate
one’s non-additive probability assignments. The tie to fairness may be indicative of
a problem with probability assignments that fail the formal norms, but it doesn’t
explain what is wrong with them.
The central problem for Dutch book arguments is that they make the issue ex-
ternal to the judgments themselves, which suggests that the argument is only good
insofar as probability assignments have the requisite external connections. Break the
connections and the argument loses its force. This won’t be a problem if probability
assignments really have to be connected to decision making in the standard way in
order to be probability assignments. But this idea seems dubious. Can’t we assign
6Christensen doesn’t go so far as to allege an actual inconsistency of belief. He does not wish tosay that the sanctioned judgments about fairness are straightforwardly logically inconsistent, justthat there is something internally defective about them.
52
genuine probabilities and make use of alternative heuristics in deciding what to do?
Couldn’t we assign probabilities even if we never acted in any way whatsoever? The
answer, according to the Pragmatic Characterization, is ‘no’. So much the worse for
the Pragmatic Characterization: we obviously do assign probabilities and go on to
misuse them. People rarely go through the process of actually calculating expected
utilities, and it is doubtful that they generally succeed in approximating it.7
But even when we don’t use the standard decision procedure, we still assign prob-
abilities and the norms still apply to those assignments.
There are decision procedures we could follow on which non-additive probabilities
are not problematic. Assigning non-additive probabilities would still be problematic.
So the problem that arises in assigning non-additive probabilities cannot come from
the practical effect. This is not to say that one wouldn’t be making a mistake of some
kind in forgoing the standard decision procedure. My point is simply that making that
mistake doesn’t exculpate one for assigning non-additive probability assignments.
3.4 The Aim Characterization
If we cannot explain Additivity by reference to the effects of probability assignments,
perhaps we should focus on just what kind of credal state probability assignments
are. Jim Joyce [18, 19] spearheaded recent interest in this alternative approach by
suggesting that we could derive the norms of probability from the fact that probability
assignments aim at accuracy. This approach is strongly influenced by the view that
the fact that beliefs aim at truth can why certain kinds of beliefs are irrational.
7 Further, it is highly dubious that people even have quantifiable utilities. It is easy to quantifymoney and offer precise definitions for fair bets. But the value of money isn’t linear, and so decisiontheorists recognize that we shouldn’t always try to maximize our expected financial returns. Utilitiesact like a substitute for money that justifies our prescription to optimize: we can assume that thevalue of utility is linear. If our desires are not quantifiable, if we cannot lay a metric over degrees ofstrength of desires, then we can’t even begin to maximize utilities.
53
Other philosophers have independently given great attention to the way aims can
help explain norms. In order to provide such an explanation, we must first work
out what aims really are. Wedgwood [37] interprets the aim of truth as consisting
in the fact that it is a necessary condition for an attitude to be a belief that it be
a representational mental state that is correct iff true. Shah and Velleman [30] offer
an alternative view, according to which the concept belief incorporates the idea that
beliefs are correct when and only when true.
However we choose to cash out the sense in which belief aims at truth, it should
secure us the fact that assignments which formed in ways that are transparently
inclined towards falsity are irrational. Insofar as beliefs must be aimed at truth in
order to be beliefs, beliefs that are formed for reasons unrelated to truth are unlikely
to achieve that aim. This suggests that at least some kinds of epistemic normativity
can be understood by analogy to means-end practical normativity. The end, in this
sense, need not be the end of the believer in general, but rather of the belief (or of the
believer qua believer). One norm that might be explained in this fashion is the norm
against believing in inconsistent propositions. Inconsistent propositions are sure to
be false, and thus belief in them automatically fails at meeting its aims.
Joyce proposed that just as beliefs have a criterion of success that makes them
successful when true, probability assignments have a criterion of success that makes
them successful to the extent that they are accurate. Unlike beliefs, success isn’t
categorical. It comes in degrees. The more accurate a probability assignment, the
more successful it is. A probability assignment in a true proposition is more accurate
the higher the probability is, and a probability assignment to a false proposition is
more accurate the lower it is.
The Aim Characterization: A probability assignment is a graduated
credal state that aims at accuracy.
54
Joyce hopes that, just as the criterion of success for belief can explain some of
the norms that govern belief, the criterion of success for probability assignments can
explain Additivity. On Joyce’s proposal, the nature of probability assignments incor-
porates the facts about accuracy. Thus, we can say that Additivity forbids holding
non-additive credal states whose criterion of success is accuracy. The explanation of
this norm is that non-additive credal states fail to meet their aims as well as alterna-
tive additive assignments would. This proposal is a definite improvement over the last.
It allows that we might have probability assignments that don’t have any connection
with a decision procedure, and that those states might be subject to Additivity even
so.
The Aim Requirement: Additivity consists in an obligation not to have
non-additive graduated credal states whose constitutive aim is accuracy.
While I would agree that accuracy is plausibly a constitutive aim of probabil-
ity assignments, understood as a requirement on the applicability of our concept of
judgment-about-probability, I do not agree that this fact can be used to explain Ad-
ditivity. The basic problem is that probability assignments aim at accuracy in a way
that is too vague to explain Additivity. In order to use the accuracy aim to explain
Additivity, we need some way of measuring accuracy. While Joyce recognizes the pos-
sibility that we don’t aim at accuracy as measured in any particular way, he thinks
that we can explain Additivity supervaluationally, by looking all possible measures.
I will introduce several measures of accuracy and show how they might be em-
ployed in an explanation of Additivity. I will also raise doubts about whether we do
constitutively aim at accuracy as measured in any of these specific ways. Finally, I
will critique Joyce’s supervaluational argument.
55
3.4.1 Measures and Disagreement
Let the correct value of a probability assignment be 1 for assignments in true propo-
sitions and 0 for assignments in false propositions. The simplest way to measure the
accuracy of a probability is to calculate the distance of each assignment from the cor-
rect value. Call this the Difference score. A higher value on the Difference score, like
all of the other measures we’ll examine, leads to greater inaccuracy. On the Differ-
ence score, inaccuracy increases linearly with the distance from the correct value and
the accuracy of a set of assignments is the sum (or, alternatively, the average) of the
inaccuracies of each probability assignment. While the Difference score is very simple
and very natural, it has been widely rejected. It cannot provide an explanation for
Additivity.
Another simple way of measuring accuracy is with the Brier score, inspired by
the work of meteorologist Glenn Brier [3]. This measure of a probability assignments
accuracy as a square of the difference from the correct value. Hence, the inaccuracy
of a measure doesn’t increase linearly with the distance from the correct value. The
inaccuracy added by moving from .8 to .7 in a true proposition is less than the
inaccuracy added by moving from a .3 to .2. The Brier score is both realtively
simple and has a number of nice properties; hence it is the favorite contender among
inaccuracy measures for probabilists.
There are many other measures. One other measure is the Spherical score, which
bears mention as an alternative to the Brier score. According to the Spherical score,
the accuracy of a particular assignment is given by 1 − |(1−c)−p|√p2+(1−p)2
where c is the
correct value, and p is the assignment.
56
Sample Accuracy Scores
Assignment Contribution to Inaccuracy
(to a Falsity) Difference Brier Spherical
.3 .3 .09 .08
.4 .4 .16 .168
.5 .5 .25 .29
.55 .55 .3 .366
.7 .7 .49 .606
.8 .8 .64 .75
These measures all agree about when one probability assignment in one propo-
sition is more accurate than another probability assignment in another proposition:
they always say that propositions that get closer to the correct value are more accu-
rate. But they differ on how to quantify the relative accuracy of different assignments,
and this means that they will disagree about how accurate different collections of as-
signments are overall.
Quantifying inaccuracy allows us to compare the relative accuracy of collections of
assignments. For instance, assignments of .5 to two true proposition will be deemed
collectively more accurate than an assignment of .4 to one proposition and .55 to
another by the Brier score but not by the Spherical score.
How do we select one of these measures as the measure of accuracy as our assign-
ments constitutively aim at? There are two plausible routes. One way is to investigate
our intuitions about how to regard the relative degrees of success of different single
assignments to a proposition. How much worse is it to assign .7 than .8 in a true
proposition? Is it greater than the difference in assigning .3 and .2? The second way
is to investigate our intuitions about different collections of probability assignments.
Since different measures favor some sets of assignments over others, we can try to
57
generate intuitions about the relative success of probability assignments in a way
that will distinguish which scores are reasonable.
I submit that our probability assignments aim at accuracy in no precise way.
Neither of the above methods provides much of an intuitive grip on the problem.
Nothing pushes me to favor one of scoring rules over the others.
Instead, I think that the situation is a bit like this: when you aim a dart at a dart
board, you aim to get closer to the center, but in doing so, there is nothing in your
action that indicates any difference in success beyond the ranking. Your aiming at
the center need not require any implicit comparison between the difference of getting
one inch to the left and two inches to the left and four inches to the left and five
inches to the left. Every point is preferable to any point further away, but there is no
way of quantifying and comparing this preference (unless you explicitly form such a
preference). So, I think, it is with accuracy. We aim at accuracy, but not in a way
that assumes a measure of accuracy.
3.4.2 Wedgwood’s Inference to the Best Explanation
If we could settle on one measures of accuracy, then, if it was the right kind of measure,
we might have a suitable explanation of Additivity. We’ll say that one assignment
dominates another with respect to a scoring rule if it does better no matter how things
turn out. There are a number of measures where non-additive assignments are always
dominated by additive assignments and additive assignments are not dominated by
non-additive assignments. The Brier score and the Spherical score are both such
measures. The Difference score is not. Thus, given a non-additive assignment, there
is always an additive assignment that better reaches the constitutive goals of the
attitude.
But to employ this argument, we’ve got to settle on one measure. I’ve suggested
that there is no single right measure that captures how we aim at accuracy, and so
58
I don’t think that this argument will work. While we may have vague ideas about
which assignments are more accurate than others, they don’t serve to distinguish
between the Difference, Brier, and Spherical scores.
We can admit this while recognizing that among these, the Brier score may make
the most sense as a measure of accuracy. If we don’t constitutively aim at accuracy
under any precise measure, then we don’t aim to maximize accuracy according to the
Brier score.
The Brier score has a lot of theoretical advantages. It is simple. It is natural. If
we interpret probability assignments as a space, then the Brier score corresponds to
distance in that space from the truth. It is proper, which means that the expected
accuracy of a probability assignment, relative to that assignment, is higher for that
assignment than for any other, and it can be used to justify the formal norms. For
these reasons, it has found special favor among philosophers as the best bet among
viable contenders for a precise accuracy measure.
Ralph Wedgwood [38] thinks that we can establish the Brier score as the right
measure by means of an inference to the best explanation.8 The best explanation
of the formal constraints is that some score favors probability assignments that obey
them, and the number of nice properties known to be associated with the Brier score
strongly suggest that it is the correct accuracy measure.
How good this argument is depends on both the plausibility that we really can
treat the Brier score as a constitutive aim and on whether any other better explana-
tion is available. I am inclined to think that this argument will fail on both counts.
First, the manifest vagueness of our intuitions about accuracy seem to me to preclude
any precise measure of accuracy from figuring in the constitutive aim of probability
assignments. I think Shah and Velleman have the right interpretation of constitutive
8Wedgwood doesn’t think of the Brier score as a measure of accuracy. Rather, he thinks thatit is a measure of “correctness”. I am not sure how much of a difference this makes – though hehimself doesn’t present it as hugely significant.
59
aims, and this interpretation suggests that the constitutive aim of an attitude should
be relatively transparent to anyone with a concept of that attitude. The aim of accu-
racy is transparent, but much as I examine my concept of a probability assignment,
I cannot find any particular indication that one measure is right.
Second, there are alternative and equally good explanations available. The argu-
ment works best if we can assume that there is one fundamental norm that explains
all others. Wedgwood thinks that the fact that beliefs are correct iff true is funda-
mental norm that explains all norms of beliefs. This may have inclined him to think
that the aim of accuracy is fundamental for probability assignments. However, we
can also postulate a variety of different fundamental norms. To put it in Wedgwood’s
terminology: it may be that inaccuracy is one variety of incorrectness among others.
Violation of the formal norms may be its own kind of incorrectness. Before turning
to the explanation that I favor, I will address the argument that Joyce develops that
doesn’t require that we actually have aim at accuracy in any particular way.
3.4.3 Joyce’s Supervaluationist Argument
Joyce’s argument does not rely on any single measure of accuracy being correct.9
Instead, he tries to show that there are properties that any reasonable precise measure
should have that together entail that non-additive states are dominated by additive
states according to that measure,
Given this claim we can see that if there were a specific measure of accuracy
that our probability assignments aim at, then even if we can’t figure out which one
it is, the fact that we can recognize that it has properties that entail that additive
probabilities dominate other probabilities means that Additivity is true. But if there
isn’t, the argument gets a bit more complicated.
Here is what Joyce says about the possibility of vagueness:
9Though he has expressed some willingness to allow that there may be a correct measure ofaccuracy, which might vary with context.
60
In developing these ideas, I will speak as if accuracy can be precisely quan-
tified. This may be unrealistic, since the concept of accuracy for [proba-
bility assignments] may simply be too vague to admit of sharp numerical
quantification. Even if this is so, however, it is still useful to pretend that
it can be so characterized because this lets us take a “supervaluationist”
approach to its vagueness. The supervaluationist idea is that one can un-
derstand a vague concept by looking at all the ways in which it could be
made precise, and treating all facts about what properties all of its “pre-
cisifications” share as facts about the concept itself... I am going to be
interested... in the properties that all reasonable “precisified” measures
of gradational accuracy share. [18, 590]
I have a number of worries about the supervaluationist approach that Joyce’s
argument relies upon.
First of all, the supervaluationist inference that allows us to project properties
from the precisifications to the vague concept itself is highly questionable. I am
not sure what would license it. It does not seem to be obviously psychologically or
logically necessary that we should ascribe to a concept all of the properties shared by
all of its available precisifications. We could recognize that every viable precisification
has a property that we refuse to apply to the vague concept.
We might try to justify the inference as follows: the supervaluationist inference is
reasonable because a vague concept places constraints on its possible precisifications.
Those properties shared by all possible precisifications reflect constraints derived from
the vague concept itself. Thus, by looking at what each precisification shares, we can
get some insight into the constraints of the concept itself.
This line of thought is problematic. By virtue of being precise, the precisifications
may introduce aspects that are not implied by the vague concept itself. All precisifi-
cations are precise. A vague concept is not itself precise. The precision is an artifact
61
of the precisification process and does not result from constraints imposed by the
vague concept. There may be other properties that all precisifications share because
they are precisifications that the vague concept itself doesn’t by itself impose upon
them and that doesn’t reflect anything about the vague concept’s nature.
Aaron Bronfman [4] has developed a related argument against Joyce’s view. Bronf-
man notes that Joyce’s argument doesn’t show that any non-additive assignment is
dominated by a particular additive assignment under any reasonable accuracy mea-
sure. It may be that each accuracy measure favors its own distinct additive assign-
ments over any given non-additive assignment. Here is Joyce’s formulation of the
worry, presented with an analogy:
Suppose ethicists and psychologists somehow decide that there are just
two plausible theories of human flourishing, both of which make geograph-
ical location central to well-being. Suppose also that, on both accounts,
it turns out that for every city in the U.S. there is an Australian city with
the property that a person living in the former would be better off living
in the latter. The first account might say that Bostonians would be better
off living in Sydney, while the second says they would do better living in
Coober Pedy. Does it follow that any individual Bostonian will be better
off living in Australia? It surely would follow if both theories said that
Bostonians will be better off living in Sydney. But, if the first theory ranks
Sydney ≺ Boston ≺ Coober Pedy, and the second ranks Coober Pedy ≺
Boston ≺ Sydney, then we cannot definitively conclude that the person
will be better off in Sydney, nor that she will be better off in Coober Pedy.
So, while both theories say that a Bostonian would be better off living
somewhere or other in Australia, it seems incorrect to conclude that she
will be better off in Australia per se because the theories disagree about
which places in Australia would make her better off. [19, 289]
62
Joyce responds that it is still problematic to accept a probability that is domi-
nated by any reasonable precise measure. It makes some sense to say that this kind
of dominance is indicative of a problem – it is a bit hard to believe that rational
probability assignments would be dominated in this way – but it is hard to make
the case that it explains the norm. It is far from clear that any Bostonian in the
scenario that Joyce describes that wants to maximize their well-being has a reason to
move. If they are confident that there is no fact of the matter which theory of human
flourishing is correct, then there is no fact of the matter whether they will be better
off moving, no matter where they move. It is implausible that if there is no fact of the
matter whether they are worse off by staying put, there is anything irrational about
doing so.
Second, it is open to challenge that the accuracy aim of probability assignments is
not actually vague. Here we must be careful to distinguish between genuine vagueness
and a lack of precision. The constitutive aim of probability assignments could be
perfectly determinate and exhausted by the fact that probability assignments that
come closer to the truth are preferable. There need be no way of measuring and
quantifying degrees of accuracy. This doesn’t make the aim of accuracy vague. There
may be no way of measuring and quantifying degrees of tastiness, but that doesn’t
mean that judges in a chili cook-off lack criteria to select the winner. They aim to
choose the tastiest chili. The rules of soccer tell you who won a single game. But
they don’t tell you whether it is better to win 4-2 or 2-1. Nor do they tell you which
team did better in a series of games. The rules of soccer are not vague on this score;
it lies beyond their purview. Similarly, it may be that our aim of accuracy specifically
tells us to prefer single probability assignments that are closer to the correct value
and does nothing to measure degrees of accuracy or how to measure collections of
assignments. This needn’t make the aim a vague aim and so it wouldn’t necessarily
63
invite the supervaluationist inference, even if the supervaluational inference was safe
on the assumption of vagueness.
Third, we may worry about the kinds of considerations that Joyce uses in support
of the properties he thinks a scoring rule should have. The worry is that Joyce as-
sumes that accuracy does all of the work in settling what probability assignments are
allowable. Consider the kinds of properties that Joyce requires. The first iteration
of Joyce’s argument relied on the weak convexity axiom. The weak convexity axiom
stated that an adequate accuracy measure has to be conservative in a certain way:
given two set of assignments that are regarded as equally accurate, any mixture of
those assignments (arrived at by summing the products of the respective assignments
and weights that add up to 1) must be more accurate than those assignments them-
selves. Maher [26] notes that this assumption rules out the Difference score, and for
this reason he rejects it as a constraint on a reasonable probability assignment. I’m
inclined to agree with Joyce that conservativeness is a virtue of a probability assign-
ment, but I’m not tempted to attribute this to considerations of accuracy. The fact
that we judge conservative assignments better may be related to other aspects of the
nature of probability assignments. In essence, though there might be reasons to favor
conservative assignments over non-conservative assignments, they need not arise from
accuracy considerations.
In Joyce’s updated [19] version of the argument, he offers a new set of constraints10.
Among those constraints are two (offered for separate proofs): Propriety and Minimal
Coherence. A measure is proper if no probability assignment that obeys the formal
norms recognizes another probability assignment as having a greater expected accu-
racy. The problem with improper assignments, Joyce suggests, is that they “under-
mine their own adoption and use.” As soon as you’ve adopted a modest assignment,
10It is also worth noting that he moves from discussing accuracy to discussing epistemic utility.
64
you’ve got reason to change to another assignment which you regard as being more
accurate. This fact is the basis of his support of the criterion of propriety.
It is open to question whether you really should switch, even if you aim at accuracy
and you recognize another probability assignment to be more accurate. There may
be more risk involved in switching, for instance. Or else accuracy might not be all
that we aim at. For instance, you might not aim to adopt probability assignments
that are as accurate as they can be, but only as accurate as your evidence licenses
you to be. If so, then it would be a mark against the reasonableness of the Difference
score that it is improper, since changing the assignment to make it more accurate
might make it more accurate than it is licensed to be by the evidence available.
The issues I’ve taken with both propriety and weak convexity have a similar
source. Joyce assumes that the accuracy measure explains the comparative virtues
of different probability assignments when it is compatible with external constraints
that achieve the same ends. Its hard to make them out as results derived solely
considerations of accuracy, so they need not be regarded as requirements for any
genuine precisification of accuracy. In this way, Joyce’s argument shares something
with Wedgwood’s. Joyce seems to be assuming that accuracy must do the work
of deciding what probability assignments are reasonable. So any judgments we have
about reasonableness can be projected back onto accuracy measures. Those measures
in turn can be used to explain the formal norms. This kind of inference to the best
explanation, however, works to the extent that accuracy measures really can provide
the best explanation. In order to decide whether this is the case, we must consider
how other alternative proposals fare in explaining the reasonableness of the relevant
probability assignments.
Finally, it is worth considering the distinction between the properties of measures
it makes sense for the concept to be vague over and the properties of measures that
it makes sense to adopt as precisifications inpractice. It is surely vague when a boy
65
becomes a man. For expediency, we may say that the reasonable precisifications
include his 18th or his 25th birthday. The 16th hour of the 1122nd Tuesday of his life
is not an expedient point. If we had to settle on a date, the first two options are
better than the third. But a concept which is vague over the latter precisification is
reasonable, because there is a difference between what a concept is vague over and
what precisifications are reasonable. If the supervaluational argument is to have much
force, it must be because of what metrics our aim of accuracy is vague over, and not
what precisifications would be reasonable to adopt. Given Joyce’s interest in things
like propriety, he seems mostly concerned with the latter.
In summary, then, we need more than just a basic aim at accuracy; we need
precisely measurable accuracy measures to make the argument work. But it doesn’t
seem that we aim at accuracy in any particular way, and Joyce’s supervaluationist
gambit is highly questionable.
3.5 The Constitutive Characterization
So far we have considered three proposals. On the Bare Characterization, probabil-
ity assignments are just graduated credal states, and Additivity prohibited us from
having any non-additive graduated credal states. On the Pragmatic Characteriza-
tion, probability assignments had an essential tie to decision making, and thus the
Additivity prohibited us from having non-additive graduated credal states that were
used to select actions on the basis of expected utility. On the Aim Characterization,
probability assignments aimed to maximize accuracy, and thus Additivity prohibited
us from having non-additive graduated credal states that aimed at accuracy (in the
way that our states do). I think that none of these proposals were able to really
explain Additivity in a plausible manner.
66
I’m going to offer a distinct account that makes two separate claims. First, that
Additivity is a rule that partly constitutes the practice of assigning probabilities.
Second, that we become subject to the rules governing the practice of assigning prob-
abilities by engaging in that practice through our implicit intentions to participate in
the practice given our application of the concept of probability. As I suggested in the
introduction, we might consider what it is to make a judgment about probability in
terms of engagement in a practice. In the second chapter, I suggested that we need
to postulate a concept of probability. Here, I combine the two ideas: the concept of
probability is a concept whose application constitutes a move within a practice. Since
probability is understood in terms of that practice, anyone who assigns a probability
should regard themselves as making a move in that practice and hence being subject
to its norms. I will call this the ‘Formal Characterization’.
The Constitutive Characterization: It is partly constitutive of proba-
bility assignments that they are governed by normative formal constraints
including Additivity, Non-negativity, and Normality.
The Formal Requirement: Additivity consists in an obligation not to
have non-additive credal states that are governed by a normative require-
ment to be additive.
This may sound trivial. I take this to be an advantage of the view, for it explains
why the truth of Additivity is transparent and undeniable. In the next few sections, I
will explain how we should understand the way in which rules constitute our practices
and why the norm of Additivity holds any sway over us. I will conclude by discussing
the status of other norms.
67
3.5.1 Additivity as a Constitutive Rule of a Practice
When you play a game of chess, there are certain things you’re not supposed to do.
For instance, you’re not supposed to move a pawn back toward your own side. You’re
not supposed to do so, not because it is strategically unsound or unsportsmanlike,
but because it is against the rules. Moving a pawn backwards is forbidden by the
rules of chess and playing chess is an activity characterized in part by the rules you
must follow, or at least attempt to follow, while playing it. If you are to play chess,
you must make some attempt to play by the rules. In your game you may agree with
your opponent that moving a pawn backwards is allowed. There may be no reason
not to come up with such an agreement. Nor need the agreement be verbal; you may
make a backward move, your opponent may recognize this and not object, and the
game may proceed. But insofar as you and your opponent do not try to restrict all
moves to the legal moves of chess, you’re no longer playing chess. The norms that
govern the allowable moves in chess are norms that are constitutive of the practice of
playing chess, in the sense that an implicit intention to obey the rules is necessary to
play chess. There is nothing wrong with playing other games, but if you play chess,
you must intend to play by the rules.
Chess is a game with well-defined rules that we can choose to participate in or
refrain from participating in. The goal of the game is to maneuver one’s pieces into a
checkmate of the opponent’s king. The rules govern how we are allowed to proceed.
Probability assignments may be viewed in an analogous way. The practice of assigning
probabilities can be characterized by their goal or their use and by the constraints
allowed by the practice. The goal of chess is to get a checkmate. The constraints
are the rules governing allowable moves. Just as the constraints of chess help give
structure to the game and make it a worthwhile pastime, the constraints on assigning
probabilities make them useful.
68
The goal of assigning probabilities could be conceived of in either of two ways. In
the first way, the goal is accuracy. Just as Joyce suggested, it is part of what it is to
be a probability assignment to be aimed at coming as close as possible to the truth.
In the first chapter, I proposed a different goal – I suggested that we use probabilities
to measure the relative strengths’ of our sources of evidence. Given how closely these
align in practice, I don’t think that there needs to be a fact which is the real goal.
In each case, we will want to assign higher probabilities in propositions for which we
have lots of evidence, and lower probabilities in propositions for which we have less
evidence.
The constraints on our assignments are just Additivity, Non-negativity, and Nor-
mality. The fact that probability is characterized by the rules that it is distinguishes
it from other practices. There is an infinite variety of other practices governed by
other norms. We could have sought to assign values with the goal of accuracy or
reflecting our evidence in many different ways. These alternative practices need not
have been any worse, but they would have been different. Take Normality and Non-
negativity. Why is it that probability assignments should lie between 0 and 1, rather
than between 0 and 10, 0 and 12, or -5 and 23? There is no deep reason. However, if
we started using a scale between -5 and 23, we would no longer be assigning probabil-
ities. Probabilities are supposed to range between 0 and 1 for no better reason than
that restriction is built into what it is to be a probability. We choose to calculate
probabilities because the scale involved is especially convenient. We are obliged to
assign probabilities between 0 and 1, rather than -5 and 23, because that is what the
practice that we have chosen to adopt requires.
Additivity has a similar explanation. We are obliged to assign additive values
as probabilities because the restriction toward assigning additive values is partly
constitutive of the practice of assigning probabilities. Insofar as you’re not restricted
to assigning additive probabilities, you’re not engaging in the practice of assigning
69
probabilities. There need be nothing wrong with what you are doing. It is perfectly
rational and fine to assign non-additive values. Just as it’s fine to play checkers with
chess pieces, it is fine to have other non-additive kinds of graduated credal states. But
if you’re assigning probabilities, then you’re bound by the norms that govern them.
3.5.2 Reasons for Engaging in the Practice
The explanation of why probability assignments are governed by the rules of Addi-
tivity is shallow. The explanation is simply because that is part of what it is to be
a probability assignment. Probability assignments are moves in a certain practice.
That practice is governed by certain rules. Those rules require that we assign additive
values.
Simply saying this, however, doesn’t explain why Additivity holds any sway over
us. Simply saying that moving a pawn backwards is against the rules of chess doesn’t
explain why someone shouldn’t move their piece backwards while playing chess. Such
a move might disqualify the game as chess, but why does that matter at all? In other
words, why play chess, and why, in a game of chess, are we subject to the rules of
chess? Two answers can be given. The immediate answer is that we are subject to
the norms governing chess because we intend to play chess. Insofar as we intend to
play chess, we must try to follow its rules. The less immediate answer is that there
are all sorts of pragmatic reasons for playing chess. It is intellectually stimulating.
It is fun. It is something one can enjoy with another person. These are reasons to
subject oneself to the rules of chess.
Similarly, we can offer two related explanations for why our assignments should
conform to the rules governing the practice. The immediate explanation of why we are
normatively bound by the constraints of this practice is that we intend to engage in
this practice. Participating in a practice that is governed by certain norms requires the
intention to conform one’s activities to those norms. Normative practices confer their
70
norms to us through our intentions to participate in them. The intention to engage
in a normative practice involves an intention to conform our actions to the norms,
and our intentions are what explains the resulting obligation that we come to have to
conform our actions (or probability assignments) to the norms. The intention need
only be implicit. When we offer an acquaintance a casual remark, we (often) intend
that our utterance conform to culturally standard rules of grammar. This intention
isn’t explicit or deliberately formed. But it is clearly still an intention. Similarly, our
intention to conform our assignments to the norms governing probability is implicit
in the act of assigning a probability. It might be thought that this robs the norms
of something. If norms cannot be intentionally11 flouted, because they are built into
conditions of engaging in the practice, are they still norms? It doesn’t matter too
much what we call them, as long as we understand them. If the formal norms turn
out to be constitutive rules rather than genuine norms, we can still explain the fact
that we should assign probabilities that obey them, insofar as we assign probabilities
at all.
A less immediate reason is that there are good reasons to engage in the practice
of assigning probabilities. Probabilities are immensely useful, both epistemically and
practically. We can use probability assignments to come to hold better justified
beliefs and to select actions that are most likely to provide us with the most of what
we want. Engaging in the practice of assigning probabilities provides one with a
simple way to organize and compare one’s diverse bits of evidence and represent the
relative strengths of that evidence in a way that is useful for making decisions. By
packaging evidence into probabilities, we make it possible to select actions on the
basis of expected utilities. Since maximizing one’s expected utilities is often a good
way to get what one wants (especially in the long run), it is wise to engage in a
practice that makes a useful decision procedure available.
11While I am sympathetic with the idea that the norms cannot be intentionally flouted, I do thinkthat they can be unintentionally disobeyed.
71
There is a variety of good reasons we have to engage in the practice. It has proven
extremely useful in a number of the domains in which probabilities have been applied.
Additivity is a fundamental part of the practice and seems necessary to the utility to
which assignments of probability have been put. So we have good reason to engage
in the practice that subjects our assignments to Additivity. This is ultimately where
I think the Dutch book arguments become relevant. These arguments constitute
evidence of the pragmatic benefits of engaging in the practice of assigning probabil-
ities. They don’t explain the norm of Additivity, but they supply us to reasons to
engage in a practice for which Additivity is a norm. This allows the present account
to accommodate the Dutch book argument without making the explanation of the
norms wholly pragmatic. The reason why we are subject to the norms that we are is
because of our intention to participate in a practice governed by those norms. This
may be regarded as a kind of pragmatic reason, but by divorcing the justification
from other desires, and internalizing the pragmatic ramifications, it makes it much
more tolerably so. In fact, I think that the ultimate explanation is quite similar to
the explanations sometimes given for the norms of belief in terms of its aims. So this
account allows us to make space for Dutch book arguments without giving them too
central a role.
3.5.3 The Status of Other Norms
The practice of assigning probabilities is in part constituted by a rule forbidding the
assignment of non-additive values. What else goes into characterizing the practice?
There are many other norms that have been suggested to govern probability assign-
ments. None of these other norms is built into the practice of probability in the
way that Additivity, Normality, and Non-negativity are. It would take a substantial
amount of space to catalog and discuss these alternative norms. I will return to the
subject briefly in the next chapter, when I discuss the significance of conditionaliza-
72
tion, but it is worth saying something briefly about what else could explain norms
of probability. There are two major sources for potential norms, one internal to the
practice and one external to the practice. The internal source of norms, apart from
the formal constraints, is the goal of assigning probabilities. There are a variety of
other possible sources of norms.
In addition to the formal constraints required by the practice, the goal of the prac-
tice can help motivate some probabilistic norms. One instance of a norm that might
be derived from the goal of the practice is a norm to obey the Principle of Indiffer-
ence, which counsels us to assign equal probabilities to equally plausible propositions
for whom we have equal evidence. This principle has been formulated in variety of
different ways, and many problems with its bolder formulations have been uncovered.
Nevertheless, I think that there is a kernel of truth to it. The principle makes sense
as a requirement to have reasons for different probability assignments. If we take
reasons in favor of alternative propositions to be entirely equally weighty, it doesn’t
make sense to assign different probabilities. If we accept that the goal of our proba-
bility assignments is to represent the relative strengths of our evidence, then we can
see why we should assign equal probabilities to propositions that are backed by equal
reasons. Insofar as there is no reason to favor one proposition over another, we should
represent them as having equal evidential support. So it makes sense to make this
principle, insofar as it is true, a result of the goal. Unlike Normality, Non-negativity,
and Additivity, it isn’t a further constraint on the practice, but it is requisite for
meeting the goal.
One instance of a norm that arises from an external source (for an alternate view,
see [17]) is Lewis’s [23] Principal Principle. Loosely stated, the Principal Principle
says that one should assign probabilities in line with known objective chances. So if
one knows that a biased coin has a 23
chance of landing heads, one should assign a
probability of 23
that it will land heads on the next toss. I agree with Lewis in thinking
73
that the Principal Principle is integral to our understanding of chance. I think there-
fore that the source of the Principal Principle is our concept of an objective chance.
Plausibly, our concept of an objective chance resembles in some way the concept of
an objective property whose recognition merits a specific probability assignment in re-
sponse. Insofar as one thinks that there is a property that there is a property that
merits a specific probabilistic response, one has reason to adopt that response. This
is a response required by the concept, it doesn’t arise from the practice of assigning
probabilities itself.
The explanation that I have I offered for Additivity is both simple and straightfor-
ward. It makes sense of Additivity in a thoroughly non-mysterious fashion, without
giving up on the insights of past Bayesians who were attracted to pragmatic vindi-
cations of the norms. Their pragmatic stories have a place, but the place that they
have is mediated by our intentions to engage in a practice. They explain why those
intentions are not arbitrary, and those intentions explain why we are subject to Addi-
tivity. In the next chapter, I will take up the question of how probability assignments
with respect to different bodies of evidence ought to relate to each other. I will sug-
gest that one customary norm regarding their relation, conditionalization, should be
understood in terms of commitment preservation. This should help to demonstrate
the variety of different sources of the norms governing probability.
74
Chapter 4
The Authority of
Conditionalization
4.1 The Bayesian Procedure
We should only believe those propositions that are, given our evidence, rather likely
to be true. The ‘Bayesian procedure’, inspired by Thomas Bayes, is one way to
figure out what how likely a proposition is on the basis of our evidence. At the
core of this procedure is conditionalization, a method for updating a probability
assignment with the inclusion of new evidence. To apply the procedure, we first
commit ourselves to one probability assignment based on some subset of our evidence,
and then conditionalize that assignment on the remainder of our evidence.
For Bayesian epistemologists, who accept the doctrine of degrees of belief, the
Bayesian procedure resembles the normal course of rational belief change. Since I am
skeptical of that doctrine, I will regard the Bayesian procedure as one among many
possible methods that we may employ to decide what to believe. I think that we are
under no rational obligation to go through the steps necessary to apply the method; we
are free to use other methods or heuristics when they prove more useful. If, instead of
75
using the Bayesian procedure we apply another method, justified by its expediency,
that produces results that differ from the results of the Bayesian procedure, it is
rational to assign probabilities that are in line with that other method.
I will assume all of this and focus on a residual question: other methods for
settling on probabilities may be more expedient, and so one may opt not to apply
the Bayesian procedure or bother to discern its dictates – but is it ever rationally
permissible to ignore those of its dictates of which one is aware? A person may get
away with failing to comply with a superior’s orders when they don’t actually receive
those orders. The superior may still have authority over their subordinate insofar as
the subordinate would be compelled to comply with any orders that they did receive.
Is the Bayesian procedure authoritative in the same way?
I will argue that the Bayesian procedure is not always authoritative. Sometimes
the thing to do is to ignore its dictates. My argument for this fact will depend upon
a particular analysis of the source of the procedure’s (limited) authority. I’ll pro-
pose that conditionalization makes sense, when it does, by virtue of the fact that
conditionalization is a way of being faithful to a special type of commitment. Be-
cause conditionalization is required to be faithful to a special type of commitment –
specifically commitments to the relative probabilities – this justification of condition-
alization undermines the authority of the Bayesian procedure. If conditionalization is
only necessary to be faithful to a certain kind of commitment, then when we haven’t
undertaken that kind of commitment we are not obligated to heed the dictates of the
Bayesian procedure.
The plan for the paper is as follows. First, I will explain the Bayesian procedure
in more detail. The procedure requires substantive inputs – commitments to proba-
bility assignments given bodies of evidence – which it doesn’t supply any guidance in
selecting. This may seem to make the procedure useless, but I will instead suggest
that it gives us some insight into the real significance of the procedure. In the second
76
section, I will give an analysis of that significance. Often, the Bayesian procedure de-
livers probability assignments that one should adopt as one’s own in order to respect
one’s present commitments. In the third section, I will present a case that raises
concerns about the authority of the procedure. In the final section, I will discuss
what this case shows about the limits of the procedure’s authority.
4.1.1 The Details
The Bayesian procedure is a way of settling on a probability to assign to a given
proposition (the ‘target’ proposition). The procedure works by updating a probability
assignment based on a subset of our total evidence on the remainder of our evidence.
It requires two inputs. First, it requires a division of our evidence into two parts: the
evidence that we base the the input probability assignment on, and the evidence that
we use to update it. We must be able to encapsulate the latter bit of evidence as a
proposition (the ‘evidence’ proposition). The procedure also requires that we start
with an assignment of probabilities to the members of an algebra of propositions that
includes both the target proposition and the evidence proposition.1.
The input probability assignment is customarily referred to as the ‘prior proba-
bility assignment’. In order for the procedure to make sense, it is critical that the
prior probability assignment represent something about our commitments to what
the probability should be in light of the relevant restricted body of evidence. This
commitment needn’t involve a belief about any objective measures of evidence: it
may instead simply reflect a personal commitment that we have undertaken. Though
the procedure relies on an input probability assignment in order to provide any guid-
ance on what probabilities to assign, it isn’t useless. Often, the prior probability
assignment is based on a body of evidence whose probabilistic significance we are in
a better position to assess. The value of the Bayesian procedure is that it provides a
1Strictly speaking, we don’t need to assign probabilities to every member of the algebra, just tothe evidence proposition and the conjunction of the target proposition and the evidence proposition.
77
precise way in which to accommodate the remainder of our evidence. If we know what
to think about the probability of the members of our algebra given that restricted
body of evidence, we can figure out what to assign to the total body of evidence.
The Bayesian procedure consists of conditionalizing the specified prior probability
assignment on the evidence proposition. The evidence proposition rules out every
member of the prior algebra that it is inconsistent with. A proposition is untouched
by the evidence if it is entailed by the evidence proposition. An untouched proposition
is entirely consistent with the evidence – there is no way in which the proposition
could turn out to be true while the evidence proposition is false. Each conjunction
of any proposition with the evidential proposition is untouched. Each conjunction
of a proposition with the negation of the evidential proposition is ruled out. The
essential feature of conditionalization is that it sets to 0 the probabilities of all ruled
out propositions and preserves the ratios of all of the untouched propositions.
To see exactly what conditionalization involves, it can be helpful to think about
probability assignments geometrically. A probability assignment can be geometrically
represented by an association of propositions with regions. The region associated with
a conjunction of two propositions is the region that intersects the conjuncts’ associated
regions. The region associated with a disjunction of two propositions is the region
that overlaps the disjuncts’ associated regions. The probabilities of each proposition
correspond to the relative size of their associated regions. The total space, which
corresponds with what we know to be true, is treated as having a size of 1. Every
proposition has a probability equal to the size of the associated regions.
On this way of representing a probability assignment, conditionalization is a geo-
metric process. It involves first excising the regions associated with ruled-out propo-
sitions and scaling the remainder back up to a total size of 1 in such a way as to
preserve ratios of untouched propositions.
78
In the following geometrical representation, we have a probability space consisting
of three propositions: ψ, φ, and γ. The initial assignment grants them all an equal
(1⁄3) probability. ψ is the evidence proposition. So incorporating the evidence rules
out ψ. φ and γ are untouched propositions. The untouched propositions retain their
ratio as the total space is renormalized. φ and γ both come to have a probability of
1⁄2 after conditionalization.
ψ φ γ ψ φ γ φ γ
4.2 The Logic of the Procedure
If the Bayesian procedure is to have any kind of authority its reliance on condition-
alization must be vindicated. Conditionalization is a very popular account of how
to go about updating a probability assignment on new evidence, and so numerous
defenses of it have been given. Most of these defenses assume the doctrine of degrees
of belief and take conditionalization to concern the rational evolution of these beliefs.
By focusing on the Bayesian procedure, I have used conditionalization as a synchronic
process. We often conditionalize a probability assignment relative to a subset of our
present evidence on our total evidence. Nevertheless, with a bit of tweaking, the same
strategies for justifying conditionalization as a diachronic process might be carried
over for its synchronic application.
I will discuss two different attempts at justifying conditionalization that have
played a prominent role in contemporary discussions: Dutch book arguments and
accuracy-based arguments. After discussing and criticizing these two proposals, I
will consider a third proposal which extends the account of the formal norms that
I offered in the previous chapter to the conditionalization. On this third view, the
79
authority of conditionalization is constitutive of the practice of assigning probabilities.
I will ultimately reject this view and offer an alternative that takes the authority of
conditionalization (what authority it has) to arise from our commitments to how
evidence should be represented.
4.2.1 Dutch Book Arguments
Conditionalization is a process by which we derive one probability assignment from
another. If the Bayesian procedure is to have authority, it is because conditionaliza-
tion tells us how probability assignments with respect to different bodies of evidence
ought to relate to each other. The Dutch book argument for conditionalization [34]
rests on the practical problems that arise if we don’t conditionalize.
The Dutch book argument shows that those who fail to conditionalize and at-
tempt to maximize their utility will be willing to accept diachronic Dutch books –
a set of bets offered at different times that appear individually advantageous but
that collectively guarantee a loss. By accepting those bets that individually appear
advantageous, those who fail to conditionalize will be led to be worse off no matter
how things turn out. This is unfortunate.
Dutch book arguments show that there are pragmatic reasons to conditionalize.
However, it is difficult to translate this observation into a satisfying explanation of
why it is that we should conditionalize. I offered one such interpretation of the
argument in the previous chapter according to which we should infer the irrationality
of the beliefs from the unfortunateness of acting on their behalf, and presented several
criticisms of it. Those criticisms continue to have force in the present context: Dutch
books threaten to make conditionalization dependent on one’s decision procedure,
they aren’t sufficiently epistemic, and the unfortunateness of accepting Dutch books
doesn’t obviously entail irrationality about the decisions that lead to them.
80
There are also a couple of specific reasons to worry about the diachronic versions
of the argument used in defenses of conditionalization. Such arguments establish too
much and too little. First, the same considerations that counsel for conditionalizing
also seem to counsel against changing one’s mind. The Dutch book argument shows
us that one can be led to accept bets that will led to sure losses unless one condition-
alizes. It doesn’t matter why it is that one fails to conditinalize. If we should always
conditionalize, then it is not permissible to change one’s mind about how to repre-
sent our evidence. Such changes needn’t conform to conditionalization. But, unless
there are objective facts about evidence, it is hard to see how it could be irrational
to change one’s mind. If the only considerations we can find against conditionalizing
also apply to changing one’s mind, then those considerations prove too much.
Second, diachronic Dutch book arguments aren’t quite able to justify the syn-
chronic Bayesian procedure. They show something about how it is that the proba-
bility assignments one bases one’s decisions on should change over time. They don’t
show anything about how we are obliged to represent probability with respect to
different bodies of evidence at a single time. One can avoid accepting Dutch books
so long as one always updates one’s probability assignment on one’s total evidence
by conditionalization. This is compatible with wild swings in what one assigns to
propositions relative to non-total bodies of evidence. Such swings go against the
spirit of conditionalization and flout the Bayesian procedure, but they don’t produce
the same pragmatic issues. This second problem is the inverse of the first. By being
focused entirely on the probability assignment relative to the total body of evidence
at different times, the argument is too weak to establish the synchronic Bayesian
procedure.
81
4.2.2 Accuracy-Based Arguments
As with the formal norms, accuracy-based arguments have recently become popular
as an alternative to Dutch book arguments for conditionalizaion because they promise
to offer a less pragmatic explanation. accuracy-based arguments rely on the thought
that our attitudes characteristically aim to maximize accuracy. (Versions of this ar-
gument, such as that of Wallace and Greaves [13] discussed below, focus on ‘epistemic
utility’ which may reflect other virtues of a probability assignment beyond accuracy.)
Ideally, an accuracy-based argument would contain a proof that any updating proce-
dure other than conditionalization is bound to produce less accurate results. That is,
insofar as one aims at accuracy, one is always better off conditionalizing than not con-
ditionalizing. This would make the accuracy-based argument for conditionalization
similar to the accuracy-based argument for the formal norms.
The ideal is unobtainable. The accuracy of the result of updating a probability
assignment in a particular way depends both upon the accuracy of the original as-
signment and how it is updated. If one starts with a highly inaccurate assignment,
one is often better off simply scrapping it. Whether someone is better off condition-
alizing depends upon exactly what kind of assignment they start with. The next best
thing to the ideal involves relativizing the argument: maybe conditionalization won’t
always produce the most accurate result, but it will be such that one should always
think that it will. Given our uncertainty, conditionalization may always seem like the
best option.
David Wallace and Hilary Greaves [13] have proven a result of this kind. They
proved that no matter how accuracy is measured, everyone should expect that the
most accurate probability assignment is whatever assignment would be deemed most
accurate by the probability assignment arrived at by conditionalizing. They don’t
show that one should expect that conditionalizing itself delivers the most accurate
assignment, only that conditionalizing delivers the assignment that one should expect
82
to provide the best advice about what probability assignment to accept on the basis
of accuracy. Wallace and Greaves point out that if probability assignments regard
themselves as being the most accurate (as they do if accuracy is the sole virtue and
is measured with the Brier score or another proper measure) then conditionalization
is rationally required.
This is a very interesting result, but its limits should be noted. It faces many of
the same problems as the accuracy-based arguments for the formal norms. It is not
clear how the argument should work if accuracy can’t be measured precisely. Though
Wallace and Greaves purport to only assume a more general account of epistemic
utility, it is clear that they have accuracy in mind when suggesting that their result
helps to justify conditionalization. If we allow that epistemic utility includes other
virtues, it is not at all clear that probability assignments will recommend themselves.
If probability assignments don’t recommend themselves, then there is no guarantee
that conditionalization will be optimal. For instance, if we think that probability
assignments should be evaluated in terms of how closely they conform to what is
licensed by the evidence, then there could be situations, like the following below, in
which we shouldn’t conditionalize.
Suppose that someone assigns a probability of 1 to the proposition that the bod-
ies of evidence A and B license the following probability assignments to ψ and φ
(respectively):
A ψ φ
a .5 .2
b .1 .2
B ψ φ
a .6 .4
b 0 0
Then suppose that body of evidence B contains all those propositions in body
of evidence A, along with the proposition a. Suppose that, in possession of body of
evidence A, they assign probabilities in light of the above values. What should they do
once they subsequently learn a? Conditionalization would deliver a final probability
of about .7 to ψ. However, that assignment wouldn’t recommend itself in light of their
83
commitments to facts about what evidence licenses. Since they are entirely sure that
body of evidence B licenses an assignment of .6 in ψ, it would recommend adopting a
probability of .6 in light of the evidence. In fact, if someone wants to maximize their
epistemic utility, measured in closeness to what the evidence licenses, they should go
with .6.2
If we do think that bodies of evidence license probability assignments that don’t
relate by conditionalization, and if we aim in any way to assign probabilities licensed
by our evidence, then conditionalization may not be the way to go. It begs the ques-
tion to think that probability assignments must relate by conditionalization. Hence,
Wallace and Greaves rely on questionable assumptions to justify conditionalization.
4.2.3 Constitutivism
In the third chapter, I argued that the formal norms of probability need no explana-
tion. Being a probability assignment involves being subjected to the formal norms.
The norms get their grip on us through our intention to assign probabilities. Not all
norms of probability are like this, and I suggested that Lewis’s Principal Principle
the Principle of Indifference were not. Conditionalization may seem much more basic,
and therefore more plausible as a constitutive norm. However, I think that it is not.
My reason for denying constitutivity to conditionalization through its role in the
Bayesian procedure is that it does not appear to be a requirement for engagement in
the practice. (Intentional) obedience to constitutive norms is essential to participation
in a practice. For this reason, a good guide to whether or not a norm is constitutive of
a practice is whether or not it is coherent for someone to intentionally and flagrantly
2Conditionalization only really applies when we gain new evidence. This is known to createproblems for its application, since often our evidence typically changes by shifts of evidence, notmere accretion. When our body of evidence becomes B, for instance, we not only gain the informationthat our total body of evidence is A, but we also lose the information that our total body of evidenceis B.
84
violate that norm while attempting to engage in the practice.3 For example, one
cannot understand the game of chess and intend to play chess while moving one’s
pawns backward. Someone who intends to play a game that allows moving pawns
backward intends, at best, to play a variant of chess. This is why the moves available
to a pawn are constitutive. Since that move violates the rules, and intended obedience
to the rules is required for an intention to play chess, it is incoherent to intentionally
move backwards. On the other hand, one can intentionally make a move that is
strategically unsound – say, trading a queen for a pawn for no gain in position –
without incoherence.
Those norms that cannot be intentionally violated without confusion about the
practice or incoherence are constitutive. The Principle of Indifference and the Prin-
cipal Principle can be violated without confusion about the practice of assigning
probabilities, and for this reason, they are not constitutive. On the other hand, the
formal norms cannot be violated. Conditionalization falls on side of being coherently
violable. While it isn’t coherent to intentionally violate a formal norm, it is coher-
ent to intentionally refrain from conditionalizing. One might opt to follow heuristics
to decide what probabilities to assign, and these heuristics might diverge from the
Bayesian procedure. Since it is coherent to disobey the Bayesian norm while intend-
ing to assign probabilities, obedience to the Bayesian norm can’t be constitutive of
the practice of assigning probabilities, and needs some other explanation.
It is perfectly possible to introduce a new practice that is constitutively governed
by the rule: ‘assign values in accordance with the Bayesian procedure’. This practice
isn’t the practice we currently engage in when we assign probabilities. The Bayesian
procedure could be constitutive of a practice, but it is not actually constitutive of our
3One may violate a norm governing a practice that one doesn’t recognize. However, insofar asthey intend to engage in the practice, if they were to discover that that norm governs that practice,they would have to alter their behavior or give up their intention.
85
practice. Hence, we can’t rely on constitutivity to account for the authority of the
Bayesian procedure.
4.2.4 Authority through Commitment Preservation
The key to the normative authority of the Bayesian procedure lies in our relation to
the prior probability assignment. Our commitment to the dictates of the procedure
should be no greater than our commitment to what we feed into it. If we take a shot in
the dark in assigning the prior probability, we’ve got no reason to trust in the results
of conditionalizing. As I suggested in the first chapter, we don’t need to think that
any probability assignment is right in order to commit ourselves to one. We may think
of other probability assignments as equally rational responses to the evidence without
being indifferent between them, just as one may have particular desires without seeing
those desires as any more or less rational. Adopting a probability assignment from
among the rational assignments available involves taking a personal stance on what
probability is reasonable on the basis of that evidence.
Part of what it is to assign a probability is to commit oneself to representing a body
of evidence with that probability. I propose that the Bayesian procedure receives what
authority it has from the commitments that lead us to settle on the prior probability
assignment. The normative force of the Bayesian procedure comes from the fact
that conditionalization is the only way to properly respect our commitments when
we assign a posterior. By committing ourselves to the prior, we may have already
committed ourselves to the posterior.
This approach to understanding the authority of the Bayesian procedure has sev-
eral advantages over the previous proposals that I’ve discussed. First, it is clear why
conditionalization is an epistemic norm. The commitments that we have to the prior
probability assignment are commitments about how to represent evidence with prob-
abilities. They are clearly epistemic commitments. While we may lack practical or
86
moral reasons for conditionalizing, our epistemic commitments provide us with epis-
temic reasons to conditionalize. Second, we needn’t rely on evidentialism to deliver
these epistemic commitments. It is doubtful that single probability assignments are
objectively demanded by many bodies of evidence. There needn’t be a unique ratio-
nal body of evidence in order for the present approach to work. The reason is that
the normative force doesn’t come from the evidence, but from ourselves. We don’t
conditionalize because conditionalization is the right way to respond to the evidence;
we conditionalize because it is the only way to respect our own commitments. Third,
there is no problem handling the synchronicity of the Bayesian procedure. The fact
that we assess probabilities with respect to different bodies of evidence at the same
time produces no technical challenges for this account. It is designed to handle them.
Fourth, since the value of conditionalizing doesn’t lie in the consequences of doing
so, this view does not prohibit changes of opinion. We can alter our commitments
whenever we like. What we believed in the past has no sway over what we now be-
lieve. When we do alter our commitments, however, we must do so across the board.
If multiple responses to the evidence are rational, then we can rationally shift from
one position to another, so long as we shift both our prior and posterior assignments.
4.2.5 How Commitments are Preserved
This approach to justifying conditionalization will only be viable if we can explain
how it is that commitments to a prior probability assignment make for commitments
to posterior probability assignments. There are three important ingredients in the
account that I’ll offer. First, I will give an account of the nature of our commitments. I
will suggest that many of our commitments are commitments to relative probabilities.
Second, I will reiterate my analysis of what conditionalization really amounts to: it
preserves the probability ratios of untouched propositions. Third, I will propose that
any proposition that is entailed by two separate propositions provides no evidence that
87
is relevant to the two propositions’ relative probabilities. I’ll take each component in
turn.
Commitments to a prior assignment may take many forms. Just as we may be
committed to a political cause for moral, religious, or social reasons, we can be com-
mitted to a probability assignment because of other, more basic, commitments. The
precise form of our commitments can explain why it is that we should care about
conditionalization, though the fact that we can be committed to a probability assign-
ment for other reasons is important to the limitations of conditionalization as well,
as I will explore later.
The first ingredient concerns our commitments. One form that our commit-
ments may take is commitment to the relative probabilities of propositions in light of
the evidence. A commitment to relative probability in light of the evidence is a com-
mitment about how to regard two propositions on the basis of that evidence. Such a
commitment only tells us the relationship between the numbers assigned to the two
propositions, but with enough such relative commitments, we may settle on specific
values for a whole algebra of propositions. If we are committed to assigning an equal
probability to a collection of inconsistent propositions which collectively exhaust the
space of possibilities, then we must assign them a value solely dependent on the
number of such propositions. I will reserve the term ‘relative commitment’ for com-
mitments about the ratios that probability assignments should take, although there
are certainly other kinds of relationships we could be committed to having between
the probabilities we assign to propositions.
Relative commitments are quite common. We often decide what probabilities to
assign by trying to estimate how two propositions compare with each other. Once
we have figured out how they relate to each other, we can deduce what numbers
they must be assigned in order to maintain probabilistic coherence. It is important
that these relative probabilities are assigned because we adopt a commitment that
88
the evidence demands them. While all probabilities are assigned as a response to our
evidence, and as a way of representing that evidence, some probabilities may be more
directly demanded by the evidence than others. We may be committed to assigning
ψ and φ equal probabilities in light of the evidence, and committed to assigning φ and
γ equal probabilities in light of the evidence, without being committed to assigning
ψ and γ equal probabilities in light of the evidence.
The second ingredient concerns conditionalization. As I explained earlier, what
is special about conditionalization is that it preserves the ratios of untouched propo-
sitions. This fact about conditionalization exhausts it. Conditionalization should
therefore be seen as that procedure that leaves the ratios of untouched propositions
alone and restores a probability assignment to probabilistic coherence, given that
every proposition that was ruled out must be assigned a probability of 0.
The final ingredient is the claim that any bit of evidence that is entailed by
each of two propositions doesn’t alter the balance of evidence we have for those two
propositions. This claim rests on the thought that evidence entailed by each of two
propositions doesn’t help discern between them. If we expect that the the pressure
will drop whether it rains or hails, then we cannot use the fact that the pressure
dropped to reevaluate whether we think that it is more likely to rain or to hail. If we
think that if either Frank or Sal will come in first place, Lee will come in third, then
we can’t use the fact that Lee came in third to reevaluate whether we think Frank or
Sal is more likely to have come in first. If judgments about probability aim to capture
the balance of evidence, then the addition of undiscerning evidence shouldn’t alter
the ratios of assigned probabilities demanded by the evidence.
Now we can assemble the ingredients. If we have a commitment regarding the
relative probability of two propositions on a particular body of evidence, then the
nature of the commitment means that we are also committed to the same relative
probability with the addition of any undiscerning evidence. The same commitment
89
to a relative probability in light of the old evidence will lead to a commitment to
the same relative probability in light of the additional undiscerning evidence. Thus
if we are committed to regarding any two propositions as having a particular ratio of
probabilities given a certain body of evidence, we are committed to them having the
same ratio with the addition of any evidenced entailed by both propositions.
It follows from this that if we are committed to the relative probabilities of propo-
sitions in the prior assignment, we are committed to maintaining the ratios of all
untouched propositions with the addition of undiscerning evidence. Since this is all
that conditionalization does and since conditionalization is the only way of updating
one’s probabilities that does this, we should conditionalize. The Bayesian procedure
is authoritative if and only if we are rationally required to assign probabilities on
the basis of different bodies of evidence that are related by conditionalization. Re-
specting our commitments to relative probabilities in light of the evidence requires
conditionalizing. So the Bayesian procedure is authoritative if we have commitments
to the relative probabilities all of the members of our algebra in light of the evidence.
This account provides a neat explanation of the authority of the Bayesian proce-
dure, but it is limited. It requires that our prior assignment reflect commitments that
we have about relative probabilities in light of the evidence. Often this is not the case
and when it isn’t we won’t have any reason to conditionalize. In the remainder of the
paper, I will explore and defend the idea that we may rationally lack commitments
to relative probabilities.
4.3 An Illustrative Case
Before I begin discussing the limitations of the Bayesian procedure, I will present a
case that illustrates them. In this section, I will describe a case in which I think that
the Bayesian procedure is not authoritative, and I will give an argument to this effect.
90
This case should provide a vivid introduction to the kinds of ways we might fail to
have the relative commitments that I previously suggested were vital for explaining
the authority of the Bayesian procedure.
4.3.1 Alice’s Predicament
Consider Alice’s predicament:
Alice the Astrophysicist. Alice has a new theory of dark matter. Ac-
cording to her theory, dark matter is composed of a hitherto-unobserved
particle – the D-particle – that is part of a natural extension of the Stan-
dard Model. The properties of this particle do a very good job explaining
the observed properties of dark matter.
A gap in her theory concerns the origins of the particle: it is not produced
by any known particle interactions. Mathematical reasoning allows her to
narrow down the candidates for the process that might have produced
the particle to two. The particle could be produced as a result of either
low-energy or high-energy supersymmetry breaking, but not both. Alice
has no a priori reason to prefer either hypothesis; her theory makes no
predictions about how the D-particle is actually produced. But experi-
mentation with a particle accelerator has confirmed that no such particle
is produced by low-energy supersymmetry breaking.
91
Alice is concerned with four hypotheses:
d-par Dark matter is composed of D-particles.
high The D-particle is produced during high-
energy supersymmetry breaking.
low The D-particle is produced during low-
energy supersymmetry breaking.
d-par Dark matter is not composed of D-particles.
Alice’s Question: How likely should Alice think it is that d-par is
correct, given the fact that low is not?
If she is to use the Bayesian procedure, Alice must try to assign a probability
to each member of the algebra generated by her four hypotheses. Suppose that
she forms a prior assignment by considering what her non-empirical evidence (that
is, her evidence without the results from the particle accelerator) merits, and then
conditionalizing on the remainder of her evidence. Suppose that the combination of
naturalness and fit with the data lead Alice to assign a 2⁄3 prior probability to the
proposition that her theory is true. Further, in the absence of evidence, she divides
the probability of high and low up equally, so that each proposition receives a
probability of 1⁄3.
In Alice the Astrophysicist, the Bayesian procedure tells Alice that upon incorpo-
rating the evidence that rules out low, she should maintain the relative probabilities
of high and d-par because they are both inconsistent with low. Since she assigned
them both a probability of 1⁄3 before, and they come to exhaust the possibilities, the
Bayesian procedure says that Alice should end up assigning them each a probability
of 1⁄2.
Intuitively, the addition of d-par to Alice’s body of evidence does not alter her
epistemic situation with regard to d-par. Her prior confidence in d-par was driven
92
by the theory’s naturalness and by the fit between its predictions and the established
data. The fact that there were two sub-cases consistent with her theory was not a
relevant factor in her assignment of prior probability to the theory as a whole. The
theory remains as natural after the evidence that rules low out is included, and the
fit between the theory’s predictions and the known properties of dark matter persists.
This same evidence should continue to drive her division of probabilities into d-par
and d-par. So, the probability she comes to have in d-par should be the same as
her old probability. It should continue to be 2⁄3.
4.3.2 The Counterfactual Argument
Consider the following counterfactual scenario (differences from the original are ital-
icized):
Counterfactual Alice. Alice has a new theory of dark matter. Accord-
ing to her theory, dark matter is composed of a kind of hitherto-unobserved
particle – the D-particle – that is part of a natural extension of the Stan-
dard Model. The properties of this particle do a very good job explaining
the observed properties of dark matter. The combination of naturalness
and fit support d-par just as they do in the original scenario.
However, there is only one conceivable process that might produce the D-
particle: high-energy supersymmetry breaking. Alice has no independent
reason to think that it does so. In this counterfactual scenario, low-energy
supersymmetry breaking is not a viable process, due to minute alterations
in physical laws that don’t otherwise effect the naturalness and fit of d-
par. Consequently, Alice assigns high the full probability of the theory.
Let Pr−(d-par) be Alice’s probability assignment in the original scenario before
she is able to rule out low, Pr+(d-par) be Alice’s probability assignment in the
93
original scenario after she is able to rule out low and PrC(d-par) be her probability
assignment in the counterfactual scenario in which low is not a possibility.
The argument goes as follows. Since the same evidential factors bear on Pr−(d-
par) and PrC(d-par), they should be equal. And since the same evidential factors
bear on Pr+(d-par) and PrC(d-par), they should also be equal. Therefore, Pr−(d-
par) and Pr+(d-par) should be equal.
In assigning a probability to a proposition, we commit ourselves representing
that body of evidence with that probability assignment. It follows that we ought
to assign the same probability whenever we have the same evidence. If a particular
probability is the appropriate representation of a body of evidence, then as long as
that evidence does not change, the probability will remain appropriate. Insofar as
the evidence that Alice possesses does not change in a relevant way, her probability
assignment should not change.
Premise 1: Pr−(d-par) = PrC(d-par)
In the counterfactual situation, Alice assigns her probability to d-par in response
to the same evidence as she initially had in the original scenario. She assigns her
probability in response to the naturalness of her theory and the fit between the
theory and established data. Since she had the same reasons to believe d-par before
she rules out low in the original scenario and in the counterfactual scenario, she
should assign the same probability to the proposition in both cases. Thus, in the
counterfactual scenario, high should be assigned a probability of 2⁄3.
In the setup of the original case, Alice’s probability was stipulated to depend
on the naturalness of her theory and its fit with established data. Her probability
was not taken to depend upon her division of probabilities to hypotheses about how
the particle might be produced. The fact that the theory had two sub-cases was
not counted in its favor. This is a plausible assumption: when we go about assigning
94
probabilities, we do not normally bother to count or closely examine all of the possible
sub-cases.
This stipulation might be questioned – perhaps Alice is being irrational in assign-
ing her probabilities in this way. If Alice was an ideal epistemic agent, it is possible
that she would not need to rely on the Principle of Indifference to assign proba-
bilities. However, imperfect agents such as we use a variety of heuristics to assign
probabilities. I believe that if Alice is imperfect in her ability to collect and analyze
evidence, she needn’t be irrational for only basing her assessment of probability on
course-grained features of her body of evidence. We have little basis to conclude that
Alice is irrational for using a heuristic that produces the same probability in both the
prior and posterior assignments.
Premise 2: Pr+(d-par) = PrC(d-par)
What difference could it make whether low was genuinely epistemically open
to Alice in the first place? Ruling out low should not amount to evidence not to
believe d-par if its viability did not provide evidence in its favor in the first place.
By stipulation, it did not provide any such evidence that Alice recognized, so Alice
is in the same epistemic situation with respect to d-par upon ruling out low in the
original case as she is in the counterfactual case. In both cases she has the same
evidence for d-par. The theory exhibits the same naturalness, fit with the data, and
even has the very same viable sub-cases. So Alice should assign the same probability
to d-par in the original case after ruling out low as she should assign to d-par in
the counterfactual case.
The counterfactual argument suggests that the change in probabilities should look
more like this:
low high d-par high d-par
95
4.4 Limitations of the Bayesian Procedure
With Alice’s predicament in mind, I’ll return to the issues surrounding the authority
of the Bayesian procedure. First, I will explore what kinds of commitments one
may have other than commitments to representing bodies of evidence with relative
probabilities. Then I will suggest that as the case was described, it is most plausible
that the kinds of commitments that Alice has don’t support the authority of the
Bayesian procedure. Third, I will lay out the view that the authority of the Bayesian
procedure is limited to cases where we have commitments to relative probabilities.
Finally, I will take up two objections to this proposed limitation. The first of these
objections holds that we are rationally required to have commitments to relative
probabilities. The second holds that any commitment to an algebra provides us with
commitments to relative probabilities.
4.4.1 Varieties of Commitment
The Bayesian procedure only makes sense if we are committed to the prior probability
assignment. I offered one explanation of the authority of the Bayesian procedure that
relied on a particular kind of commitment. If we are committed to the relative
probabilities that we assign to propositions, those commitments should survive the
addition of undiscerning evidence. There are other kinds of commitments that we
might have, and the Bayesian procedure doesn’t build in any restriction on the kinds
of commitments it requires.
A commitment to a probability assignment to an algebra of propositions is typ-
ically derived from commitments to the particular propositions that make up the
algebra. These commitments to particular propositions may in turn be derived from
a variety of other commitments. Some of our commitments are fundamental (‘basic’)
and all other commitments are derived from those fundamental commitments and
96
from each other. I will take the notion of a basic commitment to be a psychological
primitive. Derived commitments are those commitments that we have because they
must be satisfied in order to satisfy all of one’s basic commitments.
There is nothing preventing us from having a basic commitment to a set of proba-
bility assignments to an algebra as a whole, but this would be quite unusual. It is more
typical that we have commitments to an algebra derivatively of having commitments
to the particular propositions that that make it up. Nor are we often directly com-
mitted to assigning particular values to propositions. In general, our commitments
to whole probability assignments to algebras are derived from our commitments to
the relations between assignments.
Commitments to relative probabilities are one way of having commitments to the
relations between assignments, but there are also others. We also have comparative
commitments, for instance, when we judge that one proposition is more likely than
another, and we have a kind of higher-order commitment, for instance, when we judge
that the difference in probability between two propositions is equal to the difference
in probability between two other propositions.
We also have commitments to methods. The Bayesian procedure can’t settle all of
our questions about what probabilities to assign. We can have commitments to the
prior that are derived from commitments to procedures for assigning probabilities.
One method for assigning probabilities instructs us to assign equal probabilities
to all analogous propositions in the absence of evidence. Another method is to adopt
the probability assignments of known experts. Another method is to meditate, and
then follow your own gut. Another method is to adopt probabilities in accordance
with known frequencies. When the application of these methods lead us to adopt a
commitment to a relation or a specific value, it is a derived commitment.4
4The fact that a relation can be derived from a method that we are committed to doesn’t meanthat it is derivative. We may have commitments that are overdetermined – we may have basiccommitments that we can also derive from our other commitments.
97
These procedures are all incomplete. We will often need to use many of them to
settle on a probability assignment to a larger algebra. We may even have multiple
commitments that are individually basic and allow the same derivative commitments
to be derived.
As an illustration of the ways in which we may come to be committed to a prob-
ability assignment, consider the following.
P(rain tomorrow) = .5
P(no rain tomorrow) = .5
There are many different ways in which one can come to have this assignment.
Exactly how one arrives at them makes a difference to how one should go about
updating them on the receipt of new evidence. One might, for instance, have come to
assign these probabilities just by examining the clouds, the time of year, the recent
weather, etc. One might then just intuit those probabilities, without any utilizing
any particular method. In this case, it is plausible that one would have a basic
commitment to the relative probabilities.
Alternatively, one might come to assign this probability on the basis of statistical
inference from a data set. This would involve a basic commitment to the principles
one used to derive the probability, and a derivative commitment to the relative proba-
bilities. Or, one might have a basic commitment to heeding the testimony of experts,
and one might have come to have a derivative commitment to the assignment after
hearing a meteorologist exclaim that it was as likely as not to rain tomorrow. Or, we
may have a basic commitment in the use of the Principle of Indifference, which coun-
sels assigning equal probabilities in propositions about which we have no evidence,
and we may regard ourselves as having no evidence about whether it will rain. In this
last case, we would have a basic commitment to the Principle of Indifference, and a
derived commitment to the assignments.
98
Thus, relative commitments are only one of many different ways in which we can
be committed to a probability assignment. If the authority of the Bayesian procedure
rests on these kinds of commitments, then when we lack those kinds of commitments
we should not regard the Bayesian procedure as authoritative.
4.4.2 Alice’s Commitments
Alice would have an obligation to accept the verdict of the Bayesian procedure if her
commitments to the prior assignment were commitments to the relative probabilities
of high and d-par in light of the evidence. If she had a basic commitment to thinking
that high should have the same probability as d-par, then that commitment would
mean that she should continue to assign d-par the same probability as d-par after
taking the additional evidence into account. In the way that I described the case,
however, it was suggested that her basic commitments did not include a commitment
to this relative probability. Her commitment was derived from two other commitments
that she had: to the relative probability of d-par and d-par to the use of the Principle
of Indifference in settling probabilities in the absence of evidence.
Since she had no reason to favor high and low, she assigned them equal proba-
bility. Her commitment to the Principle of Indifference will remain, and so she will
continue to be obligated to assign an equal probability in any propositions for which
she lacks discerning evidence, but no such two propositions exist in the resulting
algebra and so this commitment is irrelevant to her final probability assignment.
Alice had no basic commitment to the equal probability of high and d-par.
The basic rationale for the authority of Bayesian conditionalization doesn’t hold.
Alice’s commitments to her prior assignment don’t obviously entail much about how
she should be committed to the posterior assignment. Insofar as she lacks a basic
commitment to the equal probability of high and d-par before taking her evidence
from the particle accelerator into account, it is plausible that she isn’t committed to
99
assigning an equal probability after (unless it is required by her other commitments).
This relation was an accident of her other commitments. There is no reason to think
it is a commitment that should be carried over.
The fact that Alice lacks relative commitments to untouched propositions doesn’t
mean that she has no commitments about the posterior assignment. In order to
decide what Alice’s commitments to the prior assignment commit her to about the
posterior assignment, we must know more about what those were. Alice’s commit-
ment to the relative probabilities of d-par and d-par weren’t based on the Principle
of Indifference, so they must have a different source, either derived or basic. If her
commitment to this relative probability is basic, then it will supply Alice with no
commitments after the fact. Any basic commitment she had to the relative proba-
bility of d-par and d-par is irrelevant once she eliminates low, as d-par doesn’t
entail low. Plausibly, however, Alice’s basic commitment to the relative probabil-
ity of d-par and d-par will be accompanied by basic commitments to comparative
probabilities of the sub-cases of d-par and d-par; in particular, it is plausible that
she will have a basic commitment to always assign a probability in high no more
than twice as great as the probability she assigns to d-par. This means that she will
continue to be committed to thinking that d-par is no more than twice as great as
d-par.
If her commitments are derived from methods, then we would need to delve more
deeply into the details of the case and the particular methods involved to know where
she should be after incorporating the evidence. It is possible that her commitments
were to methods that led her to ignore the sub-cases into which d-par was divided. In
that case, it is plausible that she should still assign a probability of 2⁄3 to d-par after
taking the evidence into account; she is still committed to those principles and since
they were blind to the sub-cases, they will continue to deliver an equal probability in
d-par and d-par after the additional evidence is incorporated. On the other hand,
100
she might have been committed to the 2⁄3 assignment by virtue of a commitment to
methods that took the sub-cases into account. If she did, then depending on the
details of those methods, she might either be obligated to assign a probability in
d-par anywhere from 0 to 2⁄3, or she might lack commitments altogether.
4.4.3 The Authority of the Bayesian Procedure
The authority of the Bayesian procedure stems from the fact that our commitments
to a prior assignment may commit us to the probability assignment that results from
conditionalizing that prior assignment for our total evidence. The procedure lacks any
kind of authority over those who have no commitments to a prior assignment. The
procedure also lacks authority over those who have the wrong kind of commitment to
the prior assignment. Conditionalizing makes sense when we have commitments to
relative probabilities – specifically to ratios in light of the evidence. But it may not
make sense in other cases. If our commitments to relative probabilities are derivative,
rather than basic, then we must look to the basic commitments to see what we should
think.
If this is right, then there is no magic rule for deciding what probability assignment
one body of evidence merits in terms of the probability assignment that another does.
What one should do will depend upon what commitments one has, with different
commitments leading to different results. In each case, the individual commitments
must be carefully examined to see what those commitments require. How we should
update a probability assignment depends on how we arrived at it.
We could adopt a commitment to utilizing the Bayesian procedure itself. In that
case, we would give the Bayesian procedure unrestricted authority. Perhaps there are
good reasons to adopt such a commitment; there is something very appealing about
conditionalization that is revealed by Dutch book and accuracy arguments. But in
light of the partial vindication of the Bayesian procedure, I don’t think that we should
101
hold out hope that any more substantial commitment to the Bayesian procedure is
rationally mandatory. Insofar as we lack a commitment to the Bayesian procedure,
it lacks unrestricted authority over us.
4.4.4 Must we have Relative Commitments?
To conclude this chapter, I will briefly look at how a defender of the procedure might
respond. I will consider two responses. First, I will consider the proposal that we
should have commitments to relative probabilities. If we are to be rational, the
thought goes, we must have the commitments that make the procedure authorita-
tive. Second, I’ll consider the suggestion that the derived commitments to relative
probabilities that arise from derived commitments to any assignment to an algebra
are sufficient to provide full authority to the Bayesian procedure. These responses
may leave the Bayesian procedure with some limitations – someone who lacks relative
commitments may not be committing a further blunder in not respecting the author-
ity of the procedure, but if these responses are correct then one cannot get away with
ignoring the dictates of the Bayesian procedure without violating some norm or other.
I will focus on each kind of response in turn.
The argument against the unrestricted authority of the Bayesian procedure relies
on the thought that we are obligated to have basic commitments to relative probabil-
ities. I am skeptical that there is any rational requirement to have any commitments
whatsoever, so if this response is to have any bite, it is plausible that two things must
be the case: first, we are in the business of adopting commitments to representing ev-
idence with probabilities and second, all bodies of evidence demand that those in the
business of adopting such commitments adopt commitments to relative probabilities
in light of the evidence.
I expressed my skepticism of evidentialism in the first chapter. If evidentialism
were correct – if there were objective facts about what probabilities we should assign
102
– then it is plausible that we would always be obligated by the evidence to adopt
commitments to relative probabilities. This view may not explicitly require eviden-
tialism, but it is much less plausible without it. If our reactions to evidence are akin
to a matter of personal taste, then whether their bodies of evidence always merit
relative probabilities is itself highly subjective.
Even if evidentialism is mostly true, there may be many cases in which (relative)
probabilities are not dictated by the evidence. There are many difficult questions
about which rational disagreement seems permissible, and for which we don’t have
much of a clue about how to figure out an objective way of assigning probabilities.
Is there an objective fact of the matter, given what evidence that we have available
to us, how likely we should think it is that the human race will survive the next 500
years, that alien life exists within fifty light years, that there is a island of stability
in the elements with more than 200 protons? So long as there are some cases where
we can have commitments that are not commitments to relative probabilities, there
will be some restrictions to the authority of the Bayesian procedure.
Even if the evidence does render one probability assignment uniquely correct,
there is no guarantee that we have any obligation to adopt that assignment. The
evidence available to any child who has past sixth grade may entail that Goldbach’s
conjecture is false, but that doesn’t mean that anyone who fails to be committed to
assigning the conjecture a probability of 0 is being irrational. Even if evidence does
objectively determine a uniquely correct probability assignment, there is no reason
to think that we should be so constructed as to be rationally required to recognize
the evidence’s import and adopt the right commitments. So it may be rational for
individuals to lack commitments to relative probability assignments even if relative
probability assignments are in some sense uniquely supported by the evidence.
It strikes me as deeply implausible that there are facts about evidential relations
that greatly transcend our ability to recognize them that we are nevertheless rationally
103
obligated to obey. It is doubtful that every issue regarding evidential support is
closed to rational disagreement. So tentatively, short of an explanation of what the
evidence really does dictate, it seems safer to assume that sometimes the evidence
underdetermines the rational response (at least in terms of relative probabilities). If
evidence underdetermines the rational response, then it is plausible we are under no
obligations to think that the evidence merits any particular relative probabilities.
The second response, which held that derived commitments may be sufficient
to establish the authority of the Bayesian procedure, only makes sense if we are
derivatively committed to thinking that the relative probabilities are demanded by
the evidence. Just as there are multiple ways in which one might be committed to
a probability assignment, there are multiple ways in which one might be committed
to a relative probability, and these ways may make a difference to whether that
commitment survives a change in evidence. Earlier, I suggested that commitments
to relative probabilities should survive the addition of evidence that doesn’t touch
either proposition. This followed from the premise that relative probabilities that
are supported by evidence can’t change in response to new evidence. If we have
commitments about how one bit of evidence supports relative probabilities, that gives
us commitments about how we should react to new evidence. If we have commitments
about relative probabilities that follow not from the evidence but from our methods,
there is no guarantee that they tell us anything about how we should react to new
evidence.
We can have commitments to relative probabilities that are not commitments
about what is demanded by the evidence. Our commitments may be to have those
relative probabilities not because the evidence demands them, but because those
relative probabilities are the proper ones to have in the absence of evidence. This
is precisely what I think may go on when one bases one’s probabilities off of the
Principle of Indifference. The Principle of Indifference may be seen as a principle for
104
allocating probabilities in the absence of evidence – not a principle that states what
low levels of evidence themselves demand. The absence of evidence needn’t support
any particular probabilities at all. It should be clear why such a commitment would
not survive the incorporation of new evidence. The fact that in the absence of evidence
two propositions merit a particular ratio of probabilities doesn’t mean that they will
continue to do so when evidence is added.
Derived commitments need not be commitments to what the evidence supports.
Consequently, derived commitments to relative probabilities need not be the kinds of
things that one is rationally obligated to preserve.
Neither of these responses succeeds in undermining the explanation I have of-
fered for the authority of the Bayesian procedure, or the limitations on its authority
that seem to follow. Unless we find some alternative rationale to obey the Bayesian
procedure, we should conclude that it is indeed of limited authority.
105
Chapter 5
Conclusion
In the first chapter, I suggested that we might try to understand judgements about
probability as moves in a kind of practice. The following three chapters saw the
development of related ideas that together presented a more robust picture of the
nature of judgements about probability. Now that I have surveyed these different
proposals, I will briefly draw them together and explain how they relate to each
other.
The conclusion of the second chapter was that judgements about probability are
cognitively sophisticated. That they involve vehicles which incorporate concepts of
probability in the same way that ordinary beliefs incorporate quantitative concepts
such as price, height, and weight. Though I didn’t directly address the issue of the
representational qualities of these concepts, I suggested that we needn’t see probabili-
tistic concepts as standing in for anything in the real world. I think that we should
interpret regarding-as-evidence as a noncognitive attitude. To regard something as
evidence isn’t to believe anything about it, but to be prepared to use it a certain way
in deciding what to believe. Probabilities are a way of categorizing propositions on
the basis of these attitudes, in a way that makes them especially useful for decision
making.
106
The conclusion of the third chapter was that the constitutive account of the formal
norms governing probability held more promise than the alternative explanations.
The relevance of this result to the proposal of the first chapter should be clear. Since
this proposal relies on the idea that our intentional subjugation to the norms explains
why they hold sway over us, it isn’t anything about the world or about the basic
elements of the furniture of the mind that explains the norms. Our intentions are
fruitfully understood as intentions to engage in a practice. Since the interpretation
of probability as a practice helps to make this explanation of the norms available, it
gains plausibility for this explanation’s successes.
The conclusion of the fourth chapter was that the Bayesian procedure has only a
limited authority. This conclusion was a result of a proposal for the explanation of the
authority of the Bayesian procedure. I think that this explanation warrants accep-
tance as the best available explanation. It makes sense of the typical importance of
conditionalization, without relying on pragmatic considerations or the hefty assump-
tions of accuracy-based arguments. Though the account does not rely so directly on
the interpretation of probability as a practice, it does help round out the view. In
part, this explanation is made more plausible if we see judgements about probability
as involving commitments about the representation of evidence. If judgements about
probability were mere states of confidence, then the connection between confidence
and evidential commitments would need to be spelled out. As things stand, the intu-
itiveness of conditionalization provides support for my explanation of the constitutive
aim of assigning probabilities.
The resulting picture is one in which judgements about probability are part of a
heuristic that we use for keeping track of our evidence. The judgements are actions
of categorization. We apply a concept of probability and a number to reflect the
amount of evidence that we have for them. The amount of evidence that we have
for a proposition isn’t an objective matter, but is a matter akin to personal taste.
107
The numbers that we apply in these judgements must conform to the formal norms
of probability, for it is part of the practice that our judgements are subject to such
constraints. Probability is a form of epistemic bookkeeping that we learn from our
community. The attitude of regarding-as-evidence is innate. The way in which we
abstract from these attitudes and apply numbers is not.
The interpretation of probability as a move within a practice has many advantages
over credal noncognitivism. Besides not relying on the doctrine of degrees of belief,
it provides space for the development of the normative explanations that I advanced
in the previous two chapters of this paper. These views make the most fundamen-
tal of the normative rules that govern the assignment of probabilities a product of
ourselves. We impose rules upon ourselves, either by undertaking commitments to
relative probabilities, as with conditionalization and the Bayesian procedure, or by
deciding to assign probabilities in the first place. It is because we implicitly choose
to engage in a practice with certain rules and a certain aim, we become subject to
thre norms of probability.
There is much more that needs to be said for this account to be complete. I have
only sketched the idea of treating judgements as a move in a practice and I left out
the details of what it is that makes the judgements count as moves in a practice. If
we do not characterize judgements about probability in terms of their content or their
functional roles, we owe some other account.
Further, by giving up on representative meaning, noncognitivists must provide
the explanations that are lost along with representative content. The chief of these
problems is the Frege/Geach challenge, the challenge of explaining the logical relations
between complex judgements. I developed a version of this challenge and marshaled
it against the unification thesis in the second chapter. I think that the response
that I advocated in that chapter of regarding judgements of probabilities as involving
108
certain vehicles of probabilities is promising. But the problem is deep and a great
deal of work remains to be done.
Finally, I think that the ideas presented in this dissertation show promise in appli-
cation to other domains as well. In the first chapter, I suggested that interpretations
of moral judgements admit analogous interpretations of probabilistic judgements. I
hope the ideas that I have explored in this dissertation may give something back to
the normative domain. Normative judgements may likewise be understood in terms
of a network of communal practices. Perhaps this will not produce a different product
from the views of recent metaethicists, but it will allow us to develop those views in
a slightly different light.
109
Bibliography
[1] Frank Arntzenius, Adam Elga, and John Hawthorne. Bayesianism, infinite deci-sions, and binding. Mind, 113(450):251–283, 2004.
[2] Simon Blackburn. Opinions and chances. In D. H. Mellor, editor, Prospects forPragmatism, pages 175–96. Cambridge University Press, 1980.
[3] Glenn W Brier. Verification of forecasts expressed in terms of probability.Monthly weather review, 78(1):1–3, 1950.
[4] Aaron Bronfman. A gap in Joyce’s argument for probabilism. unpublishedmanuscript.
[5] Rudolf Carnap. Logical foundations of probability. 1950.
[6] David Christensen. Dutch-book arguments depragmatized: Epistemic consis-tency for partial believers. Journal of Philosophy, 93(9):450–479, 1996.
[7] Bruno de Finetti. Theory of Probability. New York: John Wiley, 1970.
[8] Andy Egan, John Hawthorne, and Brian Weatherson. Epistemic modals in con-text. In G. Preyer and G. Peter, editors, Contextualism in Philosophy, pages131–170. Oxford University Press, 2005.
[9] Lina Erickson and Alan Hajek. What are degrees of belief? Studia Logica,86(2):185–215, 2007.
[10] Matthew Evans and Nishi Shah. Mental agency and metaethics. Oxford studiesin metaethics, 7:80–109, 2012.
[11] Richard Foley. The epistemology of belief and the epistemology of degrees ofbelief. American Philosophical Quarterly, 29(2):111–124, 1992.
[12] Keith Frankish. Partial belief and flat-out belief. In Huber and Schmidt-Petri,editors, Degrees of Belief. Synthese Library, 2009.
[13] Hilary Greaves and David Wallace. Justifying conditionalization: Conditional-ization maximizes expected epistemic utility. Mind, 115(459):607–632, 2006.
[14] Ian Hacking. The emergence of probability. Cambridge University Press, 1975.
110
[15] Alan Hajek. What conditional probabilities could not be. Synthese, 137:273–323,2003.
[16] Alan Hajek. Arguments for–or against–probabilism? The British Journal forthe Philosophy of Science, 59(4):793–819, 2008.
[17] Alan Hajek. A puzzle about degree of belief, 2010.
[18] James Joyce. A nonpragmatic vindication of probabilism. Philosophy of Science,65(4):575–603, 1998.
[19] James Joyce. Accuracy and coherence: Prospects for an alethic epistemology ofpartial belief. In Huber and Schmidt-Petri, editors, Degrees of Belief. SyntheseLibrary, 2009.
[20] John Maynard Keynes. A treatise on probability. 1921.
[21] Andrej Nikolaevic Kolmogorov. Foundations of the theory of probability. 1950.
[22] Angelika Kratzer. Modality. semantics: an international handbook of contempo-rary research, ed. by a. von stechow and d. wunderlich, 639–50, 1991.
[23] David Lewis. A subjectivists guide to objective chance. In Ifs, pages 267–297.Springer, 1981.
[24] John MacFarlane. Epistemic modals are assesment sensitive. In Andy Egan andB. Weatherson, editors, Epistemic Modality. Oxford University Press, 2009.
[25] John MacFarlane. Assessment sensitivity: Relative truth and its applications.Oxford University Press, 2014.
[26] Patrick Maher. Joyce’s argument for probabilism. Philosophy of Science,69(1):73–81, 2002.
[27] Huw Price. Does ‘probably’ modify sense? Australasian Journal of Philosophy,61(4):396–408, 1983.
[28] Frank Plumpton Ramsey. Truth and probability. The foundations of mathematicsand other logical essays, pages 156–198, 1931.
[29] Glenn Shafer. A mathematical theory of evidence, volume 1. Princeton UniversityPress, 1976.
[30] Nishi Shah and J. David Velleman. Doxastic deliberation. The PhilosophicalReview, pages 497–534, 2005.
[31] Tamina Stephenson. Judge dependence, epistemic modals, and predicates ofpersonal taste. Linguistics and Philosophy, 30(4):487–525, 2007.
[32] Eric Swanson. Interactions with Context. PhD thesis, MIT, 2006.
111
[33] Erno Teglas, Vittorio Girotto, Michel Gonzalez, and Luca L. Bonatti. Intuitionsof probabilities shape expectations about the future at 12 monthes and beyond.Proceedings of the National Academy of Sciences, 2007.
[34] Paul Teller. Conditionalization and observation. Synthese, 26(2):218–258, 1973.
[35] Stephen Toulmin. The Uses of Argument. Cambridge University Press, 2003.
[36] Kai von Fintel and Anthony S. Gillies. ‘Might’ made right. In Andy Egan andBrian Weatherson, editors, Epistemic Modality. Oxford University Press, 2009.
[37] Ralph Wedgwood. The aim of belief. Nous, 36(s16):267–297, 2002.
[38] Ralph Wedgwood. Doxastic correctness. In Aristotelian Society SupplementaryVolume, volume 87, pages 217–234. Wiley Online Library, 2013.
[39] Donald Cary Williams. The ground of induction. 1963.
[40] Fei Xu and Vashti Garcia. Intuitive statistics by 8-month-old infants. Proceedingsof the National Academy of Sciences, 2008.
[42] Seth Yalcin. Non-factualism about epistemic modality. In Andy Egan and BrianWeatherson, editors, Epistemic Modality. Oxford University Press, 2011.