-
Forthcoming in Semantics and Pragmatics. Penultimate
version.
On the semantics and pragmatics of epistemic vocabulary
Sarah Mossssmoss@umich.edu
There has been much recent debate over the correct semantics for
epistemic vo-cabulary, i.e. expressions like the sentential
operators in sentences such as:
(1) John might be in his office.
(2) John must be in his office.
(3) John is probably in his office.
(4) If John is in the building, he is in his office.
This paper explores a rich source of data for theories of this
vocabulary. The debateover the viability of standard truth
conditional theories has called attention to thedistinctive
behavior of epistemic vocabulary in eavesdropping judgments,
indicativesuppositions, and statements of disagreement and
retraction. But extant accountsare not sufficiently sensitive to
distinctive features of the way in which epistemicvocabulary
interacts with other epistemic vocabulary. If we start by studying
thebehavior of simple nested epistemic modals, we may naturally
build a theory thatexplains the more complicated behavior of
epistemic modals under disjunction andover indicative conditionals,
and even the puzzling effects of embedding epistemicvocabulary in
classically valid arguments. In §1, I make unifying observations
aboutthe suggestive behavior of epistemic vocabulary in each of
these contexts, extractingseveral desiderata for semantic and
pragmatic theories.
1. Thanks to Fabrizio Cariani, Josh Dever, Cian Dorr, John
Hawthorne, Eric Swanson, Brian Weatherson,and an anonymous referee
for feedback on drafts of this paper. Thanks also to the University
of ChicagoLinguistics and Philosophy Workshop, the University of
Michigan Linguistics and Philosophy Work-shop, Ohio State
University, and the 24th Semantics and Linguistics Theory
Conference (SALT 24) forhelpful discussion.
-
In §2–3, I develop a semantics for epistemic vocabulary. This
semantics constitutesa rather dramatic alternative to standard
truth conditional theories, as it assigns setsof probability
measures rather than sets of worlds as semantic values. I aim to
demon-strate that what my theory lacks in conservatism is made up
for by its strength. In§4, I argue that combined with a novel
pragmatics, my semantic theory can accountfor the distinctive
linguistic behavior observed in §1. The theory I defend
therebyaddresses several challenges raised in recent literature.
For instance, the theory an-swers concerns about epistemic modals
under disjunction raised in Schroeder 2012.The theory explains why
epistemic vocabulary produces invalid instances of classi-cally
valid arguments, shedding light on important puzzles raised for
constructivedilemma arguments in Kolodny & MacFarlane 2010 and
modus tollens argumentsin Yalcin 2012b.
1. Data for a theory of epistemic vocabulary
A careful examination of the behavior of epistemic modals yields
several desideratafor a theory of epistemic vocabulary. A few of
these desiderata have been discussedelsewhere, usually as puzzles
concerning epistemic modals. A number of the desider-ata make
trouble for extant semantic theories. The literature on epistemic
modals isso vast that it would be impractical to argue against
every alternative to my preferredtheory here. For considerations of
space, I set aside the possibility of resuscitatingthe standard
truth conditional semantics for epistemic vocabulary, since
persuasivearguments against that semantics have been discussed at
length elsewhere.2 I pointout potential challenges for other
prominent theories in passing, but the main focusof this paper is
the exposition and development of a positive case for my own
theory.
1.1. Nested epistemic vocabulary
Nested epistemic vocabulary occurs in many forms in ordinary
conversation. Forexample, suppose Alice and Bob are both candidates
for certain job positions. Wemay naturally talk about Bob using
epistemic adjectives under epistemic operators:
(5) Alice is a likely hire, and Bob might be a likely hire.
(6) Alice is a possible hire, and Bob is probably also a
possible hire.
2. For instance, see the implications of triviality results
discussed in Edgington 1995, the discussion ofthe subject matter of
indicative conditionals in Bennett 2003, the “speaker inclusion
constraint” inEgan et al. 2005 and Weatherson 2008, the case of the
missing car keys in Swanson 2006 and vonFintel & Gillies 2011,
the eavesdropping cases in Egan 2007, the discussion of embedding
behaviorin Yalcin 2007, the discussion of inference patterns in
Yalcin 2010, the discussion of assertability anddisagreement in
Yalcin 2011, and the discussion of retraction and disputes in
MacFarlane 2011.
2
-
And we could further spell out the above observations as
follows:
(7) It is likely that we will hire Alice, and we might also be
likely to hire Bob.
(8) We might hire Alice, and it is probably the case that we
might hire Bob too.3
Both epistemic modals and epistemic comparative adjectives can
occur in the scopeof indicative conditionals, and vice versa:
(9) If they did not hire Alice, they are more likely to have
hired Bob than Carl.4
(10) It is more likely than not that the vase broke if he
dropped it on concrete.
In addition, there are well-known examples of right-nested and
left-nested indicatives:
(11) If a Republican wins the election, then if it’s not Reagan
who wins it will beAnderson. (McGee 1985, 462)
(12) If the cup broke if it was dropped, it was fragile.
(Gibbard 1981, 237)
And finally, there are attested uses of nested epistemic
expressions occurring in shortsuccession:
(13) She could not but think [that] Wentworth was not in love
with either. Theywere more in love with him; yet there it was not
love. It was a little fever ofadmiration; but it might, probably
must, end in love with some.5
(14) The time is now near at hand which must probably determine,
whether Amer-icans are to be, Freemen, or Slaves.6
In wordy constructions such as (7) and (8) as well as condensed
constructions suchas (13) and (14), we are intuitively using nested
epistemic modals to say somethingdifferent from what we would use
single modals to say. For example, intuitively (5)says something
different about Bob than it says about Alice:
(5) Alice is a likely hire, and Bob might be a likely hire.
To take another simple example, (15) intuitively says something
different about Bobfrom either (16) or (17):
(15) It is probably the case that Bob is a possible hire.
3. It cannot be taken for granted that both modals in these
constructions are genuinely epistemic. However,in the next section
of this paper, I give several arguments against the claim that one
can always provideembedded modals with non-epistemic
interpretations.
4. Hacquard & Wellwood 2012 give attested cases of epistemic
vocabulary in indicative antecedents,while arguing that pragmatic
considerations may limit the distribution of epistemic vocabulary
in in-dicative antecedents and similar linguistic contexts.
5. Austen 1818, p.55; italics added.6. George Washington’s
address to the Continental Army before the Battle of Long Island,
27 August 1776;
italics added.
3
-
(16) It is probably the case that Bob is a hire.
(17) Bob is a possible hire.
In particular, our judgments suggest that (15) is weaker than
either (16) or (17). Be-lieving (16) is intuitively sufficient
reason to bet at even odds that we will hire Bob,whereas merely
believing (15) is not. Evidence for the semantic difference
between(15) and (17) comes from direct intuitions about what we use
these sentences to talkabout. In particular, nested epistemic
modals are often used when you do not yethave some settled opinion
on some question. If you say that Bob is a possible hire, itsounds
as if you know that we might hire Bob. By contrast, if you merely
say that itis probably the case that Bob is a possible hire, it
sounds as if you have not yet settledon an opinion about Bob.
Either Bob is a possible hire, or he isn’t, and you are
moreinclined to side with the former opinion.
Relatedly, subjects sometimes report that they can easily make
sense of nestedepistemic modals by imagining that the speaker has
several sources of informationabout their prejacent, and she is not
sure which source she should trust. For instance,suppose we survey
several equally informed experts about whether we might hireBob. If
most say that we might hire Bob and just a couple of experts
disagree, thenit is natural to form the opinion that it is probably
the case that we might hire Bob.And analogous generalizations hold
for other uses of nested epistemic modals. Tocomment on the example
(14) above: if you say that some battle must probably bedecisive,
it sounds as if whatever settled opinion you may eventually have
about theimportance of the battle, you will settle on an opinion
according to which the battleis probably decisive. It is easy to
make sense of this state by imagining that you haveseveral sources
of information about whether the battle will be decisive, where
eachsource agrees that the battle is at least more likely than not
to be decisive.
According to naïve orthodoxy, when someone utters a declarative
sentence, youshould add its content to your stock of full beliefs.
But as theorists have developedalternatives to full belief models
of mental states, many have argued that what we sayreflects what we
think according to these more intricate models. For instance,
somehave claimed that epistemic modals are used to communicate
partial beliefs.7 At a firstglance, it may appear that sentences
containing nested epistemic modals are used tocommunicate even more
intricate mental states. In particular, according to
imprecisecredence models, you are associated with multiple
probability measures when you areunsettled as to how likely various
propositions are, exactly as you might be when youare unsure what
source of information you should trust. Rothschild 2012 argues
that
7. See §2 for further discussion, and see Swanson 2012 for a
recent catalog of relevant literature.
4
-
epistemic modals are used to communicate these sorts of
imprecise credal states. Thetheory I develop does not model
subjects as having imprecise credences. But whetheror not we adopt
the sort of semantics Rothschild defends, it is important that
ourtheory account for intuitive judgments that naturally lend
themselves to that proposal.In other words, the above discussion
highlights an important goal for any theoryof epistemic vocabulary.
This is our first desideratum: our theory should explainwhy nested
epistemic modals signal that different opinions about some subject
are inplay. Relatedly, our theory should explain why we sometimes
easily make sense ofembedded modals by imagining that a speaker
bases her opinions on multiple sourcesof information.
A second desideratum for our theory of epistemic vocabulary is
inspired by Yal-cin 2007. Yalcin points out that our theory of
epistemic possibility modals shouldexplain why conjunctions of
pairs of sentences such as (18) and (19) sound bad, andwhy such
conjunctions continue to sound bad when embedded under indicative
sup-position, as in (20) and (21):
Some detectives are discussing the identity of a certain masked
murderer.
(18) It is not John.
(19) It might be John.
(20) #Suppose it is not John and it might be John.
(21) #If it is not John and it might be John. . .
Along the same lines, note that not only is it bad to assert
(18) and (19) together,but it is difficult to imagine a single
circumstance in which you could correctly uttereither of these
sentences individually. If you would be correct in uttering (18) in
somecircumstance, then it is difficult to imagine how you could
simultaneously be just ascorrect in uttering (19).
In this last respect, (18) and (19) stand in striking contrast
to a similar pair ofsentences, namely sentences that resemble (18)
and (19), but where the embeddedsentence is replaced with a
sentence containing epistemic vocabulary:
(22) It is not the case that it is probably John.
(23) It might be the case that it is probably John.
It is possible to imagine a single circumstance in which you
could correctly utter either(22) or (23). For instance, suppose you
simply cannot make up your mind about howlikely it is that the
masked murderer is John. A few experts believe it is probably
John,but a majority of experts believe it is probably Mary. In this
case, you might correctly
5
-
use (22), insofar as you would side with the majority of experts
if forced to chooseone suspect. But you might also correctly use
(23), insofar as you refuse to simplyignore the minority expert
opinion. Here different frames of mind are relevant toyour imagined
utterances: (22) reflects your opinion after collating the advice
of yourexpert advisors, while (23) reflects the fact that you are
still not sure which expertsyou should trust. And of course,
neither frame of mind vindicates the assertion ofboth
sentences:
(24) #It is not the case that it is probably John and it might
be the case that it isprobably John.8
These judgments yield a second desideratum for our theory of
epistemic vocabulary:our theory should explain why in certain
circumstances, we could correctly uttereither (22) or (23), though
we could not correctly utter their conjunction.
A third desideratum comes from a final observation about nested
modals, namelythat the strength of the outer modal often reflects
the weight of your evidence and re-silience of your opinion about
the prejacent of the inner modal. For example: supposethat Liem
likes wearing green shirts. His dad Eric has observed the color of
his shirton 800 consecutive days. Liem was wearing green on 500 of
those days. His friendMadeleine has observed the color of his shirt
on 8 consecutive days. Liem was wear-ing green on 5 of those days.
Suppose that Eric and Madeleine have not yet seenwhat Liem is
wearing today. Both Eric and Madeleine have .625 credence that
Liemis wearing green, and both might guess that Liem is probably
wearing green. But itseems more appropriate for Madeleine to assert
(25) or (26), whereas Eric is intuitivelylicensed in asserting
(27):
(25) It might be probable that Liem is wearing green.
(26) In fact, I’m fairly confident that he is probably wearing
green.
(27) Liem is definitely likely to be wearing green.
The assertability of (27) tracks two differences between Eric
and Madeleine. Eric baseshis credences about Liem on more evidence.
In addition, his high credence that Liemis wearing green is more
resilient. Joyce 2005 argues that in a number of
evidentialsituations, “weight of evidence manifests itself in the
resilience of credences in theface of new data” (166). In the above
situation, both evidential weight and credalresilience are
manifested in the strength of the modal that embeds (28):
(28) Liem is probably wearing green.
8. A less stilted but equally infelicitous version of the
sentence: ‘John isn’t a probable killer and might bea probable
killer’.
6
-
Suppose you have a relatively uninformed hunch that Liem is
probably wearing green.In other words, suppose that your high
credence that Liem is wearing green is notjustified by much
evidence. Then you are intuitively licensed in asserting (25),
butnot (27). As you acquire more and more evidence, your high
credence that Liem iswearing green will become more and more
resilient, and you may embed (28) understronger and stronger
epistemic modals. Hence our third desideratum: our theory
ofepistemic vocabulary should explain this intuitive connection
between nested modals,evidential weight, and credal resilience.
All three of the above desiderata pose challenges for several
extant theories ofepistemic modals. For example, consider the
following standard dynamic semanticentries for epistemic
possibility and necessity modals:9
c[♦φ] = {w ∈ c : c[φ] 6= ∅}
c[�φ] = c \ {w ∈ c : (c \ c[φ]) 6= ∅}
From these definitions, we can derive that c[�♦φ] = c[♦φ]. Hence
according to thissemantics, any string of possibility and necessity
modals is equivalent to its innermostmodal. Some dynamic
semanticists explicitly embrace this result, claiming that
“em-bedding an epistemic modal under another epistemic modal does
not in general haveany interesting semantic effects” (Willer 2013,
12). The same result holds for a promi-nent competitor of the
dynamic semantic proposal, namely the semantics defendedin Yalcin
2007. As Yalcin explains: “iterating epistemic possibility
operators addsno value on this semantics. . . This may explain why
iterating epistemic possibilitymodals generally does not sound
right, and why, when it does, the truth-conditionsof the result
typically seem equivalent to ♦φ. I will generally ignore iterated
epistemicmodalities” (994). It is difficult to see how semantic
proposals in this spirit could suc-cessfully explain the pervasive
nature of nested modals, much less account for theirdistinctive
behavior.
1.2. Against contextualist re-interpretations of nested
epistemic vocabulary
The most substantive recent attempt at a more responsive
semantics for nested epis-temic modals appears in Yalcin 2009,
where Yalcin admits that sometimes nestedmodals do “allow for
coherent interpretations not equivalent to corresponding
ex-pression with the most narrow modal. The latter case is not
provided for by theabove semantics. In such cases I would be
inclined to appeal to tacit shifting of the
9. For canonical instances of semantic proposals along these
lines, see Stalnaker 1970, Veltman 1996,Beaver 2001, von Fintel
& Gillies 2008b, and Willer 2013.
7
-
information state parameter, akin to free indirect discourse”
(21). For further elabo-ration, we are directed to the following
passage in Yalcin 2007: “interpretation mayinvolve a tacit shift in
the information parameter. . . to the target state of
informationfor the context. Aside from Gricean considerations of
charitable interpretation, it isnot obvious whether general
principles are involved in the interpretation of such tacitshifts”
(1013). It is difficult to know exactly what is intended by these
brief sugges-tions, and hence my arguments so far may be understood
as an invitation to developthese suggestions into a theory that
satisfies the desiderata given above.
A natural development of these suggestions might say that in any
sentence wherenested modals occur, the prejacent of the outer modal
receives the same boring sort ofsemantic value as any simple
declarative sentence. For instance, one might assimilatesentences
such as (27) with sentences about particular probability functions,
such as(29) or (30):
(27) It is almost certainly the case that Liem is probably
wearing green.
(29) It is almost certainly the case that the objective chance
that Liem is wearinggreen is high.
(30) It is almost certainly the case that my epistemic
probability that Liem iswearing green is high.
However, there are many reasons to be skeptical of this
approach. Recall that recentliterature has provided a host of
reasons to reject the claim that the prejacent (28) isequivalent to
some simple declarative sentence like (31) or (32):
(28) Liem is probably wearing green.
(31) The objective chance that Liem is wearing green is
high.
(32) My epistemic probability that Liem is wearing green is
high.
The crucial dialectical point to appreciate is that analogous
concerns tell against theequivalence of these same sentences when
they are embedded under epistemic vocab-ulary. For example, it is
suspiciously difficult to say exactly what salient
probabilityfunction (27) is talking about. In the case described
above, Eric can utter (27). But hecannot utter (29), because Eric
knows that the objective chance that Liem is wearinggreen is either
0 or 1, and Eric is not almost certain of the latter. Madeleine
cannotutter (27). But she can utter (30), because she knows that
her inductive evidence con-firms the claim that Liem is wearing
green. Hence neither (29) nor (30) accuratelyparaphrases (27).
Furthermore, eavesdroppers may explicitly target the prejacent
of (27) and cor-rectly evaluate it relative to their epistemic
situation. For instance, if I have just seen
8
-
Liem wearing a red shirt and I overhear Eric utter (27), it
would be pedantic butnevertheless acceptable for me to say:
(33) That isn’t almost certain; it’s just false. It’s not the
case that Liem isprobably wearing green—he is wearing red.
A notorious dilemma for truth-conditional accounts replays
itself here: if Eric wasusing ‘probably’ just to talk about his own
evidential situation, then I am not licensedin saying ‘it’s false’
in judging the prejacent of (27). On the other hand, if Eric
wasusing ‘probably’ to talk about some evidence that included my
evidence, then he wasnot licensed in uttering (27) to begin
with.10
In fact, nearly every argument against a uniform truth
conditional theory of allepistemic modals yields an analogous
argument against a uniform truth conditionaltheory of all embedded
epistemic modals. Bennett 2003 may argue that any
allegedparaphrases of (27) fail to capture its intuitive subject
matter, for instance. Bennettargues that when someone utters an
indicative conditional, “common sense and theRamsey test both
clamour that [she] is not assuring me that her value for a
certainconditional probability is high, but is assuring me of that
high value. . . She aims toconvince me of that probability, not the
proposition that it is her probability” (90).Yalcin 2011 adds that
the reasons that I give in support of my utterance ‘it might
beraining’ concern the first-order proposition that it is raining,
rather than any contex-tually determined body of evidence. Both
Bennett and Yalcin could complain that(27) intuitively concerns
Liem, rather than any contextually determined body of evi-dence.
Another challenge comes from Yalcin 2007. If embedded modals are
alwaysinterpreted relative to some salient probability function,
then we lack an explanationfor the infelicity of sentences such
as:
(34) #Probably, it is raining and might not be raining.
(35) #It is unlikely that it is both raining and might not be
raining.
(36) #It might be that it is both raining and might not be
raining.
These judgments are not accommodated by expressivist,
relativist, or dynamic theo-ries that resort to assigning simple
semantic contents to embedded modal construc-tions.
In addition, it is worth noting that if we reinterpret the
prejacent of (27) as havingstraightforward truth conditions, we are
still left with the problem of interpreting(37-b) in the following
dialogue:
10. This is just the first step in an involved dialectic. For
further discussion of eavesdropping argu-ments against
truth-conditional accounts of epistemic vocabulary, see Egan et al.
2005, Egan 2007,Hawthorne 2007, von Fintel & Gillies 2008a,
Yalcin & Knobe 2010, and MacFarlane 2011.
9
-
(37) a. David: Is Liem probably wearing green?b. Eric: Almost
certainly.
Familiar arguments challenge the claim that the unembedded
(37-a) has straightfor-ward truth conditions. Furthermore, it is
difficult to see why Gricean considerationsshould demand that we
interpret (37-a) as containing free indirect discourse or atacitly
shifted information parameter. Hence it seems we must find some way
of in-terpreting (37-b) without appealing to such strategies. One
would expect the resultingunderstanding of (37-b) to provide some
similar understanding of (27), namely an al-ternative semantics
that recognizes that ‘Liem is probably wearing green’ need
notexpress a possible worlds content, whether it is embedded in a
question or under fur-ther epistemic vocabulary. To sum up: it is
not obvious that extant semantic theoriescan explain the behavior
of nested epistemic modals. A natural way of developingpotential
explanations on behalf of recent expressivist, relativist, and
dynamic theo-ries meets with several challenges. Hence the behavior
of nested epistemic modalsshould motivate us to look for
alternative semantic theories.
1.3. Epistemic vocabulary under disjunction
A fourth desideratum for our theory of epistemic vocabulary is
inspired by Schroeder2012. Schroeder argues that a semantic theory
should not predict that you can asserta disjunction only if you can
assert one of its disjuncts, even in special cases wheredisjuncts
are stipulated to be governed by wide-scope epistemic modals.
Schroederpoints out several reasons why this prediction would be
bad. Here is one example:
Last night Shieva calls me to express frustration with the paper
that she is workingon, and tells me that if she hasn’t finished by
this morning, she’s going to consulther magic 8-ball about whether
to give up and follow its advice. Since I know thatmost of the
answers on her magic 8-ball are positive, when I recall our
conversationfrom last night, I conclude that either Shieva finished
her paper by this morning,or she probably gave up. (21–2)
In this case, the speaker can correctly assert ‘Shieva finished
or probably gave up’without being able to assert either disjunct.
Similarly, you can correctly assert (38)about the result of
throwing a fair die, without being able to assert either
disjunct:
(38) It is less than four or probably even.
In this respect, disjunctions embedding epistemic vocabulary are
just like ordinarydisjunctions of simple sentences. In fact,
asserting a disjunction usually implicatesthat you are not in a
position to assert either disjunct. There is something
especiallypeculiar about disjunctions embedding epistemic
vocabulary, though. Even if you
10
-
can deny one disjunct and you cannot assert the other, you may
still be able to assertthe entire disjunction. For instance: you
can assert (38) even though you can denythe second disjunct by
itself, and you cannot assert the first. This does not hold
fordisjunctions without epistemic vocabulary. If you can deny one
half of a simple dis-junction, then disjunctive syllogism
ordinarily proves that the remaining disjunct isequivalent to the
entire disjunction, so one is not assertable without the other.
Thisbrings us to our fourth desideratum: our theory should explain
this surprising differ-ence between simple disjunctions and
disjunctions containing epistemic vocabulary.
A semantics for ‘or’ is missing from Yalcin 2007, 2011, 2012b,
and related pa-pers. Hence the relevant challenge for Yalcin is to
state a semantics that predicts thebehavior just described.11
Substantially more progress has been made on disjunctionin the
dynamic semantics literature. In fact, a number of dynamic accounts
of disjunc-tion satisfy our fourth desideratum. According to these
accounts, natural languagedisjunction is not commutative. Roughly
speaking, the second half of a disjunction isnot interpreted
relative to a global context, but rather relative to a local
context thathas been updated with the negation of the first
disjunct. This sort of account aims togive a uniform explanation of
the local interpretation of ‘probably’ in (38) and
localsatisfaction of licensing conditions for pronouns in
disjunctions such as the followingfamous example from Roberts
1989:
(39) Either there is no bathroom in this house, or it is in a
funny place.
Just as the licensing conditions for ‘it’ in (39) are satisfied
in a local context where thefirst disjunct is false, values of
contextual parameters in the second disjunct of (38) areprovided by
a local context where the first disjunct is false. This explains
why youmay assert (38) even when you can deny its second disjunct
uttered in isolation. Thedisjunction is felicitous because its
second disjunct is acceptable in all contexts wherethe negation of
the first disjunct is given.
This dynamic account predicts that natural language disjunction
is not commu-tative, and fans of this account often claim this
predicted failure of commutativity asa benefit. For instance, they
claim that a semantics for natural language disjunctionshould
entail that (40) sounds bad even though (38) sounds fine:
(38) It is less than four or probably even.
(40) It is probably even or less than four.
11. Schroeder extrapolates a semantics for ‘or’ from Yalcin 2007
and criticizes that semantics for validating‘or’ exportation.
11
-
However, it is not clear that we should want our semantics to
predict this differencebetween (38) and (40).12 For instance, there
are a number of contexts in which (40)seems just as good as (38),
namely contexts in which certain partitions of logical spaceare
salient. Consider the following case:
Alice just rolled a fair die and hid it under a cup in front of
me. I see a blue cupand a red cup. The die is under the blue cup if
it landed on a four, five, or six. Thedie is under the red cup if
it landed on a number less than four.
Bob offers me a pair of bets. For one dollar, he will sell me a
bet that pays fivedollars if the die landed on an even number. For
another dollar, he will sell me abet that pays five dollars if the
die landed on a number less than four. I am veryrisk averse, and I
do not always bet to maximize expected returns. But staringfirst at
the blue cup and then at the red cup, I judge that I would be
comfortableaccepting both bets, since, as I put it, “either it is
probably even, or less than four.”
The circumstances of the above case call attention to a certain
partition of logicalspace: either the die landed on a number less
than four, or it landed on a highernumber. Against this background,
my utterance of (40) seems perfectly correct.13
In fact, some disjunctions like (40) sound fine without heavy
contextual cues. Forinstance, you can assert any of the following
disjunctions, even if you can deny thefirst disjunct and cannot
assert the second:
(41) It’s either unlikely he was being honest with you, or he
just wanted you tothink that he was lying.
(42) The next United States president will either almost
certainly attempt torepeal a lot of Barack Obama’s policies, or
they will be a Democrat withmore liberal views than Obama has.
(43) John is probably playing baseball, or it has been raining
all afternoon.
These disjunctions seem to mean the same thing regardless of the
order in which theirdisjuncts are uttered. In fact, they might just
as well be written with their disjunctsarranged in a circle,
without detriment to our ability to understand or evaluate
them.This yields a fifth desideratum for our theory of epistemic
vocabulary: our theoryshould explain why disjunctions such as (40)
sound infelicitous in some contexts andfelicitous in others. And
our theory should explain why reversing disjunct order doesnot
affect the interpretation of disjunctions in contexts where they
sound felicitous.
12. The commutativity of disjunction is controversial even among
advocates of dynamic semantic theo-ries. For instance, Schlenker
2009 and Rothschild 2011 both provide theories according to
whichdisjunction is commutative; their accounts are sympathetic
with my discussion of the fifth desideratum.
13. Some readers may find it difficult to evaluate the
artificial speech described above, especially since thesalience of
an objective chance function may introduce noise in our judgments.
The essential point of thepresent discussion is that contextual
cues may make certain readings of epistemic vocabulary
available.See §4.5 for more natural illustrations and a more
detailed defense of this point.
12
-
This fifth desideratum should give us pause before we endorse a
semantic theorythat explicitly entails that natural language
disjunction is not commutative. Further-more, the above dynamic
explanation for why we can assert (38) seems insufficientlygeneral,
since it does not explain why we can sometimes assert (40)–(43).
The dy-namic proposal outlined above says that we can sometimes
assert a disjunction like(38) when its second disjunct is deniable
and its first disjunct is unassertable. But(40)–(43) are all
sometimes assertable even when their first disjuncts are deniable
andtheir second disjuncts are unassertable. According to the
dynamic explanation, (38) isfelicitous because its second disjunct
is acceptable in all contexts where the negationof the first
disjunct is given. But for any of (40)–(43), the second disjunct is
not accept-able even in contexts where the negation of the first
disjunct is given. For example,the negation of the first disjunct
of (40) is already given in an ordinary context wherea fair die is
rolled, but the second disjunct of (40) is not acceptable in that
context:
(40) It is probably even or less than four.
To sum up: several observations raise challenges for several
extant dynamic semanticaccounts of the assertability of
disjunctions. In particular, differences in the assertabil-ity of
(38) and (40) seem sensitive to contextual factors, such as the
salience of variousalternative sets. This should motivate us to
doubt theories that derive differencesin assertability from
context-insensitive semantic rules. Pragmatic theories are
betterdesigned to account for the distinctive behavior of
disjunctions embedding epistemicvocabulary.
1.4. Epistemic vocabulary over indicatives
A sixth desideratum for a theory of epistemic vocabulary is
inspired by an examplein chapter 9 of Lycan 2001, which itself
builds on a related discussion of subjunctiveconditionals in Slote
1978. Consider the following case:
Jill is standing on the roof of your office building. The local
fire department occa-sionally hangs a net along the roof to protect
workers doing construction. The netis strong enough to safely catch
anyone who falls off the building. Just a few hoursago, you
happened to notice that there was no net along the roof. As a
result, youdo not believe that Jill is going to jump off the roof.
Jill is a thrill-seeker who mightjump into a net for fun, but she
definitely does not have a death wish. And withouta net, anyone who
jumped off the roof would surely fall to the ground and die.
On the one hand, since you believe that there is no net along
the roof, you are intu-itively justified in asserting:
(44) Probably, if Jill jumps off the building, she will die.
13
-
On the other hand, you are confident that Jill does not have a
death wish. If you wereinformed that Jill jumped off the building,
you would immediately conclude that thelocal fire department must
have installed a net since you last checked the roof. Withthat
information in the front of your mind, you are intuitively
justified in denying (44)and asserting:
(45) Probably, if Jill jumps off the building, she will
live.
To make these observations more vivid, suppose someone asks you
whether there isa net along the roof of the building. They may well
know that you promised the firedepartment that you wouldn’t go
around telling people whether or not there was anet along the roof,
but they may still persist in pestering you for information. It
isintuitively fine for you to respond:
(46) I cannot answer your questions directly. But I can tell you
this much: it isreally likely that if Jill jumps off this building,
she will die.
On the other hand, suppose someone asks you whether you believe
that Jill is suicidal.Again, they may well know that you promised
Jill that you wouldn’t go around tellingpeople about her mental
state, but they may persist in pestering you for
information.Suppose that it is common ground that anyone suicidal
would simply cut away anysafety net and jump off the building in
question. It is intuitively fine for you torespond:
(47) I cannot answer your questions directly. But I can tell you
this much: it isreally likely that if Jill jumps off this building,
she will live.
Hence the assertability of (44) does not depend only on your
opinions about Jill andthe net, which we may stipulate are the same
when you utter (46) and (47). It must alsobe sensitive to some
factor that varies between these contexts of utterance. As withmany
other examples we have considered, you are considering different
questions inthese different contexts, and which question you are
considering seems relevant towhich utterances are felicitous.
Suppose you are considering the question of whetherthere is a net
along the roof. Then since you believe that there is probably no
net,you may say that it is probably the case that if Jill jumps
from the roof, she will die.Suppose you are considering the
question of whether Jill is suicidal. Then since youbelieve that
she is probably not suicidal, you may say that it is probably the
case that ifJill jumps from the roof, she will live. The sixth
desideratum: our theory of epistemicvocabulary should explain this
variation in the assertability conditions of (44).
There is no obvious mechanism for explaining this variation in
many extant theo-ries of epistemic vocabulary. The semantic values
for ‘probably’ and ‘if’ given in Velt-
14
-
man 1996 and Yalcin 2012b do not depend on contextually
determined parameters.An advocate of these semantic proposals might
attribute variation in the interpreta-tion of (44) to scope
ambiguity. At the level of logical form, ‘probably’ might takescope
over the entire indicative conditional in (44) or just over its
consequent. But thisdoes not seem like a plausible explanation of
the behavior of (44), since context notonly affects our
interpretation of (44), but also our interpretation of the
unembeddedindicative conditional (48):
(44) Probably, if Jill jumps off the building, she will die.
(48) If Jill jumps off the building, she will die.
The unembedded conditional is borderline assertable when we are
focusing on whetherthere is a net along the roof, but definitely
unassertable when we are focusing onwhether Jill is suicidal. These
judgments suggest that the interpretation of the indica-tive itself
depends on contextually determined parameters.
A related challenge arises when we embed sentences like (44) in
indicative condi-tionals. If we are talking about whether there is
a net, you can correctly assert:
(49) If it is probably the case that Jill will live if she
jumps, then there is a net.
If we are talking about whether Jill is suicidal, you can
correctly assert:
(50) It is probably the case that Jill will live if she
jumps.
However, you can never correctly assert:
(51) There is a net.
These judgments make trouble for certain semantic theories.
Several dynamic andexpressivist theories say something roughly like
the following: you believe a sentencewhen your credal state accepts
it. And an information state accepts a conditionalwhen the closest
state that accepts its antecedent also accepts its consequent.
Sinceyou believe (50), your actual credal state accepts the
antecedent of (49). Hence youractual credal state is the credal
state closest to yours that accepts that antecedent.Since you
believe the conditional (49), we should conclude that your actual
credalstate also accepts its consequent (51). But this conclusion
seems clearly false.14
14. In order to keep my discussion as general as possible, I
will not use this formula to construct objectionsfor particular
theories. The interested reader should combine the discussion of
attitude verbs in §7 ofYalcin 2007 with the semantics for ‘if’ and
‘probably’ in the appendix of Yalcin 2012b. For dynamictheories,
combine the standard dynamic semantics for attitude verbs in Heim
1992 with the dynamicsemantics for ‘if’ and ‘probably’ developed in
§4 of Gillies 2004, §10 of Gillies 2010, or the appendixof Yalcin
2012b, replacing “closest credal state to yours that accepts the
antecedent” with “result ofupdating your credal state on the
antecedent” in my discussion above.
15
-
The complex conditional (49) gives rise to our seventh
desideratum: our theoryshould explain its assertability conditions.
This is not a trivial endeavor. First, ourtheory must assign
semantic contents to indicatives whose antecedents embed bothgraded
epistemic vocabulary and other indicatives. Second, our theory must
explainhow your beliefs can support asserting (49) in some contexts
and (50) in others, with-out ever supporting (51). These facts
intuitively depend on the context sensitivityof (49) and (50), and
relevant contextual factors intuitively include facts about
whatquestions are salient when each is uttered.
1.5. Epistemic vocabulary in classically valid arguments
The seventh desideratum also directs us toward one final
category of useful obser-vations. If you believe both (49) and
(50), it might seem that you could apply modusponens and infer that
there is a net along the roof. But you are not licensed in
believ-ing that there is a net along the roof. The final three
desiderata concern instances ofclassically valid argument forms
that seem invalid in virtue of containing epistemicvocabulary.
Suppose Carlos has rolled a fair die without telling us how it
landed. A fair diehas three low numbers and three high numbers.
Suppose we are considering thefollowing argument about the number
Carlos rolled:
(52) a. If it is low, it is probably odd.b. It is not probably
odd.c. Hence: it is not low.
This argument seems like an instance of modus tollens. But it
also seems invalid. Thefirst premise seems correct, since 2 out of
3 of the low numbers are odd. The secondpremise seems correct,
since it is just as likely that an even number was rolled as anodd
number. But these premises do not justify our accepting the
conclusion, since wehave no idea whether a low number was rolled.
Several authors have made similarobservations about apparent
instances of modus tollens containing epistemic modals.15
This raises a puzzle: should we say that (52) is not an instance
of modus tollens, that(52) is valid, or that some instances of
modus tollens are not valid? This brings us toour eighth
desideratum: our theory of epistemic vocabulary should solve this
puzzle.At a minimum, our theory should come equipped with a notion
of consequence thatyields a verdict about whether (52) is valid.
And whether or not it is valid, ourtheory should predict the
apparent invalidity of instances of modus tollens
containingepistemic vocabulary.
15. For related discussion, see Carroll 1894, Veltman 1985,
Cantwell 2008, and especially Yalcin 2012b.
16
-
Here is another apparently invalid argument about the number
rolled:
(53) a. If it is low, it is probably odd.b. If it is high, it is
probably even.c. It is either low or high.d. Hence: either it is
probably odd or probably even.
Kolodny & MacFarlane 2010 discuss similar arguments,
including the following:
(54) a. Either the butler did it or the nephew did it.b. If the
butler did it, the murder must have occurred in the morning.c. If
the nephew did it, the murder must have occurred in the evening.d.
Hence: either the murder must have occurred in the morning or
it
must have occurred in the evening.
These arguments seem like instances of constructive dilemma. But
they also seem in-valid. For instance, just as it seems incorrect
to say that the number rolled is probablyeven, it seems incorrect
to say it is probably odd. So in the absence of any special
con-textual cues, it seems incorrect to say that the number rolled
is either probably even orprobably odd. It is neither probably even
nor probably odd, but just as likely to be oneor the other. This
brings us to our ninth desideratum: our theory should say
whether(53) is valid. And whether or not it is valid, our theory
should predict the apparentinvalidity of instances of constructive
dilemma containing epistemic vocabulary.
Similar problems arise not just for modus tollens and
constructive dilemma, butalso for disjunctive syllogism:
(55) a. It is low or probably even.b. It is not probably even.c.
Hence: it is not low.
And contraposition of indicative conditionals:
(56) a. If it is low, it is probably even.b. Hence: if it is not
probably even, it is not low.
Furthermore, it seems entirely appropriate to give similar
explanations for the appar-ent invalidity of these inferences.
Kolodny & MacFarlane 2010 and Yalcin 2012b,for instance, defend
semantic theories according to which each of the relevant
infer-ence rules is literally invalid. In fact, Kolodny and
MacFarlane go so far as to say thatmodus ponens itself is an
invalid rule of inference.
Anyone rejecting classically valid inference rules bears the
burden of explainingwhy we successfully use them in ordinary
reasoning. The easiest way to discharge
17
-
this burden is by proving that the rules are indeed valid when
restricted to premisesof a certain form. At a minimum, setting
aside complications involving adverbs ofquantification, it seems
our theory should predict that arguments are valid when theycontain
no epistemic vocabulary at all. This condition raises an important
question,namely exactly which arguments containing epistemic
vocabulary are valid.
Kolodny & MacFarlane 2010 defend inferences involving
conditionals whoseconsequents do not contain any epistemic
vocabulary. However, some inferences in-volving conditionals whose
consequents contain epistemic vocabulary are intuitivelyvalid as
well. For instance, Yalcin 2012b suggests that the following
inference isvalid:
(57) a. If the marble is big, then it might be red.b. It is not
the case that it might be red.c. Hence: it is not big.
In addition, some probabilistic inference rules are intuitively
valid, and some of thoserules govern indicatives with consequents
embedding epistemic vocabulary. In fact,we just considered
inferences of this sort in §1.4. The following inference licenses
mysaying (58-c) when discussing whether there is a net along the
roof:
(58) a. Probably, there is no net along the roof.b. If there is
no net along the roof, then if Jill jumps, she will die.c. Hence:
probably, if Jill jumps, she will die.
And the following licenses my saying (59-c) when discussing
whether Jill is suicidal:
(59) a. Probably, Jill is not suicidal.b. If Jill is not
suicidal, then if Jill jumps, she will live.c. Hence: probably, if
Jill jumps, she will live.
This brings us to our tenth and final desideratum for a theory
of epistemic vocabulary.Insofar as our theory says that standard
inference rules are generally invalid, it shouldexplain why
substantial classes of restricted rules appear to be genuinely
valid. Inparticular, our theory should explain why (57), (58), and
(59) are apparently valid,even though these inferences are riddled
with epistemic vocabulary.
2. A basic semantics for epistemic vocabulary
Before stating specific semantic entries, it will be helpful to
outline the basic idea ofthe semantic theory itself. Recall that in
a certain context, you may correctly describethe outcome of rolling
a fair die by saying:
18
-
(40) It is probably even or less than four.
The imagined context of (40) is somewhat contrived. In
particular, the context iscontrived to make a certain partition of
logical space especially salient. The partitionhas two elements:
either the number rolled is low, or it is high. As a result,
thereare also two kinds of salient credence distributions when you
utter (40). First, youhave conditional credences, conditional on
the partition propositions. For example,you have higher than .5
credence that the number rolled is even, conditional on itbeing
high. Second, you have a credence distribution over the partition
propositionsthemselves. For example, you have .5 credence that
number rolled is high. In otherwords, there are various opinions
you might have after learning some informationfrom the contextually
salient partition. And on top of that, you have some opinionsabout
the likelihood of each bit of information that you could learn.
A first pass at my semantics: the latter opinions are associated
with highermodals, while the former are associated with embedded
modals. For example, itwould sound fine for you to say (60) in the
context mentioned above:
(60) It might well be that the number is probably even.
According to my semantics, that is roughly because you could
learn some salientinformation—namely that the number rolled is
high—confirming an opinion thatgives most of its credence to the
number rolled being even. To take another exam-ple, suppose that
you are torn between various ways of evaluating candidates for
anacademic position. It is not clear how to weigh teaching
experience against researchquality, for instance, and you are open
to information that would decide this questionin different ways. In
spite of your indecision, you might say:
(61) It must be the case that Bob is a possible hire.
According to my semantics, that is roughly because as far as
your credences areconcerned, any salient information would support
an opinion that gave at least somecredence to Bob being hired.
Again, the embedded modal (‘possible’) is associatedwith your
credences conditional on various propositions (about ways of
evaluatingcandidates), while the higher modal (‘must’) is
associated with your credences inthose propositions themselves.
According to a traditional account of assertion, an assertion is
“something likea proposal” (cf. Stalnaker 2010, 152), namely the
proposal that the content of theassertion be added to the
propositions taken for granted in the conversation. In
aparadigmatic case of assertion, you believe a proposition, you
assert some sentencewith that proposition as its content, and as a
result, I come to believe that same
19
-
proposition. This model of assertion fits well with a certain
model of our mental life,according to which full beliefs are the
opinions we have and the opinions we want toshare with each other.
Meanwhile, theorists have developed alternate models of ourmental
life in which degreed beliefs play a central role. It is natural to
wonder whetherwe can update our account of assertion to fit these
more sophisticated models.
The updated account: an assertion is like a proposal, not about
a proposition thatyou should believe, but rather about a property
that your credences should have. Itis still true that in a
paradigmatic case of assertion, you have an opinion, you assertsome
sentence with that opinion as its content, and as a result, I come
to have thesame opinion. But the relevant opinions are degreed. In
other words, having anopinion amounts to having credences with a
certain property. The content of a declar-ative sentence is a
property that credences can have. Formally, contents are sets
ofprobability measures. In a paradigmatic case of assertion, when
you assert a sentencewith a certain content, I come to have a
credence distribution that is contained in thatcontent. For
instance, you may assert a sentence whose content is the set of all
mea-sures that assign probability greater than .5 to the
proposition that it is raining. Onhearing your assertion, I will
come to have more than .5 credence that it is raining.Following
Swanson 2006, we may conceive of the content of a sentence as a
constrainton credences, namely the constraint that my credences
generally end up satisfying onhearing your assertion of that
sentence.
Sentences containing epistemic vocabulary are context sensitive.
In other words,which set of measures is the content of a sentence
depends on what context you areusing the sentence in. In
particular, context contributes partitions of logical space tothe
semantic values of such sentences. The contextually determined
partitions makethe contents of sentences more interesting. A second
pass at the heart of my semantics:some asserted contents are
straightforward constraints on credences, such as assign-ing
greater than .5 credence to some particular proposition. But
asserted contentscan also constrain your credences to have more
fine-grained properties. In particu-lar, they can constrain the
structure of your credences with respect to propositions
innon-trivial contextually determined partitions. The content of a
sentence containingnested epistemic modals will be a constraint
having to do with your credences inthose propositions, and also
with your credences conditional on those propositions.Higher modals
correspond to the former sort of constraint, while embedded
modalscorrespond to the latter.
20
-
2.1. A semantics for logical operators
In addition to formal semantic entries, it will be useful to
have some shorthand forsaying what expressions mean. Let us say
that your credences satisfy the constraintthat a certain
proposition accepts that it is probably raining just in case it is
probably rain-ing according to your credences conditional on that
proposition, or in other words,just in case your conditional
credences are contained in the content that it is probablyraining.
If context determines a partition of logical space, we can quantify
over themembers of that partition as if they were each identified
with different people. Forinstance, given a contextually determined
partition, let us say that your credences sat-isfy the shorthand
constraint that someone accepts that it is probably raining just in
casesome proposition in the partition accepts that it is probably
raining. In general, letus say your credences satisfy the
constraint that someone accepts a particular contentjust in case
there is some proposition in the partition such that your credences
giventhat proposition are contained in that particular content.
Your credences satisfy theconstraint that everyone accepts a
content just in case every proposition in the parti-tion is such
that your credences given that proposition are contained in that
content.And so on. Rather than always explicitly describing your
credences conditional onpropositions in a contextually determined
partition, we have a handy shorthand thatcaptures the sense in
which your credences conditional on different partition
elementsoften correspond to different states of opinion that you
have not yourself decided be-tween. In a rough sense, one may
imagine the shorthand expressions ‘someone’ and‘everyone’ as
quantifying over different sides of yourself.16
Now for the semantics. In contrast with a number of extant
theories, it is straight-forward to start with a semantics for all
basic logical operators, including naturallanguage disjunction. For
instance: your credences are contained in the content ofa
disjunction just in case every proposition in the corresponding
contextually deter-mined partition is such that your credences
conditional on that proposition are con-tained in the content of
one of the disjuncts. The semantic entries for ‘and’ and ‘not’are
predictable variants. In shorthand:
‘S or T’ means that everyone accepts that S or accepts that
T.
‘S and T’ means that everyone accepts that S and accepts that
T.
16. In what follows, I often simplify my discussion by just
talking about whether certain partition elementsaccept a certain
constraint. It should be understood that strictly speaking, whether
a proposition acceptsa constraint is relative to a measure, e.g.
that your credences may satisfy the constraint that someoneaccepts
that it is probably raining, while my credences fail to satisfy
this same constraint.
21
-
‘not S’ means that no one accepts that S.17
In more formal vocabulary:
[[ori]]c = [λS . λT . {m : ∀p ∈ gc(i), m|p ∈ S or m|p ∈ T}]
[[andi]]c = [λS . λT . {m : ∀p ∈ gc(i), m|p ∈ S and m|p ∈
T}]
[[noti]]c = [λS . {m : ∀p ∈ gc(i), m|p /∈ S}]
A number of notes about the formal vocabulary are in order. The
variable p rangesover sets of worlds, and m ranges over probability
measures. The measure m|p is theresult of conditionalizing the
measure m on the proposition p. Let us stipulate that Sis the
semantic type of sets of measures. In the above entries, the
variables S and Trange over values of type S. The logical operators
‘and’ and ‘or’ have semantic valuesof type 〈S, 〈S, S〉〉, whereas
‘not’ has a semantic value of type 〈S, S〉. For example, thecontent
of a disjunction is a set of measures, as is the content of each
disjunct.
Exactly which set of measures is the content of a disjunction
depends on whatpartition context contributes to its content.
Following Heim & Kratzer 1998, we saythat every context c
determines an assignment function gc that specifies the values
ofall contextually determined variables. The value gc(i) is the
contextually determinedpartition relevant to the semantic entry
spelled out above. The shorthand expression‘everyone’ corresponds
to the formal expression ‘∀p ∈ gc(i)’ which quantifies
overpropositions in that partition. In what follows, I use both
shorthand and formalvocabulary, as the former allows me to make my
arguments intuitive, while the latterallows me to make them
precise.
In slightly less formal vocabulary, the semantic value of ‘S or
T’ is the set ofmeasures m satisfying the following condition: for
any proposition p in the relevantcontextually determined partition,
m|p is either contained in the content of the firstdisjunct or in
the content of the second disjunct. For example, recall that in
somecontexts where you have equal credence in each possible outcome
of rolling a fair die,it sounds okay for you to say:
(40) It is probably even or less than four.
As mentioned earlier, the sort of context that is hospitable for
(40) makes a certainpartition salient: either the number rolled is
low, or it is high. According to your
17. I use ‘not’ as shorthand for ‘it is not the case that’ and I
treat this expression as an operator that occursjust before its
argument, though ultimately one should allow many other expressions
of sententialnegation at surface structure. The analogous claims
hold for ‘might’, ‘must’, and ‘probably’.
22
-
credences conditional on it being low, the number is less than
four. According toyour credences conditional on it being high, the
number is probably even. Henceyour credences satisfy the content of
(40), namely that everyone in the contextuallydetermined partition
either accept that the number rolled is probably even or acceptthat
it is less than four. In a nutshell: you believe (40), and that is
why it sounds okayfor you to say it.
This explanation is incomplete as it stands. For starters, a
complete explanationrequires identifying the content of each
disjunct of (40) relative to the sort of contextin question, so
that we may prove that your conditional credences are contained
inthese contents. Appendix B.1 contains a complete explanation of
why your credenceis in the content of (40), and §2.4 contains
further commentary. Another clarificatorynote: the above semantic
values are custom-made for logical operators embeddingepistemic
vocabulary. The theory I develop assigns more traditional semantic
val-ues to logical operators elsewhere. The careful reader will
observe that accordingto this theory, logical operators embedding
epistemic vocabulary act essentially likeepistemic vocabulary. This
observation is implausible unless restricted to logical op-erators
embedding epistemic vocabulary, so it is important to bear in mind
that moretraditional semantic values for logical operators will be
revived in §3.
2.2. A semantics for epistemic possibility and necessity
modals
Here are shorthand semantic entries for epistemic possibility
and necessity modals:
‘might S’ means that someone accepts that S.
‘must S’ means that everyone accepts that S.
In more formal vocabulary:
[[mighti]]c = [λS . {m : ∃p ∈ gc(i) such that m|p ∈ S}]
[[musti]]c = [λS . {m : ∀p ∈ gc(i), m|p ∈ S}]
Having expanded our lexicon, we can outline a semantics for some
nested epistemicmodals. For example, (62) and (63) each mean that
everyone accepts that someoneaccepts that we will hire Bob:
(62) It is definitely the case that Bob might be the best
candidate for the job.
(63) It must be the case that Bob might be the best candidate
for the job.
This shorthand calls attention to an important semantic feature:
higher and lower
23
-
epistemic modals need not be associated with the same domain of
quantification. Bothlogical operators and modals have indices.
Assignment functions map expressionswith different indices to
potentially different values. Hence unless expressions
areco-indexed, context may contribute different partitions to their
interpretation. Forexample, an utterance of (62) may contain modals
that are not co-indexed:
(64) It is definitely1 the case that Bob might2 be the best
candidate for the job.
The semantic value of (64) is as follows:
[[(64)]]c = {m : ∀p ∈ gc(1), m|p ∈ {m′ : ∃q ∈ gc(2) such that
m′|q ∈ [[(65)]]c}},
where (65) is the prejacent of the inner modal in (64):
(65) Bob is the best candidate for the job.
For instance, in a context where (64) is uttered, it could be
that the partition gc(1)contains propositions about what sorts of
virtues matter when evaluating candidates,while the partition gc(2)
contains propositions about which candidates have whatsorts of
virtues. In that sort of context, your credences would satisfy (64)
just in caseconditional on any proposition about what virtues
matter, your credences satisfy thefollowing condition: conditional
on some proposition about which candidates havewhich virtues, Bob
is the best candidate for the job.
For those especially attentive to syntactic representation:
strictly speaking, oursemantics could identify indexed variables as
arguments of modals and logical op-erators, rather than indexing
these expressions directly. For example, our formalsemantic entry
for ‘must’ could be as follows, where v ranges over partitions:
[[must]]c = [λv . λS . {m : ∀p ∈ v, m|p ∈ S}]
In that case, (62) would contain two covert pronouns:
(66) It is definitely v1 the case that Bob might v2 be the best
candidate for thejob.
Here the pronouns v1 and v2 denote partitions relative to
contexts, according to thefamiliar semantics for referential
pronouns, i.e. [[vi]]c = gc(i). The resulting semanticvalue of
‘must vi’ matches the semantic value of ‘musti’ given above. The
reader mayreplace expressions of the latter sort with their kosher
substitutes throughout.18
18. For simplicity, I will sometimes talk as if the contextually
supplied partition is the value of a covert pro-noun. But strictly
speaking, I am neutral about the best syntactic implementation of
my theory. Partee1989 and Condoravdi & Gawron 1996 have given
reasons to doubt that similar implicit arguments are
24
-
2.3. A small detour: advantages of constraining conditional
credences
Recall from §1.1 that our use of nested epistemic modals fits
naturally with the ideathat sentences constrain imprecise credal
states. This idea should seem even morecompelling given all the
shorthand just introduced. Suppose we model your mentalstate with a
set of probability measures. In other words, suppose we model youas
if you have an imaginary mental committee of subjects with precise
credences.Then following Rothschild 2012, we could say that
sentences constrain your mentalcommittee members, rather than your
conditional credences. If a sentence demandsthat everyone accepts a
content, for instance, that could just amount to demandingthat each
committee member accept that content. In other words, my
shorthandsemantic entries for ‘might’ and ‘must’ seem like apt
translations of the followingalternative formal semantic
entries:
[[might]] = [λS . {I : ∃m ∈ I such that m ∈ S}]
[[must]] = [λS . {I : ∀m ∈ I, m ∈ S}]
Here the variable m ranges over precise credal states, i.e.
probability measures, whileS and I range over imprecise credal
states, i.e. sets of probability measures. Thisproposal may appear
to satisfy many desiderata given in §1. It is worthwhile toreflect
on how my semantics differs from this proposal, and especially to
notice thatthe imprecise credence proposal is deficient in two
respects.
First, on the imprecise credence semantics stated above,
embedding a sentenceunder ‘might’ or ‘must’ raises its semantic
type. Each modal accepts sets of measuresas inputs and delivers
sets of imprecise credal states as outputs. That means thata
sentence with a wide-scope ‘might’ or ‘must’ has the wrong semantic
type to beembedded under another epistemic modal—a bad result,
given our pervasive use ofembedded modals. The most natural repair
strategy requires that we model subjectsas having not just
imprecise credences, but more complicated mental states. In
fact,very complicated mental states are required, since subjects
commonly embed epis-temic vocabulary under embedded epistemic
vocabulary. For instance, recall that wehave no trouble
understanding (49):
(49) If it is probably the case that Jill will live if she
jumps, then there is a net.
And deeper embeddings seem perfectly intelligible, as long as
the context is richenough to supply the interpretations of relevant
expressions. For instance, (49) soundsfine when you are trying to
figure out whether there is a net along the roof of your
best analyzed as the values of covert pronouns, and I will not
evaluate their arguments in this paper.
25
-
office building. Suppose that the local fire department
occasionally puts a trampolineinstead of a net along the roof. Then
we are not really licensed in saying that thereis a net along the
roof, given just that it is probably the case that Jill will live
if shejumps. Instead, we should say something more hedged:
(67) Probably, if it is probably the case that Jill will live if
she jumps, then there isa net. (But it might be that there is a
trampoline.)
In light of (49) and (67), it is hard to imagine a reason for
ruling that embeddingsof epistemic vocabulary beyond a certain
level of complexity are are semanticallyuninterpretable. In the
absence of such a reason, our theory should deliver semanticvalues
for embeddings of arbitrary complexity. Hence in order to repair
the imprecisecredence proposal, we would have to model subjects as
having not just sets of sets ofmeasures as mental states, but sets
of sets of sets of measures, and so on. It is difficultto
independently motivate such an arcane model of our mental life.
Second, semanticists like Rothschild must endorse even more
complicated mod-els of mental states in order to give a semantics
for graded modal vocabulary. It iseasy to imagine existential or
universal quantification over members of an imaginarymental
committee. But graded modals call for probability measures over
committeemembers, and it is difficult to see how one could make
sense of this added structurewithin the imprecise credence model
without essentially describing subjects as havingprecise
credences.
The semantics I defend offers a viable alternative in the
neighborhood of theimprecise credence proposal. For starters, the
semantics extends naturally to gradedmodals, without requiring that
we represent subjects as having mental states morearcane than
ordinary credences. As a result, even though it is fairly
revisionary to saythat contents of sentences are sets of measures
instead of sets of worlds, our model ofcontents can still be
defended on the grounds that it simply reflects an
independentlymotivated model of our mental life. In addition,
according to our semantics, ‘might’,‘must’, and ‘probably’ are all
type 〈S, S〉, and ‘if’ is type 〈S, 〈S, S〉〉. Hence
complicatedsentences like (67) have well-defined semantic
values.
Furthermore, our theory even has the resources to say why
complicated sentenceslike (67) might nevertheless sound bad when
uttered out of the blue. The same goesfor many sentences containing
several referential pronouns. For instance, when ut-tered out of
the blue, (68) sounds questionable at best:
(68) ?That made that do that to that.
In particular, sentences with several referential pronouns sound
bad in isolation whenthere is a presumption that context will
determine different denotations for different
26
-
pronouns. For instance, (68) sounds worse than (69), just as the
nested epistemicvocabulary in (70) sounds worse than the repeated
unembedded vocabulary in (71):
(68) ?That made that do that to that.
(69) It entered; it saw me; it squealed; and it fainted.
(70) ?Probably, it is probable that probably Jill will probably
live.
(71) Jill will probably live; John will probably die; Janet will
probably cry; andJoe will probably celebrate.
Context often determines different denotations for pronouns in
sentences with nestedepistemic modals. As a result, a rich context
is required for the simultaneous inter-pretation of the covert
pronouns in sentences such as (67) and (70). Here again, incontrast
with semantic injunctions against complicated embeddings, pragmatic
ac-counts better fit the contours of our judgments about epistemic
vocabulary.
2.4. A semantics for ‘probably’, ‘if’, and a covert
type-shifting operator
The expression ‘probably’ has a more complicated semantic
function than possibil-ity and necessity modals. The latter modals
constrain your credences conditional onpropositions in a
contextually determined partition. But as a graded modal,
‘proba-bly’ constrains your credences in members of the partition
itself:
[[probablyi]]c = [λS . {m : m(⋃{p ∈ gc(i) : m|p ∈ S}) >
.5}]
In our shorthand: find the union of everyone that accepts that
S. If you give thatproposition greater than .5 credence, then your
credences are contained in the contentof ‘probably S’.19 For
example, recall that if we are talking about whether Jill
issuicidal, you can correctly assert:
(50) It is probably the case that Jill will live if she
jumps.
The partition relevant to the interpretation of ‘probably’ in
(50) contains two proposi-tions: either Jill is suicidal or she
isn’t. Just one of these propositions accepts that Jillwill live if
she jumps, namely the proposition that Jill isn’t suicidal.20 Since
you give
19. This semantics follows Kratzer 1991 in taking ‘probably’ to
indicate that something is more likely thannot. It is
straightforward to adjust the definition so that ‘probably’ instead
indicates likelihood above acontextually defined threshold. In a
similar vein, it is straightforward to extend the lexicon of this
paperto include other simple epistemic vocabulary, such as
‘unlikely’, ‘at least .3 likely’, ‘more likely than’,and
comparative epistemic adjectives.
20. A reminder about our shorthand: your credences satisfy the
constraint that a proposition accepts acontent just in case your
credences conditional on that proposition are contained in that
content.
27
-
more than .5 credence to that proposition, your credences are
contained in the contentof (50), and that is roughly why it sounds
okay for you to say it.
At this point, we can also give a more complete explanation of
why the contentof (40) contains your credences about the outcome of
rolling a fair die:
(40) It is probably2 even or1 less than four.
As mentioned earlier, the sort of context that is hospitable for
(40) makes a certainpartition salient: either the number rolled is
low, or it is high. A second partition isalso salient, namely the
six possible outcomes of the rolling the die. The first
partitiondetermines the content of ‘or’ and the second determines
the content of ‘probably’.If you conditionalize your credences on
the proposition that the number rolled islow, then you accept that
the number is less than four. If you conditionalize yourcredences
on the proposition that the number rolled is high, then you have
equalcredence in each of the three high number outcomes. Hence you
give more than .5conditional credence to the union of outcomes that
accept the number rolled is even.That means your credences
conditional on the number being high accept that thenumber is
probably even. It follows from our semantics for ‘or’ that your
credencesare in the content of (40), and that is roughly why it
sounds okay for you to say it.
Indicative conditionals are semantically like graded modals,
insofar as they alsoconstrain your credences in propositions in
contextually determined partitions:
[[ifi]]c = [λS . λT . {m : m(⋃{p ∈ gc(i) : m|p ∈ T}|⋃{p ∈ gc(i)
: m|p ∈ S}) = 1}]
In other words, using our shorthand: find the union of everyone
that accepts theantecedent of the conditional, and find the union
of everyone that accepts the conse-quent. If you have full credence
in the latter proposition conditional on the former,then your
credences are contained in the content of the conditional
itself.21
For example, consider the indicative conditional:
(72) If1 it is high, it is probably2 even.
The context of (72) makes a certain partition salient: either
the number rolled is low, orit is high. The former proposition
rejects the antecedent of the conditional, while thelatter accepts
it. The former proposition also rejects the consequent of the
conditional,while the latter accepts it. Hence you have full
credence in the union of propositionsthat accept the consequent of
(72), conditional on the union of propositions that accept
21. A disclaimer: this semantics is sufficient to address the
motivating concerns of the present paper, but itis not my final
word on indicative conditionals. I defend an alternative
probabilistic semantics in Moss2014, motivated by concerns that I
have bracketed for ease of exposition here.
28
-
the antecedent. It follows from our semantics for ‘if’ that your
credences are in thecontent of (72), and that is roughly why it
sounds okay for you to say it.
There is one important respect in which our theory so far is
incomplete. I havenot yet given a semantics for simple sentences
such as:
(65) Bob is a hire.
(73) Jill jumps.
(74) The number rolled is high.
For instance, I have said certain partition propositions “accept
that the number rolledis high” or “accept the antecedent of ‘if it
is high, it is probably even’.” This isshorthand for a constraint
on probability measures, namely that after conditionalizingon the
partition proposition, the resulting measure is contained in the
content of (74).Hence simple sentences like (74) must have sets of
measures as their contents.
There is a natural way of associating simple sentences with sets
of measures.According to standard truth conditional semantic
theories, the content of a simplesentence is a set of worlds.
According to my theory, the content of a simple sentenceis the set
of measures that assign probability 1 to that set of worlds.22 This
meansthat the theory need not start from scratch to deliver
semantic values for referringexpressions, predicates, quantifiers,
and so forth. Instead, a covert operator convertstraditional
semantic values into alternative semantic values:
[[C]]c = [λp . {m : m(p) = 1}]
For example, the logical form of the sentence ‘Jill jumps’ is
more accurately repre-sented as ‘C Jill jumps’. The semantic value
of this sentence is a set of measures,namely {m : m({w . Jill jumps
in w}) = 1}. Since simple sentences accompanied bythe covert
operator C have sets of measures as semantic values, simple
sentences canbe arguments of type 〈S, S〉 operators and type 〈S, 〈S,
S〉〉 operators.
Furthermore, arguments of logical operators can include both
simple sentencesand sentences containing epistemic vocabulary. For
example, the logical form of (40)is more accurately represented as
follows:
(40) [ probably2 [ C [ it is even ] ] ] or1 C [ it is less than
four ].
This detail lets us finally give a complete explanation of why
your credences arecontained in the content of (40) in the context
described above. In our most recent
22. This content may seem inappropriate, since giving full
credence to some proposition is a very strongconstraint. In short,
I have made some assumptions in order to simplify my discussion,
and refinementsof the theory in §3 address this worry. For a more
thorough treatment of these issues, see chapter 2 ofMoss 2014.
29
-
explanation of this fact, we said that “if you conditionalize
your credences on theproposition that the number rolled is low,
then you accept that the number is lessthan four.” The more
complete explanation replaces this with the following claim: ifyou
conditionalize your credences on the proposition that the number
rolled is low,then the resulting credence distribution has full
credence that the number is less than four.Fans of gory detail
should see Appendix B.1 for an explanation in formal
vocabulary.
To sum up so far: I have introduced a semantics for eight
expressions, includingbasic logical operators and epistemic
vocabulary. According to this theory, there is asense in which
logical operators are epistemic vocabulary. If they occur in the
midstof epistemic modals, logical operators deliver constraints on
credences that dependon what is accepted by propositions in
contextually determined partitions. Assigningthe same sort of
semantic values to logical operators and epistemic vocabulary
helpsexplain the behavior of the latter. The way that ‘might’ and
‘must’ and ‘probably’interact with each other has a lot in common
with the way they interact with logicaloperators. According to my
theory, this is to be expected, as both are interactionsbetween
different sorts of epistemic vocabulary.
3. A number of refinements and explanations
I have made three simplifying assumptions in developing the
semantics in §2. Inorder to refine the semantics, I will identify
these assumptions and say how they canbe removed. The first is
about the standard effect of assertion, namely that whenyou hear an
assertion with a certain content, you generally come to have
credencescontained in that content. This claim abstracts away from
lying, pretense, supposition,and so on. But more importantly, even
in normal cases of assertion, your credencesdo not really come to
be contained in asserted contents. The contents of sentencesare
simply too strong to play that role. The content of a simple
sentence is the set ofmeasures that assign probability 1 to some
proposition. But it is arguably almost neverrational to have full
credence in a proposition. Having full credence in a
propositionmakes you bet on that proposition at arbitrarily risky
odds, and makes your beliefin that proposition rationally
unrevisable by conditionalizing on further evidence. Inother words,
it makes you have blind faith in a proposition. Assertions rarely
if everhave such a dramatic effect.
It might be possible to answer this complaint by saying that our
theory governsideal cases, and that ideal communication really does
make subjects have full cre-dence in asserted contents. But even
this answer should be accompanied by somesuggestions about the
effect of assertion in realistic cases. Here is one suggestion:
as
30
-
far as the conversational record is concerned, an act of
assertion is a proposal thatthe content of the assertion be
accepted for conversational purposes. For example,suppose you
assert that it is raining. Then it will sound bad for either of us
to say oreven suppose that it might not be raining:
(75) a. Alice: Oh no. It is raining.b. Bob: #If it might not be
raining, we should buy some sunglasses.
If your assertion is not challenged or retracted, it does seem
that we accept its strongcontent for conversational purposes.
Having accepted that content, Alice and Bob doresemble subjects who
would accept bets at arbitrary odds, conversationally speaking,as
they cannot even raise the possibility that it is not
raining.23
In addition to affecting the conversational record, an assertion
affects conversa-tional participants. An assertion does not exactly
affect your credences, but somethingmore like your credences for
practical purposes. For example, an assertion of (75-a) mayhave the
effect that for practical purposes, it is just as if your credences
are containedin its content—i.e. when it comes to your preferences
and decisions, it is just as ifyou have full credence in the
proposition that it is raining. This account of assertionis
designed to mimic contemporary accounts of full belief according to
which youbelieve a proposition when you can treat it as certain for
practical purposes. For in-stance, according to Weatherson 2005,
you believe a proposition roughly just in caseconditionalizing on
that proposition changes none of your preferences over
salientoptions.24 The analogous account of assertion says you
accept an assertion just incase updating your credences on its
content changes none of your preferences oversalient options. For
example, you accept (75-a) just in case updating on the
proposi-tion that it is raining changes none of your preferences
over salient options. In otherwords, given the analogous account of
full belief, you accept (75-a) just in case youbelieve that it is
raining. This seems like exactly the right result, as assertions of
sim-ple sentences are traditionally taken to constrain your full
beliefs. To sum up: giventhe above accounts of full belief and
assertion, you accept an assertion of a simplesentence just in case
you believe its content. Even if our accounts of full belief
andassertion must ultimately be modified, the latter will deliver
intuitive results as longas it mirrors the former.
The second simplifying assumption made in §2 is that logical
operators have justone semantic value each. In fact, my theory
requires a serious and significant revision
23. This effect of assertion on the conversational record is
elegantly explained by models on which thecontext set itself is
fine-grained. For further discussion, see the context probabilism
introduced in §8 ofYalcin 2007.
24. Cousins of this principle are defended by Williamson 2000,
Ganson 2008, Fantl & McGrath 2010,and Schroeder & Ross
2014.
31
-
of this assumption, namely that logical operators have different
types of semanticvalues, depending on whether they embed
non-epistemic or epistemic vocabulary.For example, the semantic
value of negation given in §2 must have a different type ofsemantic
value of negation in simple sentences such as:
(76) John does not smoke.
For suppose (76) has the following logical form:
(77) not1 [ C John smokes ]
Then according to the semantics for ‘not’ in §2, the content of
(76) contains yourcredences just in case there is no proposition in
the relevant partition such that youhave full credence that John
smokes, given that proposition. This is not at all what
(76)intuitively means. For many partitions, it is very easy for
your credences to satisfythis constraint, even if you have a
relatively high credence that John smokes. It shouldintuitively be
much harder for your credences to be contained in the content of
(76).In fact, in light of our semantics for other sentences without
epistemic vocabulary, thecontent of (76) should intuitively be the
set of measures that assign probability 1 tothe proposition that
John does not smoke.
The appropriate refinement of our semantics involves
distinguishing between log-ical operators that embed epistemic
vocabulary and logical operators that embed sim-ple sentences. A
simple sentence actually has a set of worlds as its semantic
value,which can serve as the argument of a covert type-raising
operator. This covert oper-ator need not occur immediately above
every simple sentence. In our refined seman-tics, logical operators
can have sets of worlds as arguments. In addition to
reinstatingtraditional semantic values for simple sentences, we
reinstate traditional semantic val-ues for logical operators,
adding these values to those introduced in §2. Hence
logicaloperators have different semantic values in different
linguistic contexts: traditionalvalues when their arguments are
sets of worlds, and our §2 semantic values whentheir arguments are
sets of measures. The logical form of ‘John does not smoke’ is(78)
rather than (76):
(76) not1 [ C John smokes ]
(78) C [ not John smokes ]
The sentence under the covert operator has a set of worlds as
its semantic value,namely the proposition that John does not smoke.
Hence the content of (76) is the setof measures that assign
probability 1 to that proposition, as desired.
This refinement of our semantics addresses several other
potential problems as
32
-
well. For instance, suppose the logical form of (79) is given by
(80):
(79) John smokes or Jill drinks.
(80) [ C John smokes ] or1 [ C Jill drinks ]
Then if the content of (79) contains your credences, there must
be some contextuallydetermined partition such that conditional on
each proposition in the partition, youeither have full credence
that John smokes or full credence that Jill drinks. But
in-tuitively you can utter a disjunction even if no such
propositions would make yousure of either disjunct. In addition,
our semantics should predict that the followinginference is
valid:
(81) a. It is not the case that John does not smoke.b. Hence:
John smokes.
And likewise for the following:
(82) a. It is not the case that both John smokes and Jill
drinks.b. Hence: either John does not smoke or Jill does not
drink.
However, from the premise that no one accepts that no one
accepts that John smokes,we cannot generally infer that John
smokes. From the premise that no one acceptsthat everyone accepts
both that John smokes and Jill drinks, we cannot generally
inferthat everyone either accepts: (a) that no one accepts that
John smokes, or (b) thatno one accepts that Jill drinks. In other
words, if the covert type-raising operator ‘C’occurs immediately
above ‘John smokes’ and ‘Jill drinks’ in (81) and (82), the
result-ing inferences are invalid. Hence our §2 semantics does not
automatically validatedouble negation elimination or applications
of De Morgan’s Laws, even restricted toinferences not containing
any epistemic vocabulary.
The above refinement of our semantics validates instances of
these inferenceswhere appropriate. For instance, the logical form
of ‘John smokes or Jill drinks’ isgiven by (83):
(83) C [ John smokes or Jill drinks ]
The semantic value of (83) is the set of measures that assign
probability 1 to theproposition that either John smokes or Jill
drinks. This semantic value may containyour credences even if no
salient information would make you sure of either disjunct.The
logical form of the double negation elimination argument is not
(84) but (85):
(84) a. not1 not2 C John smokesb. Hence: C John smokes
33
-
(85) a. C not not John smokesb. Hence: C John smokes
The logical form of the De Morgan’s Law argument is not (86) but
(87):
(86) a. not1 [ C John smokes and2 C Jill drinks ]b. Hence: [
not3 C John smokes ] or4 [ not5 C Jill drinks ]
(87) a. C not [ John smokes and Jill drinks ]b. Hence: C [ [ not
John smokes ] or [ not Jill drinks ] ]
It is not hard to verify that the latter inferences are valid,
as desired.I should emphasize that on the semantics developed here,
logical operators are
polymorphic. This claim constitutes a loss of theoretical
parsimony, which some read-ers may count as a cost of my theory.
Some may even judge that this cost is ultimatelytoo great to be
outweighed by the benefits of the theory. However, several facts
mayhelp mitigate this cost. For starters, it is a familiar
observation that logical opera-tors can embed expressions of
various semantic types; indeed, “virtually every majorcategory can
be conjoined with ‘and’ and ‘or’” (Partee &