9. Research Design in Testing Theories ofInternational
ConflictPaul Huth and Todd Allee Scholars studying the causes of
international conict confront a number
oflingeringquestions.Underwhatconditionsaredisputesbetweenstateslikely
to escalate to war? What impact do alliances have on the outbreak
ofmilitarizedconict?Whenwilladeterrentthreatbecredible?Howdodomestic
political institutions affect a states propensity to settle
disputesnonviolently?Oneofthemostimportantsocialscienticmethodsforinvestigating
these ongoing questions is statistical analysis.In this chapter we
detail a number of issues regarding research designand estimation
that researchers must consider when using statistical
analy-sistoaddressimportantquestionswithinthestudyofinternationalconict
and security. We do not engage in a straightforward review of
paststatistical research on international conict, but rather put
forward a seriesof suggestions for ongoing and future statistical
research on this importanttopic. However, during the course of our
discussion we identify and discussseveral statistical studies that
converge with our suggestions and exemplifysome of the most
promising current work by political
scientists.Ourcentralargumentisthatstatisticaltestsofthecausesofinterna-tional
conict can be improved if researchers would incorporate into
theirresearch designs for empirical analysis a number of insights
that have beenemphasized in recent formal and game-theoretic
approaches to the study
ofinternationalconict.Webelievethatgreaterattentiontotheimplica-tionsoftheformalanddeductivetheoreticalliteraturesforstatisticalanalyses
can improve research designs in four areas.1931. Selecting
theoretically appropriate units of analysis for buildingdata sets2.
Understanding how to better address problems of selection effectsin
the construction of data sets and estimation of models3. Accounting
for nonindependent observations4. Reducing the amount of
measurement error in the construction ofvariables for testing
hypothesesWe focus on these four aspects of research design because
they address a
setofimportantproblemsthatempiricalanalystsneedtoaddressifcom-pelling
ndings are to be produced by statistical tests. If researchers do
notaddresstheseissuesofresearchdesigneffectively,weakempiricalresultscan
be expected despite the use of sophisticated statistical methods to
testrigorously derived theoretical propositions. Even worse, the
failure to givecareful attention to problems of research design can
result in the use of
datasetsthat(1)areactuallyill-suitedfortestingthetheoriesthatscholarsclaimtobeevaluatingor(2)severelylimitourabilitytodrawaccurateconclusions
about causal effects based on the empirical ndings producedby
statistical analyses.In this chapter we rst describe four phases or
stages that are associatedwith international disputes. These stages
provide a useful depiction of howinternational disputes can evolve
over time, and they illuminate a numberof central research design
issues faced by statistical researchers. Second,
wediscussfourparticularresearchdesignquestionsandsuggestpossibleanswers.
We conclude with a few brief observations about the implicationsof
our analysis for future quantitative work.Alternative Paths to
Conflict and Cooperation in International DisputesBroadly
conceived, the theoretical study of international conict
involvesfour different stages.1. Dispute Initiation2. Challenge the
Status Quo3. Negotiations4. Military EscalationModels, Numbers, and
Cases194Existing quantitative tests of international conict
generally focus on oneof these stages, although an increasing
number of recent studies focus onmore than one stage. We believe
that statistical researchers need to thinkcarefully about each of
the four stages as part of a unied depiction of theevolution of
international conict. An initial theoretical description of
thefourstagesandthelinkagesbetweenthestageswillhelptoidentifythepracticalchallengesfacingthequantitativeresearcher.Thesedifferentstagesarepresentedingure1,alongwithsomeoftheprincipalpathsleading
to various diplomatic and military outcomes.1In the Dispute
Initiation stage, the analysis centers on whether a disputeor
disagreement emerges between countries in which one state (the
chal-lenger) seeks to alter the prevailing status quo over some
issue(s) in its rela-tions with a target state (see g. 2). An
example would be a decision by theleaders of a state to claim the
bordering territory of their neighbor. If
theleadersofthetargetstaterejecttheclaim,thenaterritorialdisputehasemerged
between the two states (e.g., Huth 1996). Other common reasonsfor
the emergence of disputes include economic conicts over the tariff
andnontariff barriers to trade between countries (e.g., see Lincoln
1999, 1990on U.S.-Japanese trade disputes), or the intervention by
one country intothe domestic political affairs of another (e.g.,
Daalder and OHanlon
2000forananalysisofNATOpolicyinKosovo).Theoreticalanalysesofthisstagefocusonexplainingwhatissuesandbroaderdomesticandinterna-tional
conditions are likely to give rise to disputes and why it is that
somestate leaders are deterred from raising claims and disputing
the prevailingstatus
quo.Onceastatehasvoiceditsdisagreementwiththeexistingstatusquoand a
dispute has emerged, in the next stage, the Challenge the Status
Quostage,leadersofthechallengerstateconsiderbothwhentopresstheirclaimsinadisputeandwhethertheywishtousediplomaticormilitarypressuretoadvancetheirclaims.Statisticalanalysesofthisstage,then,attempt
to explain when and how states attempt to press or resolve
exist-ingdisputes.Asshowningure3,foreignpolicydecisionmakerscanchooseamongoptionssuchasnotactivelypressingclaims,relianceonnegotiations
and diplomatic efforts to change the status quo, or more
coer-civepressureinvolvingmilitarythreats.Theoutcomestothisstageinclude:(1)thestatusquo,ifthechallengerremainsquiescent,(2)theopening
or resumption of negotiations due to diplomatic initiatives
under-Research Design in Testing
Theories195takenbythechallenger,or(3)amilitaryconfrontationwhenthechal-lengerresortstoverbalwarningsandthreateningthedeploymentofitsmilitary
forces. The theoretical analysis of this stage would typically
focuson explaining what policy choices would be selected by leaders
among
thevariousdiplomaticandmilitaryoptionsavailableandhowdomesticandinternationalconditionsinuencesuchchoices(e.g.,BuenodeMesquitaand
Lalman 1992; Powell 1999; Huth and Allee 2002).
IntheNegotiationsstagethechallengerandtargethaveenteredintotalks,
and empirical tests attempt to explain the outcome of such rounds
oftalks (see g. 4). In this stage, the focus shifts to questions
such as whichparty has more bargaining leverage and is willing to
withhold making con-cessions, whether the terms of a negotiated
agreement would be acceptedback home by powerful domestic political
actors, and whether problems ofmonitoring and enforcing compliance
with the terms of a potential agree-ment would prevent a settlement
from being reached (e.g., Fearon 1998;Powell 1999; Putnam 1988;
Downs and Rocke 1990; Schoppa 1997; Mil-Models, Numbers, and
Cases196Cooperative RelationsBetween StatesDisputeEmerges
BetweenStates Over Issue(s)States Assess Diplomatic and Military
OptionsThreat to Use Military
ForceCrisisBetweenStatesWarStalemateSettlementSettlementNegotiationsRepeat
Threat of Military ForceShift to
NegotiationsStalemateSettlementContinue NegotiationsTurn to
Military OptionDispute Initiation StageNegotiations StageMilitary
Escalation StageTimeChallenge the Status Quo StageFig. 1. The
evolution of international disputesC(2)Period 1OutcomesPeriod
2Demands to Change Status Quoover Disputed IssueFavorable Change in
Status Quo for ChallengerNo Claim or Demandto Change Status Quo
Status Quo PrevailsResist DemandsPeriod 3TConcedeAnother Iteration
of theDispute Initiation StageInternational Dispute EmergesStates
are now in theChallenge Status Quo StageNo Dispute Exists with
NewStatus QuoCFig. 2. The Dispute Initiation stage. (Note: C =
Challenger State; T = Tar-get State)NegotiateDirect Threat of
Military ForceNo Diplomaticor Military InitiativesCPeriod 1
Outcomes Period 2Period 3Military Confrontation Initiated by
ChallengerStatus QuoChallenger Seeks Talks Over Disputed
TerritoryStates are now in the MilitaryEscalation StageAnother
Iteration of the Challengethe Status Quo StageStates are now in
theNegotiations StageIn the Dispute Initiation Stage
anInternational Dispute has Emerged in whichChallenger Demands to
Change the StatusQuo were Rejected by the Target StateFig. 3. The
Challenge the Status Quo stage. (Note: C = Challenger State;T =
Target State)ner 1997). The possible outcomes to the Negotiations
stage might includea settlement through mutual concessions or
capitulation by one side. Fur-thermore, a stalemate can ensue if
neither party is willing to compromise,while limited progress
toward a resolution of the dispute can occur if
oneorbothsidesofferpartialconcessions.Inthecaseofstalemateorpartialconcessions,
the dispute continues, and the leaders of the challenger
statereassess their policy options in another iteration of the
Challenge the Sta-tus Quo stage.In the Military Escalation stage
the challenger state has issued a threatof force (see g. 1). If the
target state responds with a counterthreat, a
cri-sisemergesinwhichtheleadersofbothstatesmustdecidewhethertoresort
to the large-scale use of force (see g. 5). Statistical tests of
this stagegenerally investigate whether the military standoff
escalates to the large-scale use of force or the outbreak of war,
or is resolved through some typeof less violent channel. This stage
of international conict has drawn con-siderable attention from
international conict scholars for obvious
reasons,yetitremainsthemostinfrequentlyobservedstageofinternationalconict.Someofthemoreinterestingtheoreticalpuzzlesposedatthisstage
center around questions of how credible the threats to use force
are,what actions by states effectively communicate their resolve,
and what therisks of war are as assessed by the leaders of each
state (Fearon 1994a;
Huth1988;Schultz1998;Smith1998;Wagner2000).Theoutcometotheinternationalcrisisdetermineswhetherthedisputecontinues,andifso,which
foreign policy choices need to be reconsidered. For example, if
warbreaks out, a decisive victory by one side is likely to bring an
end to thedispute; whereas a stalemate on the battleeld will lead
to the persistenceof the dispute in the postwar period. Conversely,
the avoidance of war
maybringabouttheendofthedisputebymeansofanegotiatedagreement,while
a standoff in the crisis will result in the continuation of the
dispute.In either case where the dispute persists, the focus shifts
back to the chal-lengers options in another iteration of the
Challenge the Status Quo stage.Over the duration of a dispute,
decision makers pass through the vari-ous stages numerous times;
that is, they make repeated choices
regardingthethreatoruseofforce,negotiations,anddisputesettlement.Thesechoices
of action (or inaction) become the cases that comprise the data
setsusedinquantitativestudiesofinternationalconict.Interestingly,thesequenceofpolicychoicesovertimeproducescommondiplomaticandmilitary
outcomes that may be arrived at through very different
pathwaysModels, Numbers, and Cases198Period1Period2 Outcomes
Period3Major ConcessionsMajor ConcessionsLimited ConcessionsLimited
ConcessionsLimited ConcessionsSettlement by mutual
compromiseSettlement in which status quo maintainedSettlement in
which change in status quois favorable to challengerDispute
continues with movement toward resolutionDispute continues due to
stalemateSettlement in which change in status quois favorable to
challengerDispute continues with limited gains by
challengerT(1)T(2)No ConcessionsMajor ConcessionsMajor
ConcessionsLimited ConcessionsNo ConcessionsT(3)No ConcessionsNo
ConcessionsSettlement on terms favorable to targetCAbsence of
Dispute and New Status Quo EstablishedAbsence of Dispute and New
Status Quo EstablishedAbsence of Dispute as Status Quo is
MaintainedAbsence of Dispute and New Status Quo EstablishedAnother
Iteration of the Challenge the Status Quo StageAnother Iteration of
the Challenge the Status Quo StageAnother Iteration of the
Challenge the Status Quo StageAnother Iteration of the Challenge
the Status Quo StageNOTE:C=ChallengerState T=TargetStateIn the
Challengethe Status Quo Stage the Challenger has Selected the
Option of Seeking NegotiationsDispute continues due to
stalemateAbsence of Dispute and New Status Quo EstablishedFig.4.
TheNegotiationsstage.(Note:C=ChallengerState;T=TargetState)Period1Period2
Outcomes Period3High EscalationT(1)T(3)CNOTE:C=Challenger State
T=TargetStateIn the Challengethe Status Quo Stage the Challenger
has initiated a direct threat of military forceModerate
EscalationHigh EscalationModerate EscalationLow EscalationHigh
EscalationLowEscalationModerate EscalationAnother Possible
Iteration of the Challenge the Status Quo StageAnother Possible
Iteration of the Challenge the Status Quo StageWar breaks out and
dispute continues if challenger failsto win the war and impose a
peace settlementFavorable Change in status quo and dispute
continues if challenger not satisfied w/ change in status quoCrisis
stalemate and dispute continuesLimited gains or settlement under
threat of crisis escalation for challenger and dispute continues if
challenger not satisfied w/ change in status quoAnother Iteration
of the Challenge the Status Quo StageAnother Possible Iteration of
the Challenge the Status Quo StageT(2)Another Iteration of the
Challenge the Status Quo StageLowEscalationHigh
EscalationLowEscalationModerate EscalationCrisis setback for
challenger and dispute continuesAvoidance of War and dispute
continuesAnother Iteration of the Challenge the Status Quo
StageFig. 5. The Military Escalation stage. (Note: C = Challenger
State; T = Tar-get State)(see g. 1). For example, consider the
outcome of a negotiated settlementreached through mutual
concessions. In one dispute, this could be
achievedbypeacefultalksandmutualcompromiseinashortperiodoftime,whereas,
in another dispute, repeated military conicts and then difcultand
protracted negotiations eventually produce a settlement. Several
important implications for quantitative studies of
internationalconictmaybedrawnfromourdiscussionofgure1andthevariousstages
of international conict.1.
Theoutbreakofwarandtheuseoflarge-scalemilitaryforcearealmostalwaysprecededbyapreexistinginternationaldisputeinwhichdiplomaticeffortsatnegotiationsandtalkshadbeenattempted.Asaresult,
very few military confrontations take place without a prior
historyof failed diplomacy and negotiations between states over
disputed issues.Furthermore, most international disputes do not
evolve into military
con-frontations.Sincethethreatanduseofmilitaryforceisarareandoftennaloption,empiricalstudiesneedtoinvestigatetheconditionsunderwhich
disputes will become militarized. In particular, statistical tests
needto account for potential selection effects due to the fact that
leaders mightturn to military force under unique circumstances,
such as when they arerisk-acceptant or highly resolved. 2.
Similarly, state leaders often engage in repeated efforts at
negotia-tions before deciding to make substantial concessions. As a
result,
interna-tionaldisputesarerarelysettledwithoutstatesprobingandseekingtoshift
the burden of concession making onto their adversary. For this
reasonitisimportantfortheoreticalmodelsofinternationalconictandtheempirical
tests of such models to account for the process of dispute
resolu-tion in which leaders will often shift from an initial
hard-line stance
thatseeksunilateralconcessionsbytheiradversarytoamoreaccommodativebargainingpositioninwhichtheyaccepttheneedtoofferatleastsomeconcessions
themselves.3. In addition, a states behavior in one dispute is
often affected by itsinvolvement in other disputes. Statistical
tests need to consider the largerstrategic context of a states
foreign policy in which the states leaders
mustmanagetheircountryssimultaneousinvolvementinmultipleinterna-tionaldisputes.Inotherwords,quantitativestudiesofinternationalconictneedtoconsiderthefactthatastatesbehaviorinonedisputemight
be correlated with its behavior in another international dispute.4.
The impact of history and time may also be important.
EmpiricalModels, Numbers, and Cases200tests need to account for the
fact that new information might be
revealedtostatesoverthecourseofaninternationaldispute.Thepasthistoryofdiplomatic
or military exchanges in the dispute might shape the
currentdiplomatic and military behavior of states. For example,
states are likely toshift away from conict resolution strategies
that have proven unsuccessfulin previous interactions with an
adversary. Not only may previous
negoti-ations,stalemates,ormilitaryinteractionsbeimportant,butshort-termactions
and more recent changes in prevailing conditions can lead
decisionmakers to update their beliefs and change the policy
options during a
par-ticularencounterwithanadversary.Militarizeddisputesandcrisescanunfold
over many months, and during that time domestic political
condi-tions can change, other international disputes can arise,
third parties canintervene, and the target states own behavior can
signal new informationabout its resolve and military strength. New
information may therefore berevealed in the transition from one
stage to the next stage. For example, adecision by a challenger
state to issue a threat of force in the Challenge theStatus Quo
stage (see g. 3) should not be treated necessarily as reectinga rm
decision to escalate and resort to the large-scale use of force in
thesubsequent Military Escalation stage (see g. 5). A threat of
force may bedesigned to probe the intentions and resolve of a
target state, to induce thetarget to resume talks by signaling the
dangers of a continued stalemate, orto pressure the target into
making concessions in a new round of
upcomingtalks.Asaresult,atheoreticaldistinctionshouldbedrawnbetweentheinitiation
and escalation of militarized conicts.Questions of Research
DesignIn this section we address a number of issues of research
design that
havesignicantimplicationsforstatisticaltestsoftheoriesofinternationalconict.
We use the four stages described previously (see g. 1) to guideour
discussion of these issues. We focus on four particular research
designquestions.1. What are the units of analysis for building data
sets?2. How can problems of selection effects be addressed in
empiricaltests?3. In what ways can problems of nonindependent
observations arisein statistical analyses?Research Design in
Testing Theories2014.
Whatarecommonproblemsofmeasurementerrorinstatisticalanalyses of
international conict theories?Before discussing these questions, it
is useful to touch upon a few
generalfeaturesoftheframeworkweusewhenthinkingabouttheevolutionofinternational
disputes. Typically, our conceptualization is of a situation
inwhichtwostatesinteractoveranissueof(potential)disagreement,although
in principle there is nothing to preclude one or both of the
par-ties from being a nonstate actor. Also, we maintain that the
primary deci-sion makers (who would typically be state leaders) can
be inuenced by fac-tors from a variety of levels of analysis. They
could be subject to variousdomestic political impulses and
constraints, for example. Furthermore, onecould also examine
individual-level traits of decision makers, if
necessary,oraspectsofthedecision-makingprocess.Therefore,weconsiderourframework
to be quite general and exible. Our basic structure is of a
two-playergame,butthirdpartiescanaffectthecalculusofeitherorbothactors.
The actors might, for example, consider whether an ally is likely
tointervene in a certain scenario or how an international legal
body would belikely to rule if referred a disputed issue. Our
framework, however, is
notdesignedtoexplainlong-term,dynamicprocessesthatinvolvemultipleindependent
actors. In such a situation, simulations based on
agent-basedmodeling might be quite useful (e.g., Cederman 1997).
Nevertheless,
wefeelthatourconceptualizationofinternationalconictisonethatcanaccommodatemanytypesofinternationalinteractions,aswellasawidevariety
of explanations for different types of behavior.Selecting the Units
of AnalysisWe have argued that the theoretical study of
international conict centerson four generic types of stages in
which state leaders select different
policyoptions.Empiricaltestswillneedtobedesignedforeachofthesefourstages.Ourgeneralargumentisthatinbuildingdatasetsforstatisticalanalyses
of theories about the causes of international conict the
appropri-ate unit of analysis in most empirical studies should be
the individual statein a given international dispute with another
state. Put slightly differently, onemight say that we advocate
looking at the behavior of each individual statein a directed dyad.
The reasons for this are twofold.1. Compelling theoretical
explanations for conict behavior must beModels, Numbers, and
Cases202grounded in the actions of individual states, which are
based on the
choicesoftheirpoliticalandmilitaryleaders.2Rationalchoicemodelsbasedongame-theoretic
analyses do follow this approach, and we believe any
pow-erfultheoreticalapproachultimatelyrestsonunderstandinghowthechoicesofstateleadersandtheirstrategicinteractionswithotherstateslead
to various international conict outcomes.2.
Aspreviouslyargued,warandinternationalcrisesrarely,ifever,occur in
the absence of preexisting disputes over issues and prior periods
ofnegotiations and diplomatic interactions. In fact, war and crises
threaten-ing war are quite infrequent forms of interstate
interactions. It is impor-tant to understand why state leaders in
some disputes at particular pointsin time are unable to resolve the
issues in contention and why they are will-ing to escalate the
dispute to the highest levels of military conict. As
aresult,afullexplanationofthecauseofinternationalconictshouldbebasedontherecognitionthattherearemultiplestagesthroughwhichinternational
disputes may evolve.If we consider the four stages in gure 1, we
can see that the units ofanalysis in the Dispute Initiation stage
would be potential challenger andtarget states. For the challenger
state the dependent variable in a
statisti-caltestwouldcenteronthedecisionofitsforeignpolicyleadershipwhethertocontestthepoliciesofanothergovernmentinsomeselectedissuearea(s).Examplesmightincludecompliancewitharmscontrolorcease-re
agreements, disputes over one countrys suppression of
internalpolitical opposition, or charges that one country is
permitting rebel
forcestooperateonitsterritory.Forthisunitofanalysis,theoreticalmodelswould
explain when the potential challenger actually presses its claims
anddemands that the target state change its
policies.Oncethedemandsandclaimsareclearlyarticulated,thetargetstatewould
then have to decide whether to resist the policy changes called for
bythe challenger (see g. 2). Thus, the dependent variable when the
targetstate is the unit of analysis focuses on how rmly the leaders
of the targetstate respond to demands for changing their policies.
In the Challenge
theStatusQuostageweknowthataninternationaldisputealreadyexists.Thus,
the observations in this stage would consist of all challenger
statesand the potentially repeated opportunities that their leaders
had to initiatediplomatic or military policies in an attempt to
achieve a favorable changein the prevailing status quo (see g.
3).For these rst two stages the temporal denition of what we might
termResearch Design in Testing Theories203a play of the dispute
needs to be given careful attention by researchers.One might
initially dene this as an annual observation and code what
ini-tiatives, if any, were pursued by the challenger in a given
year. Theoreti-cally, however, there is no compelling reason to
believe that a single
for-eignpolicydecisionoccursonceeverytwelvemonths.Forexample,insomeinternationaldisputesleadersofthechallengerstatemightmovethrough
several stages in a single year. For example, efforts to rely on
nego-tiations early in the year might end quickly in stalemate, yet
by the end ofthe year the leaders might decide to turn to military
pressure and threats offorce in an attempt to break the diplomatic
deadlock.In contrast, in a different international dispute the
issues at stake for
thechallengerstatemightbenotthatsalient,and,asaresult,itmakesnoeffort
to escalate or settle the dispute in a given year. The lack of
attentiongiven to the dispute raises questions about whether any
policy options wereeven considered within a given year and whether
that year of
observationshouldbeincludedinthedataset.Aresearchdesignsetupthatwasgrounded
in a game-theoretic approach would shift away from relying onthe
convenient annual time period for each observation and instead
woulddevelop a more exible set of coding rules to establish the
temporal boundsof each iteration of a stage. With these more
adaptable rules it would bepossible to identify multiple iterations
of a stage within a given year and
toextendasingleiterationofonestagebeyondayearwhentheoreticallyappropriate.InboththeNegotiationsandMilitaryEscalationstagestheunitsofanalysis
are the challenger and target states involved in a given round of
talks ormilitary confrontation. Once again, this is analogous to an
examination of
thenegotiationorescalationbehaviorofeachstateinadirecteddyad.Theduration
of the round of talks or military confrontation would
determinethetimeperiodofeachstate-levelobservation.Inthesetwostagesthedependent
variables would typically focus on outcomes such as the extentof
concessions by a state in negotiations, each states level of
military esca-lation, or how responsive one states policies were to
the short-term actionsof the other.One important implication of our
discussion about the units of analysisin statistical analyses is
that we do not generally favor or advocate the
useofnondirecteddyads(seeBennettandStam2000).Whiletheuseofdirected
dyads is desirable because it allows the researcher to capture
indi-Models, Numbers, and
Cases204vidualstatedecisionsinaparticularstrategicenvironment,muchoftheexistingworkutilizesnondirecteddyadsandfocusesonlyonthejointoutcomeresultingfromtheinteractionofpairofstates.Dyadicanalyseshavebecomeincreasinglycommoninstatisticalstudiesofinternationalconict,
particularly in the democratic peace research program. For
exam-ple, a number of statistical studies of the democratic peace
have analyzeddata sets consisting of pairs of states in which the
occurrence of a war
ormilitarizeddisputeshortofwariscodedonanannualbasisoversomespecied
time period. In some tests the population of dyads consists of
allpossiblepairingsofstates,whileotherscholarsrelyonasmallersetofpolitically
relevant dyads (e.g., Bremer 1992, 1993; Maoz 1997, 1998;Maoz and
Russett 1992, 1993; Oneal and Ray 1997; Oneal and
Russett1997,1999a,1999b,1999c;Ray1995,chap.1;Gowa1999;Russett1993).
Politically relevant dyads are typically composed of states that
arecontiguous or pairs of states in which at least one party is a
great power.These studies have produced many useful and important
ndings;
never-theless,wethinktherearereasonstoquestionresearchdesignsthatrelyupon
nondirected dyads as the basic unit of analysis. In particular,
there
areatleastthreelimitationstosuchdyadicstudiesthatcanbenicelyillus-trated
by considering empirical studies of the democratic peace.First, in
dyadic studies of the democratic peace the dependent variabletakes
the form of conict involvement for the countries in the dyad,
withoutidentifying patterns of military initiation and response, or
conict resolu-tion, by each state. This is an important drawback,
since hypotheses aboutdemocratic institutions and norms of conict
resolution logically predictwhich states in a dyad should be most
likely to initiate militarized disputesand escalate disputes to the
brink of war as well as seek diplomatic settle-ments of disputes.
Data on initiation and escalation are particularly
impor-tantintestingthemonadicversion3ofthedemocraticpeace.Anondi-recteddyadicdemocraticpeacestudy,however,wouldsimplynotetheoccurrenceofwarortheexistenceoflarge-scalemilitaryactionbetweentwostatesinamixeddyadforagiventimeperiod.Thiscodingofthedependent
variable would not distinguish between two very different
sce-narios in which democratic and nondemocratic states would
resort to thelarge-scale use of force. In the rst case, the
nondemocratic state initiatesthe large-scale use of force after
rejecting compromise proposals, and thedemocratic state responds by
defending itself against the attack. In the sec-Research Design in
Testing Theories205ond case, the reverse is true, as the democratic
state initiates the large-scaleuse of force after rejecting
compromise proposals, and the nondemocraticstate responds by
defending itself against the attack.These two cases represent very
different pathways to war and
thereforesuggestquitedifferentconclusionsaboutthemonadicapproachtothedemocratic
peace. The second pathway is seemingly quite at odds with amonadic
democratic peace argument, whereas the rst pathway is not.
Thesamegeneralpointisapplicableregardingdifferentpathwaystoconictresolution.
In one case the dispute is settled by a nondemocratic state
ini-tiating concessions or withdrawing claims, while in a second
case a
demo-craticstatetakestheinitiativetoproposeconcessions,whicharethenaccepted
by a nondemocratic adversary. The rst case runs counter to
pre-vailingmonadicargumentsaboutdemocraticnorms,whilethesecondseems
consistent. The ndings of many existing quantitative studies,
how-ever,donotprovideasolidfoundationuponwhichtodrawconclusionsabout
the monadic version of the democratic peace (Rousseau et al.
1996).It seems very desirable then to disaggregate conict behavior
within a dyadinto a more sequential analysis of each states
behavior over the course of adispute between states. Thus, Huth and
Allee (2002), in their study of
thedemocraticpeace,examine348territorialdisputesfrom1919through1995
in which each states behavior for cases of the Challenge the
StatusQuo, Negotiation, and Military Escalation stages is analyzed.
The result
isthathypothesesaboutdemocraticpatternsofinitiationandresponseregardingnegotiationsandmilitaryconictscanbeclearlypositedandthen
empirically tested.A related problem with dyad-based data sets is
that hypotheses
regard-ingtheimpactofimportantindependentvariables,suchasdemocraticnormsandstructures,onconictoutcomescannotbetesteddirectly.Instead,
the researcher is forced to make inferences about the causal
processthat might have produced patterns of observed dyadic conict
outcomes.Consider the case in which one of the two states in a
dispute is led by
aminoritygovernment.Thisminoritygovernmentmightbeunlikelytooffer
concessions to its adversary because of the difculties in securing
leg-islative or parliamentary support for such concessions.
However, its adver-sary, knowing it is bargaining with a highly
constrained opponent, mightbe more likely to offer concessions.
However, the existence of a
minoritygovernmentcouldhavetheoppositeimpactoneachofthestatesinthedyad.
By splitting the dyad into two state-level, directional
observations,Models, Numbers, and Cases206the researcher is able to
more directly test the causal impact of minoritygovernment on
conict or bargaining behavior for democratic states
(e.g.,HuthandAllee2002).Theuseofnondirecteddyads,however,wouldobscure
the true causal impact of domestic institutional arrangements
suchas minority
government.Thenal,relatedlimitationofthesedyadicstudies,especiallythoseusing
the popular nondirected dyad-year format, is that they test
hypothe-ses about international conict without grounding the
empirical
analysisinthedevelopmentandprogressionofinternationaldisputesbetweenstates
(g. 1). When analyzing whether states become involved in a
milita-rized dispute or war, the causal pathway necessarily
includes a rst stage ofa dispute emerging. We do not think
dyad-year arguments, such as thosefor the democratic peace, explain
why disputes arise, but rather, only howdisputes will be managed.
The problem with the typical dyad-year-baseddata set is that the
observed behavior of no militarized dispute or no warfor certain
dyad-years could be explained by two general processes, one ofwhich
is distinct from arguments in the democratic peace literature.
Thatis, no military conict occurs because (1) states were able to
prevent a dis-pute from escalating, which the democratic peace
literature addresses; and(2) states were not involved in a dispute,
and thus there was no reason forleaders to consider using force.
This second pathway suggests that democ-ratic peace explanations
are not that relevant. As a result, dyads that do noteven get into
disputes for reasons that are not related to democratic
insti-tutions or norms may appear to be cases in support of the
democratic peace.
Theuseofpoliticallyrelevantdyadshelpstoreducethisproblemofirrelevant
nondispute observations, but many relevant dyads are not
partiestoaninternationaldisputethathasthepotentialtoescalatetomilitaryconict.
If one has the typical data set that contains observations in
whichstates never even considered using force, then potential
problems of over-stated standard errors and biased estimates of
coefcients for the
democra-ticpeacevariablescanarise.Forexample,thenegativecoefcientonademocraticdyadvariableinastudyofmilitaryconictcouldreecttheability
of democratic leaders to manage disputes in a nonviolent way,
butitmightalsocapturethefactthatsomedemocraticdyadswerenotinvolved
in any disputes for many of the dyad-year observations in the
dataset.Asaresult,itisdifculttodrawstrongandclearcausalinferencesabout
the impact of joint democracy on conict behavior (see
Braumoellerand Sartori, chap. 6, this vol.). In the rst scenario,
it would not be worri-Research Design in Testing Theories207some to
witness a conict between two democratic states, since they shouldbe
able to manage the dispute without resorting to violence. However,
ifthe second claim is true, then the occurrence of military
confrontations iscause for concern, since the democracies are only
pacic insofar as they areable to avoid getting into militarized
disputes in the rst
place.Insum,thenondirecteddyad-yearastheunitofanalysisaggregatesmultiple
stages in the development of an international dispute into a
sin-gle observation that renders it difcult for researchers in
empirical tests toassess the causal processes operating at
different stages in the escalation orresolution of international
disputes.Accounting for Selection EffectsSelection effects are a
potential problem for any empirical test that fails tounderstand
that states do not enter into negotiations or become involved
inaviolentmilitaryclashrandomly,butratherstateleaderschoosetogodown
a particular path during the evolution of a dispute. For the
relativelyfewcasesmakingittoeithertheNegotiationsorMilitaryEscalationstages,
the story of how and why state leaders selected their countries
intothese samples is of utmost importance. Similarly, the related
idea of strate-gic interaction tells us that state leaders consider
the anticipated
responseofopponentstovariouspolicyoptions.Eventhoughsomefactorsmayaffect
the decisions of leaders in a potential conict situation, this
impactis not captured by standard statistical techniques because
leaders avoid tak-ing these potentially undesirable courses of
action. This idea is particularlysalient when analyzing the Dispute
Initiation and the Challenge the StatusQuo stages.One way to think
about selection effects hinges on the idea of sampleselection bias
(see Achen 1986; Geddes 1990; King, Keohane, and Verba1994). In the
simplest terms, using a nonrandom sample of cases to testcausal
relationships will often result in biased estimation of coefcients
instatistical tests. Not only might the causal relationships
suggested by
theresultsofsuchstatisticalanalysesbeinaccurateforthelimitedsampleexamined,
but they also cannot be used to draw inferences about the
gen-eralizable relationship that might exist between the
independent
variablesanddependentvariableoutsideofthatsample.Thelogicisstraightfor-ward:casesthatadvancetosomeparticularphaseintheevolutionofconictmaynotbetypicalofrelationsbetweenstates(Morrow1989).Models,
Numbers, and
Cases208Theremaybesomesystematicreasonorexplanationforwhythesecasesreach
a certain stage, and the failure to account for this can produce
mis-leading statistical results.Unobserved factors, such as
beliefs, resolve, risk attitudes, and
credibil-ity,mightexertaselectioneffect(Morrow1989;Fearon1994b;Smith1995,
1996). States may select themselves into certain stages of conict
ordown certain paths of dispute resolution based upon the private
informa-tion they possess about these unobserved factors. The ideas
of alliance reli-ability and extended deterrence illustrate this
idea. Reliable alliances andcredible deterrent threats should
rarely be challenged, so the large numberof cases where alliance
ties and general deterrence prevent challenges to
thestatusquoareoftenexcludedfromdatasetsofmilitarizedcrises(Fearon1994b;Smith1995).DuringtheChallengetheStatusQuostage,onlyhighly
resolved challengers would challenge strong alliances and
credibledeterrent threats. The failure to account for a challengers
resolve to carryout its military threat might lead the empirical
researcher to mistakenlyconclude that alliances increase the risk
of military escalation in crises andto fail to appreciate that
alliances might act as powerful deterrents to
statesinitiatingmilitaryconfrontations(Smith1995).Onceagain,statisticalanalyses
of a single stage can be biased if they do not consider how
stateleadersselectedthemselvesintothedatasetthatisbeingtestedforthatstage.Put
slightly differently, sample selection bias is likely to exist when
thevariables that explain the ultimate outcome of the cases also
explain whythose cases got into the sample in the rst place. If the
factors explainingthe outcome of the Military Escalation stage also
help explain the decisionto get into the Military Escalation stage
(the choice made during the Chal-lenge the Status Quo stage), then
the estimated coefcients produced bystatistical tests of cases that
only appear in the Military Escalation stage arelikely to be
biased. Variables such as wealth, regime type, and satisfactionwith
the status quo may affect the decisions made during the Challenge
theStatus Quo stage and the Military Escalation stage in similar or
differentways(seeHartandReed1999;Huth1996;Reed2000).Forexample,Huth(1996)reportsthatthemilitarybalancedoesnotsystematicallyinuence
challenger decisions to initiate territorial claims against
neigh-boring states, but among cases of existing territorial
disputes,
challengersaremuchmorelikelytothreatenanduseforceiftheyenjoyamilitaryadvantage.
In a study of the democratic peace, Reed (2000) explicitly
mod-Research Design in Testing Theories209els the decisions to
initiate and then escalate military confrontations, andhe nds that
the impact of democratic dyads is far stronger in
preventingtheemergenceofmilitaryconfrontationscomparedtotheescalationofsuch
conicts.Incorporating strategic interaction in research designs on
internationalconict is also a desirable goal (see Signorino 1999;
Smith 1999). In ourframework, accounting for sample selection bias
generally requires lookingbackward to explain where cases come
from, whereas the idea of strategicinteraction requires looking
forward to see where cases would have gone ifthey had reached later
stages in the evolution of conict. The key idea
isthatthedecisionswithinandacrossdifferentstagesareinterdependent;actorstakeintoaccountthelikelybehaviorofotherstatesatpresent,aswellaspossiblefuturedecisionsduringtheescalationofinternationalconict
(Signorino 1999). Strategic interaction may also be thought of
asthe explicit study of counterfactuals (Smith 1999, 1256). Actors
antici-patehowpotentialadversarieswillbehaveundercertaincircumstances,such
as at any of the decision-making nodes in our four stages, and
avoidmaking decisions that may ultimately lead to undesirable
outcomes. In ourmultiphase model of disputes (g. 1), a challenger
state may refrain
fromchoosingthepathleadingtothemilitaryescalationstagebecausetheyanticipate
a swift, strong military reaction from the defender in the eventof
a military threat. Or they may shun the decision to enter into
negotia-tions because they anticipate no concessions being made by
the leader
ontheotherside.Onceagain,factorsthattrulyaffectthecalculusofstateleaders
to make certain decisions or enter into certain phases or
stagessuch as the credibility of a defenders swift response, or the
domestic con-straints placed on a foreign leaderare not captured by
standard statisticaltechniques because they are unobserved. The
most widely used statistical estimators fail to capture the
concernswe raise about selection issues, and such techniques
produce biased resultswhen a nonrandom sample is used (Achen 1986).
In many cases, the
effectsofsomeindependentvariablesmaybecomeweakenedorrenderedinsignicant.Infact,someclaimthatselectionbiasmayproducecoefcients
with reversed signs (Achen 1986). For example, Huth (1988)nds in
his statistical tests that alliance ties between defender and
protgaresurprisinglyassociatedwithanincreasedriskofextended-immediatedeterrence
failure. Fearon (1994b), however, argues that the reason for
thisndingisduetoselectioneffectsinwhichanunmeasuredvariable(theModels,
Numbers, and
Cases210challengersresolvetoinitiateathreatagainsttheprotg)iscorrelatedwith
the observed variable of alliance ties. The result is that the
estimatedcoefcient for the alliance variable is actually picking up
the impact of theunmeasured challenger-resolve variable, and this
helps to explain the unex-pected negative sign on the alliance
variable. In general, scholars incorpo-rating the ideas of
strategic interaction and selection bias into their
modelshavediscoveredsignicantdifferencesbetweencoefcientsproducedbythese
corrected models and those produced by biased models (see
Sig-norino1999;Smith1999;Reed2000).Thesechangedestimatesevenaffect
some of our most important propositions in world politics, such
asthe impact of joint democracy on conict escalation as noted
earlier (Hartand Reed 1999; Reed 2000).Our general conclusion is
that quantitative studies of military
conictshouldincorporatesometypeofcorrectionforselectioneffects.Inouropinion,
the best suggestion is to model the multiple stages in the
escala-tion of international conict simultaneously. This is
generally done by esti-mating both a selection equation (to explain
which cases get into a partic-ular sample) as well as an outcome
equation (to explain how the cases
inthissampleareplayedout).AgoodexampleofthisisHuthandAllees(2002)
analysis of dispute resolution efforts by democratic states that
areinvolved in territorial disputes. In estimating the probability
that a
demo-craticchallengerwillofferconcessionsinaroundoftalks(theoutcomeequation),theyincludeaselectionequationthataccountsfortheinitialdecisionofthedemocraticchallengertoproposetalks.Giventhepreva-lence
of categorical variables in studies of international conict, probit
andlogit selection models seem most promising, although other
models maybe appropriate for different types of dependent
variables. When thinkingabout how particular cases get where they
are, one should compile data
onthosecasesthathadsomelegitimateprobabilityofmakingittosomestage,
but did not. In other words, if analyzing the Negotiation stage,
oneshould also have some information about cases that went to the
MilitaryEscalation stage, or in which the status quo was
accepted.Itisoftencumbersomeanddifculttoacquiredataonrelevantnon-events,
such as instances in which leaders considered threatening force
butdid not do so, or where a state had the ability to press a claim
concerningthe treatment of ethnic minorities abroad and decided to
accept the
statusquo.Yetwefeelthatacquiringandincorporatingthisinformationintoquantitative
analyses should be a high priority for scholars. In other
words,Research Design in Testing Theories211we advocate greater
attention to the Dispute Initiation and Challenge
theStatusQuostagestotheidenticationofthosesituationsthatcouldplausiblybecomeinternationaldisputesandthentrackingwhichdis-putes
might proceed through various Negotiations and Military
Escalationstages.Whenthisisnotpossible,andthereforenoselectionequationisspecied,
a different approach would be to include in the outcome
equationthose independent variables that would have been in a
selection equation.In other words, researchers studying the
outcomes of crises or
militarizeddisputesshouldtrytoincludeindependentvariablesthatexplainwhythose
disputes and crises might have arisen in the rst
place.Insum,theproblemsofselectionbiasandstrategicchoiceareillus-trated
nicely by game-theoretic models of military conict, which
capturethe real-life choices faced by state leaders. Our primary
point is that quan-titative analyses of international conict need
to account for the variety
ofchoicesthatstateshaveatdifferentstagesintheevolutionofadispute.Focusing
narrowly on one phase of an interstate dispute without account-ing
for past and potential future choices can lead to biased
statistical resultsand therefore limit our ability to draw accurate
conclusions concerning thefactors that contribute to military
conict.Problems of Nonindependent
ObservationsResearchersconductingstatisticalanalysesofdatasetsoninternationalconictneedtoconsiderpotentialproblemsofnonindependentobserva-tions.
There are a number of ways in which the dependence of
observationscan occur in international conict data sets. We focus
on two types that arelikely to be present in many data sets in
which the basic units of analysisare states that are involved in
international disputes. In the rst case, thedependence of
observations is due to the time-series nature of the data inwhich
the same state appears multiple times in the data set since the
inter-national dispute spans many years. With this data set the
analyst is
testingmodelsthatseektoexplainvariationinastatesdisputebehaviorovertime.Inthesecondcase,cross-sectionalorspatialdependenceispresentbecause
in a given time period (e.g., a year) the same state is a party to
sev-eraldifferentinternationaldisputesorisinuencedbythebehaviorofneighboring
states. The empirical analysis in this second study centers
ontesting models that might account for variation in a states
behavior acrossthe different international disputes in which it is
involved.Models, Numbers, and Cases212In the time-series example,
the statistical problem is that values on thedependent variable for
a state-dispute observation in time period t are sys-tematically
related to the behavior and actions of that same state in
pre-cedingtimeperiods.Putdifferently,thepriorhistoryofthedisputeisimportant
in understanding the current behavior of the disputants. In
thecross-sectional example, the problem is a bit different in that
the actions
ofasinglestateinonedisputeareinuencedbythebehaviorofthatsamestateorotherstatesinaseconddispute.Ineitherofthesetwocasesofdependent
observations, the statistical implications are that the
assembleddata sets do not contain as much independent information
as is assumed bythe standard statistical models utilized by
researchers. As a result, the stan-dard errors associated with the
estimated coefcients are likely to be inac-curate. In particular,
they are likely to be underestimated and, as a
result,researchersruntheriskofoverstatingthestatisticalsignicanceofcoefcients
and the ndings they report (see Greene 1997, chap. 13).If we refer
back to gure 1, problems of both time-series and cross-sec-tional
dependence of observations are likely to be present in data sets
thatare used to test models for the Dispute Initiation and
Challenge the StatusQuo stages. The reason is that a common
research design for each of
thesestagesistoassemblewhataretermedpooledcross-sectionaltime-seriesdata
sets. Researchers might build a data set that includes many
differentstates that are involved in many different disputes (or
potential disputes)over some extended period of time.One such
illustrative example comes from Huths (1996) study of terri-torial
disputes, in which he conducted a two-stage analysis in which
therst stage was very similar to what we have termed the Dispute
Initiationstage. In this initial analysis he included all states
from 1950 through 1990that issued territorial claims against
another state as well as a random sam-ple of states that did not
dispute their borders. He then tested models thatsought to explain
which challenger states did in fact dispute territory. Inthe second
stage of analysis, he focused on all of the territorial dispute
casesfrom 1950 through 1990, and he analyzed the varying levels of
diplomaticand military conict initiated by challenger states. In
this two-stage analy-sis Huth found evidence of both temporal and
cross-sectional relationshipsbetween cases. For example, challenger
states that had signed formal
agree-mentssettlingborderdisputeswithaparticularcountrypriorto1950were
very unlikely to repudiate those agreements and initiate a new
terri-torial dispute in the post-1950 period. Challenger states in
a territorial dis-Research Design in Testing
Theories213putewerealsolesslikelytoresorttomilitarythreatsinanattempttochange
the status quo if they were involved simultaneously in multiple
ter-ritorial disputes (chaps. 45).In the Military Escalation and
Negotiations stages in gure 1 the datasets that would be relied
upon for statistical tests would not be
standardpooledcross-sectionaltime-seriesinnature,butratherwouldbepooledcross-sectionaldesigns.Forexample,adatasetfortestingtheMilitaryEscalation
stage would typically consist of all military confrontations
ini-tiatedbyachallengerstateoversomedisputedissue.Similarly,intheNegotiations
stage the data set would include all rounds of talks held bystates
over disputed issues. For each type of data set, cross-sectional
depen-dence of observations could be a problem, as could temporal
dependence ofobservations due to the potential for repeated rounds
of talks or militaryconfrontations. For example, in the military
escalation data set the decisionby a states leadership to resort to
the large-scale use of force in a particularcase could be inuenced
by whether their adversary was already engaged ina military
confrontation with another state (see Huth, Gelpi, and Bennett1993)
or whether they had suffered a military defeat at the hands of
theircurrent adversary in a prior military confrontation (see Huth
1996).A common problem for many quantitative researchers who are
workingwith probit and logit models is that standard corrections
for time-series orspatial dependence in data are not well-developed
in the statistical litera-ture. Political methodologists, however,
have devised a number of poten-tially useful corrections that can
be employed to deal with nonindependentobservations due to time
series effects (e.g., Beck, Katz, and Tucker 1998),and such
corrections are often desirable in estimating equations.
Neverthe-less, we want to express a note of caution because
researchers may too read-ily turn to these statistical corrections
and only rely upon them to deal
withtheimportantproblemofdependentobservations.Westronglyrecom-mendthatresearchersalsodevoteconsiderableefforttoaccountingforproblemsofnonindependentobservationsthroughbetterspecicationofthetheoreticalmodelsthatareempiricallytested.Thiswouldentailresearchersdevelopinghypothesesthatcapturetheinuencesoftime-seriesandcross-sectionalfactorsandthenincludingsuchfactorsasexplanatory
variables in the equations that are tested. An excellent exam-ple
of this approach is the work of Michael Ward and Kristian
Gleditschthat includes explanatory variables in their models that
reect spatial clus-tering of conict, trade, and democratization
among states ( see GleditschModels, Numbers, and Cases214and Ward
2000; Gleditsch 2002). The primary advantage of this is thatany
estimated coefcients that are intended to pick up the effects of
depen-dent observations can be interpreted in a more direct manner
given that atheoretically grounded and more specic causal argument
has already
beenprovided.Anotherrecommendationistoswitchfromstandardlogitandprobitmodels
to event history or duration models that do explicitly account
fortime-series effects (for a general discussion of such models see
Zorn
2001).Eventhistorymodelsfocusonexplainingthetransitionfromaninitialcondition
(or status quo) to a new one as a function of time. For
example,drawing on the democratic peace literature, researchers
might hypothesizethat given a territorial dispute between two
states, the time to settlementof the dispute by means of a
negotiated agreement would be shorter if bothstates were
democratic. Good examples of IR scholars using event
historymodelsincludeWerners(1999)studyofthedurabilityofpeaceagree-ments
in the aftermath of wars and Bennett and Stams work (1998) on
theduration of interstate wars.The Measurement of
VariablesMeasurementerrorisaubiquitousconcerninallsciences,especiallythesocial
sciences. Imprecise measurement of explanatory variables,
especiallyif systematic, casts doubt on our ability to draw
accurate causal
inferences.Wefeelconcernsaboutmeasurementshouldbestronglyemphasizedinresearch
designs of international military conict. Our four-stage model
ofinternational conict illustrates some specic issues faced by
quantitativeresearchers of international conict, such as the need
to incorporate vari-ables and measures that may be uniquely
relevant to certain stages in
theevolutionofinternationalconict.Inaddition,allstudiesofmilitaryconict
are saddled with certain unique data and measurement concerns,such
as the use of large data sets with large numbers of variables, the
ambi-guity of many key concepts, a lack of creativity in
measurement, and dis-incentives to devote resources to better
measurement.Since the actions taken by leaders over the course of
an international dis-pute may provide additional useful
information, researchers may need
tomodifypreexistingmeasuresatlaterphasesofconict.Someimportantunderlyingconcepts,suchasthemilitarybalancebetweentwostates,couldbemeasureddifferentlydependingonwhichstageingure1isResearch
Design in Testing Theories215being analyzed. For example, a general
indicator of standing military
capa-bilitiesmightbeusedtomeasurethemilitarybalanceinatestoftheChallengetheStatusQuostage.However,oncebothsideshavemadethreats
to use military force or have mobilized troops, adding
informationon the local balance of forces in this dispute would
improve the measure-ment of the military balance in the Military
Escalation stage. For example,measures of the local balance of
forces have been reported to have
strongeffectsonthesuccessorfailureofextended-immediatedeterrenceorwhetherterritorialdisputesescalatetowar(e.g.,Huth1988;HuthandAllee
2002).Inaddition,asmentionedearlier,thedecisionsmadebypoliticalandmilitary
leaders during the evolution of a dispute may convey new
infor-mation. This information should then be incorporated into
empirical
testsoflaterphasesofadispute.Forexample,leadersmaygenerateaudiencecosts
or use costly signals at the beginning of the Military Escalation
stagetomaketheirthreatofmilitaryforceappearcredibletoanadversary(Fearon
1994a). Therefore, this new information about the added
credibil-ity of a states threat of force should be used to modify
preexisting
measuresofcredibilityoraddedtoanytestoftheMilitaryEscalationstage.Anexample
of this is Huth and Allees (2002) study of state behavior in
mili-tarycrisesinwhichtheycodeavariableforwhetherdemocraticleaderssend
a strong public signal of the resolve to use force at the outset of
thecrisis. In their statistical analyses they nd that such
democratic signals
ofresolvearestronglyassociatedwithdeterringescalationbytheadversarystate.AnotherinterestingexampleisthendingreportedbySchultz(2001)thatthedeterrentthreatsofdemocraticstatesaremorelikelytosucceed
if the leaders of opposition parties signal their support for the
gov-ernmentsdeterrentpolicyduringtheconfrontationwithapotentialattacker.
The overriding idea is that variables reecting additional
infor-mationcanbeaddedtoanalysesoftheMilitaryEscalationstageortheNegotiationsstage.Oneshouldnotalwaysrelyonthesamemeasureofcredibility,
resolve, or military balance in empirical tests of the
differentstages of an international dispute.A more general
measurement concern for quantitatively minded schol-ars of military
conict is that the quality of data is often poor. The
recentturntodyad-yearsastheunitofanalysisinmanystudiesofmilitaryconict
typically results in tens of thousands, if not hundreds of
thousands,of cases in data sets. Trying to nd data on all variables
for so many dyadsModels, Numbers, and
Cases216isadauntingtask.Withlimitedtimeandlimitedresources,thereisatrade-offbetweenthequantityofdatacollectedandthequalityofthisdata.
So researchers are forced to settle for imprecise or suspect data,
or
todropobservationswithmissingdata.4Inaddition,theincreasingaccep-tanceoftheideathatdomesticpoliticsvariablesshouldbeincludedinstudies
of international conict adds to the data collection
burden.Onepromisingsolutiontothecumbersometaskofcollectingqualitydata
lies with sampling. The strategy of what can be termed
retrospectiverandom sampling has rarely been used in large-n
studies of internationalmilitary conict, yet the use of
retrospective sampling designs would allowscholars to devote more
energy toward the collection of better data. In suchsampling
designs the researcher combines the population of observed
mil-itary conicts (crises or wars, for example) with a random
sample of cases inwhich no military conict occurred.5Logit models
can then be used to
esti-mateequationsinwhichthecoefcientsareunbiasedandthedegreeofinefciency
associated with standard errors is quite small. Taking a
randomsample from the large population of noncases of conict could
be a valuabletool for addressing concerns about selection bias
(Achen 1999; King andZeng 2001).Furthermore, studies of
international conict and crisis behavior
oftenemployconceptsthataredifculttomeasure.Game-theoreticmodelsoftengeneratehypothesesaboutthebeliefsofactors,yetitisnearlyimpossible
to get inside the minds of decision makers to understand howthey
interpret a situation. As a result, researchers have to develop
imperfectoperational measures for key concepts such as the
credibility of a
threat,thepoliticalconstraintsonleaders,ortheresolveofstateleaders.Inaddition,scholarshavereachedlittleconsensusonhowtomeasuresuchcentralconcepts,andtherehasbeentoolittlecriticaldebateonhowtomeasurecertaindifcultorimportantconcepts.6Thepursuitofwaystocreatively
measure theoretical concepts should be a high priority.
Hard-to-measure concepts are typically measured by single proxy
variables intendedto capture the concept of interest. Yet these
concepts could also be mea-sured by employing techniques, such as
conrmatory factor analysis, thatallow one to combine related,
observable variables into a single
underlyingfactorthatcapturesthishard-to-measureconceptinatheoreticallyinformedmanner.Substitutingalternativemeasuresforpurposesofrobustness
checks could also be done more often.It is important that more of
an effort be made to collect data and assem-Research Design in
Testing Theories217ble new data sets. Unfortunately, the cost and
time required to collect newdata can be substantial, and, as a
result, the incentives to rely upon exist-ing data sets are quite
strong. Yet the key principle of measurement in thesocial sciences
is that an empirical researcher should make every attempt
touse,collect,orobtaindatathatbesttsthetheoreticalpropositions.Widely
used measures for concepts like military capabilities or
democracymaybeappropriatefortestingcertainhypotheses,yetlessdesirablefortesting
other propositions. Scholars should be as careful as possible to
cap-ture the precise logic of their hypotheses. For example, the
hypothesis thatdemocratic institutions restrict the use of force
should be tested with
dataoninstitutionalarrangements,notwithageneralmeasureofdemocracysuch
as the widely used net democracy measure from the Polity data
set.Onceagain,whileexistingdatasetsoftenprovideavaluablefunction,more
of an effort should be made to put together new data sets and
com-pile new measures whenever such measures do not exist, or when
availablevariables are insufcient for the task at hand.ConclusionWe
have argued that the theoretical and empirical analysis of
internationalconict should be broken down into four generic stages.
By thinking
aboutthecausesofinternationalconictintermsofthesestages,webelieveresearchers
are more likely to develop research designs for statistical
teststhat1. focus on state leaders and their choices in
international disputes asthe unit of analysis for building data
sets,2. recognize that selection effects and strategic behavior are
centralconceptsforunderstandinghowinternationaldisputesevolveinto
stages where higher levels of conict occur,3. better account for
how policy choices in international disputes arelinked across time
and space, and4. include explanatory variables that better capture
and measure theimpact of domestic and international conditions
during periods ofmore intense diplomatic and military
interactions.Inourjudgment,suchresearchdesignswillgreatlyimprovestatisticaltests
of theories of international conict by better addressing problems
ofModels, Numbers, and Cases218selection bias, nonindependent
observations, and measurement error. Oneof the central implications
of our analysis is that there should be a tighterconnection between
the formal game-theoretic literature and the design
ofstatisticalanalysesandtests.Anotherimplicationisthatempiricalresearchers
will need to devote more time, effort, and resources to
develop-ingmoremicroleveldatasetsofinternationaldisputesacrossdifferentissueareasaswellasdevelopingdataondisputebehaviorthatdoesnotinvolve
military threats and the use of force.Recommended ReadingsAchen, C.
1986. The Statistical Analysis of Quasi-Experiments. Berkeley:
Universityof California Press.Fearon, J. 1994. Domestic Political
Audiences and the Escalation of InternationalDisputes. American
Political Science Review 88 (3): 57792.Hart, R., and W. Reed. 1999.
Selection Effects and Dispute Escalation. Interna-tional
Interactions 25 (3): 24364.Huth, P. 1996. Standing Your Ground. Ann
Arbor: University of Michigan
Press.Huth,P.,andT.Allee.2002.TheDemocraticPeaceandTerritorialConictintheTwentieth
Century. New York: Cambridge University Press.Signorino, C. 1999.
Strategic Interaction and the Statistical Analysis of
Interna-tional Conict. American Political Science Review 93 (2):
27998.Smith, A. 1999. Testing Theories of Strategic Choice: The
Example of Crisis Esca-lation. American Journal of Political
Science 43 (4): 125483.Notes1. We present each stage in its most
simplied form to highlight only a fewbasic points. For example, we
focus on only two actors but certainly third partiescould be
included as actors. In addition, we make no effort to model these
stagesrigorously. We simply map the choices available to states in
a dispute and the
out-comesofthevariouspathstoillustratethequestionsandconcernsstatisticalresearchers
need to address.2. Of course, if the researchers focus is on
explaining nonstate conict behav-ior then we would argue that the
unit of analysis is the individual political actoror the leader of
some organization that adopts and carries out particular policies.
3. By monadic we mean that democratic states are less likely to
initiate militarythreats and the use of force against all other
states, not just other democratic states.Research Design in Testing
Theories2194. The idea of dropping cases from statistical analyses
of international conictis especially problematic, since the cases
dropped often exhibit systematic
similar-ities.Dataonmilitaryexpenditures,militarycapabilities,andGNPareoftenhardesttoobtainforcertaintypesofcountries,suchasdevelopingcountriesorcountrieswithclosedpoliticalsystems.Droppingsuchcaseseliminatescertaintypesofmeaningfulcasesandresultsintruncatedvaluesofsomeindependentvariables.
5. This idea is logically similar to the use of control group
designs in quasi-experimental research (see Cook and Campbell
1979).6.
Recentdebatesonhowtomeasurejointdemocracyandthesimilarityofsecurity
interests constitute a welcome advance (see Thompson and Tucker
1997;Signorino and Ritter 1999).ReferencesAchen, C. 1986. The
Statistical Analysis of Quasi-Experiments. Berkeley: Universityof
California Press.. 1999. Retrospective Sampling in International
Relations. Annual Meet-ing of the Midwest Political Science
Association, Chicago. Beck, N., J. Katz, and R. Tucker. 1998.
Taking Time Seriously. American Journalof Political Science 42 (4):
126088.Bennett, D. S., and A. C. Stam. 1998. The Declining
Advantages of Democracy.Journal of Conict Resolution 42 (3):
34466.. 2000. Research Design and Estimator Choices in the Analysis
of Inter-stateDyads:WhenDecisionsMatter.JournalofConictResolution
44(5):65385.Bremer, S. 1992. Dangerous Dyads. Journal of Conict
Resolution 36 (2): 30941.. 1993. Democracy and Militarized
Interstate Conict, 18161965. Inter-national Interactions 18 (3):
23149.Bueno de Mesquita, B., and D. Lalman. 1992. War and Reason:
Domestic and Inter-national Imperatives. New Haven: Yale University
Press.Cederman, L. 1997. Emergent Actors in World Politics.
Princeton: Princeton Univer-sity
Press.Cook,T.,andD.Campbell.1979.Quasi-Experimentation.
Boston:HoughtonMifin.Daalder, I., and M. OHanlon. 2000. Winning
Ugly. Washington, DC: BrookingsInstitution.Downs, G. W., and D. M.
Rocke. 1990. Tacit Bargaining, Arms Races, and ArmsControl. Ann
Arbor: University of Michigan Press.Models, Numbers, and
Cases220Fearon, J. 1994a. Domestic Political Audiences and the
Escalation of InternationalDisputes. American Political Science
Review 88 (3):
57792..1994b.SignalingversustheBalanceofPowerandInterests.JournalofConict
Resolution 38 (2): 23669.. 1998. Bargaining, Enforcement, and
International Cooperation. Interna-tional Organization 52 (2):
269306.Geddes, B. 1990. How the Cases You Choose Affect the Answers
You Get: Selec-tion Bias in Comparative Politics. Political
Analysis 2:3150.Gleditsch, K. 2002. All International Politics Is
Local: The Diffusion of Conict, Inte-gration, and Democratization.
Ann Arbor: University of Michigan Press.Gleditsch, K., and M. Ward.
2000. War and Peace in Space and Time. Interna-tional Studies
Quarterly 44 (1): 130.Gowa, J. 1999. Ballots and Bullets.
Princeton: Princeton University Press.Greene, W. 1997. Econometric
Analysis. Upper Saddle River, NJ: Prentice-Hall.Hart, R., and W.
Reed. 1999. Selection Effects and Dispute Escalation.
Interna-tional Interactions 25 (3):
24364.Huth,P.1988.ExtendedDeterrenceandthePreventionofWar.
NewHaven:YaleUniversity Press.. 1996. Standing Your Ground. Ann
Arbor: University of Michigan
Press.Huth,P.,andT.Allee.2002.TheDemocraticPeaceandTerritorialConictintheTwentieth
Century. New York: Cambridge University Press.Huth, P., C. Gelpi,
and D. S. Bennett. 1993. The Escalation of Great Power Mil-itarized
Disputes: Testing Rational Deterrence Theory and Structural
Realism.American Political Science Review 87 (3):
60923.King,G.,R.O.Keohane,andS.Verba.1994.
DesigningSocialInquiry:ScienticInference in Qualitative Research.
Princeton: Princeton University Press.King, G., and L. Zeng. 2001.
Explaining Rare Events in IR. International Organi-zation 55 (3):
693716.Lincoln, E. 1990. Japans Unequal Trade. Washington, DC:
Brookings Institution.. 1999. Troubled Times. Washington, DC:
Brookings Institution.Maoz, Z. 1997. The Controversy over the
Democratic Peace. International Security22 (1): 16298.. 1998.
Realist and Cultural Critiques of the Democratic Peace.
Interna-tional Interactions 24 (1):
389.Maoz,Z.,andB.Russett.1992.Alliance,Contiguity,Wealth,andPoliticalEquality.
International Interactions 17 (3): 24567.. 1993. Normative and
Structural Causes of Democratic Peace, 194686.American Political
Science Review 87 (3): 62438.Milner, H. V. 1997. Interests,
Institutions, and Information: Domestic Politics and Inter-national
Relations. Princeton: Princeton University Press.Research Design in
Testing Theories221Morrow, J. D. 1989. Capabilities, Uncertainty,
and Resolve: A Limited
Informa-tionModelofCrisisBargaining.AmericanJournalofPoliticalScience
33(4):94172.Oneal, J., and J. L. Ray. 1997. New Tests of the
Democratic Peace Controlling
forEconomicInterdependence,19501985.PoliticalResearchQuarterly
50(3):75175.Oneal, J., and B. Russett. 1997. The Classical Liberals
Were Right. InternationalStudies Quarterly 41 (2): 26794.. 1999a.
Assessing the Liberal Peace with Alternative Specications. Jour-nal
of Peace Research 36 (4):
42342..1999b.IstheLiberalPeaceJustanArtifactoftheColdWar?Interna-tional
Interactions 25 (3): 21341.. 1999c. The Kantian Peace. World
Politics 52 (1): 137.Powell, R. 1999. In the Shadow of Power.
Princeton: Princeton University Press.Putnam, R. D. 1988. Diplomacy
and Domestic Politics. International Organization42 (3): 42760.Ray,
J. L. 1995. Democracy and International Conict: An Evaluation of
the DemocraticPeace Proposition. Columbia: University of South
Carolina
Press.Reed,W.2000.AUniedStatisticalModelofConictOnsetandEscalation.American
Journal of Political Science 44 (1):
8493.Rousseau,D.,C.Gelpi,D.Reiter,andP.Huth.1996.AssessingtheDyadicNatureoftheDemocraticPeace.AmericanPoliticalScienceReview
90(3):51233.Russett,B.1993.GraspingtheDemocraticPeace.
Princeton:PrincetonUniversityPress.Schoppa, L. 1997. Bargaining
with Japan. New York: Columbia University Press.Schultz, K. A.
1998. Domestic Opposition and Signaling in International
Crises.American Political Science Review 92:82944.. 2001. Democracy
and Coercive Diplomacy. New York: Cambridge Univer-sity
Press.Signorino, C. S. 1999. Strategic Interaction and the
Statistical Analysis of Inter-national Conict. American Political
Science Review 93 (2): 27998.Signorino, C. S., and J. Ritter. 1999.
Tau-B or Not Tau-B: Measuring the Simi-larity of Foreign Policy
Positions. International Studies Quarterly 43 (1):
11544.Smith,A.1995.AllianceFormationandWar.InternationalStudiesQuarterly
39(4):
40526..1996.ToInterveneorNottoIntervene:ABiasedDecision.JournalofConict
Resolution 40 (1): 1640.. 1998. International Crises and Domestic
Politics. American Political Sci-ence Review 92 (3): 62338.Models,
Numbers, and Cases222. 1999. Testing Theories of Strategic Choice:
The Example of Crisis Esca-lation. American Journal of Political
Science 43 (4): 125483.Thompson, W., and R. Tucker. 1997. A Tale of
Two Democratic Peace Critiques.Journal of Conict Resolution 41 (3):
42854.Wagner, R. H. 2000. Bargaining and War. American Journal of
Political Science 44(3): 46984.Werner, S. 1999. The Precarious
Nature of Peace. American Journal of Political Sci-ence 43 (3):
91234.Zorn,C.2001.GeneralizedEstimatingEquationModelsforCorrelatedData.American
Journal of Political Science 45 (2): 47090. Research Design in
Testing Theories223