Huth and Allee Research Design in Testing Thoe

9. Research Design in Testing Theories ofInternational ConflictPaul Huth and Todd Allee Scholars studying the causes of international conict confront a number oflingeringquestions.Underwhatconditionsaredisputesbetweenstateslikely to escalate to war? What impact do alliances have on the outbreak ofmilitarizedconict?Whenwilladeterrentthreatbecredible?Howdodomestic political institutions affect a states propensity to settle disputesnonviolently?Oneofthemostimportantsocialscienticmethodsforinvestigating these ongoing questions is statistical analysis.In this chapter we detail a number of issues regarding research designand estimation that researchers must consider when using statistical analy-sistoaddressimportantquestionswithinthestudyofinternationalconict and security. We do not engage in a straightforward review of paststatistical research on international conict, but rather put forward a seriesof suggestions for ongoing and future statistical research on this importanttopic. However, during the course of our discussion we identify and discussseveral statistical studies that converge with our suggestions and exemplifysome of the most promising current work by political scientists.Ourcentralargumentisthatstatisticaltestsofthecausesofinterna-tional conict can be improved if researchers would incorporate into theirresearch designs for empirical analysis a number of insights that have beenemphasized in recent formal and game-theoretic approaches to the study ofinternationalconict.Webelievethatgreaterattentiontotheimplica-tionsoftheformalanddeductivetheoreticalliteraturesforstatisticalanalyses can improve research designs in four areas.1931. Selecting theoretically appropriate units of analysis for buildingdata sets2. Understanding how to better address problems of selection effectsin the construction of data sets and estimation of models3. Accounting for nonindependent observations4. Reducing the amount of measurement error in the construction ofvariables for testing hypothesesWe focus on these four aspects of research design because they address a setofimportantproblemsthatempiricalanalystsneedtoaddressifcom-pelling ndings are to be produced by statistical tests. If researchers do notaddresstheseissuesofresearchdesigneffectively,weakempiricalresultscan be expected despite the use of sophisticated statistical methods to testrigorously derived theoretical propositions. Even worse, the failure to givecareful attention to problems of research design can result in the use of datasetsthat(1)areactuallyill-suitedfortestingthetheoriesthatscholarsclaimtobeevaluatingor(2)severelylimitourabilitytodrawaccurateconclusions about causal effects based on the empirical ndings producedby statistical analyses.In this chapter we rst describe four phases or stages that are associatedwith international disputes. These stages provide a useful depiction of howinternational disputes can evolve over time, and they illuminate a numberof central research design issues faced by statistical researchers. Second, wediscussfourparticularresearchdesignquestionsandsuggestpossibleanswers. We conclude with a few brief observations about the implicationsof our analysis for future quantitative work.Alternative Paths to Conflict and Cooperation in International DisputesBroadly conceived, the theoretical study of international conict involvesfour different stages.1. Dispute Initiation2. Challenge the Status Quo3. Negotiations4. Military EscalationModels, Numbers, and Cases194Existing quantitative tests of international conict generally focus on oneof these stages, although an increasing number of recent studies focus onmore than one stage. We believe that statistical researchers need to thinkcarefully about each of the four stages as part of a unied depiction of theevolution of international conict. An initial theoretical description of thefourstagesandthelinkagesbetweenthestageswillhelptoidentifythepracticalchallengesfacingthequantitativeresearcher.Thesedifferentstagesarepresentedingure1,alongwithsomeoftheprincipalpathsleading to various diplomatic and military outcomes.1In the Dispute Initiation stage, the analysis centers on whether a disputeor disagreement emerges between countries in which one state (the chal-lenger) seeks to alter the prevailing status quo over some issue(s) in its rela-tions with a target state (see g. 2). An example would be a decision by theleaders of a state to claim the bordering territory of their neighbor. If theleadersofthetargetstaterejecttheclaim,thenaterritorialdisputehasemerged between the two states (e.g., Huth 1996). Other common reasonsfor the emergence of disputes include economic conicts over the tariff andnontariff barriers to trade between countries (e.g., see Lincoln 1999, 1990on U.S.-Japanese trade disputes), or the intervention by one country intothe domestic political affairs of another (e.g., Daalder and OHanlon 2000forananalysisofNATOpolicyinKosovo).Theoreticalanalysesofthisstagefocusonexplainingwhatissuesandbroaderdomesticandinterna-tional conditions are likely to give rise to disputes and why it is that somestate leaders are deterred from raising claims and disputing the prevailingstatus quo.Onceastatehasvoiceditsdisagreementwiththeexistingstatusquoand a dispute has emerged, in the next stage, the Challenge the Status Quostage,leadersofthechallengerstateconsiderbothwhentopresstheirclaimsinadisputeandwhethertheywishtousediplomaticormilitarypressuretoadvancetheirclaims.Statisticalanalysesofthisstage,then,attempt to explain when and how states attempt to press or resolve exist-ingdisputes.Asshowningure3,foreignpolicydecisionmakerscanchooseamongoptionssuchasnotactivelypressingclaims,relianceonnegotiations and diplomatic efforts to change the status quo, or more coer-civepressureinvolvingmilitarythreats.Theoutcomestothisstageinclude:(1)thestatusquo,ifthechallengerremainsquiescent,(2)theopening or resumption of negotiations due to diplomatic initiatives under-Research Design in Testing Theories195takenbythechallenger,or(3)amilitaryconfrontationwhenthechal-lengerresortstoverbalwarningsandthreateningthedeploymentofitsmilitary forces. The theoretical analysis of this stage would typically focuson explaining what policy choices would be selected by leaders among thevariousdiplomaticandmilitaryoptionsavailableandhowdomesticandinternationalconditionsinuencesuchchoices(e.g.,BuenodeMesquitaand Lalman 1992; Powell 1999; Huth and Allee 2002). IntheNegotiationsstagethechallengerandtargethaveenteredintotalks, and empirical tests attempt to explain the outcome of such rounds oftalks (see g. 4). In this stage, the focus shifts to questions such as whichparty has more bargaining leverage and is willing to withhold making con-cessions, whether the terms of a negotiated agreement would be acceptedback home by powerful domestic political actors, and whether problems ofmonitoring and enforcing compliance with the terms of a potential agree-ment would prevent a settlement from being reached (e.g., Fearon 1998;Powell 1999; Putnam 1988; Downs and Rocke 1990; Schoppa 1997; Mil-Models, Numbers, and Cases196Cooperative RelationsBetween StatesDisputeEmerges BetweenStates Over Issue(s)States Assess Diplomatic and Military OptionsThreat to Use Military ForceCrisisBetweenStatesWarStalemateSettlementSettlementNegotiationsRepeat Threat of Military ForceShift to NegotiationsStalemateSettlementContinue NegotiationsTurn to Military OptionDispute Initiation StageNegotiations StageMilitary Escalation StageTimeChallenge the Status Quo StageFig. 1. The evolution of international disputesC(2)Period 1OutcomesPeriod 2Demands to Change Status Quoover Disputed IssueFavorable Change in Status Quo for ChallengerNo Claim or Demandto Change Status Quo Status Quo PrevailsResist DemandsPeriod 3TConcedeAnother Iteration of theDispute Initiation StageInternational Dispute EmergesStates are now in theChallenge Status Quo StageNo Dispute Exists with NewStatus QuoCFig. 2. The Dispute Initiation stage. (Note: C = Challenger State; T = Tar-get State)NegotiateDirect Threat of Military ForceNo Diplomaticor Military InitiativesCPeriod 1 Outcomes Period 2Period 3Military Confrontation Initiated by ChallengerStatus QuoChallenger Seeks Talks Over Disputed TerritoryStates are now in the MilitaryEscalation StageAnother Iteration of the Challengethe Status Quo StageStates are now in theNegotiations StageIn the Dispute Initiation Stage anInternational Dispute has Emerged in whichChallenger Demands to Change the StatusQuo were Rejected by the Target StateFig. 3. The Challenge the Status Quo stage. (Note: C = Challenger State;T = Target State)ner 1997). The possible outcomes to the Negotiations stage might includea settlement through mutual concessions or capitulation by one side. Fur-thermore, a stalemate can ensue if neither party is willing to compromise,while limited progress toward a resolution of the dispute can occur if oneorbothsidesofferpartialconcessions.Inthecaseofstalemateorpartialconcessions, the dispute continues, and the leaders of the challenger statereassess their policy options in another iteration of the Challenge the Sta-tus Quo stage.In the Military Escalation stage the challenger state has issued a threatof force (see g. 1). If the target state responds with a counterthreat, a cri-sisemergesinwhichtheleadersofbothstatesmustdecidewhethertoresort to the large-scale use of force (see g. 5). Statistical tests of this stagegenerally investigate whether the military standoff escalates to the large-scale use of force or the outbreak of war, or is resolved through some typeof less violent channel. This stage of international conict has drawn con-siderable attention from international conict scholars for obvious reasons,yetitremainsthemostinfrequentlyobservedstageofinternationalconict.Someofthemoreinterestingtheoreticalpuzzlesposedatthisstage center around questions of how credible the threats to use force are,what actions by states effectively communicate their resolve, and what therisks of war are as assessed by the leaders of each state (Fearon 1994a; Huth1988;Schultz1998;Smith1998;Wagner2000).Theoutcometotheinternationalcrisisdetermineswhetherthedisputecontinues,andifso,which foreign policy choices need to be reconsidered. For example, if warbreaks out, a decisive victory by one side is likely to bring an end to thedispute; whereas a stalemate on the battleeld will lead to the persistenceof the dispute in the postwar period. Conversely, the avoidance of war maybringabouttheendofthedisputebymeansofanegotiatedagreement,while a standoff in the crisis will result in the continuation of the dispute.In either case where the dispute persists, the focus shifts back to the chal-lengers options in another iteration of the Challenge the Status Quo stage.Over the duration of a dispute, decision makers pass through the vari-ous stages numerous times; that is, they make repeated choices regardingthethreatoruseofforce,negotiations,anddisputesettlement.Thesechoices of action (or inaction) become the cases that comprise the data setsusedinquantitativestudiesofinternationalconict.Interestingly,thesequenceofpolicychoicesovertimeproducescommondiplomaticandmilitary outcomes that may be arrived at through very different pathwaysModels, Numbers, and Cases198Period1Period2 Outcomes Period3Major ConcessionsMajor ConcessionsLimited ConcessionsLimited ConcessionsLimited ConcessionsSettlement by mutual compromiseSettlement in which status quo maintainedSettlement in which change in status quois favorable to challengerDispute continues with movement toward resolutionDispute continues due to stalemateSettlement in which change in status quois favorable to challengerDispute continues with limited gains by challengerT(1)T(2)No ConcessionsMajor ConcessionsMajor ConcessionsLimited ConcessionsNo ConcessionsT(3)No ConcessionsNo ConcessionsSettlement on terms favorable to targetCAbsence of Dispute and New Status Quo EstablishedAbsence of Dispute and New Status Quo EstablishedAbsence of Dispute as Status Quo is MaintainedAbsence of Dispute and New Status Quo EstablishedAnother Iteration of the Challenge the Status Quo StageAnother Iteration of the Challenge the Status Quo StageAnother Iteration of the Challenge the Status Quo StageAnother Iteration of the Challenge the Status Quo StageNOTE:C=ChallengerState T=TargetStateIn the Challengethe Status Quo Stage the Challenger has Selected the Option of Seeking NegotiationsDispute continues due to stalemateAbsence of Dispute and New Status Quo EstablishedFig.4. TheNegotiationsstage.(Note:C=ChallengerState;T=TargetState)Period1Period2 Outcomes Period3High EscalationT(1)T(3)CNOTE:C=Challenger State T=TargetStateIn the Challengethe Status Quo Stage the Challenger has initiated a direct threat of military forceModerate EscalationHigh EscalationModerate EscalationLow EscalationHigh EscalationLowEscalationModerate EscalationAnother Possible Iteration of the Challenge the Status Quo StageAnother Possible Iteration of the Challenge the Status Quo StageWar breaks out and dispute continues if challenger failsto win the war and impose a peace settlementFavorable Change in status quo and dispute continues if challenger not satisfied w/ change in status quoCrisis stalemate and dispute continuesLimited gains or settlement under threat of crisis escalation for challenger and dispute continues if challenger not satisfied w/ change in status quoAnother Iteration of the Challenge the Status Quo StageAnother Possible Iteration of the Challenge the Status Quo StageT(2)Another Iteration of the Challenge the Status Quo StageLowEscalationHigh EscalationLowEscalationModerate EscalationCrisis setback for challenger and dispute continuesAvoidance of War and dispute continuesAnother Iteration of the Challenge the Status Quo StageFig. 5. The Military Escalation stage. (Note: C = Challenger State; T = Tar-get State)(see g. 1). For example, consider the outcome of a negotiated settlementreached through mutual concessions. In one dispute, this could be achievedbypeacefultalksandmutualcompromiseinashortperiodoftime,whereas, in another dispute, repeated military conicts and then difcultand protracted negotiations eventually produce a settlement. Several important implications for quantitative studies of internationalconictmaybedrawnfromourdiscussionofgure1andthevariousstages of international conict.1. Theoutbreakofwarandtheuseoflarge-scalemilitaryforcearealmostalwaysprecededbyapreexistinginternationaldisputeinwhichdiplomaticeffortsatnegotiationsandtalkshadbeenattempted.Asaresult, very few military confrontations take place without a prior historyof failed diplomacy and negotiations between states over disputed issues.Furthermore, most international disputes do not evolve into military con-frontations.Sincethethreatanduseofmilitaryforceisarareandoftennaloption,empiricalstudiesneedtoinvestigatetheconditionsunderwhich disputes will become militarized. In particular, statistical tests needto account for potential selection effects due to the fact that leaders mightturn to military force under unique circumstances, such as when they arerisk-acceptant or highly resolved. 2. Similarly, state leaders often engage in repeated efforts at negotia-tions before deciding to make substantial concessions. As a result, interna-tionaldisputesarerarelysettledwithoutstatesprobingandseekingtoshift the burden of concession making onto their adversary. For this reasonitisimportantfortheoreticalmodelsofinternationalconictandtheempirical tests of such models to account for the process of dispute resolu-tion in which leaders will often shift from an initial hard-line stance thatseeksunilateralconcessionsbytheiradversarytoamoreaccommodativebargainingpositioninwhichtheyaccepttheneedtoofferatleastsomeconcessions themselves.3. In addition, a states behavior in one dispute is often affected by itsinvolvement in other disputes. Statistical tests need to consider the largerstrategic context of a states foreign policy in which the states leaders mustmanagetheircountryssimultaneousinvolvementinmultipleinterna-tionaldisputes.Inotherwords,quantitativestudiesofinternationalconictneedtoconsiderthefactthatastatesbehaviorinonedisputemight be correlated with its behavior in another international dispute.4. The impact of history and time may also be important. EmpiricalModels, Numbers, and Cases200tests need to account for the fact that new information might be revealedtostatesoverthecourseofaninternationaldispute.Thepasthistoryofdiplomatic or military exchanges in the dispute might shape the currentdiplomatic and military behavior of states. For example, states are likely toshift away from conict resolution strategies that have proven unsuccessfulin previous interactions with an adversary. Not only may previous negoti-ations,stalemates,ormilitaryinteractionsbeimportant,butshort-termactions and more recent changes in prevailing conditions can lead decisionmakers to update their beliefs and change the policy options during a par-ticularencounterwithanadversary.Militarizeddisputesandcrisescanunfold over many months, and during that time domestic political condi-tions can change, other international disputes can arise, third parties canintervene, and the target states own behavior can signal new informationabout its resolve and military strength. New information may therefore berevealed in the transition from one stage to the next stage. For example, adecision by a challenger state to issue a threat of force in the Challenge theStatus Quo stage (see g. 3) should not be treated necessarily as reectinga rm decision to escalate and resort to the large-scale use of force in thesubsequent Military Escalation stage (see g. 5). A threat of force may bedesigned to probe the intentions and resolve of a target state, to induce thetarget to resume talks by signaling the dangers of a continued stalemate, orto pressure the target into making concessions in a new round of upcomingtalks.Asaresult,atheoreticaldistinctionshouldbedrawnbetweentheinitiation and escalation of militarized conicts.Questions of Research DesignIn this section we address a number of issues of research design that havesignicantimplicationsforstatisticaltestsoftheoriesofinternationalconict. We use the four stages described previously (see g. 1) to guideour discussion of these issues. We focus on four particular research designquestions.1. What are the units of analysis for building data sets?2. How can problems of selection effects be addressed in empiricaltests?3. In what ways can problems of nonindependent observations arisein statistical analyses?Research Design in Testing Theories2014. Whatarecommonproblemsofmeasurementerrorinstatisticalanalyses of international conict theories?Before discussing these questions, it is useful to touch upon a few generalfeaturesoftheframeworkweusewhenthinkingabouttheevolutionofinternational disputes. Typically, our conceptualization is of a situation inwhichtwostatesinteractoveranissueof(potential)disagreement,although in principle there is nothing to preclude one or both of the par-ties from being a nonstate actor. Also, we maintain that the primary deci-sion makers (who would typically be state leaders) can be inuenced by fac-tors from a variety of levels of analysis. They could be subject to variousdomestic political impulses and constraints, for example. Furthermore, onecould also examine individual-level traits of decision makers, if necessary,oraspectsofthedecision-makingprocess.Therefore,weconsiderourframework to be quite general and exible. Our basic structure is of a two-playergame,butthirdpartiescanaffectthecalculusofeitherorbothactors. The actors might, for example, consider whether an ally is likely tointervene in a certain scenario or how an international legal body would belikely to rule if referred a disputed issue. Our framework, however, is notdesignedtoexplainlong-term,dynamicprocessesthatinvolvemultipleindependent actors. In such a situation, simulations based on agent-basedmodeling might be quite useful (e.g., Cederman 1997). Nevertheless, wefeelthatourconceptualizationofinternationalconictisonethatcanaccommodatemanytypesofinternationalinteractions,aswellasawidevariety of explanations for different types of behavior.Selecting the Units of AnalysisWe have argued that the theoretical study of international conict centerson four generic types of stages in which state leaders select different policyoptions.Empiricaltestswillneedtobedesignedforeachofthesefourstages.Ourgeneralargumentisthatinbuildingdatasetsforstatisticalanalyses of theories about the causes of international conict the appropri-ate unit of analysis in most empirical studies should be the individual statein a given international dispute with another state. Put slightly differently, onemight say that we advocate looking at the behavior of each individual statein a directed dyad. The reasons for this are twofold.1. Compelling theoretical explanations for conict behavior must beModels, Numbers, and Cases202grounded in the actions of individual states, which are based on the choicesoftheirpoliticalandmilitaryleaders.2Rationalchoicemodelsbasedongame-theoretic analyses do follow this approach, and we believe any pow-erfultheoreticalapproachultimatelyrestsonunderstandinghowthechoicesofstateleadersandtheirstrategicinteractionswithotherstateslead to various international conict outcomes.2. Aspreviouslyargued,warandinternationalcrisesrarely,ifever,occur in the absence of preexisting disputes over issues and prior periods ofnegotiations and diplomatic interactions. In fact, war and crises threaten-ing war are quite infrequent forms of interstate interactions. It is impor-tant to understand why state leaders in some disputes at particular pointsin time are unable to resolve the issues in contention and why they are will-ing to escalate the dispute to the highest levels of military conict. As aresult,afullexplanationofthecauseofinternationalconictshouldbebasedontherecognitionthattherearemultiplestagesthroughwhichinternational disputes may evolve.If we consider the four stages in gure 1, we can see that the units ofanalysis in the Dispute Initiation stage would be potential challenger andtarget states. For the challenger state the dependent variable in a statisti-caltestwouldcenteronthedecisionofitsforeignpolicyleadershipwhethertocontestthepoliciesofanothergovernmentinsomeselectedissuearea(s).Examplesmightincludecompliancewitharmscontrolorcease-re agreements, disputes over one countrys suppression of internalpolitical opposition, or charges that one country is permitting rebel forcestooperateonitsterritory.Forthisunitofanalysis,theoreticalmodelswould explain when the potential challenger actually presses its claims anddemands that the target state change its policies.Oncethedemandsandclaimsareclearlyarticulated,thetargetstatewould then have to decide whether to resist the policy changes called for bythe challenger (see g. 2). Thus, the dependent variable when the targetstate is the unit of analysis focuses on how rmly the leaders of the targetstate respond to demands for changing their policies. In the Challenge theStatusQuostageweknowthataninternationaldisputealreadyexists.Thus, the observations in this stage would consist of all challenger statesand the potentially repeated opportunities that their leaders had to initiatediplomatic or military policies in an attempt to achieve a favorable changein the prevailing status quo (see g. 3).For these rst two stages the temporal denition of what we might termResearch Design in Testing Theories203a play of the dispute needs to be given careful attention by researchers.One might initially dene this as an annual observation and code what ini-tiatives, if any, were pursued by the challenger in a given year. Theoreti-cally, however, there is no compelling reason to believe that a single for-eignpolicydecisionoccursonceeverytwelvemonths.Forexample,insomeinternationaldisputesleadersofthechallengerstatemightmovethrough several stages in a single year. For example, efforts to rely on nego-tiations early in the year might end quickly in stalemate, yet by the end ofthe year the leaders might decide to turn to military pressure and threats offorce in an attempt to break the diplomatic deadlock.In contrast, in a different international dispute the issues at stake for thechallengerstatemightbenotthatsalient,and,asaresult,itmakesnoeffort to escalate or settle the dispute in a given year. The lack of attentiongiven to the dispute raises questions about whether any policy options wereeven considered within a given year and whether that year of observationshouldbeincludedinthedataset.Aresearchdesignsetupthatwasgrounded in a game-theoretic approach would shift away from relying onthe convenient annual time period for each observation and instead woulddevelop a more exible set of coding rules to establish the temporal boundsof each iteration of a stage. With these more adaptable rules it would bepossible to identify multiple iterations of a stage within a given year and toextendasingleiterationofonestagebeyondayearwhentheoreticallyappropriate.InboththeNegotiationsandMilitaryEscalationstagestheunitsofanalysis are the challenger and target states involved in a given round of talks ormilitary confrontation. Once again, this is analogous to an examination of thenegotiationorescalationbehaviorofeachstateinadirecteddyad.Theduration of the round of talks or military confrontation would determinethetimeperiodofeachstate-levelobservation.Inthesetwostagesthedependent variables would typically focus on outcomes such as the extentof concessions by a state in negotiations, each states level of military esca-lation, or how responsive one states policies were to the short-term actionsof the other.One important implication of our discussion about the units of analysisin statistical analyses is that we do not generally favor or advocate the useofnondirecteddyads(seeBennettandStam2000).Whiletheuseofdirected dyads is desirable because it allows the researcher to capture indi-Models, Numbers, and Cases204vidualstatedecisionsinaparticularstrategicenvironment,muchoftheexistingworkutilizesnondirecteddyadsandfocusesonlyonthejointoutcomeresultingfromtheinteractionofpairofstates.Dyadicanalyseshavebecomeincreasinglycommoninstatisticalstudiesofinternationalconict, particularly in the democratic peace research program. For exam-ple, a number of statistical studies of the democratic peace have analyzeddata sets consisting of pairs of states in which the occurrence of a war ormilitarizeddisputeshortofwariscodedonanannualbasisoversomespecied time period. In some tests the population of dyads consists of allpossiblepairingsofstates,whileotherscholarsrelyonasmallersetofpolitically relevant dyads (e.g., Bremer 1992, 1993; Maoz 1997, 1998;Maoz and Russett 1992, 1993; Oneal and Ray 1997; Oneal and Russett1997,1999a,1999b,1999c;Ray1995,chap.1;Gowa1999;Russett1993). Politically relevant dyads are typically composed of states that arecontiguous or pairs of states in which at least one party is a great power.These studies have produced many useful and important ndings; never-theless,wethinktherearereasonstoquestionresearchdesignsthatrelyupon nondirected dyads as the basic unit of analysis. In particular, there areatleastthreelimitationstosuchdyadicstudiesthatcanbenicelyillus-trated by considering empirical studies of the democratic peace.First, in dyadic studies of the democratic peace the dependent variabletakes the form of conict involvement for the countries in the dyad, withoutidentifying patterns of military initiation and response, or conict resolu-tion, by each state. This is an important drawback, since hypotheses aboutdemocratic institutions and norms of conict resolution logically predictwhich states in a dyad should be most likely to initiate militarized disputesand escalate disputes to the brink of war as well as seek diplomatic settle-ments of disputes. Data on initiation and escalation are particularly impor-tantintestingthemonadicversion3ofthedemocraticpeace.Anondi-recteddyadicdemocraticpeacestudy,however,wouldsimplynotetheoccurrenceofwarortheexistenceoflarge-scalemilitaryactionbetweentwostatesinamixeddyadforagiventimeperiod.Thiscodingofthedependent variable would not distinguish between two very different sce-narios in which democratic and nondemocratic states would resort to thelarge-scale use of force. In the rst case, the nondemocratic state initiatesthe large-scale use of force after rejecting compromise proposals, and thedemocratic state responds by defending itself against the attack. In the sec-Research Design in Testing Theories205ond case, the reverse is true, as the democratic state initiates the large-scaleuse of force after rejecting compromise proposals, and the nondemocraticstate responds by defending itself against the attack.These two cases represent very different pathways to war and thereforesuggestquitedifferentconclusionsaboutthemonadicapproachtothedemocratic peace. The second pathway is seemingly quite at odds with amonadic democratic peace argument, whereas the rst pathway is not. Thesamegeneralpointisapplicableregardingdifferentpathwaystoconictresolution. In one case the dispute is settled by a nondemocratic state ini-tiating concessions or withdrawing claims, while in a second case a demo-craticstatetakestheinitiativetoproposeconcessions,whicharethenaccepted by a nondemocratic adversary. The rst case runs counter to pre-vailingmonadicargumentsaboutdemocraticnorms,whilethesecondseems consistent. The ndings of many existing quantitative studies, how-ever,donotprovideasolidfoundationuponwhichtodrawconclusionsabout the monadic version of the democratic peace (Rousseau et al. 1996).It seems very desirable then to disaggregate conict behavior within a dyadinto a more sequential analysis of each states behavior over the course of adispute between states. Thus, Huth and Allee (2002), in their study of thedemocraticpeace,examine348territorialdisputesfrom1919through1995 in which each states behavior for cases of the Challenge the StatusQuo, Negotiation, and Military Escalation stages is analyzed. The result isthathypothesesaboutdemocraticpatternsofinitiationandresponseregardingnegotiationsandmilitaryconictscanbeclearlypositedandthen empirically tested.A related problem with dyad-based data sets is that hypotheses regard-ingtheimpactofimportantindependentvariables,suchasdemocraticnormsandstructures,onconictoutcomescannotbetesteddirectly.Instead, the researcher is forced to make inferences about the causal processthat might have produced patterns of observed dyadic conict outcomes.Consider the case in which one of the two states in a dispute is led by aminoritygovernment.Thisminoritygovernmentmightbeunlikelytooffer concessions to its adversary because of the difculties in securing leg-islative or parliamentary support for such concessions. However, its adver-sary, knowing it is bargaining with a highly constrained opponent, mightbe more likely to offer concessions. However, the existence of a minoritygovernmentcouldhavetheoppositeimpactoneachofthestatesinthedyad. By splitting the dyad into two state-level, directional observations,Models, Numbers, and Cases206the researcher is able to more directly test the causal impact of minoritygovernment on conict or bargaining behavior for democratic states (e.g.,HuthandAllee2002).Theuseofnondirecteddyads,however,wouldobscure the true causal impact of domestic institutional arrangements suchas minority government.Thenal,relatedlimitationofthesedyadicstudies,especiallythoseusing the popular nondirected dyad-year format, is that they test hypothe-ses about international conict without grounding the empirical analysisinthedevelopmentandprogressionofinternationaldisputesbetweenstates (g. 1). When analyzing whether states become involved in a milita-rized dispute or war, the causal pathway necessarily includes a rst stage ofa dispute emerging. We do not think dyad-year arguments, such as thosefor the democratic peace, explain why disputes arise, but rather, only howdisputes will be managed. The problem with the typical dyad-year-baseddata set is that the observed behavior of no militarized dispute or no warfor certain dyad-years could be explained by two general processes, one ofwhich is distinct from arguments in the democratic peace literature. Thatis, no military conict occurs because (1) states were able to prevent a dis-pute from escalating, which the democratic peace literature addresses; and(2) states were not involved in a dispute, and thus there was no reason forleaders to consider using force. This second pathway suggests that democ-ratic peace explanations are not that relevant. As a result, dyads that do noteven get into disputes for reasons that are not related to democratic insti-tutions or norms may appear to be cases in support of the democratic peace. Theuseofpoliticallyrelevantdyadshelpstoreducethisproblemofirrelevant nondispute observations, but many relevant dyads are not partiestoaninternationaldisputethathasthepotentialtoescalatetomilitaryconict. If one has the typical data set that contains observations in whichstates never even considered using force, then potential problems of over-stated standard errors and biased estimates of coefcients for the democra-ticpeacevariablescanarise.Forexample,thenegativecoefcientonademocraticdyadvariableinastudyofmilitaryconictcouldreecttheability of democratic leaders to manage disputes in a nonviolent way, butitmightalsocapturethefactthatsomedemocraticdyadswerenotinvolved in any disputes for many of the dyad-year observations in the dataset.Asaresult,itisdifculttodrawstrongandclearcausalinferencesabout the impact of joint democracy on conict behavior (see Braumoellerand Sartori, chap. 6, this vol.). In the rst scenario, it would not be worri-Research Design in Testing Theories207some to witness a conict between two democratic states, since they shouldbe able to manage the dispute without resorting to violence. However, ifthe second claim is true, then the occurrence of military confrontations iscause for concern, since the democracies are only pacic insofar as they areable to avoid getting into militarized disputes in the rst place.Insum,thenondirecteddyad-yearastheunitofanalysisaggregatesmultiple stages in the development of an international dispute into a sin-gle observation that renders it difcult for researchers in empirical tests toassess the causal processes operating at different stages in the escalation orresolution of international disputes.Accounting for Selection EffectsSelection effects are a potential problem for any empirical test that fails tounderstand that states do not enter into negotiations or become involved inaviolentmilitaryclashrandomly,butratherstateleaderschoosetogodown a particular path during the evolution of a dispute. For the relativelyfewcasesmakingittoeithertheNegotiationsorMilitaryEscalationstages, the story of how and why state leaders selected their countries intothese samples is of utmost importance. Similarly, the related idea of strate-gic interaction tells us that state leaders consider the anticipated responseofopponentstovariouspolicyoptions.Eventhoughsomefactorsmayaffect the decisions of leaders in a potential conict situation, this impactis not captured by standard statistical techniques because leaders avoid tak-ing these potentially undesirable courses of action. This idea is particularlysalient when analyzing the Dispute Initiation and the Challenge the StatusQuo stages.One way to think about selection effects hinges on the idea of sampleselection bias (see Achen 1986; Geddes 1990; King, Keohane, and Verba1994). In the simplest terms, using a nonrandom sample of cases to testcausal relationships will often result in biased estimation of coefcients instatistical tests. Not only might the causal relationships suggested by theresultsofsuchstatisticalanalysesbeinaccurateforthelimitedsampleexamined, but they also cannot be used to draw inferences about the gen-eralizable relationship that might exist between the independent variablesanddependentvariableoutsideofthatsample.Thelogicisstraightfor-ward:casesthatadvancetosomeparticularphaseintheevolutionofconictmaynotbetypicalofrelationsbetweenstates(Morrow1989).Models, Numbers, and Cases208Theremaybesomesystematicreasonorexplanationforwhythesecasesreach a certain stage, and the failure to account for this can produce mis-leading statistical results.Unobserved factors, such as beliefs, resolve, risk attitudes, and credibil-ity,mightexertaselectioneffect(Morrow1989;Fearon1994b;Smith1995, 1996). States may select themselves into certain stages of conict ordown certain paths of dispute resolution based upon the private informa-tion they possess about these unobserved factors. The ideas of alliance reli-ability and extended deterrence illustrate this idea. Reliable alliances andcredible deterrent threats should rarely be challenged, so the large numberof cases where alliance ties and general deterrence prevent challenges to thestatusquoareoftenexcludedfromdatasetsofmilitarizedcrises(Fearon1994b;Smith1995).DuringtheChallengetheStatusQuostage,onlyhighly resolved challengers would challenge strong alliances and credibledeterrent threats. The failure to account for a challengers resolve to carryout its military threat might lead the empirical researcher to mistakenlyconclude that alliances increase the risk of military escalation in crises andto fail to appreciate that alliances might act as powerful deterrents to statesinitiatingmilitaryconfrontations(Smith1995).Onceagain,statisticalanalyses of a single stage can be biased if they do not consider how stateleadersselectedthemselvesintothedatasetthatisbeingtestedforthatstage.Put slightly differently, sample selection bias is likely to exist when thevariables that explain the ultimate outcome of the cases also explain whythose cases got into the sample in the rst place. If the factors explainingthe outcome of the Military Escalation stage also help explain the decisionto get into the Military Escalation stage (the choice made during the Chal-lenge the Status Quo stage), then the estimated coefcients produced bystatistical tests of cases that only appear in the Military Escalation stage arelikely to be biased. Variables such as wealth, regime type, and satisfactionwith the status quo may affect the decisions made during the Challenge theStatus Quo stage and the Military Escalation stage in similar or differentways(seeHartandReed1999;Huth1996;Reed2000).Forexample,Huth(1996)reportsthatthemilitarybalancedoesnotsystematicallyinuence challenger decisions to initiate territorial claims against neigh-boring states, but among cases of existing territorial disputes, challengersaremuchmorelikelytothreatenanduseforceiftheyenjoyamilitaryadvantage. In a study of the democratic peace, Reed (2000) explicitly mod-Research Design in Testing Theories209els the decisions to initiate and then escalate military confrontations, andhe nds that the impact of democratic dyads is far stronger in preventingtheemergenceofmilitaryconfrontationscomparedtotheescalationofsuch conicts.Incorporating strategic interaction in research designs on internationalconict is also a desirable goal (see Signorino 1999; Smith 1999). In ourframework, accounting for sample selection bias generally requires lookingbackward to explain where cases come from, whereas the idea of strategicinteraction requires looking forward to see where cases would have gone ifthey had reached later stages in the evolution of conict. The key idea isthatthedecisionswithinandacrossdifferentstagesareinterdependent;actorstakeintoaccountthelikelybehaviorofotherstatesatpresent,aswellaspossiblefuturedecisionsduringtheescalationofinternationalconict (Signorino 1999). Strategic interaction may also be thought of asthe explicit study of counterfactuals (Smith 1999, 1256). Actors antici-patehowpotentialadversarieswillbehaveundercertaincircumstances,such as at any of the decision-making nodes in our four stages, and avoidmaking decisions that may ultimately lead to undesirable outcomes. In ourmultiphase model of disputes (g. 1), a challenger state may refrain fromchoosingthepathleadingtothemilitaryescalationstagebecausetheyanticipate a swift, strong military reaction from the defender in the eventof a military threat. Or they may shun the decision to enter into negotia-tions because they anticipate no concessions being made by the leader ontheotherside.Onceagain,factorsthattrulyaffectthecalculusofstateleaders to make certain decisions or enter into certain phases or stagessuch as the credibility of a defenders swift response, or the domestic con-straints placed on a foreign leaderare not captured by standard statisticaltechniques because they are unobserved. The most widely used statistical estimators fail to capture the concernswe raise about selection issues, and such techniques produce biased resultswhen a nonrandom sample is used (Achen 1986). In many cases, the effectsofsomeindependentvariablesmaybecomeweakenedorrenderedinsignicant.Infact,someclaimthatselectionbiasmayproducecoefcients with reversed signs (Achen 1986). For example, Huth (1988)nds in his statistical tests that alliance ties between defender and protgaresurprisinglyassociatedwithanincreasedriskofextended-immediatedeterrence failure. Fearon (1994b), however, argues that the reason for thisndingisduetoselectioneffectsinwhichanunmeasuredvariable(theModels, Numbers, and Cases210challengersresolvetoinitiateathreatagainsttheprotg)iscorrelatedwith the observed variable of alliance ties. The result is that the estimatedcoefcient for the alliance variable is actually picking up the impact of theunmeasured challenger-resolve variable, and this helps to explain the unex-pected negative sign on the alliance variable. In general, scholars incorpo-rating the ideas of strategic interaction and selection bias into their modelshavediscoveredsignicantdifferencesbetweencoefcientsproducedbythese corrected models and those produced by biased models (see Sig-norino1999;Smith1999;Reed2000).Thesechangedestimatesevenaffect some of our most important propositions in world politics, such asthe impact of joint democracy on conict escalation as noted earlier (Hartand Reed 1999; Reed 2000).Our general conclusion is that quantitative studies of military conictshouldincorporatesometypeofcorrectionforselectioneffects.Inouropinion, the best suggestion is to model the multiple stages in the escala-tion of international conict simultaneously. This is generally done by esti-mating both a selection equation (to explain which cases get into a partic-ular sample) as well as an outcome equation (to explain how the cases inthissampleareplayedout).AgoodexampleofthisisHuthandAllees(2002) analysis of dispute resolution efforts by democratic states that areinvolved in territorial disputes. In estimating the probability that a demo-craticchallengerwillofferconcessionsinaroundoftalks(theoutcomeequation),theyincludeaselectionequationthataccountsfortheinitialdecisionofthedemocraticchallengertoproposetalks.Giventhepreva-lence of categorical variables in studies of international conict, probit andlogit selection models seem most promising, although other models maybe appropriate for different types of dependent variables. When thinkingabout how particular cases get where they are, one should compile data onthosecasesthathadsomelegitimateprobabilityofmakingittosomestage, but did not. In other words, if analyzing the Negotiation stage, oneshould also have some information about cases that went to the MilitaryEscalation stage, or in which the status quo was accepted.Itisoftencumbersomeanddifculttoacquiredataonrelevantnon-events, such as instances in which leaders considered threatening force butdid not do so, or where a state had the ability to press a claim concerningthe treatment of ethnic minorities abroad and decided to accept the statusquo.Yetwefeelthatacquiringandincorporatingthisinformationintoquantitative analyses should be a high priority for scholars. In other words,Research Design in Testing Theories211we advocate greater attention to the Dispute Initiation and Challenge theStatusQuostagestotheidenticationofthosesituationsthatcouldplausiblybecomeinternationaldisputesandthentrackingwhichdis-putes might proceed through various Negotiations and Military Escalationstages.Whenthisisnotpossible,andthereforenoselectionequationisspecied, a different approach would be to include in the outcome equationthose independent variables that would have been in a selection equation.In other words, researchers studying the outcomes of crises or militarizeddisputesshouldtrytoincludeindependentvariablesthatexplainwhythose disputes and crises might have arisen in the rst place.Insum,theproblemsofselectionbiasandstrategicchoiceareillus-trated nicely by game-theoretic models of military conict, which capturethe real-life choices faced by state leaders. Our primary point is that quan-titative analyses of international conict need to account for the variety ofchoicesthatstateshaveatdifferentstagesintheevolutionofadispute.Focusing narrowly on one phase of an interstate dispute without account-ing for past and potential future choices can lead to biased statistical resultsand therefore limit our ability to draw accurate conclusions concerning thefactors that contribute to military conict.Problems of Nonindependent ObservationsResearchersconductingstatisticalanalysesofdatasetsoninternationalconictneedtoconsiderpotentialproblemsofnonindependentobserva-tions. There are a number of ways in which the dependence of observationscan occur in international conict data sets. We focus on two types that arelikely to be present in many data sets in which the basic units of analysisare states that are involved in international disputes. In the rst case, thedependence of observations is due to the time-series nature of the data inwhich the same state appears multiple times in the data set since the inter-national dispute spans many years. With this data set the analyst is testingmodelsthatseektoexplainvariationinastatesdisputebehaviorovertime.Inthesecondcase,cross-sectionalorspatialdependenceispresentbecause in a given time period (e.g., a year) the same state is a party to sev-eraldifferentinternationaldisputesorisinuencedbythebehaviorofneighboring states. The empirical analysis in this second study centers ontesting models that might account for variation in a states behavior acrossthe different international disputes in which it is involved.Models, Numbers, and Cases212In the time-series example, the statistical problem is that values on thedependent variable for a state-dispute observation in time period t are sys-tematically related to the behavior and actions of that same state in pre-cedingtimeperiods.Putdifferently,thepriorhistoryofthedisputeisimportant in understanding the current behavior of the disputants. In thecross-sectional example, the problem is a bit different in that the actions ofasinglestateinonedisputeareinuencedbythebehaviorofthatsamestateorotherstatesinaseconddispute.Ineitherofthesetwocasesofdependent observations, the statistical implications are that the assembleddata sets do not contain as much independent information as is assumed bythe standard statistical models utilized by researchers. As a result, the stan-dard errors associated with the estimated coefcients are likely to be inac-curate. In particular, they are likely to be underestimated and, as a result,researchersruntheriskofoverstatingthestatisticalsignicanceofcoefcients and the ndings they report (see Greene 1997, chap. 13).If we refer back to gure 1, problems of both time-series and cross-sec-tional dependence of observations are likely to be present in data sets thatare used to test models for the Dispute Initiation and Challenge the StatusQuo stages. The reason is that a common research design for each of thesestagesistoassemblewhataretermedpooledcross-sectionaltime-seriesdata sets. Researchers might build a data set that includes many differentstates that are involved in many different disputes (or potential disputes)over some extended period of time.One such illustrative example comes from Huths (1996) study of terri-torial disputes, in which he conducted a two-stage analysis in which therst stage was very similar to what we have termed the Dispute Initiationstage. In this initial analysis he included all states from 1950 through 1990that issued territorial claims against another state as well as a random sam-ple of states that did not dispute their borders. He then tested models thatsought to explain which challenger states did in fact dispute territory. Inthe second stage of analysis, he focused on all of the territorial dispute casesfrom 1950 through 1990, and he analyzed the varying levels of diplomaticand military conict initiated by challenger states. In this two-stage analy-sis Huth found evidence of both temporal and cross-sectional relationshipsbetween cases. For example, challenger states that had signed formal agree-mentssettlingborderdisputeswithaparticularcountrypriorto1950were very unlikely to repudiate those agreements and initiate a new terri-torial dispute in the post-1950 period. Challenger states in a territorial dis-Research Design in Testing Theories213putewerealsolesslikelytoresorttomilitarythreatsinanattempttochange the status quo if they were involved simultaneously in multiple ter-ritorial disputes (chaps. 45).In the Military Escalation and Negotiations stages in gure 1 the datasets that would be relied upon for statistical tests would not be standardpooledcross-sectionaltime-seriesinnature,butratherwouldbepooledcross-sectionaldesigns.Forexample,adatasetfortestingtheMilitaryEscalation stage would typically consist of all military confrontations ini-tiatedbyachallengerstateoversomedisputedissue.Similarly,intheNegotiations stage the data set would include all rounds of talks held bystates over disputed issues. For each type of data set, cross-sectional depen-dence of observations could be a problem, as could temporal dependence ofobservations due to the potential for repeated rounds of talks or militaryconfrontations. For example, in the military escalation data set the decisionby a states leadership to resort to the large-scale use of force in a particularcase could be inuenced by whether their adversary was already engaged ina military confrontation with another state (see Huth, Gelpi, and Bennett1993) or whether they had suffered a military defeat at the hands of theircurrent adversary in a prior military confrontation (see Huth 1996).A common problem for many quantitative researchers who are workingwith probit and logit models is that standard corrections for time-series orspatial dependence in data are not well-developed in the statistical litera-ture. Political methodologists, however, have devised a number of poten-tially useful corrections that can be employed to deal with nonindependentobservations due to time series effects (e.g., Beck, Katz, and Tucker 1998),and such corrections are often desirable in estimating equations. Neverthe-less, we want to express a note of caution because researchers may too read-ily turn to these statistical corrections and only rely upon them to deal withtheimportantproblemofdependentobservations.Westronglyrecom-mendthatresearchersalsodevoteconsiderableefforttoaccountingforproblemsofnonindependentobservationsthroughbetterspecicationofthetheoreticalmodelsthatareempiricallytested.Thiswouldentailresearchersdevelopinghypothesesthatcapturetheinuencesoftime-seriesandcross-sectionalfactorsandthenincludingsuchfactorsasexplanatory variables in the equations that are tested. An excellent exam-ple of this approach is the work of Michael Ward and Kristian Gleditschthat includes explanatory variables in their models that reect spatial clus-tering of conict, trade, and democratization among states ( see GleditschModels, Numbers, and Cases214and Ward 2000; Gleditsch 2002). The primary advantage of this is thatany estimated coefcients that are intended to pick up the effects of depen-dent observations can be interpreted in a more direct manner given that atheoretically grounded and more specic causal argument has already beenprovided.Anotherrecommendationistoswitchfromstandardlogitandprobitmodels to event history or duration models that do explicitly account fortime-series effects (for a general discussion of such models see Zorn 2001).Eventhistorymodelsfocusonexplainingthetransitionfromaninitialcondition (or status quo) to a new one as a function of time. For example,drawing on the democratic peace literature, researchers might hypothesizethat given a territorial dispute between two states, the time to settlementof the dispute by means of a negotiated agreement would be shorter if bothstates were democratic. Good examples of IR scholars using event historymodelsincludeWerners(1999)studyofthedurabilityofpeaceagree-ments in the aftermath of wars and Bennett and Stams work (1998) on theduration of interstate wars.The Measurement of VariablesMeasurementerrorisaubiquitousconcerninallsciences,especiallythesocial sciences. Imprecise measurement of explanatory variables, especiallyif systematic, casts doubt on our ability to draw accurate causal inferences.Wefeelconcernsaboutmeasurementshouldbestronglyemphasizedinresearch designs of international military conict. Our four-stage model ofinternational conict illustrates some specic issues faced by quantitativeresearchers of international conict, such as the need to incorporate vari-ables and measures that may be uniquely relevant to certain stages in theevolutionofinternationalconict.Inaddition,allstudiesofmilitaryconict are saddled with certain unique data and measurement concerns,such as the use of large data sets with large numbers of variables, the ambi-guity of many key concepts, a lack of creativity in measurement, and dis-incentives to devote resources to better measurement.Since the actions taken by leaders over the course of an international dis-pute may provide additional useful information, researchers may need tomodifypreexistingmeasuresatlaterphasesofconict.Someimportantunderlyingconcepts,suchasthemilitarybalancebetweentwostates,couldbemeasureddifferentlydependingonwhichstageingure1isResearch Design in Testing Theories215being analyzed. For example, a general indicator of standing military capa-bilitiesmightbeusedtomeasurethemilitarybalanceinatestoftheChallengetheStatusQuostage.However,oncebothsideshavemadethreats to use military force or have mobilized troops, adding informationon the local balance of forces in this dispute would improve the measure-ment of the military balance in the Military Escalation stage. For example,measures of the local balance of forces have been reported to have strongeffectsonthesuccessorfailureofextended-immediatedeterrenceorwhetherterritorialdisputesescalatetowar(e.g.,Huth1988;HuthandAllee 2002).Inaddition,asmentionedearlier,thedecisionsmadebypoliticalandmilitary leaders during the evolution of a dispute may convey new infor-mation. This information should then be incorporated into empirical testsoflaterphasesofadispute.Forexample,leadersmaygenerateaudiencecosts or use costly signals at the beginning of the Military Escalation stagetomaketheirthreatofmilitaryforceappearcredibletoanadversary(Fearon 1994a). Therefore, this new information about the added credibil-ity of a states threat of force should be used to modify preexisting measuresofcredibilityoraddedtoanytestoftheMilitaryEscalationstage.Anexample of this is Huth and Allees (2002) study of state behavior in mili-tarycrisesinwhichtheycodeavariableforwhetherdemocraticleaderssend a strong public signal of the resolve to use force at the outset of thecrisis. In their statistical analyses they nd that such democratic signals ofresolvearestronglyassociatedwithdeterringescalationbytheadversarystate.AnotherinterestingexampleisthendingreportedbySchultz(2001)thatthedeterrentthreatsofdemocraticstatesaremorelikelytosucceed if the leaders of opposition parties signal their support for the gov-ernmentsdeterrentpolicyduringtheconfrontationwithapotentialattacker. The overriding idea is that variables reecting additional infor-mationcanbeaddedtoanalysesoftheMilitaryEscalationstageortheNegotiationsstage.Oneshouldnotalwaysrelyonthesamemeasureofcredibility, resolve, or military balance in empirical tests of the differentstages of an international dispute.A more general measurement concern for quantitatively minded schol-ars of military conict is that the quality of data is often poor. The recentturntodyad-yearsastheunitofanalysisinmanystudiesofmilitaryconict typically results in tens of thousands, if not hundreds of thousands,of cases in data sets. Trying to nd data on all variables for so many dyadsModels, Numbers, and Cases216isadauntingtask.Withlimitedtimeandlimitedresources,thereisatrade-offbetweenthequantityofdatacollectedandthequalityofthisdata. So researchers are forced to settle for imprecise or suspect data, or todropobservationswithmissingdata.4Inaddition,theincreasingaccep-tanceoftheideathatdomesticpoliticsvariablesshouldbeincludedinstudies of international conict adds to the data collection burden.Onepromisingsolutiontothecumbersometaskofcollectingqualitydata lies with sampling. The strategy of what can be termed retrospectiverandom sampling has rarely been used in large-n studies of internationalmilitary conict, yet the use of retrospective sampling designs would allowscholars to devote more energy toward the collection of better data. In suchsampling designs the researcher combines the population of observed mil-itary conicts (crises or wars, for example) with a random sample of cases inwhich no military conict occurred.5Logit models can then be used to esti-mateequationsinwhichthecoefcientsareunbiasedandthedegreeofinefciency associated with standard errors is quite small. Taking a randomsample from the large population of noncases of conict could be a valuabletool for addressing concerns about selection bias (Achen 1999; King andZeng 2001).Furthermore, studies of international conict and crisis behavior oftenemployconceptsthataredifculttomeasure.Game-theoreticmodelsoftengeneratehypothesesaboutthebeliefsofactors,yetitisnearlyimpossible to get inside the minds of decision makers to understand howthey interpret a situation. As a result, researchers have to develop imperfectoperational measures for key concepts such as the credibility of a threat,thepoliticalconstraintsonleaders,ortheresolveofstateleaders.Inaddition,scholarshavereachedlittleconsensusonhowtomeasuresuchcentralconcepts,andtherehasbeentoolittlecriticaldebateonhowtomeasurecertaindifcultorimportantconcepts.6Thepursuitofwaystocreatively measure theoretical concepts should be a high priority. Hard-to-measure concepts are typically measured by single proxy variables intendedto capture the concept of interest. Yet these concepts could also be mea-sured by employing techniques, such as conrmatory factor analysis, thatallow one to combine related, observable variables into a single underlyingfactorthatcapturesthishard-to-measureconceptinatheoreticallyinformedmanner.Substitutingalternativemeasuresforpurposesofrobustness checks could also be done more often.It is important that more of an effort be made to collect data and assem-Research Design in Testing Theories217ble new data sets. Unfortunately, the cost and time required to collect newdata can be substantial, and, as a result, the incentives to rely upon exist-ing data sets are quite strong. Yet the key principle of measurement in thesocial sciences is that an empirical researcher should make every attempt touse,collect,orobtaindatathatbesttsthetheoreticalpropositions.Widely used measures for concepts like military capabilities or democracymaybeappropriatefortestingcertainhypotheses,yetlessdesirablefortesting other propositions. Scholars should be as careful as possible to cap-ture the precise logic of their hypotheses. For example, the hypothesis thatdemocratic institutions restrict the use of force should be tested with dataoninstitutionalarrangements,notwithageneralmeasureofdemocracysuch as the widely used net democracy measure from the Polity data set.Onceagain,whileexistingdatasetsoftenprovideavaluablefunction,more of an effort should be made to put together new data sets and com-pile new measures whenever such measures do not exist, or when availablevariables are insufcient for the task at hand.ConclusionWe have argued that the theoretical and empirical analysis of internationalconict should be broken down into four generic stages. By thinking aboutthecausesofinternationalconictintermsofthesestages,webelieveresearchers are more likely to develop research designs for statistical teststhat1. focus on state leaders and their choices in international disputes asthe unit of analysis for building data sets,2. recognize that selection effects and strategic behavior are centralconceptsforunderstandinghowinternationaldisputesevolveinto stages where higher levels of conict occur,3. better account for how policy choices in international disputes arelinked across time and space, and4. include explanatory variables that better capture and measure theimpact of domestic and international conditions during periods ofmore intense diplomatic and military interactions.Inourjudgment,suchresearchdesignswillgreatlyimprovestatisticaltests of theories of international conict by better addressing problems ofModels, Numbers, and Cases218selection bias, nonindependent observations, and measurement error. Oneof the central implications of our analysis is that there should be a tighterconnection between the formal game-theoretic literature and the design ofstatisticalanalysesandtests.Anotherimplicationisthatempiricalresearchers will need to devote more time, effort, and resources to develop-ingmoremicroleveldatasetsofinternationaldisputesacrossdifferentissueareasaswellasdevelopingdataondisputebehaviorthatdoesnotinvolve military threats and the use of force.Recommended ReadingsAchen, C. 1986. The Statistical Analysis of Quasi-Experiments. Berkeley: Universityof California Press.Fearon, J. 1994. Domestic Political Audiences and the Escalation of InternationalDisputes. American Political Science Review 88 (3): 57792.Hart, R., and W. Reed. 1999. Selection Effects and Dispute Escalation. Interna-tional Interactions 25 (3): 24364.Huth, P. 1996. Standing Your Ground. Ann Arbor: University of Michigan Press.Huth,P.,andT.Allee.2002.TheDemocraticPeaceandTerritorialConictintheTwentieth Century. New York: Cambridge University Press.Signorino, C. 1999. Strategic Interaction and the Statistical Analysis of Interna-tional Conict. American Political Science Review 93 (2): 27998.Smith, A. 1999. Testing Theories of Strategic Choice: The Example of Crisis Esca-lation. American Journal of Political Science 43 (4): 125483.Notes1. We present each stage in its most simplied form to highlight only a fewbasic points. For example, we focus on only two actors but certainly third partiescould be included as actors. In addition, we make no effort to model these stagesrigorously. We simply map the choices available to states in a dispute and the out-comesofthevariouspathstoillustratethequestionsandconcernsstatisticalresearchers need to address.2. Of course, if the researchers focus is on explaining nonstate conict behav-ior then we would argue that the unit of analysis is the individual political actoror the leader of some organization that adopts and carries out particular policies. 3. By monadic we mean that democratic states are less likely to initiate militarythreats and the use of force against all other states, not just other democratic states.Research Design in Testing Theories2194. The idea of dropping cases from statistical analyses of international conictis especially problematic, since the cases dropped often exhibit systematic similar-ities.Dataonmilitaryexpenditures,militarycapabilities,andGNPareoftenhardesttoobtainforcertaintypesofcountries,suchasdevelopingcountriesorcountrieswithclosedpoliticalsystems.Droppingsuchcaseseliminatescertaintypesofmeaningfulcasesandresultsintruncatedvaluesofsomeindependentvariables. 5. This idea is logically similar to the use of control group designs in quasi-experimental research (see Cook and Campbell 1979).6. Recentdebatesonhowtomeasurejointdemocracyandthesimilarityofsecurity interests constitute a welcome advance (see Thompson and Tucker 1997;Signorino and Ritter 1999).ReferencesAchen, C. 1986. The Statistical Analysis of Quasi-Experiments. Berkeley: Universityof California Press.. 1999. Retrospective Sampling in International Relations. Annual Meet-ing of the Midwest Political Science Association, Chicago. Beck, N., J. Katz, and R. Tucker. 1998. Taking Time Seriously. American Journalof Political Science 42 (4): 126088.Bennett, D. S., and A. C. Stam. 1998. The Declining Advantages of Democracy.Journal of Conict Resolution 42 (3): 34466.. 2000. Research Design and Estimator Choices in the Analysis of Inter-stateDyads:WhenDecisionsMatter.JournalofConictResolution 44(5):65385.Bremer, S. 1992. Dangerous Dyads. Journal of Conict Resolution 36 (2): 30941.. 1993. Democracy and Militarized Interstate Conict, 18161965. Inter-national Interactions 18 (3): 23149.Bueno de Mesquita, B., and D. Lalman. 1992. War and Reason: Domestic and Inter-national Imperatives. New Haven: Yale University Press.Cederman, L. 1997. Emergent Actors in World Politics. Princeton: Princeton Univer-sity Press.Cook,T.,andD.Campbell.1979.Quasi-Experimentation. Boston:HoughtonMifin.Daalder, I., and M. OHanlon. 2000. Winning Ugly. Washington, DC: BrookingsInstitution.Downs, G. W., and D. M. Rocke. 1990. Tacit Bargaining, Arms Races, and ArmsControl. Ann Arbor: University of Michigan Press.Models, Numbers, and Cases220Fearon, J. 1994a. Domestic Political Audiences and the Escalation of InternationalDisputes. American Political Science Review 88 (3): 57792..1994b.SignalingversustheBalanceofPowerandInterests.JournalofConict Resolution 38 (2): 23669.. 1998. Bargaining, Enforcement, and International Cooperation. Interna-tional Organization 52 (2): 269306.Geddes, B. 1990. How the Cases You Choose Affect the Answers You Get: Selec-tion Bias in Comparative Politics. Political Analysis 2:3150.Gleditsch, K. 2002. All International Politics Is Local: The Diffusion of Conict, Inte-gration, and Democratization. Ann Arbor: University of Michigan Press.Gleditsch, K., and M. Ward. 2000. War and Peace in Space and Time. Interna-tional Studies Quarterly 44 (1): 130.Gowa, J. 1999. Ballots and Bullets. Princeton: Princeton University Press.Greene, W. 1997. Econometric Analysis. Upper Saddle River, NJ: Prentice-Hall.Hart, R., and W. Reed. 1999. Selection Effects and Dispute Escalation. Interna-tional Interactions 25 (3): 24364.Huth,P.1988.ExtendedDeterrenceandthePreventionofWar. NewHaven:YaleUniversity Press.. 1996. Standing Your Ground. Ann Arbor: University of Michigan Press.Huth,P.,andT.Allee.2002.TheDemocraticPeaceandTerritorialConictintheTwentieth Century. New York: Cambridge University Press.Huth, P., C. Gelpi, and D. S. Bennett. 1993. The Escalation of Great Power Mil-itarized Disputes: Testing Rational Deterrence Theory and Structural Realism.American Political Science Review 87 (3): 60923.King,G.,R.O.Keohane,andS.Verba.1994. DesigningSocialInquiry:ScienticInference in Qualitative Research. Princeton: Princeton University Press.King, G., and L. Zeng. 2001. Explaining Rare Events in IR. International Organi-zation 55 (3): 693716.Lincoln, E. 1990. Japans Unequal Trade. Washington, DC: Brookings Institution.. 1999. Troubled Times. Washington, DC: Brookings Institution.Maoz, Z. 1997. The Controversy over the Democratic Peace. International Security22 (1): 16298.. 1998. Realist and Cultural Critiques of the Democratic Peace. Interna-tional Interactions 24 (1): 389.Maoz,Z.,andB.Russett.1992.Alliance,Contiguity,Wealth,andPoliticalEquality. International Interactions 17 (3): 24567.. 1993. Normative and Structural Causes of Democratic Peace, 194686.American Political Science Review 87 (3): 62438.Milner, H. V. 1997. Interests, Institutions, and Information: Domestic Politics and Inter-national Relations. Princeton: Princeton University Press.Research Design in Testing Theories221Morrow, J. D. 1989. Capabilities, Uncertainty, and Resolve: A Limited Informa-tionModelofCrisisBargaining.AmericanJournalofPoliticalScience 33(4):94172.Oneal, J., and J. L. Ray. 1997. New Tests of the Democratic Peace Controlling forEconomicInterdependence,19501985.PoliticalResearchQuarterly 50(3):75175.Oneal, J., and B. Russett. 1997. The Classical Liberals Were Right. InternationalStudies Quarterly 41 (2): 26794.. 1999a. Assessing the Liberal Peace with Alternative Specications. Jour-nal of Peace Research 36 (4): 42342..1999b.IstheLiberalPeaceJustanArtifactoftheColdWar?Interna-tional Interactions 25 (3): 21341.. 1999c. The Kantian Peace. World Politics 52 (1): 137.Powell, R. 1999. In the Shadow of Power. Princeton: Princeton University Press.Putnam, R. D. 1988. Diplomacy and Domestic Politics. International Organization42 (3): 42760.Ray, J. L. 1995. Democracy and International Conict: An Evaluation of the DemocraticPeace Proposition. Columbia: University of South Carolina Press.Reed,W.2000.AUniedStatisticalModelofConictOnsetandEscalation.American Journal of Political Science 44 (1): 8493.Rousseau,D.,C.Gelpi,D.Reiter,andP.Huth.1996.AssessingtheDyadicNatureoftheDemocraticPeace.AmericanPoliticalScienceReview 90(3):51233.Russett,B.1993.GraspingtheDemocraticPeace. Princeton:PrincetonUniversityPress.Schoppa, L. 1997. Bargaining with Japan. New York: Columbia University Press.Schultz, K. A. 1998. Domestic Opposition and Signaling in International Crises.American Political Science Review 92:82944.. 2001. Democracy and Coercive Diplomacy. New York: Cambridge Univer-sity Press.Signorino, C. S. 1999. Strategic Interaction and the Statistical Analysis of Inter-national Conict. American Political Science Review 93 (2): 27998.Signorino, C. S., and J. Ritter. 1999. Tau-B or Not Tau-B: Measuring the Simi-larity of Foreign Policy Positions. International Studies Quarterly 43 (1): 11544.Smith,A.1995.AllianceFormationandWar.InternationalStudiesQuarterly 39(4): 40526..1996.ToInterveneorNottoIntervene:ABiasedDecision.JournalofConict Resolution 40 (1): 1640.. 1998. International Crises and Domestic Politics. American Political Sci-ence Review 92 (3): 62338.Models, Numbers, and Cases222. 1999. Testing Theories of Strategic Choice: The Example of Crisis Esca-lation. American Journal of Political Science 43 (4): 125483.Thompson, W., and R. Tucker. 1997. A Tale of Two Democratic Peace Critiques.Journal of Conict Resolution 41 (3): 42854.Wagner, R. H. 2000. Bargaining and War. American Journal of Political Science 44(3): 46984.Werner, S. 1999. The Precarious Nature of Peace. American Journal of Political Sci-ence 43 (3): 91234.Zorn,C.2001.GeneralizedEstimatingEquationModelsforCorrelatedData.American Journal of Political Science 45 (2): 47090. Research Design in Testing Theories223