Mass Purges: Top-down Accountability in Autocracy * B. Pablo Montagnes Emory University [email protected]Stephane Wolton London School of Economics [email protected]This version: July 16, 2016 Most recent version accessible here Abstract This paper contends that mass purges are a salient method of top-down accountability used by totalitarian regimes to increase party performance and shape party membership. In our theoretical framework, party members work on independent projects. Their fate, however, is linked through the purge, and a member’s effort depends on the activism of all others via what we call the pool size effect. In turn, the autocrat’s incentive to purge depends on the informativeness of different performance indicators, a function of all members’ effort via what we term the pool makeup effect. These novel pool effects emerge from the many (party members) to one (autocrat) accountability problem faced by the principal. Our approach also highlights how violence affects top-down accountability in autocracy. Greater intensity of violence increases effort, but can impede selection. The autocrat thus cannot escape a trade-off between love (less unity) and fear (more activism). * We thank Scott Ashworth, Dan Bernhardt, Alessandra Casella, Ali Cirone, Torun Dewan, Tiberiu Dragu, Scott Gehlbach, Thomas Groll, Haifeng Huang, Navin Kartik, Roger Myerson, Salvatore Nunnari, Carlo Prato, Arturas Rozenas, Milan Svolik, Scott Tyson, conference and seminar participants at the Sixth LSE-NYU Conference, Priorat Workshop in Theoretical Political Science, the Third Formal Theory & Comparative Politics Conference at Chicago, 3rd PECO Conference in Washington, D.C., Bocconi University, Columbia University for helpful comments and suggestions. All remaining errors are the authors’ responsibility. 1
48
Embed
Mass Purges: Top-down Accountability in Autocracy · In 1901, when writing his revolutionary agenda What is to be done?, Vladimir Ilyich Ulyanov (alias Lenin) chose one particular
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Mass Purges: Top-down Accountability in Autocracy∗
By removing a proportion of party members, mass purges also provide for the influx of new mem-
3There obviously exist non-formal theories of totalitarian terror, in particular Arendt (1973), to which we cannot
do justice here.4Rigby (1968), on the other hand, argues that the purges of the CPSU in the 1930’s paved the way for the Great
Terror and the show trials of 1936-38. However, he still admits that the 1933-34, 1951-53, and especially 1921-23
purges were not caused by conflict between leaders (see, e.g., 282-83). Far from generating purges, the contest for
power in USSR between Stalin, Trotsky, Zinoviev, and Bukharin lead to an increase in party membership as Stalin
tried to recruit allies (Rigby, 1968, 131).5Chinese rectification movements were also meant to educate ranks-and-files, a concern generally absent from
Soviet purges. Interestingly, the few elite purges of the Chinese leadership did not trigger mass purges even when
the purged leaders, like Gao Gang and Rao Sushi, were accused of establishing independent kingdoms (Teiwes,
1993, chapter 5, especially 142 and 162). The dismissal of Peng Dehuai in 1959, however, was concomitant with the
1959-60 Rectification Campaign. Nonetheless, Teiwes argues that the two events were uncorrelated (ibid., 339 and
341).
6
bers (Brzezinski, 1956, 9, 131, 168, Teiwes, 1993, 42-43) drawn from the pool of candidates to the
party (Rigby, 1968, 52-53). Consequently, while elite purges target specific individuals, the target
of mass purges is more diffuse, it is the mass of party members composed of potentially millions
of individuals (Teiwes, 1993, 5).6 Further, while elite purges depend on circumstances, communist
leaders thought to regulate the periodicity of mass purges (Getty, 1987, 38, 41), with Mao pre-
scribing rectification campaigns twice every five year (Teiwes, 1993, 224). On this dimension, mass
purges thus share more similarities with elections than elite purges.
3 Set-up
We study a two-period (t ∈ {1, 2}) model with an autocrat (A) and a [0, 1] continuum of party
members, indexed by the superscript m. Each party member is characterized by a type τ ∈ {i, o},
where τ = i corresponds to an ideologue and τ = o to an opportunist.7 A party member’s type is
his private information, however it is common knowledge that there is a proportion λ of ideologues
among party members.
Each period, party member m exerts effort em ∈ [0, 1] at cost (em)2/2 on an individual project.
The probability member m’s project is successful is equal to em.8 While a member’s effort is not
observed by the autocrat, the outcome of his project is (e.g., whether he has fulfilled his quota).
This assumption corresponds to historical evidence that officials in charge of the purges had little
information about local circumstances and could only judge according to how successful problem
cases or certain projects were handled (Teiwes, 1993, 28, 42, Rigby, 1968, 96). At the end of period
1, the autocrat decides to purge a proportion κ of party members, which we refer to as the ‘purge
breadth.’
Mass purges entail a loss in term of human capital and organizational knowledge as well as
the cost of potentially deporting party members or delay in finding suitable replacement for the
purged party member. This cost is captured by the cost function c(κ) with c(0) = 0 and (for ease
6As described by Weinberg (1993, 23) in his case study of mass purges in Birobizhan, leaders do not have “detailed
list of individuals to be purged.”7We use the term opportunist for simplicity to encompass members attracted to the party for the benefits attached
to it (Getty, 1987, 32-33), members who lacked “a wholehearted commitment to the Party’s cause” (Teiwes, 1993,
114-115), and members not in line with the current Party’s policy (see the quote from Stalin in Gregory et al., 2011,
36).8Alternatively, we could assume that effort is translated in project success via some concave function.
7
of exposition) marginal cost c′(κ) = c0 + c1κ, c0 ≥ 0, c1 > 0. When a party member is purged, a
new member replaces him (e.g., from the pool of candidates). The proportion of ideologues among
the replacement pool is ri.
Being purged has two distinct consequences for a party member. First, the party member is
expelled from the party and cannot enjoy the benefit associated with party membership in the
second period. Second, he suffers a direct loss L which corresponds to the “intensity of violence”
of the purge. The loss L can be relatively low if a party member is only fined or very large if a
party member is killed, her or his spouse deported, and their children sent to orphanage as it was
commonplace in Stalin’s USSR (Brzezinski, 1956, 110).9 The autocrat determines the intensity of
violence at the beginning of the game (e.g., investment in the security apparatus) at a cost ζ(L)
with ζ(0) = 0 and marginal cost ζ ′(L) = ζ0 + ζ1L, ζ0 ≥ 0 and ζ1 > 0.
In period 1, a party member enjoys a benefit R ≥ 0, which captures all the special privileges
accorded to party members. In addition, If he is not purged from the party, km = 0, an ideologue
obtains b > 0 when his project is successful, whereas an opportunist gets 0 regardless of the outcome
of his project. When purged, km = 1, a party member suffers the loss L > 0.10 Party member m’s
first-period payoff thus assumes the following form:
um1 (e; τ) = R + (1− km)
I{τ=i}b if project succeeds
0 otherwise
+ km(−L)− e2
2(1)
In period 2, if m survives the purge, given that there is no subsequent purge, his payoff can be
expressed as the sum of party membership benefit (R) and the net gain from successful project:
um2 (e; τ) = R +
I{τ=i}b if project succeeds
0 otherwise
− e2
2(2)
To simplify the exposition, we assume throughout that a party member does not discount the
future.
The autocrat gets a positive payoff—normalized to 1—when a party member’s project is suc-
cessful, and 0 otherwise. The autocrat thus wants to maximize the proportion of successful project,
which is equal to party members’ average effort in each period. In the first period, the autocrat
also bears the cost of investing in the intensity of violence and the cost of purging. Denote et the
9For example, the law of June 1934 in USSR established that all family members are legally responsible for the
illegal acts of one of them (Wolton, 2015, 271).10All our results hold if an ideologue enjoys the payoff from successful project even after being purged.
8
average effort in period t ∈ {1, 2}, we can thus express the autocrat’s first-period and second-period
payoffs as, respectively:
uA1 (κ, L) =e1 − c(κ)− ζ(L) (3)
uA2 =e2 (4)
The autocrat has a discount factor of β ∈ (0, 1), which captures, among other things, the risk
(perceived or real) of losing power between the two periods.
To summarize, the timing of the game is:
Period 1:
1. Autocrat chooses the intensity of violence L ≥ 0;
2. Member m chooses effort em1 ;
3. Project outcome (success/failure) is determined and observed by autocrat. Autocrat chooses
the purge breadth κ;
4. Purged members are replaced by new party members, and first-period payoffs are realized;
Period 2:
1. (Surviving and new) member m chooses effort em2 ;
2. Project outcome is determined;
3. Game ends and second-period payoffs are realized.
Note that the assumption that the autocrat commits to an intensity of violence at the beginning of
the game is not innocuous. Without commitment, at the moment of the purge, the autocrat would
either choose no violence (if violence is costly) or the highest feasible intensity (if it is costless). This
is because once efforts choices are made, violence has no effect on selection (and clearly on effort).
This assumption, however, has some historical ground. Funds for the Great Terror were earmarked
before its launch (Wolton, 2015, 317; unfortunately, there is no similar historical evidence for mass
purges). Observe further that the autocrat prefers to commit whenever her preferred intensity of
violence is positive.
The equilibrium concept is Perfect Bayesian Equilibrium (PBE), which implies that each party
member correctly anticipates the autocrat’s purging decision and other members’ effort when choos-
ing his own effort, and, in turn, the autocrat correctly anticipates the level of effort by each type
when determining her investment in violence and purging strategy. For simplicity, we impose that
agents are anonymous, so all agents with a successful (failed) project face the same probability of
being purged. Finally, to deal with measurability issue, we assume that when the autocrat observes
9
an out-of-equilibrium event, she treats the deviation as a mistake and does not distinguish between
the party member who deviated and the rest of the party which followed the prescribed strategy.11
If after these restrictions, multiple PBE arise, we select the one which maximizes the autocrat’s
ex-ante expected welfare (henceforth autocrat welfare). In what follows, “equilibrium” refers to
this class of equilibria.
Throughout, we use the following notation. v(τ) corresponds to a party member’s flow payoff
if his project is successful; i.e., v(i) = b and v(o) = 0. V2(τ) denotes a party member m’s expected
payoff from being in the party in period 2 as a function of his type. Simple algebra yields V2(i) =
R + b2/2 and V2(o) = R. The (ex-ante) average payoffs are denoted by v = λv(i) + (1 − λ)v(c)
and V2 = λV2(i) + (1− λ)V2(o). For the autocrat, denote W2(τ) her second-period expected payoff
induced by a type τ ∈ {i, o}. It can be checked that W2(i) = b and W2(o) = 0. The gain from
replacing an opportunist by an ideologue is Di,o := W2(i)−W2(o).
We also impose the following restrictions on parameter values:
βriDi,o < c0 + c1 ≤ βriDi,o + c11 + v
2− βλ
1−v(i)2− v(i)−v
2− (V (i)− V2)
1−v2
Di,o (5)
The left-hand side states that it is never optimal for the autocrat to purge the whole party even if
there is no ideologue among current party members (e.g., due to the risk of popular rebellion if the
party work is too disrupted). The right-hand side is a technical condition meant to limit the number
of cases to be considered. All results hold substantially when this inequality is relaxed. Finally, we
assume that the highest feasible intensity of violence, denoted L, satisfies L := 1− v(i)− V2(i).12
The analysis proceeds in three steps. First, we consider the optimal purge breadth for exogenous
levels of violence and uncover the ‘pool size’ and ‘pool make-up’ effects. Then, we examine the
conflicting effects of violence on effort and selection, which we term the love-fear trade-off. Finally,
we characterize the optimal solution to the love-fear dilemma and how it is affected by underlying
fundamentals.
11Alternatively, we could define two subsets of party members ξ0 and ξ1 who always exert (respectively) effort 0
and 1, implying that there is no out-of-equilibrium event of measure 0.12This assumption guarantees that v(i) + V2(i) + L ≤ 1 so there is no corner solution in effort (which only
complicates the analysis). A party member cannot work all the time for a long period of time so his effort is
naturally bounded above by the number of hours available in a day.
10
4 Purge Breadth
Due to our equilibrium refinement, there is no equilibrium in which agents exert zero effort.13
When they exert effort, party members endogenously sort into pools of failure and success, which
constitute, with the proportion of ideologues among existing party members (λ) and potential
replacements (ri), the only information available to the autocrat at the time of her purging decision.
However, given that ideologues receive an intrinsic benefit from a successful project, they always
have more incentive to exert effort than opportunists and are more likely to belong to the success
pool (i.e., the single-crossing condition holds). Consequently, the autocrat always first targets
unsuccessful party members.
When choosing their effort, party members take into account both the intrinsic value of success
and the risk of being purged after failure or success. As success on a project provides full or partial
inoculation from a purge, all party members have incentive to exert effort in order to survive
the purge. The benefit from a successful project, however, depends on the relative probability
of being purged when in the success and failure pools, which we refer to as the (success/failure)
pool incidences. These pool incidences can take three qualitatively distinct forms which determine
the nature of the purge. When only a portion of the failure pool is purged, we say that the
purge is “partially discriminate.” When the entire failure pool is purged, we label the purge “fully
discriminate.” Finally, when even some successful members are purged, we use the qualifier “semi-
indiscriminate.” In turn, the purge incidence faced by a member m is a function of two factors:
the breadth of the purge and the efforts of other party members.
Suppose that the purge breadth and all other efforts were exogenous. How would a party
member respond to either quantity? A party member’s effort depends critically on the nature of
the purge. In a partially or fully discriminate purge, the payoff from belonging to the success
pool is large: a successful party member obtains his flow payoff and is inoculated against the
purge. As such, a party member exerts high effort when anticipating a discriminate purge. In a
semi-indiscriminate purge, the benefit from belonging to the success pool is relatively low. Even if
successful, there is a risk a party member’s effort is wasted as he may be purged anyway.
13Absent our restrictions, there would exist an equilibrium in which all party members exert 0 effort and the
autocrat would purge with probability 1 a successful party members. This equilibrium would only be sustained by
the arguably unreasonable out-of-equilibrium belief that a successful member is likely to be an opportunist even
though only ideologues are intrinsically motivated to exert effort.
11
In turn, holding the purge breadth constant, a change in other party members’ level of activity
affects a party member’s incentive to exert effort by altering the relative size of the success and
failure pools. This is the pool size effect, which we say is positive when a member’s effort increases
with other members’ level of activity and negative otherwise. In a discriminate purge, higher
activism by other members reduce the size of the failure pool, which increases the failure pool
incidence and therefore encourages effort. Thus, in a discriminate purge, efforts by party members
are strategic complements and the pool size effect is positive. In contrast, in a semi-indiscriminate
purge, efforts are strategic substitute and the pool size effect is negative. Increased level of activity
by other members depresses a member’s effort because the benefit of being in the success pool is
reduced as the failure pool becomes thinner and the success pool incidence increases. Finally, notice
that the nature of the purge depends on all party members’ effort. As the failure pool becomes
too thin relative to the breadth of the purge (i.e., 1 − e1 < κ), a discriminate purge becomes
semi-discriminate.
Lemma 1 summarizes the reasoning above noting that in a discriminate purge, a party member
considers the flow payoff from a successful project (v(τ)) as well as the expected loss from being
purged ( κ1−e1 (V2(τ) +L)), whereas in a semi-indiscriminate, he takes into account the benefit from
success (v(τ) + V2(τ) + L) weighted by the probability of surviving the purge (1− κ−(1−e1)e1
).
Lemma 1. A type τ ∈ {i, o} party member m chooses effort:
em1 (τ) =
v(τ) + κ1−e1 (V2(τ) + L) if 1− e1 ≥ κ(
1− κ−(1−e1)e1
)(v(τ) + V2(τ) + L) if 1− e1 < κ
, (6)
As explained above, in a discriminate purge, party members’ efforts are strategic complement,
whereas they are strategic substitute in a semi-indiscriminate purge. These effects, however, are
secondary compared to the direct impact of the purge breadth. Consequently, the nature of the
purge (discriminate or semi-indiscriminate) is determined solely by the purge breadth and the
intensity of violence, and is thus fully in the autocrat’s hands.
Lemma 2. A purge is semi-indiscriminate if and only if κ > 1− v − V2 − L := κ(L).
When deciding whom to purge, the autocrat observes only the outcome of a party members’
project, and forms a posterior that a party member is an ideologue based on success (denoted
µS(e1)) or failure (µF (e1)). The autocrat’s posteriors incorporate her conjectures (correct in equi-
librium) of the different levels of effort exerted by ideologues and opportunists. Due to ideo-
logues’ intrinsic motivation, a successful project is a positive signal of ideological alignment, so
12
µF (e1) < λ < µS(e1). However, this signal is never perfect: ideologues sometimes fail and op-
portunist sometimes succeed. Consequently, in any purge, some ideologues are purged and some
opportunists survive.
Consider the autocrat’s incentives to purge a member after observing his project is unsuccessful.
Recall that W2(τ) is the autocrat’s second period expected payoff induced by a type τ ∈ {i, o}
party member. If the autocrat retains the party member after failure, her expected payoff is:
µF (e1)W2(i) + (1− µF (e1))W2(o). Since the proportion of ideologues in the replacement pool is ri,
the autocrat’s payoff from purging an unsuccessful party member is: riW2(i) + (1− ri)W2(o). The
autocrat’s expected benefit from purging a member after failure is thus:
WF = [ri − µF (e1)]Di,o2 , (7)
By a similar reasoning, the expected benefit from purging a successful party member is:
WS = [ri − µS(e1)]Di,o2 (8)
The autocrat’s incentive to purge thus depends critically on the informativeness of her posteriors,
which are a function of party members’ endogenous sorting into the success and failure pools. Thus,
in addition to the pool size effect, resulting from the interdependence between party members’
efforts, there exists a pool makeup effect which captures how changes in party members’ efforts
affect the autocrat’s learning. We say that the pool makeup effect is positive when the target pool
becomes more tainted and the autocrat’s incentive to purge increases. As party members’ effort
depends on the purge breadth, the pool makeup effect is a function of κ and the nature of the purge.
To make sense of it, we first examine how an exogenous increase in the purge breadth affects the
autocrat’s relevant posteriors.
Anticipating a discriminate purge (relatively low κ), both ideologues and opportunists increase
their effort and exit the failure pool in response to higher purge threat. Ideologues, however, have
more to lose from being purged and so increase their effort relatively more. Since ideologues are
also less likely to fail to start with, they exit the failure pool at a higher rate, and the target pool
becomes more tainted. The pool makeup effect is thus positive.
In a semi-indiscriminate purge (relatively high κ), the autocrat’s benefit from purging a party
member is determined at the margin by the informativeness of success (see (8)). The sign of the pool
makeup effect is thus a function of the differential exit rate between ideologues and opportunists
from the success pool (since an increase in κ decreases effort). Ideologues are more responsive to
13
a change in purge breadth, but (relatively) more likely to belong to the success pool before it. In
our set-up, the two effects compensate each other because party members’ cost of effort exhibits
constant elasticity. Both types exit the success pool at the same rate, resulting in a null pool
makeup effect.
In equilibrium, the purge breadth, pool size and pool makeup effects are jointly determined.
However, because all players correctly anticipate each other’s strategy, we can simply compare the
marginal cost of purging an additional number given by c′(κ) = c0 + c1κ with the marginal benefit
(determined by (7) and (8)) which incorporates the pool size and makeup effects. Recall that
κ(L) = 1− v − V2 − L, we obtain the following lemma.
Lemma 3. There exists a unique equilibrium purge breadth κ∗(L). Further, if at κ = κ(L),
(i) c0 + c1κ >WF , the purge is partially discriminate;
(i) WS ≤ c0 + c1κ ≤ WF , the purge is fully discriminate;
(ii) c0 + c1κ <WS, the purge is semi-indiscriminate.
Lemma 3 indicates that, somewhat unsurprisingly, the autocrat chooses a partially discriminate
purge if the cost of carrying a fully discriminate purge is too high (point (i)). In turn, she prefers
a semi-discriminate purge whenever the cost is sufficiently low (point (iii)). In all other cases, the
purge is fully discriminate (point (ii)).14 Notice that since the success pool has a greater proportion
of ideologues (i.e., µS(e1) > µF (e1)), the benefit of purging a successful party member is strictly
lower than the benefit of purging a failed one (WS < WF ) and a fully discriminate purge is not
a knife-edge result. Further, since a party member’s success is always an (imperfect) signal of
ideological congruence with the autocrat, Lemma 3 implies that semi-indiscriminate purges can
occur only if the replacement pool is better on average than the current party members.
Corollary 1. If ri > λ, there exists a non-measure zero set of parameter values such that the
equilibrium purge is semi-indiscriminate.
14Due to the strategic complementary, in a partially discriminate purge, a member’s effort is convex in the
(anticipated) purge breadth. The pool makeup effect is thus increasing and so is the marginal benefit of purging.
The marginal benefit WF may thus intersect the marginal cost more than once in the range [0, κ(L)]. Despite this
complication, points (i) and (ii) hold because the autocrat always prefers the highest feasible purge breadth to take
advantage of the higher marginal benefit from purging.
14
5 The love-fear tradeoff
In the previous analysis, we fixed the intensity of violence. Here, we examine how changes in L
affect the autocrat’s and party members’ strategies and reserve the analysis of the optimal intensity
of violence to the next section.
We first consider the effect of greater intensity of violence on the performance of the party in
the first period, what we term the fear effect which is positive whenever an increase in L increases
average effort. The next proposition establishes that the fear effect is always positive. As the cost
of being purged increases, all party members have greater incentives to exert effort.
Fear, however, motivates differentially ideologues and opportunists since any change in efforts
triggers the pool size and makeup effects identified above and thus affects the purge breadth. While
the cost of being purged is similar for all types, the indirect pool effects on the probability a member
survives are weighted by a member’s second-period payoff, greater for ideologues. Consequently,
the strength of the fear effect depends on the nature of the purge.
In a discriminate purge, the greater level of activity caused by higher L generates a positive
pool size effect. As ideologues are more responsive to the pool size effect, they exit the failure
pool faster than opportunists producing a positive make-up effect. The autocrat then purges more
failures. Since greater purge breadth is also conducive to more effort, all effects go in the same
direction. Equilibrium effort by all party members increases at a (relatively) high rate with the
intensity of violence.
When the purge is semi-discriminate, the pool size effect is negative (i.e., party members’ efforts
are strategic substitutes) so greater activism depresses the incentives to exert effort, especially for
ideologues. Consequently, ideologues increase their effort less than opportunists, and the success
pool becomes more tainted. The pool makeup effect is thus positive, and the purge breadth in-
creases, further reducing the benefit of effort. The equilibrium effort thus increases at a (relatively)
low rate with the intensity of violence.15
The fear effect is summarized in Proposition 1 and illustrated in Figure 2a below.
15Importantly, the indirect pool effects only reduce the positive impact of greater intensity of violence on effort in
a semi-indiscriminate purge. To see that, suppose to the contrary that the fear effect is negative so average effort
decreases with L. Since the pool size effect is negative, members have greater incentive to exert effort, especially
ideologues. This would lead to a negative pool makeup effect (the autocrat has less incentives to purge the success
pool), reducing the proportion of successful members being purged and increasing the value of success. Consequently,
all members would increase their effort, contradicting the assumption of decreased average effort.
15
Proposition 1. The first-period equilibrium average level of effort increases with the intensity of
violence. Average effort increases at a faster rate in a fully discriminate purge than in a semi-
indiscriminate purge.
Our next result establishes that the nature of the purge, but not the breadth is determined by
the intensity of violence. In a partially discriminate purge, as L increases, the failure pool becomes
more tainted and this positive pool make-up effect leads to an increase in purge breadth. But
greater breadth increases effort and reduces the failure pool. As L continues to increase, the failure
pool becomes so thin that all failures are purged: the purge becomes fully discriminate for intensity
of violence above some Lfull. In a fully discriminate purge, although the failure pool becomes more
tainted, the pool makeup effect has no effect on the breadth as the entire pool is already being
purged. The only effect remaining is the fear effect which decreases the size of the failure pool and
thus the purge breadth.
As violence increases further, two effects lead a purge to move from fully discriminate to semi-
indiscriminate. First, as relatively more opportunists join the success pool, the latter becomes more
tainted and purging successful party members is more attractive for the autocrat. Second, as the
failure pool shrinks and the breadth of the purge declines, the (marginal) cost of purging successful
members decreases. Once the purge becomes semi-indiscriminate (above some intensity Lind), the
positive pool make-up effect induced by greater intensity of violence again implies an increase in
purge breadth. Overall, the purge breadth is non-monotonic in violence because of the coarseness
of the information available to the autocrat.16
Proposition 2 summarizes these findings and Figure 1 illustrates them.
Proposition 2. There exist unique Lfull < L and Lind ∈ (Lfull, L] such that:
(i) For L < Lfull, the purge is partially discriminate and breadth strictly increases with violence;
(ii) For L ∈ [Lfull, Lind], the purge is fully discriminate and breadth strictly decreases with violence;
(iii) For L > Lind, the purge is semi-indiscriminate and breadth strictly increases with violence.
Our theory thus predicts that violent purges (L > Lind) are semi-indiscriminate, whereas (rel-
atively) mild purges are discriminate. This prediction seems in line with historical evidence. As
described by Teiwes (1993, 25-27), Chinese rectification campaigns were characterized by low inten-
sity of violence and by a high level of predictability and so resemble discriminate purge. In contrast,
16This non-monotonicity result would hold in any setting in which the autocrat’s information is discrete (but not
necessarily binary).
16
Figure 1: Equilibrium purge breadth and intensity of violence
Parameter values: λ = 1/3, ri = 2/3, R = 0, b = 1/4, β = 0.9, c0 = 0, c1 = 0.17.
during the purges of the thirties in USSR, the intensity of violence was high and the target of the
purges less delimited as “flouting commands court danger, but even enthusiastic compliance is no
guarantee of safety” (ibid., 25). Stalinist purge are thus examples of semi-indiscriminate purges
(we provide some reasons why it may have been so below).
The intensity of violence, however, does not determine the purge breadth due to the non-
monotonous relationship uncovered in Proposition 2. This appears again to correspond to historical
facts. Despite the differences in intensity of violence and nature, Soviet and Chinese purges had
similar breadth. During the 1930s Stalinist purges, the proportion of purged members varied from
5% in 1930, 1931, and 1937 to 22% in 1933-34 (see Table 1 in Appendix A for more details). In
turn, the expulsion rate in rectification campaigns in China fluctuated between 9% in 1957-58 and
23% in 1947-48 (see Table 2).17
Having examined the effect of violence on effort and the nature and breadth of the purge, we now
consider its impact on selection. Observe that since a new party member is always better on average
(from the autocrat’s perspective) than a purged party member, the second-period ideological unity
(defined by the proportion of ideologues) of the party is always greater following the purge. An
increase in the intensity of violence, however, affects the selection benefit for the autocrat. This is
what we refer to as the love effect, which is positive (resp. negative) if greater violence improves
(worsens) selection. The next proposition establishes conditions under which the love effect is
negative for the survivors of the purges and for all party members, Figure 2b illustrates this result.
17Given the differences in size and population, our preferred measure is the proportion of party members purged.
The number of expelled was always much larger in China.
17
Proposition 3.
(i) The proportion of ideologues among surviving members of the purge strictly increases with L if
and only if L < Lfull, and strictly decreases otherwise.
(ii) The proportion of ideologues in the party in the second period weakly increases with L for all
L if and only if λ ≥ ri.
(iii) If ri ∈ (λ, 2λ], the proportion of ideologues in the party in the second period strictly increases
with L for L < Lfull and strictly decreases otherwise.
The first part of the proposition highlights that the autocrat’s ability to screen party members
decreases with L unless the purge is partially discriminate. For L < Lfull, as the intensity of
violence increases, more party members exit the failure pool and more failures are purged. Among
surviving members, a greater proportion of party members thus belongs to the success pool, which
is of higher quality than the failure pool. The love effect is then positive. When the purge is fully
discriminate or semi-discriminate, surviving members all belong to the success pool. Screening
then worsens because the success pool becomes more tainted as more opportunists enter the pool
relative to the stocks of both types. The love effect is then necessarily negative.
Even though purged party members are replaced by (in expectation) more ideological members,
the love effect can still be negative when it comes to (second-period) party membership. Maybe
surprisingly, the love effect is always negative for high enough intensity of violence (L ≥ Lfull) when
the replacement pool is better than the existing pool of party members (Proposition 3(ii)). This
occurs because in a fully discriminate purge, a lower proportion of party members are replaced
by better new party member (Proposition 2(ii)). A decreasing purge breadth, however, is not
necessary to produce a negative love effect for second-period party membership. Indeed, whenever
the difference between the replacement pool and existing party numbers is not too large (ri ≤ 2λ, a
sufficient condition), the deterioration of the pool of survivors discussed above dominates increased
replacement, and the love effect is again negative.
This section therefore establishes that under certain circumstances (such as a better replace-
ment pool), the autocrat faces a trade-off between fear (better first-period performance) and love
(lower second-period ideological unity), not too dissimilar from the dilemma identified long ago
by Machiavelli (2005, Chapter 17). While there is little historical evidence on the effect of purges
on selection, the trade-off we identify might explain why Stalin, who understood that “saboteurs
18
(a) Fear effect (b) Love effect
In Figure 2b, the plain line corresponds to proportion of ideologues among all party members in period 2, the dashed
line to the proportion of ideologues among survivors of the purge. Parameter values: λ = 1/3, ri = 2/3, R = 0,
b = 1/4, β = 0.9, c0 = 0, c1 = 0.17.
disguise themselves by over-fulfilling the plan” (cited in Dallin and Breslauer, 1970, 57), allegedly
asserted that it is better to induce loyalty by fear than by conviction.18
6 Intensity of violence
In this section, we consider how the autocrat’s choice of violence balances the (positive) fear and
(potentially negative) love effects. When L is low (L < Lfull), the love effect is positive. Hence,
the autocrat has strong incentive to increase the intensity of violence. A purge, therefore, will be
partially discriminate only if the cost of investing in the security apparatus is very large. When L is
large (L > Lind), the love effect is negative. Further, the fear effect is relatively small (Proposition
1). A purge is thus semi-indiscriminate only if the cost of investing in the security apparatus is
low.19
18Dallin and Breslauer (1970, 42 footnote 37) reproduces an anecdote circulating among Moscow party members
in 1931. “Yagoda was alleged to have asked Stalin: ‘Which would you prefer Comrade Stalin: that party members
should be loyal to you from conviction or from fear?’ Stalin is alleged to have replied: ‘From fear.’ Whereupon
Yagoda asked, ‘Why?’ To which Stalin replied: ‘Because convictions can change: fear remains.”’19At L = Lfull, by definition κ(L) = 1− e1. Effort thus satisfies em1 (τ) = v(τ) + V2(τ) + L (see Lemma 1). The
posterior µF = λ1−em1 (i)1−e1 thus depends only on model parameters and L. For L > Lfull, the posterior µS does not
depend on κ (see the discussion prior to Lenna 3). Hence, the quantities used in the text of Proposition 4 only
depend on the intensity of violence and model parameters.
19
Proposition 4. There exists a (almost always) unique equilibrium intensity of violence L∗. Further,
(i) If at L = Lfull, ζ0 + ζ1L ≥ 1 + βDi,o(λ− µF (e1)), then L∗ ≤ Lfull;
(ii) If at L = Lind, ζ0 + ζ1L <12
(1 + βDi,o (λ−µS(e1))
c1(v+V2+L)
)+ βDi,o(λ− µS(e1)), then L∗ > Lind.
For low L, point (i) establishes formally that the fear effect (equals to 1) and the love effect
(equals to βDi,o(λ− µF )) are both positive. In contrast, for high intensity of violence (point (ii)),
the fear effect, while still positive, is relatively low (12(1 + βDi,o (λ−µS)
c1(v+V2+L))) and the love effect is
negative (βDi,o(λ− µS) < 0).
Given the tradeoff between love and fear for high intensity of violence, it remains to determine
under which conditions a semi-discriminate purge occurs. The next corollary lists three necessary
and sufficient conditions. As explained above, the replacement pool must be sufficiently good
quality, and in particular better on average than existing party members (condition 1.). Further,
the cost of purging (c0, c1) must be sufficiently low to compensate for the relatively low marginal
benefit of purging a successful party member (condition 2.). Finally, in line with Proposition 4,
the cost of investing in the security apparatus must also be relatively small.20
Corollary 2. A purge is semi-indiscriminate if and only if:
1. The proportion of ideologues in the replacement pool ri is strictly higher than some ri ≥ λ;
2. The cost parameters c0 and c1 are respectively strictly below some c0(ri) and c1(ri, c0);
3. The cost parameters ζ0 and ζ1 are respectively strictly below some ζ0(ri) and ζ1(ri, ζ0).
As noted above, the Stalinist purges of the thirties resembled semi-indiscriminate purges. His-
torical evidence do not permit to evaluate whether the conditions of the corollary were satisfied.
Interestingly, however, they strongly suggest that the pool of candidates to the CPSU was markedly
different than existing party members. While the 1920s were marked by a divide between ideo-
logically committed and technically proficient officials, starting in 1928, Stalin took great interest
in “training a new generation of cadres that would be both Red and experts” (Fitzpatrick, 1979,
382). Around 170,000 of these new cadres graduated between 1928-32 and 370,000 between 1933
and 1938 (ibid, 398).21 These cadres were among the main beneficiaries of the purges of the 1930s.
While their relative productivity compared to old cadres is still debated (e.g., Dallin and Breslauer,
20The thresholds described in Corollary 2 depend on all (other) parameter values. We have reduced notation to
facilitate the exposition.21Brzezinski (1956, 90-91) puts the total number of engineers emerging from the Stalinist state schools between
1933 and 1938 at 1 million.
20
1970, 37), there are clear evidence that the new cadres were more loyal to Stalin (Wolton, 2015,
267-68).22
Observe that Proposition 4 does not characterize the intensity of violence when conditions (i)
and (ii) do not hold. Due to the positive fear and love effects as well as the complementarity in
party members’ effort in a partially discriminate purge, the marginal benefit of violence is strictly
convex for L ≤ Lfull. This means that the marginal benefit may intersect the marginal cost more
than once, and the autocrat must choose between the lowest intersection and L = Lfull.23 This
implies that predicting the optimal intensity of violence is difficult even though it is generically
unique for the autocrat. Further, small changes in the underlying fundamentals can be associated
with a large increase in the intensity of violence.
Proposition 5. There exists a non-measure zero set of parameter values Pd such that if (λ, ri, b, c0, ζ0) ∈
Pd, there exists cd1 and ζd1 satisfying limc1↑cd1
L∗ < limc1↓cd1
L∗ and limζ1↑ζd1
L∗ < limζ1↓ζd1
L∗.
The unpredictability in violence outlined in Proposition 5 also implies that the purge breadth
may also seem random (by Proposition 2). While there is little ways to test this prediction, it
should be noted that there was great variation in the number of party members affected by mass
purges both in USSR and Communist China (see Tables 1 and 2 in Appendix A).
7 Mass Purges and Top-Down Accountability in Autocra-
cies
In our theoretical framework, mass purges are a salient method of top-down accountability in
autocracy. In totalitarian regimes, the political and professional responsibilites of party members
are fused and “[f]ailure in the latter thus automatically becomes a case of political accountability”
(Brzezinski, 1956, 86). The autocrat’s top-down accountability problem thus bears similarities
with bottom-up accountability where voters’ evaluation of their representative based on economic
indicators (for example, Besley, 2007, chapter 4).
There exists, however, a fundamental difference between the autocrat and voters’ accountabil-
ity problems. While many voters hold a single representative accountable so one is accountable to
22Notice that all our results hold as long as the replacement pool is expected to be more active (on average) than
existing party members. It is of little consequence whether the difference is due to greater loyalty to the leader,
greater productivity, or both.23The second intersection being a local minimum.
21
many, the autocrat faces a mass of party members so many are accountable to one. Our analysis
highlights forces specific to many-to-one accountability. In our setting, there is no team production
problem and no yardstick competition as in a tournament, party members work on independent
projects. Nonetheless, their fate is linked through the purge. This generates strategic interdepen-
dence in party members’ activism—the pool size effect. In turn, the pool size effect changes the
autocrat’s inference problem as her ability to identify an agent’s type depends on the behavior
of all party members—the pool makeup effect. Consequently, top-down accountability cannot be
apprehended by studying a single or few agents in isolation; researchers must take a comprehensive
perspective. Top-down accountability is, by definition almost, a large N problem.
While the many-to-one feature we identify is not unique to top-down accountability in autoc-
racy, the critical role of violence due to the autocrat’s monopoly in the political and judicial areas
is. We show that the intensity of violence determines the nature of the purge, but not necessarily
its breadth (Proposition 2). Further, even though the autocrat faces little constraint on her ac-
tions, due to party members’ strategic response, she cannot escape a trade-off between fear (higher
performance) and love (less unity). This trade-off implies that a rational autocrat does not use
violence bluntly. We should observe greater violence when the need to mobilize the autocrat’s
agents is high,24 and more restraint when the autocrat is interested in selection.
This result may also explain why the Fascist and Nazi regimes used purges parsimoniously