Misspecified Politics and the Recurrence of Populism Gilat Levy, Ronny Razin & Alwyn Young* London School of Economics, March 2020 We develop a model of political competition between types that differ in their specification of the data generating process for a common outcome. We show that misspecified beliefs converge on a simpler view of the world which suffers from omitted variable bias. Periods in which those with a correctly specified and endogenously more complex model govern increase the specification error of the simpler world view, leading the latter to underrate the effectiveness of complex policies and overestimate the positive impact of a few extreme policy actions. Periods in which endogenously simple types implement their narrow world view result in subpar outcomes and a weakening of their omitted variable bias. Policy cycles arise, where each type's tenure in power sows the seeds of its eventual electoral defeat. "Populism is Simple, Democracy is Complex." (R. Dahrendorf 2007) ________________ *We thank participants in conferences and seminars at the University of Pennsylvania, University of Chicago, Edinburgh University, LSE and Bocconi for helpful comments. This project received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 681579.
39
Embed
Misspecified Politics and the Recurrence of Populismpersonal.lse.ac.uk/YoungA/Misspecified Politics and... · London School of Economics, March 2020 We develop a model of political
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Misspecified Politics and the Recurrence of Populism
Gilat Levy, Ronny Razin & Alwyn Young*
London School of Economics, March 2020
We develop a model of political competition between types that differ in their specification of the data generating process for a common outcome. We show that misspecified beliefs converge on a simpler view of the world which suffers from omitted variable bias. Periods in which those with a correctly specified and endogenously more complex model govern increase the specification error of the simpler world view, leading the latter to underrate the effectiveness of complex policies and overestimate the positive impact of a few extreme policy actions. Periods in which endogenously simple types implement their narrow world view result in subpar outcomes and a weakening of their omitted variable bias. Policy cycles arise, where each type's tenure in power sows the seeds of its eventual electoral defeat.
"Populism is Simple, Democracy is Complex." (R. Dahrendorf 2007)
________________ *We thank participants in conferences and seminars at the University of Pennsylvania, University of Chicago, Edinburgh University, LSE and Bocconi for helpful comments. This project received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 681579.
- 1 -
I. Introduction
Individuals differ not merely in their economic interests and preferences, but also
in their fundamental understanding of the data generating process that underlies observed
outcomes. Consequently, because they consider the same historical data through the
prism of different models, fully rational and otherwise similar actors can have persistent
differences of opinion, as witnessed by the endurance of academic debates in areas as
diverse as macroeconomics and physics. In politics, such differences in model
specification translate into differences in realized policy decisions when different groups
are in power. The consequent interplay between beliefs and policy can generate
systematic correlations between observed data that sustain differing beliefs and biases.
This paper considers political competition between types that share the same
interests and preferences over common outcomes but differ in their specification of the
causes of these outcomes. Because of the infinite number of potential regressors and
finite number of observations, all actors must start with some restriction on the set of
policies they consider relevant, i.e. may have non-zero effects on the common outcome.
With a minimum of policy variation, actors learn over time which policies in their initial
set are actually irrelevant and asymptotically beliefs coalesce around relevant policies
with non-zero effects. Consequently, if the beliefs of one type are misspecified, i.e.
exclude relevant determinants of policy outcomes, over time they become "simple"
relative to the "complex" views of those with an initially correctly specified model. For
example, while complex types may consider crime, income stagnation, inequality, and
housing costs as interrelated but largely separate, and best treated with a range of
policies, simple types come to view them as stemming from a single cause, e.g.
immigration.
- 2 -
Our principal finding is that in electoral competition complex types with correctly
specified beliefs are unable to permanently defeat and remove from power those with
misspecified simple beliefs. Periods in which complex types govern and implement their
broad policy agenda increase the omitted variable bias of the simple, as they attribute the
successful outcomes of the full range of complex policies to moderate actions taken on a
few dimensions. This increases the simple's assessment of the likely effectiveness of a
more decisive narrow policy and mobilizes them in support of political candidates who
will implement it. However, periods when the simple govern produce systematically
inferior results, as extreme actions are revealed to be less effective than anticipated. This
reduces the intensity of both their desired policy and political activism, thereby allowing
complex types to regain power. Thus, we find that the economy suffers from endless
political cycles characterized by a rising and falling intensity of beliefs, alternating
periods of policy moderation and extremism, and systematically better and worse
outcomes.
Our model may shed some light on the recurrence of political populism. The
amorphous concept of "populism" has perhaps as many definitions as authors.1 A
frequent theme, however, is that the policies of populist politicians are extreme,
misguided and harmful to the very groups that support them (e.g., Dornbusch and
Edwards 1991). Our framework provides a motivation for the recurrence of large policy
deviations with subpar outcomes that are supported by rational voters. Periods of
complex establishment rule increase the omitted variable bias of misspecified simple
beliefs, inducing a fully rational discounting of past failures and ensuring never-ending
1For reviews see Gidron and Bonikoeski (2013) and Mudde and Rovira Kaltwesser (2017).
- 3 -
political cycles as the actions taken by each type when in power assure their eventual
replacement by their opposition.
Our paper builds on a literature of political-economy models of sub-optimal
populist policies. Acemoglu et al (2013) model left-wing populist policies that are both
harmful to elites and not in the interests of the majority poor as arising from the need for
politicians to signal that they are not influenced by rich right-wing interests. Di Tella and
Rotemberg (2016) analyze populism in a behavioural model in which voters are betrayal
averse and may prefer incompetent leaders so as to minimize the chance of suffering
from betrayal. Guiso et al (2017) define a populist party as one that champions short-
term redistributive policies while discounting claims regarding long-term costs as
representing elite interests. Bernhardt et al (2019) show how office seeking-demagogues
who cater to voters' short term desires compete successfully with far-sighted
representatives who guard the long-run interests of voters. Morelli et al (2020) show how
in a world with information costs incompetent politicians who simplistically commit to
fixed policies can be successful. Our framework expands this literature by linking the
pursuit of sub-optimal policy to the bias created by a misspecified interpretation of the
outcomes of periods of optimal rule.
Interest in learning with misspecified models dates back at least to Arrow &
Green (1973), with examples including Bray (1982), Nyarko (1991), Esponda (2008) and,
most recently, Esponda and Pouzo (2016) and Molavi (2019). Several recent papers
feature interactions between competing belief structures that share features of our
framework. Mailath and Samuelson (2019), with market analysts' evaluations of stocks
as an example, also argue that agents impose different a priori simplifications on the data
generating process and examine the convergence of beliefs with information exchange
- 4 -
based upon a given set of data and recursively announced beliefs. Eliaz and Spiegler
(2019) present a static model of political competition based upon competing narratives
that draw voters attention to different causal variables and mechanisms, with victory
going to the narrative that can present the most positive anticipatory outcomes, while
remaining consistent with steady state observations. Montiel Olea et al (2017), with
auctions as a motivation, consider competition between agents that use simple or complex
models to explain a given set of exogenous data and find that simpler agents have greater
confidence in their estimates in smaller data sets and less confidence asymptotically. In
our framework the endogenous data produced by actors with different specifications
generates persistent biases and differences in beliefs that asymptotically keep both types
politically competitive. More technically, convergence in misspecified models is not
guaranteed, and is especially problematic with multidimensional state spaces (Heidhues,
.60 Notes: p = p-value on the test of the significance of the debate dummies; share of all words = share of words with 2 or more characters.
5 .5 6 6 .5 7 7 .5 8
ln total words
-1 .1
-1
-0 .9
-0 .8
-0 .7
-0 .6
-0 .5
-0 .4
-0 .3
-0 .2
ln (
un
iqu
e w
ord
s/t
ota
l w
ord
s)
Others Trump Sanders Clinton
Words with 2 or more characters
5 5 .5 6 6 .5 7 7 .5
ln total words
-1
-0 .9
-0 .8
-0 .7
-0 .6
-0 .5
-0 .4
-0 .3
-0 .2
ln (
un
iqu
e w
ord
s/t
ota
l w
ord
s)
Others Trump Sanders Clinton
Words with 6 or more characters
Figure I: Ln Unique Words/Total Words as a Function of Total Words
- 9 -
III. A Modelling Framework
We model a polity in which citizens receive utility from a set of common policy-
influenced outcomes, but have fundamental disagreements over the causal determinants
of some of those outcomes. Specifically, we consider the case where there is a common
outcome y whose realization at time t is governed by the data generating process
tttty ε+′+= βnx )()III.1( ,
where xt and nt are vectors of k desired policy actions and policy noise, β the vector of
policy effects parameters, some of which may be zero, and εt a mean zero iid normally
distributed random shock.3 Although y is described as a single outcome, one can equally
think of it as a preference weighted average of multiple outcomes that are influenced by
x.4 The components of noise n are iid with zero mean and diagonal covariance matrix
kn I2σ , and are independent of both desired policy x and the shock to outcomes ε. These
can be thought of as miniscule bureaucratic errors or short hand for the experimentation
that might arise with forward looking politicians, and their main function is to eliminate
fragile multiple equilibria, as discussed further below.
Outside of y, we assume there are other common outcomes over which there is no
fundamental disagreement about causal mechanisms and the role of policy, and the utility
citizens derive from all common outcomes is given by
),()III.2( ttt RVyU +=
where Rt = xtʹxt represents the resources used in implementing policy x for y and is
expressed in a fashion that allows us not to worry about the signs of the elements of β or
x, while V represents the utility derived from policy outcomes over which there is no
3We follow standard matrix algebra notation throughout, using regular typeface to denote scalars,
and lower and upper case letters in bold typeface to denote column vectors and matrices, respectively.
4Thus, if utility is a weighted average of i components each with yit = xtʹβi + εit, then the outcome, parameters and error term in III.1 are simply the weighted average of those components.
- 10 -
disagreement regarding causal mechanisms. V is a reduced form, representing the utility
that can be achieved in other policy areas given the allocation of resources to y, and the
assumptions 0/ <∂∂ RV and 0/ 22 <∂∂ RV are natural. To derive analytical results below
we will work with a second-order approximation of V as a quadratic function of Rt, and in
the appendix we assume, quite reasonably, that there is a finite upper bound R on the
resources available for policy as this simplifies many aspects of the proofs.
Citizens in our polity are divided into two "types" based upon their prior beliefs
about the unknown policy effects parameters β. Faced with a multiplicity of possible
regressors, each type excludes some policies on a priori grounds as irrelevant, i.e. having
zero policy effects. While one type has a correctly specified model, in that the policies
they think are relevant includes all non-zero elements of β, the beliefs of the other type
are misspecified, in that the policies they think are relevant exclude some of the non-zero
elements of β. We use the subscript i to distinguish between the full k x 1 vectors of
desired policies, noise and parameters (x, n and β) and the ki sub-elements of these type i
believes are relevant (xi, ni and βi). Similarly, while Ht = Xt + Nt denotes the full t x k
history of desired actions and noise, Hit = Xit + Nit is made up of the ki columns of that
history deemed relevant by type i. The union of the sets of ki policies deemed relevant by
each type equals k, the total set of systematically implemented policies.
Prior beliefs for each type across the policies they believe are relevant are
normally distributed with mean 0iβ and joint covariance matrix σi02Vi0
-1, while the prior
probability density function on σi0 is inverted gamma. Following the observation of the
t x 1 history of outcomes yt, such beliefs give rise to mean posterior beliefs5
5This is a standard OLS Bayesian result (Zellner 1971). Our model is somewhat different than the
standard framework in that the regressors are determined by past realizations of the error term. However, since the current disturbance εt is independent of the current regressors hit, provided some initial prior exists the period by period recursive application of the updating formula (III.3) aggregates across t periods to the result given above.
- 11 -
).()()3.III( 001
0 titiiititiit yHβVHHVβ ′+′+= −
However, since one can easily define a finite "pre-history" of policy Hi0 and outcomes y0,
such that 000 iii HHV ′= and 001
000 )( yHHHβ iiii′′= − , and our results will be asymptotic, we
simplify our algebra by including these pre-histories in Hit and yt and simply writing
beliefs as
.)()III.4( 1titititit yHHHβ ′′= −
Without noise, the model described above intrinsically allows for multiple
equilibria. As beliefs converge, the variation in actions declines and the regressors
become colinear and "asymptotically uncooperative" (Schmidt 1976). This loss of
information through colinearity undermines proofs of convergence to a point. While
beliefs asymptotically typically satisfy some conditions, such as that mean expected
effects equal the mean impact of policy, as in βxβx ′=′ii , there are still a continuum of
equilibria that meet these minimal requirements and it is not possible to prove that the
economy does not forever move along that continuum, as no point has preference over
another. The smallest amount of random noise, however, easily eliminates this
indeterminacy.6 Not wishing to carry the reader through a host of technical and yet
fragile results, we resort to the simple assumption that implemented policy is composed
of desired policy x and the vector of mean-zero random noise n.
Political competition in our polity appears in the form of citizen candidates, one
for each type, who if successfully elected implement policies which are myopically
optimal given their belief type. Voting is costly, but citizens vote because they believe
that with some probability p their vote will be pivotal. Consequently, voters of type i will
6Technically speaking, the problem is that absent noise Hi′Hi/t is not guaranteed to be asymptotically
positive definite, a baseline assumption used in typical proofs of convergence. While the covariance matrix is positive definite for all t and the eigenvalues are all weakly increasing through time, when the regressors become collinear some of the eigenvalues become o(t), which is why beliefs end up on a line. With noise, all of the eigenvalues are assured to be Ω(t) (bounded above 0 when divided by t).
- 12 -
be motivated to vote for the citizen candidate of their type if the expected gain from type i
policies relative to those of belief type j exceeds the cost of voting, i.e.:
)],()([ where, )III.5( j
t
i
tiii UUEIcpI xx −=>
where Ei denotes the expectation based upon the beliefs of i and i
tx the preferred policies
of type i. Ii is the intensity of the voting preferences of type i and does not necessarily
equal -Ij as beliefs differ across the two groups. With a distribution across citizens of
costs c divided by pivotality probability p, the vote share candidates of each type garners
will be an increasing function of the intensity of their type. We assume this distribution
is the same for both types, and that both types are equally numerous. Consequently, the
election is won by the candidate representing the type with the greatest voting preference
intensity. The results below can be generalized to allow for unequal group sizes by
noting that this simply implies the smaller group will require a certain margin of voting
preference intensity to motivate its base enough to win an election.
IV. Optimal Policies in a Single Period
In this section we examine ideal policies and the determinants of intensity within
a single period. To simplify notation, we drop subscripted references to time. Our
principal result is:
(R1) The voting intensity of type i, i.e. their expected gain from implementing their ideal policies instead of those of the opposing type j, is increasing in iiββ′ . Consequently, the party with the highest iiββ′ wins the election and implements their ideal policies.
To build intuition we first consider the optimal choice of x given the resources R
allocated to y, then determine the optimal allocation of resources between areas of policy
agreement and disagreement, and finally show that voting intensity is increasing in iiββ′ .
- 13 -
Given a resource allocation R to outcome y, the optimal policy for any belief type
is determined by:
],-[ ]-[ ][max)IV.1( xxβxxxx
′+′=′+ RRyE λλ
where, as noted before, the expectation is taken across the prior and we use β to denote
the mean values of beliefs. Substituting the first order condition
λ2
)IV.2(β
x =
into the resource constraint
,2
1 use)later (for or
2
1
)2()IV.3(
2R
RR
ββ
ββ
ββxx
′=
′=→
′=′= λ
λλ
which allows us to solve for the optimal x as a function of R and β
,),()IV.4(ββ
ββx′
= RR
while the expected outcome given such policies x and beliefs β is given by:
.']),,([)IV.5( RRy βββxββx ′==
For a given level of resource use R, types which have more extreme parameter estimates,
as measured by ββ′ , believe they know how to pursue more effective policies, as
measured by βx′ in (IV.5), and consequently feel more constrained by the resource
limitation R, as measured by λ in (IV.3).
The gain in expected utility from pursuing an optimal policy x versus an
alternative policy x + δ that satisfies the same resource constraint is given by:
βδβδxβxβδxβx ′−=′+−′=+− )(],[],[)IV.6( yy
Substituting using (IV.4) and the fact that - δʹx = ½δʹδ , as both xʹx and (x+δ)ʹ(x+δ) equal
R, we see that:
2],[],[)IV.7(
δδββxββ
δβδxβx′′
=′′−=+−
RRyy
- 14 -
Individuals with more extreme parameter estimates feel the resource constraint more
keenly and hence lose more from a sub-optimal movement δ away from their constrained
choice. Their gain from moving from an expenditure level R1 to a higher level R2 (while
pursuing optimal policies in both instances) is also greater, as
)(]),,([]),,([)IV.8( 1212 RRRyRy −′=− ββββxββx .
Having chosen x optimally given R and β , the next step in the citizen's decision
involves the optimal allocation of resources between the areas of policy agreement and
disagreement:
),( ]),,([max)IV.9( RVRyR
+ββx
the first order condition for which is
.2
1)()()10.IV(
RRRV
ββ′−=−=′ λ
An increase in ββ′ raises the marginal utility of expenditure on y, raising the equilibrium
allocation of resources R to that sector. However, the optimal ratio of R/ββ′ rises with
ββ′ as the rising marginal utility of the diminished resources allocated to the area of
policy agreement requires an equal rise in the marginal utility of expenditure in the area
of policy disagreement.
We now determine the relative intensity of two types, i and j, when faced with a
choice between candidates proposing their respectively optimal policies. For the sake of
concrete exposition, we assume that jjii ββββ ′>′ and consequently Ri > Rj and
jjjiii RR // ββββ ′>′ . The voting intensity of type i is given by:
.)()(]),,([]),,([]),,([]),,([
)(]),,([)(]),,([)11.IV(
000
44 344 2144444 344444 2144444 344444 21<>>
−+−+−
=−−+=
iii C
ji
B
iijiii
A
ijjiij
jijjiiiii
RVRVRxyRxyRxyRxy
RVRxyRVRxyI
ββββββββ
ββββ
- 15 -
The positive first term, Ai, is the gain in expected y at expenditure levels Rj enjoyed by
moving from the sub-optimal policies of type j to the optimal policies of type i; the
positive second term, Bi, is the gain in expected y enjoyed by moving from the optimal
expenditure Rj of type j to the greater desired expenditure Ri of type i; and the negative
third term, Ci, is the reduction in utility in areas of policy agreement brought about the
increased allocation of resources to y by type i. In a similar vein, the voting preference
intensity of type j is given by:
,)()(]),,([]),,([]),,([),,([
)(]),,([)(]),,([)12.IV(
000
44 344 21444444 3444444 2144444 344444 21><>
−+−+−
=−−+=
jjj C
ij
B
jjijjj
A
jiijji
ijiijjjjj
RVRVRxyRxyRxyRxy
RVRxyRVRxyI
ββββββββ
ββββ
with each term having an interpretation similar to the case for i except that while Aj is
similarly positive, Bj is negative and Cj positive as Ri > Rj.
Let δ denote the difference in the optimal x policies of the two types. Using
(IV.7) earlier above, we see that:
).& (as 2
,2
)13.IV( jijjiiji
i
jj
j
j
iii RRAA
RA
RA >′>′>→
′′=
′′= ββββ
δδββδδββ
As type i believe they can implement more effective policies and are more strictly
constrained in their allocation of resources to y, the gains they expect in moving from
sub-optimal to optimal policies at the low expenditure levels of type j exceeds the gain
type j envision from a similar movement at the generous expenditure levels of type i.
Turning to the comparison of Bi + Ci and Bj + Cj, these terms represent the net gain to
each type from moving to their optimal resource allocation in the area of policy
disagreement, given an optimal choice of x at each expenditure level. Substituting using
(IV.8) and (IV.10):
- 16 -
,2/)))(()(()()(and
,0))((2)(
,0))((2)()IV.14(
jijijiji
jijjijjjj
jiiijiiii
RRRVRVRVRVCC
RRRRVRRB
RRRRVRRB
−′+′=−=−=
<−′−=−′=
>−′−=−′=
ββ
ββ
where the last line follows from the second order (quadratic) approximation of V as a
function of R. Consequently, Bi + Ci > Bj + Cj as:
To summarize, types with more extreme beliefs, as measured by ββ′ , believe they
can pursue more effective policies, feel more constrained by the limitations of resources,
and experience a greater gain in the flow of utility in the area of policy disagreement y by
moving a given distance δ toward their optimal policy on x and a greater net utility gain
from transferring a given quantity of resources from areas of policy agreement into the y
sector. Consequently, whichever type has more extreme beliefs views the election as
more consequential and is mobilized in greater numbers to vote for their candidate. With
equal demographic shares, they win the election and their citizen candidate implements
their preferred policies.
V. Asymptotic Equilibrium with Two Potential Policies
In this section we solve the asymptotic dynamic equilibrium for a simple example,
a world in which there are two potential policies, 1 and 2, both of which are relevant, i.e.
β1 ≠ 0 and β2 ≠ 0. Those with misspecified beliefs, the "simple", only believe in the
efficacy of policy 1, while those with correctly specified beliefs, the "complex",
recognize that both policies might have effects. This example illustrates the mechanisms
of the model and the convergence to the steady state in an intuitive fashion. The
following section derives results in greater generality. Our central results are
- 17 -
(R2) For sufficiently small noise, the asymptotic equilibrium involves policy cycles, as, with the exception of equilibrium paths of probability measure zero, asymptotically autarchic rule by a single type is not possible. The larger the ratio |/| 12 ββ , the smaller is the asymptotic share of time the simple are in power, but the more biased are their asymptotic beliefs and the more ineffective their policies.
So as to not interrupt the flow of the argument, all proofs of probability limits stated
below are given in the on-line appendix. We use hk = xk + nk to denote the t x 1 history
of desired policy and noise for policy k. We will find it useful to separate out histories
that only include the periods when each type is in power, with, for example, hki denoting
the ti rows of hk associated with the periods when type i is in power, with ts + tc = t.
The complex have a correctly specified model of the world and as long as the
regressors asymptotically have any independent variation at all will converge on the true
parameters with
ββp
c =)V.1( ,
where we use the notation p
= to signify "converges in probability". A negligible amount
of random noise is enough to ensure this result. In the probability limit, the complex then
implement steady state policies
, ,)V.2( 2211ββββ
ββββ
′=
′= ′′ R
xR
xp
cp
c ββ
where Rβ′β is the optimal allocation of resources to y given intensity ββ′ . If simple
beliefs converge on the true value for policy 1, they will have lower voting intensity as
22
21
21 βββ +< . However, there exists a level of bias
2
1
22
21*)V.3(
βββτ +=
such that (τ*β1)2 2
22
1 ββ += and the simple and the complex share the same voting
intensity. We shall show that simple beliefs converge on this level of bias in an
equilibrium with policy cycles where both types alternate in power.
- 18 -
The simple's mean beliefs concerning the effects of policy 1 are given by the
coefficient estimate in the misspecified regression that only includes policy 1 as a
regressor
11
1
11
2121
11
22111
11
1 )()V.4(
hh
εh
hh
hh
hh
εhhh
hh
yhβ
′′
+′′
+=′
++′=
′′
= ββββs .
As the mean zero shocks ε are independent of policy, it is easy to see that
,0/
/ )V.5(
11
1
11
1p
t
t =′′
=′′
hh
εh
hh
εh
as the numerator averages to zero while the denominator averages to a number strictly
greater than 0. Let θs = ts/t denote the share of time up to time t the simple have been in
power. If the limit (not probability limit) of θs = 1, i.e. asymptotically the simple are
always in power, then
1) lim (if 0/
/)()( )V.6(
11
2211
11
21 ==′
+′+=′′
∞→ st
p
θt
t
hh
xnxn
hh
hh,
as the simple do not implement any policy 2, while the noise shocks for policies 1 and 2
are independent and their product averages to zero. Conversely, if asymptotically the
complex are always in power the cross product of policies 1 and 2 is non-zero and hence
0) lim if()/(
)/(
/)()(
/)()(
/
/)V.7(
221
21
1111
2211
11
21 =+′′
=+′++′+=
′′
∞→′
′s
tn
p
θR
R
t
t
t
t
σβββ
ββ
ββ
xnxn
xnxn
hh
hh
ββ
ββ .
The results given above allow us to establish two important contradictions.
Combining (V.4) - (V.7), we see that the probability limit of simple beliefs in the two
extreme cases can be expressed as
- 19 -
0).lim (if
1
1
1 and
1)lim (if 1 with ,)V.8(
22*
22*
221
22
0
11
=+
+
=+
′
′+=
===
∞→
′
′
′
′
∞→
st
n
n
n
st
p
s
R
R
R
R
s
θστ
στ
σβ
βτ
θτβτθ
ββ
ββ
ββ
ββ
ββ
ββ
β
where sθτ denotes the degree of bias and ββ′Rn /2σ is the ratio of information revealed by
noise relative to that revealed by policy. If asymptotically the share of time the simple
are in power goes to 1, their beliefs converge on the true parameter value as their
estimating equation is asymptotically no longer misspecified. But in this case, with a
probability approaching one their intensity must be strictly less than that of the complex,
thereby ensuring that, outside of equilibrium paths whose probability is measure zero,
asymptotically they cannot always be in power. 7 Conversely, if asymptotically the share
of time the complex are in power goes to 1, then the beliefs of the simple suffer an
omitted variable bias that loads up all of the positive effects of complex actions on policy
2 onto the simple's coefficient estimate for the effects of 1. For small *2 /1/ τσ <′ββRn , bias
τ0 is necessarily greater than the level τ* needed for equal intensity. So, in this case, again
outside of equilibrium paths whose likelihood is measure zero, the complex cannot
asymptotically always be in power, establishing another contradiction. It follows that for
small enough noise, outside of paths of probability measure zero, we can rule out the
possibility that the limit of ts/t is 0 or 1.
7For the limit of θs to equal 1, the share of any fixed time interval that simple intensity is greater than
or equal to that of the complex must asymptotically go to 1. However, since the plim of simple intensity is less than that of the complex, the probability that simple intensity is greater than that of the complex in any period must go to zero. Consequently, the probability measure of paths with the characteristic that "the share of any fixed time interval simple intensity is greater than or equal to complex intensity goes to 1" must be zero. These equilibrium paths, and other such mentioned below, are those along which an increasingly unlikely sequence of shocks ε and n keep beliefs from converging on the level implied by the parameters and (in the case of omitted variables) regressors.
- 20 -
We now divide the history of the regressors into that observed during periods
when each type is in power and re-express simple beliefs (V.4) as:
11
1
11
2121
11
11
11
2121
11
11 1)V.9(hh
εh
hh
hh
hh
hh
hh
hh
hh
hhβ
′′
+
′′
+
′′
−+
′′
+
′′
=cc
ccss
ss
ssss
s ββββ .
The probability limit of the last term is, as noted before, zero. Since the limit of ts/t is
neither 0 or 1, asymptotically each type must be in power an infinite number of times, so
we can state
,/
/ &
/
/)V.10( 10
11
2121111
11
2121 βτββββτββ
p
ccc
cccp
sss
sss
t
t
t
t =′′
+==′′
+hh
hh
hh
hh
that is, in the limit the terms in brackets [] in (V.9) are the beliefs that would arise if
asymptotically the simple were always or never in power. It follows that
. where,0)1()V.11(11
11101
hh
hhβ
′′
==−−− ssp
s ηβτηηβ
The limit of the probability sβ deviates by more than a negligible epsilon from β1(η+(1-
η)τ0) is zero. η is the fraction of the squared history of policy 1 that occurred under the
simple's watch, or equivalently the fraction of the information regarding the effectiveness
of that policy revealed when only policy 1 is actively pursued.
There exists a scalar η* such that
11
)]/(1[
1)1()V.12(
*
2**
0
*0**
0** <
+−
=−
−=→=−+ ′
τσττ
τττηττηη ββRn .
Asymptotically, with a probability approaching 1, if along a path η > η* simple intensity
will be less than that of the complex, while if η < η* simple intensity will be greater than
that of the complex. As can by seen in (V.11), η is monotonically decreasing when the
complex are in power, since the denominator increases period by period while the
numerator remains constant, and is monotonically increasing when the simple are in
power, as the numerator and denominator are increased by the same amount in each
- 21 -
η* 0 1
bias is high, the simple are in power, η rises and bias falls
bias is low, the complex are in power, η falls and bias rises
Figure II: Asymptotic Phase Diagram
period. Moreover, as proven in the on-line appendix, the changes in η from one period to
another, Δη, get smaller and smaller as the numerator and denominator grow larger
0)V.13(p
=∆η .
The preceding results imply that η converges to η*, and the bias of simple beliefs
converges to the bias τ* consistent with their having the same voting intensity as the
complex. The convergence to this steady state is illustrated in Figure II, which is the
asymptotic phase diagram associated with (V.11) above. Asymptotically, when η, the
fraction of the information concerning policy 1 revealed when the simple are in power, is
less than η*, in all but a probability measure zero of equilibria the bias of the simple is
greater than τ*, the simple are in power and η grows and their bias falls as they find that
their policies are less effective than they thought. When η is greater than η*, the bias of
the simple is less than τ*, the complex are in power, η falls and and the bias and voting
intensity of the simple grows, as they load up the successful effects of policy 2 onto
policy 1. Over time the movements in η get smaller and smaller, but cannot converge to
any point other than η*, for were they to do so, the limit of the share of time the simple
are in power would be 0 or 1, which we have already established is not possible.
One endogenous variable of interest is the share of time the simple asymptotically
are in power. Manipulating the definition of η in (V.11) we have:
- 22 -
.)1(1
)V.14(11*11*
11*
1111
11
*
c
cc
s
ss
c
cc
s
c
ccs
s
sss
s
ss
s
tt
t
t
t
tt
t
tt
t
t
t
t
hhhh
hh
hhhh
hh
′+
′−
′
=→′
−+′
′
=ηη
ηη
As simple beliefs converge on τ*β1, their policies converge on
,)V.15( 1*
1ββ
ββ
′= ′R
xp
s βτ
while asymptotically complex policies are given by (V.2), so we see that
,1
1
)1(
)V.16(
*
2*
2
21*2
21
2*
*
2
21*
+
−
=→
+
′+
+
′−
+
′=
′
′′
′
τ
στθ
σβ
ησβτ
η
σβ
η
ββ
ββββ
ββ
ββββ
ββ
R
RR
R
t
t
n
p
s
nn
np
s
where we substitute for η* using (V.12) and note that the odds ratio θc/θs → τ* as 2nσ → 0.
(V.16) shows that the asymptotic fraction of the time the simple are in power is a
decreasing function of τ* and ββ′Rn /2σ . τ* is increasing in |β2/β1|, the relative efficacy of
policy 2. The more effective are the policies that the simple believe irrelevant, the higher
is the asymptotic bias needed for the simple to compete electorally with the complex. To
sustain this bias, the simple spend less time in power, loading up more bias from the
beneficial actions of complex policy in area 2. When there is more background noise
relative to the implementation of policy, the simple also spend less time in power, as the
information given by random variation systematically drives their beliefs towards the
truth, requiring a longer period of complex rule to arrive at the τ* level of bias.
Asymptotically, when the complex are in power the average efficacy of their
policies is
- 23 -
ββββββ
ββ
ββββ ′=′
+′
=+= ′′′
RRR
xxycc
c
22
212211)17.V( ββββ ,
while periods when the simple are in power yield the inferior average outcomes
ββββ
ββ
ββ ′=′
== ′′
RR
xys
s *
21
*11
1)18.V(
τβτβ .
However, the outcomes simple voters expect under complex and simple rule are
.)|(
,1
)|()19.V(
21
2*11
*
21
*11
s
s
sss
c
c
ssc
yRR
xyE
yRR
xyE
>′=′
==
<′=′
==
′′
′′
ββββ
β
ββββ
β
ββ
ββ
ββ
ββ
βτβ
τβτβ
Simple voters are systematically disappointed by the outcomes of the extreme policies
implemented when their populist politicians are in power. This leads to a gradual
diminution of beliefs and consequent moderation of policy, until those with more
complex views once again take power. However, the surprising success of policy under
the complex gradually convinces simple voters of the value of implementing more
extreme and focused policies, increasing their probability of voting in favour of populist
politicians who advocate narrow and extreme solutions to complex problems.
VI. Political Competition between Correctly & Incorrectly Specified
Models
In this section we consider the equilibrium of generalized political competition
between correctly and incorrectly specified models within the framework described
earlier above. Specifically, we consider an environment in which there are k potential
policies, some of which are relevant and have non-zero effects, and some of which are
irrelevant and have zero effects. While the beliefs of "complex" types are correctly
specified, in that they include all relevant policies, "simple" types erroneously exclude a
subset of these. The prior beliefs of both types may include some irrelevant policies that
- 24 -
have zero effects, and we impose no a priori restriction on the relative number of policies,
ks and kc, each type believes may be relevant, other than that their union covers the set of
k policies that are systematically implemented. The monikers "complex" and "simple"
derive from the fact that the endogenous asymptotic equilibrium looks much like that of
the simple 1 versus 2 policy example given above. Our results can be summarized as:
(R3a) The beliefs of both types regarding policies that are actually irrelevant converge on 0. Consequently, the non-zero beliefs of those with the misspecified model become "simple" relative to the "complex" views of those with the correctly specified model. While the beliefs of the complex converge on true parameter values, the beliefs of the simple converge on a multiple of the true parameter values. Asymptotically the simple implement a narrowed, exaggerated and less effective version of complex policies.
(R3b) All other results mirror the 1 vs 2 policy example given earlier. For sufficiently small noise, the asymptotic equilibrium involves policy cycles, as, with the exception of equilibrium paths of probability measure zero, asymptotically autarchic rule by a single type is not possible. The larger the effects of policies the simple mistakenly exclude relative to those they include, the smaller is the asymptotic share of time the simple are in power, but the more biased are their asymptotic beliefs and the more ineffective, compared to the complex, their policies when in power.
To simplify the presentation and focus on intuition, we assume a steady state
exists and then derive the restrictions on the equilibrium this imposes. Proof that the
polity actually converges to that steady state, while a contribution to the literature on
models with misspecified beliefs in its own right, is technically involved and hence
relegated to the on-line appendix. To review our notation, we use H = X + N to denote
the t x k history of desired policy and iid noise that may affect the common outcome y
through the parameters β. Each type believes that only a subset ki of these policies are
relevant and hence only use the associated t x ki columns Hi of H in the regression model
which determines their mean beliefs iβ . The true parameters associated with these
policies are denoted by βi. When necessary to differentiate which type is in power, we
- 25 -
add a second subscript. Thus, steady state policies deemed potentially relevant by type i
when j is in power are indicated by xij. We use j•x to denote the full vector of k desired
policy values when type j is in power, including 0s in policies j believes are irrelevant. θi
= ti/t is the share of time type i has been in power up to time t, Im and 0mn are the square
identity matrix and rectangular matrix of zeros of the subscripted dimensions, and p
=
denotes the probability limit. We assume the share of time each type is in power θi ,
beliefs iβ , and policies i•x all have well defined probability limits, proof of convergence
to these limits, and all other plims noted below, being given in the on-line appendix.
We begin by establishing some restrictions on the mean beliefs of type i
,],[
)()1.VI(
)(22
1
β0IXX
βIXX
βNNXNNXXX
βNNXNNXXX
βHH
βHH
εHHβHβHHyHHHβ
+′
=
+′
→
′+
′+
′+
′=
′+
′+
′+
′→
′=
′→′+′=′→′′=
−
−
iiii kkxkkni
p
iknii
iiiip
iiiiiiiii
ip
iii
iiiiiiiii
tt
tttttttt
tt
σσ
where the probability limit in the first line follows from the fact that the average of the
product of the components of policy Hi and the random shock to y, ε, is zero, while the
third line follows from the fact that the noise elements of policy N are iid and
independent from the desired elements of policy X. Assuming convergence to steady
state policies and shares of time in power, we can re-state (VI.1) as
- 26 -
[ ] [ ]
][][][)2.VI(
][][][)2.VI(
],[][][
),( )2.VI(
2
2
2
)(22
ccn
p
PEPE
ccccccccc
PEPE
ccsccscss
ssn
p
PEPE
cccsscscc
PEPE
sssssssss
iin
p
PEPE
jjjiijijj
PEPE
iiiiiiiii
kkxkknjijjiiii
p
iknijijjiiiii
cccs
scss
ijii
iiii
θθc
θθs
θθ
θθθθ
βββxβxxβxβxx
βββxβxxβxβxx
βββxβxxβxβxx
β0IxxxxβIxxxx
−=′−′+′−′
−=′−′+′−′
−=′−′+′−′→
+′+′=+′+′ −••
σ
σ
σ
σσ
44 344 2144 34421
44 344 214434421
44344214434421
where we use the notation PEPEij to denote the "policy effects prediction error" of type i
when type j is in power, and the fact that βx i•′ equals iiiβx′ and ccssss βxβx ′=′ . In the case
of the last, while xss and xcs are ks x 1 and kc x 1 vectors of policy actions deemed relevant
by each type when the simple are in power, the equality follows from the fact that the
policies the simple act on that are not included in kc have zero effects, while the elements
included in cβ that are non-zero but not included in ks have 0 simple policy actions.
We begin by focusing on the beliefs of the complex, alternately pre-multiplying
(VI.2c) by xcs and xcc
,0])[(])[(
0])[(])[()3.VI(
2
2
p
ccccccncccccccsccscsccs
p
cccccccccscccsccsncscss
θθ
θθ
=′−′+′+′−′′
=′−′′+′−′+′
βxβxxxβxβxxx
βxβxxxβxβxxx
σ
σ
and then combining equations to derive the restrictions
,0))(())((
][
0))(())((
][)4.VI(
2
22
2
22
p
ncscss
csccscccscncccccncscss
PEPE
cccccc
p
nccccc
csccscccscncccccncscss
PEPE
ccsccs
θ
θθθθ
θ
θθθθ
cc
cs
=+′
′′−+′+′′−′
=+′
′′−+′+′′−′
σσσ
σσσ
xx
xxxxxxxxβxβx
xx
xxxxxxxxβxβx
4434421
43421
where we make use of the fact that each PEPE is multiplied by a strictly positive number
in at least one equation. By the Cauchy-Schwarz inequality, the numerator in the fraction
in both equations is strictly positive, so these imply that both policy effects prediction
- 27 -
errors are asymptotically zero, which in turn, from (VI.2c), implies that c
p
c ββ = .8 Since
the complex causal model is correctly specified, with minimal noise their parameter
estimates are consistent and converge on the true parameter values.
We now focus on simple beliefs, substituting into (VI.2s) using the dependence of
policies on beliefs
],[][][)5.VI( 2ssn
p
ssscsssss
ss
s
Rθ
Rθ ss βββββββ
βββββββ
ββ
ββββ −=′−′′
+′−′′
′′ σ
where we have used the fact that complex beliefs converge on true parameter values and,
to simplify and clarify matters, that ββββ ′=′cc , as the policies the complex consider
irrelevant do indeed have zero effects. If 1p
s =θ , we pre-multiply by sβ to see
{ .
][][)6.VI(
(VI.5)by
2
s
p
sss
p
ss
ssssn
p
ssssss
ss
ss
R
ββββββ
ββββββββββββ
ββ
=→′=′→
′−′=′−′′′
′ σ
If with a probability approaching one asymptotically the simple are always in power,
their model is not misspecified (as only simple policies are systematically implemented)
and given negligible amounts of noise their beliefs converge on true parameter values. In
this case, however, we have ss
p
ss
p
cc ββββββββ ′=′>′=′ , i.e. complex intensity must be
strictly greater than that of the simple as simple beliefs exclude some relevant policies
that have non-zero effects. If so, we have a contradiction, as asymptotically, with a
probability approaching 1, the simple will not be in power, thereby contradicting the
initial assumption that 1p
s =θ .
We now consider the case where 1p
c =θ , pre-multiplying (VI.5) by βs and defining
½* )/( ssββββ ′′=τ we find
8As noted earlier, absent noise there are multiple equilibria based on the linear restrictions imposed
by the restrictions on the PEPEs.
- 28 -
[ ] { ,)/(1
)1/(
][][ )7.VI(
22*
22*
(VI.5)by
22
2*
2
s
n
np
snss
p
nss
ssssn
p
ssss
R
RR
R
R
ββββββ
ββββββββββ
ββ
ββ
ββ
ββ
ββ
ββ
++
=→+′=
+′→
′−′=′−′′
′
′
′′
′
′
στστ
σστ
σ
and see that simple beliefs are proportional to βs. As the information provided by noise
relative to that provided by complex policy, ββ′Rn /2σ , goes to zero, these results imply
that cc
p
ss
p
ss ββββββββββ ′=′>′=′=′ 2*4* ττ , so the simple have strictly greater intensity
than the complex. We conclude that if ββ′Rn /2σ is sufficiently small, asymptotically with
a probability approaching 1 the complex are not in power, so the initial assumption that
1p
c =θ is invalid. Combined with the previous paragraph, this establishes that
asymptotically autarchic rule by a single type is not possible.
For 0p
c >θ and 0p
s >θ both to be true, asymptotically the intensity of the simple
and the complex must be the same, i.e. ββββββ ′=′=′p
cc
p
ss . Using this, we substitute for
equilibrium policies in (VI.5)
.
][
][
where,
][][
][][][)8.VI(
2
2
**
22
2
nsss
nsscp
ss
s
p
s
nsscs
p
nssss
ssn
p
ssscssss
Rθ
Rθ
Rθ
Rθ
Rθ
Rθ
σ
σττ
σσ
σ
+′−′′
+′−′′==
′′
=→
+′−′
′=
+′−′
′→
−=′−′′
+′−′′
′
′
′′
′′
ββββββ
ββββββ
ββ
ββββ
ββββββ
βββββββ
β
ββββββββ
ββββββββ
ββ
ββ
ββββ
ββββ
Simple beliefs asymptotically are proportional to true parameter values and, given that
simple intensity equals that of the complex, the factor of proportionality must be τ*.
Using this, we can solve for θs by substituting s
p
s ββ*τ= in the expression for τ* in (VI.8)
- 29 -
,1
1
][
][
)9.VI(*
2*
2*
2*
*
τ
στ
στ
σττ
+
−=→
+′−′′
+′−′′= ′
′
′
ββ
ββ
ββ
ββββββ
ββββββ R
θR
θ
Rθ
n
p
s
nsss
nsscp
which is the exact same result as in the 1 vs 2 policy model of the last section. Finally,
we note that complex policies yield an average outcome of ββββ ′′R , simple policies an
average outcome of ββββ ′′R)/1( *τ , and the simple over and underestimate the
effectiveness of their own and complex policies by
,11
& 1
1)10.VI(**
−′=′−′
−′=′−′ ′′ ττβββxβxβββxβx ββββ RR
p
PEPE
cccssc
p
PEPE
ssssss
scss
4342143421
which also follow results in the previous section.
To summarize, generalized political competition between types which correctly
and incorrectly specify the causal model for y produces an equilibrium which matches in
all relevant respects that of the 1 vs 2 policy example of the previous section. The beliefs
of the type with the correctly specified causal model converge on the true parameter
values and asymptotically they implement a broad and "complex" set of polices across all
relevant policy instruments, while setting irrelevant policy instruments to zero. The
beliefs of the type with an incorrectly specified model converge on a multiple of the true
parameter values for the policy instruments they believe are relevant. They correctly set
policy to zero in areas where policy is actually irrelevant, but implement an exaggerated
policy agenda in a narrow and "simple" set of relevant policies, systematically
overestimating the effectiveness of their preferred policies and underestimating the
effectiveness of the complex policy agenda. Neither perpetual rule by the simple nor
(with sufficiently small noise) the complex is possible, and the asymptotic equilibrium
involves policy cycles where periods of complex rule increase simple bias, recurrently
returning the simple to power, where they implement systematically less effective
- 30 -
versions of complex policies, thereby disappointing their intensely motivated voters and
ensuring their eventual electoral defeat.
VII. Local Dynamics: Random Outcomes and the Political Cycle
A peculiar characteristic of political life seems to be that random outcomes
benefit or harm incumbent parties. In this section we show that this feature arises in our
model through the fully rational Bayesian updating of beliefs. Random shocks change
estimates of the effectiveness of policy, but these effects are stronger for the incumbent
party which is implementing its desired policy combination. Specifically, we show that
(R4) Although the long run equilibrium involves cycles with types alternating in power, a random negative shock to y lowers the voting intensity of incumbent groups relative to their opposition, hastening regime change, while random positive shocks to y strengthen the political position of incumbents, lengthening their stay in power in the current political cycle.
To allow an examination of period by period beliefs, we restore notation with respect to
time, with the t x k matrix Ht denoting the history of policy up to time t, the vector th′ the
tth row thereof, and Hit and ith′ the corresponding histories and tth period policies that type
i deems relevant. We focus on outcomes in the vicinity of the steady state and, to
simplify the analysis, with negligible amounts of policy noise. As elsewhere, formal
proofs of probability limits stated below are given in the on-line appendix.
The formula for mean Bayesian beliefs, based as it is upon regression coefficients,
allows a simple representation of the updating of beliefs from period t to t +1
- 31 -
],[)(1
)(
)(1
)(
)(1
)(
)()(1
)()()(
)()1.VII(
11
11
1
11
11
1
111
11
1
111
11
11
1
111
11
111
111
ititt
itititit
itititit
itititit
tititit
itititit
itititititit
tittit
itititit
itititititititit
titititit
y
y
y
βhhHHh
hHHβ
hHHh
hHH
hHHh
βhhHHβ
hyHhHHh
HHhhHHHH
yHHHβ
+++
−+
+−
+−
+
++−
+−
+
++−
+++
−+
−++
−−
++−
+++
′−′′+
′+=
′′+′
+′′+
′′−=
+′
′′+′′′
−′=
′′=
where in the second line we make use of the Sherman-Morrison formula for the rank one
update of a matrix inverse.9 The term in brackets [] in the last line is the period t + 1
prediction error based upon beliefs at the end of period t. As the beliefs of the complex
encompass all non-zero policy effects and converge on the true parameter values, their
prediction error converges on the random component of y, 1111 ++++ +′=′− ttctctty εβhβh
,11 ++ =′− t
p
ctct εβh while asymptotically the prediction error of the simple contains both that
random component and the systematic under and overprediction of outcomes under each
regime discussed earlier, ststty βh 11 ++ ′− .1*
11 +++ +′−′= tsstt
p
ετ βhβh Our analysis in this
section focuses on the impact of εt+1.
As itititititititititit NNNXXNXXHH ′+′+′+′=′ , we use the fact that policies and the
share of time each type spends in power converge to steady state values, that noise shocks
are mutually independent, and that the averaged cross-products of desired policy and
policy noise converge on 0 to calculate the following probability limits
9Specifically, (V+xx')-1 = V-1 - V-1
xx'V-1/(1 + x'V-1x).
- 32 -
,0)( while
, ,
,)( with
, and )2.VII(
1
1
11
11
2~~~
22*
2
~
~~
~
pitititit
itititit
knsscsbc
knbbcs
knxkk
xkkp
ststp
ctct
ttt
RR
R
tt
s
b
cbc
cb
=
′′=′′
+′
′=′
′=
+′
′+=
=
′
′=
′
+−
++
−+
′′
′
hHHhhHHh
Iββ
ββCββ
ββB
Iββ
ββA
I0
0AHH
CB
BAHH
ββββ
ββ
σθθ
σθτθ
σ
and where we use the subscript ~i to denote policies that each type deems irrelevant,
subscript b to denote the policies both deem relevant, k~i and kb the number of such
policies, and make use of the fact that simple beliefs and policies in areas the complex
deem irrelevant (~c) converge to the true parameter values of 0. These results allow us to
express the asymptotic change in beliefs as
,1
],[1
)3.VII(
11
1
1
1*
111
1
21
~~
~
++
−
+
++++
−
+
′=−
+′−′
=−
tct
p
ctct
tssttst
knxkk
xkkp
stst
t
tcbc
cb
ε
ετσ
hCB
BAββ
βhβhhI0
0Aββ
(VII.3) shows that in the limit changes in beliefs are of order O(1/t) times a random
variable with a finite variance, so we can asymptotically approximate the change in the
intensity of each type as:
).(2)()()4.VII( 11111 ititit
p
itititititititit βββββββββββ −′=−′+=′−′ +++++
Finally, we note the formula for a block matrix inverse and calculate the limits of some
useful quadratic forms as policy noise goes to zero:
- 33 -
[ ] .0/0lim/limlim
,)/(
lim
/)/(1
/)/(1limlim
,)()/()(
lim
/)/()(1
/)/()(1limlim
,)()(
)()()5.VII(
2
0
2~~
0~
12~
0
~~2
~~
0
~2~~
4~~
2~0
~1
~0
2*2*20
22*
42*
20
1
0
11111111
111111
22~2
2
~22
2
22
==′=′
′=
′′+′
=
′′+′′
−′=′
+′
=′′++
′=
′′++′′+
−′=′
′−′+′−′−′−−′−
=
′
→→
−
→
′′→
′
′
→
−
→
′′→
′
′
→
−
→
−−−−−−−−
−−−−−−
nncccknc
csscn
ss
s
nssc
nssc
k
n
sss
csbbcsn
bb
b
nbbcs
nbbcs
k
n
bbb
nnc
n
n
snn
n
bnn
RR
R
R
RR
R
R
σσσ
θθσ
σθσθ
σ
θτθθτθσ
σθτθσθτθ
σ
σσσ
σ
σσ
σ
σσ
βββIβ
ββ
ββββ
ββ
βββββ
ββββIββCβ
ββ
ββββ
ββ
βββββ
ββββIββAβ
BCBBCABCCBBCABC
BCBBCABBCA
CB
BA
ββββ
ββ
ββ
ββββ
ββ
ββ
Since we are considering the limit as the variance of policy noise goes to zero, we also
take ht+1 in (VII.3) as equal to xt+1, the intentional policy vector of that time period.
Asymptotically the simple implement policies βββ ββ′′ /* Rsτ for the policies they
believe are relevant and 1~ xk s0 for those they believe are irrelevant, so using the preceding
results the change in the intensity of both types when the simple are in power is
[ ]
,2]1[2
*
/2lim)(2 plimlim)6.VII(
2*1
2*
2*
**
1***
*2~~
1*
01
0 22
cs
t
cs
tssss
nccbbststst
R
RR
Rt
nn
θτθετ
θτθττετττ
τστσσ
+′
++
−′=
+
′′−
′′
′′+′=−′
+
′+
′′
′−
→+
→
ββ
ββββ
ββ
βββββ
ββββ
βββ
βββββAββββ
[ ]
.0)()1(2lim
2lim)(2 plimlim
1*1
VII.5)(by 1
~~0
1
1
*1
~0
10
2
~
22
=′
′−′′
−=
′
′′′=−′
+′−−
=
′−
→
+′
−
→+
→
tbbcss
t
xk
b
sbctctct
RR
Rt
n
s
nn
ετθ
ετ
σ
σσ
βββBBCAβ
βββCβ
ββ0
β
CB
BAβββββ
ββ1ββ1
ββ
44 344 21
- 34 -
The first term for the simple represents the systematic tendency for their intensity to
decline when in power, as they respond to the overprediction of average outcomes. The
εt+1 term, for the simple and the complex, represents the effect of random shocks to y.
Here, a negative shock reduces the intensity of the simple, as their belief in the
effectiveness of the policies they deem relevant falls. Complex beliefs in these same
policies also fall, but the complex belief in the efficacy of policies the simple deem
irrelevant, and hence do not implement, rises, as the poor outcome under simple rule
convinces the complex that these neglected policies are more effective than previously
thought. These two effects offset each other, and complex intensity remains constant. In
sum, a negative shock lowers the relative political intensity of the simple, hastening the
transfer of power, with positive shocks having the opposite effect.
When the complex are in power asymptotically they implement policies
βββ ββ′′ /Rc for the policies they believe are relevant and 1~ xk c
0 for those they believe
are irrelevant, so the changes in intensity are seen to be
[ ]
[ ]
.2
])())(1[(2lim
lim)(2 plimlim
,2]1[2
*
/2lim)(2 plimlim)7.VII(
1
1~~12
VII.5)(by 1
~~0
1
~
1
~0
10
2*1
*
2*
*
1*
21~
1*
01
0
2
22
~22
c
t
tssbb
c
ss
t
s
b
sbctctct
cs
t
cs
tss
nxkcbbststst
R
RR
Rt
R
RR
Rt
n
nn
cnn
θε
εθ
ε
θτθετ
θτθτετ
στ
σ
σσ
σσ
+
′
+′−−−
=
′−
→
+′
−
→+
→
+
′+
′′
′−
→+→
′=
′+′−′
′−=
′
′′′=−′
+′
++
−′=
+
′′−
′′
′′+′=−′
ββ
ββ11ββ1
ββ
ββ
ββββ
ββ
ββ
βββCββBBCAβ
βββCβ
βββ
β
CB
BAβββββ
βββββ
ββββ
βββ
ββ0ββAββββ
444 3444 21
- 35 -
Once again, the change in simple beliefs contains a systematic component, this time
consisting of the gradual increase in bias and intensity as outcomes under the complex are
consistently better than expected. Both simple and complex respond to the realization of
the output shock ε, but the impact on the intensity of the complex is greater as, given that
*/ τθθ →sc as 02 →nσ , we have
.1
)1(lim )8.VII(
*
*
2*
*
02
cccsn θτθτ
θτθτ
σ<
+=
+→
A negative shock reduces the belief in the effectiveness of policies of both types, but the
effects on intensity are greater for the complex, for whom intensity depends upon a wider
range of policies, all of which are seen to be failing. Consequently, negative shocks
accelerate regime change, ushering in further negative outcomes as the simple implement
misguidedly narrow and intense policies, while positive shocks lengthen the time the
complex hold onto power and the polity continues to benefit from a full range of
moderate policy actions.
VIII. Conclusion
Our analysis has shown how simplistic beliefs can persist in political competition
against a more accurate and complex view of the world, delivering sub-par outcomes on
each outing in power and yet returning to dominate the political landscape over and over
again. In the framework presented above simplistic beliefs arise as a consequence of a
primitive assumption of misspecification, but we recognize that there are deeper
questions to explore. A recent examination of European Social Survey data by Guiso et
al (2017) finds that the responsiveness of the electorate to populist ideas and the supply of
populist politicians increases in periods of economic insecurity. Social and economic
transformation, and the insecurity and inequality it can engender, may create
- 36 -
environments in which opportunistic politicians are able to plant erroneously simplistic
world views into the electorate. Linking belief formation, at its most fundamental level,
to ongoing economic and political events allows a richer characterization of political
cycles, and is something we intend to explore in future work.
- 37 -
Bibliography
Acemoglu, Daron, Georgy Egorov and Konstantin Sonin (2013). "A Political Theory of Populism". Quarterly Journal of Economics: 771-805.
Anderson, T.W. (2003). An Introduction to Multivariate Statistical Analysis. Third Edition. New Jersey: John Wiley & Sons.
Arrow, Kenneth J. and Jerry R. Green (1973). "Notes on Expectations Equilibria in Bayesian Settings." Working paper #33, Institute for Mathematical Studies in the Social Sciences, Stanford University.
Bernhardt, Dan, Stefan Krasa, and Mehdi Shadmehr (2019). "Demagogues and the
Fragility of Democracy." Manuscript, University of Illinois, Urbana.
Bohren, J. Aislinn and Daniel N. Hauser (2019). "Social Learning with Model Misspecification: A Framework and a Characterization." Manuscript, 2019.
Bray, Margaret (1982). "Learning, Estimation, and the Stability of Rational
Expectations." Journal of Economic Theory 26: 318-339. Dahrendorf, Ralf. (2007). "Acht Anmerkungen zum Populismus". Eurozine 2007 (first
published Transit 25, 2003). Di Tella, Rafael and Julio J. Rotemberg (2016). "Populism and the Return of the
`Paranoid Style': Some Evidence and a Simple Model of Demand for Incompetence as Insurance against Elite Betrayal". Manuscript, Harvard Business School.
Dornbusch, Rudiger and Sebastian Edwards (1991). "The Macroeconomics of
Populism." In Dornbusch, Rudiger and Sebastian Edwards, eds, The Macroeconomics of Populism in Latin America. Chicago, University of Chicago Press.
Eliaz, Kfir and Ran Spiegler (2019). "A Model of Competing Narratives." Manuscript, 2019.
Esponda, Ignacio (2008). "Behavioral Equilibrium in Economies with Adverse
Selection." American Economic Review 98 (4): 1269-1291. Esponda, Ignacio and Demian Pouzo (2016). "Berk-Nash Equilibrium: A Framework for
Modeling Agents with Misspecified Models." Econometrica 84 (3): 1093-1130. Esponda, Ignacio, Demian Pouzo and Yuichi Yamamoto (2019). "Asymptotic Behavior
of Bayesian Learners with Misspecified Models." Manuscript, 2019. Frick, Mira, Ryota Iijima and Yuhta Ishii (2019). "Stability and Robustness in
Misspecified Learning Models. Manuscript, 2019.
- 38 -
Gidron, Noam and Bart Bonikowski (2013). "Varieties of Populism: Literature and
Research Agenda." Manuscript, Weatherhead Center for International Affairs, Harvard University.
Guiso, Luigi, Helios Herrera, Massimo Morelli, and Tommaso Sonno (2017). "Demand and Supply of Populism." Manuscript, Einaudi Institute for Economics and Finance.
Heidhues, Paul, Botond Kőszegi and Philipp Strack (2018). "Convergence in Misspecified Learning Models with Endogenous Actions." Manuscript, 2018.
Mailath, George J. and Larry Samuelson. "Learning under Diverse World Views: Model-
Based Inference." PIER working paper 19-018, University of Pennsylvania. Molavi, Pooya (2019). "Macroeconomics with Learning and Misspecification: A General
Theory and Applications." Manuscript, 2019. Montiel Olea, José Luis, Pietro Ortoleva, Mallesh Pai and Andrea Prat (2019).
"Competing Models". Manuscript, 2019. Morelli, Massimo, Antonio Nicoló and Paolo Roberti (2020). "Populism as Simplistic
Commitment." Manuscript.
Mudde, Cas and Cristóbal Rovira Kaltwasser (2017). Populism: A Very Short Introduction. Oxford: Oxford University Press.
Nyarko, Yaw. "Learning in Mis-specified Models and the Possibility of Cycles." Journal of Economic Theory 55 (2): 416-427.
Schmidt, Peter (1976). Econometrics. New York: Marcel Dekker, 1976. Zellner, Arnold (1971). An Introduction to Bayesian Inference in Econometrics. New