Personalized Information Aggregation and Polarization Lin Hu * Anqi Li † Ilya Segal ‡ This Draft: April 2022 Abstract We study how personalized information aggregation for rationally inatten- tive voters (IARI) affects the polarization of policies and public opinion. In a two-candidate electoral competition model, an attention-maximizing info- mediary aggregates source data about candidates’ valences into easy-to-digest information. Voters decide whether to consume information, trading off the ex- pected gain from improved expressive voting against the attention cost. IARI generates policy polarization even if candidates are office-motivated. Person- alized information aggregation makes extreme voters the disciplining entity of policy polarization, and the skewness of their signals is crucial for sustaining a high degree of policy polarization in equilibrium. Analysis of disciplining voters yields insights into the polarization effects of regulating infomediaries. Apply- ing our theory to the study of a Hotelling duopoly model shows that IARI * Research School of Finance, Actuarial Studies and Statistics, Australian National University, [email protected]. † Olin Business School, Washington University in St. Louis, [email protected]. ‡ Department of Economics, Stanford University, [email protected]. 1 arXiv:1910.11405v13 [econ.GN] 10 Apr 2022
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Personalized Information Aggregation and
Polarization
Lin Hu∗ Anqi Li† Ilya Segal‡
This Draft: April 2022
Abstract
We study how personalized information aggregation for rationally inatten-
tive voters (IARI) affects the polarization of policies and public opinion. In
a two-candidate electoral competition model, an attention-maximizing info-
mediary aggregates source data about candidates’ valences into easy-to-digest
information. Voters decide whether to consume information, trading off the ex-
pected gain from improved expressive voting against the attention cost. IARI
generates policy polarization even if candidates are office-motivated. Person-
alized information aggregation makes extreme voters the disciplining entity of
policy polarization, and the skewness of their signals is crucial for sustaining a
high degree of policy polarization in equilibrium. Analysis of disciplining voters
yields insights into the polarization effects of regulating infomediaries. Apply-
ing our theory to the study of a Hotelling duopoly model shows that IARI
∗Research School of Finance, Actuarial Studies and Statistics, Australian National University,[email protected].
†Olin Business School, Washington University in St. Louis, [email protected].‡Department of Economics, Stanford University, [email protected].
1
arX
iv:1
910.
1140
5v13
[ec
on.G
N]
10
Apr
202
2
renders the principle of minimum differentiation invalid even in the absence
of price competition, and that endogenizing firms’ locations makes the welfare
consequences of regulating infomediaries less clear-cut.
ary regulation, Hotelling duopoly, principle of minimum differentiation
JEL codes: D72, D80, L10
2
1 Introduction
Recently, the idea that tech-enabled information personalization could affect polar-
ization has been put forward in the academia and popular press (Sunstein (2009);
Pariser (2011); Gentzkow (2016)). This paper studies how personalized information
aggregation for rationally inattentive voters affects the polarization of policies and
public opinion in an electoral competition model.
Our premise is that rational demand for information aggregation in the digital era
is driven by information processing costs. As the Internet and social media become
important sources of information, and the amount of available information there (2.5
quintillion bytes) is vastly greater than what any individual can process in a lifetime,
decision-makers must turn to infomediaries for information aggregation, personalized
based on their individual data such as demographic and psychographic attributes,
digital footprints, and social network positions.1 In this paper, we abstract from
the issue of data generation (e.g., original reporting), focusing instead on the role of
infomediaries in aggregating source data into information that is easy to process and
useful for the target audience.
We develop a model of information aggregation for rationally inattentive decision-
makers (IARI), in which an infomediary can flexibly aggregate source data into in-
formation using algorithm-driven systems. While flexibility is also assumed in the
1An infomediary is an internet company that gathers and links information on particular subjectson behalf of their customers. Prominent examples of infomediaries include news aggregators (e.g.,aggregator sites, social media feeds, mobile news apps), which operate by sifting through a myriad ofonline sources and discover stories that readers might find interesting. The snippets (i.e., headlinesand excerpts) displayed on their platforms contain coarse information and do not always result in theclick-through of original contents (Dellarocas et al. (2016)). A major revenue source for them comesfrom displaying ads to users while the latter are browsing through snippets (an exception is GoogleNews, which directs readers to the main Google search engine where product ads are displayed).
News aggregators have recently gained prominence as more people get information online, fromsocial media, and through mobile devices (Matsa and Lu (2016)). The top three popular newswebsites in 2019: Yahoo! News, Google News, and Huffington Post, are all aggregators. See Athey,Mobius, and Pal (2021) for background reviews.
3
Rational Inattention (RI) model pioneered by Sims (1998) and Sims (2003), there
decision-makers can aggregate information optimally themselves and so have no need
for external aggregators. To model the demand for infomediaries, we assume that
decision-makers can only choose whether to absorb the information offered to them
but cannot digest information partially or selectively, let alone aggregate information
optimally themselves. While this assumption is certainly stylized, it is the simplest
one that creates a role for infomediaries.2
If choosing to consume information, a decision-maker incurs an attention cost
that is posterior separable (Caplin and Dean (2013)) while deriving utilities from
improved decision-making. Consuming information is optimal if the expected utility
gain exceeds the attention cost. As for the infomediary, we assume that its goal is
to maximize the total amount of attention paid by decision-makers, interpreted as
the advertising revenue generated from consumer eyeballs. This stylized assumption
captures the key trade-off faced by the infomediary, who uses useful and easy-to-
process information to attract decision-makers’ attention while preventing them from
tuning out. While we focus on the case of a monopolistic infomediary in order to
capture the market power wielded by tech giants, we also investigate an extension
to perfectly competitive infomediaries which, together with personalization, becomes
equivalent to decision-makers optimally aggregating information themselves as in the
standard RI model.
We embed the IARI model into an electoral competition game in which two office-
motivated candidates choose policies on a left-right spectrum. Voters vote expres-
sively based on policies, as well as an uncertain valence state about which candidate
2As pointed out by Stromberg (2015), the last assumption is implicitly made by the medialiterature because without it the role of information provider would be much more limited. It isn’tat odds with reality, since analyses of page activities (e.g., scrolling, viewport time) have establishedsignificant user attention in the reading of online news (in particular, the snippets thereof (Dellarocaset al. (2016); Lagun and Lalmas (2016)).
4
is more fit for office. Information about candidate valence is provided by an infome-
diary. We study how IARI affects the polarization of equilibrium policies and voter
opinions in this game.
A consequence of IARI is that signal realizations prescribe recommendations as
to which candidate one should vote for. Indeed, any information beyond voting rec-
ommendations would only raise the attention cost without any corresponding benefit
for voters and would thus turn away voters whose participation constraints bind at
the optimum. Furthermore, voters must strictly prefer to obey the recommenda-
tions given to them, a property we refer to as strict obedience. Indeed, if voter has
a (weakly) preferred candidate that is independent of his voting recommendations,
then he could always vote for that candidate without paying attention, which saves
the attention cost.
An important implication of strict obedience is that local deviations from a sym-
metric policy profile wouldn’t change voters’ voting decisions regardless of the recom-
mendations they receive, which suggests that a positive degree of policy polarization
could arise in equilibrium even if candidates are office-motivated. We define pol-
icy polarization as the maximal distance between candidates’ positions among all
symmetric perfect Bayesian equilibria. In the baseline model featuring left-leaning,
centrist, and right-leaning voters, our main theorem shows that policy polarization is
strictly positive and equals the disciplining voter’s policy latitude.
A voter’s policy latitude is an index that captures his resistance to candidates’
policy deviations. It decreases with the voter’s horizontal preference for the deviating
candidate’s policies and increases with his pessimism about the deviating candidate’s
valence following unfavorable information. A voter is said to be disciplining if his
policy latitude determines policy polarization. To illustrate how personalized infor-
mation aggregation affects the identity of the disciplining voter, we compare two
5
cases: (i) broadcast information aggregation, in which the infomediary must offer a
single signal structure to all voters, and (ii) personalized information aggregation, in
which the infomediary can design different signal structures for different voters. In the
broadcast case, all voters receive the same voting recommendation, so a candidate’s
deviation is profitable, i.e., strictly increases his winning probability, if and only if
it attracts a majority coalition. Under the usual assumptions, this is equivalent to
attracting centrist voters, who are therefore disciplining. In the personalized case, the
infomediary can provide conditionally independent signals to different voters (this as-
sumption will be relaxed), so each type of voter is pivotal with a positive probability
when voters’ population distribution is sufficient dispersed. In that case, a policy
deviation is shown to be profitable if and only if it attracts any type of voter, and
voters with the smallest policy latitude are disciplining because they are the easiest
to attract.
The skewness of extreme voters’ personalized signals is crucial for sustaining a
greater degree of policy polarization as information aggregation becomes personalized.
To maximize the usefulness of information consumption for an extreme voter, the
recommendation to vote across party lines must be very strong and, in order to
prevent the voter from tuning out, must also be very rare (hereafter an occasional
big surprise). Most of the time, the recommendation is to vote along party lines
(hereafter a predisposition reinforcement), which together with the occasional big
surprise has been documented in the empirical literature.3 When base voters are
disciplining, the occasional big surprise of their signal makes them difficult to attract
in the rare event where information is unfavorable to their own-party candidate.
3Recently, Flaxman, Goel, and Rao (2016) find that the use of news aggregators mostly rein-forces people’s predispositions, but it also strengthens their opinion intensities when supporting theopposite-party candidates (i.e., occasional big surprise). Evidence for predisposition reinforcement isdiscussed in Fiorina and Abrams (2008) and Gentzkow (2016). Evidence for occasional big surpriseand, more generally, Bayesian voters is surveyed by DellaVigna and Gentzkow (2010).
6
When base voters are so pessimistic about their own-party candidate’s valence that
even the most attractive deviation to them is still not attractive enough, opposition
voters become disciplining, despite that they are also difficult to attract due to their
preferences against the deviating candidate’s policies. If, in the end, all voters end
up having bigger policy latitudes than the centrist voters in the broadcast case, then
the personalization of information aggregation increases policy polarization. We find
this to be the case when the attention cost is Shannon entropy-based, the attention
cost parameter is large, and extreme voters have strong policy preferences.
Analyses of the disciplining voter yield insights into the policy polarization effect
of recent regulatory proposals to tame the tech giants. In addition to the personaliza-
tion of information aggregation—the reversal of which is a plausible consequence of
limiting tech companies’ access to users’ personal data (The General Data Protection
Regulation (2016); Warren (2019))—we study the consequences of introducing perfect
competition to infomediaries. This regulatory proposal is advocated by the British
government as a preferable way of regulating tech giants (The Digital Competition
Expert Panel (2019)), and it is mathematically equivalent to increasing voters’ at-
tention cost parameter in the monopolistic personalized case. Its policy polarization
effect is negative, because increasing the attention cost parameter tempers voters’
beliefs about candidate valence and so reduces their policy latitudes.
Our analysis suggests that factors carrying negative connotations in our everyday
discourse could have unintended consequences for policy polarization. An example is
increasing mass polarization, which we model as a mean-preserving spread of voters’
policy preferences (Gentzkow (2016)). With personalized information aggregation,
increasing mass polarization can reduce policy polarization rather than increasing it:
as we keep redistributing voters’ population from the center to the margin, policy
polarization would eventually decrease from the centrist voters’ policy latitude to the
7
minimal policy latitude among all voters.
In Online Appendix O.1, we extend the baseline model to encompass general voters
and arbitrary correlation structures between their personalized signals. We develop a
methodology for analyzing this general model and discover, among other things, that
(i) correlation can only increase polarization, and (ii) polarization is minimized under
the uniform population distribution and conditionally independent signal distribution.
Thus our baseline result prescribes the exact lower bound for the polarization effect
of information personalization, while factors that preserve this lower bound (e.g.,
enriching voters’ types, dividing the same type of voters into multiple subgroups)
wouldn’t render polarization trivial.
We apply our theory to the study of horizontal product differentiation with person-
alized information aggregation in a Hotelling duopoly model. Among other things,
we find that IARI renders the principle of minimum differentiation (as posited by
Hotelling (1929)) invalid even in the absence of price competition, and that the wel-
fare consequences of regulating infomediaries become less clear-cut once firms’ loca-
tion choices become endogenous.
1.1 Related literature
Rational inattention The literature on rational inattention pioneered by Sims
(1998) and Sims (2003) assumes that decision-makers can optimally aggregate source
data into signals themselves. To create a role for infomediaries, we assume that the
aggregator is designed and operated by an attention-maximizing infomediary, whereas
voters must fully absorb the information given to them. Apart from this departure
from the RI paradigm, we otherwise follow the standard model of posterior-separable
attention cost that nests Shannon entropy as a special case. Posterior separability
8
(Caplin and Dean (2013)) has recently received attention from economists because of
its axiomatic and revealed-preference foundations (Zhong (2019); Caplin and Dean
(2015)), connections to sequential sampling (Morris and Strack (2017); Hebert and
Woodford (2018)), and validations by lab experiments (Dean and Neligh (2019)).
Filtering bias The idea of filtering bias—i.e., even rational consumers can ex-
hibit a preference for biased information when constrained by information processing
capacities—dates back to Calvert (1985a) and is later expanded on by Suen (2004),
Burke (2008), and Che and Mierendorff (2019), among others. While these models
predict a predisposition reinforcement, they work with non-RI information aggre-
gation technologies and do not examine the consequences of information bias for
electoral competition. Even if they did (see, e.g., Chan and Suen (2008)), their pre-
dictions could still differ from ours, as we will soon explain.
Probabilistic voting models In most existing probabilistic voting models, voters’
signals are assumed to be continuously distributed, so even small changes in candi-
dates’ positions could affect voters’ voting decisions (see Duggan (2017) for a survey).
Under this assumption, Calvert (1985b) first establishes policy convergence between
office-seeking candidates and then pioneers the use of policy preference for gener-
ating policy polarization between candidates (hereafter the Calvert-Wittman logic).
Strict obedience stands in sharp contrast to this assumption, although it is a natural
consequence of IARI.
There is a growing literature on platform competition with personalized informa-
tion. The current work differs from the existing studies in two main aspects. First,
the information structures of our interest and their properties are new to the liter-
ature. Second, we embed the analysis in a plain probabilistic voting model (akin
9
to the first model of Calvert (1985b) and many others surveyed by Duggan (2017)),
where candidates are office-motivated and the only source of uncertainty is their va-
lence shock. Together, these modeling choices generate new predictions that even the
closest works to ours ignore.4
Chan and Suen (2008) study a model of personalized media in which voters care
about whether the realization of a random state variable is above or below their
personal thresholds, and information is provided by media outlets that partition the
state space using threshold rules. A consequence of working with this information
aggregation technology rather than IARI is that signal realizations are monotone in
voters’ thresholds (i.e., if a left-leaning voter is recommended to vote for candidate
R, then a right-leaning voter must receive the same recommendation), hence centrist
voters are always disciplining despite a pluralism of media.5 We instead predict that
the disciplining voter can vary with model primitives and will discuss the implications
of this prediction in Section 6.
More recently, Matejka and Tabellini (2020) and Yuksel (2022) study electoral
competition models with personalized information acquisition.6 In the model studied
by Matejka and Tabellini (2020), voters face normal uncertainties about candidates’
policies that do not directly enter their utility functions. Information acquisition
takes the form of variance reduction, generating signals that violate strict obedience
and sustain policy polarization only if the cost of information acquisition differs across
candidates. The current work differs from that of Matejka and Tabellini (2020) in the
4Loosely related works include those assuming exogenous signal structures and those in which sig-nals are disclosed strategically by campaigning candidates (see, e.g., Glaeser, Ponzetto, and Shapiro(2005) and Herrera, Levine, and Martinelli (2008)).
5Starting from there, the analysis of Chan and Suen (2008) differs completely from ours. Inparticular, it exploits the Calvert-Wittman logic, which we do not rely on.
6The flexibility in choosing among a large variety of signal structures is crucial for these stud-ies and ours but is absent from existing political models with rigid information acquisition, e.g.,Martinelli (2006).
10
source of uncertainty, the attention technology, and the driving force behind policy
polarization. Yuksel (2022) studies a variant of the Calvert-Wittman model where
voter learning takes the form of partitioning a multi-dimensional issue space. Aside
from these modeling differences that set our reasonings apart,7 none of our predictions
(as previewed in the introduction) can be made Yuksel (2022).
Competition for consumers with limited attention A growing literature in IO
and marketing (surveyed by Spiegler (2016) and Iyer, Soberman, Villas-Boas (2005))
studies the competition between firms for consumers with limited attention (LA).
Most papers we are aware of model LA as limited consideration sets, and they focus
on price and quality competitions rather than spatial competition. Notable exceptions
include Matejka and McKay (2012) studying the price competition between firms for
RI consumers; Sauer, Schlatterer, and Schmitt (2019) studying a Hotelling duopoly
model with LA consumers (who can only observe positions in subintervals of the real
line); and Perego and Yuksel (2022) studying the spatial competition between media
companies for RI news consumers located on a Salop circle.
The remainder of the paper proceeds as follows: Section 2 introduces the baseline
model; Section 3 conducts equilibrium analyses; Section 4 reports extensions of the
baseline model; Section 5 gives a further application of our theory; Section 6 con-
cludes. See Appendices A-C and the online appendices for additional materials and
mathematical proofs.
7In particular, Yuksel (2022)’s reasoning exploits the multi-dimensionality of the issue space andthe Calvert-Wittman logic. Our results hold regardless of the dimensionality of the underlying state(see Footnote 21), and they do not exploit the Calvert-Wittman logic.
11
2 Baseline model
In this section, we first streamline the model setup and then discuss main assumptions.
2.1 Setup
Two office-motivated candidates named L and R can adopt the policies on the real
line. They face a unit mass of infinitesimal voters who are left-leaning (k = −1),
centrist (k = 0), or right-leaning (k = 1). Each type k ∈ K = −1, 0, 1 of voter
has a population q (k) > 0 and values a policy a ∈ R by u (a, k) = −|t (k)− a|. The
environment is symmetric, in that q(1) = q(−1) and t(1) > t(0) = 0 > t(−1) = −t(1).
Thus a centrist voter is also a median voter.
At the end of the game, the society holds an election, in which the majority winner
wins the election, and ties are broken evenly between the two candidates. During the
election, each voter must vote expressively for one of the two candidates. For any
given profile a = 〈aL, aR〉 ∈ R2 of positions, a type k voter earns the following utility
difference from voting for candidate R rather than L:
v (a, k) + ω.
In the above expression,
v (a, k) = u (aR, k)− u (aL, k)
captures the voter’s differential valuation of the candidates’ policies, whereas ω is an
uncertain valence state about which candidate is more fit for office. In the baseline
model, ω takes the values in Ω = −1, 1 with equal probability,8 so its prior mean
8E.g., in the ongoing debate about how to battle terrorism, ω = −1 if the state favors the use of
12
equals zero. Online Appendix O.3 examines the case of a continuum of states.
When casting votes, voters observe candidates’ policies but not directly the re-
alization of the valence state. Information about the valence state is modeled as a
finite signal structure (or simply signal) Π : Ω→ ∆ (Z), where each Π (· | ω) specifies
a probability distribution over a finite set Z of signal realizations conditional on the
state realization being ω ∈ Ω. Information is provided by a monopolistic infomedi-
ary who is equipped with a segmentation technology S. S is a partition of voters’
types, and each cell of it is called a market segment. The infomediary can distinguish
between voters from different market segments but not those within the same mar-
ket segment. Our focus is on the coarsest and finest partitions named the broadcast
technology b = K and personalized technology p = k : k ∈ K, respectively: the
former cannot distinguish between the various types of the voters at all, whereas the
latter can do so perfectly.
Under segmentation technology S ∈ b, p, the infomediary designs |S| signals,
one for each market segment. Within each market segment, voters decide whether
to consume the signal that is offered to them. Consuming a signal Π means fully
absorbing its information content. Doing so incurs an attention cost λ · I (Π), where
λ > 0 is called the attention cost parameter, and I (Π) is the needed amount of
attention for absorbing the information content of Π.9 After that, voters observe
signal realizations, update their beliefs about the quality state, and cast votes. The
infomediary’s profit equals the total amount of attention paid by voters.
The game sequence is summarized as follows.
1. The infomediary designs signal structures; voters observe the signals structures
soft power (e.g., diplomatic tactics), and ω = 1 if the state favors the use of hard power (e.g., militarypreemption). Candidates L and R are experienced with using soft and hard power, respectively, andwhoever is more experienced with handling the circumstances has an advantage over his opponent.
9According to Prat and Stromberg (2013), instrumental voting is an important motive for con-suming political information.
13
offered to them and make consumption decisions accordingly.
2. Candidates choose policies without observing the moves in Stage 1.
3. The valence state is realized.
4. Voters observe policies and signal realizations before casting votes.
We adopt perfect Bayesian equilibrium (PBE) as the solution concept. Our goal is
to characterize all PBEs where candidates propose symmetric policy profiles of form
〈−a, a〉, a ≥ 0 in Stage 2 of the game.
2.2 Discussion of Assumptions
Attention cost We state our assumption about the attention cost function. Recall
that a signal structure Π : Ω → ∆ (Z) specifies how source data about the valence
state are (randomly) aggregated into the content indexed by the signal realizations
in Z. For each z ∈ Z, let
πz =∑ω∈Ω
Π (z | ω) /2
denote the probability that the signal realization is z, and assume without loss of
generality (w.l.o.g.) that πz > 0. Then
µz =∑ω∈Ω
ω · Π (z | ω) / (2πz)
is the posterior mean of the valence state conditional on the signal realization being
z, and it fully captures one’s posterior belief after observing z. The next assumption
is standard in the RI literature.
14
Assumption 1. The needed amount of attention for consuming Π : Ω→ ∆ (Z) is
I (Π) =∑z∈Z
πz · h (µz) , (1)
where h : [−1, 1]→ R+ (i) is strictly convex and satisfies h (0) = 0, (ii) is continuous
on [−1, 1] and twice differentiable on (−1, 1), and (iii) is symmetric around zero.
Equation (1) coupled with Assumption 1(i) is equivalent to weak posterior separa-
bility (WPS), a notion proposed by Caplin and Dean (2013) to generalize Shannon’s
entropy as a measure of attention cost. In the current setting, WPS stipulates that
consuming null signals requires no attention, and that more attention is needed for
moving the posterior belief closer to the true state and as the signal becomes more
Blackwell-informative (hence attention is a scarce resource that reduces uncertainties
about the quality state). Together with the regularities imposed by Assumption 1(ii)
and (iii), WPS is satisfied by many standard attention cost functions,10 and it will
be relaxed in Section 4.
Modeling assumptions We discuss the main modeling assumptions. See Section
4 for minor assumptions and how they can be relaxed.
Consider first the infomediary, which for concreteness’ sake can be thought of as
a news aggregator. In Footnote 1, we already detailed the business model of news
aggregators. Here we repeat four noteworthy facts.
1. The content provided by news aggregators is usually very coarse, e.g., snippets
that include a title and a few summary sentences.
10Examples include the reductions in the variance and Shannon entropy of the quality state beforeand after information consumption, in which cases h(µ) = µ2 and H ((1 + µ) /2) (H denotes thebinary entropy function), respectively.
15
2. A major source of news aggregators revenues comes from displaying ads to users
while the latter are paying attention to the content.11
3. Modern news aggregators are operated by tech giants that wield significant
market power.
4. The algorithms behind their operations represent trade secrets that cannot be
easily reverse-engineered by nonusers (Eslami et al. (2015)).
We analyze the game between candidates, voters, and an infomediary. While our
game is certainly stylized, it captures some facets of reality and gives us tractability.
Motivated by Facts 2 and 3, we assume that a monopolistic infomediary maximizes
the total amount of attention paid by voters while preventing them from tuning out.
Our results remain qualitatively valid as long as the infomediary’s profit is a strictly
increasing function of voters’ attention.12 Online Appendix O.2 examines the case of
competitive infomediaries.
Motivated by Fact 4, we assume that candidates do not observe signal structures
when crafting policies.13 We do allow voters to observe the signal structures offered
to them, because according to computer scientists working on algorithm audit, the
most effective way to recover the algorithms used by tech giants is to survey users
(Eslami et al. (2015)).
Finally, we assume that policies are announced to voters at the voting stage.14 We
11Click here for Facebook’s tactics such as playing multiple small mid-roll ads when users arealready in the “lean-back” watching mode and so will absorb the ads together with the content.
12To see why, suppose the profit generated by a voter consuming Π equals J (I (Π)) for somestrictly increasing function J : [0, 1] → R+. Then for any given set of voters whose participationconstraints we wish to satisfy, the infomediary solves maxΠ J (I (Π)) · (# of participating voters) or,equivalently I (Π) · (# of participating voters), subject to voters’ participation constraints.
13We also refrain candidates from observing (signals) of the valence state when crafting policies.Relaxing this assumption wouldn’t affect the analysis, because any symmetric PBE of the currentgame remains a PBE of the augmented game, and any symmetric PBE of the augmented game inwhich candidates adopt fixed policy platforms must be a PBE of the current game.
14Since policies are certain objects, they can be observed at no attention cost once announced.
do not explicitly model the activities that make this happen (e.g., political advertising,
canvassing) but note that they typically take place right before the election day
(Gerber et al. (2011)). Given this, it is reasonable to assume that the design and
consumption of signals concerning an evolving state of the world (as in the case of
countering terrorism discussed in Footnote 8) are made without observing policies.
3 Analysis
3.1 Optimal signals
In this section, we fix any symmetric policy profile a = 〈−a, a〉 with a ≥ 0 and solve
for the signals that maximize the infomediary’s profit (hereafter optimal signals). To
facilitate discussions, we say that candidate L (resp. R) is the own-party candidate
of left-leaning (resp. right-leaning) voters.
Infomediary’s problem Under segmentation technology S ∈ b, p, any optimal
signal for market segment s ∈ S solves
maxΠ
I (Π) · D (Π; a, s) (s)
where D (Π; a, s) denotes the demand for signal Π in market segment s under policy
profile a. To figure out D(·), note that since a voter could always vote for his own-
party candidate without consuming information, information consumption is useful
only if it sometimes convinces him to vote across party lines. After consuming Π,
a voter strictly prefers candidate R to L if v (a, k) + µz > 0, and he strictly prefers
candidate L toR if v (a, k)+µz < 0. Ex ante, the expected utility gain from consuming
17
Π is
V (Π; a, k) =
∑
z∈Z πz [v (a, k) + µz]+ if k ≤ 0,∑
z∈Z −πz [v (a, k) + µz]− if k > 0,
and the voter prefers to consume Π rather than to abstain (hereafter, his participation
constraint is satisfied) if
V (Π; a, k) ≥ λ · I(Π).
Therefore,
D (Π; a, s) =∑
k∈K:V (Π;a,k)≥λ·I(Π)
population of type k voters in segment s.15
Binary signal and strict obedience We demonstrate that any optimal signal
has at most two realizations and, if binary, prescribes voting recommendations that
its consumers strictly prefer to obey. To facilitate analysis, we say that a signal
realization z endorses candidate R and disapproves of candidate L if µz > 0, and
that it endorses candidate L and disapproves of candidate R if µz < 0. For binary
signals, we write Z = L,R. From Bayes’ plausibility, which mandates that the
expected posterior mean must equal the prior mean zero:
∑z∈Z
πz · µz = 0, (BP)
it follows that we can assume µL < 0 < µR w.l.o.g. In this way, we can interpret each
signal realization z ∈ L,R as an endorsement for candidate z and a disapproval of
candidate −z. In addition, we can define strict obedience as follows.
15If a solution to Problem (s) has zero demand, then it will be regarded the same as a degeneratesignal. This convention rules out uninteresting situations in which the infomediary deters informationconsumption using nondegenerate signals.
18
Definition 1. A binary signal induces strict obedience from its consumers if the
latter strictly prefer the endorsed candidate to the disapproved one under both signal
realizations, i.e.,
v (a, k) + µL < 0 < v (a, k) + µR. (SOB)
Lemma 1. Under Assumption 1, the following hold for any symmetric policy profile
〈−a, a〉 with a ≥ 0.
(i) Any optimal personalized signal for any voter has at most two realizations.
(ii) Any optimal broadcast signal has at most two realizations.
(iii) Any optimal signal, if binary, induces strict obedience from its consumers.
Proof. Omitted proofs from the main text are gathered in Appendix B.
Lemma 1 is proven differently for the cases of personalized and broadcast infor-
mation aggregation. In the personalized case, our result follows from the fact that
individual voters makes binary decisions and the attention cost function is strictly
Blackwell-monotone. Given this, any information beyond decision recommendations
would only raise the attention cost without any corresponding benefit for voters and
so would turn away voters whose participation constraints bind at the optimum. For
these voters, maximizing attention is equivalent to maximizing the usefulness of in-
formation consumption at the maximal attention level. Neither the assumption of
binary states or that of posterior separability matters for this argument.
The broadcast case is proven by aggregating voters with binding participation
constraints into a representative voter. Under the assumption that voters’ policy
preferences exhibit increasing differences between policies and types, only extreme
voters’ participation constraints bind at the optimum, whereas centrist voters’ par-
ticipation constraint is slack. The resulting representative voter makes at most three
19
decisions: LL, LR, and RR (the first and second letters stand for the voting decisions
of the left-leaning voter and right-leaning voter, respectively), so the optimal signal
for him has at most three signal realizations. Then using the concavification method
developed by Aumann and Maschler (1995) and Kamenica and Gentzkow (2011), we
demonstrate that the optimal signal has at most two realizations LL and RR. The
proof exploits three assumptions: (i) binary states, (ii) posterior separability, and (iii)
the infomediary maximizes voters’ attention, which will be relaxed in Section 4.
Strict obedience (SOB) is an essential feature of optimal binary signals. Indeed,
if a consumer of a binary signal has a (weakly) preferred candidate that is indepen-
dent of his voting recommendations, then he would prefer to vote for that candidate
unconditionally without consuming the signal, because doing so saves the attention
cost without affecting the expected voting utility.
The next assumption imposes regularities on our problem. It makes the upcoming
analysis elegant and will be relaxed in Section 4.
Assumption 2. The following hold for any symmetric policy profile 〈−a, a〉 with
a ≥ 0, segmentation technology S ∈ b, p, and market segment s ∈ S.
(i) Any optimal signal for market segment s is nondegenerate and is consumed by
all voters therein.
(ii) The posterior means of the state induced by the signal in Part (i) lie in (−1, 1).
tion distribution could have surprising effects on product differentiation.
Fourth, the welfare consequences of the above changes are in general ambiguous,
a topic we now turn to.
6 Concluding remarks
Tech-enabled personalization is now ubiquitous and seems to maximize the social sur-
plus by best serving individuals’ needs. To us, this argument ignores a vital role of
modern infomediaries, namely their abilities in shaping information consumers’ beliefs
and, in turn, the location choices of politicians, companies, etc.. After formalizing this
role of infomediaries, the welfare consequences of many regulatory proposals to tame
the tech giants become less clear-cut. For example, while disabling personalization
clearly reduces the surplus generated from information aggregation, holding candi-
dates’ positions fixed, it could affect the social welfare either way once these positions
become endogenous.17 The same thing can be said about introducing competition
between infomediaries, which alone would make voters better off and infomediaries
worse off. For this reason, we suggest that caution must be exercised and our equilib-
rium characterization be considered when evaluating the overall impacts of the above
proposals.
An important takeaway from our analysis is the indeterminacy of the disciplining
voter in the case of personalized information aggregation. This prediction, while
delicate at first sight, suggests that a useful first step towards testing our theory is to
17For example, while increasing polarization certainly makes centrist voters worse-off, it couldaffect extreme voters’ utilities either way, depending on the exact location choices of the candidates.
37
identify shocks to infomediaries, which in practice could stem from the experiments
conducted by tech companies or the regulatory uncertainties they face (e.g., Spain’s
unexpected shutdown of Google News). It also suggests the usefulness of surveying
consultants about the disciplining voter, an approach advocated by Hersh (2015) in
the context of personalized campaign. We hope someone, maybe us, will put these
ideas into practice in the future.
A Numerical examples
This appendix solves the baseline model numerically for the case of entropy attention
cost. We first reduce Assumption 2 to model primitives. Results depicted in Figure
3 confirm the intuition discussed in Section 3.1.
0.1
0.3
0.5
0.7
0.0 0.2 0.4 0.6t(1)
λ
Condition (*) and assumption 2 holdConditoin (*) fails and assumption 2 holds
The upcoming analysis exploits the following properties of the distance utility func-
tion.
41
Observation 2. u(a, k) = −|t(k)−a| satisfies the following properties, provided that
t : K → R is strictly increasing and is symmetric around zero.
Continuity and weak concavity u (·, k) is continuous and weakly concave for any
k ∈ K.
Symmetry u (a, k) = u (−a,−k) for any a ∈ R and k ∈ K.
Inverted V-shape u (·, k) is strictly increasing on (−∞, t (k)] and is strictly de-
creasing on [t (k) ,+∞) for any k ∈ K.
Increasing differences v(−a, a′, k) := u (a, k) − u (a′, k) is increasing in k for any
a > a′. For any a > 0, v(−a, a, k) := u (a, k) − u (−a, k) is strictly positive if
k = 1, equals zero if k = 0, and is strictly negative if k = −1.
B.1 Proofs for Sections 3.1 and 3.3
The proofs presented in this appendix take any symmetric policy profile a = 〈−a, a〉
with a > 0 as given (the proof for the case a = 0 is trivial). Under the assumption of
binary states, any signal structure can be represented by the tuple 〈πz, µz〉z∈Z , where
πz denotes the probability that the signal realization is z ∈ Z, and µz denotes the
posterior mean of the state conditional on the signal realization being z. Any binary
signal structure must satisfy
πL =µR
µR − µLand πR =
−µLµR − µL
42
and so can be represented by the profile 〈µL, µR〉 of posterior means. Type k voters’
utility gain from consuming 〈µL, µR〉 is simply
V (〈µL, µR〉; a, k) =
πR [v (a, k) + µR]+ if k ≤ 0,
−πL [v (a, k) + µL]− if k > 0,
where v(a, 1) > 0 = v(a, 0) > v(a,−1) = −v(a, 1) according to Observation 2 sym-
metry and increasing differences. For ease of notation, we shall hereafter write
v(a, 1) = v and v(a,−1) = −v.
Proof of Lemmas 1 and 2 We prove Lemmas 1 and 2 together in four steps.
Step 1. Show that the optimal personalized signal for any voter is unique and has
at most two signal realizations. We prove the result only for left-leaning voters of
type k = −1. Any optimal personalized signal for them solves
maxZ,Π:Ω→∆(Z)
I (Π) subject to V (Π; a,−1) ≥ λI (Π) . (2)
Let γ ≥ 0 denote a Lagrange multiplier associated with voters’ participation con-
straint. Write the complementary slackness constraints as
γ ≥ 0, V (Π; a,−1) ≥ λI (Π) , and γ [V (Π; a,−1)− λI (Π)] = 0. (3)
If γ = 0, then the solution to (2) is the true state and so is unique and binary. If
γ > 0, then reformulate (2) as
maxZ,〈πz ,µz〉z∈Z ,γ≥0
V (Π; a,−1)− λ (γ) I (Π) subject to (BP) and (3), (4)
43
where λ (γ) := λ− 1/γ. If λ(γ) ≤ 0, then the solution to (4) and, hence, (2), is again
the true state. If λ(γ) > 0, then the maximand of (4) becomes
∑z∈Z
πz[[−v + µz]
+ − λ (γ)h (µz)]︸ ︷︷ ︸
f(µz)
,
where f is the maximum of two strictly concave functions of µ: (i) −λ (γ)h (µ), and
(ii) −v + µz − λ (γ)h (µ) (as depicted in Figure 6). Since (i) and (ii) single-cross at
µ = v, their maximum is M-shaped, so applying the concavification method developed
by Kamenica and Gentzkow (2011) yields a unique solution with at most two signal
realizations. Given this, we can restrict Z to Z : |Z| ≤ 2 in the original problem (2)
and therefore guarantee the existence of a solution.
v μ2μ1μ
f(μ) f +(μ)<latexit sha1_base64="gfLEKxkdHSLkPoyr+Q5CZyerqQo=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4KomKeix68diC/YA2lM120q7dbMLuRiihv8CLB0W8+pO8+W/ctjlo64OBx3szzMwLEsG1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1A6pRcIkNw43AdqKQRoHAVjC6m/qtJ1Sax/LBjBP0IzqQPOSMGivVw16p7FbcGcgy8XJShhy1Xumr249ZGqE0TFCtO56bGD+jynAmcFLsphoTykZ0gB1LJY1Q+9ns0Ak5tUqfhLGyJQ2Zqb8nMhppPY4C2xlRM9SL3lT8z+ukJrzxMy6T1KBk80VhKoiJyfRr0ucKmRFjSyhT3N5K2JAqyozNpmhD8BZfXibN84p3VbmoX5art3kcBTiGEzgDD66hCvdQgwYwQHiGV3hzHp0X5935mLeuOPnMEfyB8/kDzY2M8g==</latexit>
For each i = 1, 2, write Πi for the unique solution to (4) given λ(γi). From strict
optimality, i.e., voters strictly prefer Πi to Π−i given λ(γi), we deduce that
λ(γ1) (I (Π1)− I (Π2)) > V (Π1; a,−1)− V (Π2; a,−1) > λ(γ2) (I (Π1)− I (Π2)) .
44
Simplifying the above expression using λ(γ1) = λ − 1/γ1 > λ − 1/γ2 = λ(γ2) yields
I (Π1) > I (Π2), so Π1 and Π2 cannot both be the solutions to the original problem
(2), a contradiction.
Step 2. Show that an optimal broadcast signal exists and has at most two realiza-
tions. We focus on the case where all voters participate in information consumption.
The proofs for the remaining cases are analogous and hence are omitted for brevity.
Since the following must hold for any nondegenerate signal structure Π:
V (Π; a,−1) =∑z∈Z
πz[−v + µz]+ <
∑z∈Z
πz[µz]+ = V (Π; a, 0)
and V (Π; a, 1) =∑z∈Z
−πz[v + µz]− <
∑z∈Z
−πz[µz]− = V (Π; a, 0) ,
it follows that for any nondegenerate solution to the infomediary’s problem, only a
subset B ⊆ −1, 1 of the extreme voters can have binding participation constraints,
whereas centrist voters’ participation constraint must be slack. For each k ∈ −1, 1,
let γk ≥ 0 denote the Lagrange multiplier associated with type k voters’ participation
constraint, and write their complementary slackness constraints as
γk ≥ 0, V (Π; a, k) ≥ λI (Π) , and γk [V (Π; a, k)− λI (Π)] = 0. (5)
Formulate the infomediary’s problem as
maxZ,Π:Ω→∆(Z)γ−1,γ1≥0
I (Π) +∑
k∈−1,1
γk [V (Π; a, k)− λI (Π)] subject to (5), (6)
and consider three cases. First, if B = ∅, then the solution to (6) is the true state.
Second, if |B| = 1, then the solution to (6) is the optimal personalized signal for
45
the voter in B. Finally, if |B| = 2, then write γk = γkγ−1+γ1
for k ∈ −1, 1 and
λ = λ− 1γ−1+γ1
. Simplifying (6) to
maxZ,〈πz ,µz〉z∈Zγ−1,γ1≥0
∑z∈Z
πz
[γ−1[−v + µz]
+ − γ1[v + µz]− − λh(µz)
]︸ ︷︷ ︸
f(µz)
(7)
subject to (BP) and (5),
where f (µz) is the maximum of three strictly concave functions of µ: (i) γ−1 (−v + µ)−
λh (µ), (ii) −λh (µ), and (iii) −γ1 (v + µ)− λh (µ) (see Figure 7 for a graphical illus-
tration).
Fix any γ−1, γ1 and λ, and consider the relaxed problem (7). Let f+ denote
the concave closure of f , and note that µ1 := inf µ : f+ (µ) > f (µ) and µ2 :=
sup µ : f+ (µ) > f (µ) exist and satisfy µ1 < 0 < µ2. There are three cases to
consider.
(a) If f+ (0) > (1−α)f+(µ1) +αf+(µ2) for all α ∈ [0, 1], then the unique solution to
the relaxed problem is the degenerate signal (as depicted on Panel (a) of Figure
7).
(b) If f+ (0) = (1− α)f+(µ1) + αf+(µ2) > f(0) for some α ∈ [0, 1], then the unique
solution to the relaxed problem is the binary signal 〈µ1, µ2〉 (as depicted on Panel
(b) of Figure 7).
(c) If f+ (0) = (1− α)f+(µ1) + αf+(µ2) = f(0) for some α ∈ [0, 1], then the relaxed
problem has multiple solutions, each of which entails at most three signal real-
izations (as depicted on Panel (c) of Figure 7). Among all these solutions, the
binary signal 〈µ1, µ2〉 is the most Blackwell-informative and therefore constitutes
the unique solution to the original attention-maximization problem.
46
Taken together, we can always restrict Z to Z : |Z| ≤ 2 in the original problem (6)
and therefore guarantee the existence of a solution.
0 v-v μ2μ1μ
f(μ) f +(μ) <latexit sha1_base64="gfLEKxkdHSLkPoyr+Q5CZyerqQo=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4KomKeix68diC/YA2lM120q7dbMLuRiihv8CLB0W8+pO8+W/ctjlo64OBx3szzMwLEsG1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1A6pRcIkNw43AdqKQRoHAVjC6m/qtJ1Sax/LBjBP0IzqQPOSMGivVw16p7FbcGcgy8XJShhy1Xumr249ZGqE0TFCtO56bGD+jynAmcFLsphoTykZ0gB1LJY1Q+9ns0Ak5tUqfhLGyJQ2Zqb8nMhppPY4C2xlRM9SL3lT8z+ukJrzxMy6T1KBk80VhKoiJyfRr0ucKmRFjSyhT3N5K2JAqyozNpmhD8BZfXibN84p3VbmoX5art3kcBTiGEzgDD66hCvdQgwYwQHiGV3hzHp0X5935mLeuOPnMEfyB8/kDzY2M8g==</latexit>
f(μ) f +(μ)<latexit sha1_base64="gfLEKxkdHSLkPoyr+Q5CZyerqQo=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4KomKeix68diC/YA2lM120q7dbMLuRiihv8CLB0W8+pO8+W/ctjlo64OBx3szzMwLEsG1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1A6pRcIkNw43AdqKQRoHAVjC6m/qtJ1Sax/LBjBP0IzqQPOSMGivVw16p7FbcGcgy8XJShhy1Xumr249ZGqE0TFCtO56bGD+jynAmcFLsphoTykZ0gB1LJY1Q+9ns0Ak5tUqfhLGyJQ2Zqb8nMhppPY4C2xlRM9SL3lT8z+ukJrzxMy6T1KBk80VhKoiJyfRr0ucKmRFjSyhT3N5K2JAqyozNpmhD8BZfXibN84p3VbmoX5art3kcBTiGEzgDD66hCvdQgwYwQHiGV3hzHp0X5935mLeuOPnMEfyB8/kDzY2M8g==</latexit>
f(μ) f +(μ)<latexit sha1_base64="Ddchi8D2DTUXY1wiSEEskbdlJak=">AAAB7HicbVBNS8NAEJ34WetX1aOXxSIIQklU1GPRi8cKpi20tWy2m3bpZhN2J0IJ/Q1ePCji1R/kzX/jts1BWx8MPN6bYWZekEhh0HW/naXlldW19cJGcXNre2e3tLdfN3GqGfdZLGPdDKjhUijuo0DJm4nmNAokbwTD24nfeOLaiFg94CjhnYj2lQgFo2glP3zMTsfdUtmtuFOQReLlpAw5at3SV7sXszTiCpmkxrQ8N8FORjUKJvm42E4NTygb0j5vWapoxE0nmx47JsdW6ZEw1rYUkqn6eyKjkTGjKLCdEcWBmfcm4n9eK8XwupMJlaTIFZstClNJMCaTz0lPaM5QjiyhTAt7K2EDqilDm0/RhuDNv7xI6mcV77Jyfn9Rrt7kcRTgEI7gBDy4gircQQ18YCDgGV7hzVHOi/PufMxal5x85gD+wPn8Aaytjps=</latexit>
Part (i): Recall that µbL (a) is the unique solution to (8), i.e.,
maxµ∈[−1,0]
h (µ) s.t.1
2[v (−a, a,−1)− µ]+ ≥ λh (µ) .
Solving this problem using (i) h is strictly convex and strictly decreasing on [−1, 0]
and (ii) v(−a, a,−1) is nondecreasing in a shows that µbL(a) is nondecreasing in a.
As a result, φb (−a, 0, 0) = a + µbL (a) is strictly increasing in a, which together with
φb (−a, 0, 0)∣∣a=0
= µbL (0) < 0 implies that the unique root of φb(−a, 0, 0) is strictly
positive, and Ξb(0) := a ≥ 0 : φb(−a, 0, 0) ≤ 0 = [0, unique root of φb(−a, 0, 0)]. In
case the above root exceeds t(1), solving it using the fact that φb (−a, 0, 0) = a− υbL∀a ≥ t(1) yields υbL(:= ξb(0)).
Part (ii): For k = 1, notice that φp (−a, t (1) , 1) = a+t (1)+µpL (a, 1) = a+t (1)−υpL (1)
∀a ≥ t (1) and that φp (−a, t (1) , 1)|a=t(1) = v (−t (1) , t (1) , 1) + µpL (t(1), 1) < 0
by (SOB). Therefore, the maximum root of φp(−a, t(1), 1) equals −t(1) + υpL(1)(:=
ξp(1)) and exceeds t(1), and Ξp(1) := a ≥ 0 : φp(−a, t(1), 1) ≤ 0 satisfies Ξp(1) ∩
[t(1),+∞) = [t(1), ξp(1)]. The proofs for k = 0,−1 are analogous and are therefore
omitted.
We now close the proof of Theorem 2. In the broadcast case, combining Lemmas
55
3 and 5 yields Eb,q = Ξb(0) = [0, ξb(0)]. The proof for the personalized case is the
same as that for the broadcast case if q(0) > 1/2. If, instead, q(0) ≤ 1/2, then
Ep,q = ([0, t(1)) ∩ Ξp(0))︸ ︷︷ ︸A
∪ ([t(1),+∞) ∩ ∩k∈KΞp(k))︸ ︷︷ ︸B
by Lemma 4. Consider two cases. First, if ξp(0) < t(1)(< ξp(±1)), then A =
[0, ξp(0)] by Lemma 5 and B = ∅. Second, if ξp(0) ≥ t(1), then A = [0, t(1)) and B
= [t(1),mink∈K ξp(k)]. In both cases, Ep,q = A ∪B = [0,mink∈K ξ
p(k)].
Proof of Proposition 4
Proof. Replicating the proofs in this appendix for market-share maximizing compa-
nies gives the desired result.
C Minor extensions
Relaxing Assumption 2 Assumption 2, or uniform strict obedience, mandates
that all voters’ signals must satisfy (SOB) for any given pair of segmentation tech-
nology and feasible policy profile. Three assumptions together guarantee that this is
the case: (i) the underlying state is binary, (ii) optimal signals are nondegenerate,
and (iii) voters face binary decision problems. Online Appendix O.3 investigates an
extension to a continuum of states. Here we discuss the consequences of relaxing (ii)
and (iii).
The next example shows that policy polarization could still be positive even if
extreme voters are excluded from information consumption.
Example 1. Let everything be as in the baseline model except that extreme voters are
excluded from information consumption. Take any symmetric policy profile 〈−a, a〉
56
with a ∈[0,min
t (1) , ξS (0)
], and suppose extreme voters vote along party lines
when they are indifferent between the two candidates. By construction, any unilateral
deviation of candidate R from 〈−a, a〉 doesn’t attract centrist voters, and it doesn’t
increase the total number of votes that extreme voters cast to him. Combining these
observations shows that 〈−a, a〉 can be sustained as an equilibrium outcome. ♦
The next example shows that policy polarization could still be positive, even if
extreme voters observe more than two signal realizations and do not always have
strict preferences between the two candidates.18
Example 2. Let everything be as in the baseline model except that extreme voters’
personalized signals have three realizations L, M and R, and they strictly prefer
candidate L to R given signal realizaton L, are indifferent between the two candidates
given signal realization M , and strictly prefer candidate R toL given signal realization
R. Below we demonstrate that the policy profile 〈−t (1) , t (1)〉 can be sustained as
an equilibrium outcome.
For each k ∈ −1, 1, write µz (k) for the posterior mean of the state con-
ditional on type k voters’ signal realization being z ∈ L,M,R, and note that
v (−t (1) , t (1) , k) + µL (k) < v (−t (1) , t (1) , k) + µM (k) = 0 < v (−t (1) , t (1) , k) +
µR (k) by assumption. Bayes’ plausibility implies that µL (k) < 0 < µR (k). Con-
sider any unilateral deviation of candidate R from 〈−t (1) , t (1)〉 to a′. Clearly, no
a′ /∈ [−t (1) , t (1)] constitutes a profitable deviation, and no a′ ∈ [−t (1) , t (1)] at-
tracts centrist voters whose policy latitude is assumed to be greater than t (1). It
remains to show that no a′ ∈ [−t (1) , t (1)) affects extreme voters’ voting decisions.
18We do not explicitly model the decision problem and signal generation process here. It iswell known that the signal acquired by an RI decision-maker facing a finite decision problem (e.g.,categorical thinking) has finitely many realizations (Matejka and McKay (2015)). Indeed, the sameconclusion can sometimes be drawn for infinite decision problems (Jung et al. (2019).
57
For k = −1, note that
v (−t (1) , a′,−1) + µL (−1)
≤ v (−t (1) , t (−1) ,−1) + µL (−1) (inverted V-shape)
= v (−t (1) ,−t (1) ,−1) + µL (−1) (symmetry)
= 0 + µL (−1)
< 0,
and the following must hold for z = M,R:
v (−t (1) , a′,−1) + µz (−1)
> v (−t (1) , t (1) ,−1) + µz (−1) (inverted V-shape)
≥ 0.
If, in addition, type −1 voters break the tie in favor of candidate R, then no a′ ∈
[−t (1) , t (1)) could affect their voting decisions. The proof for k = 1 is analogous
and hence is omitted. ♦
Heterogeneous attention cost parameter By allowing the attention cost pa-
rameter to differ across voters, we might (but not necessarily) end up in a situation
in which centrist voters’ participation constraint is binding whereas extreme voters’
participation constraints are slack in the broadcast case. But then the broadcast
signal would be the same as centrist voters’ personalized signal, so information per-
sonalization could only decrease policy polarization.
58
Alternative segmentation technologies For general segmentation technologies,
we can first aggregate—for each market segment—voters with binding participation
constraints into a representative voter, and then solve for the optimal personalized
signals for representative voters.
References
Athey, S., M. Mobius, and J. Pal. (2021): “The Impact of aggregators on
Internet news consumption,” NBER working paper.
Aumann, R., and M. Maschler. (1995): Repeated Games with Incomplete Infor-
mation, Cambridge, MA: MIT Press.
Burke, J. (2008): “Primetime spin: Media bias and belief confirming information,”
Journal of Economics and Management Strategy, 17(3), 633-665.
Calvert, R. L. (1985a): “The value of biased information: A rational choice model
of political advice,” Journal of Politics, 47(2), 530-555.
——— (1985b): “Robustness of the multidimensional voting model: Candidate mo-
tivations, uncertainty, and convergence,” American Journal of Political Science,
29(1), 69-95.
Caplin, A., and M. Dean. (2013): “Behavioral implications of rational inattention
with Shannon entropy,” NBER working paper.
——— (2015): “Revealed preference, rational inattention, and costly information
acquisition,” American Economic Review, 105(7), 2183-2203.
Chan, J., and W. Suen. (2008): “A spatial theory of news consumption and
electoral competition,” Review of Economic Studies, 75(3), 699-728.
59
Che, Y-K., and K. Mierendorff. (2019): “Optimal dynamic allocation of atten-
tion,” American Economic Review, 109(8), 2993-3029.
Cover, T. M., and J. A. Thomas. (2006): Elements of Information Theory,
Hoboken, NJ: John Wiley & Sons, 2nd ed.
Dean, M., and N. Neligh. (2019): “Experimental tests of rational inattention,”
Working Paper.
Dellarocasm, C., J. Sutanto, M. Calin, and E. Palme. (2016): “Atten-
tion allocation in information-rich environments: The case of news aggregators,”
Management Science, 62(9), 2457-2764.
DellaVigna, S., and M. Gentzkow. (2010): “Persuasion: Empirical evidence,”
Annual Review of Economics, 2, 643-669.
Duggan, J. (2017): “A survey of equilibrium analysis in spatial model of elections,”
Working Paper.
Eslami, M., A. Aleyasen, K. G. Karahalios, K. Hamilton, and C. Sand-
vig. (2015): “FeedVis: A path for exploring news feed curation algorithms,”
CSCW’15 Companion: Proceedings of the 18th ACM Conference Companion on
Computer Supported Cooperative Work & Social Computing, 65-68.
Flaxman, S., S. Goel, and J. M. Rao. (2016): “Filter bubbles, echo chambers
and online news consumption,” Public Opinion Quarterly, 80(S1), 298-320.
Gentzkow, M. (2016): “Polarization in 2016,” Toulouse Network for Information
Technology whitepaper.
Gerber, A. S., J. G. Gimpel, D. P. Green, and D. R. Shaw. (2011): “How
large and long-lasting are the persuasive effects of televised campaign ads? Results
from a randomized field experiment,” American Political Science Review, 105(1),
135-150.
Glaeser, E. L., G. A. M. Ponzetto, and J. M. Shapiro. (2005): “Strategic
extremism: Why Republicans and Democrats divide on religious values,” Quarterly
Journal of Economics, 120(4), 1283-1330.
Hebert, B., and M. D. Woodford. (2018): “Rational inattention in continuous
time,” Working Paper.
Herrera, H., D. K. Levine, and C. Martinelli. (2008): “Policy platforms,
campaign spending and voter participation,” Journal of Public Economics, 92(3-
4), 501-513.
Hersh, E. D. (2015): Hacking the Electorate: How Campaigns Perceive Voters,
Cambridge, U.K.: Cambridge University Press.
Hotelling, H. (1929): “Stability in competition,” The Economic Journal, 39(153),
41-57.
Iyer, G., D. Soberman, and M. Villas-Boas. (2005): “The targeting of adver-
tising,” Marketing Science, 24(3), 461-476.
61
Jung, J., J. Kim, F. Matejka, and C. A. Sims. (2019): “Discrete actions in
information-constrained problems,” Review of Economic Studies, 86(6), 2643-2667.
Kamenica, E., and M. Gentzkow. (2011): “Bayesian persuasion,” American
Economic Review, 101(6), 2590-2615.
Lagun. D, and M. Lalmas. (2016): “Understanding user attention and engage-
ment in online news reading,” Proceedings of the Ninth ACM International Con-
ference on Web Search and Data Mining, 113-122.
Martinelli, C. (2006): “Would Rational Voters Acquire Costly Information?,”
Journal of Economic Theory, 129(1), 225-251.
Matejka, F., and A. McKay. (2012): “Simple market equilibria with ratio-
nally inattentive consumers,” American Economic Review: Papers and Proceedings,
102(3), 24-29.
——— (2015): “Rational inattention to discrete choices: A new foundation for the
multinomial logit model,” American Economic Review, 105(1), 272-298.
Matejka, F., and G. Tabellini. (2020): “Electoral competition with rationally
inattentive voters,” Journal of European Economic Association, jvaa042.
Matsa, K. E., and K. Lu. (2016): “10 Facts about the changing digital news
landscape,” Pew Research Center, September 14.
Morris, S., and P. Strack. (2017): “The Wald problem and the equivalence of
sequential sampling and static information costs,” Working Paper.
Pariser, E. (2011): The Filter Bubble: How the New Personalized Web Is Changing
What We Read and How We Think, New York, NY: Penguin Press.
62
Perego, J., and S. Yuksel. (2022): “Media competition and social disagreement,”
Econometrica, 90(1), 223-265.
Prat, A., and D. Stromberg. (2013): “The Political Economy of Mass Media,” in
Advances in Economics and Econometrics: Theory and Applications, Tenth World
Congress, ed. by D. Acemoglu, M. Arellano, and E. Dekel.. Cambridge University
Press.
Sauer, M. P., M. G., Schlatterer, and S. Y. Schmitt. (2019): “Horizontal
product differentiation with limited attentive consumers,” Working paper.
Sims, C. A. (1998): “Stickiness,” Carnegie-Rochester Conference Series on Public
Policy, 49(1), 317-356.
——— (2003): “Implications of rational inattention,” Journal of Monetary Eco-
nomics, 50(3), 665-690.
Spiegler, R. (2016): “Choice complexity and market competition,” Annual Review
of Economics, 8, 1-25.
Stromberg, D. (2015): “Media and politics,” Annual Review of Economics, 7,
173-205.
Suen, W. (2004): “The self-perpetuation of biased beliefs,” The Economic Journal,
114, 377-396.
Sunstein, C. R. (2009): Republic.com 2.0, Princeton, NJ: Princeton University
Press.
The Digital Competition Expert Panel. (2019): Unlocking Digital Competi-
tion, U.K.
63
Warren, E. (2019): “Here’s how we can break up Big Tech,” Medium, March 8.
Yuksel, S. (2022): “Specialized learning and political polarization,” International
Economic Review, 63(1), 457-474.
Zhong, W. (2019): “Optimal dynamic information acquisition,” Working Paper.
64
Online Appendices
(For Online Publication Only)
O.1 General model
This appendix has two purposes. The first purpose is to extend the baseline model to
general voters. Throughout, suppose candidates can adopt the policies in a compact
interval A = [−a, a] with a > 0. Voters’ type space is an arbitrary finite set K =
−K, · · · , 0, · · · , K. Their population function q : K → R++ has support K and is
symmetric around zero. Their utility function u : A×K → R satisfies the properties
listed in Observation 2, i.e.,
Assumption O1. Continuity and weak concavity u (·, k) is continuous and weakly
concave for any k ∈ K.
Symmetry u (a, k) = u (−a,−k) for any a ∈ R and k ∈ K.
Inverted V-shape u (·, k) is strictly increasing on [−a, t (k)] and is strictly decreas-
ing on [t (k) , a] for any k ∈ K, where t : K → A is strictly increasing and
symmetric around zero.
Increasing differences v(−a, a′, k) := u (a, k) − u (a′, k) is increasing in k for any
a > a′. For any a > 0, v(−a, a, k) := u (a, k) − u (−a, k) is strictly positive if
k > 0, equals zero if k = 0, and is strictly negative if k < 0.
Observation O1. All results of Section 3.1 remain valid under Assumptions 1, 2,
and O1.
Proof. The proof for the personalized case is the exact same as before. As for the
broadcast case, notice that under the current assumptions, only voters of the most
extreme types can have binding participation constraints, wheres those of interim
types must have slack participation constraints. Replacing k = ±1 with k = ±K in
the proofs of Lemmas 1 and 2 and Theorem 1 gives the desired result.
1
The second purpose of this appendix is to relax the assumption that signals are
conditionally independent across market segments. In what follows, we’ll first develop
new concepts in Appendix O.1.1 and then conduct equilibrium analyses in Appendix
O.1.2.
O.1.1 Key concepts
Joint signal distribution A joint signal distribution is a tuple 〈χ,b+,b−〉 of a
configuration matrix χ and probability vectors b+ and b−. The configuration matrix
χ has |K| rows. Each column of it constitutes a profile of the voting recommendations
that is prescribed to type −K, · · · , K voters with a strictly positive probability. Each
entry of χ is either 0 or 1, where 0 means that candidate R is disapproved of and 1
means that he is endorsed. For example, the configuration matrix is
χ∗ =
0 1
0 1
......
0 1
if S = b, and it is
χ∗∗ =
0 1 0 · · · 0 1 · · · 0 · · · 1
0 0 1 · · · 0 1 · · · 0 · · · 1
......
... · · ·...
... · · ·... · · · 1
0 0 0 · · · 0 0 · · · 1 · · · 1
0 0 0 · · · 1 0 · · · 1 · · · 1
︸ ︷︷ ︸
2|K| columns
2
if S = p and signals are conditionally independent across voters. The vectors b+
and b− compile the probabilities that each column of χ occurs in states ω = 1 and
ω = −1, respectively. By definition, all elements of b+ or b− are strictly positive and
add up to one.
We consider symmetric joint signal distributions that are consistent with the
marginal signal distributions solved in Section 3.1. To formally define symmetry, let
x be a generic voting recommendation profile to type −K, · · · , K voters, 1 be the
|K|-vector of ones, and
P =
1
. ..
1
be a |K| × |K| permutation matrix. Define the symmetry operator Σ as
Σ x = P (1− x) ,
so that x recommends candidate z ∈ L,R to type k voters if and only if Σ x
recommends candidate −z to type −k voters. A joint signal distribution is symmetric
if the probability that a voting recommendation profile x occurs in state ω = 1 equals
the probability that Σ x occurs in state ω = −1. Formally,
Definition O1. A configuration matrix χ is symmetric if for any
m ∈ 1, · · · ,#columns (χ), there exists n ∈ 1, · · · ,#columns (χ) such that Σ
[χ]m = [χ]n. A joint signal distribution 〈χ,b+,b−〉 is symmetric if χ is symmetric
and [b+]m = [b−]n for any m,n as above.19
We next define consistency. In Footnote 16, we solved for the marginal probabil-
19With a slight abuse of notation, we use [·]m to denote both the mth entry of a column vectorand the mth column of a matrix. #columns (χ) denotes the number of the columns of χ.
3
ities that the signal consumed by type k voters endorses candidate R in states ω = 1
and ω = −1, respectively, holding any segmentation technology S and symmetric
policy profile 〈−a, a〉 fixed. Compiling these probabilities across type −K, · · · , K
voters yield two |K|-vectors πS,+ (a) and πS,− (a) of marginal probabilities.
Definition O2. A joint signal distribution 〈χ,b+,b−〉 is 〈S, a〉-consistent for some
S ∈ b, p and a ∈ [0, a] if
χb+ = πS,+ (a) and χb− = πS,− (a) .
A configuration matrix χ is 〈S, a〉-consistent if there exist probability vectors b+ and
b− such that the joint signal distribution 〈χ,b+,b−〉 is 〈S, a〉-consistent. χ is S-
consistent if it is 〈S, a〉-consistent for all a ∈ [0, a].
By definition, χ∗ is b-consistent and, indeed, the only 〈b, a〉-consistent configura-
tion matrix for any given a ∈ [0, a]. χ∗∗ is p-consistent, but it is not the uniquely
p-consistent configuration in general (numerical examples are available upon request).
Attraction-proof set and policy latitude In the proof of Theorem 2, we defined
a voter’s susceptibility to a unilateral deviation deviation of candidate R from 〈−a, a〉
to a′ as φS(−a, a′, k) := v(−a, a′, k) + µSL(a, k), and noted that a′ attracts the voter
if and only if φS(−a, a′, k) > 0. We also defined the attraction-proof set and policy
latitude for every individual voter. The next definition generalizes the above concepts
to sets of voters.
Definition O3. Under segmentation technology S ∈ b, p, a deviation of candidate
R from a symmetric policy profile 〈−a, a〉 with a ∈ [0, a] to a′ attracts a set D ⊆ K of
voters if it attracts all its members, i.e., φS (−a, a′, k) > 0 ∀k ∈ D. This is equivalent
4
to
φS (−a, a′,D) := mink∈D
φS (−a, a′, k) > 0,
where φS (−a, a′,D) is the D-susceptibility to a′ following unfavorable information
to candidate R’s valence. The D-proof set ΞS (D) gathers all nonnegative policy a’s
such that no deviation of candidate R from 〈−a, a〉 attracts D, i.e.,
ΞS (D) :=
a ∈ [0, a] : max
a′∈AφS (−a, a′,D) ≤ 0
.
The maximum of the D-proof set
ξS (D) := max ΞS (D)
is D’s policy latitude.
Influential coalition The next concept is integral to the upcoming analysis.
Definition O4. Fix any segmentation technology S ∈ b, p, symmetric policy profile
〈−a, a〉 with a ∈ [0, a], and population function q, and let the default be the strictly
obedient outcome induced by any joint signal distribution 〈χ,b+,b−〉 that is 〈S, a〉-
consistent. A set C ⊆ K of voters constitutes an influential coalition if attracting C
while holding other things constant strictly increases candidate R’s winning probability
compared to the default.
Notice that majority coalitions are influential, and supersets of influential coali-
tions are influential. In the broadcast case, all voters consume the same signal, so
a coalition of voters is influential if and only if it is a majority coalition. In the
personalized case, non-majority coalitions can be influential due to the imperfect
correlation between different voters’ signals (see Table 1 for an illustration). In prin-
5
ciple, influential coalitions can depend on the entire joint signal distribution (and
certainly voters’ population distribution). The next lemma limits such dependence
to the configuration matrix only.
Lemma O1. Let everything be as in Definition O4. Then influential coalitions depend
on the joint signal distribution 〈χ,b+,b−〉 and voter’s population distribution q only
through the pair 〈χ, q〉, and they are independent of the policy profile 〈−a, a〉 if χ is
S-consistent.
Omitted proofs from Appendices O.1-O.3 are gathered in Appendix O.4.
O.1.2 Main results
The next lemma gives a full characterization of the symmetric policy profiles that
can arise in equilibrium, thus extending Lemma 4 to general voters and joint signal
distributions.
Lemma O2. Fix any pair of segmentation technology S ∈ b, p and population
function q, and assume Assumptions 1, 2, and O1. Then the following are equivalent.
(i) A symmetric policy profile 〈−a, a〉 with a ∈ [0, a] can arise in an equilibrium
with a joint signal distribution 〈χ,b+,b−〉 that is 〈S, a〉-consistent.
(ii) No deviation of candidate R from 〈−a, a〉 to a′ ∈ [−a, a) attracts any influential
coalition formed under 〈χ, q〉 whose members have ideological bliss points in
[−a, a].
In what follows, we’ll use ES,χ,q denote the set of the nonnegative policy a’s such
that 〈−a, a〉 can arise in equilibrium under segmentation technology S, configuration
matrix χ, and population function q. As before, we are interested in the degree of
6
policy polarization aS,χ,q, defined as the maximum of ES,χ,q, and whether all policies
between zero and aS,χ,q can arise in equilibrium. We focus on χs that are S-consistent,
so that ES,χ,q can be computed in two simple steps.20
1. Compute the influential coalitions formed under 〈χ, q〉.
2. For each a ∈ [0, a], check if any deviation as in Lemma O2(ii) is profitable to
candidate R. If the answer is negative, then add a to the output set.
In addition, we impose the following regularities on the susceptibility function (see
Appendix O.4 for sufficient conditions).
Assumption O2. φS (−a, a′, k) is increasing in a on [|t (k) |, a] for any S ∈ b, p,
k ∈ K, and a′ ∈ A.
Theorem O1. Fix any segmentation technology S ∈ b, p, S-consistent configura-
tion matrix χ, and population function q. Let C denote a typical influential coalition
formed under 〈χ, q〉. Under Assumptions 1, 2, O1, and O2, ES,χ,q =[0, aS,χ,q
], where
aS,χ,q = minCs formed under 〈χ,q〉
ξS (C) > 0.
The messages are twofold. First, policy polarization is in general disciplined by
the influential coalition with the smallest policy latitude and is strictly positive. Sec-
ond, marginal signal distributions affect policy polarization through policy latitudes,
whereas the joint signal distribution does so through the configuration matrix, holding
marginal news distributions constant.
The remainder of this appendix investigates the comparative statics of policy po-
larization regarding influential coalitions, holding marginal signal distributions (and
hence voters’ policy latitudes) fixed. Our starting observation is that enriching the
20For an arbitrary χ, one needs to check after Step 2 whether the output policy a is 〈S, a〉-consistent with χ.
Consider next the transition from broadcast information to personalized infor-
mation, which enriches the configuration matrix and hence has a negative policy
polarization effect, holding other things constant.
Proposition O2. Cs formed under 〈χ∗, q〉 ⊆ Cs formed under 〈χ, q〉 for any p-
consistent configuration matrix χ and any population function q.
Proof. ∀χ and q as above, Cs formed under 〈χ, q〉 ⊇ majority coalitions
= Cs formed under 〈χ∗, q〉.
Given Proposition O2, the reader can safely attribute the increasing policy po-
larization as shown in Proposition 1 of the main text to changes in marginal signal
distributions.
We finally investigate the policy polarization effect of increasing mass polarization.
As in the main text, we define increasing mass polarization as a mean-preserving
spread of voters’ policy preferences.
Definition O6. The mass is more polarized under q′ than q if q has second-order
stochastic dominance over q′ (write q SOSD q′), i.e.,∑K
k=m q (k) ≤ ∑Kk=m q
′ (k)
∀m = 1, · · · , K.
The analysis assumes quadratic attention cost.
Assumption O3. h (µ) = µ2.
The next proposition proves a similar result to Proposition 3 for general voters
and p-consistent configurations.
Proposition O3. Under Assumptions 1, 2, O1, and O3, ap,χ,q ≥ ap,χ,q′
for any
p-consistent configuration χ and any two population functions q and q′ such that
q SOSD q′.
9
O.2 Competitive infomediaries
This appendix investigates an extension to competitive infomediaries. In the environ-
ment laid out in Appendix O.1, suppose each type k ∈ K voter is served by m (k) ≥ 2
infomediaries. A market segment is a pair (k, i), where k ∈ K represents the type of
the voters being served, and i ∈ 1, · · · ,m (k) represents the serving infomediary.
The population of the voters in market segment (k, i) is ρ (k, i), where ρ (k, i) > 0 and∑m(k)i=1 ρ (k, i) = q (k). The functions m and ρ are symmetric (i.e., m (k) = m (−k)
and ρ (k, i) = ρ (−k, i) for any k ∈ K and i = 1, · · · ,m (k)), and they are taken as
given throughout this appendix.
For any given symmetric policy profile a = 〈−a, a〉 with a ∈ [0, a], the signal for
market segment (k, i) maximizes the net expected utilities of the voters therein (as
in the standard RI model):
maxΠ
V (Π; a, k)− λ · I (Π) .
Across market segments, we consider all joint signal distributions that are symmetric
and consistent with the marginal signal distributions that solve the above problem
(hereafter c-consistency). As in Appendix O.1, we can represent a joint signal dis-
tribution by its matrix form and define the c-consistency of the configuration. This
exercise is omitted for brevity.
We examine the policy polarization effect of introducing perfect competition be-
tween infomediaries. To facilitate comparison between the monopolistic personalized
case, we redefine p-consistency by first forming market segments using functions m
and ρ and then restricting voters of the same type to receiving the same voting
recommendation. By Lemma O1, equilibrium policies are fully determined by (1)
10
S ∈ c, p, which pins down marginal signal distributions, (2) the configuration χ,
and (3) population functions m and ρ. Hereafter we shall use ES,χ,m,ρ to denote the
equilibrium policy set and aS,χ,m,ρ to denote policy polarization.
The next proposition prescribes sufficient conditions for competition to reduce
policy polarization.
Proposition O4. Fix any functions m and ρ as above, and assume Assumptions 1, 2,
O1, and O2 for S ∈ c, p. Then Ec,χ,m,ρ = [0, ac,χ,m,ρ] ( Ep,χ′,m,ρ =[0, ap,χ
′,m,ρ]
for
any c-consistent configuration χ and p-consistent configuration χ′ such that χ χ′.
Two forces are acting in the same direction. First, competitive signals maximize
voters’ expected utilities rather than their attention and hence are less Blackwell-
informative than monopolistic personalized signals. As infomediaries stop overfeeding
voters with information about the valence state, voters become more susceptible to
policy deviations, so their policy latitudes fall. Second, dividing voters of the same
type into multiple subgroups reduces the correlation between their signals. Such a
change enriches the configuration without affecting marginal signal distributions, so
its policy polarization effect is negative.
O.3 General state distribution
This appendix extends the analysis so far to general state distributions. In the en-
vironment laid out in Appendix O.1, suppose the valence state is distributed on R
according to a c.d.f. G that is absolute continuous and symmetric around zero.21 A
signal structure is a mapping Π : R→ ∆ (Z), where each Π (· | ω) specifies a proba-
bility distribution over a finite set Z of signal realizations when the state realization
21Results below hold for discrete Gs, too. Assuming that ω ∈ R is w.l.o.g. because RI voterswho care ultimately about the differential quality between the two candidates would only acquireinformation about this single-dimensional random variable (Matejka and McKay (2015)).
11
is ω ∈ R. Under signal structure Π,
πz =
∫ω∈R
Π (z | ω) dG (ω)
is the probability that the signal realization is z ∈ Z, and it is assumed w.l.o.g. to
be strictly positive. Then
µz =
∫ω∈R
ωΠ (z | ω) dG (ω) /πz
is the posterior mean of the state conditional on the signal realization being z ∈ Z.
The next assumption is adapted from Matejka and McKay (2015).
Assumption O4. The needed amount of attention for consuming Π : R→ ∆ (Z) is
I (Π) = H (G)− EΠ [H (G (· | z))]
where H (G) is the entropy of the valence state, and H (G (· | z)) is the conditional
entropy of the valence state given signal realization z.
In what follows, we’ll first give characterizations of optimal signals and then ex-
amine their implications for policy polarization. To achieve the first goal, we fix, as
in Section 3.1, any symmetric policy profile 〈−a, a〉 with a ∈ [0, a] and use ΠS (a, k)
to denote any optimal signal consumed by type k voters under segmentation tech-
nology S ∈ b, p. When S = b, we drop the notation k and simply write Πb (a).
For each ΠS (a, k), we use ZS (a, k) to denote its support and µSz (a, k) to denote the
posterior mean of the state conditional on the signal realization being z ∈ ZS (a, k).
The next proposition gives characterizations of optimal broadcast and personalized
signals, thus extending Lemma 1 and Theorem 1 to general state distributions.
12
Proposition O5. Fix any symmetric policy profile 〈−a, a〉 with a > 0, and assume
Assumptions O1 and O4. Then,
(i) any optimal personalized signal Πp (a, k) that is nondegenerate and makes type
k voters’ participation constraint binding must satisfy |Zp (a, k) | = 2, (SOB)
and the skewness properties stated in Theorem 1(ii);
(ii) any optimal broadcast signal Πb (a) that is nondegenerate, induces consumption
from all voters and makes some voter’s participation constraint binding must
satisfy |Zb (a) | ∈ 2, 3:
(a) if |Zb (a) | = 2, then Πb (a) satisfies (SOB) and the skewness properties
stated in Theorem 1(i);
(b) if |Zb (a) | = 3, then we can write Zb (a) = LL,LR,RR, where µbLL (a) <
0, µbLR (a) = 0, and µbRR (a) = |µbLL (a) | > 0. For any k ∈ K, we must
have v (a, k) +µbLL (a) < 0, sgn(v (a, k) + µbLR (a)
)= sgn (k), and v (a, k) +
µbRR (a) > 0.
With a continuum of states, the optimal broadcast signal can have three rather
than two signal realizations. Recall that when solving for the broadcast case, we
aggregate voters with binding participation constraints into a representative voter.
Under the current assumptions, only voters of the most extreme types can have bind-
ing participation constraints, and the representative voter acting on their behalves
takes at most three final actions: LL, LR, RR (the first and second letters stand for
the voting decisions of the left-leaning and right-leaning voters, respectively). This
observation, together with the assumption that the attention cost function is strictly
Blackwell-monotone, implies that the optimal personalized signal for the representa-
tive voter has at most three signal realizations.
13
The analysis of policy polarization is the same as before in the case of two signal
realizations. In the new case of three signal realizations, it can be shown that all
voters strictly obey the recommendations LL and RR, and that the posterior mean
of the state given LR must equal zero. As argued in the footnote below, this implies
that the only symmetric policy profile that can arise in equilibrium is 〈0, 0〉, hence
the transition from broadcast to personalized information aggregation must increase
policy polarization.22
O.4 Proofs
Proof of Lemma O1 Fix any segmentation technology S, symmetric policy profile
〈−a, a〉 with a ≥ 0, 〈S, a〉-consistent signal distribution 〈χ,b+,b−〉, and population
function q. Let q denote the |K|-column vector that compiles the populations of
voters −K, · · · , K. Let the default be the strictly obedient outcome induced by the
joint signal distribution.
Define two matrix operations. First, for any C ⊆ K, let χC be the resulting
matrix from replacing every row k ∈ C of χ with a row of all ones. Second, for
any matrix A, let A be the resulting matrix from rounding the entries of A, i.e.,
replacing those entries above 1/2 with 1 and those below 1/2 with zero. By definition,
the row vector q>χ compiles candidate R’s default winning probabilities across the
voting recommendation profiles that occur with strictly positive probabilities, and
(q>χb+ + q>χb−)/2 is candidate R’s default winning probability in expectation.
After candidate R commits a unilateral deviation from 〈−a, a〉 that attracts a set C ⊆22For any symmetric policy profile 〈−a, a〉 with a > 0, the deviation to a′ = 0 weakly increases
candidate R’s winning probability when the recommendation profile is either LL or RR (LemmaO2), and it strictly increases candidate R’s winning probability when the recommendation profileis LR (obviously). In contrast, no deviation from 〈0, 0〉 increases candidate R’s winning probabilitywhen the recommendation profile is LL or RR (Lemma O2) or LR (obviously).
14
K of voters without affecting anything else, his winning probability vector becomes
q>χC, and his expected winning probability becomes (q>χCb+ + q>χCb
−)/2. Since
q>χC ≥ q>χ, the deviation strictly increases candidate R’s winning probability in
expectation if and only if it does so under some voting recommendation profile, i.e.,
(q>χCb+ + q>χCb
−)/2 > (q>χb+ + q>χb−)/2 if and only if q>χC 6= q>χ. The
last condition is equivalent to C being an influential coalition, and it depends on S,
〈−a, a〉, 〈χ,b+,b−〉, and q only through 〈χ, q〉.
Proof of Lemma O2 Replacing left-leaning voters (of type k = 1) with any type
k < 0 voter and right-leaning voters (of type k = 1) with any type k > 0 voter in the
proof of Lemma 4 gives the desired result.
Lemma O3. Let everything be as in Theorem O1. Then for any k ∈ 0, · · · , K and
any D ⊆ −k, · · · , k such that D ∩ −k, k 6= ∅, we must have ξS (D) > t (k) and
[t (k) , a] ∩ ΞS (D) =[t (k) , ξS (D)
].
Proof. Fix any k and D as above. Recall that
ΞS (D) :=
a ≥ 0 : max
a′∈AφS (−a, a′,D) ≤ 0
where φS (−a, a′,D) := mink′∈D
φS (−a, a′, k′). Let t(D) denote the image of D under
the mapping t, and write D for [min t (D) ,max t (D)]. By Assumption O1 inverted
V-shape, we can restrict attention to deviations to a′ ∈ D, i.e.,
ΞS (D) =
a ≥ 0 : max
a′∈DφS (−a, a′,D) ≤ 0
.
Fix the policy profile to be 〈−t (k) , t (k)〉, and take any a′ ∈ D. From Assumption
15
O1 and (SOB), it follows that a′ doesn’t attract type k voters: