The Politics of Personalized News Aggregation

Personalized Information Aggregation and

Polarization

Lin Hu∗ Anqi Li† Ilya Segal‡

This Draft: April 2022

Abstract

We study how personalized information aggregation for rationally inatten-

tive voters (IARI) affects the polarization of policies and public opinion. In

a two-candidate electoral competition model, an attention-maximizing info-

mediary aggregates source data about candidates’ valences into easy-to-digest

information. Voters decide whether to consume information, trading off the ex-

pected gain from improved expressive voting against the attention cost. IARI

generates policy polarization even if candidates are office-motivated. Person-

alized information aggregation makes extreme voters the disciplining entity of

policy polarization, and the skewness of their signals is crucial for sustaining a

high degree of policy polarization in equilibrium. Analysis of disciplining voters

yields insights into the polarization effects of regulating infomediaries. Apply-

ing our theory to the study of a Hotelling duopoly model shows that IARI

∗Research School of Finance, Actuarial Studies and Statistics, Australian National University,[email protected].

†Olin Business School, Washington University in St. Louis, [email protected].‡Department of Economics, Stanford University, [email protected].

1

arX

iv:1

910.

1140

5v13

[ec

on.G

N]

10

Apr

202

2

renders the principle of minimum differentiation invalid even in the absence

of price competition, and that endogenizing firms’ locations makes the welfare

consequences of regulating infomediaries less clear-cut.

Keywords: rational inattention, electoral competition, polarization, infomedi-

ary regulation, Hotelling duopoly, principle of minimum differentiation

JEL codes: D72, D80, L10

2

1 Introduction

Recently, the idea that tech-enabled information personalization could affect polar-

ization has been put forward in the academia and popular press (Sunstein (2009);

Pariser (2011); Gentzkow (2016)). This paper studies how personalized information

aggregation for rationally inattentive voters affects the polarization of policies and

public opinion in an electoral competition model.

Our premise is that rational demand for information aggregation in the digital era

is driven by information processing costs. As the Internet and social media become

important sources of information, and the amount of available information there (2.5

quintillion bytes) is vastly greater than what any individual can process in a lifetime,

decision-makers must turn to infomediaries for information aggregation, personalized

based on their individual data such as demographic and psychographic attributes,

digital footprints, and social network positions.1 In this paper, we abstract from

the issue of data generation (e.g., original reporting), focusing instead on the role of

infomediaries in aggregating source data into information that is easy to process and

useful for the target audience.

We develop a model of information aggregation for rationally inattentive decision-

makers (IARI), in which an infomediary can flexibly aggregate source data into in-

formation using algorithm-driven systems. While flexibility is also assumed in the

1An infomediary is an internet company that gathers and links information on particular subjectson behalf of their customers. Prominent examples of infomediaries include news aggregators (e.g.,aggregator sites, social media feeds, mobile news apps), which operate by sifting through a myriad ofonline sources and discover stories that readers might find interesting. The snippets (i.e., headlinesand excerpts) displayed on their platforms contain coarse information and do not always result in theclick-through of original contents (Dellarocas et al. (2016)). A major revenue source for them comesfrom displaying ads to users while the latter are browsing through snippets (an exception is GoogleNews, which directs readers to the main Google search engine where product ads are displayed).

News aggregators have recently gained prominence as more people get information online, fromsocial media, and through mobile devices (Matsa and Lu (2016)). The top three popular newswebsites in 2019: Yahoo! News, Google News, and Huffington Post, are all aggregators. See Athey,Mobius, and Pal (2021) for background reviews.

3

Rational Inattention (RI) model pioneered by Sims (1998) and Sims (2003), there

decision-makers can aggregate information optimally themselves and so have no need

for external aggregators. To model the demand for infomediaries, we assume that

decision-makers can only choose whether to absorb the information offered to them

but cannot digest information partially or selectively, let alone aggregate information

optimally themselves. While this assumption is certainly stylized, it is the simplest

one that creates a role for infomediaries.2

If choosing to consume information, a decision-maker incurs an attention cost

that is posterior separable (Caplin and Dean (2013)) while deriving utilities from

improved decision-making. Consuming information is optimal if the expected utility

gain exceeds the attention cost. As for the infomediary, we assume that its goal is

to maximize the total amount of attention paid by decision-makers, interpreted as

the advertising revenue generated from consumer eyeballs. This stylized assumption

captures the key trade-off faced by the infomediary, who uses useful and easy-to-

process information to attract decision-makers’ attention while preventing them from

tuning out. While we focus on the case of a monopolistic infomediary in order to

capture the market power wielded by tech giants, we also investigate an extension

to perfectly competitive infomediaries which, together with personalization, becomes

equivalent to decision-makers optimally aggregating information themselves as in the

standard RI model.

We embed the IARI model into an electoral competition game in which two office-

motivated candidates choose policies on a left-right spectrum. Voters vote expres-

sively based on policies, as well as an uncertain valence state about which candidate

2As pointed out by Stromberg (2015), the last assumption is implicitly made by the medialiterature because without it the role of information provider would be much more limited. It isn’tat odds with reality, since analyses of page activities (e.g., scrolling, viewport time) have establishedsignificant user attention in the reading of online news (in particular, the snippets thereof (Dellarocaset al. (2016); Lagun and Lalmas (2016)).

4

is more fit for office. Information about candidate valence is provided by an infome-

diary. We study how IARI affects the polarization of equilibrium policies and voter

opinions in this game.

A consequence of IARI is that signal realizations prescribe recommendations as

to which candidate one should vote for. Indeed, any information beyond voting rec-

ommendations would only raise the attention cost without any corresponding benefit

for voters and would thus turn away voters whose participation constraints bind at

the optimum. Furthermore, voters must strictly prefer to obey the recommenda-

tions given to them, a property we refer to as strict obedience. Indeed, if voter has

a (weakly) preferred candidate that is independent of his voting recommendations,

then he could always vote for that candidate without paying attention, which saves

the attention cost.

An important implication of strict obedience is that local deviations from a sym-

metric policy profile wouldn’t change voters’ voting decisions regardless of the recom-

mendations they receive, which suggests that a positive degree of policy polarization

could arise in equilibrium even if candidates are office-motivated. We define pol-

icy polarization as the maximal distance between candidates’ positions among all

symmetric perfect Bayesian equilibria. In the baseline model featuring left-leaning,

centrist, and right-leaning voters, our main theorem shows that policy polarization is

strictly positive and equals the disciplining voter’s policy latitude.

A voter’s policy latitude is an index that captures his resistance to candidates’

policy deviations. It decreases with the voter’s horizontal preference for the deviating

candidate’s policies and increases with his pessimism about the deviating candidate’s

valence following unfavorable information. A voter is said to be disciplining if his

policy latitude determines policy polarization. To illustrate how personalized infor-

mation aggregation affects the identity of the disciplining voter, we compare two

5

cases: (i) broadcast information aggregation, in which the infomediary must offer a

single signal structure to all voters, and (ii) personalized information aggregation, in

which the infomediary can design different signal structures for different voters. In the

broadcast case, all voters receive the same voting recommendation, so a candidate’s

deviation is profitable, i.e., strictly increases his winning probability, if and only if

it attracts a majority coalition. Under the usual assumptions, this is equivalent to

attracting centrist voters, who are therefore disciplining. In the personalized case, the

infomediary can provide conditionally independent signals to different voters (this as-

sumption will be relaxed), so each type of voter is pivotal with a positive probability

when voters’ population distribution is sufficient dispersed. In that case, a policy

deviation is shown to be profitable if and only if it attracts any type of voter, and

voters with the smallest policy latitude are disciplining because they are the easiest

to attract.

The skewness of extreme voters’ personalized signals is crucial for sustaining a

greater degree of policy polarization as information aggregation becomes personalized.

To maximize the usefulness of information consumption for an extreme voter, the

recommendation to vote across party lines must be very strong and, in order to

prevent the voter from tuning out, must also be very rare (hereafter an occasional

big surprise). Most of the time, the recommendation is to vote along party lines

(hereafter a predisposition reinforcement), which together with the occasional big

surprise has been documented in the empirical literature.3 When base voters are

disciplining, the occasional big surprise of their signal makes them difficult to attract

in the rare event where information is unfavorable to their own-party candidate.

3Recently, Flaxman, Goel, and Rao (2016) find that the use of news aggregators mostly rein-forces people’s predispositions, but it also strengthens their opinion intensities when supporting theopposite-party candidates (i.e., occasional big surprise). Evidence for predisposition reinforcement isdiscussed in Fiorina and Abrams (2008) and Gentzkow (2016). Evidence for occasional big surpriseand, more generally, Bayesian voters is surveyed by DellaVigna and Gentzkow (2010).

6

When base voters are so pessimistic about their own-party candidate’s valence that

even the most attractive deviation to them is still not attractive enough, opposition

voters become disciplining, despite that they are also difficult to attract due to their

preferences against the deviating candidate’s policies. If, in the end, all voters end

up having bigger policy latitudes than the centrist voters in the broadcast case, then

the personalization of information aggregation increases policy polarization. We find

this to be the case when the attention cost is Shannon entropy-based, the attention

cost parameter is large, and extreme voters have strong policy preferences.

Analyses of the disciplining voter yield insights into the policy polarization effect

of recent regulatory proposals to tame the tech giants. In addition to the personaliza-

tion of information aggregation—the reversal of which is a plausible consequence of

limiting tech companies’ access to users’ personal data (The General Data Protection

Regulation (2016); Warren (2019))—we study the consequences of introducing perfect

competition to infomediaries. This regulatory proposal is advocated by the British

government as a preferable way of regulating tech giants (The Digital Competition

Expert Panel (2019)), and it is mathematically equivalent to increasing voters’ at-

tention cost parameter in the monopolistic personalized case. Its policy polarization

effect is negative, because increasing the attention cost parameter tempers voters’

beliefs about candidate valence and so reduces their policy latitudes.

Our analysis suggests that factors carrying negative connotations in our everyday

discourse could have unintended consequences for policy polarization. An example is

increasing mass polarization, which we model as a mean-preserving spread of voters’

policy preferences (Gentzkow (2016)). With personalized information aggregation,

increasing mass polarization can reduce policy polarization rather than increasing it:

as we keep redistributing voters’ population from the center to the margin, policy

polarization would eventually decrease from the centrist voters’ policy latitude to the

7

minimal policy latitude among all voters.

In Online Appendix O.1, we extend the baseline model to encompass general voters

and arbitrary correlation structures between their personalized signals. We develop a

methodology for analyzing this general model and discover, among other things, that

(i) correlation can only increase polarization, and (ii) polarization is minimized under

the uniform population distribution and conditionally independent signal distribution.

Thus our baseline result prescribes the exact lower bound for the polarization effect

of information personalization, while factors that preserve this lower bound (e.g.,

enriching voters’ types, dividing the same type of voters into multiple subgroups)

wouldn’t render polarization trivial.

We apply our theory to the study of horizontal product differentiation with person-

alized information aggregation in a Hotelling duopoly model. Among other things,

we find that IARI renders the principle of minimum differentiation (as posited by

Hotelling (1929)) invalid even in the absence of price competition, and that the wel-

fare consequences of regulating infomediaries become less clear-cut once firms’ loca-

tion choices become endogenous.

1.1 Related literature

Rational inattention The literature on rational inattention pioneered by Sims

(1998) and Sims (2003) assumes that decision-makers can optimally aggregate source

data into signals themselves. To create a role for infomediaries, we assume that the

aggregator is designed and operated by an attention-maximizing infomediary, whereas

voters must fully absorb the information given to them. Apart from this departure

from the RI paradigm, we otherwise follow the standard model of posterior-separable

attention cost that nests Shannon entropy as a special case. Posterior separability

8

(Caplin and Dean (2013)) has recently received attention from economists because of

its axiomatic and revealed-preference foundations (Zhong (2019); Caplin and Dean

(2015)), connections to sequential sampling (Morris and Strack (2017); Hebert and

Woodford (2018)), and validations by lab experiments (Dean and Neligh (2019)).

Filtering bias The idea of filtering bias—i.e., even rational consumers can ex-

hibit a preference for biased information when constrained by information processing

capacities—dates back to Calvert (1985a) and is later expanded on by Suen (2004),

Burke (2008), and Che and Mierendorff (2019), among others. While these models

predict a predisposition reinforcement, they work with non-RI information aggre-

gation technologies and do not examine the consequences of information bias for

electoral competition. Even if they did (see, e.g., Chan and Suen (2008)), their pre-

dictions could still differ from ours, as we will soon explain.

Probabilistic voting models In most existing probabilistic voting models, voters’

signals are assumed to be continuously distributed, so even small changes in candi-

dates’ positions could affect voters’ voting decisions (see Duggan (2017) for a survey).

Under this assumption, Calvert (1985b) first establishes policy convergence between

office-seeking candidates and then pioneers the use of policy preference for gener-

ating policy polarization between candidates (hereafter the Calvert-Wittman logic).

Strict obedience stands in sharp contrast to this assumption, although it is a natural

consequence of IARI.

There is a growing literature on platform competition with personalized informa-

tion. The current work differs from the existing studies in two main aspects. First,

the information structures of our interest and their properties are new to the liter-

ature. Second, we embed the analysis in a plain probabilistic voting model (akin

9

to the first model of Calvert (1985b) and many others surveyed by Duggan (2017)),

where candidates are office-motivated and the only source of uncertainty is their va-

lence shock. Together, these modeling choices generate new predictions that even the

closest works to ours ignore.4

Chan and Suen (2008) study a model of personalized media in which voters care

about whether the realization of a random state variable is above or below their

personal thresholds, and information is provided by media outlets that partition the

state space using threshold rules. A consequence of working with this information

aggregation technology rather than IARI is that signal realizations are monotone in

voters’ thresholds (i.e., if a left-leaning voter is recommended to vote for candidate

R, then a right-leaning voter must receive the same recommendation), hence centrist

voters are always disciplining despite a pluralism of media.5 We instead predict that

the disciplining voter can vary with model primitives and will discuss the implications

of this prediction in Section 6.

More recently, Matejka and Tabellini (2020) and Yuksel (2022) study electoral

competition models with personalized information acquisition.6 In the model studied

by Matejka and Tabellini (2020), voters face normal uncertainties about candidates’

policies that do not directly enter their utility functions. Information acquisition

takes the form of variance reduction, generating signals that violate strict obedience

and sustain policy polarization only if the cost of information acquisition differs across

candidates. The current work differs from that of Matejka and Tabellini (2020) in the

4Loosely related works include those assuming exogenous signal structures and those in which sig-nals are disclosed strategically by campaigning candidates (see, e.g., Glaeser, Ponzetto, and Shapiro(2005) and Herrera, Levine, and Martinelli (2008)).

5Starting from there, the analysis of Chan and Suen (2008) differs completely from ours. Inparticular, it exploits the Calvert-Wittman logic, which we do not rely on.

6The flexibility in choosing among a large variety of signal structures is crucial for these stud-ies and ours but is absent from existing political models with rigid information acquisition, e.g.,Martinelli (2006).

10

source of uncertainty, the attention technology, and the driving force behind policy

polarization. Yuksel (2022) studies a variant of the Calvert-Wittman model where

voter learning takes the form of partitioning a multi-dimensional issue space. Aside

from these modeling differences that set our reasonings apart,7 none of our predictions

(as previewed in the introduction) can be made Yuksel (2022).

Competition for consumers with limited attention A growing literature in IO

and marketing (surveyed by Spiegler (2016) and Iyer, Soberman, Villas-Boas (2005))

studies the competition between firms for consumers with limited attention (LA).

Most papers we are aware of model LA as limited consideration sets, and they focus

on price and quality competitions rather than spatial competition. Notable exceptions

include Matejka and McKay (2012) studying the price competition between firms for

RI consumers; Sauer, Schlatterer, and Schmitt (2019) studying a Hotelling duopoly

model with LA consumers (who can only observe positions in subintervals of the real

line); and Perego and Yuksel (2022) studying the spatial competition between media

companies for RI news consumers located on a Salop circle.

The remainder of the paper proceeds as follows: Section 2 introduces the baseline

model; Section 3 conducts equilibrium analyses; Section 4 reports extensions of the

baseline model; Section 5 gives a further application of our theory; Section 6 con-

cludes. See Appendices A-C and the online appendices for additional materials and

mathematical proofs.

7In particular, Yuksel (2022)’s reasoning exploits the multi-dimensionality of the issue space andthe Calvert-Wittman logic. Our results hold regardless of the dimensionality of the underlying state(see Footnote 21), and they do not exploit the Calvert-Wittman logic.

11

2 Baseline model

In this section, we first streamline the model setup and then discuss main assumptions.

2.1 Setup

Two office-motivated candidates named L and R can adopt the policies on the real

line. They face a unit mass of infinitesimal voters who are left-leaning (k = −1),

centrist (k = 0), or right-leaning (k = 1). Each type k ∈ K = −1, 0, 1 of voter

has a population q (k) > 0 and values a policy a ∈ R by u (a, k) = −|t (k)− a|. The

environment is symmetric, in that q(1) = q(−1) and t(1) > t(0) = 0 > t(−1) = −t(1).

Thus a centrist voter is also a median voter.

At the end of the game, the society holds an election, in which the majority winner

wins the election, and ties are broken evenly between the two candidates. During the

election, each voter must vote expressively for one of the two candidates. For any

given profile a = 〈aL, aR〉 ∈ R2 of positions, a type k voter earns the following utility

difference from voting for candidate R rather than L:

v (a, k) + ω.

In the above expression,

v (a, k) = u (aR, k)− u (aL, k)

captures the voter’s differential valuation of the candidates’ policies, whereas ω is an

uncertain valence state about which candidate is more fit for office. In the baseline

model, ω takes the values in Ω = −1, 1 with equal probability,8 so its prior mean

8E.g., in the ongoing debate about how to battle terrorism, ω = −1 if the state favors the use of

12

equals zero. Online Appendix O.3 examines the case of a continuum of states.

When casting votes, voters observe candidates’ policies but not directly the re-

alization of the valence state. Information about the valence state is modeled as a

finite signal structure (or simply signal) Π : Ω→ ∆ (Z), where each Π (· | ω) specifies

a probability distribution over a finite set Z of signal realizations conditional on the

state realization being ω ∈ Ω. Information is provided by a monopolistic infomedi-

ary who is equipped with a segmentation technology S. S is a partition of voters’

types, and each cell of it is called a market segment. The infomediary can distinguish

between voters from different market segments but not those within the same mar-

ket segment. Our focus is on the coarsest and finest partitions named the broadcast

technology b = K and personalized technology p = k : k ∈ K, respectively: the

former cannot distinguish between the various types of the voters at all, whereas the

latter can do so perfectly.

Under segmentation technology S ∈ b, p, the infomediary designs |S| signals,

one for each market segment. Within each market segment, voters decide whether

to consume the signal that is offered to them. Consuming a signal Π means fully

absorbing its information content. Doing so incurs an attention cost λ · I (Π), where

λ > 0 is called the attention cost parameter, and I (Π) is the needed amount of

attention for absorbing the information content of Π.9 After that, voters observe

signal realizations, update their beliefs about the quality state, and cast votes. The

infomediary’s profit equals the total amount of attention paid by voters.

The game sequence is summarized as follows.

1. The infomediary designs signal structures; voters observe the signals structures

soft power (e.g., diplomatic tactics), and ω = 1 if the state favors the use of hard power (e.g., militarypreemption). Candidates L and R are experienced with using soft and hard power, respectively, andwhoever is more experienced with handling the circumstances has an advantage over his opponent.

9According to Prat and Stromberg (2013), instrumental voting is an important motive for con-suming political information.

13

offered to them and make consumption decisions accordingly.

2. Candidates choose policies without observing the moves in Stage 1.

3. The valence state is realized.

4. Voters observe policies and signal realizations before casting votes.

We adopt perfect Bayesian equilibrium (PBE) as the solution concept. Our goal is

to characterize all PBEs where candidates propose symmetric policy profiles of form

〈−a, a〉, a ≥ 0 in Stage 2 of the game.

2.2 Discussion of Assumptions

Attention cost We state our assumption about the attention cost function. Recall

that a signal structure Π : Ω → ∆ (Z) specifies how source data about the valence

state are (randomly) aggregated into the content indexed by the signal realizations

in Z. For each z ∈ Z, let

πz =∑ω∈Ω

Π (z | ω) /2

denote the probability that the signal realization is z, and assume without loss of

generality (w.l.o.g.) that πz > 0. Then

µz =∑ω∈Ω

ω · Π (z | ω) / (2πz)

is the posterior mean of the valence state conditional on the signal realization being

z, and it fully captures one’s posterior belief after observing z. The next assumption

is standard in the RI literature.

14

Assumption 1. The needed amount of attention for consuming Π : Ω→ ∆ (Z) is

I (Π) =∑z∈Z

πz · h (µz) , (1)

where h : [−1, 1]→ R+ (i) is strictly convex and satisfies h (0) = 0, (ii) is continuous

on [−1, 1] and twice differentiable on (−1, 1), and (iii) is symmetric around zero.

Equation (1) coupled with Assumption 1(i) is equivalent to weak posterior separa-

bility (WPS), a notion proposed by Caplin and Dean (2013) to generalize Shannon’s

entropy as a measure of attention cost. In the current setting, WPS stipulates that

consuming null signals requires no attention, and that more attention is needed for

moving the posterior belief closer to the true state and as the signal becomes more

Blackwell-informative (hence attention is a scarce resource that reduces uncertainties

about the quality state). Together with the regularities imposed by Assumption 1(ii)

and (iii), WPS is satisfied by many standard attention cost functions,10 and it will

be relaxed in Section 4.

Modeling assumptions We discuss the main modeling assumptions. See Section

4 for minor assumptions and how they can be relaxed.

Consider first the infomediary, which for concreteness’ sake can be thought of as

a news aggregator. In Footnote 1, we already detailed the business model of news

aggregators. Here we repeat four noteworthy facts.

1. The content provided by news aggregators is usually very coarse, e.g., snippets

that include a title and a few summary sentences.

10Examples include the reductions in the variance and Shannon entropy of the quality state beforeand after information consumption, in which cases h(µ) = µ2 and H ((1 + µ) /2) (H denotes thebinary entropy function), respectively.

15

2. A major source of news aggregators revenues comes from displaying ads to users

while the latter are paying attention to the content.11

3. Modern news aggregators are operated by tech giants that wield significant

market power.

4. The algorithms behind their operations represent trade secrets that cannot be

easily reverse-engineered by nonusers (Eslami et al. (2015)).

We analyze the game between candidates, voters, and an infomediary. While our

game is certainly stylized, it captures some facets of reality and gives us tractability.

Motivated by Facts 2 and 3, we assume that a monopolistic infomediary maximizes

the total amount of attention paid by voters while preventing them from tuning out.

Our results remain qualitatively valid as long as the infomediary’s profit is a strictly

increasing function of voters’ attention.12 Online Appendix O.2 examines the case of

competitive infomediaries.

Motivated by Fact 4, we assume that candidates do not observe signal structures

when crafting policies.13 We do allow voters to observe the signal structures offered

to them, because according to computer scientists working on algorithm audit, the

most effective way to recover the algorithms used by tech giants is to survey users

(Eslami et al. (2015)).

Finally, we assume that policies are announced to voters at the voting stage.14 We

11Click here for Facebook’s tactics such as playing multiple small mid-roll ads when users arealready in the “lean-back” watching mode and so will absorb the ads together with the content.

12To see why, suppose the profit generated by a voter consuming Π equals J (I (Π)) for somestrictly increasing function J : [0, 1] → R+. Then for any given set of voters whose participationconstraints we wish to satisfy, the infomediary solves maxΠ J (I (Π)) · (# of participating voters) or,equivalently I (Π) · (# of participating voters), subject to voters’ participation constraints.

13We also refrain candidates from observing (signals) of the valence state when crafting policies.Relaxing this assumption wouldn’t affect the analysis, because any symmetric PBE of the currentgame remains a PBE of the augmented game, and any symmetric PBE of the augmented game inwhich candidates adopt fixed policy platforms must be a PBE of the current game.

14Since policies are certain objects, they can be observed at no attention cost once announced.

16

https://instapage.com/blog/facebook-in-stream-video-ads

do not explicitly model the activities that make this happen (e.g., political advertising,

canvassing) but note that they typically take place right before the election day

(Gerber et al. (2011)). Given this, it is reasonable to assume that the design and

consumption of signals concerning an evolving state of the world (as in the case of

countering terrorism discussed in Footnote 8) are made without observing policies.

3 Analysis

3.1 Optimal signals

In this section, we fix any symmetric policy profile a = 〈−a, a〉 with a ≥ 0 and solve

for the signals that maximize the infomediary’s profit (hereafter optimal signals). To

facilitate discussions, we say that candidate L (resp. R) is the own-party candidate

of left-leaning (resp. right-leaning) voters.

Infomediary’s problem Under segmentation technology S ∈ b, p, any optimal

signal for market segment s ∈ S solves

maxΠ

I (Π) · D (Π; a, s) (s)

where D (Π; a, s) denotes the demand for signal Π in market segment s under policy

profile a. To figure out D(·), note that since a voter could always vote for his own-

party candidate without consuming information, information consumption is useful

only if it sometimes convinces him to vote across party lines. After consuming Π,

a voter strictly prefers candidate R to L if v (a, k) + µz > 0, and he strictly prefers

candidate L toR if v (a, k)+µz < 0. Ex ante, the expected utility gain from consuming

17

Π is

V (Π; a, k) =

∑

z∈Z πz [v (a, k) + µz]+ if k ≤ 0,∑

z∈Z −πz [v (a, k) + µz]− if k > 0,

and the voter prefers to consume Π rather than to abstain (hereafter, his participation

constraint is satisfied) if

V (Π; a, k) ≥ λ · I(Π).

Therefore,

D (Π; a, s) =∑

k∈K:V (Π;a,k)≥λ·I(Π)

population of type k voters in segment s.15

Binary signal and strict obedience We demonstrate that any optimal signal

has at most two realizations and, if binary, prescribes voting recommendations that

its consumers strictly prefer to obey. To facilitate analysis, we say that a signal

realization z endorses candidate R and disapproves of candidate L if µz > 0, and

that it endorses candidate L and disapproves of candidate R if µz < 0. For binary

signals, we write Z = L,R. From Bayes’ plausibility, which mandates that the

expected posterior mean must equal the prior mean zero:

∑z∈Z

πz · µz = 0, (BP)

it follows that we can assume µL < 0 < µR w.l.o.g. In this way, we can interpret each

signal realization z ∈ L,R as an endorsement for candidate z and a disapproval of

candidate −z. In addition, we can define strict obedience as follows.

15If a solution to Problem (s) has zero demand, then it will be regarded the same as a degeneratesignal. This convention rules out uninteresting situations in which the infomediary deters informationconsumption using nondegenerate signals.

18

Definition 1. A binary signal induces strict obedience from its consumers if the

latter strictly prefer the endorsed candidate to the disapproved one under both signal

realizations, i.e.,

v (a, k) + µL < 0 < v (a, k) + µR. (SOB)

Lemma 1. Under Assumption 1, the following hold for any symmetric policy profile

〈−a, a〉 with a ≥ 0.

(i) Any optimal personalized signal for any voter has at most two realizations.

(ii) Any optimal broadcast signal has at most two realizations.

(iii) Any optimal signal, if binary, induces strict obedience from its consumers.

Proof. Omitted proofs from the main text are gathered in Appendix B.

Lemma 1 is proven differently for the cases of personalized and broadcast infor-

mation aggregation. In the personalized case, our result follows from the fact that

individual voters makes binary decisions and the attention cost function is strictly

Blackwell-monotone. Given this, any information beyond decision recommendations

would only raise the attention cost without any corresponding benefit for voters and

so would turn away voters whose participation constraints bind at the optimum. For

these voters, maximizing attention is equivalent to maximizing the usefulness of in-

formation consumption at the maximal attention level. Neither the assumption of

binary states or that of posterior separability matters for this argument.

The broadcast case is proven by aggregating voters with binding participation

constraints into a representative voter. Under the assumption that voters’ policy

preferences exhibit increasing differences between policies and types, only extreme

voters’ participation constraints bind at the optimum, whereas centrist voters’ par-

ticipation constraint is slack. The resulting representative voter makes at most three

19

decisions: LL, LR, and RR (the first and second letters stand for the voting decisions

of the left-leaning voter and right-leaning voter, respectively), so the optimal signal

for him has at most three signal realizations. Then using the concavification method

developed by Aumann and Maschler (1995) and Kamenica and Gentzkow (2011), we

demonstrate that the optimal signal has at most two realizations LL and RR. The

proof exploits three assumptions: (i) binary states, (ii) posterior separability, and (iii)

the infomediary maximizes voters’ attention, which will be relaxed in Section 4.

Strict obedience (SOB) is an essential feature of optimal binary signals. Indeed,

if a consumer of a binary signal has a (weakly) preferred candidate that is indepen-

dent of his voting recommendations, then he would prefer to vote for that candidate

unconditionally without consuming the signal, because doing so saves the attention

cost without affecting the expected voting utility.

The next assumption imposes regularities on our problem. It makes the upcoming

analysis elegant and will be relaxed in Section 4.

Assumption 2. The following hold for any symmetric policy profile 〈−a, a〉 with

a ≥ 0, segmentation technology S ∈ b, p, and market segment s ∈ S.

(i) Any optimal signal for market segment s is nondegenerate and is consumed by

all voters therein.

(ii) The posterior means of the state induced by the signal in Part (i) lie in (−1, 1).

Roughly speaking, Assumption 2(i) holds if voters’ attention cost parameter λ

isn’t too high and their policy preferences aren’t too extreme, whereas Assumption

2(ii) holds if λ isn’t too low (see Appendix A for numerical examples). Together, these

assumptions imply that voters consume nondegenerate yet garbled signals about the

underlying state, which by Lemma 1 must be binary and hence satisfy (SOB). Indeed,

20

we require this to be true for any segmentation technology and policy profile, and

therefore name Assumption 2 as uniform strict obedience.

Assumptions 1 and 2(i) guarantee the uniqueness of optimal signals.

Lemma 2. For any symmetric policy profile 〈−a, a〉 with a ≥ 0, the optimal person-

alized signal for any voter is unique under Assumption 1, and the optimal broadcast

signal is unique under Assumptions 1 and 2(i).

The upcoming analysis assumes Assumptions 1 and 2 unless otherwise specified.

Skewness We examine the skewness of optimal signals. Since the underlying state

is binary, it is w.l.o.g. to identify any binary signal with the profile 〈µL, µR〉 of

posterior means.16 Two observations are immediate.

Observation 1. (i) 〈µL, µR〉 is more Blackwell-informative than 〈µ′L, µ′R〉 if |µz| ≥

|µ′z| ∀z ∈ L,R and at least one inequality is strict.

(ii) 〈µL, µR〉 endorses candidate z ∈ L,R more often than candidate −z, i.e.,

πz > π−z, if and only if |µz| < |µ−z|.

For any given policy profile 〈−a, a〉, we use 〈µbL(a), µbR(a)〉 to denote the optimal

broadcast signal and 〈µpL(a, k), µpR(a, k)〉 to denote the optimal personalized signal for

type k voters.

Theorem 1. The following hold for any symmetric policy profile 〈−a, a〉 with a > 0.

(i) The optimal broadcast signal is symmetric, in that it endorses each candidate

with equal probability, and the endorsements shift voters’ beliefs by the same

magnitude, i.e., |µbL (a) | = µbR (a).

16Indeed, we can back out the signal structure from 〈µL, µR〉 as follows: Π (z = R | ω = 1) =−µL(1+µR)µR−µL

and Π (z = R | ω = −1) = −µL(1−µR)µR−µL

.

21

(ii) The following happen in the personalized case.

(a) The optimal signal for centrist voters is symmetric, i.e., |µpL (a, 0) | = µpR (a, 0).

(b) The optimal signal for any extreme voter is skewed, in that it endorses

the voter’s own-party candidate more often than his opposite-party candi-

date, although the endorsement for the opposite-party candidate is stronger

than that of the own-party candidate, i.e., |µpL(a,−1)| < µpR(a,−1) and

|µpL(a, 1)| > µpR(a, 1). Moreover, optimal signals are symmetric between left-

leaning and right-leaning voters, |µpL (a,−1) | = µpR (a, 1) and µpR(a,−1) =

|µpL(a, 1)|.

(iii) The optimal broadcast signal is less Blackwell-informative than the optimal per-

sonalized signal for centrist voters, i.e., |µbL(a)| < |µpL(a, 0)| and µbR(a) < µpR(a, 0).

Theorem 1(i) holds because the broadcast signal is designed for a representative

voter with a symmetric policy preference. To develop intuition for Theorem 1(ii),

recall that information consumption is useful for an extreme voter if and only if it

sometimes convinces him to vote across party lines. Since the corresponding signal

realization must move the posterior mean of the state far away from the prior mean,

it must occur with a small probability in order to prevent the voter from tuning

out (hereafter an occasional big surprise). Most of the time, the recommendation

is to vote for the own-party candidate, which by Bayes’ plausibility can only shift

his belief moderately (hereafter a predisposition reinforcement). Evidence for occa-

sional big surprise and predisposition reinforcement after the use of personalized news

aggregators has already been discussed in Footnote 3.

We finally turn to Theorem 1(iii). As demonstrated earlier, the optimal broadcast

signal is designed for a representative voter with a symmetric policy preference, and

yet the decision on whether to consume the signal is made by extreme voters who

22

prefer skewed signals to symmetric ones. Such a mismatch of preferences limits the

amount of attention the optimal broadcast signal can attract from any voter com-

pared to his personalized signal. This observation, together with symmetry, implies

that the optimal broadcast signal is less Blackwell-informative than centrist voters’

personalized signal.

3.2 Equilibrium policies

This section endogenizes candidates’ policy positions. Under segmentation technology

S ∈ b, p, a profile of signals and policies 〈−a, a〉 with a ≥ 0 can arise in a PBE if it

satisfies the following properties.

• The profile of signals is a |S|-dimensional random variable, where the marginal

probability distribution of each dimension s ∈ S solves Problem (s), taking

〈−a, a〉 as given.

• The policy position a maximizes candidate R’s winning probability, taking can-

didate L’s position −a, the profile of signals, voters’ consumption decisions, and

their voting strategies (as functions of actual policies and signal realizations) as

given.

We characterize all PBEs of the above form. Before proceeding, note that the

analysis so far has pinned down the marginal signal distribution for each market

segment but has left the joint signal distribution across market segments unspecified,

despite that the latter clearly affects candidates’ strategic reasoning. In what follows,

we will first assume that signals are conditionally independent across market segments.

Later in Section 4, we will consider all joint signal distributions that are consistent

with the marginal signal distributions solved in Section 3.1.

23

Key concepts We develop key concepts to the upcoming analysis, holding any

segmentation technology S ∈ b, p and voter population distribution q fixed. We first

describe how a unilateral deviation from a symmetric policy profile can affect voters’

voting decisions. Due to symmetry, it suffices to consider candidate R’s deviation.

Definition 2. A unilateral deviation of candidate R from 〈−a, a〉 with a ≥ 0 to a′

attracts type k voters if it wins type k voters’ support even when their signal realization

disapproves of candidate R, i.e.,

v (−a, a′, k) + µSL (a, k) > 0.

It repels type k voters if it loses type k voters’ support even when their signal realiza-

tion endorses candidate R, i.e.,

v (−a, a′, k) + µSR (a, k) < 0.

Note that if a′ attracts (resp. repels) a voter, then it makes the voter vote for

(resp. against) candidate R unconditionally. If it neither attracts or repels a voter,

then it has no effect on his voting decisions.

We next construct an index called policy latitude that captures a voter’s resis-

tance to candidate R’s deviations. For the broadcast case, we write υbz for the mag-

nitude |µbz(a)|∣∣a=t(1)

of voters’ belief given signal realization z ∈ L,R under policy

profile 〈−t(1), t(1)〉. For the personalized case, we write υpz(k) for the magnitude

|µpz(a, k)||a=|t(k)| of type k voters’ belief given signal realization z ∈ L,R under pol-

icy profile 〈−|t(k)|, |t(k)|〉. Intuitively, υbL and υpL(k) capture voters’ pessimism about

candidate R’s valence given unfavorable information. Given them, we can define

policy latitudes are follows.

24

Definition 3. Let ξS(k) denote type k voters’ policy latitude under segmentation

technology S ∈ b, p. Define ξb(k) = −t(k) + υbL and ξp(k) = −t(k) + υpL(k).

By definition, a voter’s policy latitude decreases with his horizontal preference

for candidate R’s policies, and it increases with his pessimism about candidate R’s

valence given unfavorable information. Thus an increase in policy latitude makes the

voter more resistant to candidate R’s deviations.

We finally describe equilibrium outcomes. Let ES,q denote the set of nonnegative

policy a’s such that the symmetric policy profile 〈−a, a〉 can arise in equilibrium.

We are interested in the policy polarization aS,q = max ES,q, defined as the maxi-

mal symmetric equilibrium policy, and whether all policies between zero and policy

polarization can arise in equilibrium.

Definition 4. Type k voters are disciplining if their policy latitude determines policy

polarization, i.e., aS,q = ξS (k).

Equilibrium characterization The next theorem gives a full characterization of

the equilibrium policy set.

Theorem 2. For any segmentation technology S ∈ b, p and population distribu-

tion q, policy polarization is strictly positive, and all policies between zero and policy

polarization can arise in equilibrium, i.e., aS,q > 0 and ES,q =[0, aS,q

]. Moreover,

disciplining voters always exist, and their identities are as follows.

(i) In the broadcast case, centrist voters are always disciplining, i.e., ab,q = ξb (0)

∀q.

(ii) In the personalized case, centrist voters are disciplining if they constitute a ma-

jority of the population. Otherwise voters with the smallest policy latitude are

25

disciplining, i.e.,

ap,q =

ξp (0) if q (0) > 1/2,

mink∈K

ξp (k) if q (0) ≤ 1/2.

The main intuition behind Theorem 2 is as follows. When voters’ population dis-

tribution is sufficiently dispersed, personalized information aggregation allows can-

didates to benefit from attracting extreme voters in addition to attracting centrist

voters. Since voters with the smallest policy latitude are most susceptible to policy

deviations and hence constitute the easiest target of a deviating candidate, their pol-

icy latitude—which captures their resistance to policy deviations—determines equi-

librium policy polarization. Regardless of whether information aggregation is per-

sonalized or not, policy polarization is strictly positive mainly due to uniform strict

obedience: under that assumption, local deviations from a symmetric policy profile

wouldn’t change voters’ voting decisions, which suggests that a positive degree of

policy polarization could arise in equilibrium.

Proof sketch We proceed in three steps.

Broadcast case. In this case, all voters consume the same signal and so form the

same belief about candidates’ valences. Thus the standard median voter theorem

logic holds, namely a deviation of candidate R is profitable, i.e., strictly increases his

winning probability, if and only it attracts centrist voters. Formally (and no more

proof is required),

Lemma 3. In the broadcast case, a policy profile 〈−a, a〉 with a ≥ 0 can arise in

equilibrium if and only if no deviation of candidate R to any a′ ∈ R attracts centrist

voters.

26

Personalized case. In this case, Lemma 3 remains valid if centrist voters constitute a

majority of the population. Otherwise no type of voter alone forms a majority coali-

tion, and a deviation is profitable if it attracts any type of voter, holding other things

constant. The reason is pivotality: since the infomediary can now offer conditionally

independent signals to different types of voters, the above deviation strictly increases

candidate R’s winning probability when the remaining voters disagree about which

candidate to vote for.

The above argument leaves open the question of whether attracting some voters

would cause the repulsion of others. Fortunately, this concern is ruled out by the

next lemma.

Lemma 4. In the personalized case with q (0) ≤ 1/2, a symmetric policy profile

〈−a, a〉 with a ≥ 0 can arise in equilibrium if and only if no deviation of candidate R

to any a′ ∈ [−a, a) attracts any voter whose bliss point lies inside [−a, a].

The proof of Lemma 4 examines two kinds of (global) deviations: (1) a′ /∈ [−a, a]

and (2) a′ ∈ [−a, a). The case where 0 ≤ a < t(1) is the most challenging (see

Appendix B.2 for the other case where a ≥ t(1)). By committing the first kind

of deviation to a′ > a in that case (as depicted in Figure 1), candidate R may

indeed attract right-leaning voters. But such a success must cause the repulsion of

left-leaning voters, due to the symmetry and the weak concavity of voters’ utility

functions (see Appendix B.2 for technical details). In addition, the deviation moves

candidate R away from centrist voters and hence runs the risk of repelling them, so

it cannot benefit the candidate overall. The argument for a′ < −a is analogous and

hence is omitted.

Consider next a deviation a′ ∈ [−a, a) of the second kind (as depicted in Figure

2). By committing such a deviation, candidate R moves closer to centrist and left-

27

<latexit sha1_base64="TpN6sRMdEhkCqnjWyIQJr5WE10I=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4KomKeix68diC/YA2lM120q7dbMLuRiihv8CLB0W8+pO8+W/ctjlo64OBx3szzMwLEsG1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1A6pRcIkNw43AdqKQRoHAVjC6m/qtJ1Sax/LBjBP0IzqQPOSMGivV3V6p7FbcGcgy8XJShhy1Xumr249ZGqE0TFCtO56bGD+jynAmcFLsphoTykZ0gB1LJY1Q+9ns0Ak5tUqfhLGyJQ2Zqb8nMhppPY4C2xlRM9SL3lT8z+ukJrzxMy6T1KBk80VhKoiJyfRr0ucKmRFjSyhT3N5K2JAqyozNpmhD8BZfXibN84p3VbmoX5art3kcBTiGEzgDD66hCvdQgwYwQHiGV3hzHp0X5935mLeuOPnMEfyB8/kDe7WMvA==</latexit>

0<latexit sha1_base64="KhJdCOoKXuYY3IkflE7Hq6jIjLM=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4KomKeix68diC/YA2lM120q7dbMLuRiihv8CLB0W8+pO8+W/ctjlo64OBx3szzMwLEsG1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1A6pRcIkNw43AdqKQRoHAVjC6m/qtJ1Sax/LBjBP0IzqQPOSMGivVaa9UdivuDGSZeDkpQ45ar/TV7ccsjVAaJqjWHc9NjJ9RZTgTOCl2U40JZSM6wI6lkkao/Wx26IScWqVPwljZkobM1N8TGY20HkeB7YyoGepFbyr+53VSE974GZdJalCy+aIwFcTEZPo16XOFzIixJZQpbm8lbEgVZcZmU7QheIsvL5PmecW7qlzUL8vV2zyOAhzDCZyBB9dQhXuoQQMYIDzDK7w5j86L8+58zFtXnHzmCP7A+fwBxfmM7Q==</latexit>a<latexit sha1_base64="RWfBFPTb1ZbXvd7OwpnxNvIjA/w=">AAAB6XicbVBNS8NAEJ34WetX1aOXxSJ4sSQq6rHoxWMV+wFtKJvtpl262YTdiVBC/4EXD4p49R9589+4bXPQ1gcDj/dmmJkXJFIYdN1vZ2l5ZXVtvbBR3Nza3tkt7e03TJxqxusslrFuBdRwKRSvo0DJW4nmNAokbwbD24nffOLaiFg94ijhfkT7SoSCUbTSwyntlspuxZ2CLBIvJ2XIUeuWvjq9mKURV8gkNabtuQn6GdUomOTjYic1PKFsSPu8bamiETd+Nr10TI6t0iNhrG0pJFP190RGI2NGUWA7I4oDM+9NxP+8dorhtZ8JlaTIFZstClNJMCaTt0lPaM5QjiyhTAt7K2EDqilDG07RhuDNv7xIGmcV77Jyfn9Rrt7kcRTgEI7gBDy4gircQQ3qwCCEZ3iFN2fovDjvzsesdcnJZw7gD5zPHy9NjSQ=</latexit>a

<latexit sha1_base64="F4nunnnEle85/b+FO5zftpk8KoU=">AAAB63icbVBNSwMxEJ2tX7V+VT16CRahXsquFfVY9OKxgv2AdinZNNuGJtklyQpl6V/w4kERr/4hb/4b0+0etPXBwOO9GWbmBTFn2rjut1NYW9/Y3Cpul3Z29/YPyodHbR0litAWiXikugHWlDNJW4YZTruxolgEnHaCyd3c7zxRpVkkH800pr7AI8lCRrDJpKp3PihX3JqbAa0SLycVyNEclL/6w4gkgkpDONa657mx8VOsDCOczkr9RNMYkwke0Z6lEguq/TS7dYbOrDJEYaRsSYMy9fdEioXWUxHYToHNWC97c/E/r5eY8MZPmYwTQyVZLAoTjkyE5o+jIVOUGD61BBPF7K2IjLHCxNh4SjYEb/nlVdK+qHlXtfrDZaVxm8dRhBM4hSp4cA0NuIcmtIDAGJ7hFd4c4bw4787HorXg5DPH8AfO5w8Yeo2g</latexit>

t(1)<latexit sha1_base64="Qhc4CrR313xOoUrwFpfkuRTLHJg=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ6KomKeix68VjRfkAbymS7bZduNmF3I5TQn+DFgyJe/UXe/Ddu2hy09cHA470ZZuYFseDauO63s7S8srq2Xtgobm5t7+yW9vYbOkoUZXUaiUi1AtRMcMnqhhvBWrFiGAaCNYPRbeY3n5jSPJKPZhwzP8SB5H1O0VjpAU+K3VLZrbhTkEXi5aQMOWrd0lenF9EkZNJQgVq3PTc2forKcCrYpNhJNIuRjnDA2pZKDJn20+mpE3JslR7pR8qWNGSq/p5IMdR6HAa2M0Qz1PNeJv7ntRPTv/ZTLuPEMElni/qJICYi2d+kxxWjRowtQaq4vZXQISqkxqaTheDNv7xIGmcV77Jyfn9Rrt7kcRTgEI7gFDy4gircQQ3qQGEAz/AKb45wXpx352PWuuTkMwfwB87nD1r/jTI=</latexit>

a0

Type 1 Deviation





a0

Type 2 Deviation

<latexit sha1_base64="kM5Z2oVYkURAXGJlzCf4nu+9wdI=">AAAB7HicbVBNS8NAEJ3Ur1q/qh69BItQD5ZERT0WvXisYNpCG8pmu2mXbjZhdyKU0N/gxYMiXv1B3vw3btsctPXBwOO9GWbmBYngGh3n2yqsrK6tbxQ3S1vbO7t75f2Dpo5TRZlHYxGrdkA0E1wyDzkK1k4UI1EgWCsY3U391hNTmsfyEccJ8yMykDzklKCRPKyeuae9csWpOTPYy8TNSQVyNHrlr24/pmnEJFJBtO64ToJ+RhRyKtik1E01SwgdkQHrGCpJxLSfzY6d2CdG6dthrExJtGfq74mMRFqPo8B0RgSHetGbiv95nRTDGz/jMkmRSTpfFKbCxtiefm73uWIUxdgQQhU3t9p0SBShaPIpmRDcxZeXSfO85l7VLh4uK/XbPI4iHMExVMGFa6jDPTTAAwocnuEV3ixpvVjv1se8tWDlM4fwB9bnD4KOjdc=</latexit>

t(1)


t(1)

Figure 1: Consequences of candidate R deviating from 〈−a, a〉 to a′ > a when 0 ≤a < t(1).

leaning voters and hence might attract them. Yet such a deviation cannot attract

left-leaning voters (and hence doesn’t affect their voting decision) because it is still

further away from these voters’ bliss point t(−1) than candidate L’s position −a is.

As for right-leaning voters, notice that a′ is further away from their bliss point t(1)

than candidate R’s original position a is and hence cannot attract them. Meanwhile,

it is closer to these voters’ bliss point than candidate L’s position −a is and so cannot

repel them. Taken together, we conclude that the deviation to a′ is weakly attractive

to centrist voters and is neutral to extreme voters. If it doesn’t attract centrist voters,

then the original position a can be sustained in an equilibrium.





a0

Type 1 Deviation





a0

Type 2 Deviation


t(1)


t(1)

Figure 2: Consequences of candidate R deviating from 〈−a, a〉 to a′ ∈ [−a, a) when0 ≤ a < t(1).

Equilibrium policy set. Lemmas 3 and 4 demonstrate that a policy position can arise

in equilibrium if and only if it deters deviations that aim at attracting certain types of

voters. In Appendix B.2, we characterize the set of positions that is attraction-proof

for each voter, i.e., no deviation from such positions attracts him. Among other nice

geometric properties, we demonstrate that the maximum attraction-proof position for

a voter is simply his policy latitude. In the most interesting case of personalized

information and a diverse population distribution, we must deter candidates from

attracting any voter. The maximal policy position that is attraction-proof for every

voter is intuitively mink∈K ξp(k), suggesting that voters with the smallest policy lati-

28

tude are disciplining. The proof presented in Appendix B.2 formalizes this intuition.

3.3 Comparative statics

This section examines the comparative statics of equilibrium policy sets. Since all

policies between zero and policy polarization can arise in equilibrium, it is without

loss to focus on the comparative statics of policy polarization: as policy polarization

increases, the equilibrium policy set increases in the strong set order.

Segmentation technology The next proposition concerns the policy polarization

effect of (disabling) personalized information aggregation (The General Data Protec-

tion Regulation (2016); Warren (2019)).

Proposition 1. Policy polarization is strictly higher in the personalized case than in

the broadcast case if and only if one of the following situations arises in the person-

alized case:

(i) centrist voters are disciplining;

(ii) right-leaning voters are disciplining, and the belief induced by the occasional big

surprise of their personalized signal is sufficiently strong: υpL (1) > υbL + t (1);

(iii) left-leaning voters are disciplining and have a sufficiently strong policy prefer-

ence: |t (−1) | > υbL − υpL (−1).

Proposition 1 follows immediately from Theorem 2. Parts (i) of it exploits the

fact that centrist voters’ personalized signal is more Blackwell-informative than the

broadcast signal, so their policy latitude increases as information aggregation becomes

personalized.

29

Parts (ii) and (iii) of Proposition 1 show that if extreme voters are disciplining

in the personalized case, then the skewness of their personalized signals is crucial for

sustaining a greater degree of policy polarization than in the broadcast case. The

role of skewness differs according to which type of extreme voter is disciplining. In

case base voters (i.e., right-leaning voters) are disciplining, the only explanation for

why they could have a big policy latitude must be the occasional big surprise of their

personalized signal. Indeed, we require that base voters must be significantly more

pessimistic about candidate R’s valence following unfavorable information than the

centrist voters in the broadcast case, i.e., υpL (1) > υbL + t (1).

In case opposition voters (i.e., left-leaning voters) are disciplining, a presumption

is that they have a small policy latitude than base voters, i.e., −t (−1) + υpL (−1) <

−t (1) + υpL (1). The last condition, while delicate at first sight, stems naturally from

the trade-off between voters’ policy preferences and their beliefs about candidate

valence. Since base voters most prefer candidate R’s policies, they seek the biggest

occasional surprise and so are most pessimistic about candidate R’s valence following

unfavorable information. In contrast, opposition voters least prefer candidate R’s

policies but are nonetheless most optimistic about his valence following unfavorable

information. For the above condition to hold, the difference in base and opposition

voters’ beliefs must exceed the difference in their policy preferences, i.e., υpL (1) −

υpL (−1) > t(1)− t(−1). Simplifying the last condition using symmetry yields

υpL (1)− υpR (1) > 2t (1) , (∗)

which stipulates that extreme voters’ personalized signals be sufficiently skewed that

the beliefs induced by occasional big surprise and own-party bias differ by a significant

amount. In that case, candidate R wouldn’t target his base when contemplating a

30

deviation. Instead, he appeals to his opposition, which itself could be challenging

due to the latter’s preference against his policies. When such an anti-preference is

sufficiently strong, i.e., |t (−1) | > υbL − υpL (−1), policy polarization increases as a

result of personalized information aggregation. Notice the role of skewness in the

above argument, which is crucial yet indirect.

In Appendix A, we solve the baseline model numerically for the case of entropy

attention cost. We find that the conditions prescribed by Proposition 1 are most

likely to hold when the attention cost parameter λ is high and when extreme voters’

policy preference parameter t(1) is large. See the material there for discussions of

intuition.

Attention cost parameter The next proposition shows that policy polarization

decreases with voters’ attention cost parameter.

Proposition 2. Let λ′ > λ > 0 be two attention cost parameters such that the

corresponding environments satisfy Assumption 1 and 2. As we increase the attention

cost parameter from λ to λ′, policy polarization strictly increases in both the broadcast

case and the personalized case.

The proof of Proposition 2 exploits the assumption of posterior separability, under

which optimal signals become less Blackwell-informative as voters’ attention cost

parameter increases. As voters’ beliefs about candidate valence get attenuated, they

become more susceptible to policy deviations, so their policy latitudes fall.

Proposition 2 sheds light on the policy polarization effect of introducing perfect

competition between infomediaries, which is advocated by the British government as

a preferable way of regulating tech giants (The Digital Competition Expert Panel

(2019)). In Online Appendix O.2, we investigate an extension where each type k ∈ K

31

of voter is served by competitive infomediaries solving

maxΠ

V (Π; a, k)− λ · I (Π) .

The solution to this problem, hereafter called the competitive signal for type k vot-

ers, coincides with their monopolistic personalized signal for some attention cost

parameter λ′ > λ. Intuitively, monopolistic personalized signals overfeed voters with

information about candidate valence through reducing the attention cost parameters

that voters effectively face. Introducing competition between infomediaries corrects

this overfeeding problem. Its policy polarization effect is negative by Proposition 2.

Population distribution Recently, a growing body of the literature has been de-

voted to the understanding of voter polarization, also termed mass polarization. No-

tably, Fiorina and Abrams (2008) define mass polarization as a bimodal distribution

of voters’ policy preferences on a liberal-conservative scale, and Gentzkow (2016)

develops a related concept that measures the average ideological distance between

Democrats and Republicans. Inspired by these authors, we define increasing mass

polarization as a mean-preserving spread of voters’ policy preferences. The next

proposition shows that with personalized information aggregation, increasing mass

polarization may surprisingly reduce policy polarization rather than increasing it.

Proposition 3. Let q and q′ be two population distributions such that the mass is

more polarized under q′ than under q, i.e., q(0) > q′(0). As we change the population

distribution from q to q′ in the personalized case, policy polarization weakly increases,

and it strictly increases if q (0) > 1/2 ≥ q′ (0) and min ξp (−1) , ξp (1) < ξp (0).

Proposition 3 follows immediately from Theorem 2. As we keep redistributing

voters’ population from the center to the margin, candidates would eventually ben-

32

efit from attracting extreme voters in addition to attracting centrist voters. If any

extreme voter has a smaller policy latitude than that of centrist voters, then a re-

duction in the policy polarization would ensue. Expanding the last condition yields

min−t(−1) + υpL(−1),−t(1) + υpL(1) < υpL(0), which holds if centrist voters’ signal

is very informative about the underlying state and is always the case in the numerical

example presented in Appendix A. In Online Appendix O.1, we prove a similar result

to Proposition 3 for general voters and quadratic attention cost.

4 Extensions

This section reports extensions of the baseline model. See Appendix C and online

appendices for details.

General voters and joint signal distribution In Online Appendix O.1, we ex-

tend the baseline model to arbitrary finite types of voters holding general policy pref-

erences and relax the assumption that signals are conditionally independent across

market segments. Instead, we consider all joint signal distributions that are consistent

with the marginal signal distributions as solved in Section 3.1.

Our analysis leverages a new concept called influential coalition. Loosely speak-

ing, a coalition of voters is influential if attracting all its members, holding other

things constant, strictly increases the deviating candidate’s winning probability. In

the broadcast case, signals are perfectly correlated among voters, so a coalition of

voters is influential if and only if it is a majority coalition. In the personalized case,

non-majority coalitions can be influential, due to the imperfect correlation between

different voters’ signals. The next table compiles the influential coalitions in the

baseline model.

33

S = b S = p

q(0) > 1/2 majority coalitions majority coalitions

q(0) < 1/2 majority coalitions nonempty coalitions

Table 1: influential coalitions in the baseline model.

Personalized information aggregation affects policy polarization through marginal

signal distributions and influential coalitions. So far we’ve focused on the first ef-

fect, under the restriction that personalized signals are conditionally independent

across voters. As demonstrated in Online Appendix O.1, lifting the last restriction

while holding marginal signal distributions fixed can only increase policy polariza-

tion. Among all joint signal distributions and voter population distributions, the

exact lower bound mink∈K ξp(k) for policy polarization is attained when signals are

conditionally independent across voters and the population distribution is uniform

across voters’ types. Both results follow from a characterization of policy polarization

as the minimal policy latitude among all influential coalitions, as well as the compar-

ative statics of influential coalitions as we vary the joint signal distribution, holding

marginal signal distributions fixed.

Two takeaways are immediate. First, results so far prescribe the exact lower bound

for the policy polarization effect of personalized information aggregation. Second,

as long as mink∈K ξp(k) stays positive, changes in the environment (e.g., enriching

voters’ types, dividing the same type of voters into multiple subgroups) wouldn’t

render policy polarization trivial.

Beyond binary states and posterior separability Most results so far carry

over to attention cost functions that are strictly Blackwell-monotone and satisfy mild

regularity conditions. An important exception is Lemma 1(ii), which shows that the

34

optimal broadcast signal has at most two realizations if the underlying state is bi-

nary and the attention cost is posterior separable. For general state distributions and

attention cost functions, the optimal broadcast signal has at most three realizations

LL, LR, RR, where the first and second letters stand for the voting recommendations

to the most left-leaning voter and the most right-leaning voter, respectively. Inter-

estingly, this result holds for arbitrary finite types of voters, because at most two

types of voters’ participation constraints are binding in generic environments, so the

representative voter acting on their behalf has three decisions to make.

In Online Appendix O.3, we show that if the attention cost I(Π) is convex in the

signal structure Π—a property that is satisfied by, e.g., mutual information (Cover

and Thomas (2006))—then the optimal broadcast signal must be symmetric. Based

on this property, we then argue that in the case of three signal realizations, policy

polarization must equal zero, hence the personalization of information aggregation

must increase policy polarization. The case of two signal realizations can be analyzed

the same as before.

Other extensions In Appendix C, we demonstrate the robustness of our quali-

tative predictions to: (i) heterogeneous attention cost parameters; (ii) general seg-

mentation technologies; (iii) various relaxations of Assumption 2, i.e., uniform strict

obedience, such as (a) excluding some voters from information consumption, and (b)

exposing voters to more complex decision problems with more than two actions, etc.

5 Further application

In this section, we apply our theory to the study of horizontal product differentia-

tion with personalized information aggregation. Suppose consumers’ preferences for

35

green-energy cars are located on a Hotelling line. Two car companies compete for

market shares by choosing locations on the Hotelling line and by investing in product

qualities. Investments generate a random quality state that captures the success or

failure of R&D, the likability of new product features, etc. When making purchas-

ing decisions, consumers observe car companies’ locations, i.e., how “green” the cars

are, but they do not directly observe the realization of the quality state. Informa-

tion about the quality state is provided by, e.g., Youtube, which directs consumers

to product review videos using personalized algorithms. When paying attention to

video contents, consumers are exposed to mid-roll ads, which generate revenues to

Youtube and video makers.

We analyze the game between car companies, consumers, and the infomediary.

The next proposition extends the results so far to companies that maximize market

shares.

Proposition 4. Results so far remain valid if companies maximize market shares.

The takeaways from Proposition 4 are as follows. First, with information aggre-

gation for rationally inattentive consumers, the principle of minimum differentiation

posited by Hotelling (1929) could fail even in the absence of price competition.

Second, the personalization of information aggregation fundamentally changes

the disciplining entity of equilibrium product differentiation. In the case of broad-

cast information aggregation, median consumers are disciplining. With personalized

information aggregation, consumers who are the most susceptible to companies’ de-

viations are disciplining, and their identities can range from the loyal customers of

the deviating company to potential switchers.

Third, we develop a systematic way of investigating the comparative statics of

equilibrium product differentiation. Among other things, the skewness of extreme

36

consumers’ personalized signals is crucial for increasing product differentiation as in-

formation aggregation becomes personalized. Meanwhile, introducing competition to

infomediaries reduces product differentiation, whereas changing consumers’ popula-

tion distribution could have surprising effects on product differentiation.

Fourth, the welfare consequences of the above changes are in general ambiguous,

a topic we now turn to.

6 Concluding remarks

Tech-enabled personalization is now ubiquitous and seems to maximize the social sur-

plus by best serving individuals’ needs. To us, this argument ignores a vital role of

modern infomediaries, namely their abilities in shaping information consumers’ beliefs

and, in turn, the location choices of politicians, companies, etc.. After formalizing this

role of infomediaries, the welfare consequences of many regulatory proposals to tame

the tech giants become less clear-cut. For example, while disabling personalization

clearly reduces the surplus generated from information aggregation, holding candi-

dates’ positions fixed, it could affect the social welfare either way once these positions

become endogenous.17 The same thing can be said about introducing competition

between infomediaries, which alone would make voters better off and infomediaries

worse off. For this reason, we suggest that caution must be exercised and our equilib-

rium characterization be considered when evaluating the overall impacts of the above

proposals.

An important takeaway from our analysis is the indeterminacy of the disciplining

voter in the case of personalized information aggregation. This prediction, while

delicate at first sight, suggests that a useful first step towards testing our theory is to

17For example, while increasing polarization certainly makes centrist voters worse-off, it couldaffect extreme voters’ utilities either way, depending on the exact location choices of the candidates.

37

identify shocks to infomediaries, which in practice could stem from the experiments

conducted by tech companies or the regulatory uncertainties they face (e.g., Spain’s

unexpected shutdown of Google News). It also suggests the usefulness of surveying

consultants about the disciplining voter, an approach advocated by Hersh (2015) in

the context of personalized campaign. We hope someone, maybe us, will put these

ideas into practice in the future.

A Numerical examples

This appendix solves the baseline model numerically for the case of entropy attention

cost. We first reduce Assumption 2 to model primitives. Results depicted in Figure

3 confirm the intuition discussed in Section 3.1.

0.1

0.3

0.5

0.7

0.0 0.2 0.4 0.6t(1)

λ

Condition (*) and assumption 2 holdConditoin (*) fails and assumption 2 holds

Figure 3: Assumption 2 and Condition (∗): entropy attention cost, uniform popula-tion distribution.

We next solve for the model primitives under which the base voters of candidate

R (i.e., right-leaning voters) are disciplining. As demonstrated in Section 3.3, the last

situation happens if and only if extreme voters’ personalized signals are sufficiently

38

skewed that the beliefs induced by the occasional big surprise and predisposition

reinforcement differ by a significant amount:

υpL (1)− υpR (1) > 2t (1) . (∗)

As depicted in Figure 3, Condition (∗) is most likely to hold when the attention cost

parameter λ is high and extreme voters’ policy preference parameter t(1) is large.

The finding concerning the policy preference parameter t(1) is quite intuitive. As for

the attention cost parameter λ, note that as paying attention becomes more costly,

the infomediary makes right-leaning voters’ signal less Blackwell-informative in order

to prevent them from tuning out. During that process, she is reluctant to cut back

υpL (1), i.e., the occasional big surprise that makes information consumption valuable

to these voters. Instead, she reduces υpR (1) significantly, which causes the left-hand

side of Condition (∗) to increase.

We finally solve for the primitives under which information personalization in-

creases policy polarization. As demonstrated in Section 3.3, the last situation happens

if and only if

ξb (0) < mink∈K

ξp(k). (∗∗)

For all parameter values we’ve tried, only extreme voters can be disciplining in the case

of personalized information. The remainder of this appendix distinguishes between

two sub-cases.

Base voters are disciplining. In this case, Condition (∗) fails, so Condition (∗∗)

becomes υpL (1)− υbL > t (1). As depicted in Figure 4, the last condition is most likely

to hold when the attention cost parameter λ and extreme voters’ policy preference

parameter t(1) are both large. As λ increases (hence paying attention becomes more

39

0.1

0.3

0.5

0.7

0.0 0.2 0.4 0.6t(1)

λ

Condition (*) fails and assumption 2 holds

Condition (**) holds

Figure 4: Condition (∗∗): entropy attention cost, uniform population distribution,Condition (∗) fails.

costly), the infomediary makes signals less Blackwell-informative in order to prevent

voters from tuning out. In the personalized case, she can reduce υpR (1) significantly

while keeping υpL (1) almost unchanged in order to make information consumption

still useful for right-leaning voters. Such flexibility is absent in the broadcast case,

where the two posterior beliefs υbL and υbR must be reduced by the same magnitude.

As a result, υpL (1)− υbL increases, which makes Condition (∗∗) easier to satisfy.

As for the effects of strengthening voters’ policy preferences, note that as t (1)

increases, extreme voters find information consumption less useful, so the broadcast

signal must become less Blackwell-informative in order to prevent them from tuning

out, i.e., υbL must decrease. In the meantime, υpL (1) must increase, because to convince

right-leaning voters to vote for candidate L requires a bigger occasional surprise than

before. Thus υpL (1)−υbL increases, which relaxes Condition (∗) when t (1) is sufficiently

large.

Opposition voters are disciplining. In this case, Condition (∗) holds, so Condition

(∗∗) becomes |t (−1) | > υbL − υpL (−1). As depicted in Figure 5, the last condition

40

is most likely to hold when extreme voters have strong policy preferences, i.e., when

t(1) is large. As t (1) increases (hence t(−1) becomes more negative), left-leaning

voters seek a bigger occasional surprise from information consumption than before,

so υpR (−1) must increase. Also since they find information consumption less useful,

υpL (−1) must decrease significantly to prevent them from tuning out. Meanwhile in

the broadcast case, υbL (= υbR) must decrease to prevent extreme voters from tuning

out. Thus for sufficiently large t (1)s, the right-hand side of Condition (∗∗) is close

to zero whereas the left-hand side of it is big, which explains the pattern depicted in

Figure 5.

0.2

0.4

0.6

0.8

0.0 0.2 0.4 0.6t(1)

λ

Condition (*) and assumption 2 hold

Condition (**) holds

Figure 5: Condition (∗∗): entropy attention cost, uniform population distribution,Condition (∗) holds.

B Proofs

The upcoming analysis exploits the following properties of the distance utility func-

tion.

41

Observation 2. u(a, k) = −|t(k)−a| satisfies the following properties, provided that

t : K → R is strictly increasing and is symmetric around zero.

Continuity and weak concavity u (·, k) is continuous and weakly concave for any

k ∈ K.

Symmetry u (a, k) = u (−a,−k) for any a ∈ R and k ∈ K.

Inverted V-shape u (·, k) is strictly increasing on (−∞, t (k)] and is strictly de-

creasing on [t (k) ,+∞) for any k ∈ K.

Increasing differences v(−a, a′, k) := u (a, k) − u (a′, k) is increasing in k for any

a > a′. For any a > 0, v(−a, a, k) := u (a, k) − u (−a, k) is strictly positive if

k = 1, equals zero if k = 0, and is strictly negative if k = −1.

B.1 Proofs for Sections 3.1 and 3.3

The proofs presented in this appendix take any symmetric policy profile a = 〈−a, a〉

with a > 0 as given (the proof for the case a = 0 is trivial). Under the assumption of

binary states, any signal structure can be represented by the tuple 〈πz, µz〉z∈Z , where

πz denotes the probability that the signal realization is z ∈ Z, and µz denotes the

posterior mean of the state conditional on the signal realization being z. Any binary

signal structure must satisfy

πL =µR

µR − µLand πR =

−µLµR − µL

42

and so can be represented by the profile 〈µL, µR〉 of posterior means. Type k voters’

utility gain from consuming 〈µL, µR〉 is simply

V (〈µL, µR〉; a, k) =

πR [v (a, k) + µR]+ if k ≤ 0,

−πL [v (a, k) + µL]− if k > 0,

where v(a, 1) > 0 = v(a, 0) > v(a,−1) = −v(a, 1) according to Observation 2 sym-

metry and increasing differences. For ease of notation, we shall hereafter write

v(a, 1) = v and v(a,−1) = −v.

Proof of Lemmas 1 and 2 We prove Lemmas 1 and 2 together in four steps.

Step 1. Show that the optimal personalized signal for any voter is unique and has

at most two signal realizations. We prove the result only for left-leaning voters of

type k = −1. Any optimal personalized signal for them solves

maxZ,Π:Ω→∆(Z)

I (Π) subject to V (Π; a,−1) ≥ λI (Π) . (2)

Let γ ≥ 0 denote a Lagrange multiplier associated with voters’ participation con-

straint. Write the complementary slackness constraints as

γ ≥ 0, V (Π; a,−1) ≥ λI (Π) , and γ [V (Π; a,−1)− λI (Π)] = 0. (3)

If γ = 0, then the solution to (2) is the true state and so is unique and binary. If

γ > 0, then reformulate (2) as

maxZ,〈πz ,µz〉z∈Z ,γ≥0

V (Π; a,−1)− λ (γ) I (Π) subject to (BP) and (3), (4)

43

where λ (γ) := λ− 1/γ. If λ(γ) ≤ 0, then the solution to (4) and, hence, (2), is again

the true state. If λ(γ) > 0, then the maximand of (4) becomes

∑z∈Z

πz[[−v + µz]

+ − λ (γ)h (µz)]︸︷︷︸

f(µz)

,

where f is the maximum of two strictly concave functions of µ: (i) −λ (γ)h (µ), and

(ii) −v + µz − λ (γ)h (µ) (as depicted in Figure 6). Since (i) and (ii) single-cross at

µ = v, their maximum is M-shaped, so applying the concavification method developed

by Kamenica and Gentzkow (2011) yields a unique solution with at most two signal

realizations. Given this, we can restrict Z to Z : |Z| ≤ 2 in the original problem (2)

and therefore guarantee the existence of a solution.

v μ2μ1μ

f(μ) f +(μ)<latexit sha1_base64="gfLEKxkdHSLkPoyr+Q5CZyerqQo=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4KomKeix68diC/YA2lM120q7dbMLuRiihv8CLB0W8+pO8+W/ctjlo64OBx3szzMwLEsG1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1A6pRcIkNw43AdqKQRoHAVjC6m/qtJ1Sax/LBjBP0IzqQPOSMGivVw16p7FbcGcgy8XJShhy1Xumr249ZGqE0TFCtO56bGD+jynAmcFLsphoTykZ0gB1LJY1Q+9ns0Ak5tUqfhLGyJQ2Zqb8nMhppPY4C2xlRM9SL3lT8z+ukJrzxMy6T1KBk80VhKoiJyfRr0ucKmRFjSyhT3N5K2JAqyozNpmhD8BZfXibN84p3VbmoX5art3kcBTiGEzgDD66hCvdQgwYwQHiGV3hzHp0X5935mLeuOPnMEfyB8/kDzY2M8g==</latexit>

f<latexit sha1_base64="Ddchi8D2DTUXY1wiSEEskbdlJak=">AAAB7HicbVBNS8NAEJ34WetX1aOXxSIIQklU1GPRi8cKpi20tWy2m3bpZhN2J0IJ/Q1ePCji1R/kzX/jts1BWx8MPN6bYWZekEhh0HW/naXlldW19cJGcXNre2e3tLdfN3GqGfdZLGPdDKjhUijuo0DJm4nmNAokbwTD24nfeOLaiFg94CjhnYj2lQgFo2glP3zMTsfdUtmtuFOQReLlpAw5at3SV7sXszTiCpmkxrQ8N8FORjUKJvm42E4NTygb0j5vWapoxE0nmx47JsdW6ZEw1rYUkqn6eyKjkTGjKLCdEcWBmfcm4n9eK8XwupMJlaTIFZstClNJMCaTz0lPaM5QjiyhTAt7K2EDqilDm0/RhuDNv7xI6mcV77Jyfn9Rrt7kcRTgEI7gBDy4gircQQ18YCDgGV7hzVHOi/PufMxal5x85gD+wPn8Aaytjps=</latexit>

f+

Figure 6: f and its concave closure f+ in the personalized case.

To prove that the solution to (2) is unique, it suffices to show that γ is unique in

the case where γ > 0 and λ (γ) > 0. If the contrary were true, then take two distinct

Lagrange multipliers γ1 > γ2 > 0 associated with voters’ participation constraint.

For each i = 1, 2, write Πi for the unique solution to (4) given λ(γi). From strict

optimality, i.e., voters strictly prefer Πi to Π−i given λ(γi), we deduce that

λ(γ1) (I (Π1)− I (Π2)) > V (Π1; a,−1)− V (Π2; a,−1) > λ(γ2) (I (Π1)− I (Π2)) .

44

Simplifying the above expression using λ(γ1) = λ − 1/γ1 > λ − 1/γ2 = λ(γ2) yields

I (Π1) > I (Π2), so Π1 and Π2 cannot both be the solutions to the original problem

(2), a contradiction.

Step 2. Show that an optimal broadcast signal exists and has at most two realiza-

tions. We focus on the case where all voters participate in information consumption.

The proofs for the remaining cases are analogous and hence are omitted for brevity.

Since the following must hold for any nondegenerate signal structure Π:

V (Π; a,−1) =∑z∈Z

πz[−v + µz]+ <

∑z∈Z

πz[µz]+ = V (Π; a, 0)

and V (Π; a, 1) =∑z∈Z

−πz[v + µz]− <

∑z∈Z

−πz[µz]− = V (Π; a, 0) ,

it follows that for any nondegenerate solution to the infomediary’s problem, only a

subset B ⊆ −1, 1 of the extreme voters can have binding participation constraints,

whereas centrist voters’ participation constraint must be slack. For each k ∈ −1, 1,

let γk ≥ 0 denote the Lagrange multiplier associated with type k voters’ participation

constraint, and write their complementary slackness constraints as

γk ≥ 0, V (Π; a, k) ≥ λI (Π) , and γk [V (Π; a, k)− λI (Π)] = 0. (5)

Formulate the infomediary’s problem as

maxZ,Π:Ω→∆(Z)γ−1,γ1≥0

I (Π) +∑

k∈−1,1

γk [V (Π; a, k)− λI (Π)] subject to (5), (6)

and consider three cases. First, if B = ∅, then the solution to (6) is the true state.

Second, if |B| = 1, then the solution to (6) is the optimal personalized signal for

45

the voter in B. Finally, if |B| = 2, then write γk = γkγ−1+γ1

for k ∈ −1, 1 and

λ = λ− 1γ−1+γ1

. Simplifying (6) to

maxZ,〈πz ,µz〉z∈Zγ−1,γ1≥0

∑z∈Z

πz

[γ−1[−v + µz]

+ − γ1[v + µz]− − λh(µz)

]︸︷︷︸

f(µz)

(7)

subject to (BP) and (5),

where f (µz) is the maximum of three strictly concave functions of µ: (i) γ−1 (−v + µ)−

λh (µ), (ii) −λh (µ), and (iii) −γ1 (v + µ)− λh (µ) (see Figure 7 for a graphical illus-

tration).

Fix any γ−1, γ1 and λ, and consider the relaxed problem (7). Let f+ denote

the concave closure of f , and note that µ1 := inf µ : f+ (µ) > f (µ) and µ2 :=

sup µ : f+ (µ) > f (µ) exist and satisfy µ1 < 0 < µ2. There are three cases to

consider.

(a) If f+ (0) > (1−α)f+(µ1) +αf+(µ2) for all α ∈ [0, 1], then the unique solution to

the relaxed problem is the degenerate signal (as depicted on Panel (a) of Figure

7).

(b) If f+ (0) = (1− α)f+(µ1) + αf+(µ2) > f(0) for some α ∈ [0, 1], then the unique

solution to the relaxed problem is the binary signal 〈µ1, µ2〉 (as depicted on Panel

(b) of Figure 7).

(c) If f+ (0) = (1− α)f+(µ1) + αf+(µ2) = f(0) for some α ∈ [0, 1], then the relaxed

problem has multiple solutions, each of which entails at most three signal real-

izations (as depicted on Panel (c) of Figure 7). Among all these solutions, the

binary signal 〈µ1, µ2〉 is the most Blackwell-informative and therefore constitutes

the unique solution to the original attention-maximization problem.

46

Taken together, we can always restrict Z to Z : |Z| ≤ 2 in the original problem (6)

and therefore guarantee the existence of a solution.

0 v-v μ2μ1μ

f(μ) f +(μ) <latexit sha1_base64="gfLEKxkdHSLkPoyr+Q5CZyerqQo=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4KomKeix68diC/YA2lM120q7dbMLuRiihv8CLB0W8+pO8+W/ctjlo64OBx3szzMwLEsG1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1A6pRcIkNw43AdqKQRoHAVjC6m/qtJ1Sax/LBjBP0IzqQPOSMGivVw16p7FbcGcgy8XJShhy1Xumr249ZGqE0TFCtO56bGD+jynAmcFLsphoTykZ0gB1LJY1Q+9ns0Ak5tUqfhLGyJQ2Zqb8nMhppPY4C2xlRM9SL3lT8z+ukJrzxMy6T1KBk80VhKoiJyfRr0ucKmRFjSyhT3N5K2JAqyozNpmhD8BZfXibN84p3VbmoX5art3kcBTiGEzgDD66hCvdQgwYwQHiGV3hzHp0X5935mLeuOPnMEfyB8/kDzY2M8g==</latexit>


f+

(a)

0 v-v μ2μ1μ

f(μ) f +(μ)<latexit sha1_base64="gfLEKxkdHSLkPoyr+Q5CZyerqQo=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4KomKeix68diC/YA2lM120q7dbMLuRiihv8CLB0W8+pO8+W/ctjlo64OBx3szzMwLEsG1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1A6pRcIkNw43AdqKQRoHAVjC6m/qtJ1Sax/LBjBP0IzqQPOSMGivVw16p7FbcGcgy8XJShhy1Xumr249ZGqE0TFCtO56bGD+jynAmcFLsphoTykZ0gB1LJY1Q+9ns0Ak5tUqfhLGyJQ2Zqb8nMhppPY4C2xlRM9SL3lT8z+ukJrzxMy6T1KBk80VhKoiJyfRr0ucKmRFjSyhT3N5K2JAqyozNpmhD8BZfXibN84p3VbmoX5art3kcBTiGEzgDD66hCvdQgwYwQHiGV3hzHp0X5935mLeuOPnMEfyB8/kDzY2M8g==</latexit>


f+

(b)

0 v-v μ2μ1μ

f(μ) f +(μ)<latexit sha1_base64="Ddchi8D2DTUXY1wiSEEskbdlJak=">AAAB7HicbVBNS8NAEJ34WetX1aOXxSIIQklU1GPRi8cKpi20tWy2m3bpZhN2J0IJ/Q1ePCji1R/kzX/jts1BWx8MPN6bYWZekEhh0HW/naXlldW19cJGcXNre2e3tLdfN3GqGfdZLGPdDKjhUijuo0DJm4nmNAokbwTD24nfeOLaiFg94CjhnYj2lQgFo2glP3zMTsfdUtmtuFOQReLlpAw5at3SV7sXszTiCpmkxrQ8N8FORjUKJvm42E4NTygb0j5vWapoxE0nmx47JsdW6ZEw1rYUkqn6eyKjkTGjKLCdEcWBmfcm4n9eK8XwupMJlaTIFZstClNJMCaTz0lPaM5QjiyhTAt7K2EDqilDm0/RhuDNv7xI6mcV77Jyfn9Rrt7kcRTgEI7gBDy4gircQQ18YCDgGV7hzVHOi/PufMxal5x85gD+wPn8Aaytjps=</latexit>

f+<latexit sha1_base64="gfLEKxkdHSLkPoyr+Q5CZyerqQo=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4KomKeix68diC/YA2lM120q7dbMLuRiihv8CLB0W8+pO8+W/ctjlo64OBx3szzMwLEsG1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1A6pRcIkNw43AdqKQRoHAVjC6m/qtJ1Sax/LBjBP0IzqQPOSMGivVw16p7FbcGcgy8XJShhy1Xumr249ZGqE0TFCtO56bGD+jynAmcFLsphoTykZ0gB1LJY1Q+9ns0Ak5tUqfhLGyJQ2Zqb8nMhppPY4C2xlRM9SL3lT8z+ukJrzxMy6T1KBk80VhKoiJyfRr0ucKmRFjSyhT3N5K2JAqyozNpmhD8BZfXibN84p3VbmoX5art3kcBTiGEzgDD66hCvdQgwYwQHiGV3hzHp0X5935mLeuOPnMEfyB8/kDzY2M8g==</latexit>

f

(c)

Figure 7: f and its concave closure f+ in the broadcast case.

Step 3. Show that if every optimal broadcast signal is binary and is consumed by

all voters, then the optimal broadcast signal is unique and symmetric.

In Step 2, we argued that only a subset B ⊆ −1, 1 of the extreme voters can

have binding participation constraints. If B = ∅, then the solution to (6) is the true

state and so is unique and symmetric. The case where |B| = 1 is impossible, because

in that case, the solution to (6) is the optimal personalized signal for the voter in

B, which violates the participation constraint of the voter in −1, 1 − B. Finally, if

47

B = −1, 1, then any optimal signal 〈µL, µR〉 must satisfy

V (〈µL, µR〉; a,−1) = − µLµR − µL

[−v + µR]+ = λI (〈µL, µR〉)

= V (〈µL, µR〉; a, 1) = − µRµR − µL

[v + µL]− > 0.

Simplifying the above expression yields µL = −µR, so 〈µL, µR〉 is symmetric, and µL

is a solution to

maxµ∈[−1,0]

h (µ) s.t.1

2[−v − µ]+ ≥ λh (µ) . (8)

Since h is strictly convex and strictly increasing on [0, 1], (8) admits either no solution

(in which case the optimal signal isn’t binary to begin with) or a unique solution.

Step 4. Observe that any optimal (broadcast or personalized) signal, if binary,

must satisfy (SOB) among its consumers. No more proof is required beyond the

verbal argument made in the main text.

Proof of Theorem 1 We focus on the proof of Part (ii) concerning the skewness of

optimal personalized signals. Part (i) concerning the symmetry of optimal broadcast

signal has already been demonstrated in the proof of Lemmas 1 and 2. Part (iii)

requires no more proof than the verbal argument made in the main text.

We only prove Part (ii) for left-leaning voters of type k = −1. Let 〈µL, µR〉 denote

their optimal personalized signal, which by Assumption 2 must satisfy

V (〈µL, µR〉; a,−1) =µR

µR − µL[−v + µR]+ ≥ λI (〈µL, µR〉) > 0

and, hence, µR > v. We wish to show that µL+µR > 0. Notice first that µL+µR ≥ 0,

because if the contrary were true, i.e., µL+µR < 0, then consuming 〈−µR,−µL〉 incurs

48

the same attention cost as consuming 〈µL, µR〉 by Assumption 1, but the first option

generates a higher voting utility:

V (〈−µR,−µL〉; a,−1) =µR

µR − µL(−v − µL)

>−µL

µR − µL(−v + µR) = V (〈µL, µR〉; a,−1) .

To show that µL+µR 6= 0, take the Lagrange multiplier γ > 0 associated with voters’

participation constraint as given, and consider the relaxed problem of (4):

max〈µL,µR〉∈

[−1,0]×[v,1]

−µLµR − µL

(−v + µR)− (λ− 1/γ)

[µR

µR − µLh (µL)− µL

µR − µLh (µR)

]. (9)

Note that λ − 1/γ > 0 must hold in order for this problem to admit an interior

solution (as required by Assumption 2), and that any interior solution must satisfy

the following first-order conditions:

−v + µR = (λ− 1/γ) [∆h− h′ (µL) ∆µ] (10)

and v − µL = (λ− 1/γ) [h′ (µR) ∆µ−∆h] (11)

where ∆h := h (µR) − h (µL) and ∆µ := µR − µL. If µL + µR = 0, then ∆h = 0

and h′ (µR) = −h′ (µL) by Assumption 1, so the right-hand sides of the first-order

conditions are the same. Meanwhile, the left-hand sides differ, which leads to a

contradiction.

Proof of Proposition 2 It suffices to show that optimal signals become less

Blackwell-informative as we raise the attention cost parameter from λ′ to λ′′ such

that the corresponding environments satisfy Assumption 2. For the broadcast case,

49

the result follows from the fact that the optimal signal is symmetric. For the per-

sonalized case, we prove the result for left-leaning voters of type k = −1 in two

steps.

First, recall the relaxed problem (9), whose first-order conditions are given by (10)

and (11). Summing up (10) and (11) yields

h′ (µR)− h′ (µL) = 1/β, (12)

where β stands for λ− 1/γ and is treated as given by the relaxed problem. Based on

(12), we can simplify the total derivative of (11) w.r.t. β as follows:

dµLdβ

=∆h− h′ (µR) ∆µ+ β

h′ (µR) dµRdβ− h′ (µL) dµL

dβ− h′′ (µR) dµR

dβ∆µ

−h′ (µR) dµRdβ

+ h′ (µR) dµLdβ

=∆h− h′ (µR) ∆µ− βh′′ (µR)

dµRdβ

∆µ+dµLdβ

,

where ∆µ := µR − µL and ∆h := h(µR)− h(µL). Therefore,

dµRdβ

=∆h− h′ (µR) ∆µ

βh′′ (µR) ∆µ=h(µR)− h(|µL|)− h′ (µR) ∆µ

βh′′ (µR) ∆µ< 0, (13)

where the last inequality holds because h is symmetric around zero, h′ > 0 on

[0, 1], h′′ > 0, and ∆µ > 0, hence the numerator of (13) is bounded above by

h′ (µR) (µR − |µL| −∆µ) = −2h′ (µR) |µL| < 0 if µR > |µL| and by −h′ (µR) ∆µ < 0

if µR ≤ |µL|. Meanwhile, differentiating (12) with respect to β yields

h′′ (µL)dµLdβ

= h′′ (µR)dµRdβ

+1

β2,

50

and simplifying the above expression using (13) yields

dµLdβ

=∆h− h′ (µL) ∆µ

βh′′ (µL) ∆µ> 0. (14)

Together, (13) and (14) imply that the solution to the relaxed problem (9) becomes

less Blackwell-informative as β increases.

We next endogenize β and show that it strictly increases with λ. Formally, write

the Lagrange multiplier associated with left-leaning voters’ participation constraint

as γ(λ) to signify its dependence on λ, and write β(λ) for λ − 1/γ(λ). We wish

to demonstrate that β(λ′′) > β(λ′) for any λ′′ > λ′ such that the corresponding

environments satisfy Assumptions 2. Suppose to the contrary that β (λ′′) ≤ β (λ′)

for some λ′′ > λ′, and let Π′ and Π′′ denote the corresponding optimal personalized

signals for left-leaning voters, respectively. From the previous step, we know that

if β(λ′′) < β(λ′), then Π′′ is more Blackwell-informative than Π′, so in particular

I (Π′′) > I (Π′) > 0. Then from V (Π′; a,−1) = λ′I (Π′) and V (Π′′; a,−1) = λ′′I (Π′′),

it follows that V (Π′′; a,−1)−λ′I (Π′′) > 0 = V (Π′; a,−1)−λ′I (Π′) , which together

with I (Π′′) > I (Π′) implies that Π′ is not optimal when λ = λ′, a contradiction.

Meanwhile, if β (λ′) = β (λ′′), then Π′ = Π′′ as demonstrated in the previous step.

But then V (Π′; a,−1) = λ′I (Π′) < λ′′I (Π′′) = V (Π′′; a,−1), a contradiction.

B.2 Proofs for Sections 3.2 and 5

Proof of Lemma 4 Fix S = p, and consider any unilateral deviation of candidate

R from 〈−a, a〉 with a ≥ 0 to a′. Below we demonstrate that a′ is unprofitable if and

only if (i) a′ /∈ [−a, a], or (ii) a′ ∈ [−a, a) and it doesn’t attract any type k voter with

t(k) ∈ [−a, a].

51

Step 1. Show that no a′ > a strictly increases candidate R’s winning probability.

Fix any a′ > a. From Observation 2 inverted V-shape and (SOB), it follows that

neither left-leaning voters nor centrist voters would find a′ attractive, i.e., ∀k ≤ 0,

v (−a, a′, k) + µpL (a, k) < v (−a, a, k) + µpL (a, k) < 0.

Given this, as well as the symmetry of the joint signal distribution, it suffices to show

that if a′ attracts right-leaning voters, then it must repel left-leaning voters, i.e.,

v (−a, a′, 1) + µpL (a, 1) > 0 =⇒ v (−a, a′,−1) + µpR (a,−1) < 0.

The argument below exploits the symmetry of marginal signal distributions, i.e.,

µpR (a,−1) = −µpL (a, 1), which together with Observation 2 symmetry implies

v (−a, a′,−1) + µpR (a,−1) := u (a′,−1)− u (−a,−1) + µpR (a,−1)

= u (−a′, 1)− u (a, 1)− µpL (a, 1) .

Thus, if v (−a, a′, 1) + µpL (a, 1) := u (a′, 1)− u (−a, 1) + µpL (a, 1) > 0, then

v (−a, a′,−1) + µpR (a,−1) = u (−a′, 1)− u (a, 1)− µpL (a, 1)

< u (a′, 1) + u (−a′, 1)− [u (a, 1) + u (−a, 1)] ≤ 0,

where the last inequality follows from Observation 2 concavity.

Step 2. Show that no a′ < −a strictly increases candidate R’s winning probability.

The proof closely parallels that in Step 1. First, note that no a′ < −a attracts centrist

52

or right-leaning voters by Observation 2 inverted V-shape, i.e., ∀k ≥ 0,

v (−a, a′, k) + µpL (a, k) < v (−a,−a, k) + µpL (a, k) = 0 + µpL (a, k) < 0.

Second, if any a′ as above attracts left-leaning voters, then it must repel right-leaning

voters for the reason given in Step 1. Combining these observations gives the desired

result.

Step 3. Show that no a′ ∈ [−a, a) repels any voter. Fix any a′ as such. From

Observation 2 inverted V-shape and (SOB), it follows that if t (k) ≤ a′(< a), then

v (−a, a′, k) + µpR (a, k) > v (−a, a, k) + µpR (a, k) > 0,

and if t (k) > a′(≥ −a), then

v (−a, a′, k) + µpR (a, k) ≥ v (−a,−a, k) + µpR (a, k) = 0 + µpR (a, k) > 0.

Combining these observations yields v (−a, a′, k) + µpR (a, k) > 0 for any k.

Step 4. Show that no a′ ∈ [−a, a) attracts any type k voter with t(k) /∈ [−a, a].

Fix any a′ as such. From Observation 2 inverted V-shape and (SOB), it follows

that if t (k) < −a(≤ a′), then

v (−a, a′, k) + µpL (a, k) ≤ v (−a,−a, k) + µpL (a, k) = 0 + µpL (a, k) < 0,

and if t (k) > a(> a′), then

v (−a, a′, k) + µpL (a, k) < v (−a, a, k) + µpL (a, k) < 0.

53

Combining these observations yields v (−a, a′, k) + µpL (a, k) < 0 for any k.

Combining Steps 1-4 shows a deviation a′ from 〈−a, a〉 strictly increases candidate

R’s winning probability if and only if it belongs to [−a, a) and attracts any type k

voter with t(k) ∈ [−a, a]. Ruling out such deviations leads us to sustain 〈−a, a〉 in

an equilibrium.

Proof of Theorem 2 We first define a concept called attraction-proof set.

Definition 5. For any segmentation technology S ∈ b, p and symmetric policy

profile 〈−a, a〉 with a ≥ 0, define

φS(−a, a′, k) = v(−a, a′, k) + µSL(a, k)

as type k voter’s susceptibility to a deviation of candidate R from 〈−a, a〉 to a′, and

note that a′ attracts type k voters if and only if φS(−a, a′, k) > 0. Let the attraction-

proof set ΞS(k) for type k voters gather all policy position a’s such that no deviation of

candidate R from 〈−a, a〉 attracts these voters. Since a′ = t(k) is the most attractive

deviation to type k voters, we must have

ΞS(k) = a ≥ 0 : φS(−a, t(k), k) ≤ 0.

The next lemma gives characterizations of attraction-proof sets.

Lemma 5. (i) ξb(0) > 0 and Ξb(0) = [0, ξb(0)]; (ii) ∀k ∈ K, ξp(k) > |t(k)| and

Ξp(k) ∩ [|t(k)|,+∞) = [|t(k)|, ξp(k)].

54

Proof. Exploiting the functional form of u (a, k) = −|t (k)− a| yields

v(−a, a, k) =

2a if 0 ≤ a < |t(k)|,

2|t(k)| if a ≥ |t(k)|.

Substituting this observation into the proofs of Lemmas 1 and 2 yields µbL (a) ≡

µbL (t (1)) := −υbL ∀a ≥ t(1); and µpL (a, k) ≡ µpL (|t (k) |, k) := −υpL (k) ∀a ≥ |t (k) |.

Part (i): Recall that µbL (a) is the unique solution to (8), i.e.,

maxµ∈[−1,0]

h (µ) s.t.1

2[v (−a, a,−1)− µ]+ ≥ λh (µ) .

Solving this problem using (i) h is strictly convex and strictly decreasing on [−1, 0]

and (ii) v(−a, a,−1) is nondecreasing in a shows that µbL(a) is nondecreasing in a.

As a result, φb (−a, 0, 0) = a + µbL (a) is strictly increasing in a, which together with

φb (−a, 0, 0)∣∣a=0

= µbL (0) < 0 implies that the unique root of φb(−a, 0, 0) is strictly

positive, and Ξb(0) := a ≥ 0 : φb(−a, 0, 0) ≤ 0 = [0, unique root of φb(−a, 0, 0)]. In

case the above root exceeds t(1), solving it using the fact that φb (−a, 0, 0) = a− υbL∀a ≥ t(1) yields υbL(:= ξb(0)).

Part (ii): For k = 1, notice that φp (−a, t (1) , 1) = a+t (1)+µpL (a, 1) = a+t (1)−υpL (1)

∀a ≥ t (1) and that φp (−a, t (1) , 1)|a=t(1) = v (−t (1) , t (1) , 1) + µpL (t(1), 1) < 0

by (SOB). Therefore, the maximum root of φp(−a, t(1), 1) equals −t(1) + υpL(1)(:=

ξp(1)) and exceeds t(1), and Ξp(1) := a ≥ 0 : φp(−a, t(1), 1) ≤ 0 satisfies Ξp(1) ∩

[t(1),+∞) = [t(1), ξp(1)]. The proofs for k = 0,−1 are analogous and are therefore

omitted.

We now close the proof of Theorem 2. In the broadcast case, combining Lemmas

55

3 and 5 yields Eb,q = Ξb(0) = [0, ξb(0)]. The proof for the personalized case is the

same as that for the broadcast case if q(0) > 1/2. If, instead, q(0) ≤ 1/2, then

Ep,q = ([0, t(1)) ∩ Ξp(0))︸︷︷︸A

∪ ([t(1),+∞) ∩ ∩k∈KΞp(k))︸︷︷︸B

by Lemma 4. Consider two cases. First, if ξp(0) < t(1)(< ξp(±1)), then A =

[0, ξp(0)] by Lemma 5 and B = ∅. Second, if ξp(0) ≥ t(1), then A = [0, t(1)) and B

= [t(1),mink∈K ξp(k)]. In both cases, Ep,q = A ∪B = [0,mink∈K ξ

p(k)].

Proof of Proposition 4

Proof. Replicating the proofs in this appendix for market-share maximizing compa-

nies gives the desired result.

C Minor extensions

Relaxing Assumption 2 Assumption 2, or uniform strict obedience, mandates

that all voters’ signals must satisfy (SOB) for any given pair of segmentation tech-

nology and feasible policy profile. Three assumptions together guarantee that this is

the case: (i) the underlying state is binary, (ii) optimal signals are nondegenerate,

and (iii) voters face binary decision problems. Online Appendix O.3 investigates an

extension to a continuum of states. Here we discuss the consequences of relaxing (ii)

and (iii).

The next example shows that policy polarization could still be positive even if

extreme voters are excluded from information consumption.

Example 1. Let everything be as in the baseline model except that extreme voters are

excluded from information consumption. Take any symmetric policy profile 〈−a, a〉

56

with a ∈[0,min

t (1) , ξS (0)

], and suppose extreme voters vote along party lines

when they are indifferent between the two candidates. By construction, any unilateral

deviation of candidate R from 〈−a, a〉 doesn’t attract centrist voters, and it doesn’t

increase the total number of votes that extreme voters cast to him. Combining these

observations shows that 〈−a, a〉 can be sustained as an equilibrium outcome. ♦

The next example shows that policy polarization could still be positive, even if

extreme voters observe more than two signal realizations and do not always have

strict preferences between the two candidates.18

Example 2. Let everything be as in the baseline model except that extreme voters’

personalized signals have three realizations L, M and R, and they strictly prefer

candidate L to R given signal realizaton L, are indifferent between the two candidates

given signal realization M , and strictly prefer candidate R toL given signal realization

R. Below we demonstrate that the policy profile 〈−t (1) , t (1)〉 can be sustained as

an equilibrium outcome.

For each k ∈ −1, 1, write µz (k) for the posterior mean of the state con-

ditional on type k voters’ signal realization being z ∈ L,M,R, and note that

v (−t (1) , t (1) , k) + µL (k) < v (−t (1) , t (1) , k) + µM (k) = 0 < v (−t (1) , t (1) , k) +

µR (k) by assumption. Bayes’ plausibility implies that µL (k) < 0 < µR (k). Con-

sider any unilateral deviation of candidate R from 〈−t (1) , t (1)〉 to a′. Clearly, no

a′ /∈ [−t (1) , t (1)] constitutes a profitable deviation, and no a′ ∈ [−t (1) , t (1)] at-

tracts centrist voters whose policy latitude is assumed to be greater than t (1). It

remains to show that no a′ ∈ [−t (1) , t (1)) affects extreme voters’ voting decisions.

18We do not explicitly model the decision problem and signal generation process here. It iswell known that the signal acquired by an RI decision-maker facing a finite decision problem (e.g.,categorical thinking) has finitely many realizations (Matejka and McKay (2015)). Indeed, the sameconclusion can sometimes be drawn for infinite decision problems (Jung et al. (2019).

57

For k = −1, note that

v (−t (1) , a′,−1) + µL (−1)

≤ v (−t (1) , t (−1) ,−1) + µL (−1) (inverted V-shape)

= v (−t (1) ,−t (1) ,−1) + µL (−1) (symmetry)

= 0 + µL (−1)

< 0,

and the following must hold for z = M,R:

v (−t (1) , a′,−1) + µz (−1)

> v (−t (1) , t (1) ,−1) + µz (−1) (inverted V-shape)

≥ 0.

If, in addition, type −1 voters break the tie in favor of candidate R, then no a′ ∈

[−t (1) , t (1)) could affect their voting decisions. The proof for k = 1 is analogous

and hence is omitted. ♦

Heterogeneous attention cost parameter By allowing the attention cost pa-

rameter to differ across voters, we might (but not necessarily) end up in a situation

in which centrist voters’ participation constraint is binding whereas extreme voters’

participation constraints are slack in the broadcast case. But then the broadcast

signal would be the same as centrist voters’ personalized signal, so information per-

sonalization could only decrease policy polarization.

58

Alternative segmentation technologies For general segmentation technologies,

we can first aggregate—for each market segment—voters with binding participation

constraints into a representative voter, and then solve for the optimal personalized

signals for representative voters.

References

Athey, S., M. Mobius, and J. Pal. (2021): “The Impact of aggregators on

Internet news consumption,” NBER working paper.

Aumann, R., and M. Maschler. (1995): Repeated Games with Incomplete Infor-

mation, Cambridge, MA: MIT Press.

Burke, J. (2008): “Primetime spin: Media bias and belief confirming information,”

Journal of Economics and Management Strategy, 17(3), 633-665.

Calvert, R. L. (1985a): “The value of biased information: A rational choice model

of political advice,” Journal of Politics, 47(2), 530-555.

——— (1985b): “Robustness of the multidimensional voting model: Candidate mo-

tivations, uncertainty, and convergence,” American Journal of Political Science,

29(1), 69-95.

Caplin, A., and M. Dean. (2013): “Behavioral implications of rational inattention

with Shannon entropy,” NBER working paper.

——— (2015): “Revealed preference, rational inattention, and costly information

acquisition,” American Economic Review, 105(7), 2183-2203.

Chan, J., and W. Suen. (2008): “A spatial theory of news consumption and

electoral competition,” Review of Economic Studies, 75(3), 699-728.

59

Che, Y-K., and K. Mierendorff. (2019): “Optimal dynamic allocation of atten-

tion,” American Economic Review, 109(8), 2993-3029.

Cover, T. M., and J. A. Thomas. (2006): Elements of Information Theory,

Hoboken, NJ: John Wiley & Sons, 2nd ed.

Dean, M., and N. Neligh. (2019): “Experimental tests of rational inattention,”

Working Paper.

Dellarocasm, C., J. Sutanto, M. Calin, and E. Palme. (2016): “Atten-

tion allocation in information-rich environments: The case of news aggregators,”

Management Science, 62(9), 2457-2764.

DellaVigna, S., and M. Gentzkow. (2010): “Persuasion: Empirical evidence,”

Annual Review of Economics, 2, 643-669.

Duggan, J. (2017): “A survey of equilibrium analysis in spatial model of elections,”

Working Paper.

Eslami, M., A. Aleyasen, K. G. Karahalios, K. Hamilton, and C. Sand-

vig. (2015): “FeedVis: A path for exploring news feed curation algorithms,”

CSCW’15 Companion: Proceedings of the 18th ACM Conference Companion on

Computer Supported Cooperative Work & Social Computing, 65-68.

European Parliament and Council of European Union

(2016) Regulation (EU) 2016/679, https://eur-lex.europa.eu/legal-

content/EN/TXT/HTML/?uri=CELEX:32016R0679&from=EN, Accessed

08/08/2020.

Fiorina, M. P., and S. J. Abrams. (2008): “Political polarization in the American

public,” Annual Review of Political Science, 11, 563-588.

60

https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32016R0679&from=EN

https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32016R0679&from=EN

Flaxman, S., S. Goel, and J. M. Rao. (2016): “Filter bubbles, echo chambers

and online news consumption,” Public Opinion Quarterly, 80(S1), 298-320.

Gentzkow, M. (2016): “Polarization in 2016,” Toulouse Network for Information

Technology whitepaper.

Gerber, A. S., J. G. Gimpel, D. P. Green, and D. R. Shaw. (2011): “How

large and long-lasting are the persuasive effects of televised campaign ads? Results

from a randomized field experiment,” American Political Science Review, 105(1),

135-150.

Glaeser, E. L., G. A. M. Ponzetto, and J. M. Shapiro. (2005): “Strategic

extremism: Why Republicans and Democrats divide on religious values,” Quarterly

Journal of Economics, 120(4), 1283-1330.

Hebert, B., and M. D. Woodford. (2018): “Rational inattention in continuous

time,” Working Paper.

Herrera, H., D. K. Levine, and C. Martinelli. (2008): “Policy platforms,

campaign spending and voter participation,” Journal of Public Economics, 92(3-

4), 501-513.

Hersh, E. D. (2015): Hacking the Electorate: How Campaigns Perceive Voters,

Cambridge, U.K.: Cambridge University Press.

Hotelling, H. (1929): “Stability in competition,” The Economic Journal, 39(153),

41-57.

Iyer, G., D. Soberman, and M. Villas-Boas. (2005): “The targeting of adver-

tising,” Marketing Science, 24(3), 461-476.

61

Jung, J., J. Kim, F. Matejka, and C. A. Sims. (2019): “Discrete actions in

information-constrained problems,” Review of Economic Studies, 86(6), 2643-2667.

Kamenica, E., and M. Gentzkow. (2011): “Bayesian persuasion,” American

Economic Review, 101(6), 2590-2615.

Lagun. D, and M. Lalmas. (2016): “Understanding user attention and engage-

ment in online news reading,” Proceedings of the Ninth ACM International Con-

ference on Web Search and Data Mining, 113-122.

Martinelli, C. (2006): “Would Rational Voters Acquire Costly Information?,”

Journal of Economic Theory, 129(1), 225-251.

Matejka, F., and A. McKay. (2012): “Simple market equilibria with ratio-

nally inattentive consumers,” American Economic Review: Papers and Proceedings,

102(3), 24-29.

——— (2015): “Rational inattention to discrete choices: A new foundation for the

multinomial logit model,” American Economic Review, 105(1), 272-298.

Matejka, F., and G. Tabellini. (2020): “Electoral competition with rationally

inattentive voters,” Journal of European Economic Association, jvaa042.

Matsa, K. E., and K. Lu. (2016): “10 Facts about the changing digital news

landscape,” Pew Research Center, September 14.

Morris, S., and P. Strack. (2017): “The Wald problem and the equivalence of

sequential sampling and static information costs,” Working Paper.

Pariser, E. (2011): The Filter Bubble: How the New Personalized Web Is Changing

What We Read and How We Think, New York, NY: Penguin Press.

62

Perego, J., and S. Yuksel. (2022): “Media competition and social disagreement,”

Econometrica, 90(1), 223-265.

Prat, A., and D. Stromberg. (2013): “The Political Economy of Mass Media,” in

Advances in Economics and Econometrics: Theory and Applications, Tenth World

Congress, ed. by D. Acemoglu, M. Arellano, and E. Dekel.. Cambridge University

Press.

Sauer, M. P., M. G., Schlatterer, and S. Y. Schmitt. (2019): “Horizontal

product differentiation with limited attentive consumers,” Working paper.

Sims, C. A. (1998): “Stickiness,” Carnegie-Rochester Conference Series on Public

Policy, 49(1), 317-356.

——— (2003): “Implications of rational inattention,” Journal of Monetary Eco-

nomics, 50(3), 665-690.

Spiegler, R. (2016): “Choice complexity and market competition,” Annual Review

of Economics, 8, 1-25.

Stromberg, D. (2015): “Media and politics,” Annual Review of Economics, 7,

173-205.

Suen, W. (2004): “The self-perpetuation of biased beliefs,” The Economic Journal,

114, 377-396.

Sunstein, C. R. (2009): Republic.com 2.0, Princeton, NJ: Princeton University

Press.

The Digital Competition Expert Panel. (2019): Unlocking Digital Competi-

tion, U.K.

63

Warren, E. (2019): “Here’s how we can break up Big Tech,” Medium, March 8.

Yuksel, S. (2022): “Specialized learning and political polarization,” International

Economic Review, 63(1), 457-474.

Zhong, W. (2019): “Optimal dynamic information acquisition,” Working Paper.

64

Online Appendices

(For Online Publication Only)

O.1 General model

This appendix has two purposes. The first purpose is to extend the baseline model to

general voters. Throughout, suppose candidates can adopt the policies in a compact

interval A = [−a, a] with a > 0. Voters’ type space is an arbitrary finite set K =

−K, · · · , 0, · · · , K. Their population function q : K → R++ has support K and is

symmetric around zero. Their utility function u : A×K → R satisfies the properties

listed in Observation 2, i.e.,

Assumption O1. Continuity and weak concavity u (·, k) is continuous and weakly

concave for any k ∈ K.

Symmetry u (a, k) = u (−a,−k) for any a ∈ R and k ∈ K.

Inverted V-shape u (·, k) is strictly increasing on [−a, t (k)] and is strictly decreas-

ing on [t (k) , a] for any k ∈ K, where t : K → A is strictly increasing and

symmetric around zero.

Increasing differences v(−a, a′, k) := u (a, k) − u (a′, k) is increasing in k for any

a > a′. For any a > 0, v(−a, a, k) := u (a, k) − u (−a, k) is strictly positive if

k > 0, equals zero if k = 0, and is strictly negative if k < 0.

Observation O1. All results of Section 3.1 remain valid under Assumptions 1, 2,

and O1.

Proof. The proof for the personalized case is the exact same as before. As for the

broadcast case, notice that under the current assumptions, only voters of the most

extreme types can have binding participation constraints, wheres those of interim

types must have slack participation constraints. Replacing k = ±1 with k = ±K in

the proofs of Lemmas 1 and 2 and Theorem 1 gives the desired result.

1

The second purpose of this appendix is to relax the assumption that signals are

conditionally independent across market segments. In what follows, we’ll first develop

new concepts in Appendix O.1.1 and then conduct equilibrium analyses in Appendix

O.1.2.

O.1.1 Key concepts

Joint signal distribution A joint signal distribution is a tuple 〈χ,b+,b−〉 of a

configuration matrix χ and probability vectors b+ and b−. The configuration matrix

χ has |K| rows. Each column of it constitutes a profile of the voting recommendations

that is prescribed to type −K, · · · , K voters with a strictly positive probability. Each

entry of χ is either 0 or 1, where 0 means that candidate R is disapproved of and 1

means that he is endorsed. For example, the configuration matrix is

χ∗ =

0 1

0 1

......

0 1

if S = b, and it is

χ∗∗ =

0 1 0 · · · 0 1 · · · 0 · · · 1

0 0 1 · · · 0 1 · · · 0 · · · 1

......

... · · ·...

... · · ·... · · · 1

0 0 0 · · · 0 0 · · · 1 · · · 1

0 0 0 · · · 1 0 · · · 1 · · · 1

︸︷︷︸

2|K| columns

2

if S = p and signals are conditionally independent across voters. The vectors b+

and b− compile the probabilities that each column of χ occurs in states ω = 1 and

ω = −1, respectively. By definition, all elements of b+ or b− are strictly positive and

add up to one.

We consider symmetric joint signal distributions that are consistent with the

marginal signal distributions solved in Section 3.1. To formally define symmetry, let

x be a generic voting recommendation profile to type −K, · · · , K voters, 1 be the

|K|-vector of ones, and

P =

1

. ..

1

be a |K| × |K| permutation matrix. Define the symmetry operator Σ as

Σ x = P (1− x) ,

so that x recommends candidate z ∈ L,R to type k voters if and only if Σ x

recommends candidate −z to type −k voters. A joint signal distribution is symmetric

if the probability that a voting recommendation profile x occurs in state ω = 1 equals

the probability that Σ x occurs in state ω = −1. Formally,

Definition O1. A configuration matrix χ is symmetric if for any

m ∈ 1, · · · ,#columns (χ), there exists n ∈ 1, · · · ,#columns (χ) such that Σ

[χ]m = [χ]n. A joint signal distribution 〈χ,b+,b−〉 is symmetric if χ is symmetric

and [b+]m = [b−]n for any m,n as above.19

We next define consistency. In Footnote 16, we solved for the marginal probabil-

19With a slight abuse of notation, we use [·]m to denote both the mth entry of a column vectorand the mth column of a matrix. #columns (χ) denotes the number of the columns of χ.

3

ities that the signal consumed by type k voters endorses candidate R in states ω = 1

and ω = −1, respectively, holding any segmentation technology S and symmetric

policy profile 〈−a, a〉 fixed. Compiling these probabilities across type −K, · · · , K

voters yield two |K|-vectors πS,+ (a) and πS,− (a) of marginal probabilities.

Definition O2. A joint signal distribution 〈χ,b+,b−〉 is 〈S, a〉-consistent for some

S ∈ b, p and a ∈ [0, a] if

χb+ = πS,+ (a) and χb− = πS,− (a) .

A configuration matrix χ is 〈S, a〉-consistent if there exist probability vectors b+ and

b− such that the joint signal distribution 〈χ,b+,b−〉 is 〈S, a〉-consistent. χ is S-

consistent if it is 〈S, a〉-consistent for all a ∈ [0, a].

By definition, χ∗ is b-consistent and, indeed, the only 〈b, a〉-consistent configura-

tion matrix for any given a ∈ [0, a]. χ∗∗ is p-consistent, but it is not the uniquely

p-consistent configuration in general (numerical examples are available upon request).

Attraction-proof set and policy latitude In the proof of Theorem 2, we defined

a voter’s susceptibility to a unilateral deviation deviation of candidate R from 〈−a, a〉

to a′ as φS(−a, a′, k) := v(−a, a′, k) + µSL(a, k), and noted that a′ attracts the voter

if and only if φS(−a, a′, k) > 0. We also defined the attraction-proof set and policy

latitude for every individual voter. The next definition generalizes the above concepts

to sets of voters.

Definition O3. Under segmentation technology S ∈ b, p, a deviation of candidate

R from a symmetric policy profile 〈−a, a〉 with a ∈ [0, a] to a′ attracts a set D ⊆ K of

voters if it attracts all its members, i.e., φS (−a, a′, k) > 0 ∀k ∈ D. This is equivalent

4

to

φS (−a, a′,D) := mink∈D

φS (−a, a′, k) > 0,

where φS (−a, a′,D) is the D-susceptibility to a′ following unfavorable information

to candidate R’s valence. The D-proof set ΞS (D) gathers all nonnegative policy a’s

such that no deviation of candidate R from 〈−a, a〉 attracts D, i.e.,

ΞS (D) :=

a ∈ [0, a] : max

a′∈AφS (−a, a′,D) ≤ 0

.

The maximum of the D-proof set

ξS (D) := max ΞS (D)

is D’s policy latitude.

Influential coalition The next concept is integral to the upcoming analysis.

Definition O4. Fix any segmentation technology S ∈ b, p, symmetric policy profile

〈−a, a〉 with a ∈ [0, a], and population function q, and let the default be the strictly

obedient outcome induced by any joint signal distribution 〈χ,b+,b−〉 that is 〈S, a〉-

consistent. A set C ⊆ K of voters constitutes an influential coalition if attracting C

while holding other things constant strictly increases candidate R’s winning probability

compared to the default.

Notice that majority coalitions are influential, and supersets of influential coali-

tions are influential. In the broadcast case, all voters consume the same signal, so

a coalition of voters is influential if and only if it is a majority coalition. In the

personalized case, non-majority coalitions can be influential due to the imperfect

correlation between different voters’ signals (see Table 1 for an illustration). In prin-

5

ciple, influential coalitions can depend on the entire joint signal distribution (and

certainly voters’ population distribution). The next lemma limits such dependence

to the configuration matrix only.

Lemma O1. Let everything be as in Definition O4. Then influential coalitions depend

on the joint signal distribution 〈χ,b+,b−〉 and voter’s population distribution q only

through the pair 〈χ, q〉, and they are independent of the policy profile 〈−a, a〉 if χ is

S-consistent.

Omitted proofs from Appendices O.1-O.3 are gathered in Appendix O.4.

O.1.2 Main results

The next lemma gives a full characterization of the symmetric policy profiles that

can arise in equilibrium, thus extending Lemma 4 to general voters and joint signal

distributions.

Lemma O2. Fix any pair of segmentation technology S ∈ b, p and population

function q, and assume Assumptions 1, 2, and O1. Then the following are equivalent.

(i) A symmetric policy profile 〈−a, a〉 with a ∈ [0, a] can arise in an equilibrium

with a joint signal distribution 〈χ,b+,b−〉 that is 〈S, a〉-consistent.

(ii) No deviation of candidate R from 〈−a, a〉 to a′ ∈ [−a, a) attracts any influential

coalition formed under 〈χ, q〉 whose members have ideological bliss points in

[−a, a].

In what follows, we’ll use ES,χ,q denote the set of the nonnegative policy a’s such

that 〈−a, a〉 can arise in equilibrium under segmentation technology S, configuration

matrix χ, and population function q. As before, we are interested in the degree of

6

policy polarization aS,χ,q, defined as the maximum of ES,χ,q, and whether all policies

between zero and aS,χ,q can arise in equilibrium. We focus on χs that are S-consistent,

so that ES,χ,q can be computed in two simple steps.20

1. Compute the influential coalitions formed under 〈χ, q〉.

2. For each a ∈ [0, a], check if any deviation as in Lemma O2(ii) is profitable to

candidate R. If the answer is negative, then add a to the output set.

In addition, we impose the following regularities on the susceptibility function (see

Appendix O.4 for sufficient conditions).

Assumption O2. φS (−a, a′, k) is increasing in a on [|t (k) |, a] for any S ∈ b, p,

k ∈ K, and a′ ∈ A.

Theorem O1. Fix any segmentation technology S ∈ b, p, S-consistent configura-

tion matrix χ, and population function q. Let C denote a typical influential coalition

formed under 〈χ, q〉. Under Assumptions 1, 2, O1, and O2, ES,χ,q =[0, aS,χ,q

], where

aS,χ,q = minCs formed under 〈χ,q〉

ξS (C) > 0.

The messages are twofold. First, policy polarization is in general disciplined by

the influential coalition with the smallest policy latitude and is strictly positive. Sec-

ond, marginal signal distributions affect policy polarization through policy latitudes,

whereas the joint signal distribution does so through the configuration matrix, holding

marginal news distributions constant.

The remainder of this appendix investigates the comparative statics of policy po-

larization regarding influential coalitions, holding marginal signal distributions (and

hence voters’ policy latitudes) fixed. Our starting observation is that enriching the

20For an arbitrary χ, one needs to check after Step 2 whether the output policy a is 〈S, a〉-consistent with χ.

7

configuration matrix enriches influential coalitions and, by Theorem O1, reduces pol-

icy polarization. Formally, we say that χ is richer than χ′ and write χ χ′ if χ

prescribes more voting recommendation profiles than χ′.

Definition O5. χ χ′ if every column of χ′ is a column of χ.

Observation O2. Cs formed under 〈χ′, q〉 ⊆ Cs formed under 〈χ, q〉 for any

segmentation technology S ∈ b, p, any S-consistent configuration matrices χ and

χ′ such that χ χ′, and any population function q.

We examine two implications of Observation O2. In the case of personalized infor-

mation, we show that policy polarization is minimized when signals are conditionally

independent across voters for any given population distribution, and that policy po-

larization attains its global minimum mink∈K

ξp (k) if, in addition, voters’ population

distribution is uniform across types.

Proposition O1. Under Assumptions 1, 2, O1, and O2, mink∈K

ξp (k) = ap,χ∗∗,uniform ≤

ap,χ∗∗,q ≤ ap,χ,q for any p-consistent configuration matrix χ and any population func-

tion q.

Proof. The second inequality holds because χ∗∗ χ for any p-consistent χ. To

establish the first equality and inequality, note that under χ∗∗ and uniform population

distribution, each type of voter is influential, and the resulting collection of influential

coalitions, 2K − ∅, is the richest across all scenarios.

The implications of Proposition O1 are twofold. First, Theorem 2 of the main

text prescribes the exact lower bound for the policy polarization in the case of per-

sonalized information. Second, as long as this lower bound stays positive, changes in

the environment (e.g., enriching voters’ types, dividing the same type of voters into

multiple subgroups) wouldn’t render policy polarization trivial.

8

Consider next the transition from broadcast information to personalized infor-

mation, which enriches the configuration matrix and hence has a negative policy

polarization effect, holding other things constant.

Proposition O2. Cs formed under 〈χ∗, q〉 ⊆ Cs formed under 〈χ, q〉 for any p-

consistent configuration matrix χ and any population function q.

Proof. ∀χ and q as above, Cs formed under 〈χ, q〉 ⊇ majority coalitions

= Cs formed under 〈χ∗, q〉.

Given Proposition O2, the reader can safely attribute the increasing policy po-

larization as shown in Proposition 1 of the main text to changes in marginal signal

distributions.

We finally investigate the policy polarization effect of increasing mass polarization.

As in the main text, we define increasing mass polarization as a mean-preserving

spread of voters’ policy preferences.

Definition O6. The mass is more polarized under q′ than q if q has second-order

stochastic dominance over q′ (write q SOSD q′), i.e.,∑K

k=m q (k) ≤ ∑Kk=m q

′ (k)

∀m = 1, · · · , K.

The analysis assumes quadratic attention cost.

Assumption O3. h (µ) = µ2.

The next proposition proves a similar result to Proposition 3 for general voters

and p-consistent configurations.

Proposition O3. Under Assumptions 1, 2, O1, and O3, ap,χ,q ≥ ap,χ,q′

for any

p-consistent configuration χ and any two population functions q and q′ such that

q SOSD q′.

9

O.2 Competitive infomediaries

This appendix investigates an extension to competitive infomediaries. In the environ-

ment laid out in Appendix O.1, suppose each type k ∈ K voter is served by m (k) ≥ 2

infomediaries. A market segment is a pair (k, i), where k ∈ K represents the type of

the voters being served, and i ∈ 1, · · · ,m (k) represents the serving infomediary.

The population of the voters in market segment (k, i) is ρ (k, i), where ρ (k, i) > 0 and∑m(k)i=1 ρ (k, i) = q (k). The functions m and ρ are symmetric (i.e., m (k) = m (−k)

and ρ (k, i) = ρ (−k, i) for any k ∈ K and i = 1, · · · ,m (k)), and they are taken as

given throughout this appendix.

For any given symmetric policy profile a = 〈−a, a〉 with a ∈ [0, a], the signal for

market segment (k, i) maximizes the net expected utilities of the voters therein (as

in the standard RI model):

maxΠ

V (Π; a, k)− λ · I (Π) .

Across market segments, we consider all joint signal distributions that are symmetric

and consistent with the marginal signal distributions that solve the above problem

(hereafter c-consistency). As in Appendix O.1, we can represent a joint signal dis-

tribution by its matrix form and define the c-consistency of the configuration. This

exercise is omitted for brevity.

We examine the policy polarization effect of introducing perfect competition be-

tween infomediaries. To facilitate comparison between the monopolistic personalized

case, we redefine p-consistency by first forming market segments using functions m

and ρ and then restricting voters of the same type to receiving the same voting

recommendation. By Lemma O1, equilibrium policies are fully determined by (1)

10

S ∈ c, p, which pins down marginal signal distributions, (2) the configuration χ,

and (3) population functions m and ρ. Hereafter we shall use ES,χ,m,ρ to denote the

equilibrium policy set and aS,χ,m,ρ to denote policy polarization.

The next proposition prescribes sufficient conditions for competition to reduce

policy polarization.

Proposition O4. Fix any functions m and ρ as above, and assume Assumptions 1, 2,

O1, and O2 for S ∈ c, p. Then Ec,χ,m,ρ = [0, ac,χ,m,ρ] ( Ep,χ′,m,ρ =[0, ap,χ

′,m,ρ]

for

any c-consistent configuration χ and p-consistent configuration χ′ such that χ χ′.

Two forces are acting in the same direction. First, competitive signals maximize

voters’ expected utilities rather than their attention and hence are less Blackwell-

informative than monopolistic personalized signals. As infomediaries stop overfeeding

voters with information about the valence state, voters become more susceptible to

policy deviations, so their policy latitudes fall. Second, dividing voters of the same

type into multiple subgroups reduces the correlation between their signals. Such a

change enriches the configuration without affecting marginal signal distributions, so

its policy polarization effect is negative.

O.3 General state distribution

This appendix extends the analysis so far to general state distributions. In the en-

vironment laid out in Appendix O.1, suppose the valence state is distributed on R

according to a c.d.f. G that is absolute continuous and symmetric around zero.21 A

signal structure is a mapping Π : R→ ∆ (Z), where each Π (· | ω) specifies a proba-

bility distribution over a finite set Z of signal realizations when the state realization

21Results below hold for discrete Gs, too. Assuming that ω ∈ R is w.l.o.g. because RI voterswho care ultimately about the differential quality between the two candidates would only acquireinformation about this single-dimensional random variable (Matejka and McKay (2015)).

11

is ω ∈ R. Under signal structure Π,

πz =

∫ω∈R

Π (z | ω) dG (ω)

is the probability that the signal realization is z ∈ Z, and it is assumed w.l.o.g. to

be strictly positive. Then

µz =

∫ω∈R

ωΠ (z | ω) dG (ω) /πz

is the posterior mean of the state conditional on the signal realization being z ∈ Z.

The next assumption is adapted from Matejka and McKay (2015).

Assumption O4. The needed amount of attention for consuming Π : R→ ∆ (Z) is

I (Π) = H (G)− EΠ [H (G (· | z))]

where H (G) is the entropy of the valence state, and H (G (· | z)) is the conditional

entropy of the valence state given signal realization z.

In what follows, we’ll first give characterizations of optimal signals and then ex-

amine their implications for policy polarization. To achieve the first goal, we fix, as

in Section 3.1, any symmetric policy profile 〈−a, a〉 with a ∈ [0, a] and use ΠS (a, k)

to denote any optimal signal consumed by type k voters under segmentation tech-

nology S ∈ b, p. When S = b, we drop the notation k and simply write Πb (a).

For each ΠS (a, k), we use ZS (a, k) to denote its support and µSz (a, k) to denote the

posterior mean of the state conditional on the signal realization being z ∈ ZS (a, k).

The next proposition gives characterizations of optimal broadcast and personalized

signals, thus extending Lemma 1 and Theorem 1 to general state distributions.

12

Proposition O5. Fix any symmetric policy profile 〈−a, a〉 with a > 0, and assume

Assumptions O1 and O4. Then,

(i) any optimal personalized signal Πp (a, k) that is nondegenerate and makes type

k voters’ participation constraint binding must satisfy |Zp (a, k) | = 2, (SOB)

and the skewness properties stated in Theorem 1(ii);

(ii) any optimal broadcast signal Πb (a) that is nondegenerate, induces consumption

from all voters and makes some voter’s participation constraint binding must

satisfy |Zb (a) | ∈ 2, 3:

(a) if |Zb (a) | = 2, then Πb (a) satisfies (SOB) and the skewness properties

stated in Theorem 1(i);

(b) if |Zb (a) | = 3, then we can write Zb (a) = LL,LR,RR, where µbLL (a) <

0, µbLR (a) = 0, and µbRR (a) = |µbLL (a) | > 0. For any k ∈ K, we must

have v (a, k) +µbLL (a) < 0, sgn(v (a, k) + µbLR (a)

)= sgn (k), and v (a, k) +

µbRR (a) > 0.

With a continuum of states, the optimal broadcast signal can have three rather

than two signal realizations. Recall that when solving for the broadcast case, we

aggregate voters with binding participation constraints into a representative voter.

Under the current assumptions, only voters of the most extreme types can have bind-

ing participation constraints, and the representative voter acting on their behalves

takes at most three final actions: LL, LR, RR (the first and second letters stand for

the voting decisions of the left-leaning and right-leaning voters, respectively). This

observation, together with the assumption that the attention cost function is strictly

Blackwell-monotone, implies that the optimal personalized signal for the representa-

tive voter has at most three signal realizations.

13

The analysis of policy polarization is the same as before in the case of two signal

realizations. In the new case of three signal realizations, it can be shown that all

voters strictly obey the recommendations LL and RR, and that the posterior mean

of the state given LR must equal zero. As argued in the footnote below, this implies

that the only symmetric policy profile that can arise in equilibrium is 〈0, 0〉, hence

the transition from broadcast to personalized information aggregation must increase

policy polarization.22

O.4 Proofs

Proof of Lemma O1 Fix any segmentation technology S, symmetric policy profile

〈−a, a〉 with a ≥ 0, 〈S, a〉-consistent signal distribution 〈χ,b+,b−〉, and population

function q. Let q denote the |K|-column vector that compiles the populations of

voters −K, · · · , K. Let the default be the strictly obedient outcome induced by the

joint signal distribution.

Define two matrix operations. First, for any C ⊆ K, let χC be the resulting

matrix from replacing every row k ∈ C of χ with a row of all ones. Second, for

any matrix A, let A be the resulting matrix from rounding the entries of A, i.e.,

replacing those entries above 1/2 with 1 and those below 1/2 with zero. By definition,

the row vector q>χ compiles candidate R’s default winning probabilities across the

voting recommendation profiles that occur with strictly positive probabilities, and

(q>χb+ + q>χb−)/2 is candidate R’s default winning probability in expectation.

After candidate R commits a unilateral deviation from 〈−a, a〉 that attracts a set C ⊆22For any symmetric policy profile 〈−a, a〉 with a > 0, the deviation to a′ = 0 weakly increases

candidate R’s winning probability when the recommendation profile is either LL or RR (LemmaO2), and it strictly increases candidate R’s winning probability when the recommendation profileis LR (obviously). In contrast, no deviation from 〈0, 0〉 increases candidate R’s winning probabilitywhen the recommendation profile is LL or RR (Lemma O2) or LR (obviously).

14

K of voters without affecting anything else, his winning probability vector becomes

q>χC, and his expected winning probability becomes (q>χCb+ + q>χCb

−)/2. Since

q>χC ≥ q>χ, the deviation strictly increases candidate R’s winning probability in

expectation if and only if it does so under some voting recommendation profile, i.e.,

(q>χCb+ + q>χCb

−)/2 > (q>χb+ + q>χb−)/2 if and only if q>χC 6= q>χ. The

last condition is equivalent to C being an influential coalition, and it depends on S,

〈−a, a〉, 〈χ,b+,b−〉, and q only through 〈χ, q〉.

Proof of Lemma O2 Replacing left-leaning voters (of type k = 1) with any type

k < 0 voter and right-leaning voters (of type k = 1) with any type k > 0 voter in the

proof of Lemma 4 gives the desired result.

Lemma O3. Let everything be as in Theorem O1. Then for any k ∈ 0, · · · , K and

any D ⊆ −k, · · · , k such that D ∩ −k, k 6= ∅, we must have ξS (D) > t (k) and

[t (k) , a] ∩ ΞS (D) =[t (k) , ξS (D)

].

Proof. Fix any k and D as above. Recall that

ΞS (D) :=

a ≥ 0 : max

a′∈AφS (−a, a′,D) ≤ 0

where φS (−a, a′,D) := mink′∈D

φS (−a, a′, k′). Let t(D) denote the image of D under

the mapping t, and write D for [min t (D) ,max t (D)]. By Assumption O1 inverted

V-shape, we can restrict attention to deviations to a′ ∈ D, i.e.,

ΞS (D) =

a ≥ 0 : max

a′∈DφS (−a, a′,D) ≤ 0

.

Fix the policy profile to be 〈−t (k) , t (k)〉, and take any a′ ∈ D. From Assumption

15

O1 and (SOB), it follows that a′ doesn’t attract type k voters:

φS (−t (k) , a′, k) := v (−t (k) , a′, k) + µSL (t (k) , k)

≤ v (−t (k) , t (k) , k) + µSL (t (k) , k) (inverted V-shape)

< 0, (SOB)

and it doesn’t attract type −k voters, either:

φS (−t (k) , a′,−k)

:= v (−t (k) , a′,−k) + µSL (t (k) ,−k)

≤ v (−t (k) , t (−k) ,−k) + µSL (t (k) ,−k) (inverted V-shape)

= 0 + µSL (t (k) ,−k) (symmetry)

< 0.

Thus φS (−t (k) , a′,D) := mink′∈D

φS (−t (k) , a′, k′) < 0, and taking maximum over a′

yields maxa′∈D

φS (−t (k) , a′,D) < 0. Meanwhile, Assumption O2 implies that φS (−a, a′,D)

is increasing in a on [t (k) , a] for any a′. Taking maximum over a′ yields

maxa′∈D

φS (−a1, a′,D) = φS

(−a1, arg max

a′∈DφS (−a1, a

′,D) ,D)

< φS

(−a2, arg max

a′∈DφS (−a1, a

′,D) ,D)

≤ maxa′∈D

φS (−a2, a′,D) ∀a2 > a1 ≥ t (k) ,

so maxa′∈D

φS (−a, a′,D) is increasing in a on [t (k) , a]. Taken together, we obtain that

16

D’s policy latitude exceeds t (k):

ξS (D) := max ΞS (D) = max

a ≥ 0 : max

a′∈DφS (−a, a′,D) ≤ 0

> t (k) ,

and that all policies in[t (k) , ξS (D)

]belong to the D-proof set:

[t (k) , a] ∩ ΞS (D) =

a ≥ t (k) : max

a′∈DφS (−a, a′,D) ≤ 0

=[t (k) , ξS (D)

].

Lemma O4. Under Assumptions 1, 2(i), O1, and O3, the following must hold for

any a ≥ 0 and a′ ∈ [−a, a].

(i) φp (−a, a′, k) is decreasing in k on k ∈ K : k ≤ 0 and is increasing in k on

k ∈ K : k ≥ 0.

(ii) φp (−a, a′, k) ≤ φp (−a, a′,−k) for any k > 0.

Proof. Fix any a and a′ as above. Under Assumption O3, i.e., h(µ) = µ2, solving the

personalized case yields Assumption 2(i) ⇐⇒ 2λ > 1 and 4λv (−a, a,K) < 1, and

µpL (a, k) =

−2v (−a, a, k)− 1/ (2λ) if k ≤ 0,

−1/ (2λ) if k > 0.

(15)

Also recall that φp (−a, a′, k) := v (−a, a′, k) + µpL (a, k).

17

Part (i): If k ≤ 0, then

φp (−a, a′, k) = v (−a, a′, k)− 2v (−a, a, k)− 1

2λ

= u (a′, k)− u (−a, k)− 2 [u (a, k)− u (−a, k)]− 1

2λ

= u (a′, k) + u (−a, k)− 2u (a, k)− 1

2λ

= − [v (a′, a, k) + v (−a, a, k)]− 1

2λ

where the last line is decreasing in k by Assumption O1 increasing differences. If

k > 0, then φp (−a, a′, k) = v (−a, a′, k) − 1/ (2λ), which is increasing in k again by

Assumption O1 increasing differences.

Part (ii): Under Assumption O1, the following must hold for any k > 0:

φp (−a, a′, k)− φp (−a, a′,−k)

= v (−a, a′, k)− 1

2λ−[v (−a, a′,−k)− 2v (−a, a,−k)− 1

2λ

]= v (−a, a′, k)− v (a,−a′, k)− 2v (−a, a, k) (symmetry)

= [u (−a, k)− u (−a′, k)]− [u (a, k)− u (a′, k)]

= v (a′, a,−k)− v (a′, a, k) (symmetry)

≤ 0. (increasing differences)

Lemma O5. Under Assumptions 1, 2, and O1, Assumption O2 holds if S = b or if

S = p and either u (a, k) = −|t (k)− a| or h(µ) = µ2.

Proof. We wish to verify that φS (−a, a′, k) := v (−a, a′, k) + µSL (a, k) is increasing in

a on [|t (k) |, a] for any k ∈ K and a′ ∈ A. Since v (−a, a′, k) is strictly increasing in a

18

on [|t (k) |, a] by Assumption O1 inverted V-shape, it suffices to show that µSL (a, k)

is nondecreasing in a on [|t (k) |, a].

S = b. Recall that µbL (a) is the unique solution to Problem (8), i.e., maxµ∈[−1,0] h (µ)

s.t. 12

[v (−a, a,−K)− µ]+ ≥ λh (µ) , where h is strictly convex and strictly decreas-

ing on [−1, 0]. Also note that v (−a, a,−K) is decreasing in a, because the following

holds for any a′ > a ≥ 0 under Assumption O1:

v (−a′, a′,−K)− v (−a, a,−K)

= u (a′,−K)− u (−a′,−K)− u (a,−K) + u (−a,−K)

= u (a′,−K)− u (a,−K)− [u (a′, K)− u (a,K)] (symmetry)

= v (a, a′,−K)− v (a, a′, K)

≤ 0. (increasing differences)

Combining these observations gives the desired result.

S = p and u (a, k) = −|t (k)−a|. In this case, v (−a, a, k) is invariant with a (indeed,

≡ 2t(k)) on [|t (k) |, a], so µpL (a, k) ≡ µpL (|t (k) |, k) on [|t (k) |, a].

S = p and h (µ) = µ2. In this case, a careful inspection of the expression for µpL (a, k)

in (15) gives the desired result.

Proof of Theorem O1 Fix any segmentation technology S ∈ b, p, S-consistent

configuration χ, and population function q. Let C denote a typical influential coalition

19

formed under 〈χ, q〉. For each k = 0, · · · , K − 1, define

A (k) =

[t (k) , t (k + 1)) ∩

⋂C⊆−k,··· ,k

ΞS (C) if ∃C ⊆ −k, · · · , k,

[t (k) , t (k + 1)) else.

For k = K, define

A (K) = [t (K) , a] ∩⋂C

ΞS (C) .

Lemma O2 shows that

ES,χ,q =K⋃k=0

A (k) .

Below we prove by induction that ∪Kk=0A (k) =[0,min

CξS (C)

].

Step 0. Letting k = 0 in Lemma O3 yields ΞS (0) =[0, ξS (0)

], so

A (0) =

[0, ξS (0)

]if 0 is influential and ξS (0) < t (1) ,

[0, t (1)) else.

For the first case, note that A (k) ⊆ [t (k) , t (k + 1)] ∩ ΞS (0) = ∅ for any

k ≥ 1, and that minCξS (C) = ξS (0) because ξS (C) > t (1) for any C 6= 0

by Lemma O3. Taken together, we obtain ∪Kk=0A (k) =[0,min

CξS (C)

]and

terminate the procedure. In the second case, we proceed to the next step.

Step m. The output of Step m−1 is ∪m−1k=0 A (k) = [0, t (m)). Then from Lemma O3,

which shows that [t (m) , a]∩ΞS (C) =[t (m) , ξS (C)

]for any C ⊆ −m, · · · ,m

20

such that C ∩ −m,m 6= ∅, it follows that

∪mk=0A (k) =

[0, minC⊆−m,··· ,m

ξS (C)]

if minC⊆−m,··· ,m

ξS (C) < t (m+ 1) ,

[0, t (m+ 1)) else.

For the first case, note that A (k) ⊆ [t (k) , t (k + 1)] ∩ ∩C⊆−m,··· ,mξS (C) = ∅

for any k ≥ m + 1, and that minCξS (C) = min

C⊆−m,··· ,mξS (C) because ξS (C ′) >

t (m+ 1) for any C ′ * −m, · · · ,m by Lemma O3. Taken together, we obtain

∪Kk=0A (k) =[0,min

CξS (C)

]and terminate the procedure. In the second case,

we proceed to the next step.

The above procedure terminates in at most K + 1 steps, and the output is always

∪Kk=0A (k) =[0,min

CξS (C)

].

Proof of Proposition O3 We wish to demonstrate that minCs formed under 〈χ,q〉

ξp (C)

≥ minCs formed under 〈χ,q′〉

ξp (C) holds for any p-consistent χ and any two population func-

tions q and q′ such that q SOSD q′. The proof below exploits the following con-

sequences of Lemma O4: for any a ≥ 0 and a′ ∈ [−a, a], (i) φp (−a, a′,−K) =

maxk∈K

φp (−a, a′, k), and (ii) φp (−a, a′, k) is decreasing in k on k : k ≤ 0 and is in-

creasing in k on k : k ≥ 0.

21

Step 1. Show that ξp (D) > t (K) for any D ⊆ K. Fix any a′ ∈ [t (−K) , t (K)] and

any D ⊆ K, and notice two things. First,

φp (−t (K) , a′,D)

:= mink∈D

φp (−t (K) , a′, k)

≤ maxk∈D

φp (−t (K) , a′, k)

≤ φp (−t (K) , a′,−K) (Lemma O4)

≤ φp (−t (K) , t (−K) ,−K) (Assumption O1 inverted V-shape)

:= v (−t (K) , t (−K) , K) + µpL (t (K) ,−K)

= 0 + µpL (t (K) ,−K) (Assumption O1 symmetry)

< 0.

Second, since φp (−a, a′, k) is increasing in a on [t (K) , a] for any k ∈ D by Lemma

O5, φp (−a, a′,D) := mink∈D

φp (−a, a′, k) is increasing in a on [t (K) , a], too. Taken

together, we obtain

ξp (D) := max

a ≥ 0 : max

a′∈[t(−K),t(K)]φp (−a, a′,D) ≤ 0

= max

a ≥ t (K) : max

a′∈[t(−K),t(K)]φp (−a, a′,D) ≤ 0

.

Step 2. There are three kinds of influential coalitions: (a) max C ≤ 0, (b) min C ≥ 0,

and (c) min C < 0 < max C. Consider case (a), and notice two things. First, the fol-

lowing are equivalent for any a ≥ t (K) and any a′ ∈ [−a, a] by Lemma O4: (i)

φp (−a, a′, C) ≤ 0, (ii) φp (−a, a′,max C) ≤ 0, and (iii) φp (−a, a′, k : k ≤ max C) ≤

0. Second, since C is influential and C ⊆ k : k ≤ max C, k : k ≤ max C is influen-

22

tial, too. Combining these observations yields

minCs formed under 〈χ,q〉

s.t. max C≤0

ξp (C) = minCs formed under 〈χ,q〉s.t. C=k:k≤α,α≤0

ξp (C) . (16)

A close inspection of (16) reveals two things. First,

ξp (k : k ≤ α) = max

a ≥ t (K) : max

a′∈[−t(K),t(K)]φp (−a, a′, k : k ≤ α) ≤ 0

is increasing in α on α : α ≤ 0 by Lemma O4. Second, every set k : k ≤ α with

α < 0 is more likely to be influential under q′ than under q because q SOSD q′. Thus

minCs formed under 〈χ,q〉

s.t.C=k:k≤α,α≤0

ξp (C) ≥ minCs formed under 〈χ,q′〉s.t. C=k:k≤α,α≤0

ξp (C) ,

which proves the desired result for case (a). The proofs for cases (b) and (c) are

similar and hence are omitted for brevity.

Proof of Proposition O4 Recall that for any given a = 〈−a, a〉 with a ∈ [0, a],

the monopolistic personalized signal for any type k voters is the competitive signal

for type k voters when the attention cost parameter equals λ − 1/γ for some γ > 0.

Then from Proposition 2, it follows that µpL(a, k) < µcL(a, k) and, hence, the following

must hold for any a′ and D ⊆ K:

φc (−a, a′,D) := mink∈D

φc (−a, a′, k) := mink∈D

v (−a, a′, k) + µcL (a, k)

> mink∈D

v (−a, a′, k) + µpL (a, k) := φp (−a, a′,D) .

Substituting this result into the proof of Lemma O3 yields ξc (D) < ξp (D), where

ξc (D) denotes D’s policy latitude in the competitive case. Thus for any c-consistent

23

χ and p-consistent χ′ such that χ χ′,

Ec,χ,ρ =

[0, minCs formed under 〈χ,ρ〉

ξc (C)]

(Theorem O1; χ is c-consistent)

⊆[0, minCs formed under 〈χ′,ρ〉

ξc (C)]

(Proposition O1; χ χ′)

([0, minCs formed under 〈χ′,ρ〉

ξp (C)]

(ξc (C) < ξp (C))

= Ep,χ′,ρ, (Theorem O1; χ′ is p-consistent)

which completes the proof.

Proof of Proposition O5 Fix any a = 〈−a, a〉 with a > 0. In what follows, we will

strengthen Assumption O1 increasing differences to strict increasing differences,

i.e., v (a, k) is strictly increasing in k. Doing so is w.l.o.g., because if v (a, k) =

v (a, k + 1) for some k, then we can treat type k and k + 1 voters as a single entity.

Part (i): Any optimal personalized signal that is nondegenerate and makes its con-

sumers’ participation constraint binding (let γ (k) > 0 denote the corresponding La-

grange multiplier) solves

maxZ,Π:R→∆(Z)

V (Π; a, k)− (λ− 1/γ (k)) I (Π) (17)

where λ (k) := λ − 1/γ (k) > 0 must hold in order to satisfy γ (k) > 0. By Matejka

and McKay (2015), any nondegenerate solution to (17) must be binary and, hence,

satisfy (SOB). Take any such solution, and let L (k) denote the likelihood that voter

k votes for candidate R rather than L. Below we demonstrate that L (k) < 1 if k < 0,

L (k) = 1 if k = 0, and L (k) > 1 if k > 0, which together with Bayes’ plausibility

implies the skewness properties stated in Theorem 1(ii).

24

Write v for v(a, k)/λ(k), ω for ω/λ(k), G for the c.d.f. of ω, and L for L(k). By

Matejka and McKay (2015), the probability that voter k votes for candidate R in state

ω equals L exp(v+ω)L exp(v+ω)+1

, and the average probability that voter k votes for candidate R

equals LL+1

. Thus ∫ω∈R

L(exp (v + ω))

L exp (v + ω) + 1dG(ω) =

LL+ 1

,

whose left-hand side and right-hand side are hereafter denoted by LHS(L, v) and

RHS(L), respectively. When v = 0 (equivalently, k = 0), simplifying LHS(L, 0)

yields

LHS(L, 0) =

∫ ∞0

L exp (ω)

L exp (ω) + 1+L exp (−ω)

L exp (−ω) + 1dG (ω) (G is symmetric)

=

∫ ∞0

L(2L+X(ω))

L2 + LX(ω) + 1dG(ω),

where X(ω) := exp(ω) + exp(−ω) in the second equality satisfies (1) X(ω) ≥ 2 and

(2) the inequality is strict if and only if ω 6= 0. Subtracting RHS(L) from the last

expression yields

LHS(L, 0)− RHS(L) =

∫ ∞0

(X(ω)− 2)(1− L)L(L2 + LX(ω) + 1)(L+ 1)

dG(ω),

which, upon a close inspection, reveals that RHS(L) single-crosses LHS(L, 0) from

below at L = 1. To close the proof, note that LHS(L, v) is increasing in v for any L.

Thus, any root of LHS(L, v) = RHS(L) must be larger than one if v > 0 (equivalently

k > 0), and it must be smaller than one if v < 0 (equivalent k < 0).

Part (ii): Take any optimal broadcast signal that includes all voters in informa-

tion consumption, and let B 6= ∅ denote the set of the voters whose participation

constraints are binding. Since v (a, k) is strictly increasing in k, we must have

25

B ⊆ −K,K and, indeed, B = −K,K, because if B ( −K,K, then the signal

we begin with is the optimal personalized signal for the voter in B and so violates the

participation constraint of the voter in −K,K − B.

For each k ∈ B, let γ (k) > 0 denote the Lagrange multiplier associated with voter

k’s participation constraint. As in the proof of Lemma 1, we can reformulate the

infomediary’s problem as

maxZ,Π:R→∆(Z)

∑k∈B

γ (k)∑k∈B γ (K)

V (Π; a, k)−(λ− 1∑

k∈B γ (k)

)I (Π) , (18)

where λ − 1∑k∈B γ(k)

> 0 must hold in order to satisfy γ (k) > 0 ∀k ∈ B. A careful

inspection of (18) reduces it to the same kind of the optimal information acquisition

problem studied by Matejka and McKay (2015), whereby a representative voter makes

three decisions LL, LR and RR on behalf of the voters in B (the first and second

letters stand for the voting decisions of type −K and K voters, respectively) and

pays an information acquisition cost that is proportional to the mutual information

of the valence state and the voting decision profile. By Matejka and McKay (2015),

any solution Π : Ω → ∆ (Z) to (18) must satisfy Z ⊆ LL,LR,RR and make the

voters in B obey the voting recommendations given to them.

|Z| = 2. In this case, Z must equal LL,RR, and Π must induce strict obedience

from its consumers, i.e., v (a,−K)+µLL > 0 and v (a, K)+µRR < 0. To show that Π is

symmetric, i.e., Π (LL | ω) = Π (RR | −ω) a.e., suppose the contrary is true, and con-

sider a new signal structure Π′ : Ω→ ∆(Z) where Π′ (LL | ω) = Π (RR | −ω) for all

ω. For each z ∈ Z, write π′z for∫

Π′ (z | ω) dG (ω) and µ′z for∫ωΠ′ (z | ω) dG (ω) /π′z.

By construction, we have π′LL = πRR, π′RR = πLL, µ′LL = −µRR, µ′RR = −µLL, and

26

I (Π) = I (Π′). Thus

V (Π′; a,−K) = π′RR [v (a,−K) + µ′RR]

= πLL [−v (a, K)− µLL]

= V (Π; a, K)

= λI (Π) (K ∈ B)

= V (Π; a,−K) , (−K ∈ B)

and V (Π′; a, K) = V (Π; a, K) can be shown analogously. Compared to Π (or Π′), the

signal structure 12Π+ 1

2Π′ generates the same consumption utility to the representative

voter in (18) but incurs a strictly lower attention cost because I (Π) is strictly convex

in its argument (see Theorem 2.7.4. of Cover and Thomas (2006)). But then Π isn’t

a solution to (18), a contradiction.

Z = LL,LR,RR. In this case, the voters in B must strictly prefer to obey the

voting recommendations prescribed by LL and RR, and they must weakly prefer

to obey the voting recommendations prescribed by LR, i.e., v (a, k) + µLL < 0 <

v (a, k) + µRR for all k ∈ B, and v (a,−K) + µLR ≤ 0 ≤ v (a, K) + µLR. To show

that Π is symmetric, i.e., Π (LL | ω) = Π (RR | −ω), Π (LR | ω) = Π (LR | −ω), and

Π (RR | ω) = Π (LL | −ω) a.e., suppose the contrary is true, and consider a new

signal structure Π′ : Ω → ∆ (Z) where Π′ (LL | ω) = Π (RR | −ω), Π′ (LR | ω) =

Π (LR | −ω), and Π′ (RR | ω) = Π (LL | −ω). By construction, we have π′LL = πRR,

π′LR = πLR, π′RR = πLL, µ′LL = −µRR, µ′LR = −µLR, and µ′RR = −µLL. Combining

27

this observation with obedience yields

v (a,−K) + µ′LL = −v (a, K)− µRR < 0

v (a,−K) + µ′LR = −v (a, K)− µLR ≤ 0

v (a,−K) + µ′RR = −v (a, K)− µLL > 0

v (a, K) + µ′LL = −v (a,−K)− µRR < 0

v (a, K) + µ′LR = −v (a,−K)− µLR ≥ 0

and v (a, K) + µ′RR = −v (a,−K)− µLL > 0,

so

V (Π′; a,−K) = π′RR [v (a,−K) + µ′RR]

= πLL [−v (a, K)− µLL] = V (Π; a, K) = λI (Π) = V (Π; a,−K) ,

and V (Π′; a, K) = V (Π′; a, K) can be shown analogously. The remainder of the

proof parallels that for the case |Z| = 2 and is therefore omitted for brevity.

References

Cover, T. M., and J. A. Thomas. (2006): Elements of Information Theory,

Hoboken, NJ: John Wiley & Sons, 2nd ed.

Matejka, F., and A. McKay. (2015): “Rational inattention to discrete choices:

A new foundation for the multinomial logit model,” American Economic Review,

105(1), 272-298.

28

The Politics of Personalized News Aggregation

Documents