Stackelberg Plans Thomas J. Sargent and John Stachurski February 7, 2021 1 Contents โข Overview 2 โข Duopoly 3 โข The Stackelberg Problem 4 โข Stackelberg Plan 5 โข Recursive Representation of Stackelberg Plan 6 โข Computing the Stackelberg Plan 7 โข Exhibiting Time Inconsistency of Stackelberg Plan 8 โข Recursive Formulation of the Followerโs Problem 9 โข Markov Perfect Equilibrium 10 โข MPE vs. Stackelberg 11 In addition to whatโs in Anaconda, this lecture will need the following libraries: In [1]: !pip install --upgrade quantecon 2 Overview This notebook formulates and computes a plan that a Stackelberg leader uses to manip- ulate forward-looking decisions of a Stackelberg follower that depend on continuation se- quences of decisions made once and for all by the Stackelberg leader at time 0. To facilitate computation and interpretation, we formulate things in a context that allows us to apply dynamic programming for linear-quadratic models. From the beginning, we carry along a linear-quadratic model of duopoly in which ๏ฌrms face adjustment costs that make them want to forecast actions of other ๏ฌrms that in๏ฌuence future prices. Letโs start with some standard imports: In [2]: import numpy as np import numpy.linalg as la import quantecon as qe from quantecon import LQ import matplotlib.pyplot as plt %matplotlib inline 1
26
Embed
Stackelberg Plansย ยท In our duopoly example ๐ก= 1๐ก, the time decision of the Stackelberg follower. Let ๐กbe a vector of decisions chosen by the Stackelberg leader at . The
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Stackelberg Plans
Thomas J. Sargent and John Stachurski
February 7, 2021
1 Contents
โข Overview 2โข Duopoly 3โข The Stackelberg Problem 4โข Stackelberg Plan 5โข Recursive Representation of Stackelberg Plan 6โข Computing the Stackelberg Plan 7โข Exhibiting Time Inconsistency of Stackelberg Plan 8โข Recursive Formulation of the Followerโs Problem 9โข Markov Perfect Equilibrium 10โข MPE vs. Stackelberg 11
In addition to whatโs in Anaconda, this lecture will need the following libraries:
In [1]: !pip install --upgrade quantecon
2 Overview
This notebook formulates and computes a plan that a Stackelberg leader uses to manip-ulate forward-looking decisions of a Stackelberg follower that depend on continuation se-quences of decisions made once and for all by the Stackelberg leader at time 0.
To facilitate computation and interpretation, we formulate things in a context that allows usto apply dynamic programming for linear-quadratic models.
From the beginning, we carry along a linear-quadratic model of duopoly in which firms faceadjustment costs that make them want to forecast actions of other firms that influence futureprices.
Letโs start with some standard imports:
In [2]: import numpy as npimport numpy.linalg as laimport quantecon as qefrom quantecon import LQimport matplotlib.pyplot as plt%matplotlib inline
NumbaWarning: The TBB threading layer requires TBB version 2019.5 or later i.e.,
TBB_INTERFACE_VERSION >= 11005. Found TBB_INTERFACE_VERSION = 11004. The TBBthreading
layer is disabled.warnings.warn(problem)
3 Duopoly
Time is discrete and is indexed by ๐ก = 0, 1, โฆ.Two firms produce a single good whose demand is governed by the linear inverse demandcurve
๐๐ก = ๐0 โ ๐1(๐1๐ก + ๐2๐ก)
where ๐๐๐ก is output of firm ๐ at time ๐ก and ๐0 and ๐1 are both positive.๐10, ๐20 are given numbers that serve as initial conditions at time 0.By incurring a cost of change
๐พ๐ฃ2๐๐ก
where ๐พ > 0, firm ๐ can change its output according to
Firm ๐ wants to maximize the present value of its profits
โโ๐ก=0
๐ฝ๐ก๐๐๐ก
where ๐ฝ โ (0, 1) is a time discount factor.
3.1 Stackelberg Leader and Follower
Each firm ๐ = 1, 2 chooses a sequence ๐๐ โก ๐๐๐ก+1โ๐ก=0 once and for all at time 0.
We let firm 2 be a Stackelberg leader and firm 1 be a Stackelberg follower.The leader firm 2 goes first and chooses ๐2๐ก+1โ
๐ก=0 once and for all at time 0.Knowing that firm 2 has chosen ๐2๐ก+1โ
๐ก=0, the follower firm 1 goes second and chooses๐1๐ก+1โ
๐ก=0 once and for all at time 0.In choosing ๐2, firm 2 takes into account that firm 1 will base its choice of ๐1 on firm 2โschoice of ๐2.
2
3.2 Abstract Statement of the Leaderโs and Followerโs Problems
We can express firm 1โs problem as
max๐1
ฮ 1( ๐1; ๐2)
where the appearance behind the semi-colon indicates that ๐2 is given.
Firm 1โs problem induces the best response mapping
๐1 = ๐ต( ๐2)
(Here ๐ต maps a sequence into a sequence)
The Stackelberg leaderโs problem is
max๐2
ฮ 2(๐ต( ๐2), ๐2)
whose maximizer is a sequence ๐2 that depends on the initial conditions ๐10, ๐20 and the pa-rameters of the model ๐0, ๐1, ๐พ.
This formulation captures key features of the model
โข Both firms make once-and-for-all choices at time 0.โข This is true even though both firms are choosing sequences of quantities that are in-
dexed by time.โข The Stackelberg leader chooses first within time 0, knowing that the Stackelberg fol-
lower will choose second within time 0.
While our abstract formulation reveals the timing protocol and equilibrium concept well, itobscures details that must be addressed when we want to compute and interpret a Stackel-berg plan and the followerโs best response to it.
To gain insights about these things, we study them in more detail.
3.3 Firmsโ Problems
Firm 1 acts as if firm 2โs sequence ๐2๐ก+1โ๐ก=0 is given and beyond its control.
Firm 2 knows that firm 1 chooses second and takes this into account in choosing ๐2๐ก+1โ๐ก=0.
In the spirit of working backward, we study firm 1โs problem first, taking ๐2๐ก+1โ๐ก=0 as given.
We can formulate firm 1โs optimum problem in terms of the Lagrangian
Firm 1 seeks a maximum with respect to ๐1๐ก+1, ๐ฃ1๐กโ๐ก=0 and a minimum with respect to
๐๐กโ๐ก=0.
We approach this problem using methods described in Ljungqvist and Sargent RMT5 chapter2, appendix A and Macroeconomic Theory, 2nd edition, chapter IX.
Because ๐ฟ2 > 1โ๐ฝ the operator (1 โ ๐ฟ2๐ฟ) contributes an unstable component if solved back-wards but a stable component if solved forwards.
Operating on both sides of equation (2) with ๐ฝโ1 times this inverse operator gives the fol-lowerโs decision rule for setting ๐1๐ก+1 in the feedback-feedforward form.
subject to initial conditions for ๐1๐ก, ๐2๐ก at ๐ก = 0.
Comments: We have formulated the Stackelberg problem in a space of sequences.
The max-min problem associated with Lagrangian (4) is unpleasant because the time ๐ก com-ponent of firm 1โs payoff function depends on the entire future of its choices of ๐1๐ก+๐โ
๐=0.
This renders a direct attack on the problem cumbersome.
Therefore, below, we will formulate the Stackelberg leaderโs problem recursively.
Weโll put our little duopoly model into a broader class of models with the same conceptualstructure.
4 The Stackelberg Problem
We formulate a class of linear-quadratic Stackelberg leader-follower problems of which ourduopoly model is an instance.
We use the optimal linear regulator (a.k.a. the linear-quadratic dynamic programming prob-lem described in LQ Dynamic Programming problems) to represent a Stackelberg leaderโsproblem recursively.
Let ๐ง๐ก be an ๐๐ง ร 1 vector of natural state variables.
Let ๐ฅ๐ก be an ๐๐ฅ ร 1 vector of endogenous forward-looking variables that are physically free tojump at ๐ก.
In our duopoly example ๐ฅ๐ก = ๐ฃ1๐ก, the time ๐ก decision of the Stackelberg follower.
Let ๐ข๐ก be a vector of decisions chosen by the Stackelberg leader at ๐ก.The ๐ง๐ก vector is inherited physically from the past.
But ๐ฅ๐ก is a decision made by the Stackelberg follower at time ๐ก that is the followerโs best re-sponse to the choice of an entire sequence of decisions made by the Stackelberg leader at time๐ก = 0.
Let
๐ฆ๐ก = [๐ง๐ก๐ฅ๐ก
]
Represent the Stackelberg leaderโs one-period loss function as
Subject to an initial condition for ๐ง0, but not for ๐ฅ0, the Stackelberg leader wants to maxi-mize
โโ
โ๐ก=0
๐ฝ๐ก๐(๐ฆ๐ก, ๐ข๐ก) (5)
The Stackelberg leader faces the model
[ ๐ผ 0๐บ21 ๐บ22
] [๐ง๐ก+1๐ฅ๐ก+1
] = [๐ด11 ๐ด12๐ด21 ๐ด22
] [๐ง๐ก๐ฅ๐ก
] + ๐ข๐ก (6)
We assume that the matrix [ ๐ผ 0๐บ21 ๐บ22
] on the left side of equation (6) is invertible, so that
we can multiply both sides by its inverse to obtain
[๐ง๐ก+1๐ฅ๐ก+1
] = [๐ด11 ๐ด12๐ด21 ๐ด22
] [๐ง๐ก๐ฅ๐ก
] + ๐ต๐ข๐ก (7)
or
๐ฆ๐ก+1 = ๐ด๐ฆ๐ก + ๐ต๐ข๐ก (8)
4.1 Interpretation of the Second Block of Equations
The Stackelberg followerโs best response mapping is summarized by the second block of equa-tions of (7).
In particular, these equations are the first-order conditions of the Stackelberg followerโs opti-mization problem (i.e., its Euler equations).
These Euler equations summarize the forward-looking aspect of the followerโs behavior andexpress how its time ๐ก decision depends on the leaderโs actions at times ๐ โฅ ๐ก.
6
When combined with a stability condition to be imposed below, the Euler equations summa-rize the followerโs best response to the sequence of actions by the leader.
The Stackelberg leader maximizes (5) by choosing sequences ๐ข๐ก, ๐ฅ๐ก, ๐ง๐ก+1โ๐ก=0 subject to (8)
and an initial condition for ๐ง0.
Note that we have an initial condition for ๐ง0 but not for ๐ฅ0.
๐ฅ0 is among the variables to be chosen at time 0 by the Stackelberg leader.
The Stackelberg leader uses its understanding of the responses restricted by (8) to manipulatethe followerโs decisions.
4.2 More Mechanical Details
For any vector ๐๐ก, define ๐๐ก = [๐๐ก, ๐๐ก+1 โฆ].Define a feasible set of ( ๐ฆ1, 0) sequences
Please remember that the followerโs Euler equation is embedded in the system of dynamicequations ๐ฆ๐ก+1 = ๐ด๐ฆ๐ก + ๐ต๐ข๐ก.
Note that in the definition of ฮฉ(๐ฆ0), ๐ฆ0 is taken as given.
Although it is taken as given in ฮฉ(๐ฆ0), eventually, the ๐ฅ0 component of ๐ฆ0 will be chosen bythe Stackelberg leader.
4.3 Two Subproblems
Once again we use backward induction.
We express the Stackelberg problem in terms of two subproblems.
Subproblem 1 is solved by a continuation Stackelberg leader at each date ๐ก โฅ 0.
Subproblem 2 is solved by the Stackelberg leader at ๐ก = 0.
The two subproblems are designed
โข to respect the protocol in which the follower chooses ๐1 after seeing ๐2 chosen by theleader
โข to make the leader choose ๐2 while respecting that ๐1 will be the followerโs best responseto ๐2
โข to represent the leaderโs problem recursively by artfully choosing the state variablesconfronting and the control variables available to the leader
4.3.1 Subproblem 1
๐ฃ(๐ฆ0) = max( ๐ฆ1,0)โฮฉ(๐ฆ0)
โโ
โ๐ก=0
๐ฝ๐ก๐(๐ฆ๐ก, ๐ข๐ก)
7
4.3.2 Subproblem 2
๐ค(๐ง0) = max๐ฅ0
๐ฃ(๐ฆ0)
Subproblem 1 takes the vector of forward-looking variables ๐ฅ0 as given.
Subproblem 2 optimizes over ๐ฅ0.
The value function ๐ค(๐ง0) tells the value of the Stackelberg plan as a function of the vector ofnatural state variables at time 0, ๐ง0.
4.4 Two Bellman Equations
We now describe Bellman equations for ๐ฃ(๐ฆ) and ๐ค(๐ง0).
4.4.1 Subproblem 1
The value function ๐ฃ(๐ฆ) in subproblem 1 satisfies the Bellman equation
๐ฃ(๐ฆ) = max๐ข,๐ฆโ
โ๐(๐ฆ, ๐ข) + ๐ฝ๐ฃ(๐ฆโ) (9)
where the maximization is subject to
๐ฆโ = ๐ด๐ฆ + ๐ต๐ข
and ๐ฆโ denotes next periodโs value.
Substituting ๐ฃ(๐ฆ) = โ๐ฆโฒ๐๐ฆ into Bellman equation (9) gives
We use these two equations as components of the following linear system that confronts aStackelberg continuation leader at time ๐ก
โกโขโขโฃ
1 0 0 00 1 0 00 0 1 0
๐ฝ๐02๐พ โ๐ฝ๐1
2๐พ โ๐ฝ๐1๐พ ๐ฝ
โคโฅโฅโฆ
โกโขโขโฃ
1๐2๐ก+1๐1๐ก+1๐ฃ1๐ก+1
โคโฅโฅโฆ
=โกโขโขโฃ
1 0 0 00 1 0 00 0 1 10 0 0 1
โคโฅโฅโฆ
โกโขโขโฃ
1๐2๐ก๐1๐ก๐ฃ1๐ก
โคโฅโฅโฆ
+โกโขโขโฃ
0100
โคโฅโฅโฆ
๐ฃ2๐ก
Time ๐ก revenues of firm 2 are ๐2๐ก = ๐0๐2๐ก โ ๐1๐22๐ก โ ๐1๐1๐ก๐2๐ก which evidently equal
๐งโฒ๐ก๐ 1๐ง๐ก โก โกโข
โฃ
1๐2๐ก๐1๐ก
โคโฅโฆ
โฒ
โกโขโฃ
0 ๐02 0
๐02 โ๐1 โ๐1
20 โ๐1
2 0โคโฅโฆ
โกโขโฃ
1๐2๐ก๐1๐ก
โคโฅโฆ
If we set ๐ = ๐พ, then firm 2โs period ๐ก profits can then be written
๐ฆโฒ๐ก๐ ๐ฆ๐ก โ ๐๐ฃ2
2๐ก
where
๐ฆ๐ก = [๐ง๐ก๐ฅ๐ก
]
with ๐ฅ๐ก = ๐ฃ1๐ก and
๐ = [๐ 1 00 0]
Weโll report results of implementing this code soon.
But first, we want to represent the Stackelberg leaderโs optimal choices recursively.
It is important to do this for several reasons:
โข properly to interpret a representation of the Stackelberg leaderโs choice as a sequence ofhistory-dependent functions
โข to formulate a recursive version of the followerโs choice problem
First, letโs get a recursive representation of the Stackelberg leaderโs choice of ๐2 for ourduopoly model.
10
6 Recursive Representation of Stackelberg Plan
In order to attain an appropriate representation of the Stackelberg leaderโs history-dependentplan, we will employ what amounts to a version of the Big K, little k device often used inmacroeconomics by distinguishing ๐ง๐ก, which depends partly on decisions ๐ฅ๐ก of the followers,from another vector ๐ง๐ก, which does not.
We will use ๐ง๐ก and its history ๐ง๐ก = [ ๐ง๐ก, ๐ง๐กโ1, โฆ , ๐ง0] to describe the sequence of the Stackelbergleaderโs decisions that the Stackelberg follower takes as given.
Thus, we let ๐ฆโฒ๐ก = [ ๐งโฒ
๐ก ๐ฅโฒ๐ก] with initial condition ๐ง0 = ๐ง0 given.
That we distinguish ๐ง๐ก from ๐ง๐ก is part and parcel of the Big K, little k device in this in-stance.
We have demonstrated that a Stackelberg plan for ๐ข๐กโ๐ก=0 has a recursive representation
Representation (10) confirms that whenever ๐น๐ฅ โ 0, the typical situation, the time ๐ก compo-nent ๐๐ก of a Stackelberg plan is history-dependent, meaning that the Stackelberg leaderโschoice ๐ข๐ก depends not just on ๐ง๐ก but on components of ๐ง๐กโ1.
6.1 Comments and Interpretations
After all, at the end of the day, it will turn out that because we set ๐ง0 = ๐ง0, it will be truethat ๐ง๐ก = ๐ง๐ก for all ๐ก โฅ 0.
Then why did we distinguish ๐ง๐ก from ๐ง๐ก?
The answer is that if we want to present to the Stackelberg follower a history-dependentrepresentation of the Stackelberg leaderโs sequence ๐2, we must use representation (10) castin terms of the history ๐ง๐ก and not a corresponding representation cast in terms of ๐ง๐ก.
6.2 Dynamic Programming and Time Consistency of followerโs Problem
Given the sequence ๐2 chosen by the Stackelberg leader in our duopoly model, it turns outthat the Stackelberg followerโs problem is recursive in the natural state variables that con-front a follower at any time ๐ก โฅ 0.
This means that the followerโs plan is time consistent.
To verify these claims, weโll formulate a recursive version of a followerโs problem that buildson our recursive representation of the Stackelberg leaderโs plan and our use of the Big K,little k idea.
6.3 Recursive Formulation of a Followerโs Problem
We now use what amounts to another โBig ๐พ, little ๐โ trick (see rational expectations equi-librium) to formulate a recursive version of a followerโs problem cast in terms of an ordinaryBellman equation.
Firm 1, the follower, faces ๐2๐กโ๐ก=0 as a given quantity sequence chosen by the leader and be-
In [7]: # Manually checks whether P is approximately a fixed pointP_next = (R + F.T @ Q @ F + ฮฒ * (A - B @ F).T @ P @ (A - B @ F))(P - P_next < tol0).all()
Out[7]: True
In [8]: # Manually checks whether two different ways of computing the# value function give approximately the same answerv_expanded = -((y0.T @ R @ y0 + ut[:, 0].T @ Q @ ut[:, 0] +
ฮฒ * (y0.T @ (A - B @ F).T @ P @ (A - B @ F) @ y0)))(v_leader_direct - v_expanded < tol0)[0, 0]
Out[8]: True
8 Exhibiting Time Inconsistency of Stackelberg Plan
In the code below we compare two values
โข the continuation value โ๐ฆ๐ก๐๐ฆ๐ก earned by a continuation Stackelberg leader who inheritsstate ๐ฆ๐ก at ๐ก
โข the value of a reborn Stackelberg leader who inherits state ๐ง๐ก at ๐ก and sets ๐ฅ๐ก =โ๐ โ1
22 ๐21
The difference between these two values is a tell-tale sign of the time inconsistency of theStackelberg plan
In [9]: # Compute value function over time with a reset at time tvt_leader = np.zeros(n)vt_reset_leader = np.empty_like(vt_leader)
axes[2].plot(range(n), vt_leader, 'bo', ms=2)axes[2].plot(range(n), vt_reset_leader, 'ro', ms=2)axes[2].set(title=r'Leader value function $v(y_t)$', xlabel='t')
plt.tight_layout()plt.show()
9 Recursive Formulation of the Followerโs Problem
We now formulate and compute the recursive version of the followerโs problem.
We check that the recursive Big ๐พ , little ๐ formulation of the followerโs problem producesthe same output path ๐1 that we computed when we solved the Stackelberg problem
In [11]: A_tilde = np.eye(5)A_tilde[:4, :4] = A - B @ F
In [12]: # Checks that the recursive formulation of the follower's problem gives# the same solution as the original Stackelberg problemfig, ax = plt.subplots()ax.plot(yt_tilde[4], 'r', label="q_tilde")ax.plot(yt_tilde[2], 'b', label="q")ax.legend()plt.show()
Note: Variables with _tilde are obtained from solving the followerโs problem โ those with-out are from the Stackelberg problem
In [13]: # Maximum absolute difference in quantities over time between# the first and second solution methodsnp.max(np.abs(yt_tilde[4] - yt_tilde[2]))
If we inspect the coefficients in the decision rule โ ๐น , we can spot the reason that the followerchooses to set ๐ฅ๐ก = ๐ฅ๐ก when it sets ๐ฅ๐ก = โ ๐น๐๐ก in the recursive formulation of the followerproblem.
Can you spot what features of ๐น imply this?
Hint: remember the components of ๐๐ก
In [15]: # Policy function in the follower's problemF_tilde.round(4)