1 Handbook of Pricing Management, Oxford University Press Game Theory Models of Pricing September 2010 Praveen Kopalle and Robert A. Shumsky Tuck School of Business at Dartmouth 1. Introduction In 1991, packaged-goods behemoth Procter & Gamble (P&G) initiated a “value pricing” scheme for sales to retailers. Value pricing was P&G’s label for everyday low pricing (EDLP), a pricing strategy under which retailers are charged a consistent price rather than a high baseline price punctuated by sporadic, deep discounts. P&G had many reasons for converting to EDLP. Its sales force had grown to rely on discounting to drive sales and the use of deep discounts had spiraled out of control, cutting into earnings (Saporito, 1994). The discounts then created demand shocks, as both retailers and customers loaded up on products whenever the price dropped. This exacerbated the so-called bullwhip effect, as demand variability was amplified up the supply chain, reducing the utilization of distribution resources and factories and increasing costs (Lee, Padmanabhan, and Whang, 1997). Despite the costs of its traditional discounting strategy, the move to value pricing was risky for P&G. How would retailers, consumers, and competing manufacturers respond to P&G’s EDLP program? For example, would competitors deepen their own discounts to pull market share away from P&G? Or would they also move to a low, consistent price? To
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Handbook of Pricing Management, Oxford University Press
Game Theory Models of Pricing
September 2010
Praveen Kopalle and Robert A. Shumsky
Tuck School of Business at Dartmouth
1. Introduction
In 1991, packaged-goods behemoth Procter & Gamble (P&G) initiated a “value pricing” scheme
for sales to retailers. Value pricing was P&G’s label for everyday low pricing (EDLP), a pricing
strategy under which retailers are charged a consistent price rather than a high baseline price
punctuated by sporadic, deep discounts. P&G had many reasons for converting to EDLP. Its
sales force had grown to rely on discounting to drive sales and the use of deep discounts had
spiraled out of control, cutting into earnings (Saporito, 1994). The discounts then created demand
shocks, as both retailers and customers loaded up on products whenever the price dropped. This
exacerbated the so-called bullwhip effect, as demand variability was amplified up the supply
chain, reducing the utilization of distribution resources and factories and increasing costs (Lee,
Padmanabhan, and Whang, 1997).
Despite the costs of its traditional discounting strategy, the move to value pricing was
risky for P&G. How would retailers, consumers, and competing manufacturers respond to
P&G’s EDLP program? For example, would competitors deepen their own discounts to pull
market share away from P&G? Or would they also move to a low, consistent price? To
2
accurately predict competitors’ behavior, P&G would have to consider each competitor’s beliefs
about P&G itself. For example, it was possible that Unilever, a competing manufacturer, might
increase its discounting if it believed that P&G was not fully committed to EDLP but otherwise
might itself adopt EDLP. Therefore, P&G’s decision on whether and how to adopt EDLP
depended on the responses of its competitors, which, in turn, depended upon their beliefs about
what P&G was going to do.
P&G’s actions depended on Uniliver’s, which depended on P&G’s—a chicken-and-egg
problem that can make your head spin. There is, however, an approach to problems like this that
can cut through the complexity and help managers make better pricing decisions. That approach,
game theory, was first developed as a distinct body of knowledge by mathematicians and
economists in the 1940s and ’50s (see the pioneering work by von Neumann and Morgenstern,
1944, and Nash, 1950). Since then, it has been applied to a variety of fields, including computer
science, political science, biology, operations management, and marketing science.
In this chapter, we first introduce the basic concepts of game theory by using simple
pricing examples. We then link those examples with both the research literature and industry
practice, including examples of how game theory has been used to understand P&G’s value
pricing initiative and the competitive response. In Section 2 we define the basic elements of a
game and describe the fundamental assumptions that underlie game theory. In the remaining
sections we examine models that provide insight into how games work (e.g., understanding
various types of equlibria) as well as how competition affects pricing. The models may be
categorized according to two attributes: the timing of actions and the number of periods (Table
1). In many of the models, competitors make simultaneous pricing decisions, so that neither
3
player has more information about its competitor’s actions than the other. In other games with
sequential timing, one competitor sets its price first. Some games have a single period (we
define a ‘period’ as a time unit in which each player takes a single action), while other games
will have multiple time periods, and actions taken in one period may affect actions and rewards
in later periods. Along the way, we will make a few side trips: a brief look at behavioral game
theory in Section 3.2 and some thoughts on implications for the practice of pricing in Section 4.
Timing of actions
Simultaneous Sequential
Nu
mb
er o
f
per
iod
s
One Sections 2.1-2.4, 2.6 Section 2.5
Multiple Sections 3.1 – 3.4 Sections 3.5-3.6
Table 1: Game types in this chapter
This chapter does not contain a rigorous description of the mathematics behind game
theory, nor does it cover all important areas of the field. For example, we will not discuss
cooperative game theory, in which participants form coalitions to explicitly coordinate their
prices (in many countries, such coalitions would be illegal). We will, however, say a bit more
about cooperative games at the end of Section 2.6.
For a more complete and rigorous treatments of game theory, we recommend Fudenberg
and Tirole (1991), Myerson (1991), and Gibbons (1992). For more examples of how
practitioners can make use of game theory for many decisions in addition to pricing, we
recommend Dixit and Nalebuff (1991). Moorthy (1985) focuses on marketing applications such
4
as market-entry decisions and advertising programs, as well as pricing. Cachon and Netessine
(2006) discuss applications of game theory to supply chain management. Nisan et. al. (2007)
describe computational approaches to game theory for applications such as information security,
and the flow of information through social networks. The popular press and trade journals also
report on how game theory can influence corporate strategy. For example Rappeport (2008)
describes how game theory has been applied at Microsoft, Chevron, and other firms. Finally,
information intermediaries such as Yahoo and Google employ research groups that develop
game theory models to understand how consumers and firms use the web, and to develop new
business models for the internet (see labs.yahoo.com/ and research.google.com/).
2. Introduction to Game Theory: Single-Period Pricing Games
A game is defined by three elements: players, strategies that may be used by each player, and
payoffs.
A game consists of players (participants in the game), strategies (plans by each player that
describe what action will be taken in any situation), and payoffs (rewards for each player for all
combinations of strategies).
In the value-pricing example above, the players might be P&G and Unilever, their strategies are
the prices they set over time under certain conditions, and the payoffs are the profits they make
under various combinations of pricing schemes. Given that each player has chosen a strategy,
the collection of players’ strategies is called a strategy profile. If we know the strategy profile,
then we know how every player will act for any situation in the game.
5
When describing a game, it is also important to specify the information that is available
to each player. For example, we will sometimes consider situations in which each player must
decide upon a strategy to use without knowing the strategy chosen by the other player. At other
times, we will let one player observe the other player’s choice before making a decision.
When analyzing a game, there are two basic steps. The first is to make clear the
implications of the rules—to fully understand the relationships between the strategies and the
payoffs. Once we understand how the game works, the second step is to determine which
strategy each player will play under various circumstances. This analysis may begin with a
description of each player’s optimal strategy given any strategy profile for the other players. The
analysis usually ends with a description of equilibrium strategies - a strategy profile that the
players may settle into and will not want to change. One fascinating aspect of game theory is that
these equilibrium strategies can often be quite different from the best overall strategy that might
be achieved if the players were part of a single firm that coordinated their actions. We will see
that competition often can make both players worse off.
A typical assumption in game theory, and an assumption we hold in much of this chapter,
is that all players know the identities, available strategies, and payoffs of all other players, as in a
game of tic-tac-toe. When no player has private information about the game that is unavailable
to the other players, and when all players know that the information is available to all players, we
say that the facts of the game are common knowledge (see Aumann, 1976, and Milgrom, 1981,
for more formal descriptions of the common knowledge assumption). In some instances,
however, a player may have private information. For example, one firm may have more accurate
information than its competitor about how its own customers respond to price changes by the
6
competitor. Such games of incomplete information are sometimes called Bayesian games, for
uncertainty and probability are important elements of the game, and Bayes’ rule is used to update
the players’ beliefs as the game is played. See Gibbons (1992) and Harsanyi (1967) for more
background on Bayesian games.
We also assume that the players are rational, i.e., that firms make pricing decisions to
maximize their profits. Finally, we assume that each player knows that the other players are
rational. Therefore, each player can put itself in its competitors’ shoes and anticipate its
competitors’ choices. Pushing further, each player knows that its competitors are anticipating its
own choices, and making choices accordingly. In general, we assume that the players have an
infinite hierarchy of beliefs about strategies in the game. This assumption is vital for
understanding why competitors may achieve the Nash equilibrium described in the next Section.
The assumption also highlights a key insight that can be developed by studying game theory –
the importance of anticipating your competitors’ actions.
2.1 A First Example: Pricing the UltraPhone and the Prisoner’s Dilemma
Consider two players, firms A and B, that have simultaneously developed virtually identical
versions of what we will call the UltraPhone, a hand-held device that includes technology for
cheap, efficient 3-dimensional video-conferencing. Both firms will release their UltraPhones at
the same time and, therefore, each has to set a price for the phone without knowing the price set
by its competitor. For the purposes of this example, we assume both firms will charge either a
High price or a Low price for all units (see Table 2). Therefore, the firms’ possible actions are to
price ‘High’ or ‘Low,’ and each firm’s strategy is a plan to charge one of these two prices.
7
Price Unit contribution
margin at that price
Number in
segment
High $800 2,000,000
Low $300 2,000,000
Table 2: Prices, margins, and segments for the UltraPhone
To complete the description of the game, we describe the payoffs under each strategy
profile. At the High price, the firms’ unit contribution margin is $800/phone; at the Low price,
the firms’ unit contribution margin is $300/phone. There are two segments of customers for the
UltraPhone, and the maximum price each segment is willing to pay is very different. The two
million people in the “High” segment are all willing to pay the High price (but would, of course,
prefer the Low price), while the two million people in the “Low” segment are only willing to pay
the Low price. We also assume that if both firms charge the same price, then sales are split
equally between the two firms. And we assume both firms can always manufacture a sufficient
number of phones to satisfy any number of customers. Note that in this example, there is no
reliable way to discriminate between segments; if the phone is offered for the Low price by
either firm, then both the two million Low customers and the two million High customers will
buy it for that Low price.
Given the information in Table 1, both firms ask, “What should we charge for our
phone?” Before looking at the competitive analysis, consider what the firms should do if they
coordinate—e.g., if they were not competitors but instead were divisions of one firm that is a
monopoly producer of this product. The total contribution margin given the High price is 2
million*$800 = $1.6 billion. The margin given the Low price is (2 million + 2 million)*$300 =
8
$1.2 billion. Therefore, the optimal solution for a monopoly is to set prices to exclusively target
the high-end market.
B Chooses…
High Low
A C
hoos
es...
High
8
8
0
12
Low 12
0
6
6
Table 3: The UltraPhone game. Entries in each cell are payoffs to A (lower number in each
cell) and to B (upper number in each cell) in $100 millions.
Now we turn to the competitive analysis—the game. The two competing firms, A and B,
face the situation shown in Table 3, which represents a normal form of a game. The rows
correspond to the two different strategies available to firm A: price High or Low. The columns
correspond to the same strategies for B. The entries in the tables show the payoffs, the total
contribution margins for firm A (the lower number in each cell of the table) and firm B (the
upper number in each cell) in units of $100 million. If both price High (the upper left cell), then
they split the $1.6 billion equally between them. If both price Low (bottom right cell), then they
split the $1.2 billion. For the cell in the upper right, A prices High but B prices Low, so that B
captures all of the demand at the Low price and gains $1.2 billion. The lower left cell shows the
reverse.
9
Given this payoff structure, what should the firms do? First, consider firm A’s decision,
given each strategy chosen by B. If B chooses High (the first column), then it is better for A to
choose Low. Therefore, we say that Low is player A’s best response to B choosing High.
A player’s best response is the strategy, or set of strategies, that maximizes the player’s payoff,
given the strategies of the other players.
If B chooses Low (the second column), then firm A’s best response is to choose Low as well.
The fact that it is best for A to price Low, no matter what B does, makes pricing Low a particular
type of strategy, a dominant one.
A strategy is a dominant strategy for a firm if it is optimal, no matter what strategy is used by
the other players.
Likewise, dominated strategies are never optimal, no matter what the competitors do.
Pricing Low is also a dominant strategy for B: no matter what A chooses, Low is better.
In fact, the strategy profile [A Low, B Low] is the unique Nash Equilibrium of this game; if both
firms choose Low, then neither has a reason to unilaterally change its mind.
The firms are in a Nash Equilibrium if the strategy of each firm is the best response to the
strategies of the other firms. Equivalently, in a Nash equilibrium, none of the firms have any
incentive to unilaterally deviate from its strategy.
This is the only equilibrium in the game. For example, if A chooses High and B chooses Low,
then A would have wished it had chosen Low and earned $600 million rather than $0. Note that
10
for the Nash equilibrium of this game, both firms choose their dominant strategies, but that is not
always the case (see, for example, the Nash equilibria in the Game of Chicken, Section 2.3).
Because we assume that both firms are rational and fully anticipate the logic used by the
other player (the infinite belief hierarchy), we predict that both will charge Low, each will earn
$600 million, and the total industry contribution margin will be $1.2 billion. This margin is less
than the optimal result of $1.6 billion. This is a well-known result: under competition, both
prices and industry profits decline. We will see in Section 3.1, however, that it may be possible
to achieve higher profits when the game is played repeatedly.
This particular game is an illustration of a Prisoner’s Dilemma. In the classic example of
the Prisoner’s Dilemma, two criminal suspects are apprehended and held in separate rooms. If
neither confesses to a serious crime, then both are jailed for a short time (say, one year) under
relatively minor charges. If both confess, they both receive a lengthy sentence (five years). But if
one confesses and the other doesn’t, then the snitch is freed and the silent prisoner is given a long
prison term (10 years). The equilibrium in the Prisoner’s Dilemma is for both prisoners to
confess, and the resulting five-year term is substantially worse than the cooperative equilibrium
they would have achieved if they had both remained silent. In the UltraPhone example, the Low-
Low strategy is equivalent to mutual confession.
The strategy profile [A Low, B Low] is a Nash equilibrium in pure strategies. It is a pure
strategy equilibrium because each player chooses a single price. The players might also consider
a mixed strategy, in which each player chooses a probability distribution over a set of pure
strategies (see Dixit and Skeath, 1999, chapter 5, for an accessible and thorough discussion of
mixed strategies). The equilibrium concept is named after John Nash, a mathematician who
11
showed that if each player has a finite number of pure strategies, then there must always exist at
least one mixed strategy or pure strategy equilibrium (Nash, 1950). In certain games, however,
there may not exist a pure strategy equilibrium and as we will see in Section 2.4, it is also
possible to have multiple pure-strategy equilibria.
2.2 Bertrand and Cournot Competition
The UltraPhone game is similar to a Bertrand competition, in which two firms produce identical
products and compete only on price, with no product differentiation. It can be shown that if
demand is a continuous linear function of price and if both firms have the same constant
marginal cost, then the only Nash equilibrium is for both firms to set a price equal to their
marginal cost. As in the example above, competition drives price down, and in a Bertrand
competition, prices are driven down to their lowest limit.
Both the UltraPhone example and the Bertrand model assume that it is always possible
for the firms to produce a sufficient quantity to supply the market, no matter what price is
charged. Therefore, the Bertrand model does not fit industries with capacity constraints that are
expensive to adjust. An alternative model is Cournot competition, in which the firms’ strategies
are quantities rather than prices. The higher the quantities the firms choose, the lower the price.
Under a Cournot competition between two firms, the resulting Nash equilibrium price is higher
than the marginal cost (the Bertrand solution) but lower than the monopoly price. As the number
of firms competing in the market increases, however, the Cournot equilibrium price falls. In the
theoretical limit (an infinite number of competitors), the price drops down to the marginal cost.
See Weber [this volume] for additional discussion of Bertrand and Cournot competition.
12
Besides the assumption that there are no capacity constraints, another important
assumption of Bertrand competition is that the products are identical, so that the firm offering the
lowest price attracts all demand. We will relax this assumption in the next section.
2.3 Continuous Prices, Product Differentiation, and Best Response Functions
In Section 2.1 we restricted our competitors to two choices: High and Low prices. In practice,
firms may choose from a variety of prices, so now assume that each firm chooses a price over a
continuous range. Let firm i's price be pi (i=A or B). In this game, a player’s strategy is the
choice of a price, and we now must define the payoffs, given the strategies. First we describe
how each competitor’s price affects demand for the product. In the last section, all customers
chose the lowest price. Here we assume that the products are differentiated, so that some
customers receive different utilities from the products and are therefore loyal; they will purchase
from firm A or B, even if the other firm offers a lower price. Specifically, let , ) be the
demand for product i, given that the competitors charge pi and pj. We define the linear demand
function,
, .
If c > 0, we say that the products are substitutes: a lower price charged by firm A leads to more
demand for A’s product and less demand for B’s (although if B A, B does not steal all the
demand, as in Section 2.1). If c < 0, then the products are complements: a lower price charged
by A raises demand for both products. In the UltraPhone example, the products are substitutes;
product A and a specialized accessory (e.g., a docking station) would be complements. If c=0
13
then the two products are independent, there is no interaction between the firms, and both firms
choose their optimal monopoly price.
Figure 1: Best response functions in the UltraPhone pricing game
with product differentiation
Now let m be the constant marginal cost to produce either product. Firm i's margin is
therefore m , . This function is concave with respect to pi, and therefore the optimal
price for firm i given that firm j charges pj can be found by taking the derivative of the function
with respect to pi, setting that first derivative equal to 0, and solving algebraically for pi. Let
be the resulting optimal price, and we find,
. (1)
This optimal price is a function of the competitor’s price, and therefore we call the best
response function of i to j; this is the continuous equivalent of the best response defined in
14
Section 2.1. Figure 1 shows the two best response functions on the same graph, for a given set of
parameters a=3000, b=4, c=2¸and m=$200/unit. For example, if Firm B sets a price of $300,
Firm A’s best response is a price of $550. From the function, we see that if the products are
substitutes and c > 0 (as in this case), the best response of one firm rises with the price of the
other firm. If the products are complements, the lines would slope downward, so that if one firm
raises its prices then the best response of the competitor is to lower its price.
The plot also shows that if Firm A chooses a price of $633, then Firm B’s best response
is also $633. Likewise, if Firm B chooses $633, then so does Firm A. Therefore, neither firm has
any reason to deviate from choosing $633, and [$633, $633] is the unique Nash equilibrium. In
general, the equilibrium prices satisfy equation (1) when i is Firm A as well as when i is Firm B.
By solving these two equations we find,
2
.
Note that as c rises, and the products move from being complements to substitutes, the
equilibrium price declines.
In this example, the firms are identical and therefore the equilibrium is symmetric (both
choose the same price). With non-identical firms (e.g., if they have different parameter values in
the demand function and/or different marginal costs), then there can be a unique equilibrium in
which the firms choose different prices (Singh and Vives, 1984). We will discuss additional
extensions of this model, including the role of reference prices, in Section 3.3.
15
2.4 Multiple Equilibria and a Game of Chicken
Now return to the UltraPhone pricing game with two price points, as in Section 2.1, although we
also incorporate product differentiation, as in Section 2.3. Assume that 40 percent of the High-
segment customers are loyal to firm A and will choose A’s product even if A chooses a High
price while B chooses a Low price. Likewise, 40 percent of the High-segment customers are
loyal to firm B. The payoffs from this new game are shown in Table 4. If both firms price High
or both price Low, then the payoffs do not change. If A prices High and B prices Low (the upper
right cell), then A’s margin is 40%*(2 million)*$800 = $640 million while B’s margin is
(60%*(2 million) + 2 million)*$300 = $960 million. The lower left cell shows the reverse.
B Chooses…
High Low
A C
hoos
es...
High 8
8
6.4
9.6
Low 9.6
6.4
6
6
Table 4: Payoffs for the UltraPhone game with loyal customers, in $100 millions
The presence of loyal customers significantly changes the behavior of the players. If B
chooses High, then A’s best response is to choose Low, while if B chooses Low, A’s best
response is to choose High. In this game, neither firm has a dominant strategy, and [A Low, B
Low] is no longer an equilibrium; if the firms find themselves in that cell, both firms would have
16
an incentive to move to High. But [A High, B High] is not an equilibrium either; both would then
want to move to Low.
In fact, this game has two pure strategy Nash equilibria: [A High, B Low] and [A Low, B
High]. Given either combination, neither firm has any incentive to unilaterally move to another
strategy. Each of these two equilibria represents a split in the market between one firm that
focuses on its own High-end loyal customers and the other that captures everyone else.
This example is analogous to a game of “chicken” between two automobile drivers. They
race towards one another and if neither swerves, they crash, producing the worst outcome—in
our case [A Low, B Low]. If both swerve, the payoff is, obviously, higher. If one swerves and
the other doesn’t, the one who did not swerve receives the highest award, while the one who
“chickened out” has a payoff that is lower than if they had both swerved but higher than if the
driver had crashed. As in Table 4, there are two pure strategy equilibria in the game of chicken,
each with one driver swerving and the other driving straight. 1
The existence of multiple equilibria poses a challenge for both theorists and practitioners
who use game theory. Given the many possible equilibria, will players actually choose one? If
so, which one? Behavioral explanations have been proposed to answer this question – see
Section 3.2. In addition, game theorists have invented variations of the Nash equilibrium to
address these questions. These variations are called refinements, and they lay out criteria that
exclude certain types of strategies and equilibria. Many of these innovations depend upon the
theory of dynamic games, in which the game is played in multiple stages over time (see Section
1 There is also a mixed-strategy equilibrium for the game in Table 4. If each player randomly selects a High price with probability 0.2 and a Low price with probability 0.8, then neither player has any incentive to choose a different strategy.
17
3). The subgame perfect equilibrium described in the next section is one such refinement.
Another is the trembling-hand perfect equilibrium in which players are assumed to make small
mistakes (or “trembles”) and all players take these mistakes into account when choosing
strategies (Selten, 1975; Fudenberg and Tirole, 1991, Section 8.4). For descriptions of additional
refinements, see Govindan and Wilson (2005).
2.5 Sequential Action and a Stackelberg Game
In the previous Section, we assumed that both firms set prices simultaneously without knowledge
of the other’s decision. We saw how this game has two equilibria and that it is difficult to predict
which one will be played. In this Section we will see how this problem is resolved if one player
moves first and chooses its preferred equilibrium; in the game of chicken, imagine that one
driver rips out her steering wheel and throws it out her window, while the other driver watches.
For the UltraPhone competition, assume that firm A has a slight lead in its product
development process and will release the product and announce its price first. The release and
price announcement dates are set in stone—e.g., firm A has committed to announcing its price
and sending its products to retailers on the first Monday of the next month, while firm B will
announce its price and send its products to retailers a week later. It is important to note that these
are the actual pricing dates. Either firm may publicize potential prices before those dates, but
retailers will only accept the prices given by firm A on the first Monday and firm B on the
second Monday. Also, assume that the one-week “lag” does not significantly affect the sales
assumptions introduced in Sections 2.1 and 2.4.
This sequential game has a special name,
18
In a Stackelberg Game, one player (the leader) moves first, while the other player (the follower)
moves second after observing the action of the leader.
The left side of Figure 2 shows the extensive-form representation of the game, with A’s choice
as the first split at the top of the diagram and B’s choice at the bottom.
Figure 2: Analysis of the UltraPhone game with loyal customers and A as a first mover
One can identify an equilibrium in a Stackelberg Game (a Stackelberg equilibrium) by
backwards induction - work up from the bottom of the diagram. First, we see what firm B would
choose, given either action by firm A. As the second tree in Figure 2 shows, B would choose
Low if A chooses High and would choose High if A chooses Low. Given the payoffs from these
choices, it is optimal for A to choose low, and the Stackelberg equilibrium is [A Low, B High],
with the highest payoff—$960 million—going to firm A. By moving first, firm A has guided the
game towards its preferred equilibrium.
It is important to note, however, that [A High, B Low] is still a Nash equilibrium for the
full game. Suppose that firm B states before the crucial month arrives that no matter what firm A
decides, it will choose to price Low (again, this does not mean that firm B actually sets the price
A chooses
Bchooses
Bchooses
High
High High LowLow
Low
(8,8) (6.4,9.6) (9.6, 6.4) (6,6)
A chooses
BchoosesLow
(6.4,9.6)
BchoosesHigh
(9.6, 6.4)
High Low
A chooses Low
B choosesHigh
(9.6, 6.4)
19
to Low before the second Monday of the month, just that it claims it will). If firm A believes this
threat, then it will choose High in order to avoid the worst-case [A Low, B Low] outcome.
Therefore, another equilibrium may be [A High, B Low], as in the simultaneous game of Section
2.4.
But can firm A believe B’s threat? Is the threat credible? Remember that A moves first
and that B observes what A has done. Once A has chosen Low, it is not optimal for B to choose
Low as well; it would be better for B to abandon its threat and choose High. We say that B’s
choices (the second level in Figure 1) are subgames of the full game. When defining an
equilibrium strategy, it seems natural to require that the strategies each player chooses also lead
to an equilibrium in each subgame.
In a subgame perfect equilibrium, the strategies chosen for each subgame is a Nash
equilibrium of that subgame.
In this case, B’s strategy “price Low no matter what A does” does not seem to be part of a
reasonable equilibrium because B will not price Low if A prices Low first; An equilibrium that
includes B Low after A Low is not Nash. Therefore, we say that [A Low, B High] is a subgame
perfect equilibrium but that [A High, B Low] is not.
2.6 An Airline Revenue Management Game
In the UltraPhone game, the two firms set prices, and we assume the firms can easily adjust
production quantities to accommodate demand. The dynamics of airline pricing often are quite
different. The capacity of seats on each flight is fixed over the short term, and prices for various
types of tickets are set in advance. As the departure of a flight approaches, the number of seats
20
available at each price is adjusted over time. This dynamic adjustment of seat inventory is
sometimes called revenue management; see Talluri [this volume] for additional information of
the fundamentals of revenue management as well as Barnes [this volume], Kimes et al. [this
volume], and Lieberman [this volume] for descriptions of applications.
Here, we will describe a game in which two airlines simultaneously practice revenue
management. As was true for the initial UltraPhone example, the game will be a simplified
version of reality. For example, we will assume that the single flight operated by each airline has
only one seat left for sale. As we will discuss below, however, the insights generated by the
model hold for realistic scenarios.
Suppose two airlines, X and Y, offer direct flights between the same origin and
destination, with departures and arrivals at similar times. We assume that each flight has just one
seat available. Both airlines sell two types of tickets: a low-fare ticket with an advance-purchase
requirement for $200 and a high-fare ticket for $500 that can be purchased at any time. A ticket
purchased at either fare is for the same physical product: a coach-class seat on one flight leg. The
advance-purchase restriction, however, segments the market and allows for price discrimination
between customers with different valuations of purchase-time flexibility. Given the advance-
purchase requirement, we assume that demand for low-fare tickets occurs before demand for
high-fare tickets. Customers who prefer a low fare and are willing to accept the purchase
restrictions will be called “low-fare customers.” Customers who prefer to purchase later, at the
higher price, are called “high-fare customers.”
Assume marginal costs are negligible, so that the airlines seek to maximize expected
revenue. To increase revenue, both airlines may establish a booking limit for low-fare tickets: a
21
restriction on the number of low-fare tickets that may be sold. (In our case, the booking limit is
either 0 or 1.) Once this booking limit is reached by ticket sales, the low fare is closed and only
high-fare sales are allowed. If a customer is denied a ticket by one airline, we assume that
customer will attempt to purchase a ticket from the other airline (we call these “overflow
passengers”). Therefore, both airlines are faced with a random initial demand as well as demand
from customers who are denied tickets by the other airline. Passengers denied a reservation by
both airlines are lost. We assume that for each airline, there is a 100 percent chance that a low-
fare customer will attempt to purchase a ticket followed by a 50 percent chance that a high-fare
customer will attempt to purchase a ticket. This information is summarized in Table 5.
Demand
Fare Class Ticket Price Airline X Airline Y
Low $200 1 (guaranteed) 1 (guaranteed)
High $500 Prob{0}=0.5
Prob{1}=0.5
Prob{0}=0.5
Prob{1}=0.5
Table 5: Customer segments for the revenue management game
The following is the order of events in the game:
1. Airlines establish booking limits. Each airline chooses to either close the low-fare class
(“Close”) and not sell a low-fare ticket, or keep it open (“Open”). In this revenue management
game, the decision whether to Close or remain Open is the airline’s strategy.
2. A low-fare passenger arrives at each airline and is accommodated (and $200 collected) if
the low-fare class is Open.
22
3. Each airline either receives a high-fare passenger or does not see any more demand, where
each outcome has probability 0.5. If a passenger arrives, and if the low-fare class has been
closed, the passenger is accommodated (and $500 collected).
4. A high-fare passenger who is not accommodated on the first-choice airline spills to the
alternate airline and is accommodated there if the alternate airline has closed its seat from a
low-fare customer and has not sold a seat to its own high-fare customer.
Y Chooses…
Open Close
X C
hoos
es...
Open
200
200
200
375
Close
375
200
250
250
Table 6: Expected payoffs for the airline revenue management game.
Entries in each cell are payoffs to X, Y.
Table 6 shows the payoffs for each combination of strategies. If both airlines choose to
keep the low-fare class Open, both book customers for $200. If both choose to Close, each has a
50 percent chance to book a high fare, so that each has an expected value of (0.5)*$500 = $250.
In the upper right cell, if X chooses to Open and Y chooses to Close, then X books a low-fare
customer for $200 while Y has a chance to book a high-fare customer. That high-fare customer
may either be a customer who originally desired a flight on Y (who arrives with probability 0.5)
23
or a spillover from X (who also arrives, independently, with probability 0.5). Therefore, Y’s
expected revenue is [1- (0.5)(0.5)]*$500 = $375. The lower left cell shows the reverse situation.
By examining Table 6, we see that the only Nash equilibrium is [X Close, Y Close]. If
one airline chooses to Close, then the airline that chooses Open foregoes the opportunity to book
a high-fare customer.
Note that this game is not precisely the same as the UltraPhone game/Prisoner’s Dilemma
discussed in Section 2.1; the airlines could have done even worse with [X Open, Y Open]. The
competitive equilibrium, however, is inferior in terms of aggregate revenue to the cooperative
solution, just as it is in the Prisoner’s Dilemma. In Table 6, the highest total revenue across both
airlines—$575—is obtained with strategies [X Open, Y Close] or [X Close, Y Open]. The Nash
equilibrium has an expected total revenue of $500.
There is another interesting comparison to make with the UltraPhone game. In that game,
competition drove prices down. In this game, competition leads the airlines from an optimal
coordinated solution [X Open, Y Close] or [X Close, Y Open] to the Nash equilibrium [X Close,
Y Close]. Therefore, under the coordinated solution, one low-fare ticket is sold and one high-fare
ticket may be sold. Under the Nash equilibrium, only high-fare tickets may be sold. Netessine
and Shumsky (2005) analyze a similar game with two customer classes and any number of
available seats, and they confirm that this general insight continues to hold; under competition,
more seats are reserved for high-fare customers, while the number of tickets sold (the load
factor) declines, as compared to the monopoly solution. The use of seat inventory control by the
airlines and the low-to-high pattern of customer valuations inverts the Bertrand and Cournot
24
results; competition in the revenue management game raises the average price paid for a ticket as
compared to the average price charged by a monopolist.
In practice, however, airlines may compete on multiple routes, offer more than two fare
classes, and do not make a single revenue management decision simultaneously. Jiang and Pang
(2007) extend this model to airline competition over a network. Dudey (1992) and Martinez-de-
Albeniz and Talluri (2010) analyze the dynamic revenue management problem, in which the
airlines adjust booking limits dynamically over time. d'Huart (2010) uses a computer simulation
to compare monopoly and oligopoly airline revenue management behavior. He simulates up to 4
airlines with 3 competing flights offered by each airline and six fare classes on each flight. The
simulation also incorporates many of the forecasting and dynamic revenue management
heuristics used by the airlines. The results of the simulation correspond to our models’
predictions: fewer passengers are booked under competition, but with a higher average fare.
Of course, the airlines may prefer to collaborate, agree to one of the monopoly solutions,
and then split the larger pot so that both airlines would come out ahead of the Nash outcome. In
general, such collusion can increase overall profits, while reducing consumer surplus. Such
price-fixing is illegal in the United States, Canada, the European Union, and many other
countries. Occasionally airlines do attempt to collude and get caught, e.g., British Airways and
Virgin Atlantic conspired to simultaneously raise their prices on competing routes by adding fuel
surcharges to their tickets. Virgin Atlantic reported on the plot to the U.S. Department of Justice,
and British Airways paid a large fine (Associated Press, 2007).
In the absence of such legal risks, a fundamental question is whether such collusion is
stable when airlines do not have enforceable agreements to cooperate as well as explicit methods
25
for distributing the gains from cooperation. As we have seen from the examples above, each
player has an incentive to deviate from the monopoly solution to increase its own profits. In
Section 3 we will discuss how repeated play can produce outcomes other than the single-play
Nash equilibrium describe above.
In certain cases in the United States, airlines have received permission from the
Department of Justice to coordinate pricing and split the profits. Northwest and KLM have had
such an agreement since 1993, while United, Lufthansa, Continental and Air Canada began
coordinating pricing and revenue management on their transatlantic flights in 2009
(BTNonline.com, 2008; DOT, 2009). The ability of such joint ventures to sustain themselves
depend upon whether each participant receives a net benefit from joining the coalition.
Cooperative game theory focuses on how the value created by the coalition may be distributed
among the participants, and how this distribution affects coalition stability. Such joint ventures,
however, are the exception rather than the rule in the airline business and elsewhere, and
therefore we will continue to focus on non-cooperative game theory.
3. Dynamic Pricing Games
Pricing competition can be viewed from either a static or a dynamic perspective. Dynamic
models incorporate the dimension of time and recognize that competitive decisions do not
necessarily remain fixed. Viewing competitive pricing strategies from a dynamic perspective
can provide richer insights, and for some questions more accurate answers, than when using
static models. In this Section we will first explain how dynamic models differ from the models
presented in the last Section. We will then describe a variety of dynamic models, and we will
see how the price equilibria predicted by the models can help firms to understand markets and set
26
prices. Finally, and perhaps most significantly, we will show how pricing data collected from
the field has been used to test the predictions made by game theory models.
First, we make clear what we mean by a dynamic game:
If a game is played exactly once, it is a single-period, or a one-shot, or a static game. If a
pricing game is played more than once, it is a dynamic or a multi-period game.
In a dynamic game, decision variables in each period are sometimes called control variables.
Therefore, a firm’s strategy is a set of values for the control variables.
Dynamic competitive pricing situations can be analyzed in terms of either continuous or discrete
time. Generally speaking, pricing data from the market is discrete in nature (e.g., weekly price
changes at stores or the availability of weekly, store-level scanner data), and even continuous-
time models, which are an abstraction, are discretized for estimation of model parameters from
market data. Typically, dynamic models are treated as multi-period games because critical
variables—sales, market share, and so forth—are assumed to change over time based on the
dynamics in the marketplace. Any dynamic model of pricing competition must consider the
impact of competitors’ pricing strategies on the dynamic process governing changes in sales or
market-share variables.
In this section we will describe a variety of dynamic pricing policies for competing
firms. See Aviv and Vulcano [this volume] for more details on dynamic policies that do not take
competitors’ strategies directly into account. That chapter also describes models that examine, in
more detail, the effects of strategic customers. Other related chapters include Ramakrishnan
27
[this volume], Blattberg and Breisch [this volume], and Kaya and Ozer [this volume]. This last
chapter is closely related to the material in Sections 2.5 and 3.5, for it also describes Stackelberg
games, while examining specific contracts between wholesalers and retailers (e.g., buyback,
revenue sharing, etc.), the effects of asymmetric forecast information, and the design of contracts
to elicit accurate information from supply chain partners. For additional discussion of issues
related to behavioral game theory (discussed in Section 3.2, below), see Ozer and Zheng [this
volume].
3.1 Effective Competitive Strategies in Repeated Games
We have seen in Section 2.1 how a Prisoner’s Dilemma can emerge as the non-cooperative Nash
equilibrium strategy in a static, one-period game. In the real world, however, rarely do we see
one-period interactions. For example, firms rarely consider competition for one week and set
prices simply for that week, keeping them at the same level during all weeks into the future.
Firms have a continuous interaction with each other for many weeks in a row, and this
continuous interaction gives rise to repeated games.
A repeated game is a dynamic game in which past actions do not influence the payoffs or set of
feasible actions in the current period, i.e., there is no explicit link between the periods.
In a repeated game setting, firms would, of course, be more profitable if they could avoid the
Prisoner’s Dilemma and move to a more cooperative equilibrium. Therefore, formulating the
pricing game as a multi-period problem may identify strategies under which firms achieve a
cooperative equilibrium, where competing firms may not lower prices as much as they would
have otherwise in a one-shot Prisoner’s Dilemma. As another example, in an information sharing
28
context, Özer, Zheng, and Chen (2009) discuss the importance of trust and determine that
repeated games enhance trust and cooperation among players. Such cooperative equilibria may
also occur if the game is not guaranteed to be repeated, but may be repeated with some
probability.
A robust approach in such contexts is the “tit-for-tat” strategy, where one firm begins
with a cooperative pricing strategy and subsequently copies the competing firm’s previous move.
This strategy is probably the best known and most often discussed rule for playing the Prisoner’s
Dilemma in a repeated setting, as illustrated by Axelrod and Hamilton (1981), Axelrod (1984),
and also discussed in Dixit and Nalebuff (1991). In his experiment, Axelrod set up a computer
tournament in which pairs of contestants repeated the two-person prisoner’s dilemma game 200
times. He invited contestants to submit strategies, and paired each entry with each other entry.
For each move, players received three points for mutual cooperation and one point for mutual
defection. In cases where one player defected while the other cooperated, the former received
five points while the latter received nothing. Fourteen of the entries were from people who had
published research articles on game theory or the Prisoner’s Dilemma. The tit-for-tat strategy
(submitted by Anatol Rapoport, a mathematics professor at the University of Toronto), the
simplest of all submitted programs, won the tournament. Axelrod repeated the tournament with
more participants and once again, Rapoport’s tit-for-tat was the winning strategy (Dixit and
Nalebuff 1991).
Axelrod’s research emphasized the importance of minimizing echo effects in an
environment of mutual power. Although a single defection may be successful when analyzed for
its direct effect, it also can set off a long string of recriminations and counter-recriminations, and,
29
when it does, both sides suffer. In such an analysis, tit-for-tat may be seen as a punishment
strategy (sometimes called a trigger strategy) that can produce a supportive collusive pricing
strategy in a non-cooperative game. Lal’s (1990a) analysis of price promotions over time in a
competitive environment suggests that price promotions between national firms (say Coke and
Pepsi) can be interpreted as a long run strategy by these firms to defend market shares from
possible encroachments by store brands. Some anecdotal evidence from the beverage industry in
support of such collusive strategies is available. For example, a Consumer Reports article (Lal,
1990b) noted that in the soft-drink market, where Coca Cola and PepsiCo account for two-thirds
of all sales, “[s]urveys have shown that many soda drinkers tend to buy whichever brand is on
special. Under the Coke and Pepsi bottlers’ calendar marketing arrangements, a store agrees to
feature only one brand at any given time. Coke bottlers happened to have signed such
agreements for 26 weeks and Pepsi bottlers the other 26 weeks!” In other markets, such as
ketchup, cereals, and cookies (Lal, 1990b), similar arrangements can be found. Rao (1991)
provides further support for the above finding and shows that when the competition is
asymmetric, the national brand promotes to ensure that the private label does not try to attract
consumers away from the national brand.
3.2 Behavioral Considerations in Games
We now step back and ask if our game theory models provide us with a sufficiently accurate
representation of actual behavior. Behavioral considerations can be important for both static
games and for repeated games, where relationships develop between players, leading to subtle
and important consequences.
30
The assumptions we laid out at the beginning of Section 2 – rational players and the
infinite belief hierarchy – can sometimes seem far-fetched when we consider real human
interactions. Given that the underlying assumptions may not be satisfied, are the predicted
results of the games reasonable? When describing research on human behavior in games,
Camerer (2003) states,
“ …game theory students often ask: “This theory is interesting…but do people
actually play this way?’ The answer, not surprisingly, is mixed. There are no
interesting games in which subjects reach a predicted equilibrium immediately. And
there are no games so complicated that subjects do not converge in the direction of
equilibrium (perhaps quite close to it) with enough experience in the lab.”
Camerer and his colleagues have found that to accurately predict human behavior in games, the
mathematics of game theory must often be merged with theories of behavior based on models of
social utility (e.g., the concept of fairness), limitations on reasoning, and learning over time. The
result of this merger is the fast-growing field of behavioral economics and, more specifically,
behavioral game theory.
Behavioral game theory can shed light on a problem we saw in Section 2.4: when there is
more than one equilibrium in a game, it is difficult to predict player behavior. Psychologists and
behavioral game theorists have tested how player behavior evolves over many repetitions of such
games. According to Binmore (2007), over repeated trials of games with multiple equilibria
played in a laboratory, the players’ behavior evolves towards particular equilibria because of a
combination of chance and “historical events.” These events include the particular roles ascribed
to each player in the game (e.g., an employer and a worker, equal partners, etc.), the social norms
31
that the subjects bring into the laboratory, and, most importantly, the competitive strategies that
each player has experienced during previous trials of the game. Binmore also describes how, by
manipulating these historical events, players can be directed to non-equilibrium “focal points,”
although when left to their own devices players inevitably move back to a Nash equilibrium.
Behavioral game theorists have also conducted laboratory tests of the specific pricing
games that we describe in this chapter. For example, Duwfenberg and Gneezy (2000) simulated
the Bertrand game of Section 2.1 by asking students in groups of 2, 3 or 4 to bid a number from
2 to 100. The student with the lowest bid in the group won a prize that was proportional to the
size of that student’s bid. Therefore, the experiment is equivalent to the one-period Bertrand
pricing game of Section 2.1, in that the lowest price (or ‘bid’ in the experiment) captures the
entire market. This bid-reward process was repeated 10 times, with the participants in each
group randomized between each of the 10 rounds to prevent collusive behavior - as described in
the previous Section of this Chapter - that may be generated in repeated games with the same
players.
No matter how many competitors, the Nash equilibrium in this game is for all players to
bid 2, which is equivalent to pricing at marginal cost in the Bertrand game. Duwfenberg and
Gneezy found that when their experiments involved two players, however, the bids were
consistently higher than 2 (over two rounds of ten trials each, the average bid was 38, and the
average winning bid was 26). For larger groups of 3 and 4, the initial rounds also saw bids
substantially higher than the Nash equilibrium value of 2, but by the tenth round the bids did
converge towards 2.
32
Camerer (2003) describes hundreds of experiments that demonstrate how actual player
behavior can deviate from the equilibrium behavior predicted by classical, mathematical game
theory. He also shows how behavioral economists have gone a step further, to explicitly test the
accuracy of behavioral explanations for these deviations, such as bounded rationality or a
preference for fairness.
A great majority of experiments conducted by behavioral game theorists involve
experimental subjects under controlled conditions. Of course, actual pricing decisions are not
made by individuals in laboratories, but by individuals or groups of people in firms, who are
subject to a variety of (sometimes conflicting) financial and social incentives. Armstrong and
Huck (2010) discuss the extent to which these experiments may be used to extrapolate to firm
behavior. An alternate to laboratory experiments is an empirical approach, to observe actual
pricing behavior by firms and compare those prices with predictions made by game theory
models. We will return to the question of game theory’s applicability, and see an example of this
empirical approach, in Section 3.6. In addition, Ozer and Zheng [this volume] discuss many
issues related to behavioral game theory.
3.3 State-Dependent Dynamic Pricing Games
State-dependent dynamic pricing games are a generalization of repeated games, in which the
current period payoff may be a function of the history of the game.
In a dynamic game, a state variable is a quantity that (i) defines the state of nature, (ii) depends
on the decision or control variables, and (iii) impacts future payoffs (e.g., sales or profitability).
33
In a state-dependent dynamic game, there is an explicit link between the periods, and past
actions influence the payoffs in the current period.
In other words, pricing actions today will have an impact on future sales as well as sales today.
We are taught early in life about the necessity to plan ahead. Present decisions affect future
events by making certain opportunities available, precluding others, and altering the costs of still
others. If present decisions do not affect future opportunities, the planning problem is trivial; one
need only make the best decision for the present. The rest of this section deals with solving
dynamic pricing problems where the periods are physically linked while considering the impact
of competition.
In their pioneering work, Rao and Bass (1985) consider dynamic pricing under
competition where the dynamics relate primarily to cost reductions through experience curve
effects. This implies that firms realize lower product costs with increases in cumulative
production. It is shown that with competition, dynamic pricing strategies dominate myopic
pricing strategies.
The Table 7 illustrates the impact of dynamic pricing under competition relative to a
static, two-period competitive game.
Static Dynamic
Period 1 Equilibrium Price $2.00 $1.85
Period 2 Equilibrium Price $1.90 $1.80
Table 7: Static vs. Dynamic Pricing Example
34
In other words, while a static game may suggest an equilibrium price of, say, $2.00 per
unit for a widget in the first period and $1.90 in the second period, a dynamic game would
recommend a price that is less than $2.00 (say, $1.85) in period 1 and even lower than $1.90
(say, $1.80) in the period 2. This is because a lower price in the first period would increase sales
in that initial period, thus increasing cumulative production, which, in turn, would reduce the
marginal cost in the second period by more than the amount realized under a static game. This
allows for an even lower price in the second period. In such a setting, prices in each period are
the control (or decision) variables, and cumulative sales at the beginning of each period would be
considered a state variable that affects future marginal cost and, hence, future profitability.
Another pricing challenge involving state dependence is the management of reference
prices. A reference price is an anchoring level formed by customers, based on the pricing
environment. Consider P&G’s value pricing strategy discussed earlier in the chapter. In such a
setting, given the corresponding stable cost to a retailer from the manufacturer, should competing
retailers also employ an everyday low pricing strategy or should they follow a High-Low pricing
pattern? In addition, under what conditions would a High-Low pricing policy be optimal? The
answer may depend upon the impact reference prices have on customer purchase behavior. One
process for reference-price formation is the exponentially smoothed averaging of past observed
prices, where more recent prices are weighted more than less recent prices (Winer 1986):
1 1 1, 0 ≤ α < 1,
where rt is the reference price in period t, rt-1 is the reference price in the previous period, and pt-1
is the retail price in the previous period.
35
If = 0, the reference price in period t equals the observed price in period t-1. As
increases, rt becomes increasingly dependent on past prices. Thus, can be regarded as a
memory parameter, with = 0 corresponding to a one-period memory. Consider a group of
frequently purchased consumer brands that are partial substitutes. Research (Winer 1986;