Research Article Bluffing as a Rational Strategy in a ...downloads.hindawi.com/journals/jcs/2013/390454.pdf · Journalof Complex Systems high performances in poker. Moreover, poker

Hindawi Publishing CorporationJournal of Complex SystemsVolume 2013 Article ID 390454 6 pageshttpdxdoiorg1011552013390454

Research ArticleBluffing as a Rational Strategy in a SimplePoker-Like Game Model

Andrea Guazzini1 and Daniele Vilone2

1 Department of Education and Psychology University of Florence Via San Salvi 12 Building 26 50135 Florence Italy2 Statistical Materials Modeling Laboratory (CNR-IENI) Via R Cozzi 53 20125 Milano Italy

Correspondence should be addressed to Andrea Guazzini andreaguazzinigmailcom

Received 28 January 2013 Accepted 8 May 2013

Academic Editor Fuwen Yang

Copyright copy 2013 A Guazzini and D Vilone This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

We present a simple adaptive learning model of a poker-like game by means of which we show how a bluffing strategy emergesvery naturally and can also be rational and evolutionarily stable Despite their very simple learning algorithms agents learn to bluffand the most bluffing player is usually the winner

1 Introduction

Among the concepts formulated and elaborated by the cogni-tive psychology in the last century the study of the processof development of problem-solving strategies has enableda meaningful improvement in the comprehension of themental activity and its relations with the cerebral circuits

At this level of description the concept of cognitive proc-ess is defined as the interconnected performances of some ele-mentary cognitive activities that operate on and affect mentalcontents representing the fundamental issue to bridge ata theoretical level the superior cognitive functions and thehuman behaviour [1 2] Such a concept is used in a widersense to mean the act of knowing andmay be interpreted in asocial or cultural sense to describe the emergent developmentof knowledge concepts or strategies

Cognitive psychologists argue that the mind can be un-derstood in terms of information processing especially whenprocesses as abstraction categorization knowledge expert-ise or learning are involved [3ndash5]

The concept of cognitive process is defined both in termsof result of the parallel elaboration of several well-definedand functionally independent neural moduli and in terms ofa sort of ldquosoftwarerdquo able to optimize the integration amongdifferent cognitive functions by the adaptation to the differentenvironmentalinformational circumstances [6ndash8]

The target of the present paper is to formulate a cognitivemodel of a poker-like game in which the players are able

to develop strategies by learning from experience Such atask better known as problem solving allows to investigateeffectively the relation and coupling between the dynamics ofcognitive process and the environment

Moreover the study of poker is of great and general inter-est in complex systems because it is strictly related withsociophysics decision theory and behaviour evolution Actu-ally the study of human behaviour and more in general ofsocial phenomena has been faced in the last years utilizingthe tools of complex systems physics [9] Poker-like gamesprovide a very good instance of strategic dilemma whereagents must optimize their income in conditions of imperfectinformation (see Section 2) or where it is not clear which isthe true optimal strategy a case which is very common in realhuman interactions [10 11] Understanding the mechanismswhich underlie poker is then very useful for a comprehensionof human psychology and to get hints of how we have toapproach to more complicated models of human society

2 Poker-Like Games

Poker is an interesting testsbed for artificial intelligenceresearch [12ndash15] It is a game of imperfect information wheremultiple competing agents must deal with probabilisticknowledge risk assessment and possible deception notunlike decisions made in the real world The so-calledldquoopponent modellingrdquo is another difficult problem in deci-sion-making applications and it is essential to achieve

2 Journal of Complex Systems

high performances in poker Moreover poker has a richhistory of study in several academic fields Economists andmathematicians have applied a variety of analytical tech-niques to poker-related problems [16ndash18] For example theearliest investigations in game theory by luminaries such asJohn von Neumann and John Nash used simplified poker toillustrate the fundamental principles

There is an important difference between board gamesand popular card games like bridge and poker In boardgames players have complete knowledge of the entire gamestate since everything is visible to all participants In contrastbridge and poker involve imperfect information since theother playersrsquo cards are not known

From a computational point of view it is important todistinguish the lack of information from the possibility ofchance moves The former involves uncertainty about thecurrent state of the world in particular situations wheredifferent players have access to different information Thelatter involves only uncertainty about the future uncertaintywhich is resolved as soon as the future materializes Perfectand imperfect information games may involve an element ofchance examples of games from all four categories are shownin Table 1

The presence of chance elements does not need majorchanges to the computational techniques used to solve agame In fact the cost of solving a perfect information gamewith chance moves is not substantially greater than solving agame with no chance moves By contrast the introductionof imperfect information increases the complexity of theproblem

Due to the complexity (both conceptual and algorithmic)of dealing with imperfect information games this problemhas been largely ignored at the computational level until theintroduction of randomized strategies concept

Once randomized strategies are allowed the existence ofldquooptimal strategiesrdquo in imperfect information games can beproved In particular this means that there exists an optimalrandomized strategy for poker in the same way as there existsan optimal deterministic strategy for chess Indeed Kuhnshowed for a simplified poker game that the optimal strategydoes use randomization [19]

The optimal strategy has several advantages the playercannot do better than this strategy if playing against a goodopponent furthermore the player does not do worse even ifhis strategy is revealed to his opponent that is the opponentgains no advantage fromfiguring out the first playerrsquos strategy

Another interesting result of such researches is theexistence of an optimal strategy for the gambler in pokergame As first observed in a simple poker-like game by Kuhn[19] behaviors such as bluffing that seem to arise from thepsychological makeup of human players are actually gametheoretically optimal

One of the earliest and most thorough investigations ofpoker appears in the classical treatise on game theory ldquoGamesand Economic Behaviorrdquo by von Neumann andMorgenstern[20] where a large section was devoted to the formal analysisof ldquobluffingrdquo in several simplified variants of a two-personpoker game with either symmetric or asymmetric informa-tion

Table 1

Perfect information Imperfect informationNo chance Chess Inspection gameChance Monopoly Poker

Indeed the general considerations concerning poker andthe mathematical discussions of the different versions of thegame were carried out by vonNeumann as early as 1926 Rec-ognizing that ldquobluffingrdquo in poker ldquois unquestionably practicedby all experienced playersrdquo von Neumann and Morgensternidentified two reasons for bluffing ldquoThe first is the desire togive a ldquofalserdquo impression of strength in ldquorealrdquo weakness thesecond is the desire to give a ldquofalserdquo impression of weaknessin ldquorealrdquo strengthrdquo [20]

Solutions to these simplified poker-like games as wellas a large class of both zero-sum and nonzero-sum gameswere unified by the concept of mixed strategy a probabilitydistribution over the playerrsquos set of actions The importanceof mixed strategies to the theory of games and its applicationsto the social and behavioral sciences stems from the fact thatfor many interactive decision processes there can be no Nashequilibria in pure strategies

Using randomization and adaptive learning as key con-cepts tomodelize into a computational scaffolding of the cog-nitive processes we believe that this area of research is morelikely to produce insights about superior cognitive strategiesbecause of their intrinsically structures Finally comparingthemwith the real human strategy it is possible both to inves-tigate the role of environmental factors on cognitive strategiesdevelopment and to validate some theoretical psychologicalassumptions

21 Summary The target of this paper is to show how bluffingstrategies can arise naturally as a mathematical property of avery simple model without any references to psychologicalassessments For this reason we present a simple model ofpoker-like game with only two players and only two possiblestrategies folding and calling which each agent assumessimultaneously Such oversimplified game as we will seeallows to catch the fundamental mechanisms underlyingthe phenomenon of bluffing The fact that bluffing naturallyemerges already in a very simple version of the game seemsto suggest that such a strategy is perfectly rational and can bemathematically characterized

3 The Model

Themodel we are going to define and analyse is probably oneof the greatest simplifications possible of a game of chancewith imperfect information

Here we have two players at the beginning of each handof the game they put one coin as the entry pot Then theypick a ldquocardrdquo from a pack each card has an integer valuebetween 0 and 119873 minus 1 (ie there are 119873 cards overall) Atthis point according to the value of their card the playersdecide to call or instead to fold If both players call theyput another coin in the pot and who holds the highest card

Journal of Complex Systems 3

wins the winner gets the entire pot (four coins) If one ofthe players folds the ldquocallerrdquo wins and gets the entry pot (twocoins) Finally if nobody calls both players take back the cointhey had put as entry pot Mathematically when the player 119894holds the card of value 119899 he decides to call according to theprobability distribution 119875

119894(119899) with of course 119894 = 1 2 and 119899 isin

0 1 119873 minus 1 After every hand both players update theirstrategy More precisely if one folds nothing happens if theagent 119894 thanks to the card 119899 calls andwins (because he has thehighest card or because the opponent folds) the probabilitythat he calls holding the card 119899 will change in this way

119875119894 (119899) 997888rarr 119875

119894 (119899) + 120583119894 [1 minus 119875119894 (119899)] (1)

Analogously if the agent 119894 loses (ie if he calls but theopponent has a higher card than him) the probability 119875

119894(119899)

will change instead in the following way

119875119894 (119899) 997888rarr 120583

119894119875119894 (119899) (2)

In (1) and (2) the coefficient 120583119894is the learning factor (LF)

which can also be seen as a sort of risk propensity of theplayer 119894 The LFs of the players are set at the beginning ofthe game and will never change Moreover it can assume avalue between 0 (no risk propensity at all) and 1 (maximumrisk propensity possible) actually a player with 120583

119894= 0 and

the card 119899 does not increase 119875119894(119899) even though he wins and

sets 119875119894(119899) = 0 as soon as he loses instead with 120583

119894= 1 he

sets 119875119894(119899) = 1 when he wins but does not decrease 119875

119894(119899) if he

loses Finally it is easy to notice that (1) and (2) ensure that119875119894(119899) will always stay in the interval [0 1]

31 Numerical Results In this section we will present themost remarkable results of the simulations of the simplemodel defined previously

First of all for simplicity we set the LF 1205831of the ldquoplayer 1rdquo

equal to 05 and thenwe checked the dynamics by varying1205832

actually it is the difference between the LFs which essentiallydetermines the main features of the dynamics as we saw inseveral simulations In particular we can distinguish threecases 120583

2lt 1205831 1205832= 1205831 and 120583

2gt 1205831

311 Case 1205832lt 1205831 In Figures 1 and 2 the behaviour of the

money of both players is shown as a function of time wherethe time unit is a single hand of the game we set119873 = 25 120583

2=

03 and 1205832= 048 respectively and the results are averaged

over 104 and 105 iterations respectively in both cases welet the agents play 104 hands For simplicity we consideredplayers with an infinite amount of money available and wegauged to zero the initial amount Additionally the initialcalling distributions 119875

119894(119899) are picked randomly for each 119899

As it can be seen the first player with higher LF winsover his opponent with smaller LF and the money gainedby player 1 increases with time Moreover the smaller 120583

2 the

faster and bigger the winnings of player 1 Previous figuresshow that on average the player with bigger risk propensityfinally overwhelms the other one and this means that ina single match the most risk-inclined player has a biggerprobability to win and such probability increases as thedifference 120583

1minus1205832increases in its turnThe fact that risking is

Time0 2000 4000 6000 8000 10000

0

1000

2000

3000

Mon

ey

minus3000

minus2000

minus1000

1205831 = 05

1205832 = 03

Figure 1 Behaviour of the money won (or lost) by the two playersversus time that is versus the hands played averaged over 104iterated matches The LFs of the players are here 120583

1= 05 and

1205832= 03

Time0 2000 4000 6000 8000 10000

1205831 = 05

1205832 = 048

0

100

200

300Mon

ey

minus200

minus300

minus100


1= 05 and

1205832= 048

convenient gets confirmed in the next figures where the finalcalling distributions for both players are depicted

As we can see in both cases the winner is characterizedby higher calling probabilities than his opponentrsquos ones forevery value of 119899 Moreover we have 119875

119894(119899 = 0) gt 0 which is

the most explicit evidence of the emergence of bluffing

312 Case 1205832= 1205831 In this case both players have the

same winning probability as it is well shown in Figure 5indeed having the same LF they have also exactly the samebehaviours so that in a single match nobody is able to over-whelm definitively the opponent


0 4 8 12 16 20 24

1

08

06

04

02

0

119899

119875119894(119899)

1205831 = 05

1205832 = 03

Figure 3 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583

1= 05) red symbols player

2 (1205832= 03) Data took after 104 hands of the game and averaged

over 104 iterations Random initial distribution for every 119875119894(119899)

0 4 8 12 16 20 24

1

08

06

04

02

0

119899

119875119894(119899)

1205831 = 05

1205832 = 048





Naturally it is also easy to forecast the behaviour of thefinal calling probabilities of the agents theywill be equal with119875119894(119899 = 0) gt 0 for both 119894 = 1 and 119894 = 2

313 Case 1205832gt 1205831 In this case the results are qualitatively

equal to the ones of case 1205832gt 1205831 only now it is player 2 which

defeats player 1 as shown in Figures 7 and 8

32 Discussion The first conclusion we can obtain just fromthe numerical results is that in this game bluffing emergesnecessarily as rational strategy Moreover the player whobluffsmore finally winsThis can be easily understood watch-ing Figures 3 4 6 and 8 Indeed while in general the calling

Time0 2000 4000 6000 8000 10000

0

100

200

Mon

ey

minus100

minus200

1205831 = 05

1205832 = 05

Figure 5 Behaviour of the money won (or lost) by the two playersversus time that is versus the hands played throughout a singlematch The LFs of the players are here 120583

1= 1205832= 05

0 4 8 12 16 20 24

1

08

06

04

02

0

119899

119875119894(119899)

1205831 = 05

1205832 = 05



2 (1205832= 05 = 120583

1) Data took after 104 hands of the game and averaged

over 25 sdot 104 iterations Random initial distribution for every 119875119894(119899)

probabilities of both players tend to have the same value for119899 rarr 119873minus1 for small 119899 the player with higher LF that is withhigher risk propensity bluffs much more than his opponentthis means that even holding a poor card the ldquorisk-loverrdquo willcall and unless his opponent has a very good point will get theentry pot

It is possible to formalize such considerations by writingthe equations of the dynamics for themodel at stake Neglect-ing fluctuations the ldquomean-fieldrdquo equation ruling the money1198721won (or lost) by the first player is

1198721 (119905 + 1) = 1198721 (119905) +

119873minus1

sum

1198991=0

1

119873[(1 minus Π

1

2) 1198751(1198991 119905)


minus Π1

2(1 minus 119875

1(1198991 119905))

+21198751(1198991 119905) Π1

2(P1

2minusP2

1)]

(3)

where P12= Pr(119899

1gt 1198992) is the probability that the card 119899

1

held by player 1 is higher than the card 1198992held by player 2

analogously it isP21= Pr(119899

2gt 1198991) On the other hand Π1

2is

an operator defined as follows

Π1

2sdot 119883 =

119873minus1

sum

1198992=0

[1198752(1198992 119905)

119873 minus 1(1 minus 120575

11989921198991

)119883] (4)

and represents the probability that player 2 calls from thepoint of view of player 1 Equation (3) can be rewritten as adifferential equation in time which can assume the form

1 (119905) = [1 minus 21205742 (119905)] 1205961 (119905) minus [1 minus 21205741 (119905)] 1205962 (119905) (5)

with

120596119894 (119905) =

1

119873

119873minus1

sum

119899=0

119875119894 (119899 119905) 119894 = 1 2

120574119894 (119905) =

1

(119873 minus 1)2

119873minus1

sum

119899=0

119899119875119894 (119899 119905) 119894 = 1 2

(6)

Since this is a zero-sum game second playerrsquos money willbe obtained by the relation119872

2(119905) = minus119872

1(119905)

Finally the relation giving the temporal behaviour of thecalling distributions 119875

1(119899 119905) of player 1 (being the one of the

opponent of analogous form) is

1(1198991 119905) =

1

1198731198751(1198991 119905) [(1 minus Π

1

2+ Π1

2P1

2)

times [1198751(1198991 119905) + 120583

1(1 minus 119875

1(1198991 119905))]

+12058311198751(1198981 119905) Π1

2P2

1]

+1 minus 1198751(1198991 119905)

1198731198751(1198991 119905) minus

1198751(1198991 119905)

119873

(7)

Now (5) (6) and (7) are rather complicated but somefeatures of them can be determined without an explicit solu-tion Actually it is straightforward to understand that we have

1198751 (119899 119905) = 1198752 (119899 119905) 997904rArr 120596

1 (119905) = 1205962 (119905)

1205741 (119905) = 1205742 (119905) 997904rArr

1= 2= 0

(8)

Now since in our simulations we always started from thesame initial 119875

119894(119899) for all 119894 119899 and from119872

1(0) = 119872

2(0) = 0

this implies that for 1205831= 1205832both players must have on

average the same calling distributions and then they shouldnot gain nor lose money apart from fluctuations this isexactly what we found in Figures 5 and 6 It can also be shownthat if 120583

1gt 1205832 then we will have soon 119875

1(119899) ge 119875

2(119899) for all

119899 (with the equality holding only for 119899 = 119873 minus 1) so that (6)allow us to get

1gt 0 that is the victory of player 1 as shown

in Figures 1 to 4 Obviously the opposite situation takes placefor 1205831lt 1205832(as shown in Figures 7 and 8)

0

1000

2000

Mon

ey

minus2000

minus1000

Time0 2000 4000 6000 8000 10000

1205831 = 05

1205832 = 07


1= 05 and

1205832= 07

0 4 8 12 16 20 24

1

08

06

04

02

0

119899

119875119894(119899)

1205831 = 05

1205832 = 07





4 Conclusions and Perspectives

In this work we have developed a model to describe veryroughly a cognitive processes dynamicsThemodel is amongthe most simple ones but allows to capture the main behav-iours of real dynamics demonstrating how fundamental isbluffing as a rational strategy

First of all this model confirms the role of LF (120583) asprominent In a straightforward way a direct connectionbetween LF and bluffing tendency is here detectable In facteven though both agents tend to develop bluff the one withthe greatest LF bluffs more than the other ending up as


the winner Finally this last evidence suggests that bluffingmust be an evolutionary rational (stable) strategy

Other more realistic models have been developed [21]and for sure they can give us deeper information about theinner mechanisms driving to bluffing behaviours in poker-like games but the emergence of bluff as a rational strategyalready in a so simple toy model is with no doubt a veryremarkable result Indeed more in-depth analyses of the realgame can be found in the literature [13] which systematicallysurvey every facet of poker (blind flop raise etc) but thisis the first time that it is tried to catch the main features of itby means of a very ldquoreductionistrdquo approach by means of thesimplest version of the game Shortly if it has been thoughtuntil now that to catch an apparently complex behaviourbluffing with effective outcomes it is necessary to take intoaccount (almost) all the rules of poker we have shown herethat such attitude is much more fundamental and emergesnaturally when few simple ingredients are present

Moreover for practical purposes a simpler model whichcan be more easily understood turns out to be very usefulbecause it will result easier also in the realization and theanalysis of experimental tests with real human agents andhence it will be possible to plan more accurately other kindsof experimental tools In particular it will be straightforwardto utilize a tool as in reference [22] for a test with real playersand to exploit data frompoker-onlineweb sites for a statisticalanalysis of the outcomes of real poker games

References

[1] L A Real ldquoAnimal choice behavior and the evolution of cogni-tive architecture rdquo Science vol 253 no 5023 pp 980ndash986 1991

[2] L A Real ldquoInformation processing and the evolutionary ecolo-gy of cognitive architecturerdquo American Naturalist vol 140 ppS108ndashS145 1992

[3] E Brandstatter G Gigerenzer and R Hertwig ldquoThe priorityheuristic making choices without trade-offsrdquo PsychologicalReview vol 113 no 2 pp 409ndash432 2006

[4] J R Busemeyer and J T Townsend ldquoDecision field theory adynamic-cognitive approach to decision making in an uncer-tain environmentrdquo Psychological Review vol 100 no 3 pp 432ndash459 1993

[5] I Erev and G Barron ldquoOn adaptation maximization and rein-forcement learning among cognitive strategiesrdquo PsychologicalReview vol 112 no 4 pp 912ndash931 2005

[6] G T Fong D H Krantz and R E Nisbett ldquoThe effects of statis-tical training on thinking about everyday problemsrdquo CognitivePsychology vol 18 no 3 pp 253ndash292 1986

[7] C R Fox and L Hadar ldquoDecisions from experience = samplingerror + prospect theory reconsidering Hertwig BarronWeberamp Erevrdquo Judgment andDecisionMaking vol 1 no 2 pp 159ndash1612006

[8] R Hau T J Pleskac and R Hertwig ldquoDecisions from experi-ence and statistical probabilities why they trigger differentchoices than a priori probabilitiesrdquo Journal of Behavioral Deci-sion Making vol 23 no 1 pp 48ndash68 2010

[9] C Castellano S Fortunato and V Loreto ldquoStatistical physics ofsocial dynamicsrdquo Reviews of Modern Physics vol 81 no 2 pp591ndash646 2009

[10] C Camerer Behavioral Game Theory Experiments on StrategicInteraction Princeton University Press Princeton NJ USA2003

[11] H Gintis The Bounds of Reason Princeton University PressPrinceton NJ USA 2009

[12] I Barany and Z Furedi ldquoMental poker with three or more play-ersrdquo Information and Control vol 59 no 1ndash3 pp 84ndash93 1983

[13] D Billings A Davidson J Schaeffer and D Szafron ldquoThe chal-lenge of pokerrdquo Artificial Intelligence vol 134 no 1-2 pp 201ndash240 2002

[14] H Johansen-Berg and V Walsh ldquoCognitive neuroscience whoto play at pokerrdquo Current Biology vol 11 no 7 pp R261ndashR2632001

[15] D Koller and A Pfeffer ldquoRepresentations and solutions forgame-theoretic problemsrdquo Artificial Intelligence vol 94 no 1-2 pp 167ndash215 1997

[16] A Rapoport I Erev E V Abraham and D E Olson ldquoRan-domization and adaptive learning in a simplified poker gamerdquoOrganizational Behavior and Human Decision Processes vol 69no 1 pp 31ndash49 1997

[17] Z Shen ldquoA study of a generalization of a card problemrdquoAppliedMathematics and Computation vol 166 no 2 pp 385ndash4102005

[18] J Miekisz ldquoEvolutionary game theory and population dynam-icsrdquo in Multiscale Problems in the Life Sciences vol 1940 ofLecture Notes in Mathematics pp 269ndash316 2008

[19] H W Kuhn ldquoA simplified two-person pokerrdquo in Annals ofMathematics Studies vol 24 pp 97ndash103 1950

[20] J Von Neumann and O Morgenstern Theory of Games andEconomic Behavior Princeton University Press Princeton NJUSA 1947

[21] A Guazzini and D Vilone The emergence of bluff in poker-likegames [PhD thesis] University of Florence 2009

[22] A Guazzini D Vilone F Bagnoli T Carletti and R LauroGrotto ldquoCog-nitive network structure an experimental studyrdquoAdvances in Complex Systems vol 15 no 6 Article ID 12500842012

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of


Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of


Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of


Mathematical PhysicsAdvances in

Complex AnalysisJournal of


OptimizationJournal of


CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of


Operations ResearchAdvances in

Journal of


Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences


The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014


Algebra

Discrete Dynamics in Nature and Society



Decision SciencesAdvances in

Discrete MathematicsJournal of


Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of


high performances in poker Moreover poker has a richhistory of study in several academic fields Economists andmathematicians have applied a variety of analytical tech-niques to poker-related problems [16ndash18] For example theearliest investigations in game theory by luminaries such asJohn von Neumann and John Nash used simplified poker toillustrate the fundamental principles

There is an important difference between board gamesand popular card games like bridge and poker In boardgames players have complete knowledge of the entire gamestate since everything is visible to all participants In contrastbridge and poker involve imperfect information since theother playersrsquo cards are not known

From a computational point of view it is important todistinguish the lack of information from the possibility ofchance moves The former involves uncertainty about thecurrent state of the world in particular situations wheredifferent players have access to different information Thelatter involves only uncertainty about the future uncertaintywhich is resolved as soon as the future materializes Perfectand imperfect information games may involve an element ofchance examples of games from all four categories are shownin Table 1

The presence of chance elements does not need majorchanges to the computational techniques used to solve agame In fact the cost of solving a perfect information gamewith chance moves is not substantially greater than solving agame with no chance moves By contrast the introductionof imperfect information increases the complexity of theproblem

Due to the complexity (both conceptual and algorithmic)of dealing with imperfect information games this problemhas been largely ignored at the computational level until theintroduction of randomized strategies concept

Once randomized strategies are allowed the existence ofldquooptimal strategiesrdquo in imperfect information games can beproved In particular this means that there exists an optimalrandomized strategy for poker in the same way as there existsan optimal deterministic strategy for chess Indeed Kuhnshowed for a simplified poker game that the optimal strategydoes use randomization [19]

The optimal strategy has several advantages the playercannot do better than this strategy if playing against a goodopponent furthermore the player does not do worse even ifhis strategy is revealed to his opponent that is the opponentgains no advantage fromfiguring out the first playerrsquos strategy

Another interesting result of such researches is theexistence of an optimal strategy for the gambler in pokergame As first observed in a simple poker-like game by Kuhn[19] behaviors such as bluffing that seem to arise from thepsychological makeup of human players are actually gametheoretically optimal

One of the earliest and most thorough investigations ofpoker appears in the classical treatise on game theory ldquoGamesand Economic Behaviorrdquo by von Neumann andMorgenstern[20] where a large section was devoted to the formal analysisof ldquobluffingrdquo in several simplified variants of a two-personpoker game with either symmetric or asymmetric informa-tion

Table 1

Perfect information Imperfect informationNo chance Chess Inspection gameChance Monopoly Poker

Indeed the general considerations concerning poker andthe mathematical discussions of the different versions of thegame were carried out by vonNeumann as early as 1926 Rec-ognizing that ldquobluffingrdquo in poker ldquois unquestionably practicedby all experienced playersrdquo von Neumann and Morgensternidentified two reasons for bluffing ldquoThe first is the desire togive a ldquofalserdquo impression of strength in ldquorealrdquo weakness thesecond is the desire to give a ldquofalserdquo impression of weaknessin ldquorealrdquo strengthrdquo [20]

Solutions to these simplified poker-like games as wellas a large class of both zero-sum and nonzero-sum gameswere unified by the concept of mixed strategy a probabilitydistribution over the playerrsquos set of actions The importanceof mixed strategies to the theory of games and its applicationsto the social and behavioral sciences stems from the fact thatfor many interactive decision processes there can be no Nashequilibria in pure strategies

Using randomization and adaptive learning as key con-cepts tomodelize into a computational scaffolding of the cog-nitive processes we believe that this area of research is morelikely to produce insights about superior cognitive strategiesbecause of their intrinsically structures Finally comparingthemwith the real human strategy it is possible both to inves-tigate the role of environmental factors on cognitive strategiesdevelopment and to validate some theoretical psychologicalassumptions

21 Summary The target of this paper is to show how bluffingstrategies can arise naturally as a mathematical property of avery simple model without any references to psychologicalassessments For this reason we present a simple model ofpoker-like game with only two players and only two possiblestrategies folding and calling which each agent assumessimultaneously Such oversimplified game as we will seeallows to catch the fundamental mechanisms underlyingthe phenomenon of bluffing The fact that bluffing naturallyemerges already in a very simple version of the game seemsto suggest that such a strategy is perfectly rational and can bemathematically characterized

3 The Model

Themodel we are going to define and analyse is probably oneof the greatest simplifications possible of a game of chancewith imperfect information

Here we have two players at the beginning of each handof the game they put one coin as the entry pot Then theypick a ldquocardrdquo from a pack each card has an integer valuebetween 0 and 119873 minus 1 (ie there are 119873 cards overall) Atthis point according to the value of their card the playersdecide to call or instead to fold If both players call theyput another coin in the pot and who holds the highest card





119875119894 (119899) 997888rarr 119875

119894 (119899) + 120583119894 [1 minus 119875119894 (119899)] (1)


119894(119899)


119875119894 (119899) 997888rarr 120583

119894119875119894 (119899) (2)



119894= 0 and



119894= 1 he


119894(119899) if he






2lt 1205831 1205832= 1205831 and 120583

2gt 1205831



2=





2 the



Time0 2000 4000 6000 8000 10000

0

1000

2000

3000

Mon

ey

minus3000

minus2000

minus1000

1205831 = 05

1205832 = 03


1= 05 and

1205832= 03

Time0 2000 4000 6000 8000 10000

1205831 = 05

1205832 = 048

0

100

200

300Mon

ey

minus200

minus300

minus100


1= 05 and

1205832= 048



119894(119899 = 0) gt 0 which is





0 4 8 12 16 20 24

1

08

06

04

02

0

119899

119875119894(119899)

1205831 = 05

1205832 = 03





0 4 8 12 16 20 24

1

08

06

04

02

0

119899

119875119894(119899)

1205831 = 05

1205832 = 048










Time0 2000 4000 6000 8000 10000

0

100

200

Mon

ey

minus100

minus200

1205831 = 05

1205832 = 05


1= 1205832= 05

0 4 8 12 16 20 24

1

08

06

04

02

0

119899

119875119894(119899)

1205831 = 05

1205832 = 05



2 (1205832= 05 = 120583





1198721 (119905 + 1) = 1198721 (119905) +

119873minus1

sum

1198991=0

1

119873[(1 minus Π

1

2) 1198751(1198991 119905)


minus Π1

2(1 minus 119875

1(1198991 119905))

+21198751(1198991 119905) Π1

2(P1

2minusP2

1)]

(3)



1




2is


Π1

2sdot 119883 =

119873minus1

sum

1198992=0

[1198752(1198992 119905)

119873 minus 1(1 minus 120575

11989921198991

)119883] (4)



with

120596119894 (119905) =

1

119873

119873minus1

sum

119899=0

119875119894 (119899 119905) 119894 = 1 2

120574119894 (119905) =

1

(119873 minus 1)2

119873minus1

sum

119899=0

119899119875119894 (119899 119905) 119894 = 1 2

(6)


2(119905) = minus119872

1(119905)




1(1198991 119905) =

1

1198731198751(1198991 119905) [(1 minus Π

1

2+ Π1

2P1

2)

times [1198751(1198991 119905) + 120583

1(1 minus 119875

1(1198991 119905))]

+12058311198751(1198981 119905) Π1

2P2

1]

+1 minus 1198751(1198991 119905)

1198731198751(1198991 119905) minus

1198751(1198991 119905)

119873

(7)


1198751 (119899 119905) = 1198752 (119899 119905) 997904rArr 120596

1 (119905) = 1205962 (119905)

1205741 (119905) = 1205742 (119905) 997904rArr

1= 2= 0

(8)


119894(119899) for all 119894 119899 and from119872

1(0) = 119872

2(0) = 0




1(119899) ge 119875

2(119899) for all




0

1000

2000

Mon

ey

minus2000

minus1000

Time0 2000 4000 6000 8000 10000

1205831 = 05

1205832 = 07


1= 05 and

1205832= 07

0 4 8 12 16 20 24

1

08

06

04

02

0

119899

119875119894(119899)

1205831 = 05

1205832 = 07












References






























Volume 2014




Journal of











Journal of


Function Spaces






Algebra













119875119894 (119899) 997888rarr 119875

119894 (119899) + 120583119894 [1 minus 119875119894 (119899)] (1)


119894(119899)


119875119894 (119899) 997888rarr 120583

119894119875119894 (119899) (2)



119894= 0 and



119894= 1 he


119894(119899) if he






2lt 1205831 1205832= 1205831 and 120583

2gt 1205831



2=





2 the



Time0 2000 4000 6000 8000 10000

0

1000

2000

3000

Mon

ey

minus3000

minus2000

minus1000

1205831 = 05

1205832 = 03


1= 05 and

1205832= 03

Time0 2000 4000 6000 8000 10000

1205831 = 05

1205832 = 048

0

100

200

300Mon

ey

minus200

minus300

minus100


1= 05 and

1205832= 048



119894(119899 = 0) gt 0 which is





0 4 8 12 16 20 24

1

08

06

04

02

0

119899

119875119894(119899)

1205831 = 05

1205832 = 03





0 4 8 12 16 20 24

1

08

06

04

02

0

119899

119875119894(119899)

1205831 = 05

1205832 = 048










Time0 2000 4000 6000 8000 10000

0

100

200

Mon

ey

minus100

minus200

1205831 = 05

1205832 = 05


1= 1205832= 05

0 4 8 12 16 20 24

1

08

06

04

02

0

119899

119875119894(119899)

1205831 = 05

1205832 = 05



2 (1205832= 05 = 120583





1198721 (119905 + 1) = 1198721 (119905) +

119873minus1

sum

1198991=0

1

119873[(1 minus Π

1

2) 1198751(1198991 119905)


minus Π1

2(1 minus 119875

1(1198991 119905))

+21198751(1198991 119905) Π1

2(P1

2minusP2

1)]

(3)



1




2is


Π1

2sdot 119883 =

119873minus1

sum

1198992=0

[1198752(1198992 119905)

119873 minus 1(1 minus 120575

11989921198991

)119883] (4)



with

120596119894 (119905) =

1

119873

119873minus1

sum

119899=0

119875119894 (119899 119905) 119894 = 1 2

120574119894 (119905) =

1

(119873 minus 1)2

119873minus1

sum

119899=0

119899119875119894 (119899 119905) 119894 = 1 2

(6)


2(119905) = minus119872

1(119905)




1(1198991 119905) =

1

1198731198751(1198991 119905) [(1 minus Π

1

2+ Π1

2P1

2)

times [1198751(1198991 119905) + 120583

1(1 minus 119875

1(1198991 119905))]

+12058311198751(1198981 119905) Π1

2P2

1]

+1 minus 1198751(1198991 119905)

1198731198751(1198991 119905) minus

1198751(1198991 119905)

119873

(7)


1198751 (119899 119905) = 1198752 (119899 119905) 997904rArr 120596

1 (119905) = 1205962 (119905)

1205741 (119905) = 1205742 (119905) 997904rArr

1= 2= 0

(8)


119894(119899) for all 119894 119899 and from119872

1(0) = 119872

2(0) = 0




1(119899) ge 119875

2(119899) for all




0

1000

2000

Mon

ey

minus2000

minus1000

Time0 2000 4000 6000 8000 10000

1205831 = 05

1205832 = 07


1= 05 and

1205832= 07

0 4 8 12 16 20 24

1

08

06

04

02

0

119899

119875119894(119899)

1205831 = 05

1205832 = 07












References






























Volume 2014




Journal of











Journal of


Function Spaces






Algebra










0 4 8 12 16 20 24

1

08

06

04

02

0

119899

119875119894(119899)

1205831 = 05

1205832 = 03





0 4 8 12 16 20 24

1

08

06

04

02

0

119899

119875119894(119899)

1205831 = 05

1205832 = 048










Time0 2000 4000 6000 8000 10000

0

100

200

Mon

ey

minus100

minus200

1205831 = 05

1205832 = 05


1= 1205832= 05

0 4 8 12 16 20 24

1

08

06

04

02

0

119899

119875119894(119899)

1205831 = 05

1205832 = 05



2 (1205832= 05 = 120583





1198721 (119905 + 1) = 1198721 (119905) +

119873minus1

sum

1198991=0

1

119873[(1 minus Π

1

2) 1198751(1198991 119905)


minus Π1

2(1 minus 119875

1(1198991 119905))

+21198751(1198991 119905) Π1

2(P1

2minusP2

1)]

(3)



1




2is


Π1

2sdot 119883 =

119873minus1

sum

1198992=0

[1198752(1198992 119905)

119873 minus 1(1 minus 120575

11989921198991

)119883] (4)



with

120596119894 (119905) =

1

119873

119873minus1

sum

119899=0

119875119894 (119899 119905) 119894 = 1 2

120574119894 (119905) =

1

(119873 minus 1)2

119873minus1

sum

119899=0

119899119875119894 (119899 119905) 119894 = 1 2

(6)


2(119905) = minus119872

1(119905)




1(1198991 119905) =

1

1198731198751(1198991 119905) [(1 minus Π

1

2+ Π1

2P1

2)

times [1198751(1198991 119905) + 120583

1(1 minus 119875

1(1198991 119905))]

+12058311198751(1198981 119905) Π1

2P2

1]

+1 minus 1198751(1198991 119905)

1198731198751(1198991 119905) minus

1198751(1198991 119905)

119873

(7)


1198751 (119899 119905) = 1198752 (119899 119905) 997904rArr 120596

1 (119905) = 1205962 (119905)

1205741 (119905) = 1205742 (119905) 997904rArr

1= 2= 0

(8)


119894(119899) for all 119894 119899 and from119872

1(0) = 119872

2(0) = 0




1(119899) ge 119875

2(119899) for all




0

1000

2000

Mon

ey

minus2000

minus1000

Time0 2000 4000 6000 8000 10000

1205831 = 05

1205832 = 07


1= 05 and

1205832= 07

0 4 8 12 16 20 24

1

08

06

04

02

0

119899

119875119894(119899)

1205831 = 05

1205832 = 07












References






























Volume 2014




Journal of











Journal of


Function Spaces






Algebra










minus Π1

2(1 minus 119875

1(1198991 119905))

+21198751(1198991 119905) Π1

2(P1

2minusP2

1)]

(3)



1




2is


Π1

2sdot 119883 =

119873minus1

sum

1198992=0

[1198752(1198992 119905)

119873 minus 1(1 minus 120575

11989921198991

)119883] (4)



with

120596119894 (119905) =

1

119873

119873minus1

sum

119899=0

119875119894 (119899 119905) 119894 = 1 2

120574119894 (119905) =

1

(119873 minus 1)2

119873minus1

sum

119899=0

119899119875119894 (119899 119905) 119894 = 1 2

(6)


2(119905) = minus119872

1(119905)




1(1198991 119905) =

1

1198731198751(1198991 119905) [(1 minus Π

1

2+ Π1

2P1

2)

times [1198751(1198991 119905) + 120583

1(1 minus 119875

1(1198991 119905))]

+12058311198751(1198981 119905) Π1

2P2

1]

+1 minus 1198751(1198991 119905)

1198731198751(1198991 119905) minus

1198751(1198991 119905)

119873

(7)


1198751 (119899 119905) = 1198752 (119899 119905) 997904rArr 120596

1 (119905) = 1205962 (119905)

1205741 (119905) = 1205742 (119905) 997904rArr

1= 2= 0

(8)


119894(119899) for all 119894 119899 and from119872

1(0) = 119872

2(0) = 0




1(119899) ge 119875

2(119899) for all




0

1000

2000

Mon

ey

minus2000

minus1000

Time0 2000 4000 6000 8000 10000

1205831 = 05

1205832 = 07


1= 05 and

1205832= 07

0 4 8 12 16 20 24

1

08

06

04

02

0

119899

119875119894(119899)

1205831 = 05

1205832 = 07












References






























Volume 2014




Journal of











Journal of


Function Spaces






Algebra













References






























Volume 2014




Journal of











Journal of


Function Spaces






Algebra
















Volume 2014




Journal of











Journal of


Function Spaces






Algebra









Research Article Bluffing as a Rational Strategy in a ...downloads.hindawi.com/journals/jcs/2013/390454.pdf · Journalof Complex Systems high performances in poker. Moreover, poker

Documents