Research Article Bluffing as a Rational Strategy in a ...downloads.hindawi.com/journals/jcs/2013/390454.pdf · Journalof Complex Systems high performances in poker. Moreover, poker
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Hindawi Publishing CorporationJournal of Complex SystemsVolume 2013 Article ID 390454 6 pageshttpdxdoiorg1011552013390454
Research ArticleBluffing as a Rational Strategy in a SimplePoker-Like Game Model
Andrea Guazzini1 and Daniele Vilone2
1 Department of Education and Psychology University of Florence Via San Salvi 12 Building 26 50135 Florence Italy2 Statistical Materials Modeling Laboratory (CNR-IENI) Via R Cozzi 53 20125 Milano Italy
Correspondence should be addressed to Andrea Guazzini andreaguazzinigmailcom
Received 28 January 2013 Accepted 8 May 2013
Academic Editor Fuwen Yang
Copyright copy 2013 A Guazzini and D Vilone This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited
We present a simple adaptive learning model of a poker-like game by means of which we show how a bluffing strategy emergesvery naturally and can also be rational and evolutionarily stable Despite their very simple learning algorithms agents learn to bluffand the most bluffing player is usually the winner
1 Introduction
Among the concepts formulated and elaborated by the cogni-tive psychology in the last century the study of the processof development of problem-solving strategies has enableda meaningful improvement in the comprehension of themental activity and its relations with the cerebral circuits
At this level of description the concept of cognitive proc-ess is defined as the interconnected performances of some ele-mentary cognitive activities that operate on and affect mentalcontents representing the fundamental issue to bridge ata theoretical level the superior cognitive functions and thehuman behaviour [1 2] Such a concept is used in a widersense to mean the act of knowing andmay be interpreted in asocial or cultural sense to describe the emergent developmentof knowledge concepts or strategies
Cognitive psychologists argue that the mind can be un-derstood in terms of information processing especially whenprocesses as abstraction categorization knowledge expert-ise or learning are involved [3ndash5]
The concept of cognitive process is defined both in termsof result of the parallel elaboration of several well-definedand functionally independent neural moduli and in terms ofa sort of ldquosoftwarerdquo able to optimize the integration amongdifferent cognitive functions by the adaptation to the differentenvironmentalinformational circumstances [6ndash8]
The target of the present paper is to formulate a cognitivemodel of a poker-like game in which the players are able
to develop strategies by learning from experience Such atask better known as problem solving allows to investigateeffectively the relation and coupling between the dynamics ofcognitive process and the environment
Moreover the study of poker is of great and general inter-est in complex systems because it is strictly related withsociophysics decision theory and behaviour evolution Actu-ally the study of human behaviour and more in general ofsocial phenomena has been faced in the last years utilizingthe tools of complex systems physics [9] Poker-like gamesprovide a very good instance of strategic dilemma whereagents must optimize their income in conditions of imperfectinformation (see Section 2) or where it is not clear which isthe true optimal strategy a case which is very common in realhuman interactions [10 11] Understanding the mechanismswhich underlie poker is then very useful for a comprehensionof human psychology and to get hints of how we have toapproach to more complicated models of human society
2 Poker-Like Games
Poker is an interesting testsbed for artificial intelligenceresearch [12ndash15] It is a game of imperfect information wheremultiple competing agents must deal with probabilisticknowledge risk assessment and possible deception notunlike decisions made in the real world The so-calledldquoopponent modellingrdquo is another difficult problem in deci-sion-making applications and it is essential to achieve
2 Journal of Complex Systems
high performances in poker Moreover poker has a richhistory of study in several academic fields Economists andmathematicians have applied a variety of analytical tech-niques to poker-related problems [16ndash18] For example theearliest investigations in game theory by luminaries such asJohn von Neumann and John Nash used simplified poker toillustrate the fundamental principles
There is an important difference between board gamesand popular card games like bridge and poker In boardgames players have complete knowledge of the entire gamestate since everything is visible to all participants In contrastbridge and poker involve imperfect information since theother playersrsquo cards are not known
From a computational point of view it is important todistinguish the lack of information from the possibility ofchance moves The former involves uncertainty about thecurrent state of the world in particular situations wheredifferent players have access to different information Thelatter involves only uncertainty about the future uncertaintywhich is resolved as soon as the future materializes Perfectand imperfect information games may involve an element ofchance examples of games from all four categories are shownin Table 1
The presence of chance elements does not need majorchanges to the computational techniques used to solve agame In fact the cost of solving a perfect information gamewith chance moves is not substantially greater than solving agame with no chance moves By contrast the introductionof imperfect information increases the complexity of theproblem
Due to the complexity (both conceptual and algorithmic)of dealing with imperfect information games this problemhas been largely ignored at the computational level until theintroduction of randomized strategies concept
Once randomized strategies are allowed the existence ofldquooptimal strategiesrdquo in imperfect information games can beproved In particular this means that there exists an optimalrandomized strategy for poker in the same way as there existsan optimal deterministic strategy for chess Indeed Kuhnshowed for a simplified poker game that the optimal strategydoes use randomization [19]
The optimal strategy has several advantages the playercannot do better than this strategy if playing against a goodopponent furthermore the player does not do worse even ifhis strategy is revealed to his opponent that is the opponentgains no advantage fromfiguring out the first playerrsquos strategy
Another interesting result of such researches is theexistence of an optimal strategy for the gambler in pokergame As first observed in a simple poker-like game by Kuhn[19] behaviors such as bluffing that seem to arise from thepsychological makeup of human players are actually gametheoretically optimal
One of the earliest and most thorough investigations ofpoker appears in the classical treatise on game theory ldquoGamesand Economic Behaviorrdquo by von Neumann andMorgenstern[20] where a large section was devoted to the formal analysisof ldquobluffingrdquo in several simplified variants of a two-personpoker game with either symmetric or asymmetric informa-tion
Table 1
Perfect information Imperfect informationNo chance Chess Inspection gameChance Monopoly Poker
Indeed the general considerations concerning poker andthe mathematical discussions of the different versions of thegame were carried out by vonNeumann as early as 1926 Rec-ognizing that ldquobluffingrdquo in poker ldquois unquestionably practicedby all experienced playersrdquo von Neumann and Morgensternidentified two reasons for bluffing ldquoThe first is the desire togive a ldquofalserdquo impression of strength in ldquorealrdquo weakness thesecond is the desire to give a ldquofalserdquo impression of weaknessin ldquorealrdquo strengthrdquo [20]
Solutions to these simplified poker-like games as wellas a large class of both zero-sum and nonzero-sum gameswere unified by the concept of mixed strategy a probabilitydistribution over the playerrsquos set of actions The importanceof mixed strategies to the theory of games and its applicationsto the social and behavioral sciences stems from the fact thatfor many interactive decision processes there can be no Nashequilibria in pure strategies
Using randomization and adaptive learning as key con-cepts tomodelize into a computational scaffolding of the cog-nitive processes we believe that this area of research is morelikely to produce insights about superior cognitive strategiesbecause of their intrinsically structures Finally comparingthemwith the real human strategy it is possible both to inves-tigate the role of environmental factors on cognitive strategiesdevelopment and to validate some theoretical psychologicalassumptions
21 Summary The target of this paper is to show how bluffingstrategies can arise naturally as a mathematical property of avery simple model without any references to psychologicalassessments For this reason we present a simple model ofpoker-like game with only two players and only two possiblestrategies folding and calling which each agent assumessimultaneously Such oversimplified game as we will seeallows to catch the fundamental mechanisms underlyingthe phenomenon of bluffing The fact that bluffing naturallyemerges already in a very simple version of the game seemsto suggest that such a strategy is perfectly rational and can bemathematically characterized
3 The Model
Themodel we are going to define and analyse is probably oneof the greatest simplifications possible of a game of chancewith imperfect information
Here we have two players at the beginning of each handof the game they put one coin as the entry pot Then theypick a ldquocardrdquo from a pack each card has an integer valuebetween 0 and 119873 minus 1 (ie there are 119873 cards overall) Atthis point according to the value of their card the playersdecide to call or instead to fold If both players call theyput another coin in the pot and who holds the highest card
Journal of Complex Systems 3
wins the winner gets the entire pot (four coins) If one ofthe players folds the ldquocallerrdquo wins and gets the entry pot (twocoins) Finally if nobody calls both players take back the cointhey had put as entry pot Mathematically when the player 119894holds the card of value 119899 he decides to call according to theprobability distribution 119875
119894(119899) with of course 119894 = 1 2 and 119899 isin
0 1 119873 minus 1 After every hand both players update theirstrategy More precisely if one folds nothing happens if theagent 119894 thanks to the card 119899 calls andwins (because he has thehighest card or because the opponent folds) the probabilitythat he calls holding the card 119899 will change in this way
119875119894 (119899) 997888rarr 119875
119894 (119899) + 120583119894 [1 minus 119875119894 (119899)] (1)
Analogously if the agent 119894 loses (ie if he calls but theopponent has a higher card than him) the probability 119875
119894(119899)
will change instead in the following way
119875119894 (119899) 997888rarr 120583
119894119875119894 (119899) (2)
In (1) and (2) the coefficient 120583119894is the learning factor (LF)
which can also be seen as a sort of risk propensity of theplayer 119894 The LFs of the players are set at the beginning ofthe game and will never change Moreover it can assume avalue between 0 (no risk propensity at all) and 1 (maximumrisk propensity possible) actually a player with 120583
119894= 0 and
the card 119899 does not increase 119875119894(119899) even though he wins and
sets 119875119894(119899) = 0 as soon as he loses instead with 120583
119894= 1 he
sets 119875119894(119899) = 1 when he wins but does not decrease 119875
119894(119899) if he
loses Finally it is easy to notice that (1) and (2) ensure that119875119894(119899) will always stay in the interval [0 1]
31 Numerical Results In this section we will present themost remarkable results of the simulations of the simplemodel defined previously
First of all for simplicity we set the LF 1205831of the ldquoplayer 1rdquo
equal to 05 and thenwe checked the dynamics by varying1205832
actually it is the difference between the LFs which essentiallydetermines the main features of the dynamics as we saw inseveral simulations In particular we can distinguish threecases 120583
2lt 1205831 1205832= 1205831 and 120583
2gt 1205831
311 Case 1205832lt 1205831 In Figures 1 and 2 the behaviour of the
money of both players is shown as a function of time wherethe time unit is a single hand of the game we set119873 = 25 120583
2=
03 and 1205832= 048 respectively and the results are averaged
over 104 and 105 iterations respectively in both cases welet the agents play 104 hands For simplicity we consideredplayers with an infinite amount of money available and wegauged to zero the initial amount Additionally the initialcalling distributions 119875
119894(119899) are picked randomly for each 119899
As it can be seen the first player with higher LF winsover his opponent with smaller LF and the money gainedby player 1 increases with time Moreover the smaller 120583
2 the
faster and bigger the winnings of player 1 Previous figuresshow that on average the player with bigger risk propensityfinally overwhelms the other one and this means that ina single match the most risk-inclined player has a biggerprobability to win and such probability increases as thedifference 120583
1minus1205832increases in its turnThe fact that risking is
Time0 2000 4000 6000 8000 10000
0
1000
2000
3000
Mon
ey
minus3000
minus2000
minus1000
1205831 = 05
1205832 = 03
Figure 1 Behaviour of the money won (or lost) by the two playersversus time that is versus the hands played averaged over 104iterated matches The LFs of the players are here 120583
1= 05 and
1205832= 03
Time0 2000 4000 6000 8000 10000
1205831 = 05
1205832 = 048
0
100
200
300Mon
ey
minus200
minus300
minus100
Figure 2 Behaviour of the money won (or lost) by the two playersversus time that is versus the hands played averaged over 105iterated matches The LFs of the players are here 120583
1= 05 and
1205832= 048
convenient gets confirmed in the next figures where the finalcalling distributions for both players are depicted
As we can see in both cases the winner is characterizedby higher calling probabilities than his opponentrsquos ones forevery value of 119899 Moreover we have 119875
119894(119899 = 0) gt 0 which is
the most explicit evidence of the emergence of bluffing
312 Case 1205832= 1205831 In this case both players have the
same winning probability as it is well shown in Figure 5indeed having the same LF they have also exactly the samebehaviours so that in a single match nobody is able to over-whelm definitively the opponent
4 Journal of Complex Systems
0 4 8 12 16 20 24
1
08
06
04
02
0
119899
119875119894(119899)
1205831 = 05
1205832 = 03
Figure 3 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583
1= 05) red symbols player
2 (1205832= 03) Data took after 104 hands of the game and averaged
over 104 iterations Random initial distribution for every 119875119894(119899)
0 4 8 12 16 20 24
1
08
06
04
02
0
119899
119875119894(119899)
1205831 = 05
1205832 = 048
Figure 4 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583
1= 05) red symbols player
2 (1205832= 048) Data took after 104 hands of the game and averaged
over 105 iterations Random initial distribution for every 119875119894(119899)
Naturally it is also easy to forecast the behaviour of thefinal calling probabilities of the agents theywill be equal with119875119894(119899 = 0) gt 0 for both 119894 = 1 and 119894 = 2
313 Case 1205832gt 1205831 In this case the results are qualitatively
equal to the ones of case 1205832gt 1205831 only now it is player 2 which
defeats player 1 as shown in Figures 7 and 8
32 Discussion The first conclusion we can obtain just fromthe numerical results is that in this game bluffing emergesnecessarily as rational strategy Moreover the player whobluffsmore finally winsThis can be easily understood watch-ing Figures 3 4 6 and 8 Indeed while in general the calling
Time0 2000 4000 6000 8000 10000
0
100
200
Mon
ey
minus100
minus200
1205831 = 05
1205832 = 05
Figure 5 Behaviour of the money won (or lost) by the two playersversus time that is versus the hands played throughout a singlematch The LFs of the players are here 120583
1= 1205832= 05
0 4 8 12 16 20 24
1
08
06
04
02
0
119899
119875119894(119899)
1205831 = 05
1205832 = 05
Figure 6 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583
1= 05) red symbols player
2 (1205832= 05 = 120583
1) Data took after 104 hands of the game and averaged
over 25 sdot 104 iterations Random initial distribution for every 119875119894(119899)
probabilities of both players tend to have the same value for119899 rarr 119873minus1 for small 119899 the player with higher LF that is withhigher risk propensity bluffs much more than his opponentthis means that even holding a poor card the ldquorisk-loverrdquo willcall and unless his opponent has a very good point will get theentry pot
It is possible to formalize such considerations by writingthe equations of the dynamics for themodel at stake Neglect-ing fluctuations the ldquomean-fieldrdquo equation ruling the money1198721won (or lost) by the first player is
1198721 (119905 + 1) = 1198721 (119905) +
119873minus1
sum
1198991=0
1
119873[(1 minus Π
1
2) 1198751(1198991 119905)
Journal of Complex Systems 5
minus Π1
2(1 minus 119875
1(1198991 119905))
+21198751(1198991 119905) Π1
2(P1
2minusP2
1)]
(3)
where P12= Pr(119899
1gt 1198992) is the probability that the card 119899
1
held by player 1 is higher than the card 1198992held by player 2
analogously it isP21= Pr(119899
2gt 1198991) On the other hand Π1
2is
an operator defined as follows
Π1
2sdot 119883 =
119873minus1
sum
1198992=0
[1198752(1198992 119905)
119873 minus 1(1 minus 120575
11989921198991
)119883] (4)
and represents the probability that player 2 calls from thepoint of view of player 1 Equation (3) can be rewritten as adifferential equation in time which can assume the form
1 (119905) = [1 minus 21205742 (119905)] 1205961 (119905) minus [1 minus 21205741 (119905)] 1205962 (119905) (5)
with
120596119894 (119905) =
1
119873
119873minus1
sum
119899=0
119875119894 (119899 119905) 119894 = 1 2
120574119894 (119905) =
1
(119873 minus 1)2
119873minus1
sum
119899=0
119899119875119894 (119899 119905) 119894 = 1 2
(6)
Since this is a zero-sum game second playerrsquos money willbe obtained by the relation119872
2(119905) = minus119872
1(119905)
Finally the relation giving the temporal behaviour of thecalling distributions 119875
1(119899 119905) of player 1 (being the one of the
opponent of analogous form) is
1(1198991 119905) =
1
1198731198751(1198991 119905) [(1 minus Π
1
2+ Π1
2P1
2)
times [1198751(1198991 119905) + 120583
1(1 minus 119875
1(1198991 119905))]
+12058311198751(1198981 119905) Π1
2P2
1]
+1 minus 1198751(1198991 119905)
1198731198751(1198991 119905) minus
1198751(1198991 119905)
119873
(7)
Now (5) (6) and (7) are rather complicated but somefeatures of them can be determined without an explicit solu-tion Actually it is straightforward to understand that we have
Now since in our simulations we always started from thesame initial 119875
119894(119899) for all 119894 119899 and from119872
1(0) = 119872
2(0) = 0
this implies that for 1205831= 1205832both players must have on
average the same calling distributions and then they shouldnot gain nor lose money apart from fluctuations this isexactly what we found in Figures 5 and 6 It can also be shownthat if 120583
1gt 1205832 then we will have soon 119875
1(119899) ge 119875
2(119899) for all
119899 (with the equality holding only for 119899 = 119873 minus 1) so that (6)allow us to get
1gt 0 that is the victory of player 1 as shown
in Figures 1 to 4 Obviously the opposite situation takes placefor 1205831lt 1205832(as shown in Figures 7 and 8)
0
1000
2000
Mon
ey
minus2000
minus1000
Time0 2000 4000 6000 8000 10000
1205831 = 05
1205832 = 07
Figure 7 Behaviour of the money won (or lost) by the two playersversus time that is versus the hands played averaged over 104iterated matches The LFs of the players are here 120583
1= 05 and
1205832= 07
0 4 8 12 16 20 24
1
08
06
04
02
0
119899
119875119894(119899)
1205831 = 05
1205832 = 07
Figure 8 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583
1= 05) red symbols player
2 (1205832= 07) Data took after 104 hands of the game and averaged
over 104 iterations Random initial distribution for every 119875119894(119899)
4 Conclusions and Perspectives
In this work we have developed a model to describe veryroughly a cognitive processes dynamicsThemodel is amongthe most simple ones but allows to capture the main behav-iours of real dynamics demonstrating how fundamental isbluffing as a rational strategy
First of all this model confirms the role of LF (120583) asprominent In a straightforward way a direct connectionbetween LF and bluffing tendency is here detectable In facteven though both agents tend to develop bluff the one withthe greatest LF bluffs more than the other ending up as
6 Journal of Complex Systems
the winner Finally this last evidence suggests that bluffingmust be an evolutionary rational (stable) strategy
Other more realistic models have been developed [21]and for sure they can give us deeper information about theinner mechanisms driving to bluffing behaviours in poker-like games but the emergence of bluff as a rational strategyalready in a so simple toy model is with no doubt a veryremarkable result Indeed more in-depth analyses of the realgame can be found in the literature [13] which systematicallysurvey every facet of poker (blind flop raise etc) but thisis the first time that it is tried to catch the main features of itby means of a very ldquoreductionistrdquo approach by means of thesimplest version of the game Shortly if it has been thoughtuntil now that to catch an apparently complex behaviourbluffing with effective outcomes it is necessary to take intoaccount (almost) all the rules of poker we have shown herethat such attitude is much more fundamental and emergesnaturally when few simple ingredients are present
Moreover for practical purposes a simpler model whichcan be more easily understood turns out to be very usefulbecause it will result easier also in the realization and theanalysis of experimental tests with real human agents andhence it will be possible to plan more accurately other kindsof experimental tools In particular it will be straightforwardto utilize a tool as in reference [22] for a test with real playersand to exploit data frompoker-onlineweb sites for a statisticalanalysis of the outcomes of real poker games
References
[1] L A Real ldquoAnimal choice behavior and the evolution of cogni-tive architecture rdquo Science vol 253 no 5023 pp 980ndash986 1991
[2] L A Real ldquoInformation processing and the evolutionary ecolo-gy of cognitive architecturerdquo American Naturalist vol 140 ppS108ndashS145 1992
[3] E Brandstatter G Gigerenzer and R Hertwig ldquoThe priorityheuristic making choices without trade-offsrdquo PsychologicalReview vol 113 no 2 pp 409ndash432 2006
[4] J R Busemeyer and J T Townsend ldquoDecision field theory adynamic-cognitive approach to decision making in an uncer-tain environmentrdquo Psychological Review vol 100 no 3 pp 432ndash459 1993
[5] I Erev and G Barron ldquoOn adaptation maximization and rein-forcement learning among cognitive strategiesrdquo PsychologicalReview vol 112 no 4 pp 912ndash931 2005
[6] G T Fong D H Krantz and R E Nisbett ldquoThe effects of statis-tical training on thinking about everyday problemsrdquo CognitivePsychology vol 18 no 3 pp 253ndash292 1986
[7] C R Fox and L Hadar ldquoDecisions from experience = samplingerror + prospect theory reconsidering Hertwig BarronWeberamp Erevrdquo Judgment andDecisionMaking vol 1 no 2 pp 159ndash1612006
[8] R Hau T J Pleskac and R Hertwig ldquoDecisions from experi-ence and statistical probabilities why they trigger differentchoices than a priori probabilitiesrdquo Journal of Behavioral Deci-sion Making vol 23 no 1 pp 48ndash68 2010
[9] C Castellano S Fortunato and V Loreto ldquoStatistical physics ofsocial dynamicsrdquo Reviews of Modern Physics vol 81 no 2 pp591ndash646 2009
[10] C Camerer Behavioral Game Theory Experiments on StrategicInteraction Princeton University Press Princeton NJ USA2003
[11] H Gintis The Bounds of Reason Princeton University PressPrinceton NJ USA 2009
[12] I Barany and Z Furedi ldquoMental poker with three or more play-ersrdquo Information and Control vol 59 no 1ndash3 pp 84ndash93 1983
[13] D Billings A Davidson J Schaeffer and D Szafron ldquoThe chal-lenge of pokerrdquo Artificial Intelligence vol 134 no 1-2 pp 201ndash240 2002
[14] H Johansen-Berg and V Walsh ldquoCognitive neuroscience whoto play at pokerrdquo Current Biology vol 11 no 7 pp R261ndashR2632001
[15] D Koller and A Pfeffer ldquoRepresentations and solutions forgame-theoretic problemsrdquo Artificial Intelligence vol 94 no 1-2 pp 167ndash215 1997
[16] A Rapoport I Erev E V Abraham and D E Olson ldquoRan-domization and adaptive learning in a simplified poker gamerdquoOrganizational Behavior and Human Decision Processes vol 69no 1 pp 31ndash49 1997
[17] Z Shen ldquoA study of a generalization of a card problemrdquoAppliedMathematics and Computation vol 166 no 2 pp 385ndash4102005
[18] J Miekisz ldquoEvolutionary game theory and population dynam-icsrdquo in Multiscale Problems in the Life Sciences vol 1940 ofLecture Notes in Mathematics pp 269ndash316 2008
[19] H W Kuhn ldquoA simplified two-person pokerrdquo in Annals ofMathematics Studies vol 24 pp 97ndash103 1950
[20] J Von Neumann and O Morgenstern Theory of Games andEconomic Behavior Princeton University Press Princeton NJUSA 1947
[21] A Guazzini and D Vilone The emergence of bluff in poker-likegames [PhD thesis] University of Florence 2009
[22] A Guazzini D Vilone F Bagnoli T Carletti and R LauroGrotto ldquoCog-nitive network structure an experimental studyrdquoAdvances in Complex Systems vol 15 no 6 Article ID 12500842012
high performances in poker Moreover poker has a richhistory of study in several academic fields Economists andmathematicians have applied a variety of analytical tech-niques to poker-related problems [16ndash18] For example theearliest investigations in game theory by luminaries such asJohn von Neumann and John Nash used simplified poker toillustrate the fundamental principles
There is an important difference between board gamesand popular card games like bridge and poker In boardgames players have complete knowledge of the entire gamestate since everything is visible to all participants In contrastbridge and poker involve imperfect information since theother playersrsquo cards are not known
From a computational point of view it is important todistinguish the lack of information from the possibility ofchance moves The former involves uncertainty about thecurrent state of the world in particular situations wheredifferent players have access to different information Thelatter involves only uncertainty about the future uncertaintywhich is resolved as soon as the future materializes Perfectand imperfect information games may involve an element ofchance examples of games from all four categories are shownin Table 1
The presence of chance elements does not need majorchanges to the computational techniques used to solve agame In fact the cost of solving a perfect information gamewith chance moves is not substantially greater than solving agame with no chance moves By contrast the introductionof imperfect information increases the complexity of theproblem
Due to the complexity (both conceptual and algorithmic)of dealing with imperfect information games this problemhas been largely ignored at the computational level until theintroduction of randomized strategies concept
Once randomized strategies are allowed the existence ofldquooptimal strategiesrdquo in imperfect information games can beproved In particular this means that there exists an optimalrandomized strategy for poker in the same way as there existsan optimal deterministic strategy for chess Indeed Kuhnshowed for a simplified poker game that the optimal strategydoes use randomization [19]
The optimal strategy has several advantages the playercannot do better than this strategy if playing against a goodopponent furthermore the player does not do worse even ifhis strategy is revealed to his opponent that is the opponentgains no advantage fromfiguring out the first playerrsquos strategy
Another interesting result of such researches is theexistence of an optimal strategy for the gambler in pokergame As first observed in a simple poker-like game by Kuhn[19] behaviors such as bluffing that seem to arise from thepsychological makeup of human players are actually gametheoretically optimal
One of the earliest and most thorough investigations ofpoker appears in the classical treatise on game theory ldquoGamesand Economic Behaviorrdquo by von Neumann andMorgenstern[20] where a large section was devoted to the formal analysisof ldquobluffingrdquo in several simplified variants of a two-personpoker game with either symmetric or asymmetric informa-tion
Table 1
Perfect information Imperfect informationNo chance Chess Inspection gameChance Monopoly Poker
Indeed the general considerations concerning poker andthe mathematical discussions of the different versions of thegame were carried out by vonNeumann as early as 1926 Rec-ognizing that ldquobluffingrdquo in poker ldquois unquestionably practicedby all experienced playersrdquo von Neumann and Morgensternidentified two reasons for bluffing ldquoThe first is the desire togive a ldquofalserdquo impression of strength in ldquorealrdquo weakness thesecond is the desire to give a ldquofalserdquo impression of weaknessin ldquorealrdquo strengthrdquo [20]
Solutions to these simplified poker-like games as wellas a large class of both zero-sum and nonzero-sum gameswere unified by the concept of mixed strategy a probabilitydistribution over the playerrsquos set of actions The importanceof mixed strategies to the theory of games and its applicationsto the social and behavioral sciences stems from the fact thatfor many interactive decision processes there can be no Nashequilibria in pure strategies
Using randomization and adaptive learning as key con-cepts tomodelize into a computational scaffolding of the cog-nitive processes we believe that this area of research is morelikely to produce insights about superior cognitive strategiesbecause of their intrinsically structures Finally comparingthemwith the real human strategy it is possible both to inves-tigate the role of environmental factors on cognitive strategiesdevelopment and to validate some theoretical psychologicalassumptions
21 Summary The target of this paper is to show how bluffingstrategies can arise naturally as a mathematical property of avery simple model without any references to psychologicalassessments For this reason we present a simple model ofpoker-like game with only two players and only two possiblestrategies folding and calling which each agent assumessimultaneously Such oversimplified game as we will seeallows to catch the fundamental mechanisms underlyingthe phenomenon of bluffing The fact that bluffing naturallyemerges already in a very simple version of the game seemsto suggest that such a strategy is perfectly rational and can bemathematically characterized
3 The Model
Themodel we are going to define and analyse is probably oneof the greatest simplifications possible of a game of chancewith imperfect information
Here we have two players at the beginning of each handof the game they put one coin as the entry pot Then theypick a ldquocardrdquo from a pack each card has an integer valuebetween 0 and 119873 minus 1 (ie there are 119873 cards overall) Atthis point according to the value of their card the playersdecide to call or instead to fold If both players call theyput another coin in the pot and who holds the highest card
Journal of Complex Systems 3
wins the winner gets the entire pot (four coins) If one ofthe players folds the ldquocallerrdquo wins and gets the entry pot (twocoins) Finally if nobody calls both players take back the cointhey had put as entry pot Mathematically when the player 119894holds the card of value 119899 he decides to call according to theprobability distribution 119875
119894(119899) with of course 119894 = 1 2 and 119899 isin
0 1 119873 minus 1 After every hand both players update theirstrategy More precisely if one folds nothing happens if theagent 119894 thanks to the card 119899 calls andwins (because he has thehighest card or because the opponent folds) the probabilitythat he calls holding the card 119899 will change in this way
119875119894 (119899) 997888rarr 119875
119894 (119899) + 120583119894 [1 minus 119875119894 (119899)] (1)
Analogously if the agent 119894 loses (ie if he calls but theopponent has a higher card than him) the probability 119875
119894(119899)
will change instead in the following way
119875119894 (119899) 997888rarr 120583
119894119875119894 (119899) (2)
In (1) and (2) the coefficient 120583119894is the learning factor (LF)
which can also be seen as a sort of risk propensity of theplayer 119894 The LFs of the players are set at the beginning ofthe game and will never change Moreover it can assume avalue between 0 (no risk propensity at all) and 1 (maximumrisk propensity possible) actually a player with 120583
119894= 0 and
the card 119899 does not increase 119875119894(119899) even though he wins and
sets 119875119894(119899) = 0 as soon as he loses instead with 120583
119894= 1 he
sets 119875119894(119899) = 1 when he wins but does not decrease 119875
119894(119899) if he
loses Finally it is easy to notice that (1) and (2) ensure that119875119894(119899) will always stay in the interval [0 1]
31 Numerical Results In this section we will present themost remarkable results of the simulations of the simplemodel defined previously
First of all for simplicity we set the LF 1205831of the ldquoplayer 1rdquo
equal to 05 and thenwe checked the dynamics by varying1205832
actually it is the difference between the LFs which essentiallydetermines the main features of the dynamics as we saw inseveral simulations In particular we can distinguish threecases 120583
2lt 1205831 1205832= 1205831 and 120583
2gt 1205831
311 Case 1205832lt 1205831 In Figures 1 and 2 the behaviour of the
money of both players is shown as a function of time wherethe time unit is a single hand of the game we set119873 = 25 120583
2=
03 and 1205832= 048 respectively and the results are averaged
over 104 and 105 iterations respectively in both cases welet the agents play 104 hands For simplicity we consideredplayers with an infinite amount of money available and wegauged to zero the initial amount Additionally the initialcalling distributions 119875
119894(119899) are picked randomly for each 119899
As it can be seen the first player with higher LF winsover his opponent with smaller LF and the money gainedby player 1 increases with time Moreover the smaller 120583
2 the
faster and bigger the winnings of player 1 Previous figuresshow that on average the player with bigger risk propensityfinally overwhelms the other one and this means that ina single match the most risk-inclined player has a biggerprobability to win and such probability increases as thedifference 120583
1minus1205832increases in its turnThe fact that risking is
Time0 2000 4000 6000 8000 10000
0
1000
2000
3000
Mon
ey
minus3000
minus2000
minus1000
1205831 = 05
1205832 = 03
Figure 1 Behaviour of the money won (or lost) by the two playersversus time that is versus the hands played averaged over 104iterated matches The LFs of the players are here 120583
1= 05 and
1205832= 03
Time0 2000 4000 6000 8000 10000
1205831 = 05
1205832 = 048
0
100
200
300Mon
ey
minus200
minus300
minus100
Figure 2 Behaviour of the money won (or lost) by the two playersversus time that is versus the hands played averaged over 105iterated matches The LFs of the players are here 120583
1= 05 and
1205832= 048
convenient gets confirmed in the next figures where the finalcalling distributions for both players are depicted
As we can see in both cases the winner is characterizedby higher calling probabilities than his opponentrsquos ones forevery value of 119899 Moreover we have 119875
119894(119899 = 0) gt 0 which is
the most explicit evidence of the emergence of bluffing
312 Case 1205832= 1205831 In this case both players have the
same winning probability as it is well shown in Figure 5indeed having the same LF they have also exactly the samebehaviours so that in a single match nobody is able to over-whelm definitively the opponent
4 Journal of Complex Systems
0 4 8 12 16 20 24
1
08
06
04
02
0
119899
119875119894(119899)
1205831 = 05
1205832 = 03
Figure 3 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583
1= 05) red symbols player
2 (1205832= 03) Data took after 104 hands of the game and averaged
over 104 iterations Random initial distribution for every 119875119894(119899)
0 4 8 12 16 20 24
1
08
06
04
02
0
119899
119875119894(119899)
1205831 = 05
1205832 = 048
Figure 4 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583
1= 05) red symbols player
2 (1205832= 048) Data took after 104 hands of the game and averaged
over 105 iterations Random initial distribution for every 119875119894(119899)
Naturally it is also easy to forecast the behaviour of thefinal calling probabilities of the agents theywill be equal with119875119894(119899 = 0) gt 0 for both 119894 = 1 and 119894 = 2
313 Case 1205832gt 1205831 In this case the results are qualitatively
equal to the ones of case 1205832gt 1205831 only now it is player 2 which
defeats player 1 as shown in Figures 7 and 8
32 Discussion The first conclusion we can obtain just fromthe numerical results is that in this game bluffing emergesnecessarily as rational strategy Moreover the player whobluffsmore finally winsThis can be easily understood watch-ing Figures 3 4 6 and 8 Indeed while in general the calling
Time0 2000 4000 6000 8000 10000
0
100
200
Mon
ey
minus100
minus200
1205831 = 05
1205832 = 05
Figure 5 Behaviour of the money won (or lost) by the two playersversus time that is versus the hands played throughout a singlematch The LFs of the players are here 120583
1= 1205832= 05
0 4 8 12 16 20 24
1
08
06
04
02
0
119899
119875119894(119899)
1205831 = 05
1205832 = 05
Figure 6 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583
1= 05) red symbols player
2 (1205832= 05 = 120583
1) Data took after 104 hands of the game and averaged
over 25 sdot 104 iterations Random initial distribution for every 119875119894(119899)
probabilities of both players tend to have the same value for119899 rarr 119873minus1 for small 119899 the player with higher LF that is withhigher risk propensity bluffs much more than his opponentthis means that even holding a poor card the ldquorisk-loverrdquo willcall and unless his opponent has a very good point will get theentry pot
It is possible to formalize such considerations by writingthe equations of the dynamics for themodel at stake Neglect-ing fluctuations the ldquomean-fieldrdquo equation ruling the money1198721won (or lost) by the first player is
1198721 (119905 + 1) = 1198721 (119905) +
119873minus1
sum
1198991=0
1
119873[(1 minus Π
1
2) 1198751(1198991 119905)
Journal of Complex Systems 5
minus Π1
2(1 minus 119875
1(1198991 119905))
+21198751(1198991 119905) Π1
2(P1
2minusP2
1)]
(3)
where P12= Pr(119899
1gt 1198992) is the probability that the card 119899
1
held by player 1 is higher than the card 1198992held by player 2
analogously it isP21= Pr(119899
2gt 1198991) On the other hand Π1
2is
an operator defined as follows
Π1
2sdot 119883 =
119873minus1
sum
1198992=0
[1198752(1198992 119905)
119873 minus 1(1 minus 120575
11989921198991
)119883] (4)
and represents the probability that player 2 calls from thepoint of view of player 1 Equation (3) can be rewritten as adifferential equation in time which can assume the form
1 (119905) = [1 minus 21205742 (119905)] 1205961 (119905) minus [1 minus 21205741 (119905)] 1205962 (119905) (5)
with
120596119894 (119905) =
1
119873
119873minus1
sum
119899=0
119875119894 (119899 119905) 119894 = 1 2
120574119894 (119905) =
1
(119873 minus 1)2
119873minus1
sum
119899=0
119899119875119894 (119899 119905) 119894 = 1 2
(6)
Since this is a zero-sum game second playerrsquos money willbe obtained by the relation119872
2(119905) = minus119872
1(119905)
Finally the relation giving the temporal behaviour of thecalling distributions 119875
1(119899 119905) of player 1 (being the one of the
opponent of analogous form) is
1(1198991 119905) =
1
1198731198751(1198991 119905) [(1 minus Π
1
2+ Π1
2P1
2)
times [1198751(1198991 119905) + 120583
1(1 minus 119875
1(1198991 119905))]
+12058311198751(1198981 119905) Π1
2P2
1]
+1 minus 1198751(1198991 119905)
1198731198751(1198991 119905) minus
1198751(1198991 119905)
119873
(7)
Now (5) (6) and (7) are rather complicated but somefeatures of them can be determined without an explicit solu-tion Actually it is straightforward to understand that we have
Now since in our simulations we always started from thesame initial 119875
119894(119899) for all 119894 119899 and from119872
1(0) = 119872
2(0) = 0
this implies that for 1205831= 1205832both players must have on
average the same calling distributions and then they shouldnot gain nor lose money apart from fluctuations this isexactly what we found in Figures 5 and 6 It can also be shownthat if 120583
1gt 1205832 then we will have soon 119875
1(119899) ge 119875
2(119899) for all
119899 (with the equality holding only for 119899 = 119873 minus 1) so that (6)allow us to get
1gt 0 that is the victory of player 1 as shown
in Figures 1 to 4 Obviously the opposite situation takes placefor 1205831lt 1205832(as shown in Figures 7 and 8)
0
1000
2000
Mon
ey
minus2000
minus1000
Time0 2000 4000 6000 8000 10000
1205831 = 05
1205832 = 07
Figure 7 Behaviour of the money won (or lost) by the two playersversus time that is versus the hands played averaged over 104iterated matches The LFs of the players are here 120583
1= 05 and
1205832= 07
0 4 8 12 16 20 24
1
08
06
04
02
0
119899
119875119894(119899)
1205831 = 05
1205832 = 07
Figure 8 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583
1= 05) red symbols player
2 (1205832= 07) Data took after 104 hands of the game and averaged
over 104 iterations Random initial distribution for every 119875119894(119899)
4 Conclusions and Perspectives
In this work we have developed a model to describe veryroughly a cognitive processes dynamicsThemodel is amongthe most simple ones but allows to capture the main behav-iours of real dynamics demonstrating how fundamental isbluffing as a rational strategy
First of all this model confirms the role of LF (120583) asprominent In a straightforward way a direct connectionbetween LF and bluffing tendency is here detectable In facteven though both agents tend to develop bluff the one withthe greatest LF bluffs more than the other ending up as
6 Journal of Complex Systems
the winner Finally this last evidence suggests that bluffingmust be an evolutionary rational (stable) strategy
Other more realistic models have been developed [21]and for sure they can give us deeper information about theinner mechanisms driving to bluffing behaviours in poker-like games but the emergence of bluff as a rational strategyalready in a so simple toy model is with no doubt a veryremarkable result Indeed more in-depth analyses of the realgame can be found in the literature [13] which systematicallysurvey every facet of poker (blind flop raise etc) but thisis the first time that it is tried to catch the main features of itby means of a very ldquoreductionistrdquo approach by means of thesimplest version of the game Shortly if it has been thoughtuntil now that to catch an apparently complex behaviourbluffing with effective outcomes it is necessary to take intoaccount (almost) all the rules of poker we have shown herethat such attitude is much more fundamental and emergesnaturally when few simple ingredients are present
Moreover for practical purposes a simpler model whichcan be more easily understood turns out to be very usefulbecause it will result easier also in the realization and theanalysis of experimental tests with real human agents andhence it will be possible to plan more accurately other kindsof experimental tools In particular it will be straightforwardto utilize a tool as in reference [22] for a test with real playersand to exploit data frompoker-onlineweb sites for a statisticalanalysis of the outcomes of real poker games
References
[1] L A Real ldquoAnimal choice behavior and the evolution of cogni-tive architecture rdquo Science vol 253 no 5023 pp 980ndash986 1991
[2] L A Real ldquoInformation processing and the evolutionary ecolo-gy of cognitive architecturerdquo American Naturalist vol 140 ppS108ndashS145 1992
[3] E Brandstatter G Gigerenzer and R Hertwig ldquoThe priorityheuristic making choices without trade-offsrdquo PsychologicalReview vol 113 no 2 pp 409ndash432 2006
[4] J R Busemeyer and J T Townsend ldquoDecision field theory adynamic-cognitive approach to decision making in an uncer-tain environmentrdquo Psychological Review vol 100 no 3 pp 432ndash459 1993
[5] I Erev and G Barron ldquoOn adaptation maximization and rein-forcement learning among cognitive strategiesrdquo PsychologicalReview vol 112 no 4 pp 912ndash931 2005
[6] G T Fong D H Krantz and R E Nisbett ldquoThe effects of statis-tical training on thinking about everyday problemsrdquo CognitivePsychology vol 18 no 3 pp 253ndash292 1986
[7] C R Fox and L Hadar ldquoDecisions from experience = samplingerror + prospect theory reconsidering Hertwig BarronWeberamp Erevrdquo Judgment andDecisionMaking vol 1 no 2 pp 159ndash1612006
[8] R Hau T J Pleskac and R Hertwig ldquoDecisions from experi-ence and statistical probabilities why they trigger differentchoices than a priori probabilitiesrdquo Journal of Behavioral Deci-sion Making vol 23 no 1 pp 48ndash68 2010
[9] C Castellano S Fortunato and V Loreto ldquoStatistical physics ofsocial dynamicsrdquo Reviews of Modern Physics vol 81 no 2 pp591ndash646 2009
[10] C Camerer Behavioral Game Theory Experiments on StrategicInteraction Princeton University Press Princeton NJ USA2003
[11] H Gintis The Bounds of Reason Princeton University PressPrinceton NJ USA 2009
[12] I Barany and Z Furedi ldquoMental poker with three or more play-ersrdquo Information and Control vol 59 no 1ndash3 pp 84ndash93 1983
[13] D Billings A Davidson J Schaeffer and D Szafron ldquoThe chal-lenge of pokerrdquo Artificial Intelligence vol 134 no 1-2 pp 201ndash240 2002
[14] H Johansen-Berg and V Walsh ldquoCognitive neuroscience whoto play at pokerrdquo Current Biology vol 11 no 7 pp R261ndashR2632001
[15] D Koller and A Pfeffer ldquoRepresentations and solutions forgame-theoretic problemsrdquo Artificial Intelligence vol 94 no 1-2 pp 167ndash215 1997
[16] A Rapoport I Erev E V Abraham and D E Olson ldquoRan-domization and adaptive learning in a simplified poker gamerdquoOrganizational Behavior and Human Decision Processes vol 69no 1 pp 31ndash49 1997
[17] Z Shen ldquoA study of a generalization of a card problemrdquoAppliedMathematics and Computation vol 166 no 2 pp 385ndash4102005
[18] J Miekisz ldquoEvolutionary game theory and population dynam-icsrdquo in Multiscale Problems in the Life Sciences vol 1940 ofLecture Notes in Mathematics pp 269ndash316 2008
[19] H W Kuhn ldquoA simplified two-person pokerrdquo in Annals ofMathematics Studies vol 24 pp 97ndash103 1950
[20] J Von Neumann and O Morgenstern Theory of Games andEconomic Behavior Princeton University Press Princeton NJUSA 1947
[21] A Guazzini and D Vilone The emergence of bluff in poker-likegames [PhD thesis] University of Florence 2009
[22] A Guazzini D Vilone F Bagnoli T Carletti and R LauroGrotto ldquoCog-nitive network structure an experimental studyrdquoAdvances in Complex Systems vol 15 no 6 Article ID 12500842012
wins the winner gets the entire pot (four coins) If one ofthe players folds the ldquocallerrdquo wins and gets the entry pot (twocoins) Finally if nobody calls both players take back the cointhey had put as entry pot Mathematically when the player 119894holds the card of value 119899 he decides to call according to theprobability distribution 119875
119894(119899) with of course 119894 = 1 2 and 119899 isin
0 1 119873 minus 1 After every hand both players update theirstrategy More precisely if one folds nothing happens if theagent 119894 thanks to the card 119899 calls andwins (because he has thehighest card or because the opponent folds) the probabilitythat he calls holding the card 119899 will change in this way
119875119894 (119899) 997888rarr 119875
119894 (119899) + 120583119894 [1 minus 119875119894 (119899)] (1)
Analogously if the agent 119894 loses (ie if he calls but theopponent has a higher card than him) the probability 119875
119894(119899)
will change instead in the following way
119875119894 (119899) 997888rarr 120583
119894119875119894 (119899) (2)
In (1) and (2) the coefficient 120583119894is the learning factor (LF)
which can also be seen as a sort of risk propensity of theplayer 119894 The LFs of the players are set at the beginning ofthe game and will never change Moreover it can assume avalue between 0 (no risk propensity at all) and 1 (maximumrisk propensity possible) actually a player with 120583
119894= 0 and
the card 119899 does not increase 119875119894(119899) even though he wins and
sets 119875119894(119899) = 0 as soon as he loses instead with 120583
119894= 1 he
sets 119875119894(119899) = 1 when he wins but does not decrease 119875
119894(119899) if he
loses Finally it is easy to notice that (1) and (2) ensure that119875119894(119899) will always stay in the interval [0 1]
31 Numerical Results In this section we will present themost remarkable results of the simulations of the simplemodel defined previously
First of all for simplicity we set the LF 1205831of the ldquoplayer 1rdquo
equal to 05 and thenwe checked the dynamics by varying1205832
actually it is the difference between the LFs which essentiallydetermines the main features of the dynamics as we saw inseveral simulations In particular we can distinguish threecases 120583
2lt 1205831 1205832= 1205831 and 120583
2gt 1205831
311 Case 1205832lt 1205831 In Figures 1 and 2 the behaviour of the
money of both players is shown as a function of time wherethe time unit is a single hand of the game we set119873 = 25 120583
2=
03 and 1205832= 048 respectively and the results are averaged
over 104 and 105 iterations respectively in both cases welet the agents play 104 hands For simplicity we consideredplayers with an infinite amount of money available and wegauged to zero the initial amount Additionally the initialcalling distributions 119875
119894(119899) are picked randomly for each 119899
As it can be seen the first player with higher LF winsover his opponent with smaller LF and the money gainedby player 1 increases with time Moreover the smaller 120583
2 the
faster and bigger the winnings of player 1 Previous figuresshow that on average the player with bigger risk propensityfinally overwhelms the other one and this means that ina single match the most risk-inclined player has a biggerprobability to win and such probability increases as thedifference 120583
1minus1205832increases in its turnThe fact that risking is
Time0 2000 4000 6000 8000 10000
0
1000
2000
3000
Mon
ey
minus3000
minus2000
minus1000
1205831 = 05
1205832 = 03
Figure 1 Behaviour of the money won (or lost) by the two playersversus time that is versus the hands played averaged over 104iterated matches The LFs of the players are here 120583
1= 05 and
1205832= 03
Time0 2000 4000 6000 8000 10000
1205831 = 05
1205832 = 048
0
100
200
300Mon
ey
minus200
minus300
minus100
Figure 2 Behaviour of the money won (or lost) by the two playersversus time that is versus the hands played averaged over 105iterated matches The LFs of the players are here 120583
1= 05 and
1205832= 048
convenient gets confirmed in the next figures where the finalcalling distributions for both players are depicted
As we can see in both cases the winner is characterizedby higher calling probabilities than his opponentrsquos ones forevery value of 119899 Moreover we have 119875
119894(119899 = 0) gt 0 which is
the most explicit evidence of the emergence of bluffing
312 Case 1205832= 1205831 In this case both players have the
same winning probability as it is well shown in Figure 5indeed having the same LF they have also exactly the samebehaviours so that in a single match nobody is able to over-whelm definitively the opponent
4 Journal of Complex Systems
0 4 8 12 16 20 24
1
08
06
04
02
0
119899
119875119894(119899)
1205831 = 05
1205832 = 03
Figure 3 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583
1= 05) red symbols player
2 (1205832= 03) Data took after 104 hands of the game and averaged
over 104 iterations Random initial distribution for every 119875119894(119899)
0 4 8 12 16 20 24
1
08
06
04
02
0
119899
119875119894(119899)
1205831 = 05
1205832 = 048
Figure 4 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583
1= 05) red symbols player
2 (1205832= 048) Data took after 104 hands of the game and averaged
over 105 iterations Random initial distribution for every 119875119894(119899)
Naturally it is also easy to forecast the behaviour of thefinal calling probabilities of the agents theywill be equal with119875119894(119899 = 0) gt 0 for both 119894 = 1 and 119894 = 2
313 Case 1205832gt 1205831 In this case the results are qualitatively
equal to the ones of case 1205832gt 1205831 only now it is player 2 which
defeats player 1 as shown in Figures 7 and 8
32 Discussion The first conclusion we can obtain just fromthe numerical results is that in this game bluffing emergesnecessarily as rational strategy Moreover the player whobluffsmore finally winsThis can be easily understood watch-ing Figures 3 4 6 and 8 Indeed while in general the calling
Time0 2000 4000 6000 8000 10000
0
100
200
Mon
ey
minus100
minus200
1205831 = 05
1205832 = 05
Figure 5 Behaviour of the money won (or lost) by the two playersversus time that is versus the hands played throughout a singlematch The LFs of the players are here 120583
1= 1205832= 05
0 4 8 12 16 20 24
1
08
06
04
02
0
119899
119875119894(119899)
1205831 = 05
1205832 = 05
Figure 6 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583
1= 05) red symbols player
2 (1205832= 05 = 120583
1) Data took after 104 hands of the game and averaged
over 25 sdot 104 iterations Random initial distribution for every 119875119894(119899)
probabilities of both players tend to have the same value for119899 rarr 119873minus1 for small 119899 the player with higher LF that is withhigher risk propensity bluffs much more than his opponentthis means that even holding a poor card the ldquorisk-loverrdquo willcall and unless his opponent has a very good point will get theentry pot
It is possible to formalize such considerations by writingthe equations of the dynamics for themodel at stake Neglect-ing fluctuations the ldquomean-fieldrdquo equation ruling the money1198721won (or lost) by the first player is
1198721 (119905 + 1) = 1198721 (119905) +
119873minus1
sum
1198991=0
1
119873[(1 minus Π
1
2) 1198751(1198991 119905)
Journal of Complex Systems 5
minus Π1
2(1 minus 119875
1(1198991 119905))
+21198751(1198991 119905) Π1
2(P1
2minusP2
1)]
(3)
where P12= Pr(119899
1gt 1198992) is the probability that the card 119899
1
held by player 1 is higher than the card 1198992held by player 2
analogously it isP21= Pr(119899
2gt 1198991) On the other hand Π1
2is
an operator defined as follows
Π1
2sdot 119883 =
119873minus1
sum
1198992=0
[1198752(1198992 119905)
119873 minus 1(1 minus 120575
11989921198991
)119883] (4)
and represents the probability that player 2 calls from thepoint of view of player 1 Equation (3) can be rewritten as adifferential equation in time which can assume the form
1 (119905) = [1 minus 21205742 (119905)] 1205961 (119905) minus [1 minus 21205741 (119905)] 1205962 (119905) (5)
with
120596119894 (119905) =
1
119873
119873minus1
sum
119899=0
119875119894 (119899 119905) 119894 = 1 2
120574119894 (119905) =
1
(119873 minus 1)2
119873minus1
sum
119899=0
119899119875119894 (119899 119905) 119894 = 1 2
(6)
Since this is a zero-sum game second playerrsquos money willbe obtained by the relation119872
2(119905) = minus119872
1(119905)
Finally the relation giving the temporal behaviour of thecalling distributions 119875
1(119899 119905) of player 1 (being the one of the
opponent of analogous form) is
1(1198991 119905) =
1
1198731198751(1198991 119905) [(1 minus Π
1
2+ Π1
2P1
2)
times [1198751(1198991 119905) + 120583
1(1 minus 119875
1(1198991 119905))]
+12058311198751(1198981 119905) Π1
2P2
1]
+1 minus 1198751(1198991 119905)
1198731198751(1198991 119905) minus
1198751(1198991 119905)
119873
(7)
Now (5) (6) and (7) are rather complicated but somefeatures of them can be determined without an explicit solu-tion Actually it is straightforward to understand that we have
Now since in our simulations we always started from thesame initial 119875
119894(119899) for all 119894 119899 and from119872
1(0) = 119872
2(0) = 0
this implies that for 1205831= 1205832both players must have on
average the same calling distributions and then they shouldnot gain nor lose money apart from fluctuations this isexactly what we found in Figures 5 and 6 It can also be shownthat if 120583
1gt 1205832 then we will have soon 119875
1(119899) ge 119875
2(119899) for all
119899 (with the equality holding only for 119899 = 119873 minus 1) so that (6)allow us to get
1gt 0 that is the victory of player 1 as shown
in Figures 1 to 4 Obviously the opposite situation takes placefor 1205831lt 1205832(as shown in Figures 7 and 8)
0
1000
2000
Mon
ey
minus2000
minus1000
Time0 2000 4000 6000 8000 10000
1205831 = 05
1205832 = 07
Figure 7 Behaviour of the money won (or lost) by the two playersversus time that is versus the hands played averaged over 104iterated matches The LFs of the players are here 120583
1= 05 and
1205832= 07
0 4 8 12 16 20 24
1
08
06
04
02
0
119899
119875119894(119899)
1205831 = 05
1205832 = 07
Figure 8 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583
1= 05) red symbols player
2 (1205832= 07) Data took after 104 hands of the game and averaged
over 104 iterations Random initial distribution for every 119875119894(119899)
4 Conclusions and Perspectives
In this work we have developed a model to describe veryroughly a cognitive processes dynamicsThemodel is amongthe most simple ones but allows to capture the main behav-iours of real dynamics demonstrating how fundamental isbluffing as a rational strategy
First of all this model confirms the role of LF (120583) asprominent In a straightforward way a direct connectionbetween LF and bluffing tendency is here detectable In facteven though both agents tend to develop bluff the one withthe greatest LF bluffs more than the other ending up as
6 Journal of Complex Systems
the winner Finally this last evidence suggests that bluffingmust be an evolutionary rational (stable) strategy
Other more realistic models have been developed [21]and for sure they can give us deeper information about theinner mechanisms driving to bluffing behaviours in poker-like games but the emergence of bluff as a rational strategyalready in a so simple toy model is with no doubt a veryremarkable result Indeed more in-depth analyses of the realgame can be found in the literature [13] which systematicallysurvey every facet of poker (blind flop raise etc) but thisis the first time that it is tried to catch the main features of itby means of a very ldquoreductionistrdquo approach by means of thesimplest version of the game Shortly if it has been thoughtuntil now that to catch an apparently complex behaviourbluffing with effective outcomes it is necessary to take intoaccount (almost) all the rules of poker we have shown herethat such attitude is much more fundamental and emergesnaturally when few simple ingredients are present
Moreover for practical purposes a simpler model whichcan be more easily understood turns out to be very usefulbecause it will result easier also in the realization and theanalysis of experimental tests with real human agents andhence it will be possible to plan more accurately other kindsof experimental tools In particular it will be straightforwardto utilize a tool as in reference [22] for a test with real playersand to exploit data frompoker-onlineweb sites for a statisticalanalysis of the outcomes of real poker games
References
[1] L A Real ldquoAnimal choice behavior and the evolution of cogni-tive architecture rdquo Science vol 253 no 5023 pp 980ndash986 1991
[2] L A Real ldquoInformation processing and the evolutionary ecolo-gy of cognitive architecturerdquo American Naturalist vol 140 ppS108ndashS145 1992
[3] E Brandstatter G Gigerenzer and R Hertwig ldquoThe priorityheuristic making choices without trade-offsrdquo PsychologicalReview vol 113 no 2 pp 409ndash432 2006
[4] J R Busemeyer and J T Townsend ldquoDecision field theory adynamic-cognitive approach to decision making in an uncer-tain environmentrdquo Psychological Review vol 100 no 3 pp 432ndash459 1993
[5] I Erev and G Barron ldquoOn adaptation maximization and rein-forcement learning among cognitive strategiesrdquo PsychologicalReview vol 112 no 4 pp 912ndash931 2005
[6] G T Fong D H Krantz and R E Nisbett ldquoThe effects of statis-tical training on thinking about everyday problemsrdquo CognitivePsychology vol 18 no 3 pp 253ndash292 1986
[7] C R Fox and L Hadar ldquoDecisions from experience = samplingerror + prospect theory reconsidering Hertwig BarronWeberamp Erevrdquo Judgment andDecisionMaking vol 1 no 2 pp 159ndash1612006
[8] R Hau T J Pleskac and R Hertwig ldquoDecisions from experi-ence and statistical probabilities why they trigger differentchoices than a priori probabilitiesrdquo Journal of Behavioral Deci-sion Making vol 23 no 1 pp 48ndash68 2010
[9] C Castellano S Fortunato and V Loreto ldquoStatistical physics ofsocial dynamicsrdquo Reviews of Modern Physics vol 81 no 2 pp591ndash646 2009
[10] C Camerer Behavioral Game Theory Experiments on StrategicInteraction Princeton University Press Princeton NJ USA2003
[11] H Gintis The Bounds of Reason Princeton University PressPrinceton NJ USA 2009
[12] I Barany and Z Furedi ldquoMental poker with three or more play-ersrdquo Information and Control vol 59 no 1ndash3 pp 84ndash93 1983
[13] D Billings A Davidson J Schaeffer and D Szafron ldquoThe chal-lenge of pokerrdquo Artificial Intelligence vol 134 no 1-2 pp 201ndash240 2002
[14] H Johansen-Berg and V Walsh ldquoCognitive neuroscience whoto play at pokerrdquo Current Biology vol 11 no 7 pp R261ndashR2632001
[15] D Koller and A Pfeffer ldquoRepresentations and solutions forgame-theoretic problemsrdquo Artificial Intelligence vol 94 no 1-2 pp 167ndash215 1997
[16] A Rapoport I Erev E V Abraham and D E Olson ldquoRan-domization and adaptive learning in a simplified poker gamerdquoOrganizational Behavior and Human Decision Processes vol 69no 1 pp 31ndash49 1997
[17] Z Shen ldquoA study of a generalization of a card problemrdquoAppliedMathematics and Computation vol 166 no 2 pp 385ndash4102005
[18] J Miekisz ldquoEvolutionary game theory and population dynam-icsrdquo in Multiscale Problems in the Life Sciences vol 1940 ofLecture Notes in Mathematics pp 269ndash316 2008
[19] H W Kuhn ldquoA simplified two-person pokerrdquo in Annals ofMathematics Studies vol 24 pp 97ndash103 1950
[20] J Von Neumann and O Morgenstern Theory of Games andEconomic Behavior Princeton University Press Princeton NJUSA 1947
[21] A Guazzini and D Vilone The emergence of bluff in poker-likegames [PhD thesis] University of Florence 2009
[22] A Guazzini D Vilone F Bagnoli T Carletti and R LauroGrotto ldquoCog-nitive network structure an experimental studyrdquoAdvances in Complex Systems vol 15 no 6 Article ID 12500842012
Figure 3 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583
1= 05) red symbols player
2 (1205832= 03) Data took after 104 hands of the game and averaged
over 104 iterations Random initial distribution for every 119875119894(119899)
0 4 8 12 16 20 24
1
08
06
04
02
0
119899
119875119894(119899)
1205831 = 05
1205832 = 048
Figure 4 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583
1= 05) red symbols player
2 (1205832= 048) Data took after 104 hands of the game and averaged
over 105 iterations Random initial distribution for every 119875119894(119899)
Naturally it is also easy to forecast the behaviour of thefinal calling probabilities of the agents theywill be equal with119875119894(119899 = 0) gt 0 for both 119894 = 1 and 119894 = 2
313 Case 1205832gt 1205831 In this case the results are qualitatively
equal to the ones of case 1205832gt 1205831 only now it is player 2 which
defeats player 1 as shown in Figures 7 and 8
32 Discussion The first conclusion we can obtain just fromthe numerical results is that in this game bluffing emergesnecessarily as rational strategy Moreover the player whobluffsmore finally winsThis can be easily understood watch-ing Figures 3 4 6 and 8 Indeed while in general the calling
Time0 2000 4000 6000 8000 10000
0
100
200
Mon
ey
minus100
minus200
1205831 = 05
1205832 = 05
Figure 5 Behaviour of the money won (or lost) by the two playersversus time that is versus the hands played throughout a singlematch The LFs of the players are here 120583
1= 1205832= 05
0 4 8 12 16 20 24
1
08
06
04
02
0
119899
119875119894(119899)
1205831 = 05
1205832 = 05
Figure 6 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583
1= 05) red symbols player
2 (1205832= 05 = 120583
1) Data took after 104 hands of the game and averaged
over 25 sdot 104 iterations Random initial distribution for every 119875119894(119899)
probabilities of both players tend to have the same value for119899 rarr 119873minus1 for small 119899 the player with higher LF that is withhigher risk propensity bluffs much more than his opponentthis means that even holding a poor card the ldquorisk-loverrdquo willcall and unless his opponent has a very good point will get theentry pot
It is possible to formalize such considerations by writingthe equations of the dynamics for themodel at stake Neglect-ing fluctuations the ldquomean-fieldrdquo equation ruling the money1198721won (or lost) by the first player is
1198721 (119905 + 1) = 1198721 (119905) +
119873minus1
sum
1198991=0
1
119873[(1 minus Π
1
2) 1198751(1198991 119905)
Journal of Complex Systems 5
minus Π1
2(1 minus 119875
1(1198991 119905))
+21198751(1198991 119905) Π1
2(P1
2minusP2
1)]
(3)
where P12= Pr(119899
1gt 1198992) is the probability that the card 119899
1
held by player 1 is higher than the card 1198992held by player 2
analogously it isP21= Pr(119899
2gt 1198991) On the other hand Π1
2is
an operator defined as follows
Π1
2sdot 119883 =
119873minus1
sum
1198992=0
[1198752(1198992 119905)
119873 minus 1(1 minus 120575
11989921198991
)119883] (4)
and represents the probability that player 2 calls from thepoint of view of player 1 Equation (3) can be rewritten as adifferential equation in time which can assume the form
1 (119905) = [1 minus 21205742 (119905)] 1205961 (119905) minus [1 minus 21205741 (119905)] 1205962 (119905) (5)
with
120596119894 (119905) =
1
119873
119873minus1
sum
119899=0
119875119894 (119899 119905) 119894 = 1 2
120574119894 (119905) =
1
(119873 minus 1)2
119873minus1
sum
119899=0
119899119875119894 (119899 119905) 119894 = 1 2
(6)
Since this is a zero-sum game second playerrsquos money willbe obtained by the relation119872
2(119905) = minus119872
1(119905)
Finally the relation giving the temporal behaviour of thecalling distributions 119875
1(119899 119905) of player 1 (being the one of the
opponent of analogous form) is
1(1198991 119905) =
1
1198731198751(1198991 119905) [(1 minus Π
1
2+ Π1
2P1
2)
times [1198751(1198991 119905) + 120583
1(1 minus 119875
1(1198991 119905))]
+12058311198751(1198981 119905) Π1
2P2
1]
+1 minus 1198751(1198991 119905)
1198731198751(1198991 119905) minus
1198751(1198991 119905)
119873
(7)
Now (5) (6) and (7) are rather complicated but somefeatures of them can be determined without an explicit solu-tion Actually it is straightforward to understand that we have
Now since in our simulations we always started from thesame initial 119875
119894(119899) for all 119894 119899 and from119872
1(0) = 119872
2(0) = 0
this implies that for 1205831= 1205832both players must have on
average the same calling distributions and then they shouldnot gain nor lose money apart from fluctuations this isexactly what we found in Figures 5 and 6 It can also be shownthat if 120583
1gt 1205832 then we will have soon 119875
1(119899) ge 119875
2(119899) for all
119899 (with the equality holding only for 119899 = 119873 minus 1) so that (6)allow us to get
1gt 0 that is the victory of player 1 as shown
in Figures 1 to 4 Obviously the opposite situation takes placefor 1205831lt 1205832(as shown in Figures 7 and 8)
0
1000
2000
Mon
ey
minus2000
minus1000
Time0 2000 4000 6000 8000 10000
1205831 = 05
1205832 = 07
Figure 7 Behaviour of the money won (or lost) by the two playersversus time that is versus the hands played averaged over 104iterated matches The LFs of the players are here 120583
1= 05 and
1205832= 07
0 4 8 12 16 20 24
1
08
06
04
02
0
119899
119875119894(119899)
1205831 = 05
1205832 = 07
Figure 8 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583
1= 05) red symbols player
2 (1205832= 07) Data took after 104 hands of the game and averaged
over 104 iterations Random initial distribution for every 119875119894(119899)
4 Conclusions and Perspectives
In this work we have developed a model to describe veryroughly a cognitive processes dynamicsThemodel is amongthe most simple ones but allows to capture the main behav-iours of real dynamics demonstrating how fundamental isbluffing as a rational strategy
First of all this model confirms the role of LF (120583) asprominent In a straightforward way a direct connectionbetween LF and bluffing tendency is here detectable In facteven though both agents tend to develop bluff the one withthe greatest LF bluffs more than the other ending up as
6 Journal of Complex Systems
the winner Finally this last evidence suggests that bluffingmust be an evolutionary rational (stable) strategy
Other more realistic models have been developed [21]and for sure they can give us deeper information about theinner mechanisms driving to bluffing behaviours in poker-like games but the emergence of bluff as a rational strategyalready in a so simple toy model is with no doubt a veryremarkable result Indeed more in-depth analyses of the realgame can be found in the literature [13] which systematicallysurvey every facet of poker (blind flop raise etc) but thisis the first time that it is tried to catch the main features of itby means of a very ldquoreductionistrdquo approach by means of thesimplest version of the game Shortly if it has been thoughtuntil now that to catch an apparently complex behaviourbluffing with effective outcomes it is necessary to take intoaccount (almost) all the rules of poker we have shown herethat such attitude is much more fundamental and emergesnaturally when few simple ingredients are present
Moreover for practical purposes a simpler model whichcan be more easily understood turns out to be very usefulbecause it will result easier also in the realization and theanalysis of experimental tests with real human agents andhence it will be possible to plan more accurately other kindsof experimental tools In particular it will be straightforwardto utilize a tool as in reference [22] for a test with real playersand to exploit data frompoker-onlineweb sites for a statisticalanalysis of the outcomes of real poker games
References
[1] L A Real ldquoAnimal choice behavior and the evolution of cogni-tive architecture rdquo Science vol 253 no 5023 pp 980ndash986 1991
[2] L A Real ldquoInformation processing and the evolutionary ecolo-gy of cognitive architecturerdquo American Naturalist vol 140 ppS108ndashS145 1992
[3] E Brandstatter G Gigerenzer and R Hertwig ldquoThe priorityheuristic making choices without trade-offsrdquo PsychologicalReview vol 113 no 2 pp 409ndash432 2006
[4] J R Busemeyer and J T Townsend ldquoDecision field theory adynamic-cognitive approach to decision making in an uncer-tain environmentrdquo Psychological Review vol 100 no 3 pp 432ndash459 1993
[5] I Erev and G Barron ldquoOn adaptation maximization and rein-forcement learning among cognitive strategiesrdquo PsychologicalReview vol 112 no 4 pp 912ndash931 2005
[6] G T Fong D H Krantz and R E Nisbett ldquoThe effects of statis-tical training on thinking about everyday problemsrdquo CognitivePsychology vol 18 no 3 pp 253ndash292 1986
[7] C R Fox and L Hadar ldquoDecisions from experience = samplingerror + prospect theory reconsidering Hertwig BarronWeberamp Erevrdquo Judgment andDecisionMaking vol 1 no 2 pp 159ndash1612006
[8] R Hau T J Pleskac and R Hertwig ldquoDecisions from experi-ence and statistical probabilities why they trigger differentchoices than a priori probabilitiesrdquo Journal of Behavioral Deci-sion Making vol 23 no 1 pp 48ndash68 2010
[9] C Castellano S Fortunato and V Loreto ldquoStatistical physics ofsocial dynamicsrdquo Reviews of Modern Physics vol 81 no 2 pp591ndash646 2009
[10] C Camerer Behavioral Game Theory Experiments on StrategicInteraction Princeton University Press Princeton NJ USA2003
[11] H Gintis The Bounds of Reason Princeton University PressPrinceton NJ USA 2009
[12] I Barany and Z Furedi ldquoMental poker with three or more play-ersrdquo Information and Control vol 59 no 1ndash3 pp 84ndash93 1983
[13] D Billings A Davidson J Schaeffer and D Szafron ldquoThe chal-lenge of pokerrdquo Artificial Intelligence vol 134 no 1-2 pp 201ndash240 2002
[14] H Johansen-Berg and V Walsh ldquoCognitive neuroscience whoto play at pokerrdquo Current Biology vol 11 no 7 pp R261ndashR2632001
[15] D Koller and A Pfeffer ldquoRepresentations and solutions forgame-theoretic problemsrdquo Artificial Intelligence vol 94 no 1-2 pp 167ndash215 1997
[16] A Rapoport I Erev E V Abraham and D E Olson ldquoRan-domization and adaptive learning in a simplified poker gamerdquoOrganizational Behavior and Human Decision Processes vol 69no 1 pp 31ndash49 1997
[17] Z Shen ldquoA study of a generalization of a card problemrdquoAppliedMathematics and Computation vol 166 no 2 pp 385ndash4102005
[18] J Miekisz ldquoEvolutionary game theory and population dynam-icsrdquo in Multiscale Problems in the Life Sciences vol 1940 ofLecture Notes in Mathematics pp 269ndash316 2008
[19] H W Kuhn ldquoA simplified two-person pokerrdquo in Annals ofMathematics Studies vol 24 pp 97ndash103 1950
[20] J Von Neumann and O Morgenstern Theory of Games andEconomic Behavior Princeton University Press Princeton NJUSA 1947
[21] A Guazzini and D Vilone The emergence of bluff in poker-likegames [PhD thesis] University of Florence 2009
[22] A Guazzini D Vilone F Bagnoli T Carletti and R LauroGrotto ldquoCog-nitive network structure an experimental studyrdquoAdvances in Complex Systems vol 15 no 6 Article ID 12500842012
1gt 1198992) is the probability that the card 119899
1
held by player 1 is higher than the card 1198992held by player 2
analogously it isP21= Pr(119899
2gt 1198991) On the other hand Π1
2is
an operator defined as follows
Π1
2sdot 119883 =
119873minus1
sum
1198992=0
[1198752(1198992 119905)
119873 minus 1(1 minus 120575
11989921198991
)119883] (4)
and represents the probability that player 2 calls from thepoint of view of player 1 Equation (3) can be rewritten as adifferential equation in time which can assume the form
1 (119905) = [1 minus 21205742 (119905)] 1205961 (119905) minus [1 minus 21205741 (119905)] 1205962 (119905) (5)
with
120596119894 (119905) =
1
119873
119873minus1
sum
119899=0
119875119894 (119899 119905) 119894 = 1 2
120574119894 (119905) =
1
(119873 minus 1)2
119873minus1
sum
119899=0
119899119875119894 (119899 119905) 119894 = 1 2
(6)
Since this is a zero-sum game second playerrsquos money willbe obtained by the relation119872
2(119905) = minus119872
1(119905)
Finally the relation giving the temporal behaviour of thecalling distributions 119875
1(119899 119905) of player 1 (being the one of the
opponent of analogous form) is
1(1198991 119905) =
1
1198731198751(1198991 119905) [(1 minus Π
1
2+ Π1
2P1
2)
times [1198751(1198991 119905) + 120583
1(1 minus 119875
1(1198991 119905))]
+12058311198751(1198981 119905) Π1
2P2
1]
+1 minus 1198751(1198991 119905)
1198731198751(1198991 119905) minus
1198751(1198991 119905)
119873
(7)
Now (5) (6) and (7) are rather complicated but somefeatures of them can be determined without an explicit solu-tion Actually it is straightforward to understand that we have
Now since in our simulations we always started from thesame initial 119875
119894(119899) for all 119894 119899 and from119872
1(0) = 119872
2(0) = 0
this implies that for 1205831= 1205832both players must have on
average the same calling distributions and then they shouldnot gain nor lose money apart from fluctuations this isexactly what we found in Figures 5 and 6 It can also be shownthat if 120583
1gt 1205832 then we will have soon 119875
1(119899) ge 119875
2(119899) for all
119899 (with the equality holding only for 119899 = 119873 minus 1) so that (6)allow us to get
1gt 0 that is the victory of player 1 as shown
in Figures 1 to 4 Obviously the opposite situation takes placefor 1205831lt 1205832(as shown in Figures 7 and 8)
0
1000
2000
Mon
ey
minus2000
minus1000
Time0 2000 4000 6000 8000 10000
1205831 = 05
1205832 = 07
Figure 7 Behaviour of the money won (or lost) by the two playersversus time that is versus the hands played averaged over 104iterated matches The LFs of the players are here 120583
1= 05 and
1205832= 07
0 4 8 12 16 20 24
1
08
06
04
02
0
119899
119875119894(119899)
1205831 = 05
1205832 = 07
Figure 8 Calling probabilities as functions of the card value 119899 forboth players Black symbols player 1 (120583
1= 05) red symbols player
2 (1205832= 07) Data took after 104 hands of the game and averaged
over 104 iterations Random initial distribution for every 119875119894(119899)
4 Conclusions and Perspectives
In this work we have developed a model to describe veryroughly a cognitive processes dynamicsThemodel is amongthe most simple ones but allows to capture the main behav-iours of real dynamics demonstrating how fundamental isbluffing as a rational strategy
First of all this model confirms the role of LF (120583) asprominent In a straightforward way a direct connectionbetween LF and bluffing tendency is here detectable In facteven though both agents tend to develop bluff the one withthe greatest LF bluffs more than the other ending up as
6 Journal of Complex Systems
the winner Finally this last evidence suggests that bluffingmust be an evolutionary rational (stable) strategy
Other more realistic models have been developed [21]and for sure they can give us deeper information about theinner mechanisms driving to bluffing behaviours in poker-like games but the emergence of bluff as a rational strategyalready in a so simple toy model is with no doubt a veryremarkable result Indeed more in-depth analyses of the realgame can be found in the literature [13] which systematicallysurvey every facet of poker (blind flop raise etc) but thisis the first time that it is tried to catch the main features of itby means of a very ldquoreductionistrdquo approach by means of thesimplest version of the game Shortly if it has been thoughtuntil now that to catch an apparently complex behaviourbluffing with effective outcomes it is necessary to take intoaccount (almost) all the rules of poker we have shown herethat such attitude is much more fundamental and emergesnaturally when few simple ingredients are present
Moreover for practical purposes a simpler model whichcan be more easily understood turns out to be very usefulbecause it will result easier also in the realization and theanalysis of experimental tests with real human agents andhence it will be possible to plan more accurately other kindsof experimental tools In particular it will be straightforwardto utilize a tool as in reference [22] for a test with real playersand to exploit data frompoker-onlineweb sites for a statisticalanalysis of the outcomes of real poker games
References
[1] L A Real ldquoAnimal choice behavior and the evolution of cogni-tive architecture rdquo Science vol 253 no 5023 pp 980ndash986 1991
[2] L A Real ldquoInformation processing and the evolutionary ecolo-gy of cognitive architecturerdquo American Naturalist vol 140 ppS108ndashS145 1992
[3] E Brandstatter G Gigerenzer and R Hertwig ldquoThe priorityheuristic making choices without trade-offsrdquo PsychologicalReview vol 113 no 2 pp 409ndash432 2006
[4] J R Busemeyer and J T Townsend ldquoDecision field theory adynamic-cognitive approach to decision making in an uncer-tain environmentrdquo Psychological Review vol 100 no 3 pp 432ndash459 1993
[5] I Erev and G Barron ldquoOn adaptation maximization and rein-forcement learning among cognitive strategiesrdquo PsychologicalReview vol 112 no 4 pp 912ndash931 2005
[6] G T Fong D H Krantz and R E Nisbett ldquoThe effects of statis-tical training on thinking about everyday problemsrdquo CognitivePsychology vol 18 no 3 pp 253ndash292 1986
[7] C R Fox and L Hadar ldquoDecisions from experience = samplingerror + prospect theory reconsidering Hertwig BarronWeberamp Erevrdquo Judgment andDecisionMaking vol 1 no 2 pp 159ndash1612006
[8] R Hau T J Pleskac and R Hertwig ldquoDecisions from experi-ence and statistical probabilities why they trigger differentchoices than a priori probabilitiesrdquo Journal of Behavioral Deci-sion Making vol 23 no 1 pp 48ndash68 2010
[9] C Castellano S Fortunato and V Loreto ldquoStatistical physics ofsocial dynamicsrdquo Reviews of Modern Physics vol 81 no 2 pp591ndash646 2009
[10] C Camerer Behavioral Game Theory Experiments on StrategicInteraction Princeton University Press Princeton NJ USA2003
[11] H Gintis The Bounds of Reason Princeton University PressPrinceton NJ USA 2009
[12] I Barany and Z Furedi ldquoMental poker with three or more play-ersrdquo Information and Control vol 59 no 1ndash3 pp 84ndash93 1983
[13] D Billings A Davidson J Schaeffer and D Szafron ldquoThe chal-lenge of pokerrdquo Artificial Intelligence vol 134 no 1-2 pp 201ndash240 2002
[14] H Johansen-Berg and V Walsh ldquoCognitive neuroscience whoto play at pokerrdquo Current Biology vol 11 no 7 pp R261ndashR2632001
[15] D Koller and A Pfeffer ldquoRepresentations and solutions forgame-theoretic problemsrdquo Artificial Intelligence vol 94 no 1-2 pp 167ndash215 1997
[16] A Rapoport I Erev E V Abraham and D E Olson ldquoRan-domization and adaptive learning in a simplified poker gamerdquoOrganizational Behavior and Human Decision Processes vol 69no 1 pp 31ndash49 1997
[17] Z Shen ldquoA study of a generalization of a card problemrdquoAppliedMathematics and Computation vol 166 no 2 pp 385ndash4102005
[18] J Miekisz ldquoEvolutionary game theory and population dynam-icsrdquo in Multiscale Problems in the Life Sciences vol 1940 ofLecture Notes in Mathematics pp 269ndash316 2008
[19] H W Kuhn ldquoA simplified two-person pokerrdquo in Annals ofMathematics Studies vol 24 pp 97ndash103 1950
[20] J Von Neumann and O Morgenstern Theory of Games andEconomic Behavior Princeton University Press Princeton NJUSA 1947
[21] A Guazzini and D Vilone The emergence of bluff in poker-likegames [PhD thesis] University of Florence 2009
[22] A Guazzini D Vilone F Bagnoli T Carletti and R LauroGrotto ldquoCog-nitive network structure an experimental studyrdquoAdvances in Complex Systems vol 15 no 6 Article ID 12500842012
the winner Finally this last evidence suggests that bluffingmust be an evolutionary rational (stable) strategy
Other more realistic models have been developed [21]and for sure they can give us deeper information about theinner mechanisms driving to bluffing behaviours in poker-like games but the emergence of bluff as a rational strategyalready in a so simple toy model is with no doubt a veryremarkable result Indeed more in-depth analyses of the realgame can be found in the literature [13] which systematicallysurvey every facet of poker (blind flop raise etc) but thisis the first time that it is tried to catch the main features of itby means of a very ldquoreductionistrdquo approach by means of thesimplest version of the game Shortly if it has been thoughtuntil now that to catch an apparently complex behaviourbluffing with effective outcomes it is necessary to take intoaccount (almost) all the rules of poker we have shown herethat such attitude is much more fundamental and emergesnaturally when few simple ingredients are present
Moreover for practical purposes a simpler model whichcan be more easily understood turns out to be very usefulbecause it will result easier also in the realization and theanalysis of experimental tests with real human agents andhence it will be possible to plan more accurately other kindsof experimental tools In particular it will be straightforwardto utilize a tool as in reference [22] for a test with real playersand to exploit data frompoker-onlineweb sites for a statisticalanalysis of the outcomes of real poker games
References
[1] L A Real ldquoAnimal choice behavior and the evolution of cogni-tive architecture rdquo Science vol 253 no 5023 pp 980ndash986 1991
[2] L A Real ldquoInformation processing and the evolutionary ecolo-gy of cognitive architecturerdquo American Naturalist vol 140 ppS108ndashS145 1992
[3] E Brandstatter G Gigerenzer and R Hertwig ldquoThe priorityheuristic making choices without trade-offsrdquo PsychologicalReview vol 113 no 2 pp 409ndash432 2006
[4] J R Busemeyer and J T Townsend ldquoDecision field theory adynamic-cognitive approach to decision making in an uncer-tain environmentrdquo Psychological Review vol 100 no 3 pp 432ndash459 1993
[5] I Erev and G Barron ldquoOn adaptation maximization and rein-forcement learning among cognitive strategiesrdquo PsychologicalReview vol 112 no 4 pp 912ndash931 2005
[6] G T Fong D H Krantz and R E Nisbett ldquoThe effects of statis-tical training on thinking about everyday problemsrdquo CognitivePsychology vol 18 no 3 pp 253ndash292 1986
[7] C R Fox and L Hadar ldquoDecisions from experience = samplingerror + prospect theory reconsidering Hertwig BarronWeberamp Erevrdquo Judgment andDecisionMaking vol 1 no 2 pp 159ndash1612006
[8] R Hau T J Pleskac and R Hertwig ldquoDecisions from experi-ence and statistical probabilities why they trigger differentchoices than a priori probabilitiesrdquo Journal of Behavioral Deci-sion Making vol 23 no 1 pp 48ndash68 2010
[9] C Castellano S Fortunato and V Loreto ldquoStatistical physics ofsocial dynamicsrdquo Reviews of Modern Physics vol 81 no 2 pp591ndash646 2009
[10] C Camerer Behavioral Game Theory Experiments on StrategicInteraction Princeton University Press Princeton NJ USA2003
[11] H Gintis The Bounds of Reason Princeton University PressPrinceton NJ USA 2009
[12] I Barany and Z Furedi ldquoMental poker with three or more play-ersrdquo Information and Control vol 59 no 1ndash3 pp 84ndash93 1983
[13] D Billings A Davidson J Schaeffer and D Szafron ldquoThe chal-lenge of pokerrdquo Artificial Intelligence vol 134 no 1-2 pp 201ndash240 2002
[14] H Johansen-Berg and V Walsh ldquoCognitive neuroscience whoto play at pokerrdquo Current Biology vol 11 no 7 pp R261ndashR2632001
[15] D Koller and A Pfeffer ldquoRepresentations and solutions forgame-theoretic problemsrdquo Artificial Intelligence vol 94 no 1-2 pp 167ndash215 1997
[16] A Rapoport I Erev E V Abraham and D E Olson ldquoRan-domization and adaptive learning in a simplified poker gamerdquoOrganizational Behavior and Human Decision Processes vol 69no 1 pp 31ndash49 1997
[17] Z Shen ldquoA study of a generalization of a card problemrdquoAppliedMathematics and Computation vol 166 no 2 pp 385ndash4102005
[18] J Miekisz ldquoEvolutionary game theory and population dynam-icsrdquo in Multiscale Problems in the Life Sciences vol 1940 ofLecture Notes in Mathematics pp 269ndash316 2008
[19] H W Kuhn ldquoA simplified two-person pokerrdquo in Annals ofMathematics Studies vol 24 pp 97ndash103 1950
[20] J Von Neumann and O Morgenstern Theory of Games andEconomic Behavior Princeton University Press Princeton NJUSA 1947
[21] A Guazzini and D Vilone The emergence of bluff in poker-likegames [PhD thesis] University of Florence 2009
[22] A Guazzini D Vilone F Bagnoli T Carletti and R LauroGrotto ldquoCog-nitive network structure an experimental studyrdquoAdvances in Complex Systems vol 15 no 6 Article ID 12500842012