## Abstract

This article studies correlated two-person games constructed from games with independent players as proposed in Iqbal *et al.* (2016 *R. Soc. open sci.* **3**, 150477. (doi:10.1098/rsos.150477)). The games are played in a collective manner, both in a two-dimensional lattice where the players interact with their neighbours, and with players interacting at random. Four game types are scrutinized in iterated games where the players are allowed to change their strategies, adopting that of their best paid mate neighbour. Particular attention is paid in the study to the effect of a variable degree of correlation on Nash equilibrium strategy pairs.

## 1. Introduction

This paper considers the four two-person (*A* and *B*), 2×2 non-zero-sum game types defined by the pay-off matrices given in table 1. Namely, the Prisoner’s Dilemma (PD), the Hawk–Dove (HD), the Samaritan’s Dilemma (SD) and the Battle of the Sexes (BOS), whose interpretation is described below.

In the PD game, both players may choose either to cooperate (*C*) or to defect (*D*). Mutual cooperators each scoring the *reward* *R*; mutual defectors score the *punishment* *P*; and *D* scores the *temptation* *C*, who scores *S* (*sucker*’s pay-off) in such an encounter. In the PD, it is: *R*=3, *P*=2 and *S*=1. The PD with these pay-offs will be referred to as PD(5,3,2,1).

In the HD game, the structure of the pay-offs matrices is similar to that in the PD, but in the HD it is *P*<*S* instead of *P*>*S* as in the PD. In this study, the HD pay-off values will be *R*=2, *P*=−1 and *S*=0. The HD with these pay-offs will be referred to as HD(3,2,0,−1).

In the SD game, the charity player A may choose Aid/No Aid, whereas the beneficiary player B may choose Work/Loaf. The Samaritan’s dilemma arises in the act of charity. The charity player wants to help (Aid) people in need. However, the beneficiary may simply rely on the handout (Loaf) rather than try to improve their situation (Work). This is not anticipated by the charity player. Many people may have experienced this dilemma when confronted with people in need. Although there is a desire to help them, there is the recognition that a handout may be harmful to the long-run interests of the recipient [1–5]. Following the references [6–8], we adopt here the pay-off matrices given in its corresponding panel in table 1, and the SD with these pay-offs will be referred to as SD(3,2,1,−1).

In the so-called BOS game, the rewards *R*>*r*>0 quantify the preferences in a conventional couple fitting the traditional stereotypes: The male player A prefers to attend a *F*ootball match, whereas the female player B prefers to attend a *B*allet performance. Both players hope to *coordinate* their choices, but the *conflict* is also present because their preferred activities differ [9,10]. In this study, the BOS pay-off values will be *R*=3, *r*=1. The BOS with these pay-offs will be referred to as BOS(5,1).

The PD and HD games are symmetric, i.e. the pay-off matrices of both players coincide after transposition, whereas the SD and the BOS games are not symmetric. In symmetric games, the role of both players are somehow interchangeable, whereas in asymmetric games every player has to be studied separately. This issue is to be taken into account all across this study, but particularly in §5.

This paper studies the game-types under scrutiny interacting in a collective manner; either with players connected in a spatially structured manner (§3) or with players randomly connected (§4). Collective games on networks have long been studied previously [11,12]. The novelty of this study lies in the consideration of the mechanism for correlating independent strategies given in [13], and contextualized here in §2.

## 2. Independent players and correlated games

In the somehow canonical approach to game theory, both players choose their strategies independently of each other. In an alternative approach, an external (probabilistic) mechanism sends a signal to each player, so that, in principle, the players do not have any active role. Both approaches, as well as a mechanism for combining them, are featured in this section.

### 2.1 Games with independent players

In conventional games, both players decide independently their probabilistic strategies **x**=(*x*,1−*x*)′ and **y**=(*y*,1−*y*)′, which give rise to the joint probability distribution ** Π**=

**x**

**y**′. As a result, in a game with

**P**

_{A}and

**P**

_{B}pay-off matrices, the expected pay-offs (

*p*) of both players are (⊙ indicates element-by-element matrix multiplication, 1′=(1,1)):

The strategy pair (**x**, **y**), referred to here as (*x*, *y*), is in Nash equilibrium (NE), if *x* is the best response to *y* and *y* is the best response to *x*. In the PD game, mutual defection, i.e. *x**=*y**=0, is the only pair of strategies in NE. The HD game has three strategy pairs in NE, two of them are given by the pure strategies (*x**=1,*y**=0≡(*D*,*H*)) and (*x**=0,*y**=1≡(*H*,*D*)), whereas the third NE in achieved with mixed strategies, which in the particular case of the HD(3,2,0,−1) considered here becomes *p*_{A,B}=1. Note that (*x**=*y**=0≡(*H*,*H*)) is not in NE in the HD game. The SD game has only one NE, which in the particular case of the SD(3,2,1,−1) considered here becomes: *p*_{A}=−0.2, *p*_{B}=1.5). The BOS game has three strategy pairs in NE, two of them are given by the pure strategies (*x**=*y**=1≡(*F*,*F*)) and (*x**=*y**=0≡(*B*,*B*)), whereas the third NE in achieved with the mixed strategies (*x**=*R*/(*R*+*r*),*y**=*r*/(*R*+*r*)), leading to *p*_{A,B}=*Rr*/(*R*+*r*)<*r*.

Social welfare (SW) functions may be envisaged as summarizing some particular conception of the *common good* [14]. In its simplest form, SW solutions maximize the sum of the pay-offs of both players. In the games studied here, only (1,1) is the SW solution in the HD(3,2,0,−1) and the SD(3,2,1,−1); in the PD(5,3,2,1), (1,1), (1,0), (0,1) are SW solutions, although only (1,1) is pay-offs balanced; in the BOS(5,1), both (1,1) and (0,0) are SW solutions.

### 2.2 Correlated games

In a different game scenario, that of correlated games, an *external* probability distribution *p*_{A}(** Π**)=1′

**P**

_{A}⊙

**1,**

*Π**p*

_{B}(

**)=1′**

*Π***P**

_{B}⊙

**1.**

*Π*Non-factorizable ** Π** may be generated from independent strategies (

*x*,

*y*) as with the ad hoc method based on an external parameter

*k*∈[0,1] given in [13], and shown as follows:

Equations (2.3) give the values of the elements of ** Π** from equations (2.2) for three relevant values of

*k*. Note that

*k*=1 interchanges the

*k*=0 values of

*π*

_{12}and

*π*

_{21}, whereas those of

*π*

_{11}and

*π*

_{22}remain unaltered. Also relevant is that if

**is uniform (all its elements equal to**

*Π**k*=0 and

*k*=1, but for

*Π*from equations (2.2) for relevant values of

*x*and

*y*.

From the joint probabilities given in equations (2.4), the pay-offs in a KPD(5,3,2,1) with pure strategies and *x*=*y*=0.5 are given in the following equations, and plotted in figure 3*a*:

Figure 1 shows the best responses to pure strategies in the KPD(5,3,2,1). Figure 1*a*,*b* proves, respectively, that the strategy pairs (0,1) and (1,0) are in NE in the (*k*^{★},*k*^{•}) interval. The *k*^{★}-threshold is achieved in the intersection of *k*^{•}-threshold is achieved in the intersection of

From the joint probabilities given in equations (2.4), the pay-offs in a KHD(3,2,−1,0) with pure strategies and *x*=*y*=0.5 are given in the following equations, and plotted in figure 4*a*.

From the joint probabilities given in equations (2.4), the pay-offs in a KSD(3,2,−1,0) with pure strategies and *x*=*y*=0.5 are given in the following equations, and plotted in figure 6*a*.

In the KSD(3,2,−1,0) it is, *p*_{A}=((3(2*k*−1)^{2}+2)*y*−1)*x*−*y*, *p*_{B}=((2(2*k*−1)^{2}−4)*x*+1+2*k*)*y*+(3−2*k*)*x*. Consequently, the strategy pairs in NE in the KSD(3,2,−1,0) are given in (2.8), where the threshold *k**=0.89 emerges from the *x*≤1 restraint applied to *x*. Before *k**, it is

Figure 2*a* shows the strategies and pay-offs in NE in a KSD(3,2,−1,0) for variable *k*. Figure 2*b* shows pay-offs in a KSD(3,2,−1,0) with (*x*=0.5,*y*=0.2). It is remarkable that the pay-offs in the latter scenario do not differ very much from that in NE, particularly in the case of *p*_{A}.

From the joint probabilities given in equations (2.4), the pay-offs in a KBOS(5,1) with pure strategies and *x*=*y*=0.5 are given in the following equations, and plotted in figure 7*a*.

Iqbal *et al.* [13] give a second method of constructing non-factorizable ** Π** from independent strategies (

*x*,

*y*). It departs from the fact that in factorizable

**it is**

*Π**π*

_{11}=

*xy*,

*π*

_{12}=

*x*−

*π*

_{11},

*π*

_{21}=

*y*−

*π*

_{11},

*π*

_{22}=1+

*π*

_{11}−(

*x*+

*y*). Then, it is proposed just to alter the form of

*π*

_{11}=

*xy*, maintaining those of the other three elements of

**as functions of**

*Π**π*

_{11}. It is concluded in [13] that

*π*

_{11}(

*x*,

*y*)<

*xy*is the only restriction to be imposed on

*π*

_{11}(

*x*,

*y*) in order to make sure that all the elements of

**are in the [0,1] interval and sum to 1.0. The authors propose**

*Π**π*

_{11}=(

*xy*)

^{2}and

*π*

_{11}=

*x*

^{2}

*y*

^{3}as examples. But

*π*

_{11}(

*x*,

*y*)<

*xy*does not suffice to make sure that

*π*

_{22}=1+

*π*

_{11}−(

*x*+

*y*) is non-negative. To prove this, let us consider the particular case of

*x*=

*y*, i.e.

*π*

_{22}=1+

*π*

_{11}−2

*x*: With

*π*

_{11}=

*x*

^{4},

*π*

_{22}is negative if 0.554<

*x*<1.0, and with

*π*

_{11}=

*x*

^{5},

*π*

_{22}is negative if 0.519<

*x*<1.0.

## 3. Spatial games

In the spatial version of the two-person games we deal with, each player occupies a site (*i*,*j*) in a two-dimensional *N*×*N* lattice. The *A* and *B* players alternate in the site occupation in a chessboard form, so that every player is surrounded by four partners (*A*-*B*, *B*-*A*), and four mates (*A*-*A*, *B*-*B*). The game is played in the cellular automata (CA) manner, i.e. with uniform, local and synchronous interactions [15]. In this way, every player (*i*,*j*) plays with his four adjacent partners, so that his pay-off at time step *T*, namely *i*,*j*) will adopt the probabilities of his mate player (*k*,*l*) with the highest pay-off among their mate neighbours. Table 2 shows a simple example with the PD(5,3,2,1) game where initially every player cooperates (*x*=*y*=1), except the defector (*x*=0) player A located in the (3,4) cell. Thus at *T*=1, the defector player A gets the *p*=20 pay-off instead of the common *p*=12 pay-off. The imitation mechanism spreads the *x*_{A}=1 defection across the player A cells, whereas player B cooperation remains unaltered as no player B defects.

All the simulations in this section are run in an *N*=200 lattice with periodic boundary conditions and initial random assignment of the probability values sampled from a uniform distribution in the [0,1] interval. Thus, initially:

Figure 3 deals with spatial simulations of the PD(5,3,2,1) with joint probabilities generated according to (2.2). Figure 3*b* shows the mean pay-offs (*x* and *y* at *T*=200 starting from five different random assignments of *x* and *y*. Mutual defection (*x*=*y*=0) arises below the lower *k*^{★}=0.25 threshold and mutual cooperation (*x*=*y*=1) beyond the higher *k*^{•}=0.707 threshold. In the (*k*^{★},*k*^{•}) transition interval, where both (1,0) and (0,1) are in NE, *k* increases from *k*^{★} up to *k*^{•}; the mean pay-offs of both players in turn are fairly similar, reaching values not far from *R*=3. With the more sophisticated method of correlating independent probability distributions presented in [16], referred to here as EWL, the transition interval from mutual defection up to mutual cooperation in the PD is shorter and a strategy pair in NE providing the pay-off of mutual cooperation appears with lower degree of correlation (entanglement in the quantum approach implemented by the EWL method). In the PD(5,3,2,1) studied here, the thresholds of the correlation parameter applying the EWL method (referred to here as *k*_{q}) in a 0.0 up to 1.0 normalized scale are

Figure 3*b* shows also the mean-field pay-offs (*p**) achieved in a single hypothetical two-person game with players adopting the mean probabilities appearing in the spatial dynamic simulation, namely with joint probability matrix

The mean-field pay-offs (coloured brown for player A, green for player B) fully coincide with the actual mean pay-offs out of the transition interval, but underestimate them in the transition interval. The lack of coincidence of both mean-field and actual mean pay-offs is due to spatial effects that will be illustrated here when addressing the BOS game (figures 9 and 8).

Figure 4*b* shows the results in five spatial simulations of the HD(3,2,0,−1) at *T*=200. Spatial effects arise before *k*^{★} so that the mean-field approaches underestimate the actual mean pay-offs as in the spatial simulations of the PD. After *k*^{★}, the spatial simulations detect (1,1) as the unique NE, so that both pay-offs increase their values according to *p*=12*k*^{2}−12*k*+2 up to *p*=2.0 at *k*=1.0. The *k*^{★} threshold appears from the intersection of *p*^{(1.0,1.0)} and *p*^{(1.0,0.0)}_{B}, given in equations (2.6a) and (2.6b), thus *k*^{★}=0.848.

No results on the spatial simulations of the HD using the EWL correlation method have been reported elsewhere, so figure 5 is included in this article. Again, as stressed above regarding the PD, the outcome of mutual cooperation (Dove in the HD) emerges before with the EWL method: *a* that spatial effects also arise in spatial simulations using the EWL method before *p*^{★}) also underestimate the actual mean pay-offs (

Figure 6*b*,*c* show the results in five spatial simulations of a KSD(3,2,−1,0) at *T*=200. As the SD has only one NE regardless of *k*, (i) the results shown in the spatial simulation mimic those corresponding to NE in two-person games shown in figure 2*a*, and (ii) no spatial effects arise so that both mean-field and actual pay-offs coincide for every *k*. In spatial simulations of the SD using the EWL correlation method [6] it is

Figure 7*b*,*c* show the results in five spatial simulations of the KBOS(5,1) at *T*=200. Owing to the particular structure of the BOS game, where both *π*_{12} and *π*_{21} are irrelevant, the graphs in these panels are symmetric around *k*=0.5. The general form of the pay-offs (figure 7*b*) correspond to that of *k*=0.5 (figure 7*c*) where notable spatial effects arise, and particularly close to the extreme values of *k*, both 0.0 and 1.0. The output of the spatial simulations of the BOS using the EWL correlation method notably differ from that shown in figure 7 [18]. Let us say that the BOS game proves to be a highly challenging game.

Figures 8 and 9 deal with simulations of the KBOS(5,1). The former with *k*=0.0, the latter with *k*=0.5. In panel *a* of both figures, the dynamics up to *T*=200; in panels *b* and *c*, the patterns of pay-offs and probabilities at *T*=200 and in panels *d* and *e*, zooms of the 20×20 central region of the full patterns. In both scenarios, the dynamics induced by the imitation of the best paid neighbour implemented here actuates in a straightforward manner, so that the permanent regime is achieved very soon. This applies not only to the BOS game but in a general manner, regardless of the game under scrutiny.

The patterns of the pay-offs and probabilities shown in figure 8*b,c* are enhanced by the zooms of a small central region in figure 8*d,e*. The general patterns are featured by regions of black-marked clusters where *x*=*y*=1.0 and white-marked clusters where *x*=*y*=0.0. The emergence of these well-defined spatial structures explain why the mean-field pay-off fails to estimate the actual mean pay-off, as shown in figure 8*a*. The pattern of probabilities at *T*=200 shown in figure 9 for *k*=0.5 turns out particularly surprising as two horizontal compact bands with (*x*=*y*=0.0) (figure 9*a*, upper and lower panels) and one with (*x*=*y*=1.0) emerge. This dramatic spatial structure lies in the origin of the discrepancy between the mean-field and the actual mean pay-offs shown in figure 9*a*. In figures 8 and 9, the (*x*=*y*=0.0) and (*x*=*y*=1.0) clusters are separated by borders where either (*x*=0.0,*y*=1.0) or (*x*=1.0,*y*=0.0) and consequently the pay-offs of both players are zero, which causes white cell lines in the pay-offs patterns. These clear (almost white) border lines are clearly noticeable in figure 8*b–e*, whereas in figure 9 they are only two not-so-apparently clear horizontal lines, one of them enhanced in the zoom which has been located in the upper transition from *x*=*y*=0.0 to *x*=*y*=1.0.

## 4. Games on random networks

In the simulations of this section, every player is connected at random with four partners and four mates, so that any spatial structure is absent in such random networks. To compare the simulations presented in this section to those based in spatially structured lattices in §3, also 200×200 players interact in the games on networks studied in this section, half of them of type A, the other half of type B.

Figure 10 deals with the KPD(5,3,2,1) game with variable *k* in network simulations. Figure 10*a* shows the mean pay-offs of both players and their mean values of *x* and *y* at *T*=200 in five simulations. Figure 10*b* shows the dynamics in one of such simulations up to *T*=20 for *k*=0.0 (i), *k*=0.4 (ii) and *k*=1.0 (iii).

The overall structure of the graphs in figure 10*a* coincides with that in figure 3*b*. The *k*^{★} and *k*^{•} remain unaltered, with *x*=*y*=0 before *k*^{★} and *x*=*y*=1 after *k*^{★} in both scenarios. At variance with this, the behaviour of the system in the (*k*^{★},*k*^{•}) interval varies significantly in figure 10 compared to that in figure 3, as in the network simulation the (1,0) and (0,1) NE emerge with no spatial effects masking them. Panel *b* shows that also in network simulations the dynamics induced by the imitation of the best paid neighbour implemented here also actuates in a straightforward manner, so that the permanent regime is achieved almost immediately for *k*=0.0 and *k*=1.0, and as soon as just passed *T*=10 for *k*=0.4.

Figures 11–13 show the results with the KHD(3,2,0,−1), KSD(3,2,1,−1) and KBOS(5,1) games with variable *k* in five network simulations at *T*=200. Panel *a* of these figures shows the mean pay-offs of both players, and panel *b*, the mean values of *x* and *y*.

In figure 11, the *k*^{•} threshold and the permanent *x*=*y*=1 regime after *k*^{•} remain unaltered compared to those in figure 4. But before *k*^{•}, the KHD system behaves much as the KPD in its transition interval in network simulations: the (1,0) and (0,1) NE emerge with no spatial effects masking them.

In figure 12, the *k*^{•} threshold and the permanent *x*=*y*=1 regime after *k*^{•} remain unaltered compared to those in figure 6. But before *k*^{•}, the KSD system shows a kind of helter-skelter oscillations particularly pronounced around *k*=0.5.

The overall structure of the graphs in figure 13 coincides with that in figure 7, so that *k*, both 0.0 and 1.0. The absence of spatial structure in the network simulations of figure 13 produces crisp pay-offs (and probability) graphs, with no relevant alterations around *k*=0.5 , although in one of the simulations it is *k*=0.5, coincident with *a* player B overrates player A in the wide interval *k*^{★} and *k*^{•} defined at the intersection of the pay-offs given in equations (2.9a)). This indicates a kind of bias of the proposed correlation mechanism that favours player B (already pointed out when commenting on equations (2.3) in §2.2), a characteristic that is also found in the EWL model regarding the BOS game [18]. It is relevant to point out that

## 5. Partial strategy updating

In this section, it is assumed that only one player type updates his strategies in the manner indicated in §3. Thus, in figures 14 and 15 only player A updates strategies in the symmetric games of PD and HD. The asymmetric games of SD and BOS are studied in figures 16–19, where both players are treated separately.

In all the figures of this section, panels *a* and *b* deal with spatial simulations and games on networks, respectively, with the initial strategy probabilities assigned at random. Panel *c*, the probability of the player that does not update his probability strategies is fixed at 0.5, instead of being assigned as random as is done with the player that updates probability strategies. Thus, panel *c* provides a kind of the theoretical reference of what is to be expected in the collective behaviour, both in spatial simulations and in games on networks.

In the mean-field analysis with partial updating, the player that does not update his probabilities will have his mean probability equal to the middle level

In the KPD context of figure 14, it is *k*^{★}, and *p*^{(x=1,y=1/2)}_{A}=2*k*^{2}+2 after *k*^{★}. As a result, the general form of the pay-off of player B, *k*^{★}, and *p*^{(x=1,y=1/2)}_{B}=2*k*^{2}−4*k*+4 after *k*^{★}. At *x*=0 and *x*=1 before and after *k*=0, *p*_{A} slightly exceeds its theoretical value 1.5 and *p*_{B} slightly undervalues its theoretical value 3.5. Only small spatial effects emerge in the spatial simulations of player A (panel *a*) close to *k*^{★}.

In the KHD context of figure 15, it is *k*=1/2 it is, *k*^{★} in figure 15*a*.

The strong effect that the absence of updating capacities from one of the players exerts on the collective dynamics studied here is remarkable. Thus, figure 14*a,b* are to be compared to figures 3 and 10 regarding the PD, respectively, and figure 15*a,b* are to be compared to figures 4 and 11 regarding the HD, respectively. In any case, the intrinsic symmetry of both the PD and HD games ceases to be operative in this section, favouring player A, i.e. the player allowed to find a best response to the fixed strategies of the other player, player B, so far.

In the KSD context of figure 16, it is *p*^{(x=1,y=1/2)}_{A}=6*k*^{2}−6*k*+1 and *k*^{★}, although their outputs are qualitatively similar.

In the KSD context of figure 17, it is *p*_{A}(*x*=1/2,*y*)=(6*k*^{2}−6*k*+1. In figure 17, the beneficiary player B overrates the charity player A, greater compared to figure 16, albeit not to a very large extent.

In the KBOS context of figure 18, it is *k*^{2}−16*k*+4≥0, and consequently the best response of player A is *x*=1, which leads to *x*=1 renders *k* not in its extreme values so that *x*=*y*=1, i.e. those given in equation (2.9a). In figure 18*c*, it is *k*, also for *a*,*b*) it is *k*^{2}−8*k*+2=−8*k*^{2}−8*k*−20, so that it is *x*. As a result, there is no repercussion of being *x*=1 at *b*) or in the mean-field pay-off approaches in the CA simulations (figure 18*a*). Nevertheless, in the CA simulations, spatial effects induce the increase of the actual mean pay-off of player A up to nearly *k* of figure 13, i.e. the studied correlation mechanism favours player B in the KBOS, even if the latter is unable to update his strategies.

In the KBOS context of figure 19, it is *k*^{2}+8*k*−2≤0 and consequently the best response of player B is *y*=0, which leads to *k*^{2}−8*k*−2=0, so that now it is *y* so that there is no repercussion for *k* in the KBOS simulations of figure 19. This is highly expected, when in addition to the structural bias favouring player B in the KBOS, only player B is allowed to search for best responses.

## 6. Conclusion

This article studies correlated two-person games constructed from games with independent players. The games are studied in a collective manner, both in a spatially structured two-dimensional lattice and with players connected at random. Iterated games are analysed where the players interact with their nearest neighbours, and after every round each player adopts the strategy of his best paid mate neighbour for the next round. The implementation of such imitation of the best evolving rule proves to be a very useful tool to analyse the collective behaviour of two-person games via simulation.

How high correlation enables the emergence of new Nash equilibria is described. In three of the game types studied here (Prisoner’s Dilemma, Hawk and Dove, Samaritan’s Dilemma), the new Nash equilibria achieved with highly correlated games maximize the sum of the pay-offs of both players, i.e. they provide its (unique) so-called SW solution. The case of the fourth game type studied here, the Battle of the Sexes, appears to be the most challenging one in this respect because it has two SW solutions and the correlation mechanism adopted in this study tends to favour one of the players.

## Data accessibility

The Fortran code (qgames.f) used to produce all the simulations in this article as well as the makefile to run it in a Unix environment are available from the Dryad Digital Repository http://dx.doi.org/10.5061/dryad.722sg [19]. The current configuration of qgames.f produces figure 3*b*.

## Competing interests

I declare I have no competing interests.

## Funding

This contribution has been funded by the Spanish grant no. MTM2015-63914-P.

## Acknowledgements

Part of the computations of this work were performed in FISWULF, an HPC machine of the International Campus of Excellence of Moncloa, funded by the UCM and Feder Funds.

- Received September 12, 2017.
- Accepted October 17, 2017.

- © 2017 The Authors.

Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.