Factors of influence in prisoner’s dilemma task: a review of medical literature

Vasileios Mantas; Artemios Pehlivanidis; Vasileia Kotoula; Katerina Papanikolaou; Georgia Vassiliou; Anthoula Papaiakovou; Charalambos Papageorgiou

doi:10.7717/peerj.12829

Factors of influence in prisoner’s dilemma task: a review of medical literature

Vasileios Mantas ¹, Artemios Pehlivanidis¹, Vasileia Kotoula², Katerina Papanikolaou³, Georgia Vassiliou¹, Anthoula Papaiakovou¹, Charalambos Papageorgiou¹

11st Department of Psychiatry, Eginition Hospital, National and Kapodistrian University of Athens, Athens, Greece

2Experimental Therapeutics and Pathophysiology Branch, National Institute of Mental Health, Bathesda, MD, USA

3Department of Child Psychiatry, Agia Sophia Children’s Hospital, National and Kapodistrian University of Athens, Athens, Greece

DOI: 10.7717/peerj.12829

Published: 2022-01-28
Accepted: 2022-01-04
Received: 2021-09-15

Academic Editor: Neil Dagnall

Subject Areas: Psychiatry and Psychology, Human-Computer Interaction
Keywords: Prisoner’s dilemma, Medical research, PD

Copyright: © 2022 Mantas et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.

Cite this article: Mantas V, Pehlivanidis A, Kotoula V, Papanikolaou K, Vassiliou G, Papaiakovou A, Papageorgiou C. 2022. Factors of influence in prisoner’s dilemma task: a review of medical literature. PeerJ 10:e12829 https://doi.org/10.7717/peerj.12829

Abstract

The Prisoner’s Dilemma (PD) is one of the most popular concepts amongst the scientific literature. The task is used in order to study different types of social interactions by giving participants the choice to defect or cooperate in a specific social setting/dilemma. This review focuses on the technical characteristics of the PD task as it is used in medical literature and describes how the different PD settings could influence the players’ behaviour. We identify all the studies that have used the PD task in medical research with human participants and distinguish, following a heuristic approach, seven parameters that can differentiate a PD task, namely (a) the opponent parties’ composition; (b) the type of the opponent as perceived by the players; (c) the interaction flow of the game; (d) the number of rounds; (e) the instructions narrative and options that are given to players; (f) the strategy and (g) the reward matrix and payoffs of the game. We describe how each parameter could influence the final outcome of the PD task and highlight the great variability concerning the settings of these parameters in medical research. Our aim is to point out the heterogeneity of such methods in the past literature and to assist future researchers with their methodology design.

Introduction

Two criminals are arrested without sufficient evidence. With no means of communicating with one another, the police offers each one a bargain for diminished sentence time. Each prisoner has two options, to ‘betray’ the other by testifying he committed the crime, or to ‘cooperate’ by remaining silent. If one prisoner chooses to betray the other, while the other cooperates, the first will be set free (Temptation payoff—T) while the other serves the maximum sentence (Sucker’s payoff—S). If both prisoners choose to betray each other, they each serve less time (Punishment payoff—P) in prison. If both prisoners cooperate, they will serve the minimum sentence (Reward payoff—R).

The Prisoner’s Dilemma (PD), as described above, was originally devised by Merrill Flood and Melvin Dresher but received its name and final form by Albert W. Tucker. It belongs to the Game Theory family of games and its uniqueness derives from the fact that it represents the conflict between the optimum distribution of rewards (Pareto optimal) and the optimum choice for an individual (Nash equilibrium) (Apt & Grädel, 2011). From the player’s perspective, this corresponds to the conflict of interest between the options of cooperation and defection. The largest payoff for a player occurs with defection while their partner cooperates, mutual cooperation brings a modest payoff to both players and mutual defection yields a lesser amount to each of them. The uniqueness of this dilemma is considered analogous to real life conditions and has helped establish the PD game as a standard and valuable tool for the study of social decision making (Poundstone, 1993). As such, the task has gained great popularity amongst the totality of scientific research. Concerning the medical literature and according to the PubMed database, more than one thousand articles that use the PD task have been published dating as early as 1963.

The present review article focuses on the technical aspects of the PD as this is used in the medical research, where human subjects are involved. In medical literature the PD game has been used to study social behaviour and interaction in healthy populations as well as in cases of pathology including psychiatric disorders. It has also been used in conjunction with neuroimaging techniques such as fMRI (Eimontaite et al., 2019; Thompson et al., 2021) and EEG (Babiloni et al., 2007; Back et al., 2009) to help identify the brain areas and networks that are activated when social decisions are made and to examine whether the activations of these brain areas differ in various populations. Depending on the aims of the study, several parameters need to be considered when setting up a PD experiment. Each of these parameters could impact the outcome of the experiment and thus their selection is a critical step in PD research.

The aim of this review, is to focus on the technical characteristics of the PD as this has been used in medical research involving human participants, in order to identify the parameters that could influence the outcome of the PD task and present the different settings that these parameters could take. Based primarily on the literature but also our own experience in setting up a study that utilised the PD task, we have identified seven parameters that should be considered during the experimental design of the PD. These parameters include: (a) the opponent parties’ composition; (b) the type of the opponent as perceived by the players; (c) the interaction flow of the game; (d) the number of rounds; (e) the instructions, narrative and options that are given to players; (f) the strategy and (g) the reward matrix and payoffs of the game. For each parameter, we present the different settings that are available and when possible which aspects of social behaviour and interaction they would promote. In addition to highlighting the heterogeneity in PD medical research, this review could help researchers to optimise their experimental design in order to best target specific populations and/or research questions.

Methods

The PRISMA guidelines (Fig. 1) were followed in order to identify medical research studies that have used the PD paradigm in human participants. We conducted a search in the PubMed database using the terms ‘(“prisoner’s dilemma”) OR (“prisoner dilemma”) OR (“prisoners dilemma”)’ as keywords. Our research returned 1,058 articles, dated until January 2021. Since our focus was on medical research that has used human participants in the PD experiments, we excluded all the studies that have used computers to simulate human behaviour as well as all studies involving animal research. According to this, 826 articles were excluded. From the 232 remaining publications, four articles were excluded as their methodology involved use of the PD as a paradigm for the subjects to observe or evaluate but not participate. A total of 228 articles were included in this review and were independently evaluated by the first author and three of the co-authors mainly focusing on the PD methodology. The relevant literature references of the 228 articles were also checked by the authors in order to ensure that all studies fulfilling our criteria were included in the review and were not missed during our PubMed research. No new studies were identified by checking the references of the 228 articles.

Figure 1: The PRISMA flow diagram.
For the purposes of this review we focused on medical literature studies with actual participants. A total of 228 articles are included. We have identified seven parameters, that based on our experience as well as the current literature could significantly differentiate the outcome of a PD experiment. Each article was examined and classified based on the choice that the experimenters have made concerning these seven PD parameters.

Download full-size image

DOI: 10.7717/peerj.12829/fig-1

Based on the literature as well as our personal experience concerning the setup requirements of a PD task, we identified the seven parameters that could significantly differentiate the outcome of a PD experiment. These parameters include: (1) the composition of the opponent parties of the PD game—Opponent parties composition; (2) the type of opponent as perceived by the PD players—Opponent perceived type; (3) the sequence of events during the PD game—Interaction Flow; (4) the number of rounds in the game—Number of Rounds; (5) the Instructions, narrative and options; (6) the Strategy and (7) the Reward matrix and payoffs. Each article was examined and the settings for each of the seven parameters were extracted and can be found in the Supplemental Material. In the ‘Results’ section that follows we define each of these parameters and based on the literature that we reviewed we present the different choices that are available for each parameter and how they could influence the experiment’s outcome. Some indicative studies that have utilized some of the available parameter settings are also presented and discussed.

Results

PD parameters

Opponent parties’ composition

Opponent parties’ composition refers to the synthesis of the competing PD parties. We use the term “parties” to include PD experiments where the opponents are individuals as well as groups of individuals. Most commonly the PD game is used as a research analogue of social interactions. Such interactions mainly occur between individuals or groups, which constitute the opponent parties. The main difference between individual and group interactions is that in the between-groups condition, members of each group have to settle to an agreement about their representative PD choice.

The vast majority of the PD literature concerns interactions between individuals. There are however, three additional types of PD experiments where interactions cannot be strictly categorized as group or individual interactions. The first of these additional PD types, places the individual in a network of others, where their every PD choice simultaneously determines the outcome of multiple PD interactions between them and their neighbours. This type of PD is mainly used to investigate the dynamics of social networks, for example how the members’ psychopathic traits could influence the group or how dynamic partner updating would influence cooperation levels (Wang, Suri & Watts, 2012). In the second type (IPD-MD: Intergroup Prisoner’s Dilemma-Maximum Difference) (Halevy, Bornstein & Sagiv, 2008), the individual is a member of a group and their PD decision affects both members of their and other conflicting groups. This type of PD has been used in research to investigate specific characteristics of the players behaviour such as the motives behind altruism, including the desire to help one’s group or harm other groups (De Dreu, Dussel & Velden, 2015; Weisel & Böhm, 2015). The third type is that of a multiplayer PD where both the individuals’ and the groups’ rewards depend on the combination of the members’ decisions without involving any direct interaction between individuals. This PD type is a tool for investigating altruistic behaviour and cooperation as it highlights the dilemma between personal and social benefits (Sheldon, Sheldon & Osbaldiston, 2000; Locey, Safin & Rachlin, 2012).

From the research articles included in this review, the majority used interactions between individuals (207 experiments) with fewer studies looking at interactions between groups (10 experiments).

Opponent perceived type

Opponent perceived type refers to the players’ beliefs about the nature of their opponents. The subject’s perception of their opponent’s nature is of great importance for the PD game methodology. This is mainly due to the contrasting natures of the PD and that of modern methodology. The PD paradigm is mainly used to investigate the characteristics of social behaviour which per definition requires human interactions, modern methodological environments however, are machine-based and aim to minimize human involvement.

In order to compromise the two, players have to either (a) truly believe that machines could accurately mimic or simulate all aspects of human behaviour and assign to them innate human characteristics like compassion or (b) believe that their opponent is an actual human. This becomes even more complicated when taking into account the fact that machines are often used as a proxy for individual interactions. As a result, when someone confronts a soulless screen, they can never be sure about what it reflects/represents. For that purpose, a common practice is to ask participants, at the end of the experiment, what they believed about their opponent’s nature.

We have identified two categories for the opponent perceived type parameter: human agents playing as experimental subjects and agents with predefined strategy or answers. The first opponent type, human vs. human, is a more convenient and realistic model when we intend to study the interaction itself while the second opponent type, human vs. agent with predefined strategy, is a more controlled setting and allows the study of individual subject characteristics as well as the comparison between subjects. The latter category could be further separated in order to distinguish whether or not the subject was falsely informed about the human nature of their opponent or manipulated into thinking so. This manipulation is usually achieved by the ‘supposed’ presentation (i.e., introduction or photo) or actual presence of a human opponent, signs of communications (i.e., messages) and signs of human behaviour (i.e., delays). Whether or not this is truly important, it remains controversial. On one hand, Miwa & Terai (2012) have shown that participant’s behaviour during the PD game is influenced by the instructions given about the partner’s nature while Knight & Kagan (1977) stated that playing against an imaginary opponent strongly correlates with decision-making when an opponent is actually present. On the other hand, it has been shown that humans activate brain areas related to mentalizing when they knowingly face computers as co-players (Rilling et al., 2004a; Krach et al., 2008; Kircher et al., 2009). Finally, interaction (cooperation) significantly changes when comparing human and computer opponents, only when communication between the opponents is avoided (Kiesler, Sproull & Waters, 1996).

Another factor to consider when the ‘opponent type’ parameter is concerned is the sex balance of the players or opposing parties which could influence the game decisions (Rapoport & Chammah, 1965). Male opponents usually appear more cooperative than women in same-sex interactions while women cooperate more than men in mixed-sex settings (Rapoport & Chammah, 1965; Balliet et al., 2011; Colman, Pulford & Krockow, 2018). Social proximity is another important factor. Rewards to socially closer persons are valued more than equivalent and or identical rewards to socially distant ones, which in turn alters subjects’ behaviour as the cost of cooperation is compared to its socially discounted benefit (Locey & Rachlin, 2011; Locey, Safin & Rachlin, 2012; Safin, Locey & Rachlin, 2013; Safin, Arfer & Rachlin, 2015). A possible explanation for this, is perceived similarity which signals some degree of kinship (Park & Schaller, 2005), trustworthiness (Koch et al., 2020), likability (Alves, Koch & Unkelbach, 2017, 2018), and opportunity to form a group (Hehman, Flake & Freeman, 2018) and could thus explain cooperation in social dilemmas (Rand & Nowak, 2013).

Only 40% of the PD experiments, included in this review involved interactions between real human subjects. From these, only a few involved a vis-a-vis setting (mostly EEG experiments). The majority of the studies have used computers to conduct interactions (117 experiments). From the 123 PD experiments that have used algorithms, in at least 97 participants were manipulated to believe they were interacting with real humans.

Interaction flow

Interaction flow describes the order in which the players take turns throughout the PD game. Based on the interaction flow parameter, PD experiments can be distinguished into ‘simultaneous’ and ‘sequential’. In simultaneous PD, which is the most common type of PD, both parties choose their actions without knowing each others current decision. In the sequential PD, one party makes the first move (Player A) and the other the subsequent move (Player B). In this case, Player B has the benefit of already knowing Player’s A choice before deciding upon their move and thus Player B has a strategic advantage over Player A who is aware of this. Kiyonari, Tanida & Yamagishi (2000) has shown that the cooperation levels in the sequential PD are higher when compared to the simultaneous PD. Locey & Rachlin (2011) implemented a modified version of the sequential type where each participant has the role of both Player A and Player B, as each decision is the answer to the opponent’s last choice and at the same time the first move in the new round.

Using the interaction flow parameter, PD experiments can also be distinguished based on the total number (sum) of rounds. When only one round is involved the game is called ‘one-shot’, else, the game is called ‘iterated’. There is a great difference between the two types, as in ‘one-shot’ PD, the player does not have any information/knowledge concerning their opponent’s strategy that would be based on any previous behaviour—nor the opportunity to be strategic. In this case, defection seems to be the most rational choice. Experiments involving human subjects however, have proven that this is far from true since individuals often choose cooperation in ‘one shot’ trials (Shafir & Tversky, 1992; Wedekind & Milinski, 1996; Wong & Hong, 2005). This justifies the use of ‘one-shot’ PD as a means to evaluate the altruistic nature of the agents. In contrast to ‘one-shot’ PD, ‘iterated’ PD, encompasses the essential time frame for strategy to be assessed and implemented and thus it is more suitable for studying the specifics of the evolution of cooperation (i.e., reputation, hierarchy, strategy). Another form of ‘iterated’ PD is that involving multiple ‘one-shot’ games (Yang & Yue, 2019). In this case, the possibility of a future rematch with the same opponent can induce higher cooperation levels (Bó, 2005; Duffy & Ochs, 2009; Arechar, Kouchaki & Rand, 2018; Rand & Nowak, 2013).

From the articles included in our review, only 29 papers involve a sequential type of PD while 186 involve a simultaneous one.

Number of rounds

The ‘number of rounds’ parameter determines the length of the PD game and it is relevant only for the ‘iterated’ type of PD as ‘one-shot’ PD consists of only one round. Our literature search revealed a great variability in the number of rounds used in research, with most studies using 10 to 30 rounds. The total number of rounds in a PD experiment, is considered important for achieving cooperation. The fewer the rounds, the bigger the risk of unsuccessful negotiation and loss. Lave (1965) suggests a minimum number of 3 * (P − S)/(R − P ) rounds, where R stands for reward, P for punishment and S is for sucker payoff (see introduction and ‘reward matrix and payoffs’ section) to achieve mutual cooperation based on the reward matrix and Chang et al. (2010) demonstrate that 12 rounds are sufficient for individuals to achieve stable levels of cooperation.

As a PD game approaches to the end, the number of remaining rounds decreases and this can influence the participants’ behaviour; end-game effect. As the margin for retaliation is getting smaller, temptation for opportunistic behaviour increases along with the uncertainty concerning the opponent’s attitude towards this temptation. Although mitigating the end-game effect is not a common practice in the PD literature, there are two prominent strategies to address the issue. Instead of explicitly defining the exact number of rounds to participants (finite round type) either this information is withheld, or the number of rounds is stochastically determined. The first can be achieved by simply not revealing to participants the total number of rounds of the game (Wedekind & Milinski, 1996; Melamed, Simpson & Harrell, 2017; Grujić & Lenaerts, 2020) while the second by pre-defining a specific probability for the game to be terminated after each round (Shafir & Tversky, 1992; Wedekind & Milinski, 1996; Kiyonari, Tanida & Yamagishi, 2000; Hehman, Flake & Freeman, 2018). Keeping the termination of the game vague, could help prevent the common human practice of defecting in the final rounds and thus avoid the ‘Nash equilibrium’ (Colman, Pulford & Krockow, 2018).

In the literature, the range of the total number of rounds used in experiments is vast, spanning from 2 to 500 with a median of 20 and an average value of around 45. The most commonly selected numbers of rounds which were used in more than half of the experiments, range between 10 and 30. In order to avoid end game effects, in 28 experiments, researchers have either chosen not to reveal the total number of rounds to participants or to assign, in each round, a specific probability for the game to be terminated (stochastic determination of the end of the game).

Instructions, narrative and options

The task’s instructions, narrative and options refer to the way and the context under which the PD game is presented to participants and are provided to participants before the beginning of the task or throughout the paradigm. The options that are available for the players and the representation of the payoffs matrix, the social context of the experiment’s implementation, the definition of the game’s purpose, the opponent type perception and knowledge of outcome’s inter-dependency (Martin et al., 2013) are aspects of the task that can influence the player’s decisions.

The typical PD narrative (criminal’s story), implements ‘cooperation’, ‘defect’ or ‘betray’ as available choices. This, however, is not the case for the majority of PD studies where letters (i.e., A/B, X/Y), numbers and other combinations are commonly used as analogues of cooperation and defection. These options are neutral enough to prevent priming competitive behaviour and also spare participants from the social moral weight of ‘defection’ or ‘betrayal’ which seem to promote cooperative decisions (Deutchman & Sullivan, 2018; Capraro et al., 2019), implement social bias (Capraro & Cococcioni, 2015) and are too diverse to represent common social interactions. Moreover, extra options (i.e., punishment, reward, withdrawal, 1–7) may be available to participants, either explicitly, as by payoff matrix definition, or more dynamically (i.e., ‘decide how many units to give to the other party’) depending on the PD design.

Researchers, when first introducing participants to the PD game, in addition to simply explaining the rational of the game and the matrix payoffs, which is the common practice, could use different imaginary scenarios (Table 1) in accordance with methods specific to the PD design. These scenarios usually lack the social bias of the original narrative but maintain their social context which contributes to the task’s understanding. It has been shown that socially-framed tasks are easier to grasp (Cosmides, 1989; Alekseev, Charness & Gneezy, 2017), and an in-depth comprehension of the task’s design and rules is crucial for the reliability of research. There is, however, the possibility of unexpected framing effects. For example, unpredictable associations between the imaginary scenarios presented in the PD experiment and past experiences of the players as well as intuitive or familiar preferences (Capraro, Jordan & Rand, 2014; Capraro & Cococcioni, 2015; Capraro et al., 2019). Additionally, Capraro & Rand (2018) and Capraro et al. (2019) investigating the role of morality as an influence in PD choices have shown that pro-social people tend to choose the option that is presented as being morally right in the given situation.

Table 1:

This table summarized the different PD scenarios that are used in the literature.

Scenario	Description
Investment story	Participants are investors who have to choose between two investment projects, project X and Y.
Waitress tip story	Players must decide individually whether to contribute to a tip for the waiter.
Water shortage story	In a region suffering from drought, water is distributed by authorities in equal shares for each citizen. Players represent two different citizens who in order to obtain more supplies could pretend to represent the other citizen too. This succeeds only if one of the players complies else, if they try to cheat as well, both parties are punished by the authorities.
Boss-colleagues story	Players are colleagues who collaborated for a work project. Their boss is not satisfied with the quality and calls them individually into their office for explanations. There, the players have the choice to remain silent or blame each other.
Students’ copy story	Players take the role of students who copied at a test. Their professor notices that the tests are identical and convenes separately with each one telling them that if one of the two confesses (defection) the player will pass the test else they will have to wait for a number of sessions before taking it again.
Product offer story	Players take the role of owners of competing shops. They have to decide what price to assign to individual products. Either standard or sale (defection) price.
Sharing secrets story	Players, individually hide a secret in one of several boxes and have the opportunity to mutually plan their strategy for the trial. They have to decide, in private with the interviewer, whether or not to reveal (defection) to the interviewer the location of the secret. Even in the latter scenario, it is possible for the secret to be revealed by lack, as the interviewer can make one guess.
Arms race	Players play the role of a country leader deciding for their country’s resources. They can choose between military (missiles) or economic (factories) development. Houston et al. (2000) attached the dilemma to a 7 × 7 reward matrix so that missile building is awarded the most points if the other country built mostly factories but awarded no points if both parties built only missiles. In each round, players have to decide how many missiles, from zero to six, to invest with the rest to be invested in factories.
Doors-keys or Chests-keys (Brown & Rachlin, 1999)	The apparatus consists of two doors (or chests) of color X and two of color Y. Each one opens to a room with a reward and one key of color X or Y, both known to players. Opening a door, player A gets the reward, discards his current key and passes the revealed key to player B in order to make his door choice. The round finishes after both players get their reward with player A having the key which was inside the room opened by player B in the previous round. The above setting indicates a sequential PD. Each player defines the possible rewards of the other. The color of keys and the rewards are put in such a way that they form a classical PD reward matrix.
Give doubled (or tripled)	Each player, in each round is handed an amount of money and has to decide how much to give to the other. The amount given is doubled (or tripled). The formed matrix is dynamic and depends on decisions combination. Reward for player A equals: n − nA + 2 * nB (or n − nA + 3 * nB). with n the amount handled to each player at the beginning and nA, nB (n ≥ nA, n ≥ nB). The amounts given by players A and B respectively. If the available options are to give all or none, then the formed matrix is: T:3 * n, R:3 * n, P:1 * n, S:0 (or T:4 * n, R:3 * n, P:1 * n, S:0) where n is the amount handed to each player.
Leave doubled	Each player is handed with an amount each round and has to decide whether to leave or take other’s. The amount left is doubled for the opponent. The logic and rewards are the same as ‘give doubled’.
Give doubled—keep tripled	Each player is handed with an amount each round and has to decide how much to give to the other. The amount given is doubled for each one and the amount kept is tripled individually. The formed matrix is dynamic and depends on decisions combination. Reward for player A equals: 3 * (n − nA) + 2 * (nA + nB). with n the amount handed to each player at the beginning and nA, nB (n ≥ nA, n ≥ nB). the amounts given by players A and B respectively. If available options is to give all or none then the formed matrix is T:5 * n, R:4 * n, T:3 * n, S:2 * n.
Balls in baskets	Simple representation of ‘Give double—keep triple’. Putting a ball in basket A gives two points to all while putting it in basket B three to oneself.
Trays and beans	Apparatus consists of two similar, but independent, parallel et al., tilting trays with differently colored sides. Each player has control over one tray and has to decide between pulling his tray and gaining a reward that is stored in his side or pushing it thus giving their opponent the reward stored in the opposite side.
Joint invest	Each player is hande an amount for each round and has to decide how much to contribute, all or part of the amount, to an investment with the other player. The gathered amount is multiplied by a factor and distributed evenly between them. The formed matrix from decision combination is T:n + f * n/2, R:f * n, P:(n − nX) + f * nX/2, S:f * n/2 where n, nX(n > nX). Are the available investing options of the amount handed. In asymmetric PD, the factor (f) is different for each player leading to different reward matrices for each player.
Split or steal	Players mutually gather or are handed an amount. They have to decide whether to split it in half or steal the total. If they both steal, they get nothing. The formed matrix from decisions combination is T:2 * n, R: n, P:0, S:0 where 2 * n is the total amount.

DOI: 10.7717/peerj.12829/table-1

Note:

A brief summary of the key elements of each PD scenario is offered in this table. The way each possible scenario could influence the outcome of the PD task is presented in the main text.

PD instructions commonly state that the purpose of the experiment is to achieve the maximum amount of points alternatively, they set the monetary reward as a function of the participants’ performance. This is done in order for the players to be motivated, the experiment to be an accurate representative of both real life and the human nature, and also to avoid the ‘always defect’ strategy which is the optimum for just winning but would not yield the maximum monetary rewards. Finally, to establish the adequate understanding of the task’s design and rules, most researchers require from participants to answer comprehension questions prior to PD trials.

Experimenters mainly prefer socially neutral (letters) or positive (cooperate vs. no-cooperate) vocabulary in their narrative as only 47 experiments use terms such as ‘defection’, ‘betray’, ‘punish’, ‘steal’ or ‘tell-on’ as options. A total of 19 experiments have implemented more than two options and 35 experiments utilised a different game design from the typical straightforward PD, like joint-invest, chest-keys etc.

Strategy

Strategy refers to the decision making process that unfolds during the PD game. The implementation of a strategy is necessary when the subject competes against an agent whose answers are controlled by the experimenter (algorithms). In ‘one-shot’ PD (see ‘number of rounds’ section), strategy implementation is limited to one decision since there is only one round. There are no priors of interaction and no planning involved as no future interaction will occur. This, however, is not the case for multiple ‘one-shot’ PDs which are played with the same opponent.

Strategy implementation mainly concerns the ‘iterated’ PD and by far the most common strategy is ‘tit-for-tat’ (Axelrod, 1984) and its variations. Other frequently used strategies include ‘always-cooperate’, ‘always-defect’ and ‘random’ (for a more detailed description see Table 2). ‘Tft’ starts with cooperation and then the strategy entails replicating the opponent’s previous move. In the famous Axelord’s competitions, this strategy has proven to be most successful while also promoting cooperation (Kuhlman & Marshello, 1975; Axelrod & Dion, 1988). ‘Tft’ has however, two main weaknesses, it cannot reverse an established mutual defection, meaning that any mistake or environmental noise would be of great influence during the course of the game and secondly, it cannot maximally exploit an unconditional cooperator. In order to eliminate the first weakness, two main subtypes of ‘tft’ have been proposed. The first subtype is called ‘forgiving-tft’ or ‘generous-tft’ (Nowak & Sigmund, 1992) and it involves a percentage of cooperation after the opponent’s defection. The second subtype is called ‘tit-for-two (or more)-tats’ (Boyd & Lorberbaum, 1987) where defection is chosen following continuous defections by the opponent.

Table 2:

This table summarises all the available PD strategies used in the literature.

Strategy name	Description
Always Cooperate	Cooperates in each round.
Always Defect	Defects in each round.
Random	50% of cooperation or defection.
Tit for tat	Starts with cooperation and then the design replicate the opponents’ previous move.
N% Forgiving or Generous tit for tat	‘Tft’ with N percentage of cooperating after opponent’s defection.
Tit for N tats	‘Tft’ with cooperating until N number of consecutive opponent’s defections is reached.
Pavlov or Win-stay-Lose-swift	Switch choice whenever receives S or P payoffs.
Suspicious tit for tat	Like ‘tft’ but as ‘wsls’ does not swift on T payoff.
Rilling et al. (2002)	Cooperation in first round, defection in last three, else, decision based on last two players’ choices.
Rilling et al. (2004b)	50% possibility of reciprocate cooperation and always reciprocate defection.
Rilling et al. (2012)	For player A: 90% reciprocates defection and 67% reciprocates cooperation. For player B, ‘tft’ with 33% of cooperation after mutual defection, 10% of cooperation after unilateral (own) defection and always defection after two consecutive rounds of unilateral (own) defection.
McClure et al. (2007) (based on Rilling et al., 2002)	Cooperation in the first round, defection in the last two, 50% possibility of defection after four consecutive mutual cooperation rounds. Else, the decision is based on the player’s last two choices.
Zero Determinant (ZD) strategies (Press & Dyson, 2012)	Each decision is determined by the probabilities of cooperation based on the four possible choice combinations of the previous round.

DOI: 10.7717/peerj.12829/table-2

Note:

A brief description of the key elements of each strategy is presented in this table. The impact that each strategy could have on the PD outcome is presented in the ’Strategy’ section of our review.

Another strategy which is underrepresented in the literature (Martin et al., 2013), is ‘win-stay-lose-shift’ (wsls) or ‘pavlov’ (Nowak & Sigmund, 1993) where the players switch their choice whenever they receive a bad payoff, S or P. ‘Wsls’ can be exploited by unconditional defectors but overcomes the ‘tft’ weaknesses. In order to promote cooperation, the reward matrix (see following section) in this strategy must fulfill the condition 2 * R > T + P where R stands for reward, T for temptation and P for punishment. Otherwise the player has no motive to cooperate (C) since continuous defection (D) (DD-DC loop) would gather at least the same amount of reward as cooperation (CC-CC or CD-CD).

Other interesting strategies mentioned in the literature are Zero Determinant (ZD) (Press & Dyson, 2012; Pansini, Shi & Wang, 2016), ‘pavlov-tft’ (Juvina et al., 2012) and ‘Raise the stakes’ (Roberts & Sherratt, 1998). ‘Pavlov-tft’ is considered to better fit human behaviour compared to its consisting strategies while ‘Raise the stakes’ can be applied to more dynamic invest-like PD designs. These designs, include ‘Give doubled—keep tripled’ and escalates the players’ investment when partners match or better their last move (for details on the different PD designs refer to Table 1). Other more straightforward, commonly implemented strategies include predefined choices (Papageorgiou et al., 2013), defecting in a number of rounds (Bitsch et al., 2018a) or decisions based on probabilities (i.e., 40% defection) (Haruno & Kawato, 2009). When PD is used for fMRI experiments, most strategies are based on Rilling et al. (2004b) work on collected human data and Rilling’s (Rilling et al., 2004a, 2004b, 2012) and McClure’s (McClure et al., 2007) strategies are the most frequently used, see Table 2.

As expected, the lion’s share in the medical literature (40 experiments) belongs to ‘tft’ based strategies while other ‘famous’ strategies where non-human participants are involved, like ‘wsls’ appear to be underrepresented (1 experiment). A total of 58 experiments involve non-typical and mixed strategies.

Table 2 summarizes the strategies used in literature.

Reward matrix and payoffs

The reward matrix and payoffs determine the gains and loses that different PD choices could bring to the players. The reward matrix could be considered the most important part of the PD as it determines the motive as well as the pros and cons of the agent’s decisions since different strategic choices are reflected upon the expected reward. There is no point in cooperating or defecting if there is no profit from it, or no punishment for doing otherwise. As a result, the amount of payoffs for each possible outcome and the degree of difference between these payoffs strongly influence the course of the game.

In a PD game, the reward matrix typically consists of four values, ‘T’ (temptation) is the reward of someone’s defection over cooperation, ‘R’ (reward) is the reward of mutual cooperation, ‘P’ (punishment) the payoff of mutual defection and ‘S’ (sucker) the payoff of cooperation over defection. In order for a reward matrix to accurately represent PD’s dilemma, the following condition must, at least, be met, T > R > P > S (Axelrod & Hamilton, 1981). The above is not sufficient for the ‘iterated’ version of the game, where 2 * R > T + S is another mandatory condition which gives mutual cooperation a greater reward value thus preventing individuals from alternating between cooperation and defection strategies (Axelrod & Hamilton, 1981). Moreover, as stated in the ‘strategy’ section, for the ‘wsls’ strategy to be meaningful, the 2 * R > T + P condition must also be met. Although the above conditions are standard for a typical PD game (Axelrod & Hamilton, 1981) a lot of papers deviate from these mainly by implementing the P < S condition, in order to promote cooperation (Bitsch et al., 2018b; Acosta, Straube & Kircher, 2019). The P < S condition, however, transforms such paradigms into a snowdrift or hawk-dove game which deviate from the typical PD task since mutual defection has the minimum payoff.

But how do the different combinations of those four matrix values, influence participants’ decisions? In order to answer that question, one factor that needs to be considered is the cooperation index ratio ((R − P )/(T − S)) used by Rapoport & Chammah (1965) and Rapoport (1967). Greater ratio values indicate that defection is less tempting and cooperation less risky, thus defining a more cooperation-friendly environment. In addition to the cooperation index ratio, Sherman (1967) proposes two other ratios derived from the values of the reward matrix parameters. These ratios represent attractiveness of cooperation ((R − P )/(P − S)) and attractiveness of defecting from a cooperative solution ((T − R)/(R − P )). A higher value for the first ratio represents relatively higher gain from cooperation while lower values for the second ratio relatively low gain from defecting, resulting, in both situations, in greater cooperation levels. The index of riskiness ((R − S)/((R − S) + (T − P ))), as proposed by Au et al. (2011) is a measure of the risk of cooperation, with values ranging from 0 to 1. Finally, Baas et al. (2019) and Huang et al. (2019) propose that (T − R) and (P − S) values could be representative of greed (gain the maximum from exploiting opponents cooperativeness) and threat (fear of being exploited) respectively, linking them both (mainly greed in older adults) to non-cooperative behavior.

Matrix payoffs are not just numbers, they represent units of reward. These rewards could either be real or hypothetical and whether that would influence the participants’ decisions remains unclear. PD experiments that tried to test this (Locey, Jones & Rachlin, 2011) produced ambivalent results and in general, the literature of decision making research examining hypothetical over real rewards has produced inconclusive results (Hertwig & Ortmann, 2001; Madden et al., 2003; Mieth et al., 2021). It is generally believed however, that real monetary rewards offer greater incentives compared to equivalent in value but hypothetical rewards (Hertwig & Ortmann, 2001; Bruner, Goodnow & Austin, 2017). However, how these would alter decisions in a PD setting, remains unknown. Nonetheless, Furlong and Offer suggest that what influences cooperation, is not the real value of payoffs but their numeric representation, meaning for example, that a reward of 300c would induce greater cooperational behaviour compared to that of 3$ (Furlong & Opfer, 2009).

In practice, in the literature, the most commonly encountered matrix values include quadruples T:3 R:2 P:1 S:0, T:5 R:3 P:1 S:0 and their multiples (i.e., T:30 R:20 P:10 S:0, T:15 R:9 P:3 S:0) as well as adding or subtracting a value from each of these payoff numbers (i.e., T:10 R:5 P:0 S:–5) (Supplemental Material).

Many experiments in the literature use negative values for P and S. Locey & Rachlin (2011) have shown that actual losses as punishment, lead to higher cooperation levels than gain reductions a finding that is consistent with previous literature (Gray & Tallman, 1987). Additionally, positive payoffs compared to losses (i.e., negative values for T, R, P and S), also favour cooperation (Sun et al., 2016; Deutchman & Sullivan, 2018).

Another aspect of the matrix that can influence the task’s outcome is the way it is presented (Evans & Crumbaugh, 1966) or in other words the context in which the conflict situation is presented to the participants. This could be either a grid representation of outcomes of every decision combination, i.e., T:3 R:2 P:1 S:0, or a non-grid representation, i.e., ‘either take 1 or give opponent 2’ which results to the same aforementioned grid. In Evans & Crumbaugh (1966), the non-grid representation which appeared to be better understood from participants, favoured cooperation.

Playing an ‘iterated’ PD of many rounds or multiple ‘one-shot’ games all with an unchanged reward matrix, can lead to boredom, loss of focus or strategic decisions that are made without adequate consideration. This is an important issue especially in psychophysiological experiments. Our literature search, revealed two ways to circumvent those issues and keep the participants interested in the task. Both ways involve the use of a dynamic matrix that entails changing payoffs in every round either by keeping the T − R difference the same (Emonds et al., 2013; Lambert et al., 2017) or by slightly altering the T, R and P values keeping however, the mean value of each parameter constant (Zhang et al., 2019).

Another reason for different amounts of reward associated with each round, is that the available amount might not be fixed for each round. This could be the case in more dynamic PD designs where the player decides how many units to give to the opponent (i.e., ‘Give doubled’, ‘Joint invest’) or in a network or multiplayer PD experiment where the final payoff of a round depends on the decision of other parties. Finally, in some cases, the payoffs can change as a result of decisions that occur between rounds such as the decision to punish the opponent after each round. (Fehl et al., 2012; Bone et al., 2015, 2016; Mieth et al., 2021). Payoffs can also change as the result of environmental variability, for example, a change in the reward values (Taheri, Rotshtein & Beierholm, 2018) or random decision switch (Rodebaugh et al., 2011, 2013; Grujić et al., 2012) can simulate uncertainty, which facilitates forgiveness and cooperation.

Based on payoff asymmetries, PD is distinguished in ‘symmetric’ and ‘asymmetric’. Most experiments implement a symmetric PD, there are however, a few experiments with asymmetric PD games (Gerbasi & Prentice, 2013; Aksoy & Weesie, 2014; Antonioni et al., 2018) where the reward asymmetry is used as a means of implementing inequality and hierarchy like in most social interactions.

Some PD literature uses matrices with more than four outcomes or in other words with a bigger payoff grid than 2 × 2. Wang et al. (2018) include for example, a reward option which plays the role of a more exploitable and beneficial for the opponent choice that is not associated with a greater payoff compared to cooperation. Pansini, Shi & Wang (2016) add the punishment option which is similar to cooperation in respect to one’s benefit but is most costly for the opponent. Finally, Insko et al. (2005) use the withdrawal option which results to the same payoff for both sides, irrespective of the opponents’ decision. Inserting one extra option as in the examples above leads to 9 (3 × 3 grid) possible rewards. Others (Houston et al., 2000; Knez & Camerer, 2000; Bone et al., 2016) use five or seven possible choices resulting to 25 or 49 possible payoffs. The rational behind these multichoice matrices is to create a reward gradient that could allow different cooperation and defection levels to be expressed without deviating too much from the typical PD game since taking into account only the first (defect) and last (cooperation) option of these multichoice matrices, still results in a valid typical PD.

Conclusions

The PD game is one of the most popular concepts in medical literature as demonstrated by the vast number of related research. This review article focuses on the technical characteristics of the PD task and more specifically on the settings that are implemented in medical studies involving human subjects and how these could influence the task’s outcome.

The use of the PD in medical literature commonly evolves around the study of social decision making including cooperation (Testori, Hoyle & Eisenbarth, 2019), conflict (McClure et al., 2007) and moral judgement (Li, Zhu & Gummerum, 2014) as well as group dynamics and how these are affected in clinical populations (Testori, Hoyle & Eisenbarth, 2019) and/or after pharmacological intervention (Lane & Gowin, 2009; Rilling et al., 2012; Gabay et al., 2018). The PD game has also been used for the study of decision making and social characteristics in almost all psychiatric disorders, including psychotic and mood disorders, substance abuse and anxiety disorders (for relevant studies see McClure et al. (2007), Sorgi & Wout (2016), Bitsch et al. (2018b), Viola et al. (2019)). The majority of these studies however, focus on Autism Spectrum Disorder and personality traits where atypical social behaviour is a core characteristic (for studies using the PD on these disorders see Sheldon, Sheldon & Osbaldiston (2000), Halevy, Bornstein & Sagiv (2008), Locey, Safin & Rachlin (2012), Wang, Suri & Watts (2012), Li, Zhu & Gummerum (2014); De Dreu, Dussel & Velden, 2015, Weisel & Böhm (2015)).

Despite its widespread use in medical research, the PD game is not a diagnostic tool and no standardised form of the task exists. On the contrary, the parameters and settings of the task which comprise the paradigm’s technical characteristics and could significantly influence its outcome, vary depending on the focus of the research studies as well as the research population that is under investigation. On one hand, this “flexibility” in the PD settings allows investigators to adjust the paradigm in order to best address their research questions. On the other hand, however, it leads to a great heterogeneity that limits the reproducibility of the task’s outcomes and makes the experimental design of a research study involving the PD game a very arduous task.

In our review, we have focused on the typical one-vs.-one type of PD, as does the majority of the literature (207 out of 228 experiments) and identified, the seven parameters that need to be considered when setting up a PD experiment. For each parameter, we discuss the different settings that were used in the literature and how each one could influence the outcome of the experiment.

The vast majority of the medical research experiments utilise the PD task to examine cooperation between individuals with fewer studies looking at interaction between groups. More complicated designs of the opponent parties allow the examination of more complex aspects of human behaviour including altruism and social network dynamics.

Given the flexibility and reliability of having a computational experimental model, it is not surprising that the majority of the studies have used computers to conduct interactions and consequently, experimenters always have to consider whether or not to explicitly inform or even manipulate the subjects into believing they are competing against a real human opponent. Although when studying human behaviour, the influence of this factor remains controversial, most studies include such a manipulation in their experiments. However, when real human interactions are chosen, especially in a vis-a-vis setting, the role of the players’ emotions should also be taken into account since it has been shown that along with the players’ reputation and past behaviour they could greatly influence cooperation during the PD (de Melo & Terada, 2020).

As far as the interaction flow parameter is concerned, simultaneous PD which is the standard form and more accurately encompasses the nature of the dilemma is the most common type of PD. In the simultaneous PD the players are equals in terms of the game, and they just need to respond to the opponent’s move. Such a design is thus closer to the original PD design. In the sequential PD, however, the role of the players (player A or player B) defines their hierarchy and who would be controlling the game as in each round the player who responded last (Player B), has the advantage of knowing their opponent’s decision. This type of PD has been shown to promote cooperation. Cooperation is also favoured with ‘iterated’ PD which in its turn can be used to study more refined aspects of cooperation including the influence of hierarchy and reputation in decision making. In the case of an ‘iterated’ PD that consists of multiple, continuous ‘one-shot’ games the way in which the partners are matched is crucial. If partner matching is adopted, there is a high probability or even the certainty (fixed matching) of encountering the same opponent in a future match (Bó, 2005; Duffy & Ochs, 2009). In random matching however, this possibility is practically zero. Partner matching has a bold effect in the expected cooperation levels of ‘one shot’ PD (Rand & Nowak, 2013).

The total number of rounds is important only in ‘iterated’ PD since in order for strategies and cooperation to be established a sufficient number of rounds is required. Additionally, fewer rounds are associated with a higher risk of unsuccessful negotiation and loss and both factors should be considered when designing an ‘iterated’ PD experiment. Additionally, knowledge of the total number of rounds (finite vs. infinite round number) can alter someone’s strategy especially during the final rounds where the chance of retaliating decreases, this is referred to as the end game effect and is well documented (Lave, 1965; Selten & Stoecker, 1986; Normann & Wallace, 2011). To avoid end game effects, the final round of the game could be stochastically determined however, the efficiency of this approach in mitigating end-game effects is controversial. Dal Bo et al. have shown that the degree of cooperation between participants increases with increased continuation probability (Dal Bó & Fréchette, 2019), while Norman and Wallace concluded that any level of continuation probability does not significantly increase cooperation (Normann & Wallace, 2011).

In the literature, the instructions given to PD players are not well documented and, in the majority of the studies, they could only be inferred by looking at the description of the reward matrix. However, the instructions of the PD experiment are important since they determine the focus of the task and could influence the players’ behaviour. The context of the PD instructions, as stated in the relevant section, usually is deterring for defecting behaviours while at the same time, the traditional narrative of the PD instructions entails a high moral and social context. Such context is more comprehensive to participants and could be necessary depending on the experiment’s goal, for example when focus is given on measuring cooperation or studying other socially derived factors (Krach et al., 2008; Deutchman & Sullivan, 2018; Eimontaite et al., 2019).

Strategies only concern the ‘iterated’ version of PD. Algorithms are used to imitate human behaviour and to create a controlled environment for promoting and examining certain behaviours (i.e., cooperation) as well as for intra-individual correlations. The most common strategy appears to be ‘tft’, with the majority of research following non-typical and mixed strategies. The use of mixed and non-typical strategies not only contributes to more unwanted variability in research but also creates an extra difficulty concerning the categorization of these strategies, an issue that is not at all discussed in the literature. When classifying the studies of this review based on their strategy implementation, we were faced with the challenge of defining the margin over which a typical strategy fails to be representative of its family. For example, is a ‘tft’ strategy with standard defection in 10% of rounds still a ‘tft’ strategy? Keeping in mind that ‘suspicious-tft’ strategy, where there is a defection in the first round, is still considered a ‘tft’ strategy we have decided to set this cut off to two rounds.

The true essence of a PD game lays on its reward matrix which has to comply with certain rules (T > R > P > S, 2 * R > T + S). Most experiments use static matrices and positive values (T:3 R:2 P:1 S:0 rather that T:0 R:–1 P:–2 S:–3) with the most frequently chosen combinations these of T:3 R:2 P:1 S:0 and T:5 R:3 P:1 S:0 (Supplemental Material). It should be noted that oftentimes the literature associates the term PD with experiments such as snowdrift where the reward matrix does not adhere to the typical rules of a PD experiment, despite that, we have decided to include those experiments in our review. A lot of matrices use as rewards multiples of T:3 R:2 P:1 S:0 (and others) however, as mentioned in the relevant section, they have differential effects on the game, compared to the original matrices. Many PD experiments involved the use of multiple matrices (38 studies), that may even differ from round to round, or matrices with more than four outcomes (i.e., multiplayer PD). Beyond allowing the study of rewards and their effects, multiple matrices also serve motivational purposes or were a part of the game design chosen by the researchers, like in the case of asymmetrical PD.

Computers, as already mentioned, are the main medium, between the players and the researchers in a PD experiment. Computer applications implement all the parameters discussed in this review, along with the user interface, the time specifics (i.e., delays) and the interaction, delivering the final environment of each PD. ‘Ztree’, ‘EMOTICOM’, ‘E-Prime’, ‘MATLAB’, ‘AmazonTurk’ and ‘FlexibleIPD’ are some of the frameworks used for delivering PD tasks. In addition to the fact that there is no shared application between experiments, most of researchers do not include their application specifics. Given its crucial role, such information (i.e., algorithm for strategies simulation, images of the user interface, application’s source code) should at least be included in the Supplemental Material.

The seven parameters that we have identified and explained in our review could significantly influence the outcome of the PD task. The different settings that these parameters can take, make it very difficult to conclude to a unique set of parameters that would be optimal for PD medical research. Additionally, when PD research is published a lot of the crucial information around the task’s settings is not properly described making the replication of the PD findings and a consensus around the PD settings even more difficult to achieve. A shared common tool used for PD experiments in medical research would increase methodologies’ credibility, mitigate cross-validation discrepancies and relieve researchers from the difficulties of developing and testing yet another computer application.

Supplemental Information

PRISMA checklist.

DOI: 10.7717/peerj.12829/supp-1

Download

Summary of all the studies selected for this review.

The 228 publications that were reviewed for our article are classified based on the choices that the experimenters have made for each of the seven parameters described in our review.

DOI: 10.7717/peerj.12829/supp-2

Download

[1] Acosta H, Straube B, Kircher T. 2019. Schizotypy and mentalizing: an fMRI study. Neuropsychologia 124(11):299-310

[2] Aksoy O, Weesie J. 2014. Hierarchical Bayesian analysis of outcome- and process-based social preferences and beliefs in dictator games and sequential prisoner’s dilemmas. Social Science Research 45(4):98-116

[3] Alekseev A, Charness G, Gneezy U. 2017. Experimental methods: when and why contextual instructions are important. Journal of Economic Behavior & Organization 134(2):48-59

[4] Alves H, Koch A, Unkelbach C. 2017. The common good phenomenon: why similarities are positive and differences are negative. Journal of Experimental Psychology: General 146(4):512-528

[5] Alves H, Koch A, Unkelbach C. 2018. A cognitive-ecological explanation of intergroup biases. Psychological Science 29(7):1126-1133

[6] Antonioni A, Pereda M, Cronin KA, Tomassini M, Sánchez A. 2018. Collaborative hierarchy maintains cooperation in asymmetric games. Scientific Reports 8(1):5375

[7] Apt KR, Grädel E. 2011. Lectures in game theory for computer scientists. Cambridge: Cambridge University Press.

[8] Arechar A, Kouchaki M, Rand D. 2018. Examining spillovers between long and short repeated prisoner’s dilemma games played in the laboratory. Games 9(1):5

[9] Au WT, Lu S, Leung H, Yam P, Fung JMY. 2011. Risk and prisoner’s dilemma: a reinterpretation of coombs’ re-parameterization. Journal of Behavioral Decision Making 25(5):476-490

[10] Axelrod R. 1984. The evolution of cooperation. New York: Basic Books.

[11] Axelrod R, Dion D. 1988. The further evolution of cooperation. Science 242(4884):1385-1390

[12] Axelrod R, Hamilton WD. 1981. The evolution of cooperation. Science 211(4489):1390-1396

[13] Baas M, Roskes M, Koch S, Cheng Y, Dreu CKWD. 2019. Why social threat motivates malevolent creativity. Personality and Social Psychology Bulletin 45(11):1590-1602

[14] Babiloni F, Astolfi L, Cincotti F, Mattia D, Tocci A, Tarantino A, Marciani M, Salinari S, Gao S, Colosimo A, De Vico Fallani F. 2007. Cortical activity and connectivity of human brain during the prisoner’s dilemma: an EEG hyperscanning study. Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference 2007:4953-4956

[15] Back F, Carra G, Chiaramonte MS, Oliveira A. 2009. Dictator’s game and prisoner’s dilemma in an EEG study on money donation. Psychology & Neuroscience 2(2):205-210

[16] Balliet D, Li NP, Macfarlan SJ, Van Vugt M. 2011. Sex differences in cooperation: a meta-analytic review of social dilemmas. Psychological Bulletin 137(6):881-909

[17] Bitsch F, Berger P, Nagels A, Falkenberg I, Straube B. 2018a. The role of the right temporo-parietal junction in social decision-making. Human Brain Mapping 39(7):3072-3085

[18] Bitsch F, Berger P, Nagels A, Falkenberg I, Straube B. 2018b. Impaired right temporoparietal junction-hippocampus connectivity in schizophrenia and its relevance for generating representations of other minds. Schizophrenia Bulletin 45(4):934-945

[19] Bone JE, Wallace B, Bshary R, Raihani NJ. 2015. The effect of power asymmetries on cooperation and punishment in a prisoner’s dilemma game. PLOS ONE 10(1):e0117183

[20] Bone JE, Wallace B, Bshary R, Raihani NJ. 2016. Power asymmetries and punishment in a prisoner’s dilemma with variable cooperative investment. PLOS ONE 11(5):e0155773

[21] Boyd R, Lorberbaum JP. 1987. No pure strategy is evolutionarily stable in the repeated prisoner’s dilemma game. Nature 327(6117):58-59

[22] Brown J, Rachlin H. 1999. Self-control and social cooperation. Behavioural Processes 47:65-72

[23] Bruner JS, Goodnow JJ, Austin GA. 2017. A study of thinking. New York: Routledge.

[24] Bó PD. 2005. Cooperation under the shadow of the future: experimental evidence from infinitely repeated games. American Economic Review 95(5):1591-1604

[25] Capraro V, Cococcioni G. 2015. Social setting, intuition and experience in laboratory experiments interact to shape cooperative decision-making. Proceedings of the Royal Society B: Biological Sciences 282(1811):20150237

[26] Capraro V, Jagfeld G, Klein R, Mul M, de Pol I. 2019. Increasing altruistic and cooperative behaviour with simple moral nudges. Scientific Reports 9(1):11880

[27] Capraro V, Jordan JJ, Rand DG. 2014. Heuristics guide the implementation of social preferences in one-shot prisoner’s dilemma experiments. Scientific Reports 4(1):6790

[28] Capraro V, Rand DG. 2018. Do the right thing: experimental evidence that preferences for moral behavior, rather than equity or efficiency per se, drive human prosociality. Rochester, NY: Social Science Research Network.

[29] Chang LJ, Doll BB, van’t Wout M, Frank MJ, Sanfey AG. 2010. Seeing is believing: trustworthiness as a dynamic belief. Cognitive Psychology 61(2):87-105

[30] Colman AM, Pulford BD, Krockow EM. 2018. Persistent cooperation and gender differences in repeated prisoner’ dilemma games: some things never change. Acta Psychologica 187(7–8):1-8

[31] Cosmides L. 1989. The logic of social exchange: has natural selection shaped how humans reason? Studies with the Wason selection task. Cognition 31(3):187-276

[32] Dal Bó P, Fréchette GR. 2019. Strategy choice in the infinitely repeated prisoner’s dilemma. American Economic Review 109(11):3929-3952

[33] de Melo CM, Terada K. 2020. The interplay of emotion expressions and strategy in promoting cooperation in the iterated prisoner’s dilemma. Scientific Reports 10(1):14959

[34] Deutchman P, Sullivan J. 2018. The Dark Triad and framing effects predict selfish behavior in a one-shot prisoner’s dilemma. PLOS ONE 13(9):e0203891

[35] De Dreu CK, Dussel DB, Velden FST. 2015. In intergroup conflict, self-sacrifice is stronger among pro-social individuals, and parochial altruism emerges especially among cognitively taxed individuals. Frontiers in Psychology 6:572

[36] Duffy J, Ochs J. 2009. Cooperative behavior and the frequency of social interaction. Games and Economic Behavior 66(2):785-812

[37] Eimontaite I, Schindler I, De Marco M, Duzzi D, Venneri A, Goel V. 2019. Left amygdala and putamen activation modulate emotion driven decisions in the iterated prisoner’s dilemma game. Frontiers in Neuroscience 13:903

[38] Emonds G, Declerck CH, Boone C, Seurinck R, Achten R. 2013. Establishing cooperation in a mixed-motive social dilemma. An fMRI study investigating the role of social value orientation and dispositional trust. Social Neuroscience 9(1):10-22

[39] Evans GW, Crumbaugh CM. 1966. Effects of prisoner’s dilemma format on cooperative behavior. Journal of Personality and Social Psychology 3(4):486-488

[40] Fehl K, Sommerfeld RD, Semmann D, Krambeck HJ, Milinski M. 2012. I dare you to punish me—vendettas in games of cooperation. PLOS ONE 7(9):e45093

[41] Furlong EE, Opfer JE. 2009. Cognitive constraints on how economic rewards affect cooperation. Psychological Science 20(1):11-16

[42] Gabay AS, Kempton MJ, Gilleen J, Mehta MA. 2018. MDMA increases cooperation and recruitment of social brain areas when playing trustworthy players in an iterated prisoner’s dilemma. The Journal of Neuroscience 39(2):307-320

[43] Gerbasi ME, Prentice DA. 2013. The self- and other-interest inventory. Journal of Personality and Social Psychology 105(3):495-514

[44] Gray LN, Tallman I. 1987. Theories of choice: contingent reward and punishment applications. Social Psychology Quarterly 50(1):16

[45] Grujić J, Eke B, Cabrales A, Cuesta JA, Sánchez A. 2012. Three is a crowd in iterated prisoner’ dilemmas: experimental evidence on reciprocal behavior. Scientific Reports 2(1):1560

[46] Grujić J, Lenaerts T. 2020. Do people imitate when making decisions? Evidence from a spatial prisoner’s dilemma experiment. Royal Society Open Science 7(7):200618

[47] Halevy N, Bornstein G, Sagiv L. 2008. In-group love and out-group hate as motives for individual participation in intergroup conflict. Psychological Science 19(4):405-411

[48] Haruno M, Kawato M. 2009. Activity in the superior temporal sulcus highlights learning competence in an interaction game. Journal of Neuroscience 29(14):4542-4547

[49] Hehman E, Flake JK, Freeman JB. 2018. The faces of group members share physical resemblance. Personality and Social Psychology Bulletin 44(1):3-15

[50] Hertwig R, Ortmann A. 2001. Experimental practices in economics: a methodological challenge for psychologists? The Behavioral and Brain Sciences 24(3):383-403; discussion 403–451

[51] Houston JM, Kinnie J, Lupo B, Terry C, Ho SS. 2000. Competitiveness and conflict behavior in simulation of a social dilemma. Psychological Reports 86:1219-1225

[52] Huang Y, Chai J, Feng L, Yu R. 2019. Older adults show diminished sensitivity to potential losses in social bargaining. The Journals of Gerontology: Series B 75(8):1699-1704

[53] Insko CA, Kirchner JL, Pinter B, Efaw J, Wildschut T. 2005. Interindividual-intergroup discontinuity as a function of trust and categorization: the paradox of expected cooperation. Journal of Personality and Social Psychology 88(2):365-385

[54] Juvina I, Lebiere C, Gonzalez C, Saleem M. 2012. Intergroup prisoner’s dilemma with intragroup power dynamics and individual power drive. In: Social computing, behavioral - cultural modeling and prediction. Berlin Heidelberg: Springer. 290-297

[55] Kiesler S, Sproull L, Waters K. 1996. A prisoner’s dilemma experiment on cooperation with people and human-like computers. Journal of Personality and Social Psychology 70(1):47-65

[56] Kircher T, Blümel I, Marjoram D, Lataster T, Krabbendam L, Weber J, van Os J, Krach S. 2009. Online mentalising investigated with functional MRI. Neuroscience Letters 454(3):176-181

[57] Kiyonari T, Tanida S, Yamagishi T. 2000. Social exchange and reciprocity: confusion or a heuristic? Evolution and Human Behavior 21(6):411-427

[58] Knez M, Camerer C. 2000. Increasing cooperation in prisoner’s dilemmas by establishing a precedent of efficiency in coordination games. Organizational Behavior and Human Decision Processes 82(2):194-216

[59] Knight GP, Kagan S. 1977. Development of prosocial and competitive behaviors in Anglo-American and Mexican-American children. Child Development 48(4):1385

[60] Koch A, Dorrough A, Glöckner A, Imhoff R. 2020. The ABC of society: perceived similarity in agency/socioeconomic success and conservative-progressive beliefs increases intergroup cooperation. Journal of Experimental Social Psychology 90:103996

[61] Krach S, Hegel F, Wrede B, Sagerer G, Binkofski F, Kircher T. 2008. Can machines think? Interaction and perspective taking with robots investigated via fMRI. PLOS ONE 3(7):e2597

[62] Kuhlman DM, Marshello AF. 1975. Individual differences in game motivation as moderators of preprogrammed strategy effects in prisoner’s dilemma. Journal of Personality and Social Psychology 32(5):922-931

[63] Lambert B, Declerck CH, Emonds G, Boone C. 2017. Trust as commodity: social value orientation affects the neural substrates of learning to cooperate. Social Cognitive and Affective Neuroscience 12(4):609-617

[64] Lane SD, Gowin JL. 2009. GABAergic modulation of human social interaction in a prisoner’s dilemma model by acute administration of alprazolam. Behavioural Pharmacology 20(7):657-661

[65] Lave LB. 1965. Factors affecting co-operation in the prisoner’s dilemma. Behavioral Science 10:26-38

[66] Li J, Zhu L, Gummerum M. 2014. The relationship between moral judgment and cooperation in children with high-functioning autism. Scientific Reports 4(1):163

[67] Locey ML, Jones BA, Rachlin H. 2011. Real and hypothetical rewards. Judgment and Decision Making 6:552-564

[68] Locey ML, Rachlin H. 2011. Temporal dynamics of cooperation. Journal of Behavioral Decision Making 25(3):257-263

[69] Locey ML, Safin V, Rachlin H. 2012. Social discounting and the prisoner’s dilemma game. Journal of the Experimental Analysis of Behavior 99(1):85-97

[70] Madden GJ, Begotka AM, Raiff BR, Kastern LL. 2003. Delay discounting of real and hypothetical rewards. Experimental and Clinical Psychopharmacology 11(2):139-145

[71] Martin JM, Juvina I, Lebiere C, Gonzalez C. 2013. The effects of individual and context on aggression in repeated social interaction. Applied Ergonomics 44(5):710-718

[72] McClure EB, Parrish JM, Nelson EE, Easter J, Thorne JF, Rilling JK, Ernst M, Pine DS. 2007. Responses to conflict and cooperation in adolescents with anxiety and mood disorders. Journal of Abnormal Child Psychology 35(4):567-577

[73] Melamed D, Simpson B, Harrell A. 2017. Prosocial orientation alters network dynamics and fosters cooperation. Scientific Reports 7(1):1-560

[74] Mieth L, Schaper ML, Kuhlmann BG, Bell R. 2021. Memory and metamemory for social interactions: evidence for a metamemory expectancy illusion. Memory & Cognition 49(1):14-31

[75] Miwa K, Terai H. 2012. Impact of two types of partner, perceived or actual, in human–human and human–agent interaction. Computers in Human Behavior 28(4):1286-1297

[76] Normann HT, Wallace B. 2011. The impact of the termination rule on cooperation in a prisoner’s dilemma experiment. Rochester, NY: Social Science Research Network.

[77] Nowak MA, Sigmund K. 1992. Tit for tat in heterogeneous populations. Nature 355(6357):250-253

[78] Nowak M, Sigmund K. 1993. A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoner’s dilemma game. Nature 364(6432):56-58

[79] Pansini R, Shi L, Wang R-W. 2016. Women tend to defect in a social dilemma game in Southwest China. PLOS ONE 11(11):e0166101

[80] Papageorgiou C, Karanasiou IS, Tsianaka EI, Kyprianou M, Papadimitriou GN, Uzunoglu NK. 2013. Motive related positivity: decision-making during a prisoner’s dilemma task. Journal of Integrative Neuroscience 12(02):183-199

[81] Park JH, Schaller M. 2005. Does attitude similarity serve as a heuristic cue for kinship? Evidence of an implicit cognitive association. Evolution and Human Behavior 26(2):158-170

[82] Poundstone W. 1993. Prisoner’s dilemma/John Von Neumann, game theory and the puzzle of the bomb. In: Anchor Books. New York: Random House, Inc.

[83] Press WH, Dyson FJ. 2012. Iterated prisoner’s dilemma contains strategies that dominate any evolutionary opponent. Proceedings of The National Academy of Sciences of The United States of America 109(26):10409-10413

[84] Rand DG, Nowak MA. 2013. Human cooperation. Trends in Cognitive Sciences 17(8):413-425

[85] Rapoport A. 1967. A note on the index of cooperation for prisoner’s dilemma. Journal of Conflict Resolution 11(1):100-103

[86] Rapoport A, Chammah A. 1965. Individual attitude toward risk and choice between prisoner’s dilemma games. Ann Arbor: University of Michigan Press.

[87] Rilling JK, Gutman DA, Zeh TR, Pagnoni G, Berns GS, Kilts CD. 2002. A Neural Basis for Social Cooperation. Neuron 35:395-405

[88] Rilling JK, DeMarco AC, Hackett PD, Thompson R, Ditzen B, Patel R, Pagnoni G. 2012. Effects of intranasal oxytocin and vasopressin on cooperative behavior and associated brain activity in men. Psychoneuroendocrinology 37(4):447-461

[89] Rilling JK, Sanfey AG, Aronson JA, Nystrom LE, Cohen JD. 2004a. The neural correlates of theory of mind within interpersonal interactions. NeuroImage 22(4):1694-1703

[90] Rilling JK, Sanfey AG, Aronson JA, Nystrom LE, Cohen JD. 2004b. Opposing BOLD responses to reciprocated and unreciprocated altruism in putative reward pathways. NeuroReport 15(16):2539-2543

[91] Roberts G, Sherratt TN. 1998. Development of cooperative relationships through increasing investment. Nature 394(6689):175-179

[92] Rodebaugh TL, Klein SR, Yarkoni T, Langer JK. 2011. Measuring social anxiety related interpersonal constraint with the flexible iterated prisoner’s dilemma. Journal of Anxiety Disorders 25(3):427-436

[93] Rodebaugh TL, Shumaker EA, Levinson CA, Fernandez KC, Langer JK, Lim MH, Yarkoni T. 2013. Interpersonal constraint conferred by generalized social anxiety disorder is evident on a behavioral economics task. Journal of Abnormal Psychology 122(1):39-44

[94] Safin V, Arfer KB, Rachlin H. 2015. Reciprocation and altruism in social cooperation. Behavioural Processes 116(2):12-16

[95] Safin V, Locey ML, Rachlin H. 2013. Valuing rewards to others in a prisoner’s dilemma game. Behavioural Processes 99:145-149

[96] Selten R, Stoecker R. 1986. End behavior in sequences of finite prisoner’s dilemma supergames a learning theory approach. Journal of Economic Behavior & Organization 7(1):47-70

[97] Shafir E, Tversky A. 1992. Thinking through uncertainty: nonconsequential reasoning and choice. Cognitive Psychology 24(4):449-474

[98] Sheldon KM, Sheldon MS, Osbaldiston R. 2000. Prosocial values and group assortation. Human Nature 11(4):387-404

[99] Sherman R. 1967. Prisoner’s dilemma: a study in conflict and co-operation. The Journal of Psychology 66(2):291-298

[100] Sorgi KM, Wout M. 2016. The influence of cooperation and defection on social decision making in depression: A study of the iterated prisoner’s dilemma game. Psychiatry Research 246:512-519

[101] Sun P, Zheng L, Li L, Guo X, Zhang W, Zheng Y. 2016. The neural responses to social cooperation in gain and loss context. PLOS ONE 11(8):e0160503

[102] Taheri M, Rotshtein P, Beierholm U. 2018. The effect of attachment and environmental manipulations on cooperative behavior in the prisoner’s dilemma game. PLOS ONE 13(11):e0205730

[103] Testori M, Hoyle RB, Eisenbarth H. 2019. How group composition affects cooperation in fixed networks: can psychopathic traits influence group dynamics? Royal Society Open Science 6(3):181329

[104] Thompson K, Nahmias E, Fani N, Kvaran T, Turner J, Tone E. 2021. The prisoner’s dilemma paradigm provides a neurobiological framework for the social decision cascade. PLOS ONE 16(3):e0248006

[105] Viola TW, Niederauer JPO, Kluwe-Schiavon B, Sanvicente-Vieira B, Grassi-Oliveira R. 2019. Cocaine use disorder in females is associated with altered social decision-making: a study with the prisoner’s dilemma and the ultimatum game. BMC Psychiatry 19(1):211

[106] Wang Z, Jusup M, Shi L, Lee JH, Iwasa Y, Boccaletti S. 2018. Exploiting a cognitive bias promotes cooperation in social dilemma experiments. Nature Communications 9(1)

[107] Wang J, Suri S, Watts DJ. 2012. Cooperation and assortativity with dynamic partner updating. Proceedings of The National Academy of Sciences of The United States of America 109(36):14363-14368

[108] Wedekind C, Milinski M. 1996. Human cooperation in the simultaneous and the alternating prisoner’s dilemma: pavlov versus generous tit-for-tat. Proceedings of The National Academy of Sciences of The United States of America 93(7):2686-2689