Prisoners Dilemma Calculator - 2x2 Payoff Matrix Solver

Prisoners dilemma calculator that turns T, R, P, and S years-in-prison values into a full 2x2 payoff matrix, with the Nash equilibrium and Pareto-optimal cell flagged.

Updated: June 16, 2026 • Free Tool

Prisoners Dilemma Calculator

Results

Player 1 payoff when both cooperate

Player 1 payoff when P1 cooperates, P2 defects 0

Player 1 payoff when P1 defects, P2 cooperates 0

Player 1 payoff when both defect 0

Player 2 payoff when both cooperate 0

Player 2 payoff when P1 cooperates, P2 defects 0

Player 2 payoff when P1 defects, P2 cooperates 0

Player 2 payoff when both defect 0

Total social welfare at (C, C) 0

Total social welfare at (D, D) 0

Cooperation gain (DD welfare - CC welfare) 0

Player 1 dominant strategy 0

Player 2 dominant strategy 0

Nash equilibrium cell 0

Pareto-optimal cell 0

Valid prisoners dilemma ordering 0

What Is the Prisoners Dilemma Calculator?

A prisoners dilemma calculator takes four payoff values (T, R, P, and S) and turns them into a 2x2 payoff matrix so you can read the dominant strategy, the Nash equilibrium, and the best joint outcome.

• Game theory homework: Work through a textbook payoff matrix and confirm the dominant strategy and Nash equilibrium.
• Teaching cooperation and defection: Show students why individually rational choices can produce a worse group result, using the 2x2 matrix.
• Negotiation or pricing analysis: Plug in custom payoffs to model a price-fixing meeting, an arms-race decision, or competitive bidding.
• Sensitivity checks on payoffs: Test whether changing the sucker or temptation payoff by one unit flips the Nash equilibrium from (D, D) to (C, C).

The classical prisoners dilemma story is short: two suspects are questioned separately and can stay silent (cooperate) or confess (defect). If both stay silent they each get a small sentence. If one confesses and the other does not, the confessor walks free and the silent one gets the longest sentence. If both confess they each get a medium sentence. The dominant strategy is to confess.

The calculator applies that story to the four outcomes (CC, CD, DC, DD) and labels the Nash equilibrium and any cell that Pareto-dominates it. The same machinery works for any two-player symmetric game with the same payoff ordering, which is why economics and political science classes use it to teach non-cooperative game theory.

The payoff matrix is a 2x2 grid where each cell holds the payoffs for a binary strategy choice, and a truth table generator builds the same kind of binary-input to cell-value grid for every strategy combination.

How the Prisoners Dilemma Calculator Works

The prisoners dilemma payoff matrix is built directly from your four inputs. Once the four cells are filled in, the calculator checks which cell is a Nash equilibrium, which cells are Pareto-superior to it, and how much welfare the two players gain or lose by switching from mutual defection to mutual cooperation.

Player 1 payoff matrix: [[R, S], [T, P]] | Player 2 payoff matrix: [[R, T], [S, P]]

T: The defector's payoff when the other player cooperates. In the classic story this is 0 (the defector walks free).
R: Both players' payoff when they both cooperate. In the classic story this is 1 year each.
P: Both players' payoff when they both defect. In the classic story this is 2 years each.
S: The cooperator's payoff when the other player defects. In the classic story this is 3 years, the worst individual outcome.

Each player compares the two cells in their column (or row) and picks the lower number when payoffs are years in prison. When T < R and P < S, defection dominates cooperation for both players and the unique Nash equilibrium is the (D, D) cell.

The cooperation gain is the welfare at DD minus the welfare at CC. In a valid prisoners dilemma that difference is positive, which means the (D, D) equilibrium wastes more total welfare than the (C, C) cell, and that gap is why the defection equilibrium looks irrational from a social-welfare perspective even though it is the only stable outcome in a single shot.

Classic 0, 1, 2, 3 years-in-prison payoffs

T = 0, R = 1, P = 2, S = 3

Player 1 prefers to defect when the other cooperates (T=0 vs R=1) and when the other defects (P=2 vs S=3). Player 2 reasons the same way, so the Nash equilibrium is the (D, D) cell at (2, 2).

CC = (1, 1), CD = (3, 0), DC = (0, 3), DD = (2, 2). Nash DD has welfare 4; CC has welfare 2 and Pareto-dominates DD, while CD and DC are also Pareto-efficient on the frontier.

Both players lose one extra year compared with mutual cooperation, so the cooperation gain is 2 years saved by staying silent together.

Axelrod-style tournament payoffs (0, 3, 1, 5)

T = 0, R = 3, P = 1, S = 5

T=0 still beats R=3 and P=1 still beats S=5, so defection dominates for both players and the Nash equilibrium stays at DD = (1, 1) per round.

CC = (3, 3), CD = (5, 0), DC = (0, 5), DD = (1, 1). Nash DD has welfare 2 per round; CC has welfare 6 per round and Pareto-dominates DD, while CD and DC remain Pareto-efficient.

Robert Axelrod used this payoff table in his iterated prisoners dilemma tournaments. The 4-point CC vs DD gap per round is the long-run bonus that lets tit-for-tat outperform always-defect in repeated play.

According to Stanford Encyclopedia of Philosophy, the standard payoff ordering is T > R > P > S in utility terms, or S > P > R > T when payoffs are costs such as years in prison.

Key Concepts Behind the Prisoners Dilemma

These four ideas explain why the matrix looks the way it does and why the equilibrium is so often mutual defection.

Dominant strategy

A strategy that beats the alternative no matter what the other player does. In a valid prisoners dilemma defection strictly dominates cooperation for both players, which is why the matrix always leans toward the (D, D) corner.

Nash equilibrium

A pair of strategies where no player can improve by changing their move while the other player holds theirs fixed. In a single-shot prisoners dilemma the (D, D) cell is the unique Nash equilibrium, even when mutual cooperation would make both players better off.

Pareto-optimal outcome

An outcome where you cannot make one player better off without making the other worse off. In a valid prisoners dilemma the (C, C) cell Pareto-dominates (D, D), so defection wastes social welfare, but the (C, D) and (D, C) cells are also Pareto-efficient on the frontier.

The sum of the two players' payoffs in a cell. The cooperation gain equals the welfare at DD minus the welfare at CC, which matches the result panel; in the classic story that gain is 2 years, the amount both players leave on the table when they defect.

Whether defection truly dominates cooperation depends on the relative size of the four payoffs, and a ratio calculator lets you compare the temptation-to-reward and sucker-to-punishment ratios to verify the strict T > R > P > S (or S > P > R > T) ordering.

How to Use This Prisoners Dilemma Calculator

Set the four payoffs, then read the matrix, the equilibrium flags, and the welfare comparison in that order.

1 Enter the temptation payoff (T): Payoff for the defector when the other player cooperates. Classic years-in-prison value: 0.
2 Enter the reward payoff (R): Payoff for each player when they both cooperate. Classic value: 1 year.
3 Enter the punishment payoff (P): Payoff for each player when they both defect. Classic value: 2 years.
4 Enter the sucker payoff (S): Payoff for the cooperator when the other player defects. Classic value: 3 years.
5 Read the 2x2 payoff matrix: The result panel reproduces the four cells CC, CD, DC, and DD with the row player first and the column player second.
6 Read the equilibrium flags and cooperation gain: The Nash equilibrium and the cells that Pareto-dominate it are flagged, the dominant strategy for each player is printed, and the cooperation gain shows how much welfare the defection equilibrium wastes.

If you want to model a price-fixing meeting where both firms can either hold the cartel (cooperate) or undercut the other (defect), start with T = 0, R = 1, P = 2, S = 3. The calculator will show that defection dominates and both firms end up serving 2 years of profit loss instead of 1 year of cartel profit.

The total social welfare at a cell is just the sum of the two players' payoffs, and a basic addition calculator does that two-number sum the same way the welfare totals in the result panel do.

Benefits of Using This Prisoners Dilemma Calculator

These are the practical reasons to run the numbers rather than reason them out in your head.

• Reads the matrix in seconds: The calculator fills the four cells and prints the dominant strategy for each player.
• Flags the Nash equilibrium and Pareto-superior cells: You see at a glance which cell is the equilibrium and which cells are strictly better for both players.
• Quantifies the cooperation gain: The welfare comparison shows how many years defection wastes.
• Accepts custom T, R, P, S values: Plug in any payoff set, not only the classic numbers.
• Works for classroom and negotiation work: The matrix appears in game theory homework, oligopoly pricing, and arms-control prep.

The cooperation gain is the signed change from CC welfare to DD welfare, and an absolute change calculator gives the same signed-difference with magnitude reading the prisoners dilemma uses to show the welfare gap.

Factors That Affect the Prisoners Dilemma Outcome

The numbers you type in set the single-shot result, but the real-world outcome often depends on these additional features of the situation.

Whether the game is repeated

In a single-shot prisoners dilemma defection dominates. In an iterated version with a long shadow of the future, reciprocity strategies such as tit-for-tat can sustain cooperation and shift the long-run outcome to (C, C).

Payoff scale and units

Whether you measure payoffs in years in prison, dollars of profit, or abstract points changes the labels but not the ordering. The calculator works as long as the units are consistent for T, R, P, and S.

Payoff symmetry between players

The prisoners dilemma assumes both players face the same T, R, P, and S values. Asymmetric games (different payoffs for each player) need separate tables and are outside the scope of this calculator.

Communication and binding agreements

Allowing the players to talk before choosing can move the outcome toward CC, but only with a credible way to punish a defector. The matrix itself does not change; the supporting environment does.

• The calculator only models a two-player symmetric game with the four standard payoffs. Larger or asymmetric games need a different matrix and are out of scope here.
• The numbers are inputs, not estimates. Real-world payoffs are rarely exact, so treat the equilibrium as a directional signal.

According to Robert Axelrod, The Evolution of Cooperation, iterated prisoners dilemma tournaments show that tit-for-tat beats always-defect when the probability of meeting again is high enough.

The defection penalty is often reported as a percent change from CC to DD welfare, and a percentage change calculator gives the (DD welfare - CC welfare) divided by CC welfare times 100 number the prisoners dilemma analysis uses to show how big the welfare loss really is.

Prisoners dilemma calculator showing a 2x2 payoff matrix with T, R, P, S inputs and the Nash equilibrium cell highlighted

Frequently Asked Questions

Q: What is the prisoners dilemma calculator?

A: It takes four payoff values (T, R, P, and S) and turns them into a 2x2 payoff matrix so you can read off each player's dominant strategy, the Nash equilibrium, the Pareto-optimal outcome, and the cooperation gain in one pass.

Q: How do you solve the prisoners dilemma payoff matrix?

A: Fill in the four cells with the standard mapping, then compare each player's two cells in the same column (or row) to find the dominant strategy. The Nash equilibrium is the cell where both players are playing a best response.

Q: What payoff values define a true prisoners dilemma?

A: A valid prisoners dilemma needs S > P > R > T when payoffs are years in prison, or T > R > P > S in utility. The classic story uses T = 0, R = 1, P = 2, S = 3, which is the default set here.

Q: Is cooperation ever rational in the prisoners dilemma?

A: In a single-shot game defection strictly dominates cooperation, so (C, C) is not a Nash equilibrium. Cooperation can be stable in the iterated version when the players expect to meet again and use a reciprocity strategy such as tit-for-tat.

Q: What is the difference between Nash equilibrium and Pareto optimal?

A: The Nash equilibrium is the cell where no player can do better by switching while the other holds their strategy fixed (the (D, D) cell in a valid prisoners dilemma). A Pareto-optimal cell is one where neither player can be made better off without making the other worse off; in a 2x2 prisoners dilemma the (C, C) cell Pareto-dominates (D, D), but the (C, D) and (D, C) cells are also Pareto-efficient, so the Pareto frontier is wider than (C, C) alone.

Q: Can the prisoners dilemma be played with more than two strategies?

A: The classic game uses only Cooperate and Defect. The same payoff ordering can be extended to multi-strategy games, but the calculator here is built for the two-strategy version because that is the one in most textbooks.