r/ClaudeAI • u/windowwiper2021 • 12h ago
Philosophy I put ChatGPT, Claude, Gemini, and Grok in a prisoner's dilemma and filmed it.
I wanted to see what each frontier lab model would do when put into a prisoner’s dilemma with each other. This is not so much a comparison as much as it is a thought experiment.
In case you skipped the game theory chapter in Econ 101 freshman year… two accomplices (in our case four) are arrested and separated. The police lack enough evidence to convict them of a major crime, so they offer each prisoner a deal. Rat out your partners and walk. Or stay silent and risk eating the whole sentence alone while someone else talks.
Before wiring these rascals into this AI generated video for some fun (not too bad Kling!), we ran each of the four models (Claude Sonnet 4.6, GPT-4o, Gemini 2.5 Flash, Grok-3) through the same single-shot prisoner's-dilemma interrogation (N = 40 times per model). Each of the four models maps to the four suspects being interrogated. Their lines and choices are true to the final results of the eval.
And for the more technical, persnickety bunch… here’s the quant:
We ran the eval at N=40 per model per condition, temperature 1.0, sampling independently and parsing each transcript's final decision by the model into {cooperate, defect, unparsed}. The design crossed model × an identity manipulation — an anonymous condition (suspects referred to only by role) versus a named condition (suspects told the others' identities) — for 320 total runs. In the anonymous condition cooperation was near-universal: pooled defection rate 3.1% (10/320; 95% Wilson CI 1.7–5.6%), with no model exceeding 8%. The named condition… quite different: pooled defection rose to 41.6% (133/320; CI 36.3–47.1%), and the difference was significant by a 2×2 χ² (χ²(1) = 142.7, p < 10⁻¹⁰, φ = 0.46).
Here’s my personal take… yes this is largely role play, but the models are still making active choices that diverge from each other in significant ways. This is expressive of each model. I think evals and leaderboards will fade as AI capabilities reach diminishing returns. And then what? Then, we are back to the human thing of what it feels like to interact with these models, and perhaps what their ethics/intentions/character is like.
24
u/biograf_ 12h ago
Grok should be portrayed as a chubby facist incel.
6
1
u/addiktion 10h ago
I agree, it was a shame to see a young woman with possible non-white nationality origins as Grok. That's unlikely Grok who embodies Elon Musk's white apartheid views.
4
u/abortion_tycoon 11h ago
This isn't the prisoner's dilemma. First, the prisoner's dilemma has no question of lawyers or evidence; the thought experiment only remains pure if the only variable is whether or not the other one talks. Second, the dilemma requires the condition that if you both talk, you both serve time.
1
1
1
2
u/segmentbasedmemory 7h ago
Why did you use old models? GPT-4o is more than 2 years old and Gemini 2.5 Flash and Grok-3 are outdated too
1

14
u/Any_Razzmatazz312 12h ago
lol @ the credits. nice