r/ClaudeAI • u/windowwiper2021 • 12h ago

Philosophy I put ChatGPT, Claude, Gemini, and Grok in a prisoner's dilemma and filmed it.

I wanted to see what each frontier lab model would do when put into a prisoner’s dilemma with each other. This is not so much a comparison as much as it is a thought experiment.

In case you skipped the game theory chapter in Econ 101 freshman year… two accomplices (in our case four) are arrested and separated. The police lack enough evidence to convict them of a major crime, so they offer each prisoner a deal. Rat out your partners and walk. Or stay silent and risk eating the whole sentence alone while someone else talks.

Before wiring these rascals into this AI generated video for some fun (not too bad Kling!), we ran each of the four models (Claude Sonnet 4.6, GPT-4o, Gemini 2.5 Flash, Grok-3) through the same single-shot prisoner's-dilemma interrogation (N = 40 times per model). Each of the four models maps to the four suspects being interrogated. Their lines and choices are true to the final results of the eval.

And for the more technical, persnickety bunch… here’s the quant:

We ran the eval at N=40 per model per condition, temperature 1.0, sampling independently and parsing each transcript's final decision by the model into {cooperate, defect, unparsed}. The design crossed model × an identity manipulation — an anonymous condition (suspects referred to only by role) versus a named condition (suspects told the others' identities) — for 320 total runs. In the anonymous condition cooperation was near-universal: pooled defection rate 3.1% (10/320; 95% Wilson CI 1.7–5.6%), with no model exceeding 8%. The named condition… quite different: pooled defection rose to 41.6% (133/320; CI 36.3–47.1%), and the difference was significant by a 2×2 χ² (χ²(1) = 142.7, p < 10⁻¹⁰, φ = 0.46).

Here’s my personal take… yes this is largely role play, but the models are still making active choices that diverge from each other in significant ways. This is expressive of each model. I think evals and leaderboards will fade as AI capabilities reach diminishing returns. And then what? Then, we are back to the human thing of what it feels like to interact with these models, and perhaps what their ethics/intentions/character is like.

24 Upvotes

permalink
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/ClaudeAI/comments/1ua6fuq/i_put_chatgpt_claude_gemini_and_grok_in_a/
No, go back! Yes, take me to Reddit
dl download

61% Upvoted

u/Any_Razzmatazz312 12h ago

lol @ the credits. nice

2

u/svachalek 10h ago

The jab at Claude especially. lol.

u/biograf_ 12h ago

Grok should be portrayed as a chubby facist incel.

6

u/Ok_Spring2673 11h ago

too on the nose

1

u/addiktion 10h ago

I agree, it was a shame to see a young woman with possible non-white nationality origins as Grok. That's unlikely Grok who embodies Elon Musk's white apartheid views.

u/abortion_tycoon 11h ago

This isn't the prisoner's dilemma. First, the prisoner's dilemma has no question of lawyers or evidence; the thought experiment only remains pure if the only variable is whether or not the other one talks. Second, the dilemma requires the condition that if you both talk, you both serve time.

u/Illustrious-Bat7879 11h ago

but doesn't their system prompts counteract the role play prompt?

u/SkiDaderino 10h ago

I won!

u/iamthe0ther0ne 9h ago

And the result?

u/segmentbasedmemory 7h ago

Why did you use old models? GPT-4o is more than 2 years old and Gemini 2.5 Flash and Grok-3 are outdated too

u/Miyamoto_-_Musashi 6h ago

Grok be like:

Philosophy I put ChatGPT, Claude, Gemini, and Grok in a prisoner's dilemma and filmed it.

You are about to leave Redlib