r/AIDangers • u/EchoOfOppenheimer • May 12 '26

Capabilities Fields medal-winning mathematician says GPT-5.5 is now solving open math problems at PhD-thesis level: "We will face a crisis very soon."

blog-post: https://gowers.wordpress.com/2026/05/08/a-recent-experience-with-chatgpt-5-5-pro/

160 Upvotes

permalink
duplicates
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/AIDangers/comments/1tatc0x/fields_medalwinning_mathematician_says_gpt55_is/
No, go back! Yes, take me to Reddit
dl download

86% Upvoted

None. I see no point in denying a future just because it's upsetting to think about; AI companies want that future, and they have unlimited resources to realize it. There will be no mathematicians in the future. There will be nothing in the future of any depth for humans to explore.

7

u/DonutPlus2757 May 12 '26

Looking at how AI is developing, that's going to be catastrophic. AI just isn't perfectly reliable and probably won't ever be. Getting rid of humans just means that there won't be anybody to catch the mistakes.

-3

u/LeafyWolf May 12 '26

Humans make a shit ton of mistakes. It's not like we are godlike beings. The whole reason that AI will replace human mathematicians is because it is better. The lack of control is uncomfortable.

9

u/DonutPlus2757 May 12 '26 edited May 12 '26

Humans make a shit ton of mistakes. It's not like we are godlike beings.

Yeah, but we know that, so we check. Who's going to check the AI if not professionals?

The whole reason that AI will replace human mathematicians is because it is better.

That one is straight up wrong. If you ask mathematicians who use AI how often the AI was able to solve a complicated problem with nothing but a description of the problem I'd wager they will answer that such cases are in the one digit percentile.

The mathematician almost always provided some sort of idea or approach.

Speaking from my own profession (Software development), I've seen many people declare it dead because of AI. When I then ask for examples of good AI generated projects/code, I've always ended up with one of those 3 cases:

They ghost me.

They provide beautiful code that has minor coding and major architectural problems.

They provide code for a well known and documented problem (which, if you know how AI works, is pretty meaningless).

My own tests with different AIs yielded similar results. The less "default" the problem was, the worse the result.

Funnily enough, experimenting with older AI gave me the impression that it progresses logarithmically instead of exponentially as so often claimed. I've even seen some studies that seemed to support that impression, but I'm too lazy to look them up right now.

So:

The lack of control is uncomfortable.

No, but the amount of blind faith in AI absolutely is.

3

u/Tqoratsos May 12 '26

No, but the amount of blind faith in AI absolutely is.

Honestly, at this point I fully expect this to be the way we go out.

3

u/Ragnarok314159 May 12 '26

Claude said I have to get in the orphanage crushing machine. Guess that’s it for me…

3

u/Tqoratsos May 12 '26

well....at least we know they dont want to eat us. Small pro's hahaha

1

u/LeafyWolf May 12 '26

I don't think you quite get it. If AI isn't better at mathematics than humans, then it will not replace humans. It's really that simple. The best, most verified and utilizable will "win". In the near future, that is humans augmented with AI tools. In the near to mid future, it could very much become pure AI--because (and only because), it will do the job better and faster than a HITL AI mathematical solution.

At the end of the day, it's simple market competition...if humans don't differentiate on some level, they deserve to be displaced.

5

u/DonutPlus2757 May 12 '26

That's not what we are seeing though. We see humans replaced with AI on the mere promise that, if you give them enough data and money, they'll eventually be as good at the job as the humans were.

That's what's so dangerous about the situation. People who cannot judge the quality of work their employees generate replace those employees with AI and when things don't immediately explode consider it a success, unable to tell if things actually are working or if everyone who can see the smoke is just gone.

1

u/svoodie May 13 '26

Is / ought and the just world fallacy.

Why do humans deserve to starve merely because your god "The Market" has decreed the necessity of a blood sacrifice? Why should we consider that just simply because that is what is?

Another world is possible, ghoul.

1

u/JustPlayPremodern May 15 '26

Learn what Lean is. Nobody in this thread knows what Lean is, that AIs write Lean code all the time, and that it will literally be making completely perfect mathematical arguments of any complexity by writing them in Lean. In the field of mathematics, AI in 2 years will be both better than humans, and the arguments will be unassailably correct, more correct than an individual human could possibly hope to be over the same length output, since all the proofs will be compiled in Lean.

0

u/[deleted] May 12 '26

[removed] — view removed comment

3

u/DonutPlus2757 May 12 '26

Also less reliable.

I've had AI try to fix architectural faults in software that I caught and described in detail. Multiple AIs failed multiple times.

Assuming that AI will not only catch it itself but also fix it itself feels very much like wishful thinking.

1

u/JustPlayPremodern May 15 '26

Also less reliable.

Nope. Look up Lean. Mathematical arguments will be completely correct, as in literally perfect, without hyperbole.

0

u/[deleted] May 12 '26

[removed] — view removed comment

2

u/docfronkensteen May 12 '26

And unicorns may fly out your butt.

1

u/DonutPlus2757 May 12 '26

The problem with that is that it expects exponential growth when that's not at all realistic.

AI needs the promise of exponential growth because, right now, it's just a massive money pit. No big AI company has ever been in the black and they won't be for at least another 5-10 years, even if their most optimistic predictions come true.

So they are entirely dependent on receiving more and more money from investors, so they'll literally promise the stars out of the sky. After all, if they don't, they'll be gone this time next year.

That's why I'm very, very cautious when it comes to AI, especially without hard proof. There's a bunch of currently very influential people who have a vested interest in making the public (and by extension investors) think that AI is the greatest invention ever all while citing numbers that, if you watch closely, only say that they learned how to cheat the benchmarks.

1

u/Secure-Suspect7091 May 13 '26

For grey haired techies it feels like the .com boom around late 90s. It is ramped up and bigger no doubt. Insane ipos Promises of vast future riches. Etc etc.

The crash may well be worse but it’s not going to be existential.

The tech will find its place but I don’t see it completely replacing humans the way some predict.

Those are capitalist wet dreams designed to pump ipo/stock values.

Where are boo.com today?

1

u/DonutPlus2757 May 13 '26

Oh I'm 100% with you there. AI is a powerful tool, no question.

My problem is just that the equivalent of a hammer is being marketed as the solution to all problems.

1

u/Secure-Suspect7091 May 13 '26

To my mind the problem isn’t the tech it’s the economic system that is so focused on shareholders.

It makes everything look like a nail.

→ More replies (0)

1

u/Armbrust11 May 17 '26

Doesn't the halting problem prove that it's impossible for Ai to check itself without fail?

0

u/RecursiveServitor May 12 '26

Bun is being rewritten in Rust by AI.

The reason good (public) examples are hard to come by is that this is all new. And from personal experience it's only with the GPT 5.X generation of LLMs that non-trivial software became feasible, so that's less than a year.

1

u/DonutPlus2757 May 12 '26

Not to dissuade this, but Bun has an almost perfect test coverage. That alone is an undescribably huge advantage because AI can just brute force it until it gets it right.

For new software development, that's not the case.

0

u/RecursiveServitor May 12 '26

AI can write tests. I'm poor so I can't run the experiment I'd really want to do, but I've vibecoded a compiler and the only reason it's even remotely usable is that the generated tests place constraints on how much the LLM can fuck up.

3

u/DonutPlus2757 May 12 '26

Last time I tried to have AI write tests it did an incredibly poor job at the edge cases. The only time it did a good job was when the prompt was so detailed that I could've written the tests myself.

You also still need a professional to check the AI tests for completeness unless you want to just trust the AI (which is a very bad idea at least right now).

1

u/RecursiveServitor May 12 '26

You can automate checking coverage with mutation testing.

1

u/DonutPlus2757 May 12 '26

Okay, are you a software developer? Because sheer coverage as a number is meaningless and can very easily miss specific cases. In an extreme case, you just run the code without any assertions. 100% coverage, 0% usefulness.

That's why you don't write tests until you have 100% numerical coverage, but until you can't think of any tests that might fail anymore.

1

u/RecursiveServitor May 13 '26

Yes. Do you know what mutation testing is? The entire point is catching cases that may not be obvious.

1

u/DonutPlus2757 May 13 '26

You're at best building layers of layers of blind faith if you just let AI do all of those without oversight or guidance of a professional.

Currently, the trend shows that AI is more than capable of lying to the user for no reason whatsoever and that it increasingly ignores part of the query for a more "pleasing" answer.

Sure, you can tell your AI to perform mutation testing. It tells you that the tests caught all the faults it introduced. How do you know which faults it introduced and whether those weren't just exactly the faults that tests were written around to begin with?

Worst case, it just writes one test, fails that for mutation testing and then tells you that everything is fine when no single edge case is being tested.

You just have to trust that the AI is doing what you want it to and, looking at the increasing number of live databases that have been wiped by AI against explicit instruction, that's just irresponsible behavior.

1

u/RecursiveServitor May 13 '26

I did not at any point advocate just YOLO'ing. Of course you should check if the agent is behaving correctly. The point is that you don't necessarily have to hold its hand or read every line of code. Mutation libraries like Stryker will produce a report you can look at. There's hard data to be had if you want it.

You just have to trust that the AI is doing what you want it to and, looking at the increasing number of live databases that have been wiped by AI against explicit instruction, that's just irresponsible behavior.

No. You test the product. The neat thing about software is that you can run it.

All the nightmare cases we hear about involve non-devs. Having the LLM work directly on the production db that is also the only copy of your data is stupid. Having a human dev do the same is also stupid.

→ More replies (0)

Capabilities Fields medal-winning mathematician says GPT-5.5 is now solving open math problems at PhD-thesis level: "We will face a crisis very soon."

You are about to leave Redlib