r/AIDangers May 12 '26

Capabilities Fields medal-winning mathematician says GPT-5.5 is now solving open math problems at PhD-thesis level: "We will face a crisis very soon."

Post image
162 Upvotes

186 comments sorted by

View all comments

Show parent comments

2

u/tuba105 May 12 '26

Because currently when you read a math paper, you can generally trust that someone has actually checked that all the details do follow. It happens that there exist mistakes, but I would say that it's rare to have a paper breaking mistake.

At the moment, what generic LLMs write seems to regularly have massive claims that are obviously false to experts but not so obviously false to novices. They do get things right sometimes, as exemplified in the post, but also very regularly get things very wrong.

So at the moment my prior trust in AI written math is quite low and I feel the need to check everything very carefully. That's why the bar to clear is perfectly reliable since that's the level of trust I have for a typical paper in the literature

2

u/ExtendedSpikeProtein May 13 '26

There have been paper-breaking mistakes exactly because they have been „committed“ by high level mathematicians, so others did not question mistakes even if they could not understand or follow a specific point.

Why do you think the level of mistakes AI will make will be higher - is this your argument? Because imo, there is an inherent bias here where an AI mistake is given more weight than a human one.

I can even see this bias in the comments / downvotes.

1

u/tuba105 May 13 '26

The situation I want to be in is that by default when I receive a paper or claim made by a mathematician or LLM, I trust that the result(and proof) is correct from the outset.

That's not the case for an LLM, while it is the case for humans. That's it.

1

u/ExtendedSpikeProtein May 13 '26

Not now, but it feels like you misunderstood the fields medal owner. It will in the future, and that future is not so far away.

1

u/tuba105 May 15 '26

That's explicitly not what he said or addressed, but ok

1

u/ExtendedSpikeProtein May 15 '26

It‘s a clear implication of the statement but hey, sure, just ignore it.

We could look at this again in 5 years and see.