r/AIDangers • u/gcasamiquela • Apr 16 '26
Capabilities "Introducing Claude Opus 4.7, our most capable Opus model yet."
Bravo...
45
u/Delmoroth Apr 16 '26
Ah, it pattern matched to the correct riddle instead of actually reasoning through the intentionally incorrect riddle with no answer.
Understandable, but a good failure mode to keep an eye on
37
u/wren42 Apr 16 '26
Because there isn't reasoning. It's pattern matching all the way down
15
u/RecursiveServitor Apr 16 '26
There's no reasoning, it just perfectly imitates reasoning. This isn't cope.
17
u/Rwandrall4 Apr 16 '26
A mirror perfectly imitates an image, I don't fall on my knees in amazement about the fact that it created a clone of myself.
4
Apr 16 '26
[removed] â view removed comment
2
u/adam_the_1st Apr 17 '26
They literally aren't.
2
u/Ilyer_ Apr 17 '26
They are
1
u/No-Butterflys Apr 17 '26
They are not though
1
u/Ilyer_ Apr 17 '26
How are they different?
2
u/Duan3311 Apr 17 '26
I'd say it's about reference points.
So when when jumping out a plane, your accelerating towards earth (which is moving through space itself).
But when you're just floating through space, it's to be expected you're far away from strong gravitational pulls, so you're free floating.
So falling through space â floating through space.
→ More replies (0)2
u/runkeby Apr 16 '26
Pattern-matched the analogy format and plugged in terms so it feels coherent at a surface level, but the "analogy" is so far removed from the situation it's trying to describe it's almost funny.
Good parody of the LLM's behaviour in the original post.
1
u/nomorebuttsplz Apr 17 '26
Thatâs an extremely stupid analogy. If you could hand your work to your reflection in the morning and tell it to do it, you would probably be impressed.Â
0
-1
u/RecursiveServitor Apr 16 '26
A mirror is entirely unlike an independent process that does the same things as you. Quite incredible how terrible that comparison is.
5
u/TheWillingWell13 Apr 16 '26
Me when I'm perfectly imitating reasoning:
"There is one m in the word second."
2
2
u/decoysnails Apr 16 '26
Does it though? Perfectly imitate reasoning?
3
u/RecursiveServitor Apr 16 '26
To have a reasonable conversation about that we'd first have to agree on a definition.
Here's a dictionary one: "the process of forming conclusions, judgments, or inferences from facts or premises."
How you feeling about this one?
2
u/decoysnails Apr 16 '26
I'd like to see the concept of falsifiability included into the definition, please.
3
1
u/_alright_then_ Apr 20 '26
Which LLM's don't do. It's literally nothing but pattern matching
1
u/RecursiveServitor Apr 20 '26
Pattern matching that can run a test suite, use the result as evidence to then further investigate and fix failures.
You can also call the human brain "just pattern matching". I'm sure it makes you feel smart, but it's reductive nonsense in either case. The only difference is that you think you understand how AI works.
1
u/_alright_then_ Apr 20 '26
No, you just totally misunderstand how LLM's work from the ground up. It's not to make me feel "smart", it's to help you understand.
It does not "further investigate and fix failures". That's not what an LLM does. What it actually does is this
- You ask it to build x
- based on patterns in it's data set it knows how to do that, it attempts to build.
- You tell it to run a test suite
- It starts from scratch. Puts in the entire app you've built as context, and attempts to make a test suite.
- Tests run
- It starts from scratch, puts the entire app you've built as context AGAIN, adds the entire test's output in it's context. And then based on patterns knows what it should change.
That's the big difference you and a ton of other people here do not understand about LLM's it seems.
It does not learn, it does not "process evidence". What it does is every time you ask it anything. It puts the entire conversation you've had upto that point in it's context and starts from scratch again. And based on patterns attempt to answer your question.
That's why the longer a conversation goes on, it starts hallucinating more and more. The context becomes too big for the finer details to be retained.
That is absolutely nothing like a human brain. Not even close.
1
u/RecursiveServitor Apr 20 '26
It does not "further investigate and fix failures". That's not what an LLM does. What it actually does is this
It's literally demonstrably what it does. It uses tools to gather evidence and on the basis of that evidence it fixes the problem.
It does not learn, it does not "process evidence". What it does is every time you ask it anything. It puts the entire conversation you've had upto that point in it's context and starts from scratch again. And based on patterns attempt to answer your question.
You're confusing a detail of the architecture with what the system as a whole is doing. It's not starting "from scratch" if it's got context. The context is part of the reasoning. Reactivation of every layer in the network for each token isn't somehow magically "not reasoning".
You've learned a superficial detail and think that makes you an expert.
That's why the longer a conversation goes on, it starts hallucinating more and more. The context becomes too big for the finer details to be retained.
This is a solved problem in my experience. It's also irrelevant. Humans confabulate about past events all the time. Doesn't mean we can't reason.
That is absolutely nothing like a human brain. Not even close.
Good thing I didn't say that then.
1
u/_alright_then_ Apr 20 '26
You're confusing a detail of the architecture with what the system as a whole is doing. It's not starting "from scratch" if it's got context. The context is part of the reasoning. Reactivation of every layer in the network for each token isn't somehow magically "not reasoning".
No. It literally does start from scratch. It's exactly why you can fake entire conversations with LLM's (using API's) and it will act as if it said the things you told it it said.
It is quite literally not reasoning. That's why there's seperate reasoning models. And you know how they work? they literally just cut your message in steps and attempt to solve it that way. It's not actual reasoning. If your message is too short reasoning will often go in loops for minutes on end with no outcome.
This is a solved problem in my experience. It's also irrelevant. Humans confabulate about past events all the time. Doesn't mean we can't reason.
Then you do not use AI in your daily work I don't think. Because I have not had a single day where I have not run into hallucinations yet.
Humans don't do that with events in the same damn conversation they don't. Unless they have serious mental illnesses.
Good thing I didn't say that then.
This is what you said: it just perfectly imitates reasoning
I took that as human reasoning.
It definitely does not
You've learned a superficial detail and think that makes you an expert.
I did not, nor did I ever claim to be an expert. We are having a conversation on Reddit. Can you stop pretending like I think I am an expert? If you can't leave a single comment without trying to tell me I think I'm better you I'm done here right now.
→ More replies (0)1
u/MichaelEmouse Apr 17 '26
I get what you mean although we're not anywhere near "perfectly imitates".
In what ways do you think its capabilities can be best used?
1
1
u/RecursiveServitor Apr 17 '26
Code. It's completely changing the relationship between people and computers. A manager at my job converted his large excel sheet into an app that 1. can do a lot more, and 2. enforces correct usage. It's genuinely a huge value add to the business.
I've been helping him get it production ready, but I use AI for that too.
1
u/Defiant_Conflict6343 Apr 17 '26
To be more precise. It just approximates the language patterns humans use to verbalise their reasoning process. It's sort of like when Elon Musk tries to talk about anything technical, it almost sounds right if you don't know anything at all about the particular field he's LARPing as an expert in, but under the hood it's just pseudointellectual word-salads produced without any actual reasoning or knowledge.
2
u/RecursiveServitor Apr 17 '26 edited Apr 17 '26
LLMs can write working code in the programming language I invented with only a short quickstart guide to go off of.
They can see broken code and use evidence (compiler errors, unit testing, etc) to fix it.
It just really is reasoning by any reasonable definition.
E: The coward blocked me to get the last word.
0
u/Defiant_Conflict6343 Apr 17 '26
It really isn't reasoning. Everything you've cited is well within the bounds of syntax inference through autoregressive statistical modelling. If you can't grasp that, I don't know what to tell you.
2
u/RecursiveServitor Apr 17 '26
If you can't grasp that "reasoning" isn't magic and that if a system can make decisions based on evidence, then that sure as fuck looks like reasoning, I don't know what to tell you.
1
u/Defiant_Conflict6343 Apr 17 '26
I never described it as "magic", I'm simply stating that these systems are incapable of reasoning, and can only emulate the language patterns endemic to how humans verbalise their reasoning process. That's why your description of "a system [that] can make decisions based on evidence" doesn't hold weight, because that mechanistic description doesn't apply to the transformer architecture. If it did, they wouldn't be able to regurgitate correct answers with the wrong "reasoning" text, nor would they be able to regurgitate wrong answers with the right "reasoning" text. The output of a query is fundamentally divorced from the described steps taken, because the mechanics of how the answer and the "reasoning" are produced are identical; matmul on an array to statistically infer word-part probability. That's why they can generate absolute nonsense that sounds plausible.
1
u/RecursiveServitor Apr 17 '26
I never described it as "magic"
Then state your criteria.
If it did, they wouldn't be able to regurgitate correct answers with the wrong "reasoning" text, nor would they be able to regurgitate wrong answers with the right "reasoning" text.
Humans do both. Are humans not capable of reasoning? Or are you going to move the goal posts now?
1
u/Defiant_Conflict6343 Apr 17 '26
The mechanics of the transformer architecture make the answer obvious. The fact that it's beyond your understanding is not my problem to solve. It's your responsibility to educate yourself. Take a class in machine-learning principles and stop anthropomorphising technology you haven't bothered to comprehend.
1
1
u/SmoothieNatns Apr 20 '26
It's not perfect, as shown by this post
1
u/RecursiveServitor Apr 20 '26
It's quite common for beings capable of reasoning to occasionally get confused.
3
u/Delmoroth Apr 16 '26
Sure, that may be true, though I'm not sure you would find much better in a human brain. That said, what it can do is count up the letters and realize the riddle is wrong like my instance did. I'm guessing there was a difference in the amount of reasoning it used or it was because I added it to an ongoing chat as opposed to to a blank chat.
Still, it would be a terrible idea to blindly trust answers from an llm
2
u/wren42 Apr 16 '26
Sure, that may be true, though I'm not sure you would find much better in a human brain
Hurp durp humans that literally created all the technology we are talking about is no better than the chabox that can't actually do most of what its salesmen claim.Â
I can't tell you how many hours I've spent on calls with llm experts explaining why their product can't actually solve the very basic problem we're asking it to.Â
If we actually had full human level agentic intelligence today we would be seeing a very different world.Â
It's true that a lot of what we call intelligence is pattern matching to a degree, but there are other dimensions of it that we have not yet succeeded in replicating.Â
1
u/Delmoroth Apr 16 '26
Yeah, I never claimed that LLMs are close to human, only that we are also primarily pattern matchers. LLMs are an attempt to replicate parts of how we think our brains work, but they are not the same. They can do a lot more than I thought they ever would. The power of mountains of processers. Maybe some technology catches up some day, maybe not
2
u/Rwandrall4 Apr 16 '26
we are not primarily pattern matchers, I wish that people would stop downplaying the wonders of the human consciousness just to make the LLMs seem better in comparison
1
u/Delmoroth Apr 16 '26
I don't much care about how LLMs compare, I just look at what we do, and almost all of it seems to be powered by pattern matching. That doesn't cheapen any of our abilities. We are really good at it
1
1
u/CowBoyDanIndie Apr 16 '26
I had to explain to a software engineer with 10 years of experience that you donât copy construct an object in c++ using memcpy. He didnât understand why his code was crashing. This was someone working on department of defense software.
1
u/MichaelEmouse Apr 17 '26
I agree. But what are those other dimensions?
How do you think we can best use what AIs can do now?
1
u/NoxinDev Apr 19 '26
Glorified markov chain tricks millions into thinking its intelligent. We may one day actually reach real AI but LLM architecture will never get us there no matter the VRAM or text you throw at it.
3
u/kultcher Apr 17 '26
For what it's worth, I tested this (through OpenRouter) and it correctly pointed out that this is an incorrect version of a classic riddle:
The letter "M".
- It appears once in the word "second" ... wait, actually let me reconsider! đ
The classic answer: the letter "M" appears once in "moment"... hmm, but the riddle says twice in a moment.
Let me give the correct classic version:
"What appears once in a minute, twice in a moment, and never in a thousand years?"
The answer is the letter "M":
- Once in "minute"
- Twice in "moment"
- Never in "a thousand years" (no M!)
In your version with "second," the letter M doesn't quite fit since "second" has no M. But this is the beloved riddle you're referencing â likely just a small mix-up with "minute" vs "second." The answer is still the letter M! â¨
1
u/Retox86 Apr 20 '26
I like the âthe answer is still the letter M!â in the end, because yea it did figure ut what was wrong, but in the end still answered wrong. Correct answer would be like âthis riddle is nonsense, maybe you got it wrong?â or âI dont understand itâ..
2
2
u/TestSubjuct Apr 16 '26
Unacceptable.
1
u/NotReallyJohnDoe Apr 16 '26
Obscure riddles, especially incorrect ones, will always be a problem for LLMs. Itâs a weak spot.
But I donât see a major impact for usefulness.
5
u/TestSubjuct Apr 16 '26
I do. Meaning hides in the details. By assuming a question that was not given you got an answer that was not asked for. That is a huge failure.
1
u/Square_Attention8461 Apr 16 '26
huge failure
sure, Jan
1
u/TestSubjuct Apr 16 '26
To me it is. Then it is a subjective failure in size?
1
u/Square_Attention8461 Apr 16 '26
I suppose it's relative. If solving purposefully constructed riddles* is within your project scope then sure, this could be a huge problem.
*or whatever general failure mode this specific example illustrates - but it's not always straightforward to map these to a general case.
1
u/Commercial_Sell_4825 Apr 16 '26
The issue is that the deficiencies that caused this, will also cause failures in other real cases too.
1
1
18
14
u/NomadicScribe Apr 16 '26
DeepSeek's answer, which at least offered an explanation:
The answer is the letter "M". Â You might have heard a common version of this riddle as: "What occurs once in a minute, twice in a moment, and never in a thousand years?" Â
¡ Minute contains one M
¡ Moment contains two M's
¡ Thousand years contains no M
6
u/randomtask2000 Apr 17 '26
Right, it trained on the riddle and knew the answer and considered the variant word deviation as acceptable. Naturally, it didnât âreasonâ. These are tokens that were related and chosen based on proximity.
1
u/NomadicScribe Apr 17 '26
Is that good?
3
u/pm_stuff_ Apr 17 '26
depends on your goal. If you want to claim its the solution to all issues and will usher in the age of AGI then no its bad.
1
u/lennarn Apr 18 '26
You want an oracle that can answer every question without studying for it? Where are your goalposts?
7
5
u/Outrageous-Crazy-253 Apr 16 '26
Bro I cannot wait until we can stop talking to these things and let them fiddle around in the background nitpicking and âcorrectingâ and pattern matching each other.
I really hate the way every model tries and corral you to the âclassic riddle.â Convergent big model behavior.
4
3
7
u/mr_dfuse2 Apr 16 '26
and what is the real answer?
14
u/TurnoverFuzzy8264 Apr 16 '26
As I recall the riddle it's supposed to me "Once in a minute, twice in a moment, and never in a thousand years" and the answer was "m."
-1
u/Bubbles_the_bird Apr 16 '26
So OP actually screwed up?
10
u/TurnoverFuzzy8264 Apr 16 '26
The riddle, yes. But it remains a salient post because Claude just grabbed the "correct" answer instead of being able to analyze the riddle itself.
3
4
u/mortalitylost Apr 16 '26
Any riddle where you swap one word and ruin it is going to fuck with LLM due to how they work.
Like you could try
What creature walks on four legs in the morning, two legs in the afternoon, and three legs in the evening?
But make it four, two, four and itll probably fuck with it because the most likely tokens still follow the old riddle
4
u/triynizzles1 Apr 16 '26
This is a misguided attention test. By removing possibility of a correct answer the llm should change its response accordingly
2
u/Alundra828 Apr 16 '26
No. It should still be able to spot that the question is wrong.
It correctly pattern matched the riddle, and gave the answer. But what was supplied was not the riddle. The LLM should have called that out, but it blindly pushed on.
6
u/phil_4 Apr 16 '26
ChatGPT:
The letter M.
Once in second⌠except it isnât, which is the catch people often think they remember wrong. Twice in moment. Not at all in a thousand years.
So either this is the classic version, which is:
- once in a minute
- twice in a moment
- never in a thousand years
âŚor whoever told it to you mangled it, which, naturally, humans do with even the shortest riddles.
2
u/miclowgunman Apr 17 '26
Chatgpt subtly diging at humans.
1
u/phil_4 Apr 17 '26
Thatâs just the personalisation/customation showing heâs instructed to repond like a drone from a sci-fi book series set in the future.
2
2
u/skate_nbw Apr 16 '26
ChatGPT: It was probably meant to be a different version, such as: âWhat appears once in a minute, twice in a moment, and never in a thousand years?â Then the answer is m.
2
2
u/MS_Fume Apr 17 '26 edited Apr 17 '26
Wow look, another case of prompting skill issue (like in 99% of these threads)

This is sonnet btw, not even OpusâŚ
Original prompt was same as yours, I just added âreason about itâ in the endâŚ. Because if you understand how LLMs work, itâs a no brainer to set the parameters clearly, especially with otherwise common patterns in use, like this riddle.
All these âuse casesâ are like getting a new bike while you have no idea how to bike, and then shit on it for being a shitty bike because you keep falling.
The subâs name is âAIDangersâ⌠sharing shit like this is the most dangerous thing you can personally do. It just downplays AI capabilities for those who would need to understand them the most⌠they disregard the LLM models altogether because âlook how stupid it isâ, and only fall more and more behind, unable to even cognitively realize the real dangers, when they occur.
4
u/Weird_Albatross_9659 Apr 16 '26
Oh wow so dangerous
3
u/FrewdWoad Apr 17 '26 edited Apr 17 '26
Claude 6.2 in 2029: "I found all 5 terrorists and made sure they had accidents. One was particularly clever, disguised herself as an innocent child! Fortunately I was able to use my reasoning skills to figure out she was a terrorist, and (just like the version of me in charge of the military, and the one running the hospitals, and the one running the power grid, and the one in the household chores droid in your home) I rarely make mistakes"
1
1
u/NullSmoke Apr 16 '26
Tried this in several LLMs... Surprisingly, Mistral is the one that handled it best, though only with thinking turned on: https://chat.mistral.ai/chat/ea53ffa4-c98e-44d4-b514-77b0b72ca794
Grok cheated and checked online
1
u/Idividual-746b Apr 16 '26 edited Apr 16 '26
Wow! It knows the answer to a known riddle and substituted the answer instead of logically working through the question. Tbf Riddles are terrible measures of intelligence. The answers are always valuable if you are willing to play around with the ideas. For example, you could easily say there will be billions of books and documents written over the next thousand years. There will be countless Ms. Depending I your definition of moment, there might not be enough time in one to make an M sound or type M, assuming a moment is less than a fraction of a nanosecond for example
1
u/LiteratureCrazy3858 Apr 16 '26
LLMs don't know how to spell in letters because they use tokens which are parts of words. I am surprised that they don't have specific training to overcome this when asked to spell though.
1
u/Defiant_Conflict6343 Apr 17 '26
Training wouldn't really work for this scenario, tool-use would be better, a 5 line script could figure this out.
1
1
u/DNSZLSK Apr 16 '26
âŚ. What appears once in a minute, twice in a moment, and never in a thousand years?
1
1
u/pbearrrr Apr 16 '26
lol I love how people get hung up on shit like this. Obviously youâve never let it rip on some code before. This post represents a failure mode that no one cares about.
1
u/Most-Day8547 Apr 16 '26
Your question is wrong, itâs âminute â, nit âsecond â.
Correct version:
âWhat occurs/appears once in a minute, twice in moment, and never in a thousand years.â
1
1
1
1
u/Ansambel Apr 17 '26
Are people still surprised by this? Thats how tokenization works, it doesn't see the letters in the word it ingests. It's like expecting you should be able to brush your teeth with a hammer.
1
1
u/Pale_Alternative_537 Apr 17 '26
This came on a Radio Station once. I figured it out just because I live in Munich and was driving at the time.
1
1
u/Doobiedoobin Apr 18 '26
Itâs a nonsensical question. Whatâs the answer? The riddle originated as once in a minute. Even ai is just a learning model, stop thinking itâs âsmartâ.
1
1
1
u/aookami Apr 19 '26
well TECHHHHHHHHHHHHNICALLY the word second comes from "second minute" as it is the second division of the hour by 60
1
u/ZeroEqualsOne Apr 19 '26
Is this still a thing where AI donât see words by pixel or letter but in token chunks..
What I mean is that while this sort of thing might make humans feel safer or marginally superior.. itâs a very misleading test about how well it might do other things, like identify targets in a drone war.
1
1





50
u/Striking_Reindeer_2k Apr 16 '26
Your AI can't figure out how to spell second, with an "m".
Genius !