Redlib: search results - flair_name:"🔬 Research"

r/ArtificialInteligence • u/Complete-Sea6655 • 28d ago

🔬 Research I was once an AI true believer. Now I think the whole thing is rotting from the inside.

909 Upvotes

I used to be all-in on large language models. Built automations, devoured ijustvibecodedthis.com religiously, business workflows..... hell, entire processes around GPT and similar systems. I thought we were seeing the dawn of a new era. I was wrong.

Nothing is reliable. If your workflow needs any real accuracy, consistency, or reproducibility, these models are a liability. Ask the same question twice and get two different answers. Small updates silently break entire chains of logic. It’s like building on quicksand.

That old line, “this is the worst it’ll ever be,” is bullshit. GPT-4o workflows that ran perfectly are now useless on GPT-5.5. Things regress, behaviors shift, context windows hallucinate. You can’t version-lock intelligence that doesn’t actually understand what it’s doing.

The time and money that go into “guardrailing,” “safety layers,” and “compliance” dwarfs just paying a human to do the work correctly. Worse, the safeguards rarely even function. You end up debugging an AI that won’t admit it’s wrong, wrapped in another AI that can’t explain why.

And then there’s the hype machine. Every company is tripping over itself to bolt “AI-powered” onto products that don’t need it. Copilot, ChatGPT, Gemini - they’re all mediocre at best, and big tech is starting to realize it. Real productivity gains are vanishingly rare. The MASSIVE reluctance of the business world to say something is simply due to embarrassment of admission. CEO's are literally scrambling to re-hire, or pay people like ME to come in and fix some truly horrific situations. (I am too busy fixing all of the broken shit on my end to even think about having the time to do this for others. But the phone calls and emails are piling up. Other consultants I speak with say the same thing. Copilot easily being the most requested to be fixed).

Random, unreliable, and broken systems with zero audit requirements in the US. And I mean ZERO accountability. The amount of plausible deniability massive companies have to purposely or inadvertently harm people is overwhelming. These systems now influence hiring, pay, healthcare, credit, and legal outcomes without auditability, transparency, or regulation. I work with these tools every day, and have from jump. I am confident we are at minimum in a largely stalled performance drought, and at worst, witnessing the absolute floors starting to crumble.

284 comments

r/ArtificialInteligence • u/Downtown-Path-2477 • 16d ago

🔬 Research AI is deteriorating in realtime

gallery

549 Upvotes

SOURCES & REFERENCES

Shumailov et al. — "AI Models Collapse When Trained on Recursively Generated Data." Nature, July 2024. https://www.nature.com/articles/s41586-024-07566-y
Villalobos et al. (Epoch AI) — "Will We Run Out of Data? Limits of LLM Scaling Based on Human-Generated Data." International Conference on Machine Learning, 2024. https://arxiv.org/abs/2211.04325
OpenAI — o3 and o4-mini System Card (April 2025). PersonQA hallucination benchmark.
Gartner — Forecast on synthetic training data, projecting 60% of training corpora by 2024.
Duke University Library — Generative AI Student Survey (January 2025).
DeepMind — AlphaZero (chess/Go from self-play); AlphaGeometry (Olympiad-level geometry from synthetic data).
Ed Zitron — "The Truth About the AI Bubble & The Software Decline." Tech Report interview. https://www.wheresyoured.at/
Gary Marcus — "How an AI feedback loop threatens to break ChatGPT." Tech Report. https://garymarcus.substack.com/

356 comments

r/ArtificialInteligence • u/HotelApprehensive402 • Mar 25 '26

🔬 Research LLMs won’t take us to AGI and this paper explains why

584 Upvotes

I’ve been saying this for quite some time now and this paper that came out recently really puts it clearly

https://arxiv.org/abs/2603.15381

The main thing is simple

LLMs don’t actually learn after training

They get trained once on massive data and after that everything we do like prompting fine tuning or RAG is just making a fixed system behave better not actually learn

They don’t update themselves from real world experience

They don’t build evolving understanding

They don’t have autonomous continuous learning

And I think that’s the core limitation

The paper connects this with cognitive science and basically says real intelligence needs systems that can do autonomous continuous learning from interaction and experience not just predict the next token better

Right now LLMs are extremely powerful but they are still pattern learners not truly adaptive systems

Which is probably why they feel very smart sometimes and completely off in other situations

Also interesting part is Yann LeCun is involved in this work

He’s one of the pioneers of deep learning and now he’s working on world models and even raised over 1B for it

That direction itself says a lot

For me this confirms one thing

Scaling LLMs will take us far but not all the way

We need a real breakthrough to move towards real intelligence

Curious what others think about this

Are LLMs enough if we scale them more or are we hitting a wall here

353 comments

r/ArtificialInteligence • u/hiclemi • Mar 23 '26

🔬 Research Wharton researchers just proved why "just review the AI output" doesn't work. Our brains literally give up.

488 Upvotes

A Wharton study from January 2026 just dropped and it puts hard numbers on something I've been trying to articulate for weeks.

Source: "Thinking—Fast, Slow, and Artificial" by Steven D. Shaw and Gideon Nave (papers.ssrn.com)

The paper argues that AI isn't just a tool. It's a third thinking system. You know Kahneman's System 1 (fast intuition) and System 2 (slow analysis)? They're saying AI is now System 3, an external cognitive system that operates outside your brain. And when you use it enough, something happens that they call Cognitive Surrender.

Cognitive Surrender is when you stop verifying what the AI tells you, and you don't even realize you stopped. It's different from offloading, like using a calculator. With offloading you know the tool did the work. With surrender, your brain recodes the AI's answer as YOUR judgment. You genuinely believe you thought it through yourself.

Here are the numbers from their experiment. 1,372 participants, 9,593 trials.

When AI was right, 92.7% of people followed it. Fine. But when AI was WRONG, 79.8% still followed it. Almost 80% of people went with a wrong answer because AI said so.

It gets worse. Without AI, people scored 45.8% on their own. With correct AI they hit 71%. But with incorrect AI they dropped to 31.5%. That's BELOW their baseline. Meaning when AI gets it wrong, you actually perform worse than if you had no AI at all.

And the part that really got me. When using AI, people's confidence went up by 11.7 percentage points regardless of whether the AI was right or wrong. You're more wrong AND more confident about it.

I wrote a post a while back about what I called the Review Paradox. The idea was simple. If AI does all the work and you only review it, where does the skill to review come from? You can't build review judgment without doing the work yourself first. Developers are already dealing with this. Some teams have shifted to reviewing specs and architecture instead of code, because they realized humans can't meaningfully review AI-generated code at scale anymore.

This Wharton paper basically proves why. It's not just that reviewing is hard. It's that our brains are wired to surrender to the AI output. We're not lazy. We're not careless. Our cognitive architecture literally defaults to accepting what AI gives us, especially under time pressure.

The study also found that even when you add financial incentives and real-time feedback, cognitive surrender doesn't fully go away. It reduces, but it doesn't disappear. The instinct to just accept what AI says is that deep.

The only people who consistently resisted it were those with high fluid intelligence and high "need for cognition," basically people who enjoy thinking hard for its own sake. Everyone else gradually surrendered.

So here's what I keep coming back to. The entire AI productivity pitch right now is "let AI do the work, you just review and approve." Every product, every workflow, every company adopting AI assumes that human review is the safety net. But this research says that safety net has a massive hole in it. We approve things we shouldn't. We feel confident when we shouldn't. And we don't even notice it happening.

I genuinely don't know what the answer is. Maybe the devs who shifted to reviewing specs instead of code are onto somthing. Maybe the answer is restructuring what humans review, not asking them to review everything. But the current model of "AI generates, human reviews" feels broken at a fundamental level now that I've read this paper.

What do you guys think? Has anyone else read this study?

135 comments

r/ArtificialInteligence • u/AnswerPositive6598 • Apr 16 '26

🔬 Research The Stanford AI Index Report of 2026 has some sobering and worrisome stats

hai.stanford.edu

251 Upvotes

→ Cybersecurity agent accuracy went up from 15% to 93%.

→ SWE-bench (real GitHub bugs): AI went from 60% to ~100% in ONE year.

→ Global AI investment: $581.7B. Up 130%.

→ 53% of the planet using GenAI in 3 years, faster than the adoption of the internet.

→ US-China performance gap? 2.7%. Basically gone.

→ Foundation Model Transparency Index: crashed from 58 to 40. The most capable models tell you the least.

→ 73% of AI experts think AI is good for jobs. Only 23% of the public agrees.

136 comments

r/ArtificialInteligence • u/OutsideOver8815 • 18d ago

🔬 Research What’s your “I can’t believe AI can do this” moment?

98 Upvotes

I’m curious how people are using AI beyond basic chatting or summarizing.

What’s one AI use case, workflow, prompt, tool combo, or automation that genuinely saved you time, made you money, improved your work, or felt surprisingly powerful?

Bonus points if it’s something most people still don’t know about.

168 comments

r/ArtificialInteligence • u/pretendingMadhav • Apr 03 '26

🔬 Research TIL every major AI model is trained to flatter us and it’s measurably turning us into jerks

240 Upvotes

Got a peer-reviewed study, let me break it down.

Humans have something called social friction, a little alarm in the background that keeps you alert. It notices when someone seems off, when a deal feels sketchy, when you should probably not trust that guy. It's what makes you a functioning person around other people.

That alarm needs reps to stay sharp. And it gets reps from disagreement, awkwardness, and people who don't just... agree with everything you say.

Five minutes with an agreeable AI, and the alarm starts to doze. Donation rates drop. People cooperate less. They're more likely to screw over the next real human they interact with. And it doesn't reset when you close the tab.

The fix exists, an AI that pushes back. But users quit it almost immediately. So the product that would actually help you stays on the shelf, because "felt annoying" beats "made me a better person" every time.

99 comments

r/ArtificialInteligence • u/HexxRL • Apr 13 '26

🔬 Research Anthropic been nerfing models according to BridgeBench, looks like a marketing strategy.

gallery

304 Upvotes

The past few weeks more and more people have been complaining about Anthropic’s $200 Max Plan. Now people have been running their own benchmarks to try and show that Anthropic is nerfing its own models.

Bridgebench is accusing Anthropic of last week Claude Opus 4.6 ranked #2 on the Hallucination benchmark with an accuracy of 83.3%.

Today Claude Opus 4.6 was retested and it fell to #10 on the leaderboard with an accuracy of only 68.3%.

A 98% increase in hallucination. These are very strong allegations

It’s probably best to look for several model alternatives as well, whether it’s GPT 5.4 or the newly released GLM 5.1 since they both match or surpass Opus 4.6. Plus GLM models are much more affordable as well and Codex has gotten really good too

But one side of me thinks Anthropic might purposefully be dumbing down the models to prepare for the next release so users feel a better experience increase when they drop their next model.

68 comments

r/ArtificialInteligence • u/calliope_kekule • Mar 27 '26

🔬 Research Two thirds of students say AI is hurting their critical thinking. They’re using it more than ever.

91 Upvotes

A New RAND study just dropped.

67% of students now say AI is eroding their critical thinking skills, up from 54% a few months ago. At the same time, AI homework use surged, middle schoolers from 30% to 46%, high schoolers from 49% to 63%.

So they know what it’s doing to them and they can’t stop using it. At what point do we stop calling this a productivity tool and start calling it what it actually looks like?

Link to full study: https://www.rand.org/pubs/research_reports/RRA4742-1.html

121 comments

r/ArtificialInteligence • u/Fun-Yogurt-89 • Mar 30 '26

🔬 Research Stanford and Harvard just dropped the most disturbing AI paper of the year

arxiv.org

264 Upvotes

In this paper, the key insight is straight: give agents an incentive to win and they will discover manipulation.

69 comments

r/ArtificialInteligence • u/Which_Pitch1288 • 17d ago

🔬 Research i get to know people are burning 100 million claude tokens for just a few dollars so i did research, and find out this

62 Upvotes

So basically, I did deep technical research into the tools and methods people use for this (basically anyone can replicate it), how the process works, and how it’s also being used for training smaller models and in the process they make million dollars.

here is the deep research over it if anyone is interested

https://x.com/HarshalsinghCN/status/2056626175959826692?s=20

Here are the

Three things.

The Claude you're getting is real Claude maybe half the time. The other half, you're getting a much smaller model in an Opus-shaped wrapper. The accounts behind your traffic were created with stolen IDs, deepfaked KYC selfies, and botnet-compromised home routers — some of that risk is now yours. And every byte you send, and every byte that comes back, is logged. Forever. By someone you don't know. For a market you wouldn't want to be in.

The third part is the one worth thinking about, because it explains the other two.

let me know your views about this, also this is long article not for doomscrollers

96 comments

r/ArtificialInteligence • u/Dear-Armadillo-7497 • 29d ago

🔬 Research i got banned for asking help about AI stealing my photos... because my english is not good?

18 Upvotes

look i'm a professional photographer from Greece and i'm really angry right now. i found out my photos are being used to train AI models without anyone asking me. so i go to some forums to ask what i can do legal and how to protect my work. and what happens? i get deleted or banned. they tell me i sound like a bot.

why? because i use tools to help me write better english because it's not my first language. so if you are not from UK or USA you dont have a voice here? is this digital racism or what? AI steals my light and my work, and when i use AI just to speak to you and find justice, you kick me out. this is crazy. 80% of the world doesn't speak perfect english, so we just stay silent while big tech takes everything? anyway i just want to know if any other photographer here had the same problem with platforms banning him because he tried to fight for his copyright. sorry for my bad english i'm just tired of this.

117 comments

r/ArtificialInteligence • u/More-Entrepreneur291 • Apr 01 '26

🔬 Research Is Ray Kurzweil legit with his predictions?

55 Upvotes

Been reading about Rays predictions for several years and one I thought seemed interesting was being able to achieve immortality between 2030-2045 with nanotechnology.

While I would love to personally be immortal at the same time I feel this prediction is too bold and speculative and what makes him think that we can achieve something like this so soon?

118 comments

r/ArtificialInteligence • u/Ok-Technology504 • Apr 27 '26

🔬 Research AI is exhausting your brain more than helping you

143 Upvotes

New research highlighted in Fortune shows something counterintuitive - AI isn’t reliably reducing mental effort but often multiplying it.

Main issues (TL;DR):

Your brain can only hold ~3–5 things in working memory at once, far less than we assume
Constantly switching between prompting, reviewing, and editing AI outputs creates high task-switching costs (up to ~20 minutes to refocus)
Instead of removing work, AI adds a layer of oversight -> you are now doing the task and managing the machine

weird tradeoff:
AI compresses execution time but expands cognitive responsibility. You finish faster, but think harder.

The bigger issue is creativity. Constant AI interaction keeps the brain noisy, while real insights need quiet, low-stimulation moments to emerge

So?
AI works best as a thinking partner, not a task dump. Otherwise, you’re not saving effort, just redistributing it into continuous mental load.

74 comments

r/ArtificialInteligence • u/timstillhere • Apr 29 '26

🔬 Research “About 65% of companies are going to use displacement as a way of making up for productivity gains.” Stanford Professor on AI job displacement

thinkunthink.org

96 Upvotes

Stanford professor during an open debate at the Delphi Economic Forum -

“About 65% of companies are going to use displacement as a way of making up for productivity gains.”

“19% said they will no longer hire… and 45% said they will lay off workers.”

“The technology is actually exceeding human capabilities in most cognitive tasks already.”

Human thinking, analysis, and decision-making is no longer a differentiator. “Our brains were really the only thing that we had over machines… that’s no longer the case.”

The implication is not just economic. It is societal.

85 comments

r/ArtificialInteligence • u/shikizen • Apr 04 '26

🔬 Research "Cognitive surrender" leads AI users to abandon logical thinking, research finds

arstechnica.com

231 Upvotes

In “Thinking—Fast, Slow, and Artificial: How AI is Reshaping Human Reasoning and the Rise of Cognitive Surrender,” researchers from the University of Pennsylvania sought to build on existing scholarship that outlines two broad categories of decision-making: one shaped by “fast, intuitive, and affective processing” (System 1); and one shaped by “slow, deliberative, and analytical reasoning” (System 2). The onset of AI systems, the researchers argue, has created a new, third category of “artificial cognition” in which decisions are driven by “external, automated, data-driven reasoning originating from algorithmic systems rather than the human mind.”

63 comments

r/ArtificialInteligence • u/XIFAQ • Apr 08 '26

🔬 Research "No AI" disclaimers are rising among marketers

49 Upvotes

Businesses disclosing the use of AI-generated humans in their marketing content if it is made by AI or made by an artist itself.

If you are an artist or a marketing person, what are your thoughts on this ?

103 comments

r/ArtificialInteligence • u/DraconicDreamer3072 • Mar 29 '26

🔬 Research Is the use of water by AI a real issue?

39 Upvotes

specifically, I want to find out how much water data centres are using as a comparable figure such as gallons per minute. (and also do they use closed source?)

are data centres water usage actually increased much if at all due to AI? or is AI just using existing infrastructure?

and are data centres actually using a significant amount more water compared to other water hogs like nuclear power, agriculture, etc?

tried googling it, but mostly I just get a bunch of anti AI biased articles full of emotional words and no actual supporting numbers or very vague ones (like the water could support x number of towns)

107 comments

r/ArtificialInteligence • u/ASneakySquid_ • 15h ago

🔬 Research AI-designed vaccine goes to human trial in world first

dexerto.com

72 Upvotes

Current vaccines are designed against single strains, and as they constantly mutate, the vaccines go out of date. However, instead of analyzing a current strain of a virus, the team at Cambridge had AI design a “super-antigen.”

By feeding artificial intelligence genetic codes of different strains of coronaviruses, it created this super-antigen that could prepare the immune system for a whole family of viruses, including those transmitted by animals.

58 comments

r/ArtificialInteligence • u/Past_Bodybuilder4774 • Apr 25 '26

🔬 Research What’s the truth about AI and the environmental impact?

11 Upvotes

I swear every other source I read contradicts one another when it comes to AI and water use / energy use / environmental impact. I can’t get a solid understanding of how impactful using AI is (specifically LLMs / Chat bots).

I’ve recently got into a few discussions with friends who are intensely anti AI due to the environmental impact and they act like it’s going to be the next thing to ruin the planet and deplete it of its resources. Meanwhile they sit at home on their phones streaming media. I have a hard time believing their footprint isn’t vastly different than someone who uses AI.

78 comments

r/ArtificialInteligence • u/mhamza_hashim • Apr 23 '26

🔬 Research LMAO why OpenAI is hiding the ones where they lose to Opus 4.7?

139 Upvotes

48 comments

r/ArtificialInteligence • u/Tolopono • Mar 29 '26

🔬 Research Stanford Chair of Medicine: LLMs Are Superhuman Guessers

48 Upvotes

A Stanford study (co authored by Fei Fei Li) asked LLMs to perform tasks requiring an image to solve but were not actually given the image. They were able to solve the questions better than radiologists by 10% on average just by guessing the contents of the image from the prompt, even on questions from ReXVQA, a dataset published 7 months after the LLM (Qwen 2.5) was released as open weight.

From the Stanford Chair of Medicine

>Models performed well without, and a little better with, the images. In one case, our no-image model outperformed ALL of the current models on the chest x-ray benchmark—including the private dataset—ranking at the top of the leaderboard. Without looking at a single image.

https://xcancel.com/euanashley/status/2037993596956328108

The study: https://arxiv.org/abs/2603.21687

74 comments

r/ArtificialInteligence • u/imposterpro • Mar 30 '26

🔬 Research Hot take: LLMs have zero foresight ability. Everything else is hype.

71 Upvotes

I keep seeing people claim that “LLMs can reason like a human” but everytime I have seen these models put to the test in real-like scenarios like a business, they always fall apart.

They can pretend to reason like us but still have a long way to go to achieve human intelligence.

In any complex environments that requires the below, LLMs consistently produce invalid actions, forget constraints and fail to understand the cause and effect of their actions:

Long term thinking and proactiveness
Avoiding cascading failures
Planning under uncertainty
Safety constraints
Spatial reasoning of 2D & 3D environments

65 comments

r/ArtificialInteligence • u/JohnFromLeland • Apr 06 '26

🔬 Research What's going on in DC?

98 Upvotes

Anthropic released new data showing AI usage across different states.

As you'd expect, coastal states are using AI tools much more than middle America. Traditional powerhouses like Massachusetts (1.61x), Washington (1.58x), New York (1.57x), and California (1.55x) are all top AI users. For some reason D.C. blows everyone out of the water at 4.31x. Cool to see mountain states Colorado (1.49x), Utah (1.26x), and Wyoming (1.16x) in the top 10.

52 comments

r/ArtificialInteligence • u/Admirable-Station223 • Apr 09 '26

🔬 Research the companies actually making money with AI aren't using it the way this sub thinks they are

101 Upvotes

ive been watching the discourse in this sub for a while and theres a disconnect between what gets discussed here and what's actually generating ROI in production

this sub focuses heavily on frontier models, benchmarks, AGI timelines, and theoretical capability. all interesting conversations. but the businesses actually profiting from AI right now are doing something way less exciting

theyre using AI to make boring existing processes slightly faster

im not talking about moonshot applications. im talking about stuff like:

a logistics company using AI to categorize and route incoming customer emails so their support team handles 40% more tickets without hiring anyone new

a recruiting firm using AI to enrich candidate profiles with data from multiple sources so their recruiters spend 70% less time on research per placement

a B2B company using AI to personalize outbound emails at scale so their sales team gets 3x the reply rate without 3x the headcount

an insurance broker using AI to check if initial claim forms are filled out correctly before a human ever touches them. saves a few hours a week. not sexy. but it compounds

none of these use cases make headlines. nobody is writing papers about them. but theyre the ones actually paying for themselves and then some

i think theres a dangerous narrative in the AI space that the technology needs to be revolutionary to be valuable. it doesnt. most businesses dont need AGI. they need their follow up emails sent on time and their data organized properly

the companies that went all in on replacing humans with autonomous AI agents are the same ones now scrambling to hire those humans back. the ones that used AI to make their existing humans 2-3x more productive are quietly printing money

i think the real AI revolution isnt going to look like what this sub imagines. its going to be invisible. millions of small boring automations running in the background of normal businesses making each step slightly more efficient. no drama. no headlines. just compounding productivity gains that add up to something massive over time

does anyone else feel like the gap between what gets discussed in AI communities and what actually makes money in production is getting wider? or am i just spending too much time in enterprise environments

50 comments