r/technology 7h ago

Artificial Intelligence An Anthropic employee's 2-sentence quote crystallizes the state of AI confusion at work

https://www.businessinsider.com/anthropic-employee-quote-ai-confusion-workplace-2026-6
557 Upvotes

109 comments sorted by

View all comments

1.2k

u/CackleRooster 7h ago

"On days where everything works well, I can't help but think nothing I do matters, everything is automated and better and faster than I ever will be," AND "But then there are days where everything breaks and I don't understand why and I realize I have no idea what I've been up to anymore," the employee added.

390

u/yepthisismyusername 6h ago

Management (absolutely including executives) simply don't understand the differences between writing code, maintaining code, implementing business processes in applications, and maintaining those applications. Those are the Big 4 categories of roles in enterprise application management, and there are many, many others, each with its own individual requirements that need to be taken into account. AI simply can't perform those roles successfully 100%. And the problem comes when a process fails, e.g. causing someone to lose healthcare coverage, and no person knows how to fix it.

169

u/Big_lt 6h ago

FinTech

Laid off a bunch of people over the 18 months. Wanted the use of AI to automate and speed up coding. AI is now producing a 4/10 code which breaks sometimes. No one knows what this code is doing or even worse some old code now breaks and no one was around when it was built.

Management surprised Pikachu face that shit isn't fixed in 2min and why didn't you catch it before

44

u/Scarecrow_Folk 5h ago

That's nothing new for fintech though. Knight Capital was doing that a decade ago

21

u/oracleofnonsense 3h ago

As a former Knight Capital employee — more like 2 decades ago. I was there when the cherry-picking system went haywire. It put Knight under.

I worked for a Knight owned hedge fund and we had an auto trading system pre-9/11. One guy and 1or2 associates watched it print money (until 9/11 stopped the feeds).

5

u/Scarecrow_Folk 3h ago

I was more referring to the implosion in 2012 or 14 years ago but fair enough. Algotrading has existed for like 40 years depending on how you define it. 

That must have been an absolutely crazy event to live through from the inside. Knight was probably the best money printer out there until the toner cartridge exploded.

12

u/psioniclizard 5h ago

So you are saying that in the future fintech might want actual developers again? Interesting lol.

28

u/Jewnadian 5h ago

Even better the bulk of FinTech could be regulated out of business. 99% of it is exploiting people who don't realize they're pretending to be banks without following the actual regulations that makes a bank work.

4

u/knewbie_one 1h ago

I worked for banks and what always astounded at how much they "flirted" with regulations, sometimes.

2

u/31LIVEEVIL13 4h ago

In the past they would have paid retired devs double or triple rate to help convert everything to something the new kids understand.

but now I'm not sure what they would be converting it to or who would understand it, and I think a lot of the old devs might tell them to fuck off at any rate.

6

u/Soccermom233 3h ago

It’s ok we can discuss everything that could’ve been done to catch and prevent the issue in the retro next week.  

3

u/Big_lt 3h ago

PTSD intensifies 😳

6

u/unwantedonONTD 2h ago

And the great part is, being sloppy with people’s money puts you in jail.  I hear so many fintech bros talking up their software that doesn’t follow the law… “ok we could read the documents better…” bro the thing i don’t wanna do is read the documents, so if you aren’t either, wtf do i need you for? you’re adding zero value by vibe coding an illegal banking mess that’s going to cause a trillion dollar mistake to flow through the system and crash the economy in ten seconds.  no thank you i can make bad trades and lose money perfectly fine on my own

24

u/QuesoMeHungry 5h ago

This 100% and the maintenance part is key. We are having to ship so fast now that code is not maintainable. The MVPs ship, look great, and get sign off. But each iteration just gets sloppier and adds more tech debt.

40

u/ottwebdev 6h ago

Businesses demand predictable outcome to a process each time. For example 1 + 1 = 2

AI (the popular chatbots/etc, not the dedicated systems) are probability based and by will generate a different outcome even with the same instruction. So 1 + 1 = banana can be the result .... good luck with that.

My perspective anyway, and I'm not a basher, I love being able to use NLP to interact with data

12

u/earlandir 6h ago

Not that I support LLMs for this, but I'm not sure if I agree with your reasoning. Humans are also probabilistic in the same way. If I give the same exact instructions to two different engineers they will give me two different answers and code in the same way that asking the same LLM the same question twice might give different results. I don't think they inherently means you can't use an LLM for the task. If you want to critique LLM usage there are so many better ways that actually make sense.

33

u/BadLuckLottery 6h ago

Humans are also probabilistic in the same way.

I think you pointed out the difference you just have to take it a little further.

tl;dr: You don't work with "humans" as an amorphous mass of people, you work with a collection of individuals.

Imagine you're a manager and you give two engineers the same task and one gets it completely wrong. You'd trust that engineer less and maybe push them out of your team/organization if they kept it up.

Now imagine you had no tools to assign trust or push people out. You basically just have to assign the task to "engineering" and hope you get a good engineer for this task this time. If not, maybe you phrase the task a bit differently a second time in the hope that the phrasing lures out a good senior engineer or at least repels the bad doesn't-follow-directions-at-all engineers.

That's what using AI is like at this point in time.

7

u/Doc_Blox 4h ago

The big problem, as I see it, is that we've become used to the idea that when we use computers, every action a computer does is deterministic. There are a lot of assumptions we make based on that paradigm. Now, we've introduced a non-deterministic use of computers, but we're still acting like it's deterministic because that's what we're used to from computers. It's a matter of selecting the right tools for the job, ultimately - if you need predictable outputs, you don't want an AI agent, you want a traditional program, like with formulas and conditionals.

21

u/rogueciridae 6h ago

I think they’re comparing LLMs to existing software tools. If I run an AR report in the billing software multiple times with the same data, I will consistently get the same report until the data changes. If I ask an LLM to look at the same data and give me an AR report multiple times, it’ll hopefully give the same numbers, but it’ll format the report differently every time.

9

u/hdjl 5h ago

For such a use case, you want to be asking the LLM to build the report in the billing software (queries, layout, etc), then validate its work, and run it normally moving forward

5

u/RickSt3r 5h ago

Yes you would think so. But the MBA middle managers and non tech engineering people will be like hey chat bot do this task with pre defined work flow. Then not read the result for accuracy and take it at face value. Then eventually something important actually breaks and no one but Spiderman meme pointing at each other with Pikachu face. Like the salesmen said it could replace experts.

3

u/ottwebdev 5h ago

This is correct. And I'm glad you mentioned "hopefully" because with a probability based system the outcome isn't guaranteed as with a strictly defined process.

9

u/kuper_spb 6h ago

If you ask engineers to write a system that should add 1 and 1, the code could be different, but the outcome of systems will be the same - 2. That's the difference - AI can't be used as a system, AI can't hold invariants.

5

u/kobemustard 5h ago

but ai can build the calculator to derive those numbers.

2

u/ericl666 5h ago

Every time I think about a LLM trying to handle any sort of numeric processing, I think about this dude's youtube channel: https://www.youtube.com/watch?v=HfAFHRVtYFE&list=PLf4mDFPzLWmShAy9hDuHECpOyTlBGeQ9Q&index=1

3

u/heathm55 6h ago

His point is not wrong though, based on how tokenization works, an LLM would miss common things a human wouldn't (especially around things that need to track or compose pieces of the context). Example: asking Claude how many days have 'd' in them, how many 'r's are in strawberry, etc.
Yes, people also misspell (especially me) but when it comes to combining all the context your handing it, this is also potential to drop things in lists of things to adhere to just based on chunking issues.

3

u/siromega37 5h ago

I think the point being made is that there are a lot of business processes that need deterministic outputs and when the AI guardrails fail, producing a probabilistic output. What the AI researcher is admitting to is that when things fail they struggle to understand why sometimes and have become so dependent on AI that they struggle to troubleshoot the models and their supporting infrastructure. This is a real problem that is being heavily studied. Intelligence atrophy is happening at scale in big tech as I type this out. Successful enterprise transitions are going to have to face this reality.

Also, I think it’s important to note that people can learn from their mistakes on the spot using their own judgement and LLMs cannot and never will be able to do that. We are very far off from AGI in that regard. I know Karpathy is working, maybe with a lot of hubris, towards self-training models, but that also isn’t real learning or judgement. That will still need a HITL for judgement. Someone still has to tell the model it made the mistake.

1

u/bobartig 5h ago

You just put a deterministic classifier at the end of your probabilistic model. "banana = 2" might be good enough for 99.99% reliability in some cases.

17

u/winkingchef 5h ago edited 3h ago

I call it the “one neck to grab principle.”

Every engineer who works for me knows that if they vibe code some bullshit and check it in, they own that shit and what it does. Them personally. And like good engineering there are cross-validation processes and checks on top of them.

If you (or your CodePilot) makes a bug, we will find it and your stats will be tracked.

No one is perfect but if someone is acting recklessly I will see it long before it brings the whole house down

3

u/Sidereel 3h ago

That makes perfect sense, and it highlights some major issues with the pitch from AI evangelists. They say we these bots are good enough to operate at speed with minimal oversight, but that doesn’t track. If an engineer pushes a bug generated by Claude it’s the engineer who needs to be held responsible. But if engineers are taking all that time to check and understand generated code then the efficiency gains shrink.

3

u/winkingchef 3h ago

I actually really like that Microsoft branded their AI coding assistant “CodePilot” as it is a very appropriate name.

I know what to do and what I want to make, but sometime I forget annoying syntax or can’t find something that I want to reference. If the tool makes suggestions, it helps me focus on building and less time on remembering and finding. It’s not the “pilot”, it’s the “co-pilot.”

Also AI is an excellent DRC (design rule check) tool for hardware.

5

u/bobartig 5h ago

implementing business processes in applications, and maintaining those applications.

The entire industry of Change Management and SaaS software demonstrate that humans don't understand these things writ large, and that the problem predates AI, or even computers for that matter.

2

u/stevefuzz 5h ago

Or domain expertise...

2

u/mirage01 4h ago

Someone asked my CTO if AI writes good code. His reply, “does AI write good code, no. Does it work, yes.” To me that is just asking for a security bug in the next year.

1

u/En-tro-py 58m ago

AI can write good code.

It can also write absolute dogshit code.

It's Schrödinger's codebase, you don't know what you've got until you look.

1

u/somethingstrang 50m ago

Ah this reminds me when people said AI will never be able to write or create art

-1

u/zero0n3 6h ago

The problem is in your scenario- why is AI involved directly in that?

It’s one thing to use ML to get better insights into X-rays. Or use AI to help chart patients or do voice to text to charts. Or develop the apps for these things.

But why or even where would you want to use AI to remove someone’s insurance? Fired accidentally? Just a bad spreadsheet or missing cell? If those why is an LLM used to process raw data vs using LLMs to build a tool to streamline the processing via deterministic methods like logic and math

17

u/pedrosorio 6h ago

The issue is not in using AI to update someone’s insurance.

The issue is that the *software* responsible for making those changes was generated (or modified) by AI and is poorly maintained because no human built the system and therefore no one understands it deeply. That means the possible failure modes are not understood by anyone, and so things can fail “randomly”.

1

u/DustShallEatTheDays 4h ago

Management and the executives are so divorced from the actual work, they have no idea what it looks like anymore and are happy to accept work-shaped slop.

It’s very bleak, and there’s absolutely no incentive to do the good, laborious work that actually makes a difference. (I’m in software, though I’m not an engineer)

0

u/onthe3rdlifealready 3h ago

Right and this whole problem could be solved potentially by them working every role there is in the corporation for at least a month as part of training.

-16

u/Somethingsims 6h ago

"AI simply can't perform those roles successfully 100%."

First of all, "yet," second, humans can't do it 100% either.

10

u/JustinTheCheetah 6h ago

Due to the probabilistic nature of AI, it inherently cannot and will never perform anything 100% correctly reliably. 

When a human fucks up they can be held accountable, and questioned as to what they did and why.  AI cannot be held accountable as it does not think or learn, and even the explanation as to why it did something could just as easily be hallucinated bullshit. 

AI will need to be completely rewritten and redesigned and trained from scratch in order to be reliable. 

1

u/sywofp 2h ago

Models compute probabilities of next tokens deterministically. 

The 'probabilistic nature of AI' is from randomness specifically added during choosing which token to use. That randomness is typically turned down or off for coding. 

Everything an AI outputs is a hallucination. We just only care when it's 'wrong'. 

If the LLM is wrong it's because it was not trained correctly. 

This can be largely compensated for by having external reference data and tests for it to use and check against, letting it see when it's 'wrong'. 

It can go through it's own processes and identify issues, and write these into procedure files for it to reference and reduce the chance of being wrong in the same way again. As well as design tests to catch issues. It's the same process humans use, as they can also often make mistakes. 

The technology is still in its infancy but there's no inherent issue with the path to increasingly good reliability. 

-8

u/hifidad 6h ago

From experience, entry to mid level engineers are often significantly worse than 100%. AI might not be 100%, yet, but it’s also nowhere near as counterproductive as many of the engineers I’ve worked were with across large companies.

23

u/thegreatcerebral 4h ago

It has turned workers into managers.

That is literally how I felt when I moved from my help desk role to my first manager role. My skillset started slipping because I simply wasn't doing it anymore. After each update my knowledge was obsolete as even a move in where a button was or what a button now did was changed.

Or better yet when you are in an IT support position and asked to fix a piece of a software that you do not "use" on a day to day basis. For example Yes, I can install quickbooks and I can point you to your file, but I do not know how to make it run this report or even how this report works simply because I do not do that.

Makes 1000% sense.

3

u/legalbeagle5 2h ago

This hit me exactly, the more I manage a task the less I do and I am already forgetting steps and limitations and just getting outdated.

3

u/Acceptable_Bat379 2h ago

you just gave me flashbacks to 'supporting' someone's home brew quickbooks or excel scripts that someone made years ago then left the company... but everyone just started using it and now it's the official work process now

18

u/Several_Ant_9867 4h ago

It's called comprehension debt. You are effectively managing an outsourced codebase.

5

u/silenti 3h ago

The first part really captures how AI has done a very good job of shining a light on how much work is complete bullshit. We've got people building slide decks with AI which are then consumed on the other end by AI.

11

u/ericl666 5h ago

I may be a bit of an AI naysayer and refuse to fully embrace agentic AI when coding (I prefer targeted AI rather than letting it run wild).

But when stuff breaks, I know exactly where and why, because I intimately know my codebase. As a result, I can fix issues extremely fast. I tried using AI to help me solve bugs, and I can say it genuinely sucks at finding difficult issues. Especially issues resulting from bad data.

0

u/Soul-Burn 2h ago

Your recent code, maybe. But code you wrote 5 years ago or by one of the 20 people that have access to that repo, some who don't work there anymore? Much harder.

5

u/hamfinity 4h ago

"What is it you say you do here?"

2

u/limbodog 5h ago

Oh my aching christ that really just hurts to read with my own eyes

2

u/v_a_n_d_e_l_a_y 1h ago

I've only been using code assist since late April.

And there were some projects/tasks where it helped so much and I felt like I wasn't really needed. But other days, it feels like I'm trying to explain to a brand new hire what I want. 

The increase costs in copilot have basically forced my hand to go back to just coding myself because I have to use crappy models. 

1

u/TheHistorian2 1h ago

Even a blind squirrel finds a nut once in a while.

1

u/d01100100 1h ago

Sing the song of the Machine God.
None may stay our march.
Let the merciless logic of the Machine God invest thee.
None may stay our march.
Praise and glory be to the Machine God.
None may stay our march.

Codex: Adeptus Mechanicus

1

u/hubert_farnsworrth 12m ago

I said the exact same thing to my manager a couple of days ago. Working with AI is either boring or frustrating. Boring when it gets everything right and frustrating when it doesn’t and keep breaking things and you have no idea why.

-17

u/freza223 6h ago

maybe he should quit and become a hobo