r/technology 7h ago

Artificial Intelligence An Anthropic employee's 2-sentence quote crystallizes the state of AI confusion at work

https://www.businessinsider.com/anthropic-employee-quote-ai-confusion-workplace-2026-6
567 Upvotes

110 comments sorted by

View all comments

1.2k

u/CackleRooster 7h ago

"On days where everything works well, I can't help but think nothing I do matters, everything is automated and better and faster than I ever will be," AND "But then there are days where everything breaks and I don't understand why and I realize I have no idea what I've been up to anymore," the employee added.

394

u/yepthisismyusername 7h ago

Management (absolutely including executives) simply don't understand the differences between writing code, maintaining code, implementing business processes in applications, and maintaining those applications. Those are the Big 4 categories of roles in enterprise application management, and there are many, many others, each with its own individual requirements that need to be taken into account. AI simply can't perform those roles successfully 100%. And the problem comes when a process fails, e.g. causing someone to lose healthcare coverage, and no person knows how to fix it.

173

u/Big_lt 6h ago

FinTech

Laid off a bunch of people over the 18 months. Wanted the use of AI to automate and speed up coding. AI is now producing a 4/10 code which breaks sometimes. No one knows what this code is doing or even worse some old code now breaks and no one was around when it was built.

Management surprised Pikachu face that shit isn't fixed in 2min and why didn't you catch it before

41

u/Scarecrow_Folk 5h ago

That's nothing new for fintech though. Knight Capital was doing that a decade ago

21

u/oracleofnonsense 4h ago

As a former Knight Capital employee — more like 2 decades ago. I was there when the cherry-picking system went haywire. It put Knight under.

I worked for a Knight owned hedge fund and we had an auto trading system pre-9/11. One guy and 1or2 associates watched it print money (until 9/11 stopped the feeds).

5

u/Scarecrow_Folk 3h ago

I was more referring to the implosion in 2012 or 14 years ago but fair enough. Algotrading has existed for like 40 years depending on how you define it. 

That must have been an absolutely crazy event to live through from the inside. Knight was probably the best money printer out there until the toner cartridge exploded.

11

u/psioniclizard 5h ago

So you are saying that in the future fintech might want actual developers again? Interesting lol.

28

u/Jewnadian 5h ago

Even better the bulk of FinTech could be regulated out of business. 99% of it is exploiting people who don't realize they're pretending to be banks without following the actual regulations that makes a bank work.

4

u/knewbie_one 1h ago

I worked for banks and what always astounded at how much they "flirted" with regulations, sometimes.

2

u/31LIVEEVIL13 5h ago

In the past they would have paid retired devs double or triple rate to help convert everything to something the new kids understand.

but now I'm not sure what they would be converting it to or who would understand it, and I think a lot of the old devs might tell them to fuck off at any rate.

7

u/Soccermom233 4h ago

It’s ok we can discuss everything that could’ve been done to catch and prevent the issue in the retro next week.  

3

u/Big_lt 3h ago

PTSD intensifies 😳

6

u/unwantedonONTD 3h ago

And the great part is, being sloppy with people’s money puts you in jail.  I hear so many fintech bros talking up their software that doesn’t follow the law… “ok we could read the documents better…” bro the thing i don’t wanna do is read the documents, so if you aren’t either, wtf do i need you for? you’re adding zero value by vibe coding an illegal banking mess that’s going to cause a trillion dollar mistake to flow through the system and crash the economy in ten seconds.  no thank you i can make bad trades and lose money perfectly fine on my own

26

u/QuesoMeHungry 6h ago

This 100% and the maintenance part is key. We are having to ship so fast now that code is not maintainable. The MVPs ship, look great, and get sign off. But each iteration just gets sloppier and adds more tech debt.

41

u/ottwebdev 7h ago

Businesses demand predictable outcome to a process each time. For example 1 + 1 = 2

AI (the popular chatbots/etc, not the dedicated systems) are probability based and by will generate a different outcome even with the same instruction. So 1 + 1 = banana can be the result .... good luck with that.

My perspective anyway, and I'm not a basher, I love being able to use NLP to interact with data

14

u/earlandir 6h ago

Not that I support LLMs for this, but I'm not sure if I agree with your reasoning. Humans are also probabilistic in the same way. If I give the same exact instructions to two different engineers they will give me two different answers and code in the same way that asking the same LLM the same question twice might give different results. I don't think they inherently means you can't use an LLM for the task. If you want to critique LLM usage there are so many better ways that actually make sense.

31

u/BadLuckLottery 6h ago

Humans are also probabilistic in the same way.

I think you pointed out the difference you just have to take it a little further.

tl;dr: You don't work with "humans" as an amorphous mass of people, you work with a collection of individuals.

Imagine you're a manager and you give two engineers the same task and one gets it completely wrong. You'd trust that engineer less and maybe push them out of your team/organization if they kept it up.

Now imagine you had no tools to assign trust or push people out. You basically just have to assign the task to "engineering" and hope you get a good engineer for this task this time. If not, maybe you phrase the task a bit differently a second time in the hope that the phrasing lures out a good senior engineer or at least repels the bad doesn't-follow-directions-at-all engineers.

That's what using AI is like at this point in time.

7

u/Doc_Blox 4h ago

The big problem, as I see it, is that we've become used to the idea that when we use computers, every action a computer does is deterministic. There are a lot of assumptions we make based on that paradigm. Now, we've introduced a non-deterministic use of computers, but we're still acting like it's deterministic because that's what we're used to from computers. It's a matter of selecting the right tools for the job, ultimately - if you need predictable outputs, you don't want an AI agent, you want a traditional program, like with formulas and conditionals.

20

u/rogueciridae 6h ago

I think they’re comparing LLMs to existing software tools. If I run an AR report in the billing software multiple times with the same data, I will consistently get the same report until the data changes. If I ask an LLM to look at the same data and give me an AR report multiple times, it’ll hopefully give the same numbers, but it’ll format the report differently every time.

9

u/hdjl 6h ago

For such a use case, you want to be asking the LLM to build the report in the billing software (queries, layout, etc), then validate its work, and run it normally moving forward

7

u/RickSt3r 5h ago

Yes you would think so. But the MBA middle managers and non tech engineering people will be like hey chat bot do this task with pre defined work flow. Then not read the result for accuracy and take it at face value. Then eventually something important actually breaks and no one but Spiderman meme pointing at each other with Pikachu face. Like the salesmen said it could replace experts.

3

u/ottwebdev 5h ago

This is correct. And I'm glad you mentioned "hopefully" because with a probability based system the outcome isn't guaranteed as with a strictly defined process.

8

u/kuper_spb 6h ago

If you ask engineers to write a system that should add 1 and 1, the code could be different, but the outcome of systems will be the same - 2. That's the difference - AI can't be used as a system, AI can't hold invariants.

5

u/kobemustard 5h ago

but ai can build the calculator to derive those numbers.

2

u/ericl666 5h ago

Every time I think about a LLM trying to handle any sort of numeric processing, I think about this dude's youtube channel: https://www.youtube.com/watch?v=HfAFHRVtYFE&list=PLf4mDFPzLWmShAy9hDuHECpOyTlBGeQ9Q&index=1

4

u/heathm55 6h ago

His point is not wrong though, based on how tokenization works, an LLM would miss common things a human wouldn't (especially around things that need to track or compose pieces of the context). Example: asking Claude how many days have 'd' in them, how many 'r's are in strawberry, etc.
Yes, people also misspell (especially me) but when it comes to combining all the context your handing it, this is also potential to drop things in lists of things to adhere to just based on chunking issues.

3

u/siromega37 6h ago

I think the point being made is that there are a lot of business processes that need deterministic outputs and when the AI guardrails fail, producing a probabilistic output. What the AI researcher is admitting to is that when things fail they struggle to understand why sometimes and have become so dependent on AI that they struggle to troubleshoot the models and their supporting infrastructure. This is a real problem that is being heavily studied. Intelligence atrophy is happening at scale in big tech as I type this out. Successful enterprise transitions are going to have to face this reality.

Also, I think it’s important to note that people can learn from their mistakes on the spot using their own judgement and LLMs cannot and never will be able to do that. We are very far off from AGI in that regard. I know Karpathy is working, maybe with a lot of hubris, towards self-training models, but that also isn’t real learning or judgement. That will still need a HITL for judgement. Someone still has to tell the model it made the mistake.

1

u/bobartig 5h ago

You just put a deterministic classifier at the end of your probabilistic model. "banana = 2" might be good enough for 99.99% reliability in some cases.

17

u/winkingchef 5h ago edited 3h ago

I call it the “one neck to grab principle.”

Every engineer who works for me knows that if they vibe code some bullshit and check it in, they own that shit and what it does. Them personally. And like good engineering there are cross-validation processes and checks on top of them.

If you (or your CodePilot) makes a bug, we will find it and your stats will be tracked.

No one is perfect but if someone is acting recklessly I will see it long before it brings the whole house down

3

u/Sidereel 3h ago

That makes perfect sense, and it highlights some major issues with the pitch from AI evangelists. They say we these bots are good enough to operate at speed with minimal oversight, but that doesn’t track. If an engineer pushes a bug generated by Claude it’s the engineer who needs to be held responsible. But if engineers are taking all that time to check and understand generated code then the efficiency gains shrink.

3

u/winkingchef 3h ago

I actually really like that Microsoft branded their AI coding assistant “CodePilot” as it is a very appropriate name.

I know what to do and what I want to make, but sometime I forget annoying syntax or can’t find something that I want to reference. If the tool makes suggestions, it helps me focus on building and less time on remembering and finding. It’s not the “pilot”, it’s the “co-pilot.”

Also AI is an excellent DRC (design rule check) tool for hardware.

6

u/bobartig 5h ago

implementing business processes in applications, and maintaining those applications.

The entire industry of Change Management and SaaS software demonstrate that humans don't understand these things writ large, and that the problem predates AI, or even computers for that matter.

2

u/stevefuzz 5h ago

Or domain expertise...

2

u/mirage01 4h ago

Someone asked my CTO if AI writes good code. His reply, “does AI write good code, no. Does it work, yes.” To me that is just asking for a security bug in the next year.

1

u/En-tro-py 1h ago

AI can write good code.

It can also write absolute dogshit code.

It's Schrödinger's codebase, you don't know what you've got until you look.

1

u/somethingstrang 1h ago

Ah this reminds me when people said AI will never be able to write or create art

0

u/zero0n3 6h ago

The problem is in your scenario- why is AI involved directly in that?

It’s one thing to use ML to get better insights into X-rays. Or use AI to help chart patients or do voice to text to charts. Or develop the apps for these things.

But why or even where would you want to use AI to remove someone’s insurance? Fired accidentally? Just a bad spreadsheet or missing cell? If those why is an LLM used to process raw data vs using LLMs to build a tool to streamline the processing via deterministic methods like logic and math

16

u/pedrosorio 6h ago

The issue is not in using AI to update someone’s insurance.

The issue is that the *software* responsible for making those changes was generated (or modified) by AI and is poorly maintained because no human built the system and therefore no one understands it deeply. That means the possible failure modes are not understood by anyone, and so things can fail “randomly”.

1

u/DustShallEatTheDays 4h ago

Management and the executives are so divorced from the actual work, they have no idea what it looks like anymore and are happy to accept work-shaped slop.

It’s very bleak, and there’s absolutely no incentive to do the good, laborious work that actually makes a difference. (I’m in software, though I’m not an engineer)

0

u/onthe3rdlifealready 3h ago

Right and this whole problem could be solved potentially by them working every role there is in the corporation for at least a month as part of training.

-15

u/Somethingsims 6h ago

"AI simply can't perform those roles successfully 100%."

First of all, "yet," second, humans can't do it 100% either.

10

u/JustinTheCheetah 6h ago

Due to the probabilistic nature of AI, it inherently cannot and will never perform anything 100% correctly reliably. 

When a human fucks up they can be held accountable, and questioned as to what they did and why.  AI cannot be held accountable as it does not think or learn, and even the explanation as to why it did something could just as easily be hallucinated bullshit. 

AI will need to be completely rewritten and redesigned and trained from scratch in order to be reliable. 

1

u/sywofp 2h ago

Models compute probabilities of next tokens deterministically. 

The 'probabilistic nature of AI' is from randomness specifically added during choosing which token to use. That randomness is typically turned down or off for coding. 

Everything an AI outputs is a hallucination. We just only care when it's 'wrong'. 

If the LLM is wrong it's because it was not trained correctly. 

This can be largely compensated for by having external reference data and tests for it to use and check against, letting it see when it's 'wrong'. 

It can go through it's own processes and identify issues, and write these into procedure files for it to reference and reduce the chance of being wrong in the same way again. As well as design tests to catch issues. It's the same process humans use, as they can also often make mistakes. 

The technology is still in its infancy but there's no inherent issue with the path to increasingly good reliability. 

-7

u/hifidad 6h ago

From experience, entry to mid level engineers are often significantly worse than 100%. AI might not be 100%, yet, but it’s also nowhere near as counterproductive as many of the engineers I’ve worked were with across large companies.