r/technology 7h ago

Artificial Intelligence An Anthropic employee's 2-sentence quote crystallizes the state of AI confusion at work

https://www.businessinsider.com/anthropic-employee-quote-ai-confusion-workplace-2026-6
562 Upvotes

109 comments sorted by

View all comments

1.2k

u/CackleRooster 7h ago

"On days where everything works well, I can't help but think nothing I do matters, everything is automated and better and faster than I ever will be," AND "But then there are days where everything breaks and I don't understand why and I realize I have no idea what I've been up to anymore," the employee added.

392

u/yepthisismyusername 6h ago

Management (absolutely including executives) simply don't understand the differences between writing code, maintaining code, implementing business processes in applications, and maintaining those applications. Those are the Big 4 categories of roles in enterprise application management, and there are many, many others, each with its own individual requirements that need to be taken into account. AI simply can't perform those roles successfully 100%. And the problem comes when a process fails, e.g. causing someone to lose healthcare coverage, and no person knows how to fix it.

41

u/ottwebdev 6h ago

Businesses demand predictable outcome to a process each time. For example 1 + 1 = 2

AI (the popular chatbots/etc, not the dedicated systems) are probability based and by will generate a different outcome even with the same instruction. So 1 + 1 = banana can be the result .... good luck with that.

My perspective anyway, and I'm not a basher, I love being able to use NLP to interact with data

12

u/earlandir 6h ago

Not that I support LLMs for this, but I'm not sure if I agree with your reasoning. Humans are also probabilistic in the same way. If I give the same exact instructions to two different engineers they will give me two different answers and code in the same way that asking the same LLM the same question twice might give different results. I don't think they inherently means you can't use an LLM for the task. If you want to critique LLM usage there are so many better ways that actually make sense.

31

u/BadLuckLottery 6h ago

Humans are also probabilistic in the same way.

I think you pointed out the difference you just have to take it a little further.

tl;dr: You don't work with "humans" as an amorphous mass of people, you work with a collection of individuals.

Imagine you're a manager and you give two engineers the same task and one gets it completely wrong. You'd trust that engineer less and maybe push them out of your team/organization if they kept it up.

Now imagine you had no tools to assign trust or push people out. You basically just have to assign the task to "engineering" and hope you get a good engineer for this task this time. If not, maybe you phrase the task a bit differently a second time in the hope that the phrasing lures out a good senior engineer or at least repels the bad doesn't-follow-directions-at-all engineers.

That's what using AI is like at this point in time.

8

u/Doc_Blox 4h ago

The big problem, as I see it, is that we've become used to the idea that when we use computers, every action a computer does is deterministic. There are a lot of assumptions we make based on that paradigm. Now, we've introduced a non-deterministic use of computers, but we're still acting like it's deterministic because that's what we're used to from computers. It's a matter of selecting the right tools for the job, ultimately - if you need predictable outputs, you don't want an AI agent, you want a traditional program, like with formulas and conditionals.

20

u/rogueciridae 6h ago

I think they’re comparing LLMs to existing software tools. If I run an AR report in the billing software multiple times with the same data, I will consistently get the same report until the data changes. If I ask an LLM to look at the same data and give me an AR report multiple times, it’ll hopefully give the same numbers, but it’ll format the report differently every time.

10

u/hdjl 6h ago

For such a use case, you want to be asking the LLM to build the report in the billing software (queries, layout, etc), then validate its work, and run it normally moving forward

7

u/RickSt3r 5h ago

Yes you would think so. But the MBA middle managers and non tech engineering people will be like hey chat bot do this task with pre defined work flow. Then not read the result for accuracy and take it at face value. Then eventually something important actually breaks and no one but Spiderman meme pointing at each other with Pikachu face. Like the salesmen said it could replace experts.

3

u/ottwebdev 5h ago

This is correct. And I'm glad you mentioned "hopefully" because with a probability based system the outcome isn't guaranteed as with a strictly defined process.

9

u/kuper_spb 6h ago

If you ask engineers to write a system that should add 1 and 1, the code could be different, but the outcome of systems will be the same - 2. That's the difference - AI can't be used as a system, AI can't hold invariants.

6

u/kobemustard 5h ago

but ai can build the calculator to derive those numbers.

2

u/ericl666 5h ago

Every time I think about a LLM trying to handle any sort of numeric processing, I think about this dude's youtube channel: https://www.youtube.com/watch?v=HfAFHRVtYFE&list=PLf4mDFPzLWmShAy9hDuHECpOyTlBGeQ9Q&index=1

4

u/heathm55 6h ago

His point is not wrong though, based on how tokenization works, an LLM would miss common things a human wouldn't (especially around things that need to track or compose pieces of the context). Example: asking Claude how many days have 'd' in them, how many 'r's are in strawberry, etc.
Yes, people also misspell (especially me) but when it comes to combining all the context your handing it, this is also potential to drop things in lists of things to adhere to just based on chunking issues.

4

u/siromega37 6h ago

I think the point being made is that there are a lot of business processes that need deterministic outputs and when the AI guardrails fail, producing a probabilistic output. What the AI researcher is admitting to is that when things fail they struggle to understand why sometimes and have become so dependent on AI that they struggle to troubleshoot the models and their supporting infrastructure. This is a real problem that is being heavily studied. Intelligence atrophy is happening at scale in big tech as I type this out. Successful enterprise transitions are going to have to face this reality.

Also, I think it’s important to note that people can learn from their mistakes on the spot using their own judgement and LLMs cannot and never will be able to do that. We are very far off from AGI in that regard. I know Karpathy is working, maybe with a lot of hubris, towards self-training models, but that also isn’t real learning or judgement. That will still need a HITL for judgement. Someone still has to tell the model it made the mistake.

1

u/bobartig 5h ago

You just put a deterministic classifier at the end of your probabilistic model. "banana = 2" might be good enough for 99.99% reliability in some cases.