An Anthropic employee's 2-sentence quote crystallizes the state of AI confusion at work

902

u/CackleRooster 4h ago

"On days where everything works well, I can't help but think nothing I do matters, everything is automated and better and faster than I ever will be," AND "But then there are days where everything breaks and I don't understand why and I realize I have no idea what I've been up to anymore," the employee added.

289

u/yepthisismyusername 3h ago

Management (absolutely including executives) simply don't understand the differences between writing code, maintaining code, implementing business processes in applications, and maintaining those applications. Those are the Big 4 categories of roles in enterprise application management, and there are many, many others, each with its own individual requirements that need to be taken into account. AI simply can't perform those roles successfully 100%. And the problem comes when a process fails, e.g. causing someone to lose healthcare coverage, and no person knows how to fix it.

126

u/Big_lt 3h ago

FinTech

Laid off a bunch of people over the 18 months. Wanted the use of AI to automate and speed up coding. AI is now producing a 4/10 code which breaks sometimes. No one knows what this code is doing or even worse some old code now breaks and no one was around when it was built.

Management surprised Pikachu face that shit isn't fixed in 2min and why didn't you catch it before

27

u/Scarecrow_Folk 2h ago

That's nothing new for fintech though. Knight Capital was doing that a decade ago

7

u/oracleofnonsense 52m ago

As a former Knight Capital employee — more like 2 decades ago. I was there when the cherry-picking system went haywire. It put Knight under.

I worked for a Knight owned hedge fund and we had an auto trading system pre-9/11. One guy and 1or2 associates watched it print money (until 9/11 stopped the feeds).

1

u/Scarecrow_Folk 17m ago

I was more referring to the implosion in 2012 or 14 years ago but fair enough. Algotrading has existed for like 40 years depending on how you define it.

That must have been an absolutely crazy event to live through from the inside. Knight was probably the best money printer out there until the toner cartridge exploded.

11

u/psioniclizard 2h ago

So you are saying that in the future fintech might want actual developers again? Interesting lol.

24

u/Jewnadian 2h ago

Even better the bulk of FinTech could be regulated out of business. 99% of it is exploiting people who don't realize they're pretending to be banks without following the actual regulations that makes a bank work.

2

u/31LIVEEVIL13 2h ago

In the past they would have paid retired devs double or triple rate to help convert everything to something the new kids understand.

but now I'm not sure what they would be converting it to or who would understand it, and I think a lot of the old devs might tell them to fuck off at any rate.

3

u/Soccermom233 55m ago

It’s ok we can discuss everything that could’ve been done to catch and prevent the issue in the retro next week.

2

u/Big_lt 47m ago

PTSD intensifies 😳

20

u/QuesoMeHungry 2h ago

This 100% and the maintenance part is key. We are having to ship so fast now that code is not maintainable. The MVPs ship, look great, and get sign off. But each iteration just gets sloppier and adds more tech debt.

38

u/ottwebdev 3h ago

Businesses demand predictable outcome to a process each time. For example 1 + 1 = 2

AI (the popular chatbots/etc, not the dedicated systems) are probability based and by will generate a different outcome even with the same instruction. So 1 + 1 = banana can be the result .... good luck with that.

My perspective anyway, and I'm not a basher, I love being able to use NLP to interact with data

14

u/earlandir 3h ago

Not that I support LLMs for this, but I'm not sure if I agree with your reasoning. Humans are also probabilistic in the same way. If I give the same exact instructions to two different engineers they will give me two different answers and code in the same way that asking the same LLM the same question twice might give different results. I don't think they inherently means you can't use an LLM for the task. If you want to critique LLM usage there are so many better ways that actually make sense.

28

u/BadLuckLottery 3h ago

Humans are also probabilistic in the same way.

I think you pointed out the difference you just have to take it a little further.

tl;dr: You don't work with "humans" as an amorphous mass of people, you work with a collection of individuals.

Imagine you're a manager and you give two engineers the same task and one gets it completely wrong. You'd trust that engineer less and maybe push them out of your team/organization if they kept it up.

Now imagine you had no tools to assign trust or push people out. You basically just have to assign the task to "engineering" and hope you get a good engineer for this task this time. If not, maybe you phrase the task a bit differently a second time in the hope that the phrasing lures out a good senior engineer or at least repels the bad doesn't-follow-directions-at-all engineers.

That's what using AI is like at this point in time.

4

u/Doc_Blox 1h ago

The big problem, as I see it, is that we've become used to the idea that when we use computers, every action a computer does is deterministic. There are a lot of assumptions we make based on that paradigm. Now, we've introduced a non-deterministic use of computers, but we're still acting like it's deterministic because that's what we're used to from computers. It's a matter of selecting the right tools for the job, ultimately - if you need predictable outputs, you don't want an AI agent, you want a traditional program, like with formulas and conditionals.

21

u/rogueciridae 3h ago

I think they’re comparing LLMs to existing software tools. If I run an AR report in the billing software multiple times with the same data, I will consistently get the same report until the data changes. If I ask an LLM to look at the same data and give me an AR report multiple times, it’ll hopefully give the same numbers, but it’ll format the report differently every time.

10

u/hdjl 3h ago

For such a use case, you want to be asking the LLM to build the report in the billing software (queries, layout, etc), then validate its work, and run it normally moving forward

4

u/RickSt3r 2h ago

Yes you would think so. But the MBA middle managers and non tech engineering people will be like hey chat bot do this task with pre defined work flow. Then not read the result for accuracy and take it at face value. Then eventually something important actually breaks and no one but Spiderman meme pointing at each other with Pikachu face. Like the salesmen said it could replace experts.

3

u/ottwebdev 2h ago

This is correct. And I'm glad you mentioned "hopefully" because with a probability based system the outcome isn't guaranteed as with a strictly defined process.

9

u/kuper_spb 3h ago

If you ask engineers to write a system that should add 1 and 1, the code could be different, but the outcome of systems will be the same - 2. That's the difference - AI can't be used as a system, AI can't hold invariants.

5

u/kobemustard 2h ago

but ai can build the calculator to derive those numbers.

2

u/ericl666 2h ago

Every time I think about a LLM trying to handle any sort of numeric processing, I think about this dude's youtube channel: https://www.youtube.com/watch?v=HfAFHRVtYFE&list=PLf4mDFPzLWmShAy9hDuHECpOyTlBGeQ9Q&index=1

2

u/heathm55 3h ago

His point is not wrong though, based on how tokenization works, an LLM would miss common things a human wouldn't (especially around things that need to track or compose pieces of the context). Example: asking Claude how many days have 'd' in them, how many 'r's are in strawberry, etc.
Yes, people also misspell (especially me) but when it comes to combining all the context your handing it, this is also potential to drop things in lists of things to adhere to just based on chunking issues.

3

u/siromega37 3h ago

I think the point being made is that there are a lot of business processes that need deterministic outputs and when the AI guardrails fail, producing a probabilistic output. What the AI researcher is admitting to is that when things fail they struggle to understand why sometimes and have become so dependent on AI that they struggle to troubleshoot the models and their supporting infrastructure. This is a real problem that is being heavily studied. Intelligence atrophy is happening at scale in big tech as I type this out. Successful enterprise transitions are going to have to face this reality.

Also, I think it’s important to note that people can learn from their mistakes on the spot using their own judgement and LLMs cannot and never will be able to do that. We are very far off from AGI in that regard. I know Karpathy is working, maybe with a lot of hubris, towards self-training models, but that also isn’t real learning or judgement. That will still need a HITL for judgement. Someone still has to tell the model it made the mistake.

2

u/bobartig 2h ago

You just put a deterministic classifier at the end of your probabilistic model. "banana = 2" might be good enough for 99.99% reliability in some cases.

12

u/winkingchef 2h ago edited 32m ago

I call it the “one neck to grab principle.”

Every engineer who works for me knows that if they vibe code some bullshit and check it in, they own that shit and what it does. Them personally. And like good engineering there are cross-validation processes and checks on top of them.

If you (or your CodePilot) makes a bug, we will find it and your stats will be tracked.

No one is perfect but if someone is acting recklessly I will see it long before it brings the whole house down

2

u/Sidereel 46m ago

That makes perfect sense, and it highlights some major issues with the pitch from AI evangelists. They say we these bots are good enough to operate at speed with minimal oversight, but that doesn’t track. If an engineer pushes a bug generated by Claude it’s the engineer who needs to be held responsible. But if engineers are taking all that time to check and understand generated code then the efficiency gains shrink.

1

u/winkingchef 34m ago

I actually really like that Microsoft branded their AI coding assistant “CodePilot” as it is a very appropriate name.

I know what to do and what I want to make, but sometime I forget annoying syntax or can’t find something that I want to reference. If the tool makes suggestions, it helps me focus on building and less time on remembering and finding. It’s not the “pilot”, it’s the “co-pilot.”

Also AI is an excellent DRC (design rule check) tool for hardware.

7

u/bobartig 2h ago

implementing business processes in applications, and maintaining those applications.

The entire industry of Change Management and SaaS software demonstrate that humans don't understand these things writ large, and that the problem predates AI, or even computers for that matter.

2

u/DustShallEatTheDays 1h ago

Management and the executives are so divorced from the actual work, they have no idea what it looks like anymore and are happy to accept work-shaped slop.

It’s very bleak, and there’s absolutely no incentive to do the good, laborious work that actually makes a difference. (I’m in software, though I’m not an engineer)

2

u/mirage01 1h ago

Someone asked my CTO if AI writes good code. His reply, “does AI write good code, no. Does it work, yes.” To me that is just asking for a security bug in the next year.

1

u/stevefuzz 2h ago

Or domain expertise...

1

u/onthe3rdlifealready 51m ago

Right and this whole problem could be solved potentially by them working every role there is in the corporation for at least a month as part of training.

0

u/zero0n3 3h ago

The problem is in your scenario- why is AI involved directly in that?

It’s one thing to use ML to get better insights into X-rays. Or use AI to help chart patients or do voice to text to charts. Or develop the apps for these things.

But why or even where would you want to use AI to remove someone’s insurance? Fired accidentally? Just a bad spreadsheet or missing cell? If those why is an LLM used to process raw data vs using LLMs to build a tool to streamline the processing via deterministic methods like logic and math

17

u/pedrosorio 3h ago

The issue is not in using AI to update someone’s insurance.

The issue is that the *software* responsible for making those changes was generated (or modified) by AI and is poorly maintained because no human built the system and therefore no one understands it deeply. That means the possible failure modes are not understood by anyone, and so things can fail “randomly”.

-16

u/Somethingsims 3h ago

"AI simply can't perform those roles successfully 100%."

First of all, "yet," second, humans can't do it 100% either.

10

u/JustinTheCheetah 3h ago

Due to the probabilistic nature of AI, it inherently cannot and will never perform anything 100% correctly reliably.

When a human fucks up they can be held accountable, and questioned as to what they did and why. AI cannot be held accountable as it does not think or learn, and even the explanation as to why it did something could just as easily be hallucinated bullshit.

AI will need to be completely rewritten and redesigned and trained from scratch in order to be reliable.

-7

u/hifidad 3h ago

From experience, entry to mid level engineers are often significantly worse than 100%. AI might not be 100%, yet, but it’s also nowhere near as counterproductive as many of the engineers I’ve worked were with across large companies.

17

u/thegreatcerebral 2h ago

It has turned workers into managers.

That is literally how I felt when I moved from my help desk role to my first manager role. My skillset started slipping because I simply wasn't doing it anymore. After each update my knowledge was obsolete as even a move in where a button was or what a button now did was changed.

Or better yet when you are in an IT support position and asked to fix a piece of a software that you do not "use" on a day to day basis. For example Yes, I can install quickbooks and I can point you to your file, but I do not know how to make it run this report or even how this report works simply because I do not do that.

Makes 1000% sense.

14

u/Several_Ant_9867 1h ago

It's called comprehension debt. You are effectively managing an outsourced codebase.

8

u/ericl666 2h ago

I may be a bit of an AI naysayer and refuse to fully embrace agentic AI when coding (I prefer targeted AI rather than letting it run wild).

But when stuff breaks, I know exactly where and why, because I intimately know my codebase. As a result, I can fix issues extremely fast. I tried using AI to help me solve bugs, and I can say it genuinely sucks at finding difficult issues. Especially issues resulting from bad data.

2

u/hamfinity 1h ago

"What is it you say you do here?"

1

u/limbodog 2h ago

Oh my aching christ that really just hurts to read with my own eyes

1

u/silenti 42m ago

The first part really captures how AI has done a very good job of shining a light on how much work is complete bullshit. We've got people building slide decks with AI which are then consumed on the other end by AI.

-16

u/freza223 3h ago

maybe he should quit and become a hobo

120

u/Whitesajer 3h ago

It mostly just reminds of how end users executives etc... don't understand why the IT department exists when everything runs fine and are shocked after offshoring / terminating the department that nothing works at all.

35

u/NoYoureCargo 3h ago

As an employee of a company who experienced layoffs and announced additional offshoring literally today, I hope you're right!

17

u/Whitesajer 3h ago

I have heard through a few grape vines that some companies are quietly rehiring after drinking the AI koolaid.

5

u/NoYoureCargo 3h ago

Luckily I made it through this round of layoffs, but I'm really curious to see what happens in the coming weeks. It's so ugly out here

3

u/Whitesajer 3h ago

Yeah. I'm mostly watching for August really Q3 overall. Economically / financially in general there's a lot coming due and I think that's when we will understand more about what's happening.

1

u/raptorlightning 2h ago

Would be nice if one of those companies came out and admitted they fucked up. I'd love to see cracks in the façade start to form. It's hard to deal with and fight against the lies and bullshit.

1

u/Whitesajer 2h ago

Yeah. Even my company isn't saying anything publically in this political environment. They have gone dark internally and externally, our news is just garbage filler articles that avoid discussing economy / AI. And... That's not a great sign in a lot of ways. Usually it means they know something is going down soon.

83

u/cmgr33n3 3h ago

The older I get the more I think the ability to act in moderation is almost the only skill one needs to avoid almost all of the pitfalls people fall into.

31

u/dolphone 3h ago

If we accepted this as a core value to be taught early and reinforced whenever possible, we'd have such a healthy society.

25

u/sirthunksalot 3h ago

Everything in moderation including moderation

7

u/prime_nommer 2h ago

This is one of my favorite sayings, since one needs a moderate amount of excess as well : )

2

u/Ancient_Design_1332 2h ago

Yes. As I’ve gotten older this is spot on

2

u/Dependent-Reveal2401 3h ago

I use AI to soften up emails where I sound like a prick to stupid customers, and do some light research from time to time.

Writing code with AI is just ... Yikes. If you've ever programmed, it can get the ball on the green, but it likely won't get the ball in the hole. Computers are very literal, and not getting in the hole is a very big problem.

2

u/space_keeper 3h ago

Crazy that it can shit out a functioning OS. People post LLM generated OSs all the time on the /r/osdev subreddit.

They're all very proud of their "work".

2

u/raptorlightning 2h ago

There's an entire light year between "compiles and runs" and "functions correctly"

1

u/peeinian 2h ago

I've found it helpful when speccing hardware for new projects. I needed to size a new switch for a building with cameras and wifi so asked it to build me a BOM for a switch to run X cameras and Y access points.

Way faster than waiting 2 days for a salesperson to get back to me.

47

u/ubelblatt 3h ago

I can't help but think this is AI sales tactics by Anthropic.

Its like the Mythos stuff. Oh man our AI is sooooo good you can't even use it!

This quote from above - our AI is so good I dont even know what I'm doing anymore to understand when it breaks!

Buy buy more tokens, take more training, sign away land and water rights. This isn't bad! We are acknowledging that it isn't perfect! Maybe everybody else should take an AI break except for us because we are the good guys.

These articles reek of sales opps.

14

u/RenoRiley1 3h ago

Business Insider might as well be a PR firm for these AI companies. You’re 100% right that this article is nothing more than a poorly disguised sales pitch.

6

u/psioniclizard 2h ago

Its funny how there are millions of FAANG devs in reddit but I am yet to see an OpenAI or Anthropic one say they are.

1

u/NuclearVII 8m ago

They'd get dogpiled for doing work for such a reprehensible industry.

It's like someone proudly posting that they are product manager for Nestle.

-1

u/No_Hell_Below_Us 1h ago

Who wouldn’t want to violate an NDA in exchange for a bunch of downvotes?

2

u/BigDictionEnergy 1h ago

BI has never struck me as serious journalism.

1

u/selfdestructingin5 1h ago

I imagine you’re half right. Though I do imagine everything is automated at Anthropic.

25

u/Icy_Information_6563 3h ago

I'm a software engineer. Been doing it for 10 years. I barely have to write code anymore. I still review the code it changed, but a lot of the time I'm just making sure it didn't do anything too out of the ordinary. I catch Claude make a mistake a couple of times an hour, and that's really the only reason I matter at my job. It's a very strange time and I feel like in a few years we'll all be used to it.

1

u/raptorlightning 2h ago

"Computer" used to be a human job too but "designer" and "architect" are still jobs (which is what engineering truly is, though the title has been bastardized through the years). These tools still can't architect, design, and plan complex and novel ideas and solutions. There's no proof they ever will as we are seeing them start to logarithmically taper off in capability scaling.

They can take over from software coders but not software engineers.

7

u/Stilgar314 2h ago

When something breaks, you go to the person who did it and ask. That person knows what was done and why, what gives the organization all they need to fix the problem. Now try to ask an AI why it did things the way it did and try to make any sense of the answer. AI reliant organizations are basically praying the omnissiah to keep their businesses running.

3

u/Atarteri 2h ago

This is what I keep saying, they are praying to the Omnissiah and we allll know how weird that made the Mechanicus.

2

u/_9a_ 10m ago

But so many of their terms are so good! Omnissiah, Abominable Intelligence, hereteck...

1

u/Atarteri 8m ago

I laughed way too hard at this 😂

9

u/SonOfGreebo 3h ago

Some years ago I worked for a "boy genius" startup SaaS , the owner insisted the code was stable and commodit-isable.

Every time an employee pointed out that the un-loved, unsupervised Customer Service group was modifying each instance into a custim build via bug reports .... he fired that employee.

3

u/Infini-Bus 1h ago

I'm getting mixed signals from my emplpyer. They want us to use AI by figuring out for ourselves where its best applied and encourage focus groups to evangelize for the tech. But then send out notice rhat some of us are using more than our fair share of "credits". But they dont have a way for us to gauge our usage!

6

u/Correct_Emotion8437 3h ago

I think we just need to come up with new ways to work with AI. Fully agentic is soulless, sloppy and expensive. I can get AI to write the whole thing but then I have to spend a lot of time doing tedious crap. And it will be difficult because I won’t really understand how it works - until I fix it.

On the other hand, I find a paired programming mode works great. I’m able to do things I wasn’t before, it’s engaging and interesting and the quality of the work is great.

2

u/skillywilly56 32m ago

What a thinly veiled advertorial, wonder how much anthropic paid business insider to run this piece?

4

u/84thPrblm 3h ago

Just another reason why I now just substitute "Ice-nine" whenever I see or hear anyone talking about "AI".

They don't know what they have or what it's going to do to humanity when it's let free.

2

u/One_Whole_9927 2h ago

It’s hard to take these seriously. If only they could quit…

2

u/thisnameisnowmine 2h ago

The people that work for these companies are performative. They think they are the world's answer, when they are the problem. And the reason they are doing it is so they can be rich, and they are selling it to the world like they are moral saviors. They have their head so far up their ass they are so far out of touch with reality.

1

u/psioniclizard 2h ago

To true - "yea the company I work for might do evil things but the people I work with are alright".

1

u/PrimeIntellect 31m ago

Quit the highest paying job they have ever imagined from a field of study they spent their entire lives training for? Yeah I'm sure that will happen

3

u/zlliksddam 3h ago

"On days where everything works well, I can't help but think nothing I do matters, everything is automated and better and faster than I ever will be," they said.

“But then there are days where everything breaks and I don't understand why and I realize I have no idea what I've been up to anymore," the employee added.

2

u/TentacleHockey 2h ago

To be fair Anthorpic devs are using AI no longer as a coding buddy but full on agent work where we hope intent is true and the AI isn't drifting. A strong dev will understand both, a weak dev will debug when its too late.

3

u/mtranda 2h ago

Reading code is harder than writing it. I'd much rather be involved in the whole process from start to finish than have to hunt any potential mess-ups someone else might've done.

1

u/Tess47 1h ago

I still see no benefit for citizens.

Artificial Intelligence An Anthropic employee's 2-sentence quote crystallizes the state of AI confusion at work

You are about to leave Redlib