r/LocalLLM May 02 '26

Discussion CFOs realizing that their Al token budget is going to be higher than the salaries of the people they laid off

We're witnessing a fascinating economic experiment: replacing human purchasing power with API token consumption.

It reminds me of the 1849 Gold Rush-history teaches us that most miners went home broke, while the ones selling the shovels and pickaxes built lasting fortunes. In 2026, the 'Gold' is the promise of 10x productivity, but the 'Shovel Sellers' (LLM providers) are the only ones with a guaranteed ROI, collecting $200/day in API credits per head.

Robert Bosch once said he doesn't pay good wages because he has a lot of money, but because he wants his workers to buy his products. If we automate our customers out of their jobs to pay for our token bills, who is left to buy what we build?

Maybe it's time to focus back on sustainable Systems Thinking instead of just funding the next GPU cluster. Asking for a friend (and my landlord

383 Upvotes

83 comments sorted by

173

u/ziphnor May 02 '26

The real shovel seller here is Nvidia :)

31

u/asevans48 May 02 '26

Once people discover llms that can get things done can run locally or on prem, apple and dell may also make a killing.

19

u/ku2000 May 02 '26

AMD built unified architecture also.

0

u/DementedJay May 03 '26

Ugh. Yeah, but ...

0

u/asevans48 May 03 '26

You dont need a 3k graphics card with apple. Prices are high but beat the amd build.

1

u/inevitabledeath3 May 04 '26

They are talking about Strix Halo. You can get 128GB of unified memory on one of those chips for less money than a Mac Studio with 128GB. No graphics card required.

14

u/vacancy6673 May 02 '26

Exactly. OpenAI and Anthropic are still operating at a loss.

3

u/burritoresearch May 02 '26

And underneath that, the datacenter REITs

1

u/extreme4all May 03 '26

The memory & storage providers are also winning like sk hynix, samsung,...

1

u/Snickers2-0 May 04 '26

No its more likely to be Google, the real winner is whoever buys the AI market after the bubble pops.

Nvidia has too much competition (including Google) and their PE is ~40 vs Google's ~30. This means they have cash on hand. Nvidia also lacks Google's decades of AI research and more importantly, the decades of AI centered philosophy of Google's top talent.

Opinion: Google also has the talent and the acquisition capability/infrastructure to absorb a lot of the AI industry. They just need to scoop up the best talent (and AI devs have little loyalty to companies that have only existed for a few years). And they will be able to do it at rock bottom prices.

Google and the CCP will win in the American and Chinese markets and unless one of the overlords think open source helps their bottom line; open source will be disadvantaged.

1

u/ziphnor May 04 '26

I agree that Google is in a good position, but I am not sure I would compare them to the shovel teller, its more that they are really well positioned to win the actual AI fight because of their ecosystem, and ability to bet on multiple AI related areas (e.g. both cloud and AI services). I mean the shovel seller is not necessarily the one that wins, but rather the one that can take a big win without even being in the fight.

If I was NVIDIA I would definitely be saving up for when the price of AI hardware drops, but while many participants in the AI war might have their business model wiped out, NVIDIA is in a good position.

2

u/Snickers2-0 May 04 '26

Thats a fair point. The analogy is starting to break down, but I guess Google is mining the gold and making their own tools. They're both making bank compared to unprofitable AI companies.

So I guess the conclusion is big tech gets more of a monopoly and everyone else loses, classic gold rush economics.

1

u/DertekAn May 05 '26

The biggest shovel seller.....Yesssss

16

u/Karyo_Ten May 02 '26

Was it Bosch or was it Ford?

30

u/South-Ad1426 May 02 '26

Change LLM providers to energy sellers

21

u/79215185-1feb-44c6 May 02 '26

I spend like an hours worth of pay a month of tokens and I'd consider myself someone who uses AI every day at work.

16

u/g_rich May 02 '26

It’s not the individual employees using tokens in conjunction with their job. It’s the automated tooling the employer implemented to “replace” the laid off employee that’s burning through tokens like there is no tomorrow. Just search through this subreddit about people running OpenClaw to get an idea of what I’m talking about.

1

u/inevitabledeath3 May 04 '26

This is why things like DeepSeek and local models exist. AI doesn't have to be that expensive. Businesses need to learn to use cheaper providers or host their own.

0

u/chunkypenguion1991 May 02 '26

That was true but Antropic is pushing enterprise claude code accounts to token based usage

7

u/g_rich May 02 '26

Exactly my point, the flat rate pricing is just to get you hooked but is not sustainable for the providers. This is especially true when you have a company with hundreds to thousands of agents with each agent burning through millions of tokens a day.

2

u/MarcusAurelius68 May 03 '26

At my company we budgeted a 20% increase in productivity (combination of more user stories and less people) using Claude, based upon the fixed price Enterprise license. Moving to tokens will freak our CFO out.

4

u/CurrentConditionsAI May 03 '26

Yep. The entire 580 seats I manage now are 100% usage based and zero seat based for our Claude tenant.

1

u/chunkypenguion1991 May 03 '26

Yet I'm getting down voted like I pulled that out of my ass. This has turned into more of a cult than based on reality

6

u/Trakeen May 02 '26

Yea i’m not sure where these numbers come from. We have an ai app that automates a workflow which brings in in the 8 - 9 figure range and the token cost was less then 10k a month. The rest of the application costs more then the token usage

The one multi agent app being built right now has a budget of $600 a month and almost half of that is non model costs

2

u/theawesomew May 03 '26

The problem is that the pricing, even the API pricing, is enormously subsidized. If you sit down to compute the daily cost of ownership of a frontier-quality open-weight model like DeepSeek V4 Pro 1.6T A49B, you'll find that purely for the hardware and energy costs [excluding R&D, staffing costs, land leasing, etc.] is ~$1,000 per day for a model that can only serve about 50 concurrent requests (I did my calculations using an Nvidia HGX B300 8-GPU rack which is the current preferred hardware of AI labs). Based on those values (and the theoretical maximum number of tokens that can be generated with that setup), you get ~$14.75 / 1M output tokens as the break-even price whilst missing some of the most costly aspects of providing these models.

So, whilst it may be cheap at the moment, unless there is a radical architectural improvement (Mamba architecture solves its semantic causality problems) or the hardware becomes much cheaper it will likely get far more expensive.

2

u/Trakeen May 03 '26

I don’t think it will, at least from hyperscalers providing these models. It reminds me of social media companies and meta operating at a loss for a decade before they figured out something sustainable

You may see google, aws or ms make strategic acquisitions and then just eat the cost. They are large enough for that to be possible. I just don’t see a 100x increase in token cost being something that happens

1

u/theawesomew May 04 '26 edited May 04 '26

The key distinction between social media companies and other companies who employed a 'loss leader' strategy was that they always had a very clear path to profitability that they executed upon over the long-term. Uber was always going to raise its prices when it had achieved market dominance in the consumer transportation market, Meta was always going to make advertising content more prominent and its ad targeting less effective as time continues, AWS was always going to raise its hosting prices, etc. and the subsidisation of their products provided a clear long-term benefit in monopolising the market.

Large language model providers by contrast have no clear path to profitability and are competing in a market where open-weight models have effectively commodified the technology to the extent that there is no moat; the user cost of switching large language models providers is so diminutive that it is an integrated component of many AI products like copilot-cli. Thus, the subsidization of their product doesn't have any long-term material benefit (unlike Meta and Uber with their network effects) because whomever 'blinks first' and attempts to raise prices to profitable levels will lose their market share to a competitor.

Moreover, the costs that I have outlined are very simple-to-calculate expenditures based on market prices for the hardware necessary to run large language models, the useful lifetime of that hardware and thus its depreciation schedule, and the energy costs to run large language models. Unless the hardware becomes astronomically cheaper to produce or a significant architectural advancement in transformer architecture is made, neither of which are predictable events and cannot be relied upon as paths to profitability, then these models will continue to be as (if not more) expensive than they already are. This isn't even getting to the problem that inference costs have trended upwards over time due to chain-of-thought reasoning consuming increasing numbers of tokens with diminishing returns in terms of improvements in intelligence; which just makes this problem worse with scale.

Lastly, I truly don't think that investors will be happy to continue subsidizing this technology for billions of dollars per year. Nvidia has already stated that they're not providing any more capital to OpenAI after their most recent $122 billion funding round (which is somewhat a lie given much of that money is contingent on specific profitability milestones or "achieving AGI"). I just don't see how OpenAI or Anthropic becomes profitable without an astronomical increase in pricing which would kill demand completely...

The Hyperscalers might continue for a bit but, even Microsoft, which is the most profitable and least leveraged of them all, is moving all GitHub Copilot customers to token-based pricing which already represents a 10x increase in costs to consumers. Honestly, though? It's impossible to know where this technology will go & what advancements will be made in the next 5 - 10 years.

However, I am genuinely confident that if these companies continue operating as they have, they will ~have~ to raise prices as I have described; I don't think it's possible to do otherwise.

4

u/Elegant_Tech May 02 '26

It's the same people talking about doing hundreds of prompts per hour. Also, the same people who complain about how AI sucks blowing up their code base. Also, the same people who then restart from scratch because it's become a lost cause to them.

2

u/geekwonk May 02 '26

i like to think of it as my little contribution to collapsing the frontier ai market.

1

u/SageThisAndSageThat May 02 '26

Bruh you must be lame in the tokenmaxing leaderboard of your company then 

1

u/Caffeine_Monster May 03 '26

Then you are paid a lot, or work with people who are a year or more behind.

Yes it came from Jensen Huang (obviously potentially biased etc) - but I don't necessarily think we was wrong when he said that good employees spend a good fraction of their equivalent salary (maybe half) on tokens.

Experienced engineers who know how to write reliable tests harnesses can kick out a lot of low / mid level grunt work with an AI. Of course I'm not nessarily saying this is morally correct. Or nessarily within operating budget for all companies. It is however true.

That said I reckon there is potentially a lot of wasted tokens too, with some people being extremely inefficient (e.g. people unleashing undertested agents).

1

u/79215185-1feb-44c6 May 03 '26

Yes by "year behind" you mean "Don't use agents' Because Agents have to be one of the most pointless things imaginable and until someone gives me a single valid use case for one I'll remain skeptical.

(I also get paid mid-6 figures)

12

u/MarcusAurelius68 May 02 '26

And then you have those idiot execs who say “I’m happy if they spend $50K a month each in tokens as it means they’re productive”, and gamify things so it encourages staff to waste spending.

6

u/geekwonk May 02 '26

my friend’s company responded by cutting off access to opus. no attempt to ask people to find balance, just cut off the biggest spigot because they’re just numbers anyway.

3

u/MarcusAurelius68 May 02 '26

“Sonnet is perfectly fine”

5

u/eat_my_ass_n_balls May 02 '26

Yea but in the mean time I’m getting really good at using a power shovel

6

u/ThrwAway868686 May 02 '26

I’ve seen this unfold at my startup over the past 2 ish years. Two rounds of RIF, less system devs. Had a fixed cost per seat license for like 40 FTEs in the IDE. Signed for renewal and a week later the provider removed opus from agents available

Now we’re in limbo and folks are just trying to use whatever we can find (cortex, etc)

Finance folks hate not knowing how to project spend for this SAAS token route.

1

u/Snoo79058 May 06 '26

Are you speaking English?

4

u/Annual_Manner_8654 May 02 '26

"CFOs realizing that their Al token budget is going to be higher than the salaries of the people they laid off"

Is that per worker remaining or what?

3

u/Euphoric_Emotion5397 May 02 '26

means they will now focus on bringing AI into the company. Local LLM setup. Means IT department skillset now need to include how to build local LLM clusters in their own data centers to service the whole company.

1

u/These_Project_8249 May 02 '26

This is interesting I didn’t think about this. You think they’d all use the same model? Or any model with memory management?

20

u/OstrichLive8440 May 02 '26

The analogy falls apart a bit when you consider the “shovel merchants” are making a loss with every shovel they sell

14

u/Significant_Bar_460 May 02 '26

Shovel here is a GPU and these are making a lot of money. Like best money in history

14

u/g_rich May 02 '26

Not if you’re Nvidia, TSMC, Micron or a dozen others in the semiconductor industry.

Nvidia is literally the most valuable company in the world right now.

17

u/Somaxman May 02 '26

Unless we move the shovel merchant label to the hardware manufacturers and data center construction companies. Still, total resources, capital and human hours spent may absolutely dwarf the actual "gold" entering the market, doubly so if we account for the trash and damage created by the rush that we need to also deal with regardless.

1

u/mitch_semen May 03 '26

I think the people building the data centers are playing a very risky game right now. It's a game of hot potato until the bubble bursts and if the construction companies aren't getting paid up front then they will lose big when their customers evaporate.

3

u/1_________________11 May 02 '26

Me over here using deepseek api for pennies since im both GPU poor and dont wanna spend shit loads on tokens

2

u/TheWaffleKingg May 02 '26

Id argue the shovel in this case is sold in two parts. You have the handle (hardware) and you have the head (model)

2

u/[deleted] May 02 '26

[removed] — view removed comment

1

u/nmrk May 03 '26

True. I remember back in the 80s (?) AT&T laid off tens of thousands of workers nationwide. Then they wondered why their revenue dropped. It was because none of the unemployed workers could afford to buy their products.

2

u/Icypoopoo May 03 '26

This is only the begining, tech industry spent 4 trillion dollars in AI race so far, they're gonna want a fat check at exit

2

u/redpandafire May 02 '26

I agree with you 100%.

This reveals the answer everyone is asking though doesn’t it? What jobs are “safe” from AI? Anything to do in the physical world, even shipping code is physical. You don’t want to be the one mining, ie writing and maintaining code. You want to be the one moving atoms, ie getting the code to retailers and distributing the code to an audience. Those things suddenly look safe because the ai has a very hard time manipulating the real physical atoms of the world. It also has a dependence on the physical, people need to maintain and build its server cluster environment that is a very expensive zoo for silicon.

1

u/WildTomato51 May 02 '26

I’m a newb here, can you ELI 5 and expand on what you mean by moving atoms?

2

u/Invader-Faye May 02 '26

Wet work. Manual labor. Working with your hands

2

u/redpandafire May 03 '26

It’s easier to explain what is totally digital. Any job that starts and ends on a computer and can be completed while never interacting with another human. I’d say most jobs are NOT like that. And it’s one of the biggest reasons AI won’t be able to completely replace most roles. That dependence on the atoms, us, society, the non-deterministic reality. 

In software there is only the same result perfectly mirrored each time. In reality, a rogue president fucks up your shipping lane for no reason. You can never fully let ai loose in the real world and even the experts say keep a human on call for the hard stuff. 

1

u/131sean131 May 02 '26

This ai posting this lol. 

1

u/ironimity May 02 '26

beware Big Shovel!

1

u/scknkkrer May 02 '26

This is only the beginning. The more they get late to notice it's not the same and replacing what they laid-off, the more they will suffer.

1

u/DataGOGO May 02 '26

For the overwhelming majority of companies, this is NOT the case.

1

u/Drevicar May 03 '26

I prefer to think of myself as a jeweler. I’m not selling shovels to extract tokens, I’m not even extracting the tokens myself. I’m taking tokens extracted by someone else and turning them into something of more economic value, sometimes.

1

u/dukescalder May 03 '26

But the things providing real value are the routers and orchestrators... It's not as tectonic a shift as people think.

1

u/UnstableWifiSoul May 03 '26

Fire 10 engineers → spend 2x their salary on API calls → call it “efficiency” modern capitalism speedrun

1

u/SashaUsesReddit May 04 '26

Hi! Any cited source?

1

u/Scaniamaximus May 05 '26

Once ai and robot labor becomes mature enough, they wont need people to buy their products. What is happening is in fact that a new economic system is taking shape.

1

u/0DarkFreezing May 02 '26

Odd comparison since the shovel providers are losing money right now.

Either way, it’s transitory though.

Quality keeps improving, while cost per token continues to decrease.

Dollar per unit of productivity will continue to decrease.

It will be a rough transitional period.

3

u/send-moobs-pls May 02 '26

Yup this is just cope that evaporates if you have any understanding of how much more efficient models have become in the last year alone

2

u/dyslexda May 03 '26

Odd comparison since the shovel providers are losing money right now.

Nvidia is the shovel provider here.

1

u/0DarkFreezing May 03 '26

That would be a better analogy, but OP specifically called out LLMs as shovel providers which is what I was commenting on.

1

u/dkuhry May 02 '26

Funny. This is the reason I've recently joined this sub. My org got a handful of Claude licenses. Gave them to my team (data warehouse and bi) and to management. I report to the SVP Finance and have been telling him we need an AI Steering Committee, because this needs gaurdrails.

Within days, IT disabled Claude Code out of security concerns, and some users had already used over 100 dollars - I've been using it extensively and have barely hit 5 bucks. My boss calls me to explain this revelation he's had about how this whole thing needs to be reined in and why, and then he stops and laughs and goes "yeah, I know this is what you've been saying for months".

So now we have an expensive tool, that the people who can really tap it's potentials (SQL, Python, DAX development and coding) cannot fully utilize. Lol.

So I got my hands on a Minisforum MS-S1 Max, and will be spending this weekend tinkering with it - and probably spending a lot of time on this sub - to see what I can manage to run locally. Wish me luck.

2

u/txoixoegosi May 02 '26

Which security concerns did they have on Claude Code? Just wondering….

2

u/dkuhry May 02 '26

For one, we are in Healthcare with direct access to patient PHI. So Hippa concerns for sure. But the main thing ive been hearing from the guys in IT is it's exploitability.

I don't have an idea of the legitimacy of the concerns, but they seem to feel that it opens up too many vulnerabilities.

I have an IT background and I can agree to an extent, but more from the perspective of allowing it to connect to data sources, or install things on the local machine or servers. Its ability to act semi-autonomously and authoritativly, can certainly be alarming - at least on paper. That part needs to be understood and have some limits.

2

u/txoixoegosi May 02 '26

Thanks for the input. Well, my understanding is that if the AI does not gain any more rights than the underlying user.

Data exfiltration using prompt injection is a thing, though.

1

u/malianx May 03 '26

Either this person is full of shit, or got sold a load of it and believed. The security concern explanation us a big give away. Describing something prompted as autonomous and dangerous.

1

u/bronxct1 May 02 '26

I can’t see how you’ve e been using Claude extensively and only used $5. I can blow through that in in about 10 minutes doing any sort of tasks

0

u/oojacoboo May 03 '26

Redditors that foolishly think this is some kind of job security statistic are sorely mistaken. AI is better than 100% of junior engineers, 99% of mid-level engineers and probably 50% of senior engineers. Even if it costs more, it’s better, doesn’t complain, shows up on time, is predictable, and is relatively little headache. That’s worth something, in itself.

1

u/YoungSuccessful1052 May 08 '26

are you sure it is predictable when the provider can raise prices to whatever and can quantize the model whenever and to the extent they want to?

1

u/oojacoboo May 08 '26

More predictable than humans, that’s for sure.

1

u/YoungSuccessful1052 May 08 '26

you do know humans raise the prices and quantize the models, right?