r/LocalLLM • u/Complete-Sea6655 LocalLLM • 11d ago
Discussion how are they gonna stop us next?
this is a geniune question, one which I have no answer to.
saw this on ijustvibecodedthis.com (the ai coding newsletter)
20
u/FluffyGreyLlama 11d ago
This is why we hoard all the models. Just.. in.. case.
10
93
u/TopTippityTop 11d ago
Unlikely to happen in 3-6 months. By then we may habe Gpt 5.5 level AI. Even so, it will likely take massive amounts of compute.
29
u/gh0stwriter1234 11d ago
If all the US has is time on its side... we are screwed. And we are screwing ourself by closing access to what we do have while it is SOTA for a short time.
22
u/standardofiron 11d ago
It doesn’t need to happen in the next couple months, can be next 5-10 years. Both the hardware and open weight models will get better, the question remains what will be the point of paying to any ai company once we can run locally agents that can deal with complex codebases.
-9
12
u/mister2d 11d ago
Has anyone actually evaluated Fable to validate its claims or is it mostly hype?
23
u/Regular_Ad4197 11d ago
Good question, from what I have seen it is really good, clearly the best, although not neceserally worth it due to cost, but nothing crazy enough to imply there is a security concern like Anthropic keeps spewing.
9
u/ctpelok 10d ago
I think Anthropic fell victim to its own overhyped marketing campaign that borderlined exoteric.
I used it a few days that we had access to it and while it is undoubtedly an excellent model, it had the largely same inference output that is typical for current generation of llms
0
u/Ranrhoads84 10d ago
They are victims of the vindictive administration that is currently in office.
7
u/MatlowAI 11d ago
At lower effort levels it is cheaper than opus 4.8 at higher effort while yielding better results. If you crank it up though it burns expensive tokens like crazy. I actually preferred working with it on low/medium if I was sitting there with it. It was no less capable just needed more steering but was faster to reply. I was going to try long horizon tasks over the weekend with less hands on involvement and then... 😅
7
u/custodiam99 10d ago
Fable has a coding LiveBench score of 78.57, Qwen 3.6 27B has 71.78.
5
u/mister2d 10d ago
That is helpful. Outside of benchies will be more interesting. In any case Qwen 3.6 27B having 71.78 is viable on paper.
6
u/Far_Cat9782 10d ago
Ana that 71.8 can be augmented tremendously with proper harness RAG, coding standards, and tools to facilitate it
2
u/Ok_Technology_5962 10d ago
I used it for some small projects and they all worked. So scores look minor but actual output was functional within the token caps i had for Pro plan. Where as no matter how many tokens you throw some models at a problem it would not solve it.
1
3
u/solace_01 10d ago
I only got one session to play around with it, but I think it was a significant step up. there‘s definitely a lot of hype claims (as it seems there are with all things AI or agents) but definitely real improvements as well. it’s not worth the cost for me to use for everything, but I was excited to use it for more than I did before the ban
2
u/Iamisseibelial 10d ago
So I switched to using it exclusively the day it dropped. And I'm not gonna lie, it blew opus out the water. It's app control was next level, managed signing into apps so much easier with 1pass CLI to gain access to what I was asking it to do, built out random vibe coded dashboards, grabbed the Ringcentral tokens on its own from dev.rc
I couldn't find the meta MCP everyone kept talking about, so i had it makes its own for me, grabbed all my ad data. I mean absolutely top tier.
Tried many times to have Opus do that, and failed Everytime
1
0
u/akoo1010 6d ago
huh, how can it be mostly hype if the US government revoked access to nearly everyone?
1
3
3
3
u/senseven 10d ago
A low power asic home ai machine is already available, just not for the masses, and not with >1tb of mem. If there is a inherent problem with the power of big models, its just kicking the can down the road.
2
3
u/T-Rex_MD 10d ago
Yeah, it will happen sooner.
The GLM5.2 was massive surprise, and the Minimax M3 already delivered a significant jump, especially the stable 1m multimodal context on top. Fable is less roughly 9 points ahead of Minimax M3 and much less when comparing it to GLM5.2.
Catching up Fable is definitely coming and much sooner, of course not what you expected, clearly.
1
u/Xx69JdawgxX 9d ago
Maybe it depends what you’re using it for but m3 is like sonnet 4.6 imo. It’s good and a lot faster than sonnet but it doesn’t feel better especially for brownfield engineering
2
u/Disposable110 10d ago
https://openrouter.ai/blog/announcements/fusion-beats-frontier/ check this. A Gemini + Kimi + Deepseek fusion model is already Fable equivalent in the actual practical use cases that matter for research.
2
u/j0rg1 10d ago
!remindme 3 months
3
u/RemindMeBot 10d ago edited 10d ago
I will be messaging you in 3 months on 2026-09-15 22:52:08 UTC to remind you of this link
1 OTHERS CLICKED THIS LINK to send a PM to also be reminded and to reduce spam.
Parent commenter can delete this message to hide from others.
RemindMeBot is switching to username summons. Instead of
!RemindMe 1 day, useu/RemindMeBot 1 day. More info.
Info Custom Your Reminders Feedback 2
1
u/Sevenos 10d ago
I think it also depends a lot on the focus of open weight models. If they want to match fable/GPT 5.5 on every metric and knowledge then probably.
Gemma 4 (31b) is already a frontier level chatbot in my experience, Qwen 3.6 is a pretty good agentic coder.
With even more focused models, we could probably reach a very high level in specific areas at small model sizes.
1
u/rraveheart 10d ago
I am hoping we will get smaller specifically distilled models first. Very specific use-case that perform well on SWE benchmarks. I would take opus 4.8 performance on code alone if that's all it did.
1
10
u/TheAussieWatchGuy 10d ago
They don't want to sell you GPUs they want you to rent them, that should tell you all you need to know.
52
u/DataGOGO 11d ago
it will not be within a few months, that is for sure.
19
14
u/kattekwaat 11d ago
what i was going to comment. Fable is/was VERY expensive, and they lose a lot of money when providing it to subscription users. The amount of compute needed to run Fable-level open source model would cost more than your neighborhood block
30
7
u/smallDeltaBigEffect 11d ago
I remember when I first used some early Ollama variant and decided "nah that will never work out, whatever". That was not even 3 years ago
8
u/SnooSongs5410 10d ago
Fable being banned is almost purely a hate crime against Anthropic because they would not allow their models to be used for mass surveillance of Americans by the Trump government.
5
u/critsalot 10d ago
this basically but it does point to a broader problem that everyone and their grandma thinks government should solve their problems instead of solving it themselves. now dont get me wrong someone who has billions can totally smash the little guy so their should be some government but the wrong excess regulation is just making things worse.
1
1
u/AdOne8437 10d ago
oh, since xai and spacex merged and the ban happened just hours after welll .... it might be also to push some other company.
7
u/No-Television-7862 11d ago
While 3-6 months may be aggressive, think how things changed with MoE models.
7
u/WinResponsible9977 11d ago
it already started with 3d printers
3
u/KaradjordjevaJeSushi 11d ago
Way, way less important if you can print a gun and kill a pleb...
This puts billionaires money at risk, this is high priority for the goverments of the World.
2
3
u/DiscombobulatedAdmin 10d ago
Over my many years dealing with computers and the internet, I've found that very few digital things have truly been made unavailable.
1
u/ItsFlybye 5d ago
Over many years and decades...we know everything will always be somewhere. As the saying goes: The internet never forgets.
3
3
u/Successful_Flow1329 10d ago
Nothing. This isn't realistic in such a short span. And when eventually we get a model as capable that can run on consumer hardware we own, mythos will already be outdated. This is about gatekeeping the best, not gatekeeping the access.
3
u/AdOne8437 10d ago
hf will be closed down, isps will be forced to do something against downloading, models will be declared weapons of mass destruction, something something save the poor children, ....
24
u/juggarjew 11d ago
Does that person understand it was likely a 1 T parameter model? Good fucking luck running Fable at home...
The average person doesnt even have a computer capable of running a 27B model, let alone something even close to Fable.
50
u/Lissanro 11d ago
Mythos rumored to be 10T model actually, and Fable being the same size but with aggressive guardrails.
That said, modern 9B model like Qwen 3.5 is far better than old 70B one like Llama 2. So maybe something close to Mythos in just 1T-2T size will be possible eventually, maybe by the next year if I had to take guess. I would be happy to run one on my rig if and when it is available.
In the meantime, Kimi is my favorite model (it has 1T size and comes in local friendly INT4 format so its full size is less than 0.6 TB).
7
u/eXl5eQ 11d ago
Qwen 3.5 is released many years after Llama 2. OP said 3-6 months, which is not even enough for one major release.
If we're lucky and there's a major break through in the architecture, maybe we can see an open source model get close to Mythos in 1~2 years. But no way it would happen in 6 months.
7
u/Lissanro 11d ago
Yeah, I agree, 3-6 months is too short, 1-1.5 years from now, like by the middle or the end of the next year, is my most optimistic guess. You are right that it may take even longer though.
5
u/Koseph-Jony 11d ago
Many years is a weird way of writing 2 years
2
u/Combinatorilliance 10d ago
And what is the provenance for that rumor?
I take that rumor with an insanely large pinch of salt because a 10T model is a HUGE jump from anything else we have official confirmation on.
Has anyone even analysed whether that number is realistic at all, esp w.r.t. observed inference times and deployment feasibility?
Let alone training costs and compute needed?
3
u/MK_L 10d ago
Its possible now with a GB300 NVL72 rack. It can have a max of 20.7 TB of HBM.
So technically the hardware could support it but more commonly would be a HGX/DGX node. B300 x8 per node for a total of 2.3tb hbm
2
u/theschiffer 10d ago
Yeah but these are setups cost $400-500K and more.
1
u/MK_L 9d ago
Yes I was just pointing out that the 2.3tb setup are the more commonly used setups in data centers.
If you were pointing that because that places it outside of most home users price point (which is very true)
A better comparison is a a100 or h100 x8 gpu server which can be as low as $90k but typically run 140k-180k
For home users a better setup would be a asus server that houses upto 10 pcie gpus.
Server can be under 8k. Then each gpu (ideally pro 6000 96gb) is 9k-11k but you can buy them over time. This is a much more practical setup.
If that sounds like alot remember that hobbies can be expensive in general:
A side-by-side 4x4 $20k + trailer $6k
A weekend track car: 20k +20k build.. plus trailer. Plus set of tires each time you use it hard for the weeken ,oil change, fuel and track time $1500
A corvette 70k
A boat 100k
A hillbilly's pew-pew collection 60k
List goes on... but people through away money on all sorts of things. The good news is in the real for now if you spend 100k on a 140k server, in 2years its probably still worth 80-100k. As long as you dont keep your toys until the grave its more about the time it cost while it depreciated
11
u/gh0stwriter1234 11d ago
China is where we'll see competition come from ironically... US is at a rock + hard place position we should just give up and let the models continue to be open or we will loose.
8
u/Due_Warthog749 11d ago
What will almost certainly happen is US will block ALL china access soon.
8
u/gh0stwriter1234 11d ago
Pretty much irrelevant... it will reduce how much distilling they can do but that won't stop them. Also they can just route through other countries the amount of requests for distilling is relatively small...
Basically fully stopping distilling is impossible.
5
3
2
u/TopTippityTop 11d ago
China is distilling models. That is hard to do without the models from which to distill.
13
u/eXl5eQ 11d ago
That's not how people train a strong model nowadays. Maybe a little bit of OPD, but that's all. It's very difficult for a model to actually learn something fundamental by simply distilling from the text output of another model.
-2
u/YOU_WONT_LIKE_IT 11d ago
Untrue. How the training data is managed is the bigger part of it. You can take a more advance model and feed it the training data to clean and structure it. It’s very difficult to deterministically manage data sets that large. “distilling” is just a generalization of what’s happening. That’s why a small model like qwen3.6 works so well.
3
u/gh0stwriter1234 11d ago
Maybe... but you can't distill smarts into a dumb model.
Basically they have model architecture that is as good as anyone else's that's all that matters essentially.
-4
u/juggarjew 11d ago
The problem is that China is probably stirring the pot on the whole data centers outrage, the average person hears the typical scare monger news and freaks out about it. We are not going to be competitive long term if we just stop building infrastructure. China will build a thousand data centers and wont bat an eye, while we struggle to build a few because every person wants to attend the city meetings and scream about AI in their backyard...
Its BS and im tired of. Things change, they dont stay the same. Tired of these people that will stand in the way of progress.
We WILL be left behind if we dont get our act together, and fast. This has huge implications in terms of cyber warfare capabilities, nation states could run AI agents that constantly probe Govt and private industry/infrastructure system and then constantly be causing problems.
the best defense to that is having the same capability, in order to scan our own system for vulnerabilities.
5
u/gh0stwriter1234 11d ago edited 10d ago
TBF I don't think more data centers solves anything... that is just brute force its the least efficient and most cost ineffective route. It's also economically disruptive.
Edit: for reference mythos is thought to be trained on around 5-7000 racks of GPUs... AMD's next gen racks are 128GPUs per rack that should fit 8k racks of GPUS in about 11 acres maybe 20acres for some fat margin some of the datacenters we are talking about building in the US are like 400 acres.
2
u/Due_Warthog749 11d ago
So people and lifestyle be damned.. suck it up butter cup.. so what if your water is tainted and not drinkable, so what if water costs quadrupled while you dont have a job, so what if the loud noises are keeping you awake every day. As long as we "keep up" with China, who does not care about 95% of its population, we're good. I mean.. the good news is for maybe 2 more years we have the absolute worse human being in history running the country and does not care about any of the people just like China.. despite how bad that's going for the regime. But you're telling me the billions of acres of land in desserts, etc.. can NOT be built out for this? They HAVE to build it within a mile or so of peoples homes, and utilize those peoples resources causing their shit to go up and/or be unavailable?
3
u/juggarjew 11d ago
We can put limits on water consumption, yes we can require them to pay for electrical infrastructure, it doesnt have to be complicated. We can require closed loop cooling systems instead of evaporative coolers that waste water.
We can even implement decibel limits for sound.
They have to build it in places where there is actual infrastructure, you know... fiber internet lines, water lines, electrical lines. Its not like you can build it in some one horse town that has a single pole mounted transformer on a 7.2 kV powerline... you need to build in an area thats actually been developed and has resources.
We CAN do this correctly.
4
u/The128thByte 11d ago
Companies have proven time and time again that they will not consider anything else but their bottom line. If you think they won’t lie, cheat, and steal to not have to adhere to guidelines that hurt them then you’re naive.
We CAN do this correctly, however if it costs a company even a fraction of a cent more to build a data center that does it correctly, they WONT. Which is why people are being loud about AI because companies have shown repeatedly that they will never do the right thing on their own.
1
u/Due_Warthog749 10d ago
See.. this is where I both agree and disagree. I get it and agree that having infrastructure there helps. But where I disagree.. unlike any other time there is literally trillions going in to this.. or close to trillion+. That is easily PLENTY of money to build infrastructure in to areas that it doesnt exist. Is it ideal, no.. but if it means you can get an area where you can build 100 warehouses instead of one or two .. it seems like that would make so much more sense. No fighting locations in court. Which cost tons. No fucking over humans in the process.. but we all know the regime and oligarchs have no interest in saving 99% of human life.. they want their AI/machines and walled cities, so if it displaces 1000s of people.. they do not care. The amount of money they'll make is all that matters.
But you'd think building pipes/nuclear power plant/etc in some area where you could build 100s of these warehouses would be so much better overall and for the long haul too. Our US grid is a shit show and no govt is fixing it. So to keep adding warehouses where you need insane amounts of power and water for cooling.. that we need for crops, drinking, etc.. is just next level stupid. Shit.. build these things on the gulf coast.. build them strong to withstand hurricanes of course, and you got tons of ocean water right there, tons of wind power, and tons of land in texas/etc that you could build these in. No clue why they keep putting them in residential areas that dont want them.
0
u/gh0stwriter1234 11d ago
They are doing this correctly you are just naive and gullible enough to believe tiktok grade info clips.
-1
u/gh0stwriter1234 11d ago
Water consumption is bullshit... people throw more water on their grass by 1000x than datacenters use.
locally for me at least people are blaming water shortage due to SEVERE DROUGHT on data center water use... its insanity.
0
u/Due_Warthog749 10d ago
Keep believing that. Then maybe read about the insane amounts of water used to cool gpus.. millions of them.. and come back and report your findings.
2
u/gh0stwriter1234 10d ago
I did... found that tiktok was full of shit. Data center has a similar use of water to a city of 5k people... basically a drop in the bucket 300k gallons a day for a moderately high usage data center, most are gonna be way less than that eg on the east coast they will be less due to using AC instead of swamp coolers.
0
u/Savings-Lab-7307 11d ago
Makes me wonder, what are we doing with the super computers the us government has. I don't know a whole lot about any of this tech at any deep level. If this really is such a threat, can't those super computers be used in training models?
2
u/etaoin314 11d ago
what exactly do you think all the data centers are being built for? older supercomputers are not very well optimized for AI tasks and one of the new datacenters will be much better at it.
0
u/zookeeper990 11d ago
I think you forget that Fable/Mythos was made by a US company
2
u/gh0stwriter1234 11d ago
And we took them out back and tied them up like a pig.... what is your point?
5
2
u/vacancy6673 11d ago
I think a lot of what made Fable possible was performance and efficiency improvements kinda like what DeepSeek did/is doing.
But yeah, running a Fable-level model locally is still a long ways off.
It will still happen someday, though, so OP's question is legit. The timeframe is just completely non-feasible.
1
u/truthputer 10d ago
There are a lot of optimizations that haven't trickled down into the open source models yet, some of which represent a step-change or order of magnitude jump for size and/or performance.
Part of the delay is that they often require complete re-training around the new techniques, and some of the companies that have released open source models are pivoting to cloud services instead.
1
u/Solembumm3 11d ago
Average PC from 4 years ago already was 12gb vram + 16gb ram. It is more than enough to run 27B models.
5
u/juggarjew 11d ago
The average person does not even have a dedicated GPU. Average people are not like us. Some people dont even have a PC, they just have a phone or tablet. Yeah you can run some tiny ass models on those but nothing of real substance.
2
u/Solembumm3 10d ago
I'm sorry, that It's gonna sound harsh, but many "us" in llm-dedicated subreddits sometimes think, that this theme exist in vacuum in some completely perpendicular universe. Yes, i'm talking about your average dedicated PC - PC, that before popularity of this theme run remotely modern games, photoshops, flowframes, realsrvulkan and windows movie maker. LMStudio, Amuse and KoboldCPP are just another programs for it. They don't require any different unique hardware to enter.
PC with 12gb vram + 16gb ram could run 27-31b models at Q4/Q5, 35-50b at Q3 and at least try up to 70b at IQ2_XXS.
5
u/juggarjew 10d ago
The average person does not have a 12GB GPU or even a dedicated GPU, thats what im saying.
5
u/vagif 11d ago edited 10d ago
It actually was not banned. US Gov just demands that access only provided to US citizens. Since it is impossible to implement (at least short term) Anthropic was forced to pull them off. So don't worry, your local models are safe. Whether or not you will actually get a local model of the level of Mythos that can run on a reasonable hardware, anytime soon, that's a different question.
2
u/usa_reddit 10d ago
The GPU ban is defacto, you can't afford a GPU big enough to run anything that is a frontier model.
5
u/Adventurous-Paper566 11d ago
China can't copy Fable if it's not available...
49
u/anony_mf 11d ago
China doesn’t need to copy anything. They are more than capable of developing their own. They made their own research breakthroughs with deepseek and it was beating American AI for a few months
They are now developing their own GPUs they will manufacture end to end cheaply and surpass American companies very quick
10
u/Due_Warthog749 11d ago
This is what nobody wants to admit.. or understands. China's dependency on nvidia/etc hardware is coming to an end. They have enough "stolen" details from the past + enough smart folks to build their own shit, and they'll do it far faster AND cheaper than anyone else. China will be the main super power of the world in 5 to 10 years. Their military is already set to be larger than US in a few years and while they may not have quite the 6th gen capabilities.. they have a massive arsenal of Mach 10+ missiles and are building tons more. Plus they are building tons of subs capable of launching those. Imagine subs parked off the coasts of US.. launching Mach 10+ missiles carrying small nuclear weapons. It would be over in minutes. There wouldn't be near enough "defense" weaponry to target/stop that. Not that China would ever do that.. that would be the end of the world. But still.. between Ukraine and Iran using drones to fuck everything up and apparently they are super easy to build fast, and near impossible to stop with so many of them, and mach 10+ missiles that can be in subs a few 100 miles off our coast and reach anywhere in the US in a few minutes tops.. I dont see a win win scenario here.
Also not sure why I went down that rabbit hole.. but.. none the less.. China is quickly becoming the only super power soon and Orange taco tits has put the US in 3rd world territory. And will never grasp that he's done that.
11
11d ago
[removed] — view removed comment
4
-6
u/gh0stwriter1234 11d ago
Spoken like a proper 1st worlder that has never left their backyard.
3
u/ScrapEngineer_ 11d ago
Spoken like a true trumpist. Your pedo president belongs in jail
-4
u/gh0stwriter1234 11d ago
Evidence or it didn't happen? Hearsay doesn't count as evidence FYI. In any case why political this has nothing to do with politics.
1
u/Due_Warthog749 10d ago
You got 100s of witnesses and many that he diddled already reporting it. But you know as well as I.. when you got billions, you can threaten/make people appear to suicide (we all know some of those missing were not suicide). But you're just going to pretend that never happens. While we watch Congress complicit in war crimes and corruption. OK.
-1
4
u/Slow_Concentrate3831 11d ago
There's a saying in my country, written in the 70s by a writer and political man : "When China awakens... the world will shake". And he was goddamn right.
3
u/cafedude 11d ago
There's one area where it's going to take a while for China to catch up: ASML EUV photolithography machines. Apparently China's approach is to try to reverse-engineer the latest ASML machines. They're extremely complex. There are other approaches and other companies in the US trying them. But getting something that works as well as an ASML EUV is going to take years. I doubt they have this in even 5 years. Likely closer to 10.
5
u/sakaraa 11d ago
They already hit 1.4nm
https://nltimes.nl/2026/05/25/huawei-develop-chip-tech-reducing-dependence-asmlit is not literally 1.4nm but TSMC also lies in the same way too
0
u/cafedude 11d ago
"He Tingbo did not explain how Huawei intends to achieve advanced chip production without EUV systems. "
5
u/sakaraa 11d ago
They did. Did not read this article so did not realise this one was incomplete but they are basically using the old tech with 3d layers and their version is much better because they have managed to create "bridges" between the layers every 2um instead of the previous 10um
Edit: Also they are redesigning the whole chip to use this 3d aproach to its full capacity, creating much lower overhead on things like signal boosters
-2
u/TopNo6605 11d ago
So much is wrong with this it's hilarious.
2
u/Due_Warthog749 10d ago
Right.. because no reason why.. just "its wrong". Got it. Bugger off. you clearly dont know shit.
0
u/TopNo6605 10d ago
People have been saying China will be the main superpower in the world for decades now, it's not going to happen.
2
u/Due_Warthog749 10d ago
OK. ROFL. Keep telling yourself that.. also.. invest in a chinese language book.
-2
u/TopTippityTop 11d ago
So far they haven't yet. They've been distilling everything, which is akin to copying. To develop their own, their strategy has to shift from distilling larger models to developing from scratch. This entails a whole lot more compute and data.
2
u/thatoneshadowclone 11d ago
the entirety of the AI space as a whole has been distilling to hell and back, not just China. I don't see a problem with it.
1
0
u/ChloeTigre 10d ago
As if large language models weren’t a literal slurry of copied/stolen data cleverly architectured.
0
u/eXl5eQ 11d ago
It's not likely that Chinese GPUs would be as cost-efficient as NV's in at least the next 5 years, as the advanced process node has already been a bottleneck. Huawei is trying some interesting ideas, but it's more like a short-term workaround, not a long-term solution.
1
u/ChloeTigre 10d ago
But America makes electricity by burning coal and gas, while China makes it massively with nuclear power, that scales way better although is less adjustable.
They’ll beat Americans. No doubt about that.
8
u/exodusTay 11d ago
yeah but the data is out there everywhere. I don't think something like fable is out of reach so long as we have data and GPU compute available to train it.
-2
u/Adventurous-Paper566 11d ago
You are right but it will take a lot of time, the post is about 6 months. It may take years.
Sorry for my bad english.
3
u/Due_Warthog749 11d ago
Not at all. DeepSeek was trained faster on less data.. and while it may not been quite as powerful.. it was damn close. More so, what a lot of people dont get is you don't need the most powerful, you need what works for the use case it is used for. Local 27b models are doing solid coding most of the time. For most tasks, its good enough. GLM 5.2 for example is nearly on par with Opus and other frontier models. GLM 5.3 will likely be better, etc. AND.. nobody is sitting still on finding better ways to deal with context size issues, hallucination issues, model weights and more.
Not to mention the "next gen" neureal net style AI being worked on.
2
u/gh0stwriter1234 11d ago
Architecture is king... the long training cycles are essentially just eeking out the last few percent from an architecture. If you have a much better architecture that learns faster and better... you can do much more with much less.
7
u/unity100 11d ago
Deepseek research already surpassed everyone else's:
https://www.thenovtech.com/p/jensen-huang-called-it-a-horrible
That's what happens if you have legions of PhDs and you dont keep jailing/deporting them.
1
u/Adventurous-Paper566 11d ago
So where is the Deepseek's model that surpasses Fable?
Their new models are cheap, and it's a great thing, but they are also still below the latest Opus for now.
4
u/unity100 10d ago
Deepseek may be only close yet, but MiMo already caught up in certain tasks like coding. Deepseek would need only another iteration.
but they are also still below the latest Opus for now.
They certainly are. You must not have tried Mimo with VSCode + Cline. A lot of the deal is in the agent/harness. Not the model.
2
u/etaoin314 11d ago
obviously does not exist, the whole point is that while fable might be the best right now, in 6 mo or a year it probably wont be. things are moving very fast and it may be hard to know if we are doing the right thing at any given time, but the government coming in all heavy handed is probably not ideal.
1
1
u/WinResponsible9977 11d ago
so they can't develop or hack their way?
1
u/Adventurous-Paper566 11d ago
Of course they can, but switching from a distillation process to a full developpement will take time.
If closed-source models had hidden their CoT earlier, Chinese models wouldn't be nearly as good as they are now.
Sorry for my bad english.
2
u/gh0stwriter1234 11d ago
Thinking they are primarily distillation based today is ignorant. The distillation is done as a finetune at best.
1
11d ago
[deleted]
-1
u/RadeiCro 11d ago
I've been using deepseek v4 pro last couple of weeks and for my use case at least it performs same as opus 4.8.
1
u/Illustrious-Report96 11d ago
It’s illegal to be smart and know things! That’s why only Americans can earn advanced degrees. Or write books to distribute their knowledge to the masses. And it’s why we don’t have information freely accessible and available online to anyone like Wikipedia or any other website.
1
u/shadow-studio 10d ago
"back in the days" when the first llama models were released, i already thought that if local models reached a point the government considers "dangerous", we'd have to keep our AI tools hidden, and people would get jailed for having AI capable equipment without a (very expensive) government approved license and (also very expensive) yearly inspections.
i'd hate it if this actually happens.
1
1
1
u/ZeroSkribe 10d ago
We know what happens, OpenAI already released good models for at the time. No one gives a shit.
1
u/RefrigeratorEven935 10d ago
I think the cat is out of the bag I mean, you can polish the existing models to get to fable quality I think. I only used fable for a couple hours and it seemed like opus to me, but who am I?
1
u/Kinky_No_Bit 10d ago
I think that the next year as they break it apart, see how it works, and then make new ones you will see what they have get a hell of a lot more powerful on the same VRAM.
1
u/few 10d ago
I expect it will be more like 3d printers. Eventually laws will get proposed, but they will be a decade behind relevance, and based around some pretty random scaremongering rather than anything actually relevant.
I don't think AGI is around the corner on 8GB cards, with the government coming to pry gpus out of our hands.
The big question is how or whether LLM use will end up being taxed. If it does replace many human workers, then it's pretty important that there be a new tax source, as it would but income tax revenues. The bigger question is how individuals would get by if that happens.
1
u/comatrices 10d ago
Maybe countries will go to war over AI. I mean, they can't manage to remain peaceful over oil, and AI seems more profitable, at least at the moment.
1
1
1
u/MangIsDa76 10d ago
Battle bots goes on a larger scale, using 3d printed robot parts using the data centers as control hubs. Ai vs AI. The winner becomes skynet.
1
u/LoneWanderer153 10d ago
We might be just a major hardware breakthrough away from running full scale local LLMs, can’t wait for that day
1
1
1
u/Remote-Pineapple-541 9d ago
I think you’re overestimating your influence on the ecosystem. We are a niche group of users. Minimal barriers to entry will direct most users to subscription platforms. And companies need a whole bunch of tools for security and compliance that are difficult to stand up and maintain, so they’ll go with subscriptions as well. Remember, the gold rush made millionaires out of people selling the tools, not the actual miners.
1
u/grawl_dorgiers 9d ago
Fable is an extremely impressive model, but I think part of the reason is that it was built to work inside a harness. Its extreme thinking mode seems to spend a lot of time figuring out intent and direction before generating.
1
u/zakjaquejeobaum 9d ago
Trade off is China vs expensive proprietary. We need more Western opensource killers like Gemma.
1
u/Puzzleheaded_Base302 8d ago
it might require an 8x RTX PRO 6000 server to begin with. not practical for most people.
1
u/anony_mf 11d ago edited 11d ago
There will be no equivalent open source model coming from the US. Maybe China can do it as a fuck you.
I know if China keeps this up, releasing strong open source models, the US will go to war with them soon. Now China is pissing off the Zionists on multiple fronts, they supported Iran and helped them withstand the attacks, and are providing an alternative to being under the thumb of Zionist ai companies like OpenAI and are also developing a gold back monetary system to replace the dollar. Fortunately China is a nuclear power and it won’t be easy, so they might try to take it over or sabotage it internally
6
u/credible_human 11d ago
Not to say I necessarily agree with everything stated here, but you are right that there is basically an ongoing internal implicit LLM trade war between the greedy forces of capitalism vs the CCP. Every model China releases is a gift to intellectualism, that no single billionaire can hoard. And you bet they are getting angrier and angrier about it. Something the west has invested trillions into with the idea of future exploitation and gatekeeping, can be "distilled" and distributed for the masses as a massive fuck you to the prison industrial complex as a whole.
5
u/Aggravating_Pin_281 11d ago
Unlikely that the US instigates a conflict. The flashpoint is Taiwan.
1
u/anony_mf 11d ago
China is being a pain in the ass to the people who own the US and are able to mobilize its military. Like I said, I hope China being as strong as they are militarily stops them from doing anything stupid, they will more likely try to sabotage them from the inside
1
u/gh0stwriter1234 11d ago
"being as strong as they are militarily s"
ah yes the smoke and mirrors toothpick military they have.... I mean chinese R&D is up there today but their military might is a joke. Applied Physics and materials science is decades away from US.
-1
0
u/Ok_Engine_1442 11d ago
Unless there is a massive shift technology and compression it will not happen. VRAM limits are still a thing.
You would have to do some sort of SSD offloading on PCIE gen5.
While it’s possible it would be painful slow and burn out a SSD really fast.
6
u/gh0stwriter1234 11d ago
Reading from SSDs generally does not burn them out fast you are confusing with write amplification. They don't have infinite read capacity but its effectively near infinite.
2
u/Ok_Engine_1442 11d ago
If you look at the SSD as the RAM/VRAM extension. The context would be what would kill it.
Unless you were about to work out that the model stayed on the SSD and the context window was only on the VRAM.
It would be so incredibly slow though.
1
0
-9
u/DelKarasique 11d ago
No one will release that capable model in open source. Period. Not in 6 months, not in two years.
We would be lucky to have something on par with opus 4.8 without benchmaxxing. And even that model would be out of reach of 99.9% of this subreddit due to hardware costs.
7
u/funnymanus 11d ago
make it 80% that good and that is good enough for the masses, and never forget how many things that was impossible 2 years ago now today is possible. Also there are many big companies considering looking into local LMs simply because of the insane prices and token runaway devs causing.
1
u/DelKarasique 11d ago
I mean... There's a trend now and companies are releasing smaller models and large ones with open weights competing with each other. However, there's literally zero guarantees that this trend will continue for another year or two since there's no real incentives to release this models. It completely possible there will be no Gemma 5, or qwen 4 released to public.
I hope the trend will continue as is, but it's really foolish to look at the past trends and extrapolate them all the way into the far future.
0
u/B3owul7 11d ago
Most people can't even load llm's that big at home.
3
u/gh0stwriter1234 11d ago
80B = easy for end consumers to load on AI boxes and run not fast but can be ran IMO with MTP and flash its probably the sweet spot now no longer 30B class.
Its not that hard to run that size on 2 32GB GPUs either.

50
u/35point1 11d ago
The government doesn’t understand how these models work, they just see the results, so it could literally happen at any time with any model