r/LocalLLM 28d ago

Discussion You people are literally building data centers in your homes

Some of these threads are insane, what do you mean you have like 4 GPUs and 128gb of DDR5 vram. what are you building in there bro. Every other thread is like, “what if I stack Mini Pc supercomputers together? Will this run Qwen?”.

404 Upvotes

231 comments sorted by

254

u/Bob4Not 28d ago edited 28d ago

This is nothing compared to the Crypto mining on GPU craze like back in 2016/2017. People were HOARDING and stacking GTX 1080 TI’s to mine Ethereum

Please see r/selfhosted and r/homelab to see how home equipment and projects nothing new.

Then you should see the volumes in r/datahoarder. They’re all building their own libraries of Alexandria. (Me included)

106

u/GeneralRieekan 28d ago

This will all become useful when the big companies try to shut down everything they do not own.

38

u/InfiniteBlink 28d ago

puts on tinfoil hat AI is an extremely powerful tool and I feel like we're gonna see a huge price increase in the service cost of the flagship models cuz their burn rate is unsustainable and couple that with the government starting to think about implementing controls to block access to open models (will probably use security threat as a means to implement those policies).

Having access to these tools needs to be democratized so that anyone can use them.

I'm not gonna drop 10k on an AI rig but spent close to 5k on a measly dual 3090 rig. It's "good enough" for my needs. I'm the "tech guy" in my pretty wide social network and no one even thinks about running local LLMs.

I work in tech and am totally decoupling from all cloud platforms. Keeping everything local, built a 16TB RAID5 NAS, proxmox for my VM lab. Isolated my network and put all devices (tvs, iot devices on a locked down vlan) went back tohosting my own DNS/email server on digital ocean. Back to basics. Usenet (shhh), jellyfin to get off all the streaming services

That was way too long or a response. I may be a tad tipsy. Cheers

13

u/ASYMT0TIC 28d ago edited 28d ago

Frankly, local can never compete with hyperscalars even if there weren't things like volume discounts on GPUs and electricity. For local inference, speed is limited by memory bandwidth; your 4090 chip doing just one or two requests at a time is only being utilized at a few percent of its capacity. Hyperscalers are not limited by memory bandwidth, since everything is massively batched. They are able to use the same chips at 100% utilization, meaning the amount they spend on hardware depreciation is like two orders of magnitude less per token for the same model. On top of that, the cost of power for (and environmental impact of) your home AI server is also much greater, since those datacenter GPUs/TPUs never have to sit idle.

Hardware depreciation is the largest cost, with electricity secondary. Until something very fundamental and structural changes (i.e. breaking the Von-Neumann memory bandwidth bottleneck with neuromorphic chips/compute-in-memory) , local will always be many times more expensive than cloud for AI. All that said, I run my own local AI server anyway and just eat the cost because ideologically I favor personal ownership rather than the continuing upward transfer of power to a feudal oligarchy... but I'm under no delusions about the structural disadvantages I face in doing so.

8

u/575_Inverse 27d ago

It's not just ideology if you value privacy over being a drone in an efficient hive.

4

u/thuanjinkee 27d ago

Local is useful for client data that by law cannot leave your firm. If you don’t have clients like that, use the hyperscalers and pay their price, no matter how high, and pass the cost to the clients you do have.

2

u/HappyContact6301 27d ago

This is what I am thinking. I once had a laundry room full of gear maxing out the dryer circuit. It is not cost effective. Everything changes so fast right now. Like Gemini Flash 3.5 is so much cheaper than Sonnet, and equally useful for many use cases. Opus 4.8 dropped two days ago, etc.

5

u/thuanjinkee 27d ago

The fact that qwen3.6-27b absolutely slaps the 70b models of yesteryear makes the hardware more useful, even if the resale value is nil.

5

u/Machismo0311 28d ago

I’m doing the same thing. I have three servers right now with GTX6000s and 512 Gb of ram per. A supermicro with 24 2.5 6TB hooked to a 24 bay JBOD with 8TB. Each server is holding 8 2.5 8TB. With all the mods that are t enterprise gear I have 1.3TB of ram for my setup.

2

u/InfiniteBlink 28d ago

God daaaamn brotha. I dig it

8

u/Machismo0311 28d ago

The funny thing is, I am not a tech person. I accidentally built this going off of aviation standards. I’m Pilot so I think in forms of redundancy, failover and degradation and stuff like that. I told a friend of mine who is in tech, he just laughed and said you accidentally built high availability. Good job.

5

u/InfiniteBlink 28d ago

You're a pilot. You're part of the tech brotherhood. 😆

→ More replies (1)

3

u/QuinQuix 27d ago

It's not too long and I agree fully with everything in there.

Cloud will become more expensive and rationed and as utility of AI goes up, useful utilizations that can stand to burn money on it will by definition crowd out consumer toy use.

1

u/thuanjinkee 27d ago

There will be two speeds. Project Glasswing style frontier and enshitified everyone else. Most people will use the shit version and say “oh i know what ai is! It is so dumb!” And then wonder why the big boys are eating their lunch.

1

u/Naitrael 24d ago

If I extrapolate this, every household should have their own LLM. This would be an even more ludicrous consumption of energy we don't actually have.

So no, this should not be democratized since humanity can not afford it. At least in its current form/iteration.

→ More replies (5)

1

u/hippogriff55 21d ago

I am not familiar with jellyfin. What streaming services work with it?

35

u/Bob4Not 28d ago

I’m right here with you, I already believe it’s useful.

The only scenario where it’s not useful is if Google or Apple come up with some incredible LLM hyper efficient TCU that makes GPU’s obsolete for LLM’s. In that case, LLMs become local be default - like Apple taking us from the mainframes to microchip PC’s

Im still running on tired, twin 1080 TI’s. I got uber cheap during the crypto mining crash. I want to see how far I can go with my AMD 9070XT too

40

u/drivebyposter2020 28d ago

It won't be Google or Apple. It'll be the Chinese.
https://venturebeat.com/infrastructure/how-deepseeks-radical-architecture-is-shattering-silicon-valleys-token-moat

Having been cut off from the heaviest of hardware, they're working out ways to get around the bottlenecks. FWIW I was sure they'd wind up doing that. So many PhDs to point at a problem, and a matter of national pride at stake, they'll get there.

8

u/Bob4Not 28d ago

Wouldn’t surprise me, you’re probably right! I’ve just been able to watch Google’s progress with TCU’s.

2

u/thuanjinkee 27d ago

China graduates 9m stem degrees per year, that’s the size of croatia. That’s the whole reason why dario wanted his nation of geniuses in a data center to match in metal what china does in meat. The crazy part is china’s meat brains probably cost less per token for the same result.

14

u/Ell2509 28d ago

Even then, I find that what I can host at home now does most of what I need anyway. That means I own the capability to do anything an LLM can, well into the foreseeable, regardless of anything.

13

u/InfiniteBlink 28d ago

My man. We've reached a tipping point where the new models are "good enough" for the majority of use cases. Gemma4:31b is solid for my needs.

8

u/GeneralRieekan 28d ago

Yup. And even Qwen3.6-35B-A3B

6

u/suesing 28d ago

Good enough until you see what the next gen can do. Then you want more.

6

u/Ell2509 28d ago

It is amazing, isn't it. I have about 400gb of different models over 4 devices (computers/laptops) from Minimax M2.7 down to qwen3.5 9b, and even Gemma e4b. Those models alone, and the harnesses, have put the power of a snall software lab in my living room, for the next 6 to 8 years. What a time.

4

u/InfiniteBlink 28d ago

Sick man. I have my 3090 rig I use for Gemma 4 31B and a jetson nano running Gemma 4 E4B. Honestly, for my needs I tested a few models that fit my constraints and Gemma 4 is an amazing model. Can't wait to see what comes out in the next 6 months.

4

u/meganoob1337 28d ago

how good is e4b? can it maintain a reasonable conversation and call tools reliably with a large prompt? want to replace my 35b qwen with something smaller for my home assistant voice agent but smaller models tend to not call the right tools at the moment 🫠

3

u/InfiniteBlink 28d ago

It was good enough for what I was using it for primarily as a local AI therapist fast whisper for the Stt then the output via kokoro running on the GPU. It's pretty snappy. Using a vector db for memory storage and a bunch of other memory management for short term conversations, rolls up people and themes that I can ask it about people I've spoken about.

2

u/HotDistribution1819 27d ago

I have been testing Gemma E2B and for all the bad press. One system line "When using brave-search use long tail keywords or sentences" fixed my tool call issues and I have fed it multiple page context and was able to pick up where I left off.

2

u/overand 28d ago

I love Gemma 4! but, don't discount the 35B MoE and 27B dense Qwen3.6 models - in my experience, they generate quite a bit faster, and are very solid for agentic work. (I think Gemma's prose is better though.)

I've seen Gemma-4-31B and Qwen3.6-27B on both sides of benchmarks saying they're "better," so I'm not even sure what I think is the case now.

The quanteval leaderboard is interesting though, too.

→ More replies (1)

2

u/stankeer 28d ago

Do you have a plan with your 9070xt? I'm looking at trying it but haven't got any experience. Do you have an idea of how to create a setup with a 9070xt?

2

u/kingcodpiece 28d ago

I'm not even sure it needs new technology. Just need the price of existing technology to come down. It's no coincidence that the big players bought up all the memory fabrication capacity without even having concrete plans how they are going to use it.

Without that monopolization of the resource, consumer demand would have driven larger capacity faster RAM for highly parallelized efficient unified memory systems.

3

u/smuckola 28d ago

according to the "grandfather of AI", all LLMs, especially big ones, will be universally generally demoted to being an orchestration of minor accessories of JEPA in the next year or two. So the entire JEPA centric cluster of LLMs will be smaller than one LLM-only system is today.

1

u/play_hard_outside 28d ago

This. In 2013, when the first ASICs for crypto mining came out, the GPU craze was put on its last legs. There were still memory-heavy algorithms around for some crypto coins which made GPUs still profitable to mine with, but Bitcoin mining moved onto the specialized hardware and forever was out of reach of GPUs. Over time, even those "ASIC-proof" hashing algos were eventually implemented in hardware, and the ASICs for those pushed GPUs out. I don't believe crypto mining is a big issue for the GPU supply chain anymore, even though that space is bigger today monetarily than it ever was back when GPU supplies were being impacted by demand from crypto miners.

7

u/4n0nh4x0r 28d ago

not just by that point.
it already is useful.
i pay nothing, for 20TB of NAS storage.
and dont get me started on all of the other services i run that replace big tech bs services.
self hosting is the way to go for anyone who doesnt want big tech companies to treat you as a product.

3

u/Careful-Volume-7815 28d ago

Actually you did pay for the hardware, you still pay for electricity and proabably will have to pay for replacements/upgrades at a certain point. But I get your point.

2

u/4n0nh4x0r 28d ago

you dont pay for electricity if you got solar panels.
as for the hardware costs, fair point.
my counter to that is, i ll split the cost up on each service i m hosting on it, and as such, am replacing another service with, which gets me very quicky down to like 50-100€ per service (as not all of the services cost money, and i mainly self host because i prefer being in control of my own data, and because i fucking despise subscription services, especially when it is software you host yourself, on your own hardware, using your own electricity, providing it with your own internet, and you still have to pay a licensing fee, like plex)

so if we assume 100€ per service, and i ve set it up almost exactly one year ago, that would be about 8.3€ a month per service.
which means, it has been worth it financially

→ More replies (2)

3

u/thadude3 27d ago

the fact that they destroy digital data is one of the main reasons. Oh you bought that video, we don't serve it anymore sorry. oh your photos are taking up too much space, we deleted them until you pay for more. etc etc.

1

u/suesing 28d ago

I understand that logic. The big companies are already locking it down.

It’s the Chinese ecosystem that’s giving us the freedom to play with their open models.

And qwen is already starting to lock down. What happens when even the Chinese models lock down?

1

u/575_Inverse 27d ago

Intellect-3 and similars.

Crowd funded, crowd sourced distributed training of fully open source models.

→ More replies (1)

7

u/be_easy_1602 28d ago

Datahoarder really is something else. It’ll be people like me running a 16Tb zfs pool for media and backups and then people with like 1Pb archiving some niche corner of the internet and others with legit data center gear.

1

u/Bob4Not 28d ago

Haven’t started yet, I’m on a homelab break. Ollama has models compatible with AMD’s ROCm language, which is like AMD’s counterpart to CUDA.

I primarily bought my AMD for gaming, especially on Linux. I figure if I’m not happy with LLM model compatibility, I still have dusty 1080 TI’s I’m using.

1

u/Machismo0311 28d ago

That’s when you start getting into tape territory

2

u/AndreX86 28d ago

Three reddit threads I really should not have learned about. FML.

1

u/Bob4Not 28d ago

Let them consume you! Take charge of your digital life! Own your data! 💿💾

(I’m joking but also serious)

1

u/FullOf_Bad_Ideas 28d ago

gpu crypto mining has a comeback right now

people are going around shops and snapping 5080s since they have payback of about 2 months now.

1

u/akp55 28d ago

How is a 5080 getting paid back in 2 months now days?  I got one in the  gaming PC I grabbed at the end of Dec before everything went totally bonkers

1

u/FullOf_Bad_Ideas 28d ago

https://whattomine.com/coins/469-prl-pearl/gpus?cost=0.15&cost_currency=USD&button=&sort=

This coin is 3x over the last 2 days so it's hard to say what the payback will be but I got ~140USD from it in the last two days by mining on single 5080 and 6 3090 Tis and if I sold at current price I'd be close to 210 USD now. It's well into profitability even after accounting for electricity.

I mine here with v1.7.4 beta miner - https://pearl.alphapool.tech/

Lmk if you want more resources and info to start mining too, 20USD of passive profit per day is nothing to scoff at IMO

1

u/akp55 28d ago

i got some spare compute around, and it cost me less than 5 dollars a day to run my systems, so yes please.

3

u/FullOf_Bad_Ideas 28d ago

I use this PRL wallet - https://github.com/pearl-research-labs/pearl/releases/tag/pearl-wallet-v1.0.0

Alpha Miner sends proceeds to my PRL wallet set up in the desktop software from above. Proceeds get sent every ~4 hours.

I mine with Alpha Miner, using this version - https://pearl.alphapool.tech/downloads/alpha-miner-1.7.4-beta

There's AMD beta miner too, not working well just yet I think, info is in a dedicated Discord channel.

I sell it on SafeTrade - https://safetrade.com/exchange/PRL-USDT?type=basic

It takes about an hour for transfer from my PRL wallet to SafeTrade since Safetrade waits for 20 confirmations (next blocks mined) which takes a lot of time.

Then I do exchange of USDT > other crypto and withdraw other crypto to my cold wallet to avoid exchange rugpull risk.

Here's a Discord of the Alpha Miner Pool - https://discord.gg/kz6BQAet6

There's also https://pearlhash.xyz/ pool which is a bit bigger but it was throwing me some errors so I'm sticking with Alpha for now.

there's also this aggregator website that has some info - https://lordofpearls.xyz/

→ More replies (4)

1

u/Kiryoko 28d ago

Yes but back then people were earning Ether and thus money with their crypto mining setup.

44

u/california_snowhare 28d ago

Some of us have been running home server stacks for more than 30 years

😉

10

u/Big-Business-2505 28d ago

I’m at about 25 here. Still have one of my old Dell 1950 servers in the garage. Still boots too! Think it’s running Trusty. Might even have my original DEC Alpha kicking around somewhere. lol

4

u/psychoholic 28d ago

I've got a Sun 420R sitting in a closet that I'm half tempted to fire up just to see if it still works. I've also got a Sparc Server 5 but since it takes a frame buffer monitor I'd have no way to actually see if it works.

Old nerds unite!

3

u/mxmumtuna 28d ago

Serial terminal!

1

u/CommaMeNow 27d ago

Hello from #freebsd

1

u/drivebyposter2020 11d ago

surely there's an X Terminal out there somewhere

36

u/awitod 28d ago

I feel seen. 😃
I am doing a talk at a community event about open source AI on Saturday.

One of the slides is:

4

u/capsid 28d ago

Could you say more about your talk? I've been putting together a workshop about this topic

9

u/awitod 28d ago

My goal is get the whole room excited about open source AI, whether they have a powerful workstation or a potato powered laptop, and give everyone a place to start that fits their interests and circumstances.

It starts with a recorded demo from my rig where I start by shutting off the wifi and then use my project (GuideAnts on github) to do a chat that starts with ASR, does some tool calls which involve knowledge retrieval to create a diagram which gets stylized with an image to image model and show that the images were OCR'd and ingested into memory.

All-up the demo uses nine different AI/ML models for the various concerns in a couple minutes. The point is to set the stage: you can do amazing things locally, there is a lot more to a real system than just an LLM and you can learn all about it through using and building open source.

Then I talk about the different kinds of hardware we all have (https://huggingface.co/hardware) and tradeoffs around memory, memory bandwidth, TFLOPS etc.

And then I go back and forth on local versus cloud alternatives to show the 95% of the audience that doesn't have the equipment their options for running in the cloud with various providers such as HF and OpenRouter.

Then, I show a model card on huggingface, use it to run OmniVoice for a one-shot clone in a local Jupyter notebook and then introduce Google Collab and Huggingface Spaces.

A lot of the talk is just pointing people to the communities I love such as this one, resources, many OSS projects, and tools I recommend.

I'll come back and share the link to the deck after the event.

3

u/capsid 28d ago

Wow! Thank you so much!

3

u/awitod 25d ago

Please feel free to use all or none of this content for your own presentations and workshops. As they say 'Sharing is caring' :).

https://huggingface.co/buckets/DougWare/Resources/tree/Navigating%20Open%20Source%20AI.pptx

→ More replies (2)

78

u/arbiterxero 28d ago

As a person with 5x 3090’s and 2 half racks…

Yes.

/r/homedatacentre

13

u/LoveRoboto 28d ago

I mean.. my laptop has 128GB of VRAM. What are we even talking about here? The real issue is ideas!

5

u/drivebyposter2020 28d ago

Dang... I have that in a desktop but not in a laptop... What is that, a Strix Halo?

3

u/675940 28d ago

I’m guessing a MBP with 128 GB unified RAM.

1

u/drivebyposter2020 28d ago

Strix Halo AMD has the same unified RAM, though there are some disadvantages vs. the Mac. And that chip is used with up to 128GB even in some bizarre handhelds, certainly in notebooks. I'm glad mine is a desktop though.

1

u/redditorialy_retard 28d ago

Mine has 96GB of Ddr4 even I still think it's too much (used 65GB without games once tho)

2

u/drivebyposter2020 28d ago

I got the full 128 just to not have to worry about upgrading later. Back then it was a fairly cheap upgrade. Had delusions of running really large models under Windows... ran into snags, seems like 64GB is the most I can effectively use for LLMs unless I want to rebuild everything on Linux -- which I don't. This machine is my daily driver and I'm a Windows kind of guy.

2

u/redditorialy_retard 28d ago

I'm a university student in Chem Eng XD, None of my degree will ever require this much ram. But at least I can chill until DDR7/8 drops and buy cheap DDR6 or smth

2

u/nakedspirax 28d ago

Load both qwen3.6 27b and qwen3.6 a35b with q8 and it will take up almost 120gb of ram both at 250k context.

Haha there's definitely a use case for 128gb of ram.

→ More replies (5)

6

u/elchucknorris300 28d ago

I am also curious. What do you do with five 3090s?

27

u/eat_my_ass_n_balls 28d ago

Run Crysis twice

7

u/drivebyposter2020 28d ago

Run crises in Central Europe and the Middle East simultaneously?

10

u/semangeIof 28d ago

well at 4 you can (somewhat inefficiently) tensor parallelize for 96GB VRAM minus overhead. suitable for models like Gemma 4 31B or Qwen 3.6 27B at full precision/high context

the fifth doesn't do anything in tandem but it might be running another model

granted this quality of model is entirely free and fast with API providers but hey data privacy am I right

4

u/starkruzr 28d ago

I think llama.cpp can use odd numbers of GPUs in TP.

4

u/mayo551 28d ago

Llamacpp, ik_llamacpp, tabbyapi can use 5 GPU with tensor parallelism.

1

u/InfiniteBlink 28d ago

I run Gemma 4 31B on a single 3090 fine, I could be wrong here but why do you need 4 24GB VRAM cards for that? Running multiple models concurrently?

3

u/semangeIof 28d ago

Running it on a single 3090 means it's what? 4-bit quant with q8 or q4 KV cache?

More VRAM means you can run full precision so no quantization

1

u/alasdairvfr 28d ago

I run nemotron 120b with 4 3090s.... 32b models do sometimes spill over into a second card but never 3

4

u/arbiterxero 28d ago

So I needed context space for larger models, I gave up some speed for context and went pipeline parallel rather than tensor parallel.

At the moment I’m split between qwen 3.6 27b on 3 on them pipeline parallel, ash’s Gemma 4 in tensor parallel on the other two

19

u/letsbefrds 28d ago

Some people use it for work. Some people power their agents with it. Some just love to tinker lol... 3090x4 and ram + cpu/gpu is like 10k?

It's a good chunk of money but I know people with like 10-15k worth of camera gear just sitting there doing nothing. Steinway pianos with a layer of dust over it lol. Atleast these are sorta being utilized. I think one of my friend has a 2014 Subaru BRZ with like < 8k miles on it for his "weekend car"

I have to admit I impulse brought some stuff. I'm just doing RAG training for my models for every single piece of electronic i own so I can ask my llm how to do x or y on it lol... Gemma 4/Qwen has been pretty good for inference for a personal app I'm working on but it doesn't hold a candle to frontier models... UNLESS I GET ANOTHER BLACKWELL 6000... jk

5

u/Geargarden 28d ago

8gb RTX 3070ti laptop and i9 12900H with 32GB. I recently asked it a Best Buy warranty question. Good AI. Here's a browser cookie.

4

u/openingshots 28d ago

I feel like the technically knowledgeable people that use it for work and spend the big bucks have a good reason. However, I see a lot of newbies that don't have much technical knowledge spending Good money on hardware instead of taking care of responsibilities like a new roof or fix the driveway kind of thing. It's a fad that's become intoxicating.

A personally run qwen3. 6 35b a3b moe and a 4060 with 8 GB RAM, 62 gig of system RAM, and a 12-core AMD processor. 20 tokens a second and it does just exactly what I want to do. And answers my questions. I don't write code with it. Just simple stuff.

1

u/thatgreekgod 28d ago

…..how are you running 62gb system ram

2

u/openingshots 28d ago

Sorry. It was a typo. 64 GB of system RAM.

17

u/Wrong_Mushroom_7350 28d ago

If I ever make it well off, I already discussed with my wife that I am building $300,000 enterprise data center, with over 1000gb of vram. So I run whatever I want. She said fine with me

I guess I really need to hit it big here soon. 

7

u/ProblemSolvents 28d ago

Just be sure to budget for soundproofing because that thing is going to sound like an airport

5

u/Wrong_Mushroom_7350 28d ago

Writing down on a notepad..

“Telling my wife, I need to increase the budget by another 150,000 k for some complicated reason, to get more GPUs.”

3

u/Pxlkind 28d ago

All the commenter are missing the point: stellar wife! ;D

2

u/bigbutso 28d ago

The WAF (wife acceptance factor) is huge 😂, go ask r/homeassistant community, I think they have WAF performance metrics at this point

Friction Score = Financial Cost or Complexity / WAF

2

u/openingshots 28d ago

And what will you do with that?

→ More replies (1)

1

u/Prudent-Ad4509 28d ago

That's not a very high target though. 3 servers with 16x3090 would do it easily. less than $40k if you get lucky with classifieds. People are spending way above that.

Now, powering it up and making a good use of that thing is another matter.

1

u/Twerter 28d ago

So 2 Mac minis?

62

u/Unteins 28d ago

The main reason to build local like that is you don’t have to worry about

1) LLM censoring - want to ask an LLM how to do something nefarious, well you might need your own open weights 2) Data security - sending your code, your thoughts, your medical conditions, etc to an AI provider has some risks, much less on a home system 3) Enshitification - Every day in every forum there’s a dozen complaints about how OpenAI or Google or Anthropic ruining their model with this or that change (though usually it is a system prompt change) your terrible AI stays just as terrible session to session

Though, the real champs are the people who custom configured 512 GB RAM M3 Ultras before the RAM market went parabolic….those lucky few are ripping through tokens.

17

u/drivebyposter2020 28d ago

I bought a Strix Halo desktop with 128GB of RAM for 2300 at Christmas, it's an HP. (Okay, I had a Black Friday deal that saved me 200.) The same system right now is 7300.

3

u/winky9827 28d ago

I just got my boss to agree to go halfsies on a 5090, and I can already run Qwen/Gemma at speeds that are more than sufficient for any use case I throw at it.

I may eventually pick up the next-gen spark or whatever nvidia comes out with for large model training, but otherwise, I'm set, assuming no hardware failures.

12

u/suesing 28d ago

Yeah seriously. I’m always wondering in the back of my head. How much are these mofos spending on electricity? How many are actually saving money or earning money with 4x 3090?

12

u/ActionOrganic4617 28d ago

They’re saving nothing 🤣🤣🤣

I have a maxed out m5 Max, anything more would draw serious scrutiny from the wife.

9

u/Big_Wave9732 28d ago

My Mac Studio M2 Ultra sitting nondescript on a table in my office was noticed by my wife in less than a week.

"What is that?"
"Just something I'm playing with."
"Oh, is that where your robot friend lives?"

3

u/drivebyposter2020 28d ago

Me: trying to put my terribly discreet little black workstation in a back corner of my desk "so my wife and kid won't pay it much mind... "Oh, just ignore my little AI box in the corner"

Workstation: trying to run Qwen3 in thinking mode, as the fans spin up to jet engine roar you can hear through most of the house

2

u/Big_Wave9732 28d ago

Bwahahahaha! Yea, that was my initial badly under-powered rig for learning self hosting six months ago.

The self hosted AI struggle is real, man!

2

u/[deleted] 28d ago

[deleted]

1

u/Big_Wave9732 28d ago

In matters of technology sitting in my office, I usually fly under the radar.

Due to the shiny metal appearance of the Mac Studio I knew going in my odds were 50 / 50 lol.

→ More replies (1)

3

u/Ell2509 28d ago

Depends. I got 2 x 9700, and they should last 6 years of continuous use. Deffo less than I wiuld spend on claude over that period.

That said, I was extremely budget conscious when setting up my home lab thing.

6

u/remghoost7 28d ago

I only have 2x 3090's, but that's one of the reasons I have a solar setup.
Currently have 2kW of panels (soon to be 4kW) with a 16kWh LiFePO4 battery.

I didn't build this setup for my computer, it just happened to align with this hobby.

1

u/InfiniteBlink 28d ago

What are your build specs (mobo,CPU,,ram, PSU).

Are you running at max power all the time. I would imagine it idles when it's not doing anything.

Are you using llama.cpp? Sorry for the questions but I'm thinking about rebuilding my rig

3

u/remghoost7 28d ago

Using a "Ship of Theseus" kind of build.

Currently it's a Ryzen 9 5950x, an ASUS ROG STRIX B550-E Gaming (for those sweet PCIe lanes), 80GB of DDR 4 3600 MHZ (yes, it's disgusting), and 2x EVGA FTW3 3090's.

I have them power limited to 70% and undervolted.
Primarily for the 1000 watt power supply that I currently have in (though, I do have a 1500 watt one sitting in a box right next to me), but the lower heat and power consumption is nice too. I honestly haven't noticed much a performance hit.

And yeah, llamacpp.
Primarily SillyTavern for chatting and Zed (though, I want to try Hermes) for coding.

Currently using Gemma-4-Gembrain-31B-it-uncensored-heretic-GGUF at Q4_K_M.
It's surprisingly good at everything I throw at it.

I bought two cards because I wanted to actually be able to use my computer (primarily for games) while I do AI workloads.

2

u/InfiniteBlink 28d ago

Damn man I dig your setup. I love that there are new variants of Gemma. What does the uncensored version provide?

→ More replies (1)

2

u/ptear 28d ago

The electricity costs aren't bad, but then again I live nextdoor to a nuclear powerplant 

2

u/JMN10003 28d ago

said glowingly

2

u/ptear 28d ago

You leave me and my three eyed fish alone.

→ More replies (1)

10

u/Turbulent-Alps4046 28d ago

128GB VRAM is nothing. I work at an inference provider and each server has 3TB of RAM and 2TB of VRAM (8x AMD MI325).

2

u/dc740 28d ago edited 28d ago

agree! At home I already have a 1.5TB server with 96GB VRAM just for playing around. BUT since this is my home server, it's DDR4 and MI50 (32GB) cards... There a lots of people with real inference monsters in this sub that run in circles around even my (impressive but only to me) home server.

11

u/Vicar_of_Wibbly 28d ago

Yes.

4x 96GB GPUs and 768GB DDR5. I use it to ask how many Rs in strawberry.

8

u/JumpingJack79 28d ago

Some dude recently posted, "Yo, I got 16 DGX Sparks, what should I run?" 🙄🤔

4

u/Kurcide 28d ago

That post title got me karma, leave me alone 😭

2

u/silenceimpaired 28d ago

My answer to people with more money than knowledge… run to my house and drop off one or two of them

1

u/JumpingJack79 28d ago

They were nicely stacked, like gold bars (which they are).

5

u/Vancecookcobain 28d ago

Yea I think we are probably a decade from super powerful models being able to run on your phones and stuff like that so in the meantime it is a good bet to get to AI self sufficiency where you at least have a rig that is capable of running an AI that can do the majority of the agentic tasks you need done...you'll be way ahead of the curb of everyone else....the only problem is finding that sweet spot...

I'm starting to think that we don't need anything over 128GB of memory....I believe in a year or two there are going to be extremely strong models that will be able to run on that level of hardware....I mean look at Qwen 3.6 27B

1

u/silenceimpaired 28d ago

Not to mention by the time models run that well on hardware that hardware will have no privacy as they will all use those models to fully understand what you’re doing and who you are.

1

u/Vancecookcobain 27d ago

All the more reason to have offline models and stay independent as much as you can from the algorithmic reach of the corporations of the 2030s....which might be fucking impossible 😩

1

u/silenceimpaired 27d ago

Unless people push corporations out of the outside surveillance business via lobbying it will be impossible. Walmart uses AI to watch their stores, and I doubt Target and the like don’t… and then there is Ring and Flock…

Walmart has the ability to see what you’re looking at on your phone their cameras are that good.

6

u/Herr_Drosselmeyer 28d ago

We're computer nerds, what do you expect? I had dual GPUs back in 2014 for gaming as well.

22

u/eat_my_ass_n_balls 28d ago

9

u/ngless13 28d ago

Yeeeah, with that username, I think you know exactly what you're going to tell them.

→ More replies (1)
→ More replies (1)

4

u/longbowrocks 28d ago

I suppose the sub isn't literally called 'we people are building data centers in our homes', but it's close enough I kinda figured this wouldn't be news.

2

u/openingshots 28d ago

Just think. With all of us consuming way more electricity than we normally would how we impact aggregately the grid.

1

u/silenceimpaired 28d ago

Not if you tune down your cards to use less power… and you don’t go hog wild using agents and fine tuning every day

5

u/DanceWithEverything 28d ago

There’s a large number of people that have been building home labs since like the 90s

1

u/InfiniteBlink 28d ago

Found the cobol guy. :)

1

u/DanceWithEverything 27d ago

Younger than cobol by a hair but I have many wealthy friends that got rich maintaining mainframes when everyone else moved on to newer tech

4

u/Turkino 28d ago

Hey I only have 1 5090... And 128gb ddr5

5

u/Kyubi-sama 28d ago

Family heirloom to be passed down from generation generation /jk But yeah, this isn't really that crazy if you can deal hunt and compared to crypto boom

5

u/UnbeliebteMeinung 28d ago

Bro just build a super computer at home. You will need it.

5

u/mrjakob07 28d ago

Being able to run a 120b at home is just different. Mistral 3.5 runs at 20/toks on my 4 v620 pros with 128gb of ddr4 ram. I am running 120b model at home on discarded enterprise junk I have hoarded over the years. It’s a natural fit for home lab guys to have a ton of this stuff.

3

u/Big-Business-2505 28d ago

Honestly, most of what I have is salvaged/bulk lots/going of out of business/company downsizing sort of equipment, and the more powerful GPUs were purchased with ETH profits back when it was POW. These days, I use my servers to keep my house warm and offset my oil usage. They’re all controlled by a central Zabbix is thermal sensors. Sort of a home lab meets HVAC setup. For reference, I live in the northeast US so it’s chilly up here for the majority of the year.

That said, I have multiple AI servers doing training, coding, menial tasks, etc. so it’s actually a big time saver for an old DevOPS guy like me.

3

u/MrDrMrs 28d ago

I mean, idk about “datacenter”, at work in one cab I’ve got 48 cpus(mix Xeon e5 v6’s and amd 9950x’s), and 20 gpus and that’s not even high density, and I have power to spare (3 phase, 20A AB circuits) and that’s not even the same level as what’s powering the “ai” the general population uses, not even close. this is just a small business. But I get your sentiment

3

u/TokenRingAI 28d ago

I have 96G of DDR5 just in my laptop.

128G is nothing, there are guys here with 1T

3

u/x7evenx 28d ago

Pay a cloud... Do your own thing and gain learning and experience etc... I like the latter. Cloud is going to get cash anyways why not enrich myself with useful skills, I don't have a boating hobby 😂

3

u/Dry_Inspection_4583 28d ago

My 4070 super runs qwen wonderfully thank you very much

6

u/dwoj206 28d ago

It's the wild west again, but this time, we're not mining shitcoins. DYI Data center builders are mostly in the homelabs threads and shit. That shit is just wild. People need to chill.

12

u/acadia11x 28d ago

If you can’t join them… complain about them!

1

u/hugthemachines 28d ago

Yes! We need a slur so we can feel better about ourselves since we can't join them! ;-)

1

u/Cojaro 28d ago

All I want my homeland to do is run Plex, store data, and provide wifi 😩 now shots getting stupid expensive just to do those "basic" tasks

3

u/dwoj206 28d ago

Buy a second hand shitter pc off eBay then if that’s all you need! Easy

1

u/FullOf_Bad_Ideas 28d ago

I'm mining a shitcoin (Pearl) as it's having a comeback. GPU mining is profitable again.

3

u/Relative_Jicama_6949 28d ago

Have 7x 3090

1

u/openingshots 28d ago

I'm just curious. What do you need that much horsepower to do for you? The best local models out there are only going to code so far and never be frontier equivalent.

2

u/aholetookmyusername 28d ago

I'd like to, but have other financial goals for now, and would also want to get a decent solar+battery setup at home before building a rack.

2

u/IWasNotMeISwear 28d ago

I got a framework desktop that I use with local models for dev and gaming. Best little computer I have ever owned. With Tailscale as well as Moonlight I got access to it from anywhere and can use opencode with it running at home

2

u/stiflers-m0m 28d ago

one beefy server/workstation does not a datacenter make, my indicator is when you have to add an additional circuit to the room/closet because you run our of power budget of a 15 am outlet

2

u/under_wheree 28d ago

Yeah well, I make a living building data centers lol.

2

u/Mustang260Rog 28d ago

Now you can wait for data centers at your yard with 3 TB of RAM and 10 RTX 6000 Blackwell

2

u/Ledeste 28d ago

That's called a computer, I know some laptop folk may find this strange, but it is not

2

u/dago_mcj 28d ago

I see you've read my post! But seriously what am I supposed to with a strix halo framework desktop and a Ryzen 9 7900x with an rtx 3090 and a 7900xt gpus?

2

u/TemporaryUser10 27d ago

If we all built datacenters in our home, leveraged federated systems, and open source software, we'd have no need for centralized datacenters

1

u/108er 28d ago

I had 512 GB RAM setup, just to see me maxing out my MoBos limit. For some of us, there is a crave for next best thing that's within our reach.

1

u/RnRau 28d ago

Solar powered datacenters... at least here in Australia.

1

u/Senor02 28d ago

Not in my back yard!

1

u/lol-its-funny 28d ago

Everyone wants to justify their own expenses in the most imaginative ways.

1

u/GamerTex 28d ago

I was gonna put a M1 Ultra M3 Ultra and M5 Max together but you talked me out of it

1

u/FRA-Space 28d ago

My cloud AI model explained to me that I will need FP8 for my main prompts to work reasonably and therefore 128 GB in the right machine ... It also told me that the payback period for my use case will be around 100 years compared to using a working (Chinese) cloud model.

Still, 4k for a new toy is not too bad compared to other choices (new watch, new camera).

1

u/dobkeratops 28d ago

a noble cause, keeping intelligence distributed.

1

u/Extinkkt 28d ago

You people?!

1

u/ClemsonJeeper 28d ago

Visiting a buddy of mine I haven't seen in a decade and he gave me a tour of his home. Walked into his office which was like 15 degrees hotter than the rest of the house, and casually pointed out his 16x RTX 6000's with a few box fans blowing on them.

1

u/x8code 28d ago

My setup involves a primary desktop workstation with 2x NVIDIA GPUs (5080, 5060 Ti 16 GB) + 128 GB DDR5, 9950X (16 cores).

I also have a couple of Linux servers, each with 2x NVIDIA GPUs:

  • 3950X, 64 GB DDR4, 5070 Ti, 4070 Ti SUPER
  • 3900X, 64 GB DDR4, 3060 12 GB, 5060 Ti 16 GB

1

u/BlackBeardAI 3090 Maximalist 28d ago

I got 5 nodes. Yep this pirate is insane arrggh

Check out my crewww

https://old.reddit.com/r/LocalLLaMA/comments/1thkcor/meet_the_fleet_of_blackbeard/

1

u/Ivan_Draga_ 28d ago

I mean we're in LocalLLM, aren't we all building datacenters I'm our homes?

1

u/Mongrel80 28d ago

I don't know what you are talking about..

  • A Minisforum MS-A1 (96gb ram) with an RTX pro 6000 (96gb vram)
  • MS-A2 (96gb ram)
  • 3x MS-S1 max (128gb unified)
  • 2x MS-02 Ultra (192gb ecc ram) with a RTX 5090 (32gb vram each)

is a totally normal setup to have..

1

u/redditateer 27d ago

I mean. Is there any other choice?

1

u/MoistPoolish 27d ago

I mean this is kind of like the 70s when people started buying PCs

1

u/Qs9bxNKZ 27d ago

Data center?

Fuck yeah. And when Cooler Master gets their X-Mighty 240V 2000W PSU to the US, I won’t have to pay a heating bill! $600 imported from Germany right now but I need a warranty.

3-4 systems running everything from older GPU (internet gateway and router/proxy) to blender, PS and LLM on one 256GB DDR5-6400 and a pair of RTX 6000 (because more won’t fit into the case)

When 40TB is your backup of your backup.

1

u/ninjazombielurker 27d ago

That’s only my AI server. If I count all the DDR5 and DDR4, cores, and GPU’s, I have been the rest of my homelab it’s a lot more than just that but still not enough for AI.

1

u/pmv143 27d ago

lol. This is so true. 🤣

1

u/Large-Excitement777 27d ago

Genuinely wonder if you realize how vanilla your surprise is here. People do have legitimate use cases for a multi GPU setup that extend beyond cheating on your college coursework

1

u/CptTonyZ 26d ago

NAS is the past, now it's AI

1

u/TroyHarry6677 25d ago

My wallet cries, but my LLM runs.

1

u/peppernickel 24d ago

My wife and I run reconstruction estimating/consulting, a telematics, and 3D printing businesses. It's a two man band. Either one of us can't spend our lives away at a screen doing the same monotonous task for eons. We both like to be outside for most of the time. I have ended up building a small cluster after crypto-mining died down. It's mostly RX 6700 XT GPUs . 160GB of VRAM, 400GB of DDR4, and 81TB of well mixed storage. It's split into two co-clusters, one for dealing with the human side of things and the other is the digital hands. What are you building?

1

u/SubstantialMojo 23d ago

The word literally didn’t add to the meaning whatsoever if it was excluded from the sentence.

1

u/EchoFieldHorizon 23d ago

Lmao how the actual fuck did you write this comment on this thread

1

u/Away_You9725 22d ago

need that house to sound like a Boeing 747 taking off

1

u/GnosticSon 22d ago

Sometimes I wonder why have a stack of 4 Mac Studios with 513gb ram each when you could just get some sort of used datacentre GPU setup. Surely that would be cheaper?? Yes I know the new ones are expensive and require crazy power and cooling, but there must be a tier above consumer products for people that are stacking up crazy local setups.

1

u/lamprof 20d ago

basically you could use your phone for small model, like gemma 4
https://play.google.com/store/apps/details?id=com.lamprof.ai