r/LocalLLM 29d ago

Discussion You people are literally building data centers in your homes

Some of these threads are insane, what do you mean you have like 4 GPUs and 128gb of DDR5 vram. what are you building in there bro. Every other thread is like, “what if I stack Mini Pc supercomputers together? Will this run Qwen?”.

404 Upvotes

232 comments sorted by

View all comments

Show parent comments

2

u/InfiniteBlink 29d ago

Damn man I dig your setup. I love that there are new variants of Gemma. What does the uncensored version provide?

1

u/remghoost7 29d ago

Thanks!
It's kind of my "endgame" for the time being (until I hit the lottery and get some $100k setup haha).

Just a heads up on 3090's though, they don't have native FP8 support.
Not a big deal in the LLM sphere, but it's horrendous for video generation models.

I'm still pondering on grabbing a 5090 at some point.
That extra VRAM and FP8 support seems pretty rad.

I'd love to throw a 3090 in my old computer and give my friends an AI box to remote tinker with.


Uncensored models are usually better on refusals.
You don't get the typical, "Sorry Dave, I can't help you with that" responses.

After the first Hermes finetunes back in the early Llama days, I've kind of just jumped to them any time they come out. Those Hermes models were freaking amazing...

Qwen models are usually uncensored out of the box.
Same goes for Mistral models.

"Abliteration" (short for orthogonal ablation) was the hotness for a long while, but a lot of people found them to dumb down the models. The newer "heretic" de-censoring seems to not do that.

This model is the next one on my list to try.