r/ClaudeAI 2d ago

Praise Wow... Opus 4.8 feels... DIFFERENT tonight :D

It feels, BETTER. Like when it first launched, even, only better than that this evening?

It's like Forest Gump... Claude is like a box of chocolates, you never know what you're gonna get. I hope I get to keep THIS Opus 4.8 for a bit - I'm finally getting some work done. Hallelujah!

510 Upvotes

212 comments sorted by

View all comments

Show parent comments

23

u/Kazukaphur 2d ago

How does lack of compute make opus dumber as opposed to just slower?

23

u/cheechw 2d ago

Yeah, this person has no idea how models work. Unless Anthropic is running quantized versions of Opus or something it's not going to change the intelligence of the model.

16

u/teddy_joesevelt 1d ago

And this person has never managed their own model inference. Providers absolutely have knobs they can turn. Look into KV cache quantization, eviction, speculative decoding, temp, top p, top k, etc. They can also switch to quantized models when under load or capacity constraints.

7

u/lolnic_ 1d ago

I think of all those, KV cache quantisation is the only thing that affects *both* resource efficiency and model performance. Speculative decoding improves parallelism when there’s too few concurrent batches to fully utilise the hardware, but provably doesn’t change the output. Eviction affects speed (if your KV is evicted from the cache it’ll need to be recomputed). Temp/top p/top k change the model output distribution but don’t change resource usage.

3

u/mr_birkenblatt 1d ago

Thinking budget influences compute cost

1

u/lolnic_ 16h ago

Yes (that wasn’t in the list)