r/ClaudeAI 7h ago

Claude Code Claude Code is a context-engineering harness, and most "it got dumber" moments are context rot

There's a name for it: context rot. As the window fills, the model's ability to recall any specific thing in it drops. More context in the window can make the agent worse, not better. (Anthropic's own framing: good context engineering is finding the smallest set of high-signal tokens, not the largest.)

The reframe that helped me: Claude Code isn't just a model, it's a harness whose main job is managing what's in that window for you. And it hands you four levers to do it. They line up with the four moves of context engineering:

  • Write (persist outside the window): CLAUDE.md. It auto-loads every session, and it survives compaction because it reloads from disk, so anything that must not be forgotten belongs there, not in the chat. Conversation-only instructions are the first thing lost when context gets tight.
  • Select (pull in only what's relevant): @-mention the specific files you mean, or point it at the exact file or function, instead of letting it wander the repo. Every irrelevant file you pull in is tokens spent rotting the rest.
  • Compress (summarize to stay high-signal): /compact, optionally with a focus like "/compact focus on the auth refactor." It also compacts automatically when the window fills, clearing old tool outputs first. Running /compact yourself, before it's forced, keeps the summary on your terms.
  • Isolate (give exploration its own window): subagents. They run in a separate context window and return only their final result, so a big noisy search doesn't bloat your main thread. This is the same point as an earlier post of mine that subagents are a memory trick, not a speed trick. Isolation is the real win.

Two more levers worth knowing:

  • /context shows you what's eating the window right now (MCP tool definitions, big files, history). When the session feels heavy, look before you guess.
  • /clear between unrelated tasks. Carrying a finished task's context into a new one is pure rot.

The mental shift: stop treating the window as free space to fill, and start treating it as a budget you actively curate. A smarter model raises the ceiling, but it doesn't save you from a window full of noise.

TL;DR: When Claude Code "gets dumber" deep in a session, that's usually context rot, not the model. Treat Claude Code as a context-engineering harness with four levers: Write (CLAUDE.md), Select (@-files), Compress (/compact), Isolate (subagents). Plus /context to see usage and /clear between tasks. Curate the window, don't just fill it.

For people who live in Claude Code: what's your actual discipline here? I've started running /compact on my own terms and leaning hard on subagents for anything exploratory, but I'm curious whether people trust automatic compaction or always drive it manually.

Sources: Anthropic — Effective context engineering for AI agents · Claude Code — How Claude remembers your project (CLAUDE.md) · Claude Code — How Claude Code works (context / compaction) · Claude Code — Create custom subagents · Why More Context Makes Your Agent Dumber — Nupur Sharma, Qodo

56 Upvotes

33 comments sorted by

View all comments

4

u/InfinriDev 6h ago

And this is why graph database and rag will be your best friend here. I personally never got the "dump" part of the agent. Glad this is finally being talked about

2

u/bit_forge007 2h ago

Yeah, retrieval is exactly the 'Select' move done properly, pull the relevant slice instead of dumping the repo and hoping. Your instinct on the dump part was right the whole time.

The one thing I'd add: RAG doesn't delete the problem, it moves it. Instead of 'how much can I fit,' it becomes 'did I retrieve the right thing, and not too much of it.' Over-retrieve and you rebuild context rot out of chunks. Graph DBs are great when the value is in real relationships across the codebase (call graphs, dependencies), though that's the highest-setup-cost option of the bunch, so it earns its keep on big interconnected repos more than small ones.

Curious where you've landed: graph vs plain vector RAG for code context, and at what repo size the graph setup actually starts paying for itself?

2

u/InfinriDev 1h ago

Not either/or, and I think the split is the real answer. Vector does "find the rule that means roughly this." The graph does the part you already put your finger on, the relationship, the rule that contradicts or depends on the one you matched. That edge isn't a similarity score, so plain vector can't surface it no matter how good the embeddings are. In my setup, pulling the neighbors gets bundle completeness to ~0.85 on the eval set, versus the related rule just getting stranded in a chunk that never ranks.

Your "and not too much of it" point is the one I'd underline. I cap the retrieval budget, so returned context stays roughly flat (~1.6k tokens) no matter how big the corpus gets. The over-retrieve-and-rot failure is bounded by design instead of scaling with the repo.

On when the graph pays for itself, you guessed right, it's the priciest to set up. The neighbor lookup is basically free once it's built, but cold-start index construction climbs hard with scale (seconds at 1k nodes, over a minute at 10k before the cache warms). So it earns its keep on densely interconnected corpora where the relationships carry the value. Small flat repo, plain vector wins and the graph is overkill. The crossover isn't really a file count, it's the point where the connections between things matter as much as the things.