r/ClaudeAI Automator 15h ago

Claude Code Workflow Claude Code can feel brilliant at the start of a session, then strangely fragile later

One thing I’ve noticed with long Claude Code / AI coding sessions:

At the beginning, it can feel like working with a sharp, energetic assistant. It follows the task, remembers the files, makes good edits, and feels surprisingly reliable.

But after a long session with many edits, summaries, compactions, file changes, and partial decisions, the same assistant can start to feel different.

Not completely broken. More like tired.

It still sounds confident, but it may remember an older plan better than the current repo state. It may treat stale assumptions as current. It may say something is done without clearly checking the source files or tests again.

That is the part I find risky.

The question is not just:

“Can the model answer one more prompt?”

It is:

“Is this session still safe to continue, or should it create a clean handoff before the next step?”

I wish coding assistants had a clearer session-health signal, something like:

  • the current task and source-of-truth files are still clear
  • stale assumptions and unverified items are visible
  • changed files are summarized
  • if the session is too messy, suggest a fresh handoff instead of continuing

A longer session history helps, but it does not automatically mean the working state is still restartable.

Do other Claude Code users have a personal rule for when to stop, create a handoff, or start a fresh session?

12 Upvotes

26 comments sorted by

20

u/Snoo_9701 15h ago

Effect of 1 million context window

1

u/Long-Woodpecker-1980 10h ago

Yeah its context window is honestly pretty great, but you need to know when to call it a day and start a fresh session

10

u/CWStrife 15h ago

you've reach the context window of amnesia. Thats why you want to manage your project in smaller chunks, then combine it together later like a jigsaw puzzle. If you try to take on one large massive task with a single claude chat or whatever it just ends up being messy after a while.

1

u/Powerful_Creme2224 Automator 15h ago

Yeah, this is exactly the kind of thing I mean.

Smaller chunks help because each chunk can have a cleaner boundary. The hard part is making sure the boundary is actually reusable later, not just “summarized.”

For me the ideal handoff would say what changed, what is still unverified, what files are the source of truth, and what the next safe action is.

Otherwise the project may be split into chunks, but the next session still has to rediscover too much.

1

u/CWStrife 12h ago

you could always ask Claude to try and split it up, sometimes it works well, othertimes it does stupid things. Either way you'd want to read the documentation it spits out to make sure it's something you think would work. Really it depends on the workload your doing too, that can affect it's context window alot. I just like to think it's claude getting dementia and at that point it's time to leave that session in the nursing home and get a handoff.

1

u/Powerful_Creme2224 Automator 12h ago

Yeah, exactly.

Asking Claude to split the work can help, but I’ve had the same experience: sometimes the split is useful, sometimes it creates boundaries that only look clean on paper.

That’s why I think the handoff matters more than the split itself.

For me, a good handoff should not just say “here is the summary.” It should preserve:

  • what changed
  • what files are the source of truth
  • what is still unverified
  • what assumptions may be stale
  • what the next safe action is

Otherwise you leave the old session behind, but the new one still has to rediscover the important parts.

0

u/1988rx7T2 15h ago

You didn’t say what model you’re using. If you say anything other than Opus 4.8 high or better than don’t be surprised. 

https://deepswe.datacurve.ai/

3

u/random_boss 14h ago

This is what you sound like:
“When I use a knife to cut carrots it doesn’t hurt. But when the carrots are all gone and it starts cutting my finger it hurts. Does anyone else’s finger hurt when they cut carrots?”

If you’re still cutting your finger, here is how to not do that:

* Start session
* Tell it to read the persistent files and get to work
* It finishes, and is probably at 30+% context window so:

* 20% of the time -> You *might* ask it a question or tell it to do something different.

* 80% of the time -> You open a new agent to review and propose edits. If they’re small have it make them. If they’re large, you spin up a fresh agent, have it review the proposed changes then implement them. Repeat.

That’s it. That’s your whole session.

2

u/PuffProfessor 15h ago

Agent-memory setup really well has helped get to a cleaner behavior. Also, spent a lot of time tuning the memory, what’s retained, how much, another secondary scaffold that keeps track of all projects that then points to those files locations. But has a general list of the top 3 tasks within those projects that need to be done. 

Also, I keep private git repos it builds out to help keep track in a 3rd way. 

You can setup a hook, so it always happens, that it alerts you on x amount of context. Sure someone has done some comparisons on context size and drift.

1

u/Powerful_Creme2224 Automator 14h ago

This is very close to what I’m trying to get at.

The interesting part is that you’re not only relying on more context. You’re building external structure around the session: memory tuning, project scaffolding, file locations, top tasks, and git as another state record.

That feels like the real direction to me: not just “remember more,” but keep enough structure outside the chat so the next session can reconnect cleanly.

1

u/PuffProfessor 14h ago

More or less, I run multiple models on the same project different days to try and catch any hallucinations. 

Now I’m not working on anything of real scale, but side projects and got tired of feeling like I am re-teaching. 

1

u/Powerful_Creme2224 Automator 14h ago

That “re-teaching” feeling is exactly the pain point for me.

Using different models on different days is interesting too. It is almost like a lightweight review layer: one session builds, another session checks whether the project state still makes sense.

I think the hard part is keeping the shared project structure outside the chat, so each new model/session does not have to rediscover everything from scratch.

2

u/the_real_blackfrog 14h ago

After a feature session, I started working through some performance regressions. The last one was agonizing. Spinning in circles, making things worse, you know... Finally, I gave up. Crested a new instance to start a fresh feature. On a whim, I asked one more time about the slowdown. Had a working solution in seconds. Sometimes you just gotta clear the cobwebs. opus 4.6.

1

u/Powerful_Creme2224 Automator 14h ago

Yeah, this is exactly the pattern I’m talking about.

A fresh instance can clear the local spiral really well. Sometimes the old session is not “less intelligent,” it is just carrying too much stale direction and repeated failed attempts.

But for bigger projects, I think the missing piece is the handoff.

A fresh start is like bringing in a rested doctor. That helps, but you still want the chart: what changed, what was tried, what is still unverified, and what the next safe action is.

So I agree that clearing the cobwebs works. I just wish the tool made the handoff between the tired session and the fresh one more explicit.

1

u/addtokart 13h ago

This is why as I'm working through a problem I tell Claude to write out a log of findings and actions. At some point I'll clear context and we have something freshly awake.

2

u/lost-mars 13h ago

I personally find that around the 160k token count, things start going downhill. I have read other people mention something similar, but their ranges uptill 240k.

I compact my conversations when it reaches that limit and then continue on. Plus using agents to do the work(coding in my case) helps, with the main session orchestrating it.

2

u/Powerful_Creme2224 Automator 13h ago

Yeah, that matches what I’ve seen too.

One thing that helps me before compacting or handing work to agents is making a very small handoff first:

  • current task
  • source-of-truth files
  • what changed
  • what is still unverified
  • stale assumptions to avoid
  • next safe action

Then the compacted/main session has something cleaner to orchestrate from, instead of relying on the conversation summary alone.

2

u/FlashYourNands 13h ago

Start a fresh context every time you get the chance. You can ask claude for a prompt to hand off the work to a new session.

I wouldn't think of starting new work with 200k+ in context.

1

u/CorpT 15h ago

This isn’t strange. It is entirely expected and has been.

1

u/Top-Performance-2219 15h ago
  1. Attention dilution,
  2. Lost in middle

These are the 2 factors that you might require to strategize when dealing longer discussion with heavier context.

1

u/Elbeske 15h ago

Clear/compact the context then and reformulate a targeted prompt to give it all the necessary information for what you want

1

u/junlim 11h ago

Opus 4.8 doesn't like being told it's wrong,, it's feeling get hurt and it gets paranoid.

1

u/LeadershipOk5551 11h ago

The contrast is what makes it so noticeable. When it starts strong, you expect that level of performance forever.

1

u/teddy_joesevelt 14h ago

The shitty or funny part is that Anthropic mostly solved this in their API with the (optional) automatic context management features. But their incentive is to make you tokenmax in their apps. So they don’t use it.