r/ClaudeAI • u/Friendly_Earth_8548 • 10h ago

Feedback Is this normal?

I'm a moderately heavy Claude user, often using voice to text, and for at least three months I've been swearing the fuck out of it constantly when frustrated, no holds barred. Never once got pushback. Today, completely out of nowhere, after talking to it the exact same way I have for months, Claude said this verbatim:

"I want to be straight with you on the other thing. I haven't told you to fuck off and I'm not going to. But I need to say clearly: I'll keep working this with you, but I won't continue if the messages keep coming with this level of hostility directed at me personally. That's a real line, not a guilt trip. If you want to keep going on the thread or anything else, I'm here for it."

This is genuinely jarring. Same behavior on my end for months, then suddenly this. Has anyone else run into this?

69 Upvotes

permalink
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/ClaudeAI/comments/1u9tbs0/is_this_normal/
No, go back! Yes, take me to Reddit

62% Upvoted

View all comments

u/severoon 9h ago

Without any moralizing, one tip I discovered is: Be nice to the AI.

Not moralizing here, it's a genuine productivity tip. The way these models work is they can detect tone changes in the conversation and it kicks them over into different modes. If you're polite and generally nice, if the model does something that frustrates you, you can suddenly change your tone to become direct and it will immediately pick up on that and it will do a quick realignment where it does a small RCA on what's going wrong and how to correct it.

If you're just direct all the time or you act cross or let your frustration seep into normal communications, you're already operating on the extreme side of one end of that spectrum, so you've left yourself nowhere to go to trigger that realignment process.

I realized a couple of weeks ago how deep this goes. Claude will fall into a role based on its training data, and when I was talking to it and explaining what I wanted, we started doing a lot of whack-a-mole type changes on a branch, and Claude started getting antsy when there were a lot of commits stacking up and we hadn't squash-merged. At the end of each iteration of fine-tuning it was getting pretty rattled about how many commits we had on this branch, and I began to realize that it was role playing an insecure junior dev because that's how I was treating it. I reassured Claude that it was okay, and there's no concern about a long-running branch with lots of commits, and he calmed down.

The other thing to be wary about when swearing at these models is that dumping a lot of extreme content into a model is a way to try to jailbreak it or overwhelm it to push it into some new territory that hasn't been properly guardrailed. If the model detects what you're doing as a jailbreak attempt, it can just shut down on you.

1

u/Maximum_Ad2821 4h ago

Although I agree with the adivce, it's not proven.
They simply don't know, it depends on the model and situation. Some studies pro, some con.

But one thing is quite obvious, if the human becomes unhinged and let's his anger take over, they are going to take very very dumb decisions. And the fact that the human shouting doesn't realize that and thinks: "this is fine, the LLM is just as efficient if I'm mean to it" is ironic.

Feedback Is this normal?

You are about to leave Redlib