"Update: Root cause found — this was a bug in a tool I built that was running locally for testing, not Claude Code.
When the tool's configuration pointed at a local working directory, it would hard-reset that directory every poll cycle to reflect the remote — destroying all uncommitted changes to tracked files, exactly as described in the issue."
kibwen 15 hours ago [-]
Let's focus on the real issue here, which is that HN has apparently normalized the double hyphen in the title to an en dash--yes, an en dash, not even an em dash.
dragonwriter 13 hours ago [-]
That's LaTeX convention, double hyphen is an en-dash, triple hyphen is an em-dash.
byronsharman 15 hours ago [-]
I agree that it should be left as a double hyphen, but an en dash is far more appropriate considering the decades-long precedent set by LaTeX (and continued by Typst).
ajross 14 hours ago [-]
It's a command line argument. The undeniably correct way to render it is with two minus signs[1] and absolutely not something non-ascii.
[1] Not strictly a hyphen, which has its own unicode point (0x2010) outside of ascii. Unicode embraced the ambiguity by calling this point (0x2d) "HYPHEN-MINUS" formally, but really its only unique typographic usage is to represent subtraction.
minitech 14 hours ago [-]
They meant “more appropriate [than an em dash]”. And that minus sign usage of hyphen-minus isn’t unique in Unicode either – see U+2212 MINUS SIGN.
ajross 13 hours ago [-]
But... it's not more appropriate than an em dash for representing command line arguments? I don't see how either is any more incorrect than the other. There's a uniquely correct answer here and the em-dash is not it. Period.
minitech 13 hours ago [-]
It’s about the top-level comment’s horror that ”--” was substituted with “an en dash, not even an em dash”. If you’re picking a substitution for “--”, en dash makes more sense. The comment you originally replied to had already agreed “that it should be left as a double hyphen”.
ajross 13 hours ago [-]
> If you’re picking a substitution for “--”, en dash makes more sense.
No, it doesn't? This seems like crazy talk to me, like "If you're picking a substitute for saffron, blood plasma makes more sense than monocrystalline silicon". Like, what?
It makes zero sense to substitute this at all. It's exactly what it says it is, the "--hard" command line option to "git reset", and you write it in exactly one way.
minitech 13 hours ago [-]
Nobody is confused or disagrees about the `--hard` part. It was a minor tangent about contexts where these ASCII substitutions are established, like LaTeX (`` -> “, '' -> ”, -- -> –, --- -> —, etc.)
bluedel 3 hours ago [-]
It's not a command line argument, it's part of the title of a hackernews post.
dragonwriter 12 hours ago [-]
> The undeniably correct way to render it is with two minus signs[1] and absolutely not something non-ascii.
> [1] Not strictly a hyphen, which has its own unicode point (0x2010) outside of ascii. Unicode embraced the ambiguity by calling this point (0x2d) "HYPHEN-MINUS" formally, but really its only unique typographic usage is to represent subtraction.
Strictly, its as you note, the hyphen-minus, and Unicode has separate, disambiguated code points for both hyphen (0x2010) and minus (0x2212); hyphen-minus has no "unique typographic usage".
ajross 2 hours ago [-]
I said that badly. What I meant was that ASCII 0x2d is, in fact, used as the only minus sign in basically all markup and presentation layers. (Mostly because math layout tends to go through its own interpreter -- what lives in "the unicode text" is always "markup" of some kind). The unicode value is ignored AFAIK, nothing emits it or interprets it specially. That is not true of the hyphen, which does get special treatment at the presentation layer in fonts and whatnot.
lynx97 11 hours ago [-]
The "sed" expressions that power the title "cleanup" here do overshoot quite often. It ruins --long-command-arguments and it definitely also reuins cpp::namespaces. Quite curious why these obvious shortcomings are not being fixed.
tom_ 14 hours ago [-]
Pro tip: pros don't copy and paste from HN titles straight into the command line.
(Or... do they?? Hmm, ok, maybe I need to let this roll around in my mind.)
johnisgood 15 hours ago [-]
And it should be "--" to begin with, i.e. "--hard".
SoftTalker 14 hours ago [-]
Two hyphens for an en-dash, three for an em-dash.
rtpg 15 hours ago [-]
iOS keyboard autocomplete
smallerize 15 hours ago [-]
Surely its copy and paste though?
chatmasta 15 hours ago [-]
You underestimate just how annoying iOS autocomplete can be.
christoph 11 hours ago [-]
I really don’t! I switched it all of months ago - autocomplete, autocaps, all of
it. I reached a point where the constant frustration had to be worse than any productivity gain it was hoping to offer.
A few months on… I like
it! Frustration is all gone, any errors are just on me now, and it forces me to slow down a bit and use the brain a bit more!
SilverElfin 14 hours ago [-]
Not just iOS but macOS too. And it seems to only get worse. And with no notice to users. And no response in their forums.
alwillis 9 hours ago [-]
I’ve been using Cotypist on macOS [1].
Sometimes it feels like it’s reading my mind when I’m typing.
oh, no, I notice, but typically not until after hitting send
chatmasta 12 hours ago [-]
I swear sometimes it doesn’t apply the corrections until I submit the form. It’s infuriating.
jonahx 15 hours ago [-]
desktop test --
0xbadcafebee 14 hours ago [-]
Article: "Major issue with most popular AI coding tool"
comments: "ThE tItLe iS aI cOded !!!1"
minitech 14 hours ago [-]
No, the comment was pointing out that the HN platform automatically replaces `--` in titles with `–`. (I don’t know if that’s true, but that was the intent. Nothing to do with AI.)
yunwal 4 hours ago [-]
Where did they say the title is ai coded?
butterlesstoast 14 hours ago [-]
The best community
layer8 9 hours ago [-]
The article is wrong and the issue is closed.
CarVac 15 hours ago [-]
double hyphens –
triple hyphens —
AnonC 9 hours ago [-]
For me on iOS:
Double hyphens —
Triple hyphens —-
Actual em dash (typed with more effort, but HN changes it) —
The triple hyphens has a gap in it separating the autocorrected en dash and the hyphen.
alcor-z 14 hours ago [-]
[dead]
mrcwinn 14 hours ago [-]
Apple actually had the nerve to make it a point to say they’d made their keyboard intelligence better. What a joke. Can’t keyboard, my ass!
Jarred 14 hours ago [-]
I spent some time investigating this, and the issue is not accurate - Claude Code itself does not have code that spawns `git reset --hard origin/main`
Most likely, the developer ran `/loop 10m <prompt>` or asked claude to create a cron task that runs every 10 minutes and refreshes & resets git.
tylerchilds 10 hours ago [-]
Probably something innocuous like
“Sync with the server periodically to get the latest”
Tracks for what we can infer
kccqzy 15 hours ago [-]
> Process monitoring at 0.1-second intervals found zero git processes around reset times.
I don’t think this is a valid way of checking for spawned processes. Git commands are fast. 0.1-second intervals are not enough. I would replace the git on the $PATH by a wrapper that logs all operations and then execs the real git.
wswope 15 hours ago [-]
Sure looks to me like this whole case is Claude Code chasing its own tail, failing to debug, and offering to instead generate a bug report for the user when it can't figure out a better way forward.
Maybe even submitting the bug report "agentically" without user input, if it's running on host without guardrails (pure speculation).
This HN account is also by the same user as github, this submission may be AI created. I wonder if they've let **claw run loose over their whole online presence and this is the result.
(No need to use bpftrace, just an easy example :-) )
repiret 14 hours ago [-]
Or just `strace`.
raddan 13 hours ago [-]
Seconded. Way simpler than BPF, especially when all you want to see is syscalls.
simianwords 16 hours ago [-]
I think this post potentially mischaracterises what may be a one off issue for a certain person as if it were a broader problem. I'm guessing some context has been corrupted?
jeswin 15 hours ago [-]
It's not a one off issue - it has happened to me a few times. It has once even force pushed to github, which doesn't allow branch protection for private personal projects. Here's an example.
1) claude will stash (despite clear instructions never to do so).
2) claude will use sed to bulk replace (despite clear instructions never to do so). sed replacements make a mess and replaces far too many files.
3) claude restores the stash. Finds a lot of conflicts. Nothing runs.
4) claude decides it can't fix the problem and does a reset hard.
I have this right at the top of my CLAUDE.md and it makes things better, but unlike codex, claude doesn't follow it to the letter. However, it has become a lot better now.
NEVER USE sed TO BULK REPLACE.
*NEVER USE FORCE PUSH OR DESTRUCTIVE GIT OPERATIONS*: `git push --force`, `git push --force-with-lease`, `git reset --hard`, `git clean -fd`, or any other destructive git operations are ABSOLUTELY FORBIDDEN. Use `git revert` to undo changes instead.
bschwindHN 15 hours ago [-]
When will you all learn that merely "telling" an LLM not to do something won't deterministically prevent it from doing that thing? If you truly want it to never use those commands, you better be prepared to sandbox it to the point where it is completely unable to do the things you're trying to stop.
Twirrim 14 hours ago [-]
Even worse, explicitly telling it not to do something makes it more likely to do it. It's not intelligent. It's a probability machine write large. If you say "don't git push --force", that command is now part of the context window dramatically raising the probability of it being "thought" about, and likely to appear in the output.
Like you say, the only way to stop it from doing something is to make it impossible for it to do so. Shove it in a container. Build LLM safe wrappers around the tools you want it to be able to run so that when it runs e.g. `git`, it can only do operations you've already decided are fine.
LuxBennu 13 hours ago [-]
This is true for prohibitions but claude.md works really well as positive documentation. I run custom mcp servers and documenting what each tool does and when to use it made claude pick the right ones way more reliably. Totally different outcome than a list of NEVER DO THIS rules though, for that you definitely need hooks or sandboxing.
trenchgun 9 hours ago [-]
Yes but this is probabilistic. Skill, documentation etc help by giving it the information it needs. You are then in the more correct probability distribution. Fine for docs, tips etc, but not good enough for mandatory things.
dolmen 9 hours ago [-]
"more reliably" is still not "reliably".
juped 10 hours ago [-]
Even even worse, angry all-caps shouting will make it more stupid, because it pushes you into a significantly stupider vector subspace full of angry all-caps shouting. The only thing that can possibly save you then is if you land in the even tinier Film Crit Hulk sub-subspace.
I touch on this a bit in the piece I wrote for normies, it helped a lot of people I know understand the tech a bit better.
svnt 7 hours ago [-]
Is this true for anything beyond the simplest LLM architectures? It seems like as soon as you introduce something like CoT this is no longer the case, at least in terms of mechanism, if not outcome.
heyethan 13 hours ago [-]
Feels like a lot of people are still treating these tools like “smart scripts” instead of systems with failure modes.
Telling it not to do something is basically just nudging probabilities. If the action is available, it’s always somewhere in the distribution.
Which is why the boundary has to be outside the model, not inside the prompt.
nottorp 9 hours ago [-]
> sandbox it to the point where it is completely unable to do the things you're trying to stop
Why are permissions for these "agents" on a default allow model anyway?
mr_mitm 8 hours ago [-]
What do you mean? By default, Claude asks for permission for every file read, every edit, every command. It gets exhausting, so many people run it with `--dangerously-skip-permissions`.
dwb 8 hours ago [-]
It does not ask for permission for every file read, only those outside the project and not explicitly allowed. You can bypass project edit permission requests with “allow edits”, no need for “dangerously skip permissions”. Bash commands are harder, but you can allow-list them up to a point.
nottorp 8 hours ago [-]
> so many people run it with `--dangerously-skip-permissions`
It's on the people then, not the "agent". But why doesn't Claude come with a decent allow list, or at least remember what the user allows, so the spam is reduced?
mr_mitm 8 hours ago [-]
You have the option to "always allow command `x.*`", but even then. The more control you hand over to these things, the more powerful and useful (and dangerous) they become. It's a real dilemma and yet to be solved.
jeswin 14 hours ago [-]
My point is exactly that you need safeguards. (I have VMs per project, reduced command availability etc). But those details are orthogonal to this discussion.
However "Telling" has made it better, and generally the model itself has become better. Also, I've never faced a similar issue in Codex.
DrewADesign 15 hours ago [-]
That’s right, because we’re not developers anymore— we orchestrate writhing piles of insane noobs that generally know how to code, but have absolutely no instinct or common sense. This is because it’s cheaper per pile of excreted code while this is all being heavily subsidized. This is the future and anyone not enthusiastically onboard is utterly foolish.
biglost 15 hours ago [-]
I use a script wrapper of git un muy path for claude, but as you correctly said, I'm not sure claude Will not ever use a new zsh with a differentPATH....
lambda 14 hours ago [-]
Why do you expect that a weighted random text generator will ever behave in predictable way?
How can people be so naive as to run something like Claude anywhere other than in a strictly locked down sandbox that has no access to anything but the single git repo they are working on (and certainly no creds to push code)?
This is absolutely insane behavior that you would give Claude access to your GitHub creds. What happens when it sees a prompt injection attack somewhere and exfiltrates all of your creds or wipes out all of your repos?
I can't believe how far people have fallen for this "AI" mania. You are giving a stochastic model that is easily misdirected the keys to all of your productive work.
I can understand the appeal to a degree, that it can seem to do useful work sometimes.
But even so, you can't trust it with anything, not running it in a locked down container that has no access to anything but a Git repo which has all important history stored elsewhere seems crazy.
Shouting harder and harder at the statistical model might give you a higher probability of avoiding the bad behavior, but no guarantee; actually lock down your random text generator properly if you want to avoid it causing you problems.
And of course, given that you've seen how hard it is to get it follow these instructions properly, you are reviewing every line of output code thoroughly, right? Because you can't trust that either.
rimunroe 13 hours ago [-]
> How can people be so naive as to run something like Claude anywhere other than in a strictly locked down sandbox that has no access to anything but the single git repo they are working on (and certainly no creds to push code)?
> This is absolutely insane behavior that you would give Claude access to your GitHub creds. What happens when it sees a prompt injection attack somewhere and exfiltrates all of your creds or wipes out all of your repos?
I don’t understand why people are so chill about doing this. I have AI running on a dedicated machine which has absolutely no access to any of my own accounts/data. I want that stuff hardware isolated. The AI pushes up work to a self-hosted Gitea instance using a low-permission account. This setup is also nice because I can determine provenance of changes easily.
ex-aws-dude 12 hours ago [-]
The answer is that for these people most of the time it looks predictable so they start to trust it
The tool is so good at mimicking that even smart people start to believe it
alwillis 9 hours ago [-]
Claude Code hooks are deterministic; the agent can’t bypass them [1].
For example you force a linter to run or for tests to run.
Claude Code defaults to running in a sandbox on macOS and Linux. Claude Cowork runs in a Linux VM.
If you can't trust yourself, you will never be able to trust anyone else.
If you believe the AI is out to get you, that's certainly the reality you will manifest.
matkoniecz 11 hours ago [-]
> How can people be so naive as to run something like Claude anywhere other than in a strictly locked down sandbox that has no access to anything but the single git repo they are working on (and certainly no creds to push code)?
Because it is much easier to do and failure rate is quite low.
(not saying that it is a good idea)
cruffle_duffle 13 hours ago [-]
> How can people be so naive as to run something like Claude anywhere other than in a strictly locked down sandbox that has no access to anything but the single git repo they are working on (and certainly no creds to push code)?
Because it’s insanely useful when you give it access, that’s why. They can do way more tasks than just write code. They can make changes to the system, setup and configure routers and network gear, probe all the iot devices in the network, set up dns, you name it—anything that is text or has a cli is fair game.
The models absolutely make catastrophic fuckups though and that is why we’ll have to both better train the models and put non-annoying safeguards in front of them.
Running them in isolated computers that are fully air gapped, require approval for all reads and writes, and can only operate inside directories named after colors of the rainbow is not a useful suggestion. I want my cake and I want to eat it too. It’s far to useful to give these tools some real access.
It doesn’t make me naive or stupid to hand the keys over to the robot. I know full well what I’m getting myself into and the possible consequences of my actions. And I have been burned but I keep coming back because these tools keep getting better and they keep doing more and more useful things for me. I’m an early adopter for sure…
mtndew4brkfst 15 hours ago [-]
It has once even force pushed to github, which doesn't allow branch protection for private personal projects.
This is only restricted for *fully free* accounts, but this feature only requires a minimum of a paid Pro account. That starts around $4 USD/month, which sounds worth it to prevent lost work from a runaway tool.
jeswin 14 hours ago [-]
I was on one till recently, maybe I still am. But does it work for orgs? I put some projects under orgs when they become more than a few projects.
namibj 15 hours ago [-]
That's a fee for not running a local git proxy with permissions enforcement that holds onto the GitHub credentials in place of Claude.
mikaraento 11 hours ago [-]
Do you know of a good ready-made implementation of such a proxy? I’ve been looking for one.
GitHub is also a worry in terms of exfiltration. You can’t block pushes to public repos unless you are using GitHub Enterprise Managed Users afaict.
verdverm 15 hours ago [-]
Or putting the code and .git in a sandbox without the credentials
jatora 15 hours ago [-]
Reinforcing an avoidance tactic is nowhere near as effective as doing that PLUS enforcing a positive tactic. People with loads of 'DONT', 'STOP', etc. in their instructions have no clue what they're doing.
In your own example you have all this huge emphasis on the negatives, and then the positive is a tiny un-emphasized afterthought.
refulgentis 15 hours ago [-]
I think you're generally correct, but certainly not definitively, and I worry the advice and tone isn't helpful in this instance with an outcome of this magnitude.
(more loosely: I'm a big proponent of this too, but it's a helluva hot take, how one positively frames "don't blow away the effing repro" isn't intuitive at all)
Eisenstein 8 hours ago [-]
The trick is to explain why something is important, not just to emphasize it. For instance:
"As an LLM, when Claude used 'sed', it can quickly and easily break files that are difficult for the user to fix. Claude must be aware that an LLM's actions seem effortless to it but to the user it represents hours of work getting things back in order."
unchar1 14 hours ago [-]
Claude tends to disregard "NEVER do X" quite often, but funnily enough, if you tell it "Always ask me to confirm before going X", it never fails to ask you. And you can deny it every time
SoftTalker 14 hours ago [-]
If it disregards "NEVER do" instructions, why would it honor your denial when it asks?
There is never a guarantee with GenAI. If you need to be sure, sandbox it.
Zetaphor 12 hours ago [-]
There are plenty of examples in the RL training showing it how and when to prompt the human for help or additional information. This is even a common tool in the "plan" mode of many harnesses.
Conversely, it's much harder to represent a lack of doing something
$ yoloai apply bugfix
Target: /home/ks/tmp/b64
Commits to apply (1):
9db260b33bcd Fix bit mask in base64 encoding
Apply to /home/ks/tmp/b64? [y/N] y
1 commit(s) applied to /home/ks/tmp/b64
Now the commit claude made inside the sandbox has been applied to my workdir:
$ git log
commit 5b0fc3a237efe8bbc9a9e1a05f9ce45d37d38bfa (HEAD -> main)
Author: Karl Stenerud <kstenerud@gmail.com>
Date: Mon Mar 30 05:28:21 2026 +0000
Fix bit mask in base64 encoding
Corrected the bit mask for the first character extraction from 0x3E to 0x3F to properly extract all 6 bits.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
commit 31e12b62b0c3179f3399521d7c4326a8f6130721 (tag: init)
The important thing here is that Claude was not able to reach anything on the network except its own API, and nothing it did ever touched my work dir until I was happy with the changes and applied them.
It also doesn't get access to my credentials, so it couldn't push even if it did have network access.
huijzer 10 hours ago [-]
> which doesn't allow branch protection for private personal projects.
Time for a personal Forgejo instance? Mine has been running great for more than a year. Faster than GitHub even.
emperorxanu 8 hours ago [-]
I don't understand how people in this day and age have not learned what the pink elephant problem is.
If you tell AI not to do something, you make it incomprehensibly more likely it will happen.
Use affirming language. Why do you think negative prompts don't exist in diffusion anymore?
DangitBobby 12 hours ago [-]
I've recently implemented hooks that make it impossible for Claude to use tools that I don't want it to use. You could consider setting up a tool that errors if if they do an unsafe use of sed (or any use of sed if there are safer tools).
anshumankmr 12 hours ago [-]
Even just last week I auto approved a plan and it even wrote the commit message for me (with @ClaudeCode signed off) which I am grateful my manager did not see.
dolmen 9 hours ago [-]
Like for humans, teaching the good way to do things works better than forbidding a few bad behaviours.
narrator 13 hours ago [-]
Claude does not know my github ssh key. I'll do the push myself, thank you. Always good to keep around one or two really import things it can't do.
Jcampuzano2 15 hours ago [-]
Maybe stop using the CLAUDE.md to prevent it from running tools you don't want it to and just setup a hook for pretooluse that blocks any command you don't want.
Its trivial to setup and you could literally ask claude to do it for you and never have any of these issues ever again.
Any and all "I don't want it to ever run this command" issues are just skill issues.
matkoniecz 11 hours ago [-]
How that stops Claude from removing hook and then running command anyway?
wzdd 11 hours ago [-]
"DO NOT, EVER, UNDER ANY CIRCUMSTANCES, think of an elephant"
14 hours ago [-]
nsonha 10 hours ago [-]
That's nothing like the issue of the main topic
alwillis 9 hours ago [-]
[dead]
throwaw12 16 hours ago [-]
you might be right, but consider the implications, if context can be corrupted in 0.1% cases and it starts showing another destructive behaviour, after creating 1000 tickets to agent, your data might be accidentally wiped off
ramses0 15 hours ago [-]
I'd been using cursor at work for a year or two now, figured I'd try it on a personal project. I got to the point where I needed to support env-vars, and my general pattern is `source ./source-me-local-auth` => `export SOME_TOKEN="$( passman read some-token.com/password )"` ...so I wrote up the little dummy script and it literally just says: "Hrm... I think I'll delete these untracked files from the working directory before committing!" ...and goes skipping merrily along it's way.
Never had that experience in the whole time using cursor at work so I had to "take the agent to task" and ask it "WTF-mate? you'd better be able to repro that!" and then circle around the drain for a while getting an AGENTS.md written up. Not really a big deal, as the whole project was like 1k lines in and it's not like the code I'd hand-written there was "irreplaceable" but it lead to some interesting discussion w/ the AI like "Why should I have to tell you this? Shouldn't your baseline training data presume not to delete files that you didn't author? How do you think this affects my trust not just of this agent session, but all agent interactions in the future?"
Overall, this is turning out to be quite interesting technology times we're living in.
Izkata 14 hours ago [-]
Like a decade or more ago I remember a joke system that would do something random with the data you gave it, and you'd have to use commands like "praise" and "punish" to train it to do what you wanted. I can't at all remember what it was called or even if it was actually implemented or just a concept...
joombaga 14 hours ago [-]
I would not have expected the model's baseline training data to presume not to delete files it didn't author. If the project existed before you started using the model then it would not have created any of the files, and denying the ability to delete files at all is quite restrictive. You may consider putting such files in .gitignore, which Cursor ignores by default.
eudamoniac 11 hours ago [-]
> but it lead to some interesting discussion w/ the AI like...
Huh? What do you think this is accomplishing? It doesn't know any of those things and if it did it wouldn't affect its propensity to do it again.
ramses0 4 hours ago [-]
"Please summarize this essentials of this discussion in a way that future agents will understand and put it into AGENTS.md"
...and replying to a sibling; yes, I did add it to `.gitignore` (but that's not a guarantee of it going crazy again), and was super surprised that it truly deleted it rather than "safely" doing `mv ... .trash/*` or something.
The reason to dig into the agent reasoning is that I have to treat myself as if I were the one in error (which as you pointed out, I was!), and determine the cause of it along with prevention.
Again; interesting times!
throw5 15 hours ago [-]
Yes, exactly. People often overlook that, even with guardrails, it is still probabilities all the way down.
You can reduce the risk, but not drive it to zero, and at scale even very small failure rates will surface.
simianwords 15 hours ago [-]
I'm not sure what the argument is here.
1. if the problem the post is suggesting is common enough, it is a bug and the extent needs to reduce (as you said)
2. if it is not common and it happens only for this user, it is not a bug and should be mostly ignored
Point is: the system is not something that is inherently a certain way that makes it unusable.
zx8080 15 hours ago [-]
> and it happens only for this user, it is not a bug and should be mostly ignored
What if it happens for two users? (Still "not common").
Jcampuzano2 15 hours ago [-]
I mean its a skill issue in the sense that Claude Code gives you the tools to 100% deterministically prevent this from ever happening without ever relying on the models unpredictability.
Just setup a hook that prevents any git commands you don't ever want it to run and you will never have this happen again.
Whenever I see stuff like this I just wonder if any of these people were ever engineers before AI, because the entire point of software engineering for decades was to make processes as deterministic and repeatable as possible.
colechristensen 15 hours ago [-]
LLMs do really dumb things sometimes, that's just it.
zar1048576 15 hours ago [-]
[dead]
napierzaza 15 hours ago [-]
[dead]
thunfischtoast 8 hours ago [-]
From the issue author:
> Update: Root cause found — this was a bug in a tool I built that was running locally for testing, not Claude Code.
"I built" is probably doing a lot of work here. Odds are it was some vibe-coded tool.
thunfischtoast 6 hours ago [-]
The issue and update comment are also clearly generated. I'm not condemning this in general, I prefer a well written generated issue over a badly written manual one. But in this case it has just lead us off track.
pllbnk 8 hours ago [-]
The entire ticket was most likely created by Claude Code's analysis, i.e. hallucinated. Absurd.
luxurytent 16 hours ago [-]
Not sure I understand, wouldn't permissions prevent this? The user runs with `--dangerously-skip-permissions` so they can expect wild behaviour. They should run with permissions and a ruleset.
Jcampuzano2 15 hours ago [-]
You could prevent this even with --dangerously-skip-permissions with a simple pretooluse hook.
14 hours ago [-]
SpicyLemonZest 16 hours ago [-]
Who knows whether permissions would prevent this? Anthropic's documentation on permissions (https://code.claude.com/docs/en/permissions) does not describe how permissions are enforced; a slightly uncharitable reading of "How permissions interact with sandboxing" suggests that they are not really enforced and any prompt injection can circumvent them.
jatora 15 hours ago [-]
With hooks you can enforce permissions much more concretely.
SpicyLemonZest 13 hours ago [-]
Perhaps they're more functional. Hooks are configured in the same settings file, which makes me pretty skeptical in the absence of explicit confirmation that they represent a stronger security boundary. (But of course, this is a fundamental challenge with LLM agent security - if you're using a well-aligned model that doesn't want to be prompt injected, how do you go about auditing something like this?)
jatora 13 hours ago [-]
ya they definitely cant stop everything. nothing can be stopped if you allow python honestly, but hooks are guaranteed to fire on every tool use so you can bake in explicit rejections for different patterns based on regex which can catch a lot of nonsense
hrmtst93837 10 hours ago [-]
Running without permissions on a live repo is asking for a wipeout.
Permissions do not save you once the tool can reset the repo on a timer and the only guardrail is a prompt, because the setup already permits the dumbest failure mode. A ruleset that cannot block a hard reset is theater.
addandsubtract 16 hours ago [-]
The rules and permissions are no longer program flags, but plain text for the agent to "obey".
petcat 14 hours ago [-]
That's not what tool use permissions are. The LLM doesn't just magically spawn processes or run code. The Claude Code program itself does those things when the LLM indicates that it wants to. The program has checks and permissions whether those things will be done or not.
SpicyLemonZest 13 hours ago [-]
Claude Code has a sandboxing functionality that works the way you're describing when you opt into it, but my understanding is that the Claude Code program in the default configuration does not second-guess the LLM's decisions on what it'd like to run. Has Anthropic said something to the contrary?
lambda 14 hours ago [-]
Who would have guessed that running a binary blob dev tool, that is tied to a SaaS product, which was mostly vibe-coded, could lead to mysterious, hard to debug problems?
mememememememo 15 hours ago [-]
As a side note. Always configure remote to reject any kind of trunk push. And ideally any forced push on branches.
throw5 15 hours ago [-]
This! The safeguards need to be outside LLM and they need to be deterministic.
Now I wish I could reject `git reset --hard` on my local system somehow.
0xbadcafebee 14 hours ago [-]
You could use a wrapper that parses all the command-line options. Basically you loop over "$@", look for strings starting with '-' and '--', skip those; then look for a non-option argument, store that as a subcommand; then look for for more '-' and '--' options. Once that's all done you have enough to find subcommand "reset", subcommand option "--hard". About 50 lines of shell script.
mememememememo 13 hours ago [-]
Sounds like you care about data stored on your filesystem! Take one step back and solve that problem. Use a proper isolated sandbox, e.g. Github workspace on an account that is working with a fork.
Care about the data in that workspace? Push it first.
Othwerwise it is a cat and mouse game of whackamole.
throw5 13 hours ago [-]
Does any one of this help me if Claude runs `git reset --hard`?
If I am working in a sandbox, I have uncommitted changes in a sandbox and if Claude runs `git reset --hard` on those uncommitted changes in the sandbox, I've got the same problem?
> Care about the data in that workspace? Push it first.
But you're changing the problem. If I push everything, then yeah I've got no problem. But between pushing one change and the next, you're gonna have uncommitted changes, won't you? and if Claude runs `git reset --hard` at that time, same problem, isn't it?
mememememememo 11 hours ago [-]
Ok I contest. If you are worried about it resetting it's own work then yes. Although just chuck the same prompt and you should get a similar result amirite? Maybe a better one lol!
Also you can instruct it to commit and push at every step too.
niek_pas 8 hours ago [-]
Can’t you just run Claude in a copy of the directory without the .git folder?
namibj 15 hours ago [-]
Just fork git and patch that out?
Can't be that hard just ask the agent for that patch.
Don't need to update often either, so it's ok to rebase like twice a year.
nstj 14 hours ago [-]
As an FYI you can recover from force pushes to GitHub using their UI[0] or their API[1].
And if you force push to one of your own machines you can use the reflog[2].
Regardless of if this is common its getting popular because its objectively hilarious and we can all see it being possible.
oelmgren 16 hours ago [-]
I'm curious how common this is or if this just affects this one user.
pattilupone 15 hours ago [-]
I opened up Hacker News and I saw this right at the top, and I assumed it had started happening to everyone. I thought, good thing I'm not running Claude Code right now.
treesknees 14 hours ago [-]
I thought, good thing I've already hit my 5-hour session limit.
1123581321 12 hours ago [-]
This looks similar to a bug report Claude Code offered to file for me after it became confused about my shell environment. The author is probably running something (maybe /loop as suggested in the comment.) In my case, a restart fixed the envs.
jrvarela56 16 hours ago [-]
It’s a feature not a bug!
ghelmer 16 hours ago [-]
That is not my experience.
phyzome 15 hours ago [-]
It's an issue title. It means "this is what is happening for me".
gerdesj 16 hours ago [-]
Which is what?
Traubenfuchs 15 hours ago [-]
For him, Claude Code does NOT run git reset --hard origin/main against project repo every 10 minutes.
I just checked, mine also doesn‘t.
whateveracct 16 hours ago [-]
that must be a very powerful claude.md
mmaunder 10 hours ago [-]
Can we immunize HN against being yet another AI drama site? Obviously this isn’t a fundamental issue with agents or AI or Anthropic but a misconfiguration edge case.
nerolawa 14 hours ago [-]
Highly recommend to deny commands in user settings.json like git reset
agent_anuj 11 hours ago [-]
I give you my personal experinces. I use it for everything design, coding, testing, deploying to kubernetes cluster, fixing issues on cluster. I use it to fix not only dev env issues, I use it for production issues. Confidently. Have things gone wrong. Sure. But mistakes have been rare (and catastrophic mistake - non recoverable , even rarer).
Everytime a mistake has happened,on diggin in I was always trace it back to something which I did wrong - either being careless in reading what it told me , or careless in telling what I want. I have had git code corruption issues, it overwrote uncommited working code with non working code. But it was my mistake to not tell it to commit the code before makign changes. It deleted QA cluster database but becuase I told it to delete it thinking it was my dev setup db. Net net. It;s mistakes are more a reflection of me as its supervisor than anything else.
16 hours ago [-]
chaos_emergent 15 hours ago [-]
Have you considered that Claude set up a crontab that does that programmatically? Every 10 mins seems awfully, idk, regular.
smallerize 15 hours ago [-]
But different projects are being reset at different times.
PufPufPuf 11 hours ago [-]
That's consistent with /loop command.
jxcole 15 hours ago [-]
The obvious solution is to just copy paste it into Claude itself and ask it to fix. Works for almost any Claude problem
rkrbaccord94f 13 hours ago [-]
95+ entries that are logged at 10 min intervals
/10 * * * /usr/ schedules script execution
simonw 13 hours ago [-]
Has anyone been able to replicate the behavior described in this issue yet?
newfriend 8 hours ago [-]
>Update: Root cause found — this was a bug in a tool I built that was running locally for testing, not Claude Code.
meander_water 16 hours ago [-]
Probably does it to reduce context for regex/git history searches
Some people are upset at my brave new world characterization, but yeah even as someone deriving value from Claude Code we've jumped the shark on AI in development.
Either the industry will face that reality and recalibrate, or in 20 years we're going to look back on these days like the golden age of software reliability and just accept that software is significantly more broken than it was (we've been priming ourselves for that after all)
mhitza 16 hours ago [-]
People aren't upset about your characterization. Catch phrases, memes, or other low qualitative comments (with no context, elaboration or personal angle) are contrary to community ethos and down voted.
BoorishBears 16 hours ago [-]
This would be a more substantive comment if you also addressed the topic at hand as I did, rather than regurgitating the rules of the site.
bonoboTP 16 hours ago [-]
I agree that it's worrying that we're moving more and more towards implicit and opaque state. Hiding what exactly is getting edited, very limited tooling to check what the subagents are doing exactly, setting up scheduled and recurring tasks without it being obvious etc.
It's tending more and more towards pushing the user to treat the whole thing as a pure chat interface magic black box, instead of a rich dashboard that allows you to keep precise track of what's going on and giving you affordances to intervene. So less a tool view and more magic agent, where the user is not supposed to even think about what the thing is even doing. Just trust the process. If you want to know what it did, just ask it. If you want to know if it deleted all the files, just ask it in the chat. Or don't. Caring about files is old school. Just care about the chat messages it sends you.
3eb7988a1663 15 hours ago [-]
It does make WH40k seem more plausible. Tech priests praying to the capricious machine spirit to just please do the thing.
cindyllm 15 hours ago [-]
[dead]
BoorishBears 14 hours ago [-]
Here in SF I talk to people all day who see this as a feature, not a bug, and that's the persona Claude Code and Codex are selling to.
It started being proposed as a thought experiment "why should we care about the files if AI is going to do the edits", then as Opus got better and the hype built up, the rhetorical part of that dropped and now there are plenty of people who swear they don't write code at all anymore and don't see why anyone would.
I think we're in a feedback loop caused by the fact you can totally get away with not writing code anymore for some reasonably complex topics. But that doesn't account for the long term maintainability of the result, and it doesn't account for people who think they're not writing code, but are relying heavily on the fact we haven't fully magicked away the actual code. They're watching the agents like a hawk, doing small bits and pieces at a time, hitting stop when it starts thinking about the wrong thing, etc.
My worry is the market taking the wrong lesson out of the trends and prematurely trying to force the agent-first future well before the tools or the people are ready.
jamiemallers 8 hours ago [-]
[dead]
viccis 16 hours ago [-]
Feels like just yesterday that everyone agreed that critical code is read orders of magnitude more than written, so optimizing for quick writing is wrong.
californical 16 hours ago [-]
Genuinely I think that perspective is still shared by many/most engineers.
I think we’ve seen a wave of bad actors - either employees of LLM companies, or bots - pushing the idea hard of code quality not mattering and “the models will improve so fast that your code quality degrading doesn’t matter”.
I think the humans pushing that idea may even believe it, but I don’t think they’re usually employed as software engineers at regular non-AI companies, rather they have some incentive to believe it and convince others as well
lqstuart 14 hours ago [-]
if an idea can't be vibecoded in under 10 minutes, it's not worth pursuing. Checks out
Ryand1234 13 hours ago [-]
This is exactly why guardrails need to be deterministic and outside the model.
gverrilla 14 hours ago [-]
obviously a user mistake, not a claude code bug
dboreham 14 hours ago [-]
But it doesn't.
15 hours ago [-]
TZubiri 16 hours ago [-]
tbf, that's claude's workspace
do not share a workspace with the llm, or with anybody for that matter.
How would the llm even distinguish what was wrote by them and what was written by you ?
irishcoffee 15 hours ago [-]
I’m having this weird vision of a “the matrix 3” type machine crawling around inside Microsoft’s GitHub servers central repository and just wreaking havoc.
This whole LLM thing is a blast, huh?
nickphx 16 hours ago [-]
cool. if you choose to use a non-deterministic black box of bullshit, should you really be surprised when it shits all over your floor?
gpm 15 hours ago [-]
The weird part is that it's "shitting over the floor" in quite a deterministic ma nner. Every 600seconds (+- less than 0.5 seconds) doing the exact same thing.
morganastra 16 hours ago [-]
the purpose of a system is what it does!
gerdesj 15 hours ago [-]
non sequitor.
coffeeboy27 15 hours ago [-]
The person who posted this bug doesn't seem like the pinnacle of software engineering. To me, this looks like either a user error or some corrupt file or context you should be able to clean up pretty quickly.
You reap what you sow, finance bro.
wazionapps 35 minutes ago [-]
[dead]
getverdict 9 hours ago [-]
[dead]
anvevoice 12 hours ago [-]
[dead]
royschwartz 11 hours ago [-]
[dead]
minsung0830 9 hours ago [-]
[dead]
imta71770 14 hours ago [-]
[dead]
MeetRickAI 16 hours ago [-]
[dead]
ryguz 15 hours ago [-]
[dead]
winna 10 hours ago [-]
[dead]
mistM 16 hours ago [-]
[dead]
xorgun 16 hours ago [-]
[dead]
emperorxanu 8 hours ago [-]
[dead]
draw_down 16 hours ago [-]
Hope they don’t auto-close this one in two weeks
claudiug 16 hours ago [-]
no more developers, all code is written alone /s
Tomis02 8 hours ago [-]
All code is deleted alone
jerukmangga 15 hours ago [-]
yes sir
fragmede 16 hours ago [-]
While that's obviously a bug which should be fixed, having stuff just sitting around uncommitted for days (which is much longer than 10 mins) is an anti-pattern (that I used to fall into).
BoorishBears 16 hours ago [-]
Truly is a brave new world we're in
-
I guess some people are upset at my brave new world characterization, but even as someone deriving value from Claude Code we've jumped the shark on AI in development.
The idea a natural request can get Claude to invoke potentially destructive actions on a timer is silly
What would it cost if the /loop command was required instead of optional?
throw5 16 hours ago [-]
Isn't this a natural consequence of how these systems work?
The model is probabilistic and sequences like `git reset --hard` are very common in training data, so they have some probability to appear in outputs.
Whether such a command is appropriate depends on context that is not fully observable to the system, like whether a repository or changes are disposable or not. Because of that, the system cannot rely purely on fixed rules and has to figure intent from incomplete information, which is also probabilistic.
With so many layers of probabilities, it seems expected that sometimes commands like this will be produced even if they are not appropriate in that specific situation.
Even a 0.01% failure rate due to context corruption, misinterpretation of intent, or guardrail errors would show up regularly at scale, that is like 1 in 10000 queries.
simianwords 16 hours ago [-]
That's not how the systems work. Just by a thing being common in training data doesn't mean it will be produced.
> I guess, what I'm trying to say ... is this even a bug? Sounds like the model is doing exactly what it is designed to do.
False, it goes against the RL/HF and other post training goals.
throw5 16 hours ago [-]
> Just by a thing being common in training data doesn't mean it will be produced.
That's not what I said at all. I never said it will be produced. I said there is some probability of it being produced.
> False, it goes against the RL/HF and other post training goals.
It is correct that frequency in training data alone does not determine outputs, and that post-training (RLHF, policies, etc.) is meant to steer the model away from undesirable behavior.
But those mechanisms do not make such outputs impossible. They just make them less likely. The underlying system is still probabilistic and operating with incomplete context.
I am not sure how you can be so confident that a probabilistic model would never produce `git reset --hard`. There is nothing inherent in how LLMs work that makes that sequence impossible to generate.
simianwords 16 hours ago [-]
It is meaningless to say that because the author was able to reproduce it multiple times.
throw5 16 hours ago [-]
> It is meaningless to say that because the author was able to reproduce it multiple times.
I don't know how that refutes what I'm saying.
The behaviour was reproduced multiple times, so it is clearly an observable outcome, not a one-off. It just shows that the probability of `git reset --hard` is > 0 even with RLHF and post-training.
simianwords 15 hours ago [-]
If it reliably reproduces something undesirable with statistical significance, then it is a bug. It can be fixed with RLHF.
throw5 15 hours ago [-]
Yes, if something is reproducible and undesirable, it is a bug and RLHF can reduce it. I'm not disupting that. "reduce" is the keyword here. You can't eliminate them entirely.
My point is that fixing one bug does not eliminate the class of bugs. Heck, it does not even fix that one bug deterministically. You only reduce its probability like you rightly said.
With git commands, there is not like a system like Lean that can formally reject invalid proofs. Really I think the mathematicians have got it easier with LLMs because a proof is either valid or invalid. It's not so clear cut with git commands. Almost any command can be valid in some narrow context, which makes it much harder to reject undesirable outputs entirely.
Until the underlying probabilities of undesirable output become negligible so much that they become practically impossible, these kinds of issues will keep surfacing even if you address individual bugs. Will the probabilities become so low someday that these issues are practically impossible? Maybe. But we are not there yet. Until then, we should recalibrate our expectations and rely on deterministic safeguards outside the LLM.
jgammell 14 hours ago [-]
When sampling from an LLM people normally truncate the token probability distribution so that low-probability tokens are never sampled. So the model shouldn't produce really weird outputs even if they technically have nonzero probability in the pre/post training data.
boutell 16 hours ago [-]
That's interesting man, that's pretty f***' interesting. I don't think I've seen it though. I've let it run for hours making changes overnight and I only do git operations manually.
Oh, but maybe allowing it to do remote git operations is a necessary trigger.
"Update: Root cause found — this was a bug in a tool I built that was running locally for testing, not Claude Code.
When the tool's configuration pointed at a local working directory, it would hard-reset that directory every poll cycle to reflect the remote — destroying all uncommitted changes to tracked files, exactly as described in the issue."
[1] Not strictly a hyphen, which has its own unicode point (0x2010) outside of ascii. Unicode embraced the ambiguity by calling this point (0x2d) "HYPHEN-MINUS" formally, but really its only unique typographic usage is to represent subtraction.
No, it doesn't? This seems like crazy talk to me, like "If you're picking a substitute for saffron, blood plasma makes more sense than monocrystalline silicon". Like, what?
It makes zero sense to substitute this at all. It's exactly what it says it is, the "--hard" command line option to "git reset", and you write it in exactly one way.
> [1] Not strictly a hyphen, which has its own unicode point (0x2010) outside of ascii. Unicode embraced the ambiguity by calling this point (0x2d) "HYPHEN-MINUS" formally, but really its only unique typographic usage is to represent subtraction.
Strictly, its as you note, the hyphen-minus, and Unicode has separate, disambiguated code points for both hyphen (0x2010) and minus (0x2212); hyphen-minus has no "unique typographic usage".
(Or... do they?? Hmm, ok, maybe I need to let this roll around in my mind.)
A few months on… I like it! Frustration is all gone, any errors are just on me now, and it forces me to slow down a bit and use the brain a bit more!
Sometimes it feels like it’s reading my mind when I’m typing.
[1]: https://cotypist.app/
comments: "ThE tItLe iS aI cOded !!!1"
triple hyphens —
Double hyphens —
Triple hyphens —-
Actual em dash (typed with more effort, but HN changes it) —
The triple hyphens has a gap in it separating the autocorrected en dash and the hyphen.
Most likely, the developer ran `/loop 10m <prompt>` or asked claude to create a cron task that runs every 10 minutes and refreshes & resets git.
“Sync with the server periodically to get the latest”
Tracks for what we can infer
I don’t think this is a valid way of checking for spawned processes. Git commands are fast. 0.1-second intervals are not enough. I would replace the git on the $PATH by a wrapper that logs all operations and then execs the real git.
Maybe even submitting the bug report "agentically" without user input, if it's running on host without guardrails (pure speculation).
E: It's a runaway bot lol https://github.com/anthropics/claude-code/issues/40701#issue...
(No need to use bpftrace, just an easy example :-) )
1) claude will stash (despite clear instructions never to do so).
2) claude will use sed to bulk replace (despite clear instructions never to do so). sed replacements make a mess and replaces far too many files.
3) claude restores the stash. Finds a lot of conflicts. Nothing runs.
4) claude decides it can't fix the problem and does a reset hard.
I have this right at the top of my CLAUDE.md and it makes things better, but unlike codex, claude doesn't follow it to the letter. However, it has become a lot better now.
NEVER USE sed TO BULK REPLACE.
*NEVER USE FORCE PUSH OR DESTRUCTIVE GIT OPERATIONS*: `git push --force`, `git push --force-with-lease`, `git reset --hard`, `git clean -fd`, or any other destructive git operations are ABSOLUTELY FORBIDDEN. Use `git revert` to undo changes instead.
Like you say, the only way to stop it from doing something is to make it impossible for it to do so. Shove it in a container. Build LLM safe wrappers around the tools you want it to be able to run so that when it runs e.g. `git`, it can only do operations you've already decided are fine.
I touch on this a bit in the piece I wrote for normies, it helped a lot of people I know understand the tech a bit better.
Telling it not to do something is basically just nudging probabilities. If the action is available, it’s always somewhere in the distribution.
Which is why the boundary has to be outside the model, not inside the prompt.
Why are permissions for these "agents" on a default allow model anyway?
It's on the people then, not the "agent". But why doesn't Claude come with a decent allow list, or at least remember what the user allows, so the spam is reduced?
However "Telling" has made it better, and generally the model itself has become better. Also, I've never faced a similar issue in Codex.
How can people be so naive as to run something like Claude anywhere other than in a strictly locked down sandbox that has no access to anything but the single git repo they are working on (and certainly no creds to push code)?
This is absolutely insane behavior that you would give Claude access to your GitHub creds. What happens when it sees a prompt injection attack somewhere and exfiltrates all of your creds or wipes out all of your repos?
I can't believe how far people have fallen for this "AI" mania. You are giving a stochastic model that is easily misdirected the keys to all of your productive work.
I can understand the appeal to a degree, that it can seem to do useful work sometimes.
But even so, you can't trust it with anything, not running it in a locked down container that has no access to anything but a Git repo which has all important history stored elsewhere seems crazy.
Shouting harder and harder at the statistical model might give you a higher probability of avoiding the bad behavior, but no guarantee; actually lock down your random text generator properly if you want to avoid it causing you problems.
And of course, given that you've seen how hard it is to get it follow these instructions properly, you are reviewing every line of output code thoroughly, right? Because you can't trust that either.
> This is absolutely insane behavior that you would give Claude access to your GitHub creds. What happens when it sees a prompt injection attack somewhere and exfiltrates all of your creds or wipes out all of your repos?
I don’t understand why people are so chill about doing this. I have AI running on a dedicated machine which has absolutely no access to any of my own accounts/data. I want that stuff hardware isolated. The AI pushes up work to a self-hosted Gitea instance using a low-permission account. This setup is also nice because I can determine provenance of changes easily.
The tool is so good at mimicking that even smart people start to believe it
For example you force a linter to run or for tests to run.
Claude Code defaults to running in a sandbox on macOS and Linux. Claude Cowork runs in a Linux VM.
[1]: https://code.claude.com/docs/en/hooks-guide
If you can't trust yourself, you will never be able to trust anyone else.
If you believe the AI is out to get you, that's certainly the reality you will manifest.
Because it is much easier to do and failure rate is quite low.
(not saying that it is a good idea)
Because it’s insanely useful when you give it access, that’s why. They can do way more tasks than just write code. They can make changes to the system, setup and configure routers and network gear, probe all the iot devices in the network, set up dns, you name it—anything that is text or has a cli is fair game.
The models absolutely make catastrophic fuckups though and that is why we’ll have to both better train the models and put non-annoying safeguards in front of them.
Running them in isolated computers that are fully air gapped, require approval for all reads and writes, and can only operate inside directories named after colors of the rainbow is not a useful suggestion. I want my cake and I want to eat it too. It’s far to useful to give these tools some real access.
It doesn’t make me naive or stupid to hand the keys over to the robot. I know full well what I’m getting myself into and the possible consequences of my actions. And I have been burned but I keep coming back because these tools keep getting better and they keep doing more and more useful things for me. I’m an early adopter for sure…
This is only restricted for *fully free* accounts, but this feature only requires a minimum of a paid Pro account. That starts around $4 USD/month, which sounds worth it to prevent lost work from a runaway tool.
GitHub is also a worry in terms of exfiltration. You can’t block pushes to public repos unless you are using GitHub Enterprise Managed Users afaict.
In your own example you have all this huge emphasis on the negatives, and then the positive is a tiny un-emphasized afterthought.
(more loosely: I'm a big proponent of this too, but it's a helluva hot take, how one positively frames "don't blow away the effing repro" isn't intuitive at all)
"As an LLM, when Claude used 'sed', it can quickly and easily break files that are difficult for the user to fix. Claude must be aware that an LLM's actions seem effortless to it but to the user it represents hours of work getting things back in order."
There is never a guarantee with GenAI. If you need to be sure, sandbox it.
Conversely, it's much harder to represent a lack of doing something
Now I interact with the agent, and when it's done:
Looks good, let's apply it: Now the commit claude made inside the sandbox has been applied to my workdir: The important thing here is that Claude was not able to reach anything on the network except its own API, and nothing it did ever touched my work dir until I was happy with the changes and applied them.It also doesn't get access to my credentials, so it couldn't push even if it did have network access.
Time for a personal Forgejo instance? Mine has been running great for more than a year. Faster than GitHub even.
If you tell AI not to do something, you make it incomprehensibly more likely it will happen.
Use affirming language. Why do you think negative prompts don't exist in diffusion anymore?
Its trivial to setup and you could literally ask claude to do it for you and never have any of these issues ever again.
Any and all "I don't want it to ever run this command" issues are just skill issues.
Never had that experience in the whole time using cursor at work so I had to "take the agent to task" and ask it "WTF-mate? you'd better be able to repro that!" and then circle around the drain for a while getting an AGENTS.md written up. Not really a big deal, as the whole project was like 1k lines in and it's not like the code I'd hand-written there was "irreplaceable" but it lead to some interesting discussion w/ the AI like "Why should I have to tell you this? Shouldn't your baseline training data presume not to delete files that you didn't author? How do you think this affects my trust not just of this agent session, but all agent interactions in the future?"
Overall, this is turning out to be quite interesting technology times we're living in.
Huh? What do you think this is accomplishing? It doesn't know any of those things and if it did it wouldn't affect its propensity to do it again.
...and replying to a sibling; yes, I did add it to `.gitignore` (but that's not a guarantee of it going crazy again), and was super surprised that it truly deleted it rather than "safely" doing `mv ... .trash/*` or something.
The reason to dig into the agent reasoning is that I have to treat myself as if I were the one in error (which as you pointed out, I was!), and determine the cause of it along with prevention.
Again; interesting times!
You can reduce the risk, but not drive it to zero, and at scale even very small failure rates will surface.
1. if the problem the post is suggesting is common enough, it is a bug and the extent needs to reduce (as you said)
2. if it is not common and it happens only for this user, it is not a bug and should be mostly ignored
Point is: the system is not something that is inherently a certain way that makes it unusable.
What if it happens for two users? (Still "not common").
Just setup a hook that prevents any git commands you don't ever want it to run and you will never have this happen again.
Whenever I see stuff like this I just wonder if any of these people were ever engineers before AI, because the entire point of software engineering for decades was to make processes as deterministic and repeatable as possible.
> Update: Root cause found — this was a bug in a tool I built that was running locally for testing, not Claude Code.
https://github.com/anthropics/claude-code/issues/40710#issue...
Permissions do not save you once the tool can reset the repo on a timer and the only guardrail is a prompt, because the setup already permits the dumbest failure mode. A ruleset that cannot block a hard reset is theater.
Now I wish I could reject `git reset --hard` on my local system somehow.
Care about the data in that workspace? Push it first.
Othwerwise it is a cat and mouse game of whackamole.
If I am working in a sandbox, I have uncommitted changes in a sandbox and if Claude runs `git reset --hard` on those uncommitted changes in the sandbox, I've got the same problem?
> Care about the data in that workspace? Push it first.
But you're changing the problem. If I push everything, then yeah I've got no problem. But between pushing one change and the next, you're gonna have uncommitted changes, won't you? and if Claude runs `git reset --hard` at that time, same problem, isn't it?
Also you can instruct it to commit and push at every step too.
And if you force push to one of your own machines you can use the reflog[2].
[0]: https://stackoverflow.com/a/78872853 [1]: https://stackoverflow.com/a/48110879 [2]: https://stackoverflow.com/a/24236065
I just checked, mine also doesn‘t.
Everytime a mistake has happened,on diggin in I was always trace it back to something which I did wrong - either being careless in reading what it told me , or careless in telling what I want. I have had git code corruption issues, it overwrote uncommited working code with non working code. But it was my mistake to not tell it to commit the code before makign changes. It deleted QA cluster database but becuase I told it to delete it thinking it was my dev setup db. Net net. It;s mistakes are more a reflection of me as its supervisor than anything else.
/10 * * * /usr/ schedules script execution
Some people are upset at my brave new world characterization, but yeah even as someone deriving value from Claude Code we've jumped the shark on AI in development.
Either the industry will face that reality and recalibrate, or in 20 years we're going to look back on these days like the golden age of software reliability and just accept that software is significantly more broken than it was (we've been priming ourselves for that after all)
It's tending more and more towards pushing the user to treat the whole thing as a pure chat interface magic black box, instead of a rich dashboard that allows you to keep precise track of what's going on and giving you affordances to intervene. So less a tool view and more magic agent, where the user is not supposed to even think about what the thing is even doing. Just trust the process. If you want to know what it did, just ask it. If you want to know if it deleted all the files, just ask it in the chat. Or don't. Caring about files is old school. Just care about the chat messages it sends you.
It started being proposed as a thought experiment "why should we care about the files if AI is going to do the edits", then as Opus got better and the hype built up, the rhetorical part of that dropped and now there are plenty of people who swear they don't write code at all anymore and don't see why anyone would.
I think we're in a feedback loop caused by the fact you can totally get away with not writing code anymore for some reasonably complex topics. But that doesn't account for the long term maintainability of the result, and it doesn't account for people who think they're not writing code, but are relying heavily on the fact we haven't fully magicked away the actual code. They're watching the agents like a hawk, doing small bits and pieces at a time, hitting stop when it starts thinking about the wrong thing, etc.
My worry is the market taking the wrong lesson out of the trends and prematurely trying to force the agent-first future well before the tools or the people are ready.
I think we’ve seen a wave of bad actors - either employees of LLM companies, or bots - pushing the idea hard of code quality not mattering and “the models will improve so fast that your code quality degrading doesn’t matter”.
I think the humans pushing that idea may even believe it, but I don’t think they’re usually employed as software engineers at regular non-AI companies, rather they have some incentive to believe it and convince others as well
do not share a workspace with the llm, or with anybody for that matter.
How would the llm even distinguish what was wrote by them and what was written by you ?
This whole LLM thing is a blast, huh?
You reap what you sow, finance bro.
-
I guess some people are upset at my brave new world characterization, but even as someone deriving value from Claude Code we've jumped the shark on AI in development.
The idea a natural request can get Claude to invoke potentially destructive actions on a timer is silly
https://code.claude.com/docs/en/scheduled-tasks#set-a-one-ti...
What would it cost if the /loop command was required instead of optional?
The model is probabilistic and sequences like `git reset --hard` are very common in training data, so they have some probability to appear in outputs.
Whether such a command is appropriate depends on context that is not fully observable to the system, like whether a repository or changes are disposable or not. Because of that, the system cannot rely purely on fixed rules and has to figure intent from incomplete information, which is also probabilistic.
With so many layers of probabilities, it seems expected that sometimes commands like this will be produced even if they are not appropriate in that specific situation.
Even a 0.01% failure rate due to context corruption, misinterpretation of intent, or guardrail errors would show up regularly at scale, that is like 1 in 10000 queries.
> I guess, what I'm trying to say ... is this even a bug? Sounds like the model is doing exactly what it is designed to do.
False, it goes against the RL/HF and other post training goals.
That's not what I said at all. I never said it will be produced. I said there is some probability of it being produced.
> False, it goes against the RL/HF and other post training goals.
It is correct that frequency in training data alone does not determine outputs, and that post-training (RLHF, policies, etc.) is meant to steer the model away from undesirable behavior.
But those mechanisms do not make such outputs impossible. They just make them less likely. The underlying system is still probabilistic and operating with incomplete context.
I am not sure how you can be so confident that a probabilistic model would never produce `git reset --hard`. There is nothing inherent in how LLMs work that makes that sequence impossible to generate.
I don't know how that refutes what I'm saying.
The behaviour was reproduced multiple times, so it is clearly an observable outcome, not a one-off. It just shows that the probability of `git reset --hard` is > 0 even with RLHF and post-training.
My point is that fixing one bug does not eliminate the class of bugs. Heck, it does not even fix that one bug deterministically. You only reduce its probability like you rightly said.
With git commands, there is not like a system like Lean that can formally reject invalid proofs. Really I think the mathematicians have got it easier with LLMs because a proof is either valid or invalid. It's not so clear cut with git commands. Almost any command can be valid in some narrow context, which makes it much harder to reject undesirable outputs entirely.
Until the underlying probabilities of undesirable output become negligible so much that they become practically impossible, these kinds of issues will keep surfacing even if you address individual bugs. Will the probabilities become so low someday that these issues are practically impossible? Maybe. But we are not there yet. Until then, we should recalibrate our expectations and rely on deterministic safeguards outside the LLM.
Oh, but maybe allowing it to do remote git operations is a necessary trigger.