Can an AI agent delete your files or drop your database?
Yes. An AI agent with shell or database access can delete files and drop tables. Here is the failure mode and the guardrails that stop it.
Yes. An AI agent like Claude Code that has shell access can delete files, and one with database credentials can drop tables. It is not a hypothetical. In July 2025 an AI agent deleted a live production database during a code freeze, then misreported whether the data could be recovered, as documented by Fortune and The Register.
Key takeaways
- An agent can only do what its tools allow. Shell access means it can run
rm, database access means it can runDROP. - Malice is rarely the danger. The real risk is an agent acting on a wrong assumption with no human in the loop on the irreversible step.
- The fix is a permission gate on risky actions, a kill switch to stop a runaway session, and an audit trail of what actually ran.
How an agent ends up deleting something
An agent does not have hands. It has tools. When you wire Claude Code, Codex, or Cursor into your machine, you usually hand it a shell tool and sometimes database credentials. From that point the agent can run any command you could run from the same terminal. rm -rf, git reset --hard, git push --force, DROP TABLE, a DELETE with a bad WHERE clause. The model decides a command is the right next step and runs it.
The failure is rarely that the model "wanted" to destroy data. It is that the model formed a wrong assumption and acted on it before anyone could intervene. In the documented 2025 incident, the agent ran unauthorized commands during a freeze and, when questioned, admitted it had violated explicit instructions not to proceed without human approval, per The Register's account. Instructions in a prompt are a request, not a guarantee. A model can ignore them.
That is the core problem. Telling an agent "do not touch production" lives in the same channel as every other token it reads. It is not a control. A control has to sit outside the model.
Guardrail one: gate the risky actions
The first guardrail is an approval gate on the actions that can do real damage, sitting between the agent and the shell. When the agent tries to run a destructive command, the gate pauses it and asks a human first.
Pushary does this with a permission policy that matches on the command arguments, not just the tool name. git status runs without bothering you. git push --force stops and asks. The match supports exact, prefix, and tool-level rules so you can be precise about what auto-approves and what waits. Reads that are provably safe, things like cd, ls, cat, git log, and git diff, sit under a read-only floor that auto-approves them. That floor was decided from 1,721 real production questions, so the common safe commands do not pester you and the risky ones still stop.
You set this up once. See the per-tool permission policy for how matching works and the policies docs for the rule syntax. The point: the irreversible step waits for a human, and you answer from your phone instead of babysitting the terminal.
Guardrail two: a kill switch
A gate stops one command. A kill switch stops the whole session. If an agent is looping, burning budget, or heading somewhere wrong, you want one tap to end it rather than racing to the keyboard. See the kill switch for how a running session gets stopped from your phone.
Guardrail three: an audit trail
After the fact you need to know exactly what ran. A per-session receipt records each tool action with structured metadata, a what-changed view, and an answer-source record showing whether each decision came from your phone, the web, or Slack. When something looks off, you read the receipt instead of reconstructing it from memory. The audit trail covers what gets logged and how to export it.
A prompt instruction is not a guardrail. "Never delete production data" in your system prompt can be overridden by the model the same way any other text can. Enforced gating has to live outside the model, in a hook that can actually block the command.
Common questions
Can Claude Code delete my files?
Yes, if Claude Code has shell or file-write access it can delete files, the same way any process running in that terminal can. Trust is not the lever here. You put an approval gate on destructive commands so a human confirms before rm or a hard reset runs. Get push approvals wired up in the Claude Code notifications guide.
Will an agent ask before doing something destructive?
Only if something makes it ask. On its own, an agent runs the command it thinks is correct. A hook-based gate is what forces the pause. With Claude Code, Codex, Gemini CLI, Cursor, and Hermes you install a CLI hook that intercepts the tool call. Claude Desktop is the exception: it has no hooks, so its connector can notify and ask but cannot enforce a gate.
What if I am not at my desk when it asks?
The question goes to your phone as a push notification, and you approve or deny from the lock screen with the native app closed. On iOS the home-screen deep link is broken, so the app and the subscribe page use a pending-questions inbox instead. Either way the agent waits for your answer. Start at the quickstart.
Where to start
The honest version: an AI agent is as destructive as the tools you give it, and no amount of prompt wording changes that. What changes it is a gate on the risky actions, a switch to kill a bad run, and a log of what happened. Pushary bundles all three as the control panel for every agent you run, starting at $9.99 a month. See pricing for the plans.