8. Security model¶
What¶
The full security model is in docs/SECURITY.md — this section is a tour of the reasoning, with pointers to the implementation and test for each guarantee.
Why we wrote it this way¶
The trust model:
- The user running the MCP client is trusted.
- Their SSH keys are trusted.
- The MCP client itself (an LLM) is not trusted — it may be prompted into requesting malicious operations.
- Arbitrary input to any tool is not trusted.
- Snapshot contents are not trusted — files might be symlinks, FIFOs, or crafted to mislead path resolution.
Out of scope:
- A malicious operator who already has shell access on the remote. (We're a subset of what they can already do.)
- SSH key compromise.
How — the six guarantees¶
G1 — No mutation operations are ever exposed¶
Enforced by an explicit METHODS dict in
agent/zfs_snoop_agent.py; only
read-only methods present. Tested by
test_methods_table_contains_no_mutating_operations
which asserts that no name matching common destructive zfs verbs (destroy,
snapshot, rollback, send, mount, …) ever leaks in.
G2 — No shell interpretation of user input¶
Every subprocess invocation uses shell=False with an explicit argv list
(_run_cli in the agent, build_ssh_argv in the transport). Inputs that
become argv elements are validated before the call:
- Dataset names:
^[A-Za-z0-9_][A-Za-z0-9_.:/-]*$ - Snapshot names: same plus
@<snap-part>
The transport uses shlex.quote per token when building the remote shell
command for SSH. Tests:
test_validate_dataset_rejects_invalid,
test_validate_snapshot_rejects_invalid.
G3 — Path inputs cannot escape their snapshot root¶
Two layers of defence in
agent.resolve_under_snapshot:
- Reject
..and absolute paths up front. - After joining,
Path.resolve()follows symlinks; the result must stay insiderealpath(snapshot_root).
The function returns the unresolved path so callers (read_file,
list_dir) can lstat() the final component and refuse to follow a
symlink at all. Tests:
test_resolve_rejects_dotdot_traversal,
test_resolve_rejects_symlink_that_escapes,
test_read_file_refuses_to_follow_symlink.
G4 — All reads are bounded¶
| Operation | Limit |
|---|---|
read_file |
caller-provided max_bytes, server-capped at 4 MiB |
list_dir |
max_entries, default 1000, server-capped at 10 000 |
size_breakdown |
max_entries, default 100 000, server-capped at 1 000 000; plus 30 s wall time |
find_files / content_grep |
max_results, default 100, capped at 1000 |
file_diff |
max_bytes per side, default 1 MiB, capped at 4 MiB |
versions_of |
max_bytes per snapshot, default 1 MiB, capped at 4 MiB |
find_deleted |
max_results, default 1000, capped at 10 000 |
top_consumers |
heap n, default 20, capped at 1000; same walk cap + 30 s wall time as size_breakdown |
stale_snapshots |
max_results, default 1000, capped at 10 000 |
bisect_change |
max_bytes per predicate read, default 1 MiB, capped at 4 MiB; visits O(log N) snapshots |
Per zfs subprocess |
30 s wall time |
| Transport recv | 60 s wall time |
Truncation sets truncated: true in the response rather than failing.
Tests:
test_list_dir_truncates_at_max_entries,
test_find_files_truncates,
test_size_breakdown_truncates_on_budget.
G5 — Defence in depth via ZFS delegation (user mode)¶
In the default user mode, the remote account holds only the diff ZFS
delegation. Even a compromised agent can't destroy / snapshot /
mount / send through zfs(8). In sudo mode this defence does not
apply — the allowlist (G1) and no-shell guarantee (G2) are the remaining
lines, and we document the tradeoff explicitly.
G6 — All structured logs go to stderr, never stdout¶
stdout is reserved for JSON-RPC frames. The agent's main() sets up
logging with stream=sys.stderr from the start. The transport drains the
subprocess's stderr to its own logger; corruption of the wire protocol via
errant prints is structurally impossible.
Sudo mode tradeoff¶
Sudo mode exists to support legitimate reads from root-owned snapshot
files (e.g. /etc/foo from a snapshot of rpool/ROOT/debian). In sudo
mode:
- Agent runs as uid 0.
- POSIX read restrictions no longer protect any file.
- ZFS delegation is irrelevant; only the allowlist + no-shell guarantee
stand between the wire input and
zfsmutation. - Trust boundary becomes "anything that can write to the JSON-RPC stream or into the agent source at bootstrap time has root on the remote".
Use sparingly. Default to user mode. Full discussion in SECURITY.md.
A reviewer's checklist¶
When reviewing a change that touches a tool or method:
- Is any new RPC method added to the agent's
METHODSdict read-only? (G1) - Does any new dataset/snapshot/path input route through the
validators before it touches
subprocessor the filesystem? (G2/G3) - Does any new read have a default bound and a hard cap? (G4)
- Are any new error paths returning structured info (JSON-RPC error with a code), not raw stack traces? (G6)
- If sudo mode is the only way the change makes sense, is the tradeoff documented?
What to read next¶
→ Build, package, release — the project's uv /
hatchling setup, including the force-include trick that ships the agent
inside the wheel.