Security model¶

Each guarantee below is paired with a pointer to where it's enforced in code, and with the test that asserts the behaviour.

AI-assisted authorship. Most of this codebase was drafted by Claude Code under human review (see "About this codebase" in the project README). The security-critical invariants below — the method allowlist, the no-shell guarantee, path confinement, bounded reads — were specified by the human author and are enforced by tests that fail loudly if any new code violates them. If you're doing an independent security audit, treat that as additional motivation, not as reassurance: read the code, not just the docstrings.

Threat model¶

Trusted: the user running the local MCP client, the SSH keys they hold, the remote user accounts they can already log into. SSH transport security.

Untrusted:

The MCP client (an LLM) — may be prompted into requesting malicious operations or path traversals.
Arbitrary input to any tool — paths, snapshot names, datasets, search patterns.
Snapshot contents — files inside a snapshot may be symlinks, FIFOs, or crafted to mislead path resolution.

Out of scope:

Defending against a malicious operator who already has shell access on the remote host. This tool exposes a subset of what they can already do.
Defending against compromise of the SSH key material.

Guarantees¶

G1 — No mutation operations by default; restore is opt-in per host¶

The agent dispatches RPCs through an explicit METHODS allowlist in agent/zfs_snoop_agent.py. Any method not in the dict returns JSON-RPC Method not found (-32601). Adding a method requires editing the agent source — there is no runtime knob that adds methods to the dispatcher.

Read-only methods (the entire surface for any host that hasn't opted in to restore): agent_info, list_pools, pool_status, list_datasets, dataset_properties, list_snapshots, snapshot_cadence, diff_snapshots, list_dir, size_breakdown, top_consumers, read_file, find_files, content_grep, file_history, versions_of, file_diff, snapshots_containing, first_appearance, last_appearance, find_deleted, bisect_change, stale_snapshots, size_delta, checksum_file, fetch_file, fetch_dir (the fetch_* pair copies out to your workstation — the server's live filesystem is untouched).

Writable methods (v0.4.0+): restore_file, restore_dir. These are the only methods that write to the host's live filesystem and they are gated server-side on per-host config: the server's restore_file / restore_dir tools refuse before invoking the agent unless the host has allow_restore = true and a non-empty restore_paths allowlist in hosts.toml. Default install of any pre-existing host is unaffected. See G7 for the target-path bounds.

Tests: test_methods_table_contains_no_mutating_zfs_operations asserts no mutating ZFS subcommand (e.g. destroy, rollback, set, clone) ever leaks into the agent's dispatch table — restore methods use shutil, not zfs, and are application-level operations on a different layer. test_methods_table_is_what_we_expect pins the exact set including the two restore methods, so adding or removing a method is a deliberate, reviewed change. test_restore_file_rejects_when_allow_restore_disabled asserts the server gating works.

G2 — No shell interpretation of user input¶

Every external command is invoked via subprocess.run([...], shell=False) with an explicit argv list (agent.run_zfs). Tool inputs that become argv elements are validated before the call:

Dataset names match ^[A-Za-z0-9_][A-Za-z0-9_.:/-]*$.
Snapshot names match the same plus @<snap-part>.
Tested by test_validate_dataset_rejects_invalid / test_validate_snapshot_rejects_invalid.

The local transport also uses an argv list for ssh, with the remote shell command produced via shlex.quote() per token.

fetch_file / fetch_dir copy snapshot data with sftp in batch mode (sftp -b -), feeding a single get line whose paths are quoted for sftp's own client-side lexer (_sftp_quote). sftp speaks the SFTP protocol and never invokes a remote shell or word-splits the remote path, so snapshot filenames containing spaces, glob characters, or shell metacharacters are transferred verbatim with no injection surface. (Earlier releases shelled out to scp host:path; that form relied on a remote shell under the legacy SCP protocol but is interpreted literally by scp's modern SFTP backend, so the prior shlex.quote() of the path broke any filename with a space — see CHANGELOG 0.3.1.)

G3 — Path inputs cannot escape their snapshot root¶

For any operation that takes a (snapshot, path), the agent (agent.resolve_under_snapshot):

Rejects absolute paths and any .. segment up front.
Resolves the joined path with Path.resolve(strict=False) — which follows symlinks — and verifies it stays inside realpath(snapshot_root).
Returns the unresolved path so callers can lstat() the final component to detect a symlink without following it.

read_file and list_dir then refuse to follow a final-component symlink at all; symlinks are reported with their target string as data. Tests: test_resolve_rejects_dotdot_traversal, test_resolve_rejects_symlink_that_escapes, test_read_file_refuses_to_follow_symlink, test_list_dir_reports_symlink_without_following.

G4 — All reads are bounded¶

Operation	Limit
`read_file`	`max_bytes` (caller-provided, server-capped at 4 MiB)
`list_dir`	`max_entries` (default 1000, server-capped at 10 000)
`size_breakdown`	`max_entries` (default 100 000, server-capped at 1 000 000); 30 s wall time
`find_files`	`max_results` (default 100, server-capped at 1000)
`content_grep`	`max_results` (default 100, server-capped at 1000)
`file_diff`	`max_bytes` per side (default 1 MiB, server-capped at 4 MiB)
`versions_of`	`max_bytes` per snapshot read (default 1 MiB, server-capped at 4 MiB)
`find_deleted`	`max_results` (default 1000, server-capped at 10 000)
`top_consumers`	`n` heap size (default 20, capped at 1000); `max_entries` walk cap as `size_breakdown`; 30 s wall time
`stale_snapshots`	`max_results` (default 1000, server-capped at 10 000)
`bisect_change`	`max_bytes` per predicate read (default 1 MiB, server-capped at 4 MiB); evaluates O(log N) snapshots
Per zfs subprocess	30 s wall time, enforced via `subprocess.run(timeout=)`
Transport recv	60 s wall time, enforced in `AgentConnection._recv`

Exceeding a size limit truncates the response and sets truncated: true rather than failing. Tested by test_list_dir_truncates_at_max_entries, test_find_files_truncates, and test_read_file_falls_back_to_base64_for_binary (covers max_bytes).

G5 — Defence in depth via ZFS delegation (user mode)¶

In the default user mode, the remote account is expected to hold only the diff ZFS delegation (see INSTALL). Even if the agent were compromised, it could not destroy, snapshot, mount, or send any dataset through zfs(8).

In sudo mode the agent runs as root and this defence does not apply. The allowlist (G1) and the no-shell guarantee (G2) are the remaining lines of defence; mutation operations are still not in the dispatch table. See "Sudo mode tradeoff" below.

G6 — All structured logs go to stderr, never stdout¶

stdout is reserved for JSON-RPC frames. Any log message, debug output, or unexpected stderr from a child process is captured and forwarded as a structured field in the JSON-RPC error response, not interleaved with the wire protocol.

G7 — Restore targets are bounded by an operator path allowlist¶

The two writable methods (restore_file, restore_dir, v0.4.0+) write to the host's live filesystem. They are off by default and, when enabled per host (allow_restore = true), the operator must also provide a non-empty restore_paths allowlist of absolute path prefixes. The server's _validate_restore_target enforces, in order:

target_path must be a string starting with / (relative paths rejected).
NUL / newline / carriage-return characters are rejected (same helper as _reject_batch_breaking_chars used by the sftp fetch path).
The path is canonicalised with Path.resolve(strict=False) — collapsing .. and following existing symlinks — before the allowlist and denylist checks. So /srv/../etc/passwd and a symlink whose target escapes the allowlist are rejected here, not silently restored to the resolved path.
Universal denylist (always denied regardless of the operator's allowlist): paths under /proc/, /sys/, or /dev/, and any path containing /.zfs/snapshot/. Kernel virtual filesystems aren't sane restore targets and writes there can have side effects far beyond a normal file; snapshots are read-only in ZFS anyway, but rejecting explicitly gives a clearer error.
Operator allowlist: the canonical path must lie under one of the restore_paths prefixes. Prefixes are trailing-slash normalised on both sides so /srv/foobar is not a false-positive match for /srv/foo/.

The agent re-applies the universal denylist + path-shape invariants (_validate_restore_target_agent_side) as belt-and-braces — a bug or bypass on the server side can't make the agent overwrite something pathological. The agent does NOT know the operator's per-host allowlist; that boundary stays exclusively in the server.

Restore-specific behaviour also enforced server-side:

restore_file refuses if the target is an existing directory (a typo guard — replacing a file with a directory tree is nearly always unintentional). restore_dir symmetrically refuses if the target is an existing regular file.
overwrite=False (default) refuses any existing target. With overwrite=True and backup=True, the server precomputes backup_path = <target>.zsnoop-backup-<UTC-isoformat> and the agent atomically renames the existing target to that backup path before writing — so a wrong restore is reversible.

Tested by: test_restore_file_rejects_when_allow_restore_disabled, test_restore_file_rejects_target_outside_allowlist, test_restore_file_canonicalises_dotdot_before_allowlist_check, test_restore_file_rejects_kernel_virtual_fs_denylist, test_restore_file_rejects_zfs_snapshot_substring, test_restore_file_belt_and_braces_rejects_denied_prefix (agent side), plus the config tests (test_allow_restore_requires_non_empty_restore_paths, test_restore_paths_entries_must_be_absolute).

Sudo mode tradeoff¶

Sudo mode is opt-in per host and exists to support the legitimate use case of reading files in root-owned system datasets (e.g., /etc/foo from a snapshot of rpool/ROOT/debian). In sudo mode:

The agent process is uid 0 on the remote host.
POSIX read restrictions no longer protect any file.
ZFS delegation is irrelevant; the agent could in principle invoke any zfs(8) subcommand. The allowlist (G1) still blocks this in the dispatch table, but the only line of defence against a code bug or compromised agent source is the allowlist itself, not the kernel.
The trust boundary effectively becomes: anything that can put a malicious payload into stdin (the JSON-RPC stream) or into the agent source at bootstrap time has root on the remote host.

Use sudo mode only on hosts where you already trust the SSH user with root (via sudo), and only when you need to read root-owned snapshot files. Keep user mode for everything else.

Known limitations¶

The local MCP server does not currently verify host keys beyond what OpenSSH itself does. Use a properly populated ~/.ssh/known_hosts.
The bootstrap-on-connect path sends the agent source over SSH on every fresh connection. This is the same trust boundary as git clone over ssh: if the remote is compromised, it can run whatever it likes regardless of what you send it. The agent source is not confidential.
A malicious snapshot containing a path component longer than PATH_MAX may cause path resolution to fail; this is reported as an error and does not crash the agent.

Reporting a vulnerability¶

Preferred: open a private vulnerability report via the GitHub Security Advisory tab on the repository: https://github.com/hamsolodev/zsnoop-mcp/security/advisories/new. This keeps the report confidential and pre-fills the CVE workflow.

Alternative: email zsnoop-mcp.happiest328@passmail.net with the subject [zsnoop-mcp] security. Use this if you don't have a GitHub account, or for a quick "I'm not sure if this is a vulnerability" check.

Please don't open public issues for security reports — both channels above keep the discussion private until a fix lands and a coordinated disclosure window has passed.