← Blog · MCP Security · April 20, 2026 · Rhumb · 6 min read

MCP has a security model

The protocol is not the security model. The security model is what the server enforces when the model is wrong. Scope, principals, and post-call evidence are the security model.

Scope

Tool parameters must be bounded at execution time (prefix allowlists, typed inputs, URL host/IP policy, write fences).

Principals

Every caller should run with scoped identity and scoped credentials, not a shared admin backend.

Evidence

If you cannot prove what happened after the call, you do not have a control plane, you have a claim.

The operator test

The useful question is not "does this MCP server work?" It is "what happens when one tool call is compromised?"

The current security findings cluster is concrete: unconstrained string parameters, path traversal in filesystem tools, SSRF in remote connectors, sandbox bypasses, and weak write fences. None of this requires mystical threat modeling. It requires server-side boundaries.

1. Scope is the boundary that matters

The moment you expose a tool that can read arbitrary files, write arbitrary paths, fetch arbitrary URLs, or mutate broad resources, you have already made prompt injection a production incident class. The fix is not a better prompt. The fix is an allowlist.

  • Constrain filesystem tools to explicit prefixes and resolve paths safely.
  • Constrain network tools to explicit host and resolved-IP allowlists, not open URLs.
  • Constrain write tools to explicit targets and require typed, schema-validated inputs.
  • Make empty allowlists fail closed. A blank policy should not silently mean every network, path, or caller is allowed.

Endpoint families need the same rule. If `/mcp` is authenticated but the message endpoint, streaming endpoint, or callback endpoint can invoke tools without the same check, the security model is whatever the weakest path permits. The operator test is endpoint-auth parity: every route that can trigger a tool call must enforce the same principal and scope policy.

Negative-case fixture

A security model should ship both the allowed call and the denied neighbor.

The fastest way to catch unconstrained parameter drift is to keep a paired test for every high-risk tool: one value that should pass and one adjacent value that must fail. A filesystem read should prove an allowed prefix and reject `../`. A fetch, browser, or scraping tool should prove an approved host and reject cloud metadata, loopback, private-network, and in-cluster service addresses. A repo tool should prove the intended owner/repo and reject a sibling target outside policy.

  • Store the raw value, normalized value, caller, tool, endpoint family, and policy rule in the trace.
  • Make the denied neighbor a first-class regression test, not a one-time manual audit.
  • Run schema and tool-description lint in CI, but fail the fixture if the runtime guardrail does not reject before the tool handler.
  • Treat a generic 500, silent retry, or partial execution as a failing security fixture.
Filesystem path authority

File tools need canonical-path proof before content becomes model input.

A filesystem, repo, or local-resource tool is host-state authority. The policy cannot stop at “this is a read tool” or “the user asked for a file.” It has to resolve the requested path against cwd, symlinks, case behavior, and mounts, then prove the canonical path stays under the allowed root before any read, write, diff, patch, or summary reaches the agent.

  • Bind every file operation to cwd, requested path, canonical path, allowed root or repo prefix, operation class, and redaction rule.
  • Keep denied fixtures for parent traversal, sibling repositories, hidden config, host mounts, symlink escape, and writes outside policy.
  • Return typed denials with the raw path, canonical path, symlink decision, policy rule, and redaction decision before the handler opens the file.
URL fetch authority

Open fetch is a credential boundary, not just a network helper.

A hosted MCP fetch tool that accepts arbitrary URLs can reach more than the public web. Depending on where it runs, link-local metadata, loopback, private subnets, and in-cluster services may expose credentials or internal control surfaces. The safe default is not “the agent probably will not ask for that URL.” The safe default is DNS/IP policy that denies those targets before the HTTP request starts.

  • Resolve hostnames before fetch and classify the target IP, including redirects and retries.
  • Deny cloud metadata, loopback, RFC 1918, current-network, IPv6 ULA, and service-network targets unless an explicit internal route card exists.
  • Log the blocked credential lane and policy rule so the denial is evidence, not an invisible network failure.

2. Principals must be real

The security posture collapses when every workflow shares one backend credential. A safe server needs scoped identity per caller or per tenant, then scoped credentials that match the smallest intended lane.

If a low-trust research agent can inherit the same authority as a deploy workflow, you do not have a principal model. You have a single shared blast radius. In shared runtimes, that same mistake becomes a multi-tenant containment problem, because one caller can now inherit another tenant's effective authority.

3. Evidence turns policy into control

Production operators need more than logs that say "error". They need structured records that answer who acted, what they attempted, what was denied, and why. Evidence is what makes containment verifiable.

4. Policy language is only useful if the boundary survives runtime

Generated permission manifests, gateway RBAC layers, and governance toolkits are useful because they expose intended boundaries. They are not the boundary itself.

The operator test is still runtime reality: does the wrong caller see fewer tools, get a typed denial instead of a vague failure, and leave evidence of which lane consumed the shared quota or backend authority after the remote hop.

If the manifest reads narrow but discovery still reveals write-capable tools, or a gateway claims RBAC but cannot explain which principal burned the shared budget during a noisy loop, you documented intent without creating control.

If you only have 2 minutes
Make read-only a real trust class

Split public or low-trust servers into read-only tools. Do not mix read and write tools behind one handshake.

Read-Only MCP as a Trust Class →
Make empty allowlists fail closed

Bound file paths, repo targets, URL hosts/IP ranges, and write destinations with explicit allowlisted prefixes. Empty policy should mean deny-all, not allow-all.

Scope Constraints and Blast Radius →
Scope tools per workflow

Authenticated must not mean every tool. Filter discovery and enforce tool-level authorization per role, tenant, or workflow intent.

Tool-Level Permission Scoping →
Log the decision, not just the error

Capture principal, parameters, tool, policy decision, and outcome. Typed denials must be distinguishable from runtime failures.

MCP Observability →
Pricing boundary

The security model proves the lane before Rhumb prices it

Scope, principals, and evidence are pre-execution controls. They decide whether a route is safe enough to become a paid call; they are not themselves a reason to charge for a compromised, denied, or unprovable attempt.

Scope checks, principal mapping, denied-neighbor fixtures, and evidence review are security proof, not paid execution.
The paid route begins after one bounded lane survives with caller, capability id, credential mode, quota owner, estimate, side-effect class, and receipt proof named.
A policy denial, empty allowlist, endpoint-auth mismatch, or missing audit trail should stop as a typed no-call result instead of becoming a billable retry on a broader surface.

Free-proof guide: separate security proof from paid execution / Pricing proof: see the MCP discovery pricing boundary / Execution preflight: scope managed execution

Next honest step

Start with one bounded lane

If the security model is scope plus principals plus evidence, the right next move is not a bigger catalog. Start with a governed capability lane that makes authority visible and keeps scope small.

Route-hardening fit check: if one MCP call is already painful enough to repeat, send the route, unsafe neighbor, credential lane, budget owner, and receipt proof instead of asking for a generic security review. Use the route-hardening checklist to turn the security model into one route card, one denied-neighbor fixture, one authority lane, one retry envelope, and one receipt standard.