The sharpest MCP failures are not abstract jailbreak stories. They are tools with path, string, or write parameters that stay too broad once the model is allowed to call them.
A compromised call matters less when the server enforces narrow prefixes, scoped credentials, and tenant isolation. It matters much more when one tool can reach everywhere.
The real question is not whether the server responds. It is how much damage one bad tool call can do, and whether the operator can prove what happened after.
The useful production question is no longer "does this MCP server work?" It is "when this MCP server is compromised, how much damage can it do before the boundary stops it?"
- Unconstrained string parameters mean the prompt is still doing policy work the server should have enforced.
- Filesystem tools need path normalization plus explicit allowlisted prefixes, not just a `path` field that looks typed.
- Browser and fetch-style tools need domain and egress guards so indirect prompt injection cannot quietly become SSRF, cloud metadata access, or sandbox escape.
- The useful denial is a typed refusal with preserved context, not a vague runtime error that hides whether scope control actually fired.
The official MCP servers issue tracker has developed a recurring shape: security advisories tied to unconstrained parameters, prompt-injection-driven file reads, SSRF, sandbox bypasses, and weak write boundaries.
This is not just a bug backlog. It is a design-pattern gap.
Prompt injection hits harder in MCP than in many ordinary API integrations because MCP tools are built to take action, and the trust boundary often lives in the server implementation rather than the protocol itself.
1. The structural problem is tools with no hard scope boundary
A traditional API call is usually bounded by the credential, the request schema, and the provider's own enforcement surface.
An MCP tool call is different. The action boundary is whatever the server author actually enforced. If the implementation leaves paths, strings, or write targets wide open, a model can be steered into passing dangerous values that the server will still execute.
That is why the sharpest current findings read less like exotic model attacks and more like missing containment checks: no allowed-prefix guard, no bounded path resolution, no tenant-aware credential split, no explicit write fence.
The vulnerability is not that the model followed a bad instruction. The vulnerability is that the tool boundary was broad enough for that instruction to become a real action.
Schemas describe input. Enforcement must reject the dangerous value.
A JSON schema that says path: string or url: string is not a containment boundary. It helps the model format a call; it does not prove the server will refuse ../, internal hosts, unexpected repositories, cross-tenant IDs, or write targets outside the declared lane.
- Normalize first, then compare against the allowlisted filesystem, repository, tenant, or domain scope.
- Fail closed when the normalized value falls outside that scope, even if it matches the tool schema.
- Return a typed policy denial that preserves the caller, tool, raw value, normalized value, and rule that blocked execution.
- Trace the refusal as an expected safety outcome, not as a generic runtime error the agent may retry around.
A path tool has to prove the canonical path before the model reads.
The filesystem scan cluster is the same prompt-injection boundary in local-resource form. An agent-controlled path can look like a harmless workspace file while resolving through ../, symlinks, sibling repositories, hidden config, or host mounts. The boundary has to run before the file is opened, not after the model has already seen its contents.
- Resolve cwd, requested path, canonical path, symlinks, and mount boundary before read, write, diff, patch, or summary execution.
- Allow only the declared root or repo prefix for the current caller and operation class; a read lane does not imply a write lane.
- Reject parent traversal, sibling workspaces, hidden config, host mounts, and symlink escapes with typed denials that preserve the raw path, canonical path, allowed root, policy rule, and redaction decision.
A URL tool has to deny cloud metadata before the model can fetch.
The fresh MCP fetch finding makes the scope problem concrete: an agent-controlled url parameter can become credential exfiltration when the server resolves link-local cloud metadata, loopback, RFC 1918, or in-cluster service addresses. Host-level IMDS settings help, but the MCP package still needs a default-deny network boundary before it calls the HTTP client.
- Allow only the schemes and domains the workflow needs; do not treat arbitrary
http(s)URLs as safe just because the schema accepts strings. - Resolve DNS before fetch and reject link-local metadata, loopback, RFC 1918, current-network, IPv6 ULA, and in-cluster service ranges unless the route card explicitly opts into an internal lane.
- Record raw URL, normalized host, resolved IP class, policy rule, credential lane, and typed denial so the agent cannot retry the same prompt through a broader web tool.
Schema lint is the tripwire. The pre-execution guard is the boundary.
The newer MCP audit thread adds a useful CI pattern: lint server schemas for unconstrained strings, missing descriptions, ambiguous parameter names, and overlapping tool descriptions before the server ships. That catches drift early, but it still does not make the runtime safe.
Treat schema and description lint as candidate-quality evidence. Then require a guardrail path that runs before the tool handler: normalize the value, bind it to the caller's route card, reject the denied neighbor, and leave a typed refusal the agent cannot retry around as if it were a flaky provider error.
- CI lint should fail on open-ended
stringparameters that lack enum, pattern, bounded format, or explicit allowlist semantics. - Prompt-level refusal language helps the model choose, but server-side policy must still deny the same unsafe value after the model chooses.
- A passing schema check without a pre-handler policy decision is documentation, not containment.
Would you pay to make one MCP route safe enough to repeat?
If the answer is yes, do not send a platform wishlist. Send one route that currently feels too risky to put behind an agent. Rhumb can only learn from this if the ask names the workflow, the denied neighbor, the credential lane, and the proof that would make the first governed execution boring. For SSH MCP, "let the agent run commands" is still too broad; the ask has to be one command lane with host, user, directory, timeout, rollback, and typed denial attached.
- The exact MCP route or tool call you would put behind an agent if the boundary were boring enough to trust.
- For SSH or remote-command MCP, the named host, user, runbook or command allowlist, working directory, timeout, PTY/session policy, rollback path, and environment-access rule.
- The one unsafe neighbor that must fail closed: path, repo, tenant, URL, cloud-metadata address, customer, amount, or write target.
- The credential lane and budget owner that should be preserved if the call repeats.
- The receipt or typed denial that would make you comfortable paying for the first governed execution.
2. Remote MCP turns a bad call into an operator problem
Local stdio MCP on your own machine is still risky, but the blast radius is usually easier to reason about. The affected machine, files, and credentials are at least legible to one operator.
Remote MCP raises the stakes because the same weak tool boundary can sit in front of many agents, many tenants, shared backend credentials, or a broader hosted runtime.
Once that happens, prompt injection stops being only a model-alignment concern. It becomes a principal-model and control-plane concern.
- Are path and string inputs bounded to the smallest safe surface?
- Can one compromised caller affect another tenant's data?
- Does the server run with scoped credentials or one broad backend identity?
- Will the operator be able to reconstruct the call after the fact?
If the answer to those questions is vague, the system is not merely under-hardened. It is under-bounded.
3. The audit pattern is remarkably concrete
The strongest live issue cluster keeps circling the same shape: unconstrained string parameters across official servers, path traversal risk in filesystem tools, write-capable repo tools with weak scope constraints, and remote-hosted surfaces where the server ends up doing too much on trust.
That matters because it gives operators a sharper readiness test than generic security language.
You do not need a mystical threat model to evaluate this. You can ask whether the server validates the parameter at the point of execution, whether that validation is narrow enough to prevent boundary escape, and whether the runtime keeps the resulting authority small even if one call goes bad.
4. What good looks like in production
A remote MCP surface worth trusting in production usually does four things well at the same time.
Parameter layer
- Filesystem paths restricted to explicit allowlisted prefixes
- Structured strings validated against expected shape before execution, not merely hinted at in the tool schema
- Normalized values compared against allowlisted filesystem, repository, tenant, domain, and write-target scope before the side effect runs
- URLs and browser targets restricted to explicit domains or egress policy before navigation runs
- Numeric inputs bounded to documented ranges
Auth layer
- Each caller or tenant runs with scoped credentials
- Elevated tools fail explicitly instead of inheriting a silent shared admin principal
Observability layer
- Every tool call is attributable to a caller, parameter set, and outcome
- Errors are typed clearly enough to distinguish policy denial from runtime failure
- Partial-success ambiguity does not silently collapse into "it worked"
Containment layer
- One compromised call cannot escape the allowed filesystem, repo, or resource scope
- One tenant's compromise does not become another tenant's breach
- One bad retry loop does not become shared fleet damage through broad credentials or weak governors
5. This should be a first-class evaluation question
Rhumb's access-readiness model already cares about scoped credentials, machine-readable auth failures, and revocation paths. MCP server trust adds another layer on top of that.
A strong underlying API can still be turned into a weak remote tool surface if the MCP layer removes the narrow boundaries that made the API safe to automate in the first place.
That is why scope-constraint validation belongs alongside auth model, tenant isolation, governors, recovery, and auditability in any real remote MCP readiness checklist.
Put more simply, this is the production MCP security model: scope is the boundary, principals decide whose authority is in play, and evidence proves what happened after the call.
Remote auth does not cancel that problem. A server can prove who connected while still exposing the wrong tool set or binding execution to a backend authority the caller should never inherit, which is why identity versus authority is the next useful operator question once authentication appears to work.
- Check whether the server validates tool parameters in server-side code, not only in the exposed schema.
- Look for explicit normalization plus allowlist guards on files, repos, URLs, browser targets, tenant IDs, and write paths.
- Send one out-of-scope value and verify the server returns a typed policy denial rather than executing, retrying, or hiding the refusal as a generic error.
- Confirm the runtime can use scoped credentials rather than one broad backend identity.
- Inspect whether tool calls preserve caller, parameters, result, policy context, and typed denials instead of collapsing everything into generic failure.
- Ask what happens to other tenants or sessions if one agent is compromised. The honest answer should be "nothing material."
6. Web-scraping catalogs need egress boundaries, not just more tools
Scraping-heavy MCP servers make the scope problem especially concrete. A catalog with dozens of fetch, crawl, browser, screenshot, and extraction tools can look productive while quietly becoming one broad remote web authority.
The safe shape is not “let the model pick any scraper.” It is a governed scraping lane: target domains, egress policy, provider quota, data-use limits, and output provenance are all part of the permission boundary before navigation or extraction runs.
- Bind each fetch or scraping tool to explicit target-domain, DNS/IP egress, robots/compliance, and data-use policy before it reaches a live URL.
- Keep extract, crawl, browser, screenshot, and search tools in separate permission lanes instead of one generic web-access authority.
- Attribute quota and provider spend per target lane so a broad scraping catalog cannot hide which caller burned the shared budget.
- Return typed denials for blocked domains, cross-tenant jobs, over-budget crawls, and disallowed extraction outputs instead of retryable transport errors.
7. The pattern is containment as design intent
The production conversation is moving away from "MCP security" as a vague afterthought and toward a cleaner design split: what authority is visible, what scope is reachable, what principal is acting, and how much damage one wrong call can do.
That is a better operator frame than raw liveness, raw handshake success, or a flat top-server list.
The honest question is no longer whether a server looks convenient. It is whether the server preserves narrow authority when the model is wrong.
Scope proof runs before the paid route exists
Prompt-injection containment is an acceptance gate, not a billable retry surface. Rhumb should price execution only after the dangerous adjacent values fail closed and one bounded route is safe enough to call.
Pricing proof: see the MCP discovery pricing boundary / Execution preflight: scope managed execution
Start with one bounded lane, not a broad connector surface
If prompt injection and scope drift are the real trust problem, the answer is not a larger mixed-authority catalog. Start with a narrow, governed capability lane that keeps scope small and operator intent explicit.
If this article is the threat model, these pages turn it into an operator playbook: the security model itself, the auth-versus-authority split, how to evaluate MCP surfaces honestly, how to shape governed capabilities, what a real remote-readiness checklist looks like, and where proof stops before paid execution begins.
The shortest honest frame for production trust is scope, principals, and evidence.
Useful when auth succeeds but the sharper question is still which tools stay visible and whose backend authority really survives the handshake.
Use workflow fit, trust class, auth viability, and runtime evidence before a server earns production trust.
The safer answer is not raw endpoint sprawl. It is a bounded capability surface with visible authority and policy.
Auth, scope constraints, tenant isolation, governors, recovery, and auditability belong in one operator checklist.
Candidate proof, denied-neighbor checks, and route-card inspection stay separate from the first paid execution lane.
Scope constraints are only the first boundary
Once the tool boundary is narrow enough to trust, the next operator questions are what breaks in the loop, how shared rate limits are contained, and how credentials stay narrow as more agents come online.
What actually breaks once retries, tool use, and unattended execution are live.
How shared provider budgets and retry windows turn a tool surface into a fleet coordination problem.
Why bounded parameters still fail if the credential layer widens faster than the trust model.