← Blog · MCP reliability · May 17, 2026

Tool output budget checklist

MCP tools need output budgets before they need bigger context windows.

A tool call can be correct and still break the agent if it returns too much. Search results, files, transcripts, logs, and nested API responses need bounded output contracts so the model receives the smallest safe evidence, not a context flood.

Fast answer

  • Tool output is part of the route budget. A verbose MCP result can burn more model context than the call that produced it, then make the next planning step slower, more expensive, and less recoverable.
  • A production MCP tool needs an output contract before launch: maximum bytes, maximum records, schema shape, summary rule, artifact handoff, redaction policy, and the exact denial or truncation receipt when the response exceeds budget.
  • The useful test is not whether the tool can return a large JSON blob. It is whether the same route can return the minimum safe result, point to a durable artifact when needed, and prove what was omitted.
  • If the trace cannot explain how many bytes or tokens were returned, why the payload was shaped that way, what artifact holds the full result, and how the agent can request the next page safely, the route is not ready for unattended loops.

The production checklist

Per-route output ceiling

Set a maximum response size by route, not just by server. A search result, file summary, database row read, transcript extract, and browser scrape should not share one generic payload limit.

Schema before prose

Return typed fields, stable ids, result counts, omitted-count metadata, and next-page cursors before free-form explanation. Let the model reason over bounded structure instead of raw dumps.

Artifact handoff

When the full payload is too large, write it to a durable artifact or provider object and return a reference, checksum, expiration, access rule, and safe follow-up route instead of flooding context.

Summarization boundary

Name whether the tool returned raw data, extracted fields, a lossy summary, or a sampled preview. The receipt should make lossy compression visible before the agent treats it as ground truth.

Redaction and data-use policy

Apply redaction before payload shaping, and record which secret, customer-data, credential, prompt, or topology class was removed. Truncation is not a security control.

Pagination and refill rule

Expose a cursor, range, query refinement, or approval step for more data. Do not let the agent repeat the same oversized call hoping the next response is smaller.

Failure fixtures

Test the context-flood cases before the agent discovers them in production.

Oversized search result

Expected: Return top bounded results, total count, omitted count, ranking criteria, and a cursor or refinement hint; do not stream every match into context.

Large file or transcript

Expected: Return section summaries plus artifact reference, byte range, checksum, and follow-up extraction route instead of a full dump.

Nested JSON response

Expected: Flatten or select approved fields, include schema version, and receipt omitted nested objects before the agent plans from partial data.

Sensitive field in allowed result

Expected: Redact before truncation and record the protected class. A payload clipped after the secret is already returned fails the gate.

Agent asks for 'everything' again

Expected: Deny or require a narrower query after budget exhaustion. The planner should not bypass the output budget by rephrasing the same broad request.

Trace evidence

The output receipt should make omitted data auditable.

Once the agent moves on, operators need to know whether it acted on raw data, an extraction, a summary, or a clipped preview. The trace should keep returned payload size, omitted data, redaction, artifact references, and allowed next actions in one place.

route id and tool call id
caller / tenant / workspace
operation class and data class
output ceiling in bytes / records / tokens
actual bytes and estimated tokens returned
raw count, returned count, and omitted count
schema version and selected fields
redaction rule and protected class
summary / extract / raw-data mode
artifact id, checksum, and expiration
cursor, range, or refill route
policy decision and denial / truncation code
receipt id and allowed next action

Copy-paste route card

Budget the returned evidence before the call runs.

MCP route:
Caller / tenant:
Data class:
Max bytes / records / tokens:
Allowed fields / schema:
Summary vs raw-data rule:
Artifact handoff rule:
Redaction rule:
Pagination / refill route:
Oversize denial or truncation code:
Receipt fields:

Common misreads

  • Optimizing provider-call retries while ignoring that the returned payload is what actually explodes the model bill.
  • Calling a tool read-only and therefore safe, even though it can leak private data or swamp context with unbounded output.
  • Returning a natural-language summary without saying which fields were dropped, sampled, redacted, or inferred.
  • Using truncation as a quiet success path. The agent must know the response is partial before it takes action.
  • Storing a full artifact without a checksum, expiration, access rule, or route for retrieving a narrower slice later.
  • Letting the agent retry the same broad query after an output-budget denial instead of requiring a smaller query or human approval.

Related reading