Per-route output ceiling
Set a maximum response size by route, not just by server. A search result, file summary, database row read, transcript extract, and browser scrape should not share one generic payload limit.
Tool output budget checklist
A tool call can be correct and still break the agent if it returns too much. Search results, files, transcripts, logs, and nested API responses need bounded output contracts so the model receives the smallest safe evidence, not a context flood.
Fast answer
The production checklist
Set a maximum response size by route, not just by server. A search result, file summary, database row read, transcript extract, and browser scrape should not share one generic payload limit.
Return typed fields, stable ids, result counts, omitted-count metadata, and next-page cursors before free-form explanation. Let the model reason over bounded structure instead of raw dumps.
When the full payload is too large, write it to a durable artifact or provider object and return a reference, checksum, expiration, access rule, and safe follow-up route instead of flooding context.
Name whether the tool returned raw data, extracted fields, a lossy summary, or a sampled preview. The receipt should make lossy compression visible before the agent treats it as ground truth.
Apply redaction before payload shaping, and record which secret, customer-data, credential, prompt, or topology class was removed. Truncation is not a security control.
Expose a cursor, range, query refinement, or approval step for more data. Do not let the agent repeat the same oversized call hoping the next response is smaller.
Failure fixtures
Expected: Return top bounded results, total count, omitted count, ranking criteria, and a cursor or refinement hint; do not stream every match into context.
Expected: Return section summaries plus artifact reference, byte range, checksum, and follow-up extraction route instead of a full dump.
Expected: Flatten or select approved fields, include schema version, and receipt omitted nested objects before the agent plans from partial data.
Expected: Redact before truncation and record the protected class. A payload clipped after the secret is already returned fails the gate.
Expected: Deny or require a narrower query after budget exhaustion. The planner should not bypass the output budget by rephrasing the same broad request.
Trace evidence
Once the agent moves on, operators need to know whether it acted on raw data, an extraction, a summary, or a clipped preview. The trace should keep returned payload size, omitted data, redaction, artifact references, and allowed next actions in one place.
Copy-paste route card
MCP route:
Caller / tenant:
Data class:
Max bytes / records / tokens:
Allowed fields / schema:
Summary vs raw-data rule:
Artifact handoff rule:
Redaction rule:
Pagination / refill route:
Oversize denial or truncation code:
Receipt fields: Common misreads
Related reading
Fleet-level budgets once tool calls, retries, and returned payloads all burn scarce capacity.
Pair output ceilings with retry budgets so one route cannot overspend through repeated large responses.
Trace fields for proving what was returned, omitted, redacted, or handed off as an artifact.