Per-route output ceiling
Set a maximum response size by route, not just by server. A search result, file summary, database row read, transcript extract, and browser scrape should not share one generic payload limit.
Tool output budget checklist
A tool call can be correct and still break the agent if it returns too much. Search results, files, transcripts, logs, and nested API responses need bounded output contracts so the model receives the smallest safe evidence, not a context flood.
Fast answer
The production checklist
Set a maximum response size by route, not just by server. A search result, file summary, database row read, transcript extract, and browser scrape should not share one generic payload limit.
Return typed fields, stable ids, result counts, omitted-count metadata, and next-page cursors before free-form explanation. Let the model reason over bounded structure instead of raw dumps.
Preserve the original sequence of text, image, file, table, and citation blocks or return an explicit order map. Reordering mixed media can make the agent attach evidence to the wrong step even when every block is present.
When the full payload is too large, write it to a durable artifact or provider object and return a reference, checksum, expiration, access rule, and safe follow-up route instead of flooding context.
Name whether the tool returned raw data, extracted fields, a lossy summary, or a sampled preview. The receipt should make lossy compression visible before the agent treats it as ground truth.
Apply redaction before payload shaping, and record which secret, customer-data, credential, prompt, or topology class was removed. Truncation is not a security control.
Expose a cursor, range, query refinement, or approval step for more data. Do not let the agent repeat the same oversized call hoping the next response is smaller.
Failure fixtures
Expected: Return top bounded results, total count, omitted count, ranking criteria, and a cursor or refinement hint; do not stream every match into context.
Expected: Return section summaries plus artifact reference, byte range, checksum, and follow-up extraction route instead of a full dump.
Expected: Flatten or select approved fields, include schema version, and receipt omitted nested objects before the agent plans from partial data.
Expected: Return text, images, files, tables, and citations in the route's declared order with block ids and parent ids; do not move all images or artifacts to the end of the response.
Expected: Redact before truncation and record the protected class. A payload clipped after the secret is already returned fails the gate.
Expected: Deny or require a narrower query after budget exhaustion. The planner should not bypass the output budget by rephrasing the same broad request.
Trace evidence
Once the agent moves on, operators need to know whether it acted on raw data, an extraction, a summary, or a clipped preview. The trace should keep returned payload size, omitted data, redaction, artifact references, and allowed next actions in one place.
Database-backed tools
Database-backed MCP tools need a result-authority check after the query-authority check. Returning raw customer identifiers, free-text notes, nested JSON, or twenty thousand allowed rows can still overexpose context even when the underlying SQL was read-only.
The receipt should prove why this exact slice was safe for the agent to see, not merely that the agent had permission to run a read.
Record a query or filter fingerprint before execution so an audit can tell whether the agent asked for the bounded slice it was allowed to inspect or a broad 'all customers' style read.
Treat returned fields, row scope, tenant scope, redaction class, and sample mode as a second permission check. A legal query can still return evidence the agent should not reason over.
Name the table allowlist, column allowlist, row-level predicate, workspace or tenant rule, and redaction policy that shaped the payload before truncation or summarization.
Copy-paste route card
MCP route:
Caller / tenant:
Data class:
Query / filter fingerprint:
Policy decision source:
Max bytes / records / tokens:
Allowed fields / schema:
Content block order rule:
Summary vs raw-data rule:
Artifact handoff rule:
Redaction rule:
Pagination / refill route:
Oversize denial or truncation code:
Receipt fields: Common misreads
Related reading
Fleet-level budgets once tool calls, retries, and returned payloads all burn scarce capacity.
Pair output ceilings with retry budgets so one route cannot overspend through repeated large responses.
Trace fields for proving what was returned, omitted, redacted, or handed off as an artifact.