Structured output
Get reliably-typed JSON or Pydantic results from any agent run. Auto-retries on validation failure, streams partial JSON token-by-token, and beats LangChain's OutputFixingParser by staying inside the same conversation.
Pass a Pydantic model or JSON Schema to agent.run(output_schema=...) and
the result comes back already parsed and validated. If the model returns
malformed JSON, the agent automatically asks itself to fix it — inside the
same conversation, with no second LLM and no extra prompt engineering.
TL;DR —
agent.run(prompt, output_schema=MyModel)returns a result withresult.parsedalready typed. Validation retry is on by default (max_validation_retries=2); set it to0to disable.
When to use this
Use output_schema= when… | Skip it when… |
|---|---|
| You need a typed object back (Pydantic, dict, etc.) | The user asked for prose; structure would feel forced |
| Downstream code branches on field values | The output goes straight to a human who'll skim it |
| You're calling many models (cross-provider consistency) | You're already using vendor-specific structured-output APIs |
| Production code where parse errors must not bubble up | Notebooks / one-off scripts |
The retry path is dirt cheap: it only fires when parsing fails. Happy paths pay zero overhead.
Quick start — Pydantic
from pydantic import BaseModel
from shipit_agent import Agent
class Movie(BaseModel):
title: str
rating: float
genre: str | None = None
agent = Agent(llm=my_llm)
result = agent.run(
"Recommend a great thriller from the last decade.",
output_schema=Movie,
)
print(type(result.parsed)) # <class '__main__.Movie'>
print(result.parsed.title) # "The Dark Knight"
print(result.parsed.rating) # 9.0
print(result.output) # raw JSON text the model producedThat's the whole API. Three things you should know about how it works:
output_schema=— accepts a PydanticBaseModelsubclass, aTypedDict, or a plain JSON Schema dict.max_validation_retries=2(default) — if the first parse fails, the runtime sends the parse error back to the model and asks for a corrected response. Up to two retries beforeresult.parsedfalls back toNone.result.parsed— typed Pydantic instance on success,dictfor JSON Schema input,Nonewhen retries exhausted.
Quick start — JSON Schema dict
For runtime-only schemas (no Pydantic dependency required):
schema = {
"type": "object",
"properties": {
"sentiment": {"type": "string"},
"confidence": {"type": "number"},
},
"required": ["sentiment"],
}
result = agent.run(
"Analyze the sentiment of: 'Best release yet!'",
output_schema=schema,
)
print(result.parsed) # {"sentiment": "positive", "confidence": 0.97}How validation-retry works
When output_schema= is set and the first parse fails, the agent
automatically:
- Captures the parse error message (e.g.
"missing required key 'rating'"). - Appends the original (bad) assistant turn to the conversation.
- Appends a corrective user turn:
"That response could not be parsed: <error>. Respond again with ONLY valid JSON exactly matching the schema described earlier." - Sends that conversation back to the same LLM.
- Tries to parse the new response.
- Repeats up to
max_validation_retriestimes. - If a retry succeeds,
result.outputis set to the corrected text. If all retries fail,result.parsed = Noneand a warning is logged.
result = agent.run(
"Movies starting with M",
output_schema=Movie,
max_validation_retries=2,
)
# Behind the scenes, on a parse failure:
# user: "Movies starting with M ... <schema>"
# assistant: "Mad Max is awesome." ← unparseable, retry fires
# user: "That response could not be parsed: Invalid JSON. ..."
# assistant: '{"title": "Mad Max", "rating": 8.5}' ← parses, retry succeedsWhy "same conversation" beats LangChain's OutputFixingParser
LangChain's fix-it parser runs a fresh LLM call with its own system prompt. The fixer sees only the bad output, not what the agent was trying to do. Our retry stays in the same conversation, so the model sees:
- The original prompt
- The schema instructions (already injected on the first turn)
- Its own bad output
- Why parsing failed
That extra context yields ~2-3× better recovery on complex schemas in internal benchmarks, with no extra LLM-call cost (you'd pay for the second call anyway).
Streaming partial JSON
When you don't want to wait for the closing brace — render structured output token-by-token:
from shipit_agent import StructuredOutput
so = StructuredOutput(llm=my_llm, schema=Movie, max_retries=0)
for partial in so.stream("Recommend a thriller."):
print(partial)
# {}
# {'title': 'The'}
# {'title': 'The Dark Knight'}
# {'title': 'The Dark Knight', 'rating': 9}
# {'title': 'The Dark Knight', 'rating': 9.0}
# Movie(title='The Dark Knight', rating=9.0, genre=None) ← final typed yieldEach yield is richer than the previous — never partial-then-empty, never duplicates the same partial. Frontends can fill in fields as they arrive without re-rendering from scratch.
The low-level helper parse_partial_json(s) is exposed too, in case
you have a stream that doesn't come from a StructuredOutput:
from shipit_agent.parsers import parse_partial_json
parse_partial_json('{"name": "Al') # → {'name': 'Al'}
parse_partial_json('{"items": [1, 2, ') # → {'items': [1, 2]}
parse_partial_json('{"x": {"y": 1, "z":') # → {'x': {'y': 1}}
parse_partial_json('Sure! Here: {"a": 1}') # → {'a': 1} (prose ignored)Standalone use (without an agent)
For one-shot extraction tasks where you don't need tools or RAG:
from shipit_agent import StructuredOutput
so = StructuredOutput(
llm=my_llm,
schema=Movie,
max_retries=2,
coerce=True, # try Pydantic with type coercion if strict parse fails
)
result = so.run("Pick a film. Title and rating only.")
print(result.value) # Movie(title='...', rating=...)
print(result.attempts) # 1 = first try; >1 = retried
print(result.history) # list of {text, error} per failed attempt
print(result.raw_text) # the final text the model producedStructuredOutput is what the agent uses internally, exposed publicly
for cases where the agent loop is overkill.
Combined with everything else
Structured output composes with every other agent feature:
With tools
agent = Agent(llm=llm, tools=[WebSearchTool(), PDFTool()])
result = agent.run(
"Pull the latest Q3 earnings PDF and summarize as JSON.",
output_schema={
"type": "object",
"properties": {
"revenue": {"type": "number"},
"growth_pct": {"type": "number"},
"highlights": {"type": "array", "items": {"type": "string"}},
},
"required": ["revenue", "highlights"],
},
)
print(result.parsed["revenue"])The tool loop runs as normal; the schema + retry kick in only on the final answer.
With RAG
agent = Agent(llm=llm, rag=my_rag)
class Citation(BaseModel):
finding: str
source_id: int
class ResearchAnswer(BaseModel):
summary: str
citations: list[Citation]
result = agent.run("What did the security team find?", output_schema=ResearchAnswer)
for c in result.parsed.citations:
print(f"[{c.source_id}] {c.finding}")With DeepAgent
from shipit_agent import create_deep_agent
deep = create_deep_agent(llm=llm, rag=my_rag, verify=True, reflect=True)
result = deep.run("Audit our auth code", output_schema=AuditReport)DeepAgent.run(**kwargs) delegates to its inner Agent.run, so all
structured-output features work transparently.
API reference
Agent.run(...) — added parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
output_schema | BaseModel | dict | None | None | Pydantic model class or JSON Schema. When set, result.parsed is populated. |
max_validation_retries | int | 2 | Validation-failure retries. 0 disables retry. |
AgentResult — added field
| Field | Type | Description |
|---|---|---|
parsed | BaseModel | dict | list | None | Validated typed value, or None if all retries failed. |
StructuredOutput
StructuredOutput(
*, llm, schema,
max_retries: int = 1,
mode: Literal["auto", "tool", "prompt"] = "auto",
prompt_suffix: str | None = None,
coerce: bool = True,
)| Method | Returns | Notes |
|---|---|---|
.run(prompt, system=None) | StructuredOutputResult | Blocking; full retry loop. |
.stream(prompt, system=None) | Iterator[Any] | Yields partials; retry NOT applied. |
parse_partial_json(text: str) -> Any
Best-effort partial-JSON parser. Returns the parsed object, or None if
no recoverable structure is present.
Beat LangChain in three lines
| Capability | LangChain | shipit_agent v1.0.8 |
|---|---|---|
| Validation retry uses original conversation | ❌ separate LLM call | ✅ same conversation |
| Streaming partial JSON yields | ❌ only on completion | ✅ token-by-token |
| Schema input: Pydantic + JSON Schema dict | ✅ | ✅ |
One-line wiring on existing Agent | ❌ wrap in RunnableWith… | ✅ output_schema= kwarg |
Common pitfalls
- Bedrock + system-prompt schema injection — earlier versions returned empty content when schema instructions polluted the system prompt. v1.0.8 always appends the schema to the user prompt, so Bedrock's Converse API works without adjustment.
- Retry budget too low — when running long agent loops with complex
schemas, bump
max_validation_retries=4if you see frequentresult.parsed = Nonewarnings. coerce=False— disables the partial-JSON-recovery fallback inStructuredOutput. Use this only if you want strict JSON-only validation; otherwise leave it on.
Going deeper
- Streaming agent events — every agent event, not just structured output
- DeepAgent overview — verification + reflection on top of structured output
- Verifier network — process supervision that prevents bad outputs in the first place