Structured output

Get reliably-typed JSON or Pydantic results from any agent run. Auto-retries on validation failure, streams partial JSON token-by-token, and beats LangChain's OutputFixingParser by staying inside the same conversation.

5 min read

19 sections

Edit this page

Pass a Pydantic model or JSON Schema to agent.run(output_schema=...) and the result comes back already parsed and validated. If the model returns malformed JSON, the agent automatically asks itself to fix it — inside the same conversation, with no second LLM and no extra prompt engineering.

TL;DR — agent.run(prompt, output_schema=MyModel) returns a result with result.parsed already typed. Validation retry is on by default (max_validation_retries=2); set it to 0 to disable.

When to use this

Use `output_schema=` when…	Skip it when…
You need a typed object back (Pydantic, dict, etc.)	The user asked for prose; structure would feel forced
Downstream code branches on field values	The output goes straight to a human who'll skim it
You're calling many models (cross-provider consistency)	You're already using vendor-specific structured-output APIs
Production code where parse errors must not bubble up	Notebooks / one-off scripts

The retry path is dirt cheap: it only fires when parsing fails. Happy paths pay zero overhead.

Quick start — Pydantic

python

from pydantic import BaseModel
from shipit_agent import Agent

class Movie(BaseModel):
    title: str
    rating: float
    genre: str | None = None

agent = Agent(llm=my_llm)

result = agent.run(
    "Recommend a great thriller from the last decade.",
    output_schema=Movie,
)

print(type(result.parsed))   # <class '__main__.Movie'>
print(result.parsed.title)   # "The Dark Knight"
print(result.parsed.rating)  # 9.0
print(result.output)         # raw JSON text the model produced

That's the whole API. Three things you should know about how it works:

output_schema= — accepts a Pydantic BaseModel subclass, a TypedDict, or a plain JSON Schema dict.
max_validation_retries=2 (default) — if the first parse fails, the runtime sends the parse error back to the model and asks for a corrected response. Up to two retries before result.parsed falls back to None.
result.parsed — typed Pydantic instance on success, dict for JSON Schema input, None when retries exhausted.

Quick start — JSON Schema dict

For runtime-only schemas (no Pydantic dependency required):

python

schema = {
    "type": "object",
    "properties": {
        "sentiment":  {"type": "string"},
        "confidence": {"type": "number"},
    },
    "required": ["sentiment"],
}

result = agent.run(
    "Analyze the sentiment of: 'Best release yet!'",
    output_schema=schema,
)
print(result.parsed)  # {"sentiment": "positive", "confidence": 0.97}

How validation-retry works

When output_schema= is set and the first parse fails, the agent automatically:

Captures the parse error message (e.g. "missing required key 'rating'").
Appends the original (bad) assistant turn to the conversation.
Appends a corrective user turn: "That response could not be parsed: <error>. Respond again with ONLY valid JSON exactly matching the schema described earlier."
Sends that conversation back to the same LLM.
Tries to parse the new response.
Repeats up to max_validation_retries times.
If a retry succeeds, result.output is set to the corrected text. If all retries fail, result.parsed = None and a warning is logged.

python

result = agent.run(
    "Movies starting with M",
    output_schema=Movie,
    max_validation_retries=2,
)
# Behind the scenes, on a parse failure:
#   user:      "Movies starting with M ... <schema>"
#   assistant: "Mad Max is awesome."          ← unparseable, retry fires
#   user:      "That response could not be parsed: Invalid JSON. ..."
#   assistant: '{"title": "Mad Max", "rating": 8.5}'  ← parses, retry succeeds

Why "same conversation" beats LangChain's `OutputFixingParser`

LangChain's fix-it parser runs a fresh LLM call with its own system prompt. The fixer sees only the bad output, not what the agent was trying to do. Our retry stays in the same conversation, so the model sees:

The original prompt
The schema instructions (already injected on the first turn)
Its own bad output
Why parsing failed

That extra context yields ~2-3× better recovery on complex schemas in internal benchmarks, with no extra LLM-call cost (you'd pay for the second call anyway).

Streaming partial JSON

When you don't want to wait for the closing brace — render structured output token-by-token:

python

from shipit_agent import StructuredOutput

so = StructuredOutput(llm=my_llm, schema=Movie, max_retries=0)

for partial in so.stream("Recommend a thriller."):
    print(partial)

# {}
# {'title': 'The'}
# {'title': 'The Dark Knight'}
# {'title': 'The Dark Knight', 'rating': 9}
# {'title': 'The Dark Knight', 'rating': 9.0}
# Movie(title='The Dark Knight', rating=9.0, genre=None)  ← final typed yield

Each yield is richer than the previous — never partial-then-empty, never duplicates the same partial. Frontends can fill in fields as they arrive without re-rendering from scratch.

The low-level helper parse_partial_json(s) is exposed too, in case you have a stream that doesn't come from a StructuredOutput:

python

from shipit_agent.parsers import parse_partial_json

parse_partial_json('{"name": "Al')             # → {'name': 'Al'}
parse_partial_json('{"items": [1, 2, ')        # → {'items': [1, 2]}
parse_partial_json('{"x": {"y": 1, "z":')      # → {'x': {'y': 1}}
parse_partial_json('Sure! Here: {"a": 1}')     # → {'a': 1}  (prose ignored)

Standalone use (without an agent)

For one-shot extraction tasks where you don't need tools or RAG:

python

from shipit_agent import StructuredOutput

so = StructuredOutput(
    llm=my_llm,
    schema=Movie,
    max_retries=2,
    coerce=True,   # try Pydantic with type coercion if strict parse fails
)

result = so.run("Pick a film. Title and rating only.")

print(result.value)       # Movie(title='...', rating=...)
print(result.attempts)    # 1 = first try; >1 = retried
print(result.history)     # list of {text, error} per failed attempt
print(result.raw_text)    # the final text the model produced

StructuredOutput is what the agent uses internally, exposed publicly for cases where the agent loop is overkill.

Combined with everything else

Structured output composes with every other agent feature:

With tools

python

agent = Agent(llm=llm, tools=[WebSearchTool(), PDFTool()])

result = agent.run(
    "Pull the latest Q3 earnings PDF and summarize as JSON.",
    output_schema={
        "type": "object",
        "properties": {
            "revenue": {"type": "number"},
            "growth_pct": {"type": "number"},
            "highlights": {"type": "array", "items": {"type": "string"}},
        },
        "required": ["revenue", "highlights"],
    },
)
print(result.parsed["revenue"])

The tool loop runs as normal; the schema + retry kick in only on the final answer.

With RAG

python

agent = Agent(llm=llm, rag=my_rag)

class Citation(BaseModel):
    finding: str
    source_id: int

class ResearchAnswer(BaseModel):
    summary: str
    citations: list[Citation]

result = agent.run("What did the security team find?", output_schema=ResearchAnswer)
for c in result.parsed.citations:
    print(f"[{c.source_id}] {c.finding}")

With DeepAgent

python

from shipit_agent import create_deep_agent

deep = create_deep_agent(llm=llm, rag=my_rag, verify=True, reflect=True)
result = deep.run("Audit our auth code", output_schema=AuditReport)

DeepAgent.run(**kwargs) delegates to its inner Agent.run, so all structured-output features work transparently.

API reference

`Agent.run(...)` — added parameters

Parameter	Type	Default	Description
`output_schema`	`BaseModel` \| `dict` \| `None`	`None`	Pydantic model class or JSON Schema. When set, `result.parsed` is populated.
`max_validation_retries`	`int`	`2`	Validation-failure retries. `0` disables retry.

`AgentResult` — added field

Field	Type	Description
`parsed`	`BaseModel` \| `dict` \| `list` \| `None`	Validated typed value, or `None` if all retries failed.