Type-Safe Hybrid Workflows with Pydantic AI
In a previous post I argued that business automation should use LLMs only where they add irreplaceable value — and that everything else belongs in regular code. This post is about a practical question that follows from that: if a single pipeline has deterministic steps, LLM-assisted steps, and full agentic steps, how do you keep the seams between them clean?
Pydantic AI gives you one primitive — Agent — and lets you scale it across the spectrum:
- Deterministic: no agent at all, just code.
- LLM-assisted: an
Agentwith no tools. One LLM call, one validated output. - Agentic: an
Agentwith tools, timeouts, and a typed fallback.
The contract is the same in all three cases. You declare an output_type — a Pydantic model — and the caller gets back a validated instance of it. Whether that instance came from a compile-time constant, a single LLM call, or an agent that looped through five tool calls is an implementation detail of that one step.
LLM-assisted: a typed contract for a single call
Take a simple extraction task — pulling structured contact info from a free-text message like “Hi, I’m Anna Schmidt at anna@example.com, my customer ID is 12345.”
class ContactInfo(BaseModel):
name: str | None
email: str | None
customer_id: str | None
extractor = Agent(
model=model,
system_prompt=EXTRACTOR_PROMPT,
output_type=ContactInfo,
output_retries=3,
)
result = await extractor.run(message_text)
# result.output is already a ContactInfo instance
If the model returns malformed JSON or a value that violates the schema, Pydantic validation fails and Pydantic AI retries — feeding the validation error back into the prompt so the model can correct itself. If it keeps failing, we raise. Downstream code doesn’t care whether the model nailed it first try or needed three attempts — it only ever sees a validated ContactInfo.
Agentic: same contract, more machinery
When a step needs to try a few strategies — for example, looking up a record when the extraction was ambiguous — the agent gets tools, but the output type doesn’t change.
fetch_agent = Agent(
model=model,
system_prompt=FETCH_AGENT_PROMPT,
output_type=FetchResult,
tools=[search_by_email, search_by_id, list_recent_contacts],
)
async def fetch_agent_run(info: ContactInfo) -> FetchResult:
try:
async with asyncio.timeout(10):
result = await fetch_agent.run(info.model_dump_json())
return result.output
except (asyncio.TimeoutError, Exception) as e:
log.warning("fetch agent failed", error=e)
return FetchResult(record=None, error_code="AGENT_FAILED")
Two things worth calling out. First, the output_type is still FetchResult on the error path — we construct one with record=None so downstream code doesn’t need to special-case “did the agent work?” because the shape is the same either way. Second, every LLM call gets an explicit timeout. An agent without a timeout is a production incident waiting to happen.
Deterministic: no agent at all, same shape
When a step doesn’t need an LLM, you don’t use one. But the function still returns the same type:
async def fetch_deterministic(info: ContactInfo) -> FetchResult:
if info.customer_id:
if record := await api.get_by_id(info.customer_id):
return FetchResult(record=record)
return FetchResult(record=None)
From the caller’s perspective, fetch_deterministic and fetch_agent_run are interchangeable. That’s the property that matters: swapping a step between deterministic and agentic is a local change. The code around it doesn’t notice.
What this actually buys you
Failure modes become typed values. Every way an LLM call can go wrong — malformed output, timeout, refusal, repeated validation failure — gets caught by the runner and turned into an instance of the step’s output type with an error field set. The blast radius of “the LLM did something weird” stops at the runner for that step.
The agency-level decision becomes local. When a step you thought needed an agent turns out to be solvable with deterministic code, you delete the agent and keep the same output_type. When a step you thought was deterministic starts failing on edge cases, you swap in an agent without touching anything around it. The unit of change is one function.
If you’re curious what this looks like in a non-toy pipeline — one with intent classification, structured extraction, deterministic lookups, and an LLM-fallback planner all wired together — see Architecture of a First-Level Support Automation.