Pydantic AI is a library that helps you structure inputs and outputs of AI models using Pydantic models. It enforces types, validations, and structured outputs, which makes building reliable AI systems easier.
Introduction to Pydantic AI
What is Pydantic AI?
A wrapper around LLMs (like GPT-4)
Enforces structured outputs using Pydantic models
Ensures that your AI returns predictable fields (strings, numbers, lists, nested objects, etc.)
Useful for multi-agent systems, AI workflows, and API integrations
Why use Pydantic AI?
Prevents AI hallucinations in outputs
Makes downstream processing easier (you always know the structure of data)
Works well for structured tasks like forms, summaries, classification, or multi-step reasoning
A. Simple Example
Goal: Convert a user question into a structured response
from pydantic import BaseModel
from pydantic_ai import PydanticAI
# Step 1: Define your structured output
class AnswerModel(BaseModel):
answer: str
source: str
# Step 2: Initialize Pydantic AI with GPT-4
ai = PydanticAI(model="gpt-4o-mini", output_model=AnswerModel)
# Step 3: Ask a question
user_input = "What is the capital of France?"
# Step 4: Get structured output
result = ai.predict(user_input)
print(result.dict())Output:
{
"answer": "Paris",
"source": "Wikipedia"
}AnswerModeldefines the fields we expectPydanticAIensures the LLM output matches the modelNo post-processing required
B. Nested Models
Sometimes outputs are more complex, e.g., a summary with multiple points:
from typing import List
class Point(BaseModel):
title: str
description: str
class SummaryModel(BaseModel):
topic: str
points: List[Point]
ai = PydanticAI(model="gpt-4o-mini", output_model=SummaryModel)
user_input = "Summarize Python OOP concepts"
result = ai.predict(user_input)
print(result.dict())Why nested models?
You can structure lists, dictionaries, or hierarchical data
Easier for downstream processing or integration with apps
Suppose your AI agent searches for web pages and each page has multiple authors with additional info.
Each author has:
nameandemailEach search result has:
title,url, and a list ofauthorsAll results are returned in a list
This will involve nested models.
from pydantic import BaseModel, Field
from typing import List, Dict
# Nested model for authors
class Author(BaseModel):
name: str
email: str
# Main search result model
class SearchResult(BaseModel):
title: str
url: str
authors: List[Author] # List of Author models
metadata: Dict[str, str] = Field(default_factory=dict) # Extra info in a dictionaryauthors: List[Author]→ a list of nestedAuthorobjectsmetadata: Dict[str, str]→ a dictionary of additional information likepublished_dateorcategoryField(default_factory=dict)→ ensuresmetadatadefaults to an empty dict if not providedExample:
"title": "Python Official Site", "url": "https://python.org", "authors": [ {"name": "Guido van Rossum", "email": "[email protected]"}, {"name": "Python Core Team", "email": "[email protected]"} ], "metadata": {"published_date": "2023-01-01", "category": "Programming"}
C. Multi-Agent Workflow with Pydantic AI
Imagine a workflow with multiple AI roles:
Worker AI → performs task
Validator AI → checks if task is complete
Formatter AI → converts to structured format
Define Models for Each Agent
class WorkerOutput(BaseModel):
content: str
task_done: bool
class ValidatorOutput(BaseModel):
is_valid: bool
feedback: str
class FinalOutput(BaseModel):
content: str
approved: boolCreate AI Agents
worker_ai = PydanticAI(model="gpt-4o-mini", output_model=WorkerOutput)
validator_ai = PydanticAI(model="gpt-4o-mini", output_model=ValidatorOutput)
formatter_ai = PydanticAI(model="gpt-4o-mini", output_model=FinalOutput)Workflow
# Step 1: Worker generates content
worker_result = worker_ai.predict("Write a summary about AI agents")
# Step 2: Validator checks the result
validator_result = validator_ai.predict(worker_result.dict())
# Step 3: If valid, format the output
if validator_result.is_valid:
final_result = formatter_ai.predict(worker_result.dict())
print(final_result.dict())
else:
print("Validation failed:", validator_result.feedback)Explanation:
Each agent has structured inputs and outputs
Workflow ensures reliable AI pipelines
You can branch based on validation or conditions
D. Advanced Features
1. Default Values
class TaskOutput(BaseModel):
content: str
task_done: bool = False # default value
Useful for optional fields or initial states
2. Validators
A validator in Pydantic is a function that automatically checks and transforms fields in your model before the model instance is created.
Key points:
Defined using
@validator("<field_name>")decorator.Receives the value of the field as input.
Can raise a
ValueError(orTypeError) to reject invalid values.Must return the value (possibly transformed) if valid.
from pydantic import validator
class TaskOutput(BaseModel):
content: str
task_done: bool
@validator("content")
def must_not_be_empty(cls, v):
if not v.strip():
raise ValueError("Content cannot be empty")
return vEnsures content is valid before saving or using it
Example:
# Valid data
task1 = TaskOutput(content="Finish the report", task_done=False)
print(task1)
# Output: content='Finish the report' task_done=False
# Invalid data
try:
task2 = TaskOutput(content=" ", task_done=True)
except ValueError as e:
print(e)
# Output: 1 validation error for TaskOutput
# content
# Content cannot be empty (type=value_error)The validator ensures that empty content is never allowed, protecting downstream code from invalid data.
Multiple field validators
You can validate multiple fields:
class TaskOutput(BaseModel):
content: str
task_done: bool
@validator("content", "task_done")
def not_none(cls, v, field):
if v is None:
raise ValueError(f"{field.name} cannot be None")
return v
Now both
contentandtask_donecannot beNone.
Cross-field validation using @root_validator
Sometimes you want to validate fields together:
from pydantic import root_validator
class TaskOutput(BaseModel):
content: str
task_done: bool
@root_validator
def check_content_done(cls, values):
content = values.get("content", "").strip()
done = values.get("task_done")
if done and not content:
raise ValueError("Cannot mark task done if content is empty")
return valuesEnsures that you cannot mark a task as done if the content is empty.
@validatoris a powerful way to validate and clean individual fields in Pydantic models.It ensures your data is always correct and safe to use.
Raises
ValueErrorto prevent invalid data creation.@root_validatoris used for validations involving multiple fields.
E. List / Dict Validations
class Team(BaseModel):
members: List[str]
team = Team(members=["Alice", "Bob"])Automatically validates types, length, and structure
F. Combining with Async Agents
async def main():
worker_result = await worker_ai.apredict("Write an async task")
print(worker_result.dict())
Supports asynchronous workflows for speed and scalability
async def main():Defines an asynchronous function.
Async functions allow you to run tasks concurrently, without blocking other operations.
await worker_ai.apredict("Write an async task")apredictis an asynchronous prediction call to an AI agent.awaitpauses execution until the AI finishes, without blocking other tasks in your event loop.This is crucial for speed and scalability when you have multiple AI calls or I/O operations.
worker_result.dict()Assuming
worker_resultis a Pydantic model (likeTaskOutput)..dict()converts the mod
G. Integration with Tools
Suppose your AI agent needs to call an external search API and return results. Using Pydantic, you can define a structured output model:
from pydantic import BaseModel
class SearchResult(BaseModel):
title: str
url: str
Here:
title→ the title of the web pageurl→ the link to the page
This ensures any output the AI generates conforms to this structure, making it easier to process, display, or store.
Mock “external search API”
We’ll simulate an API that returns a list of results:
def mock_search_api(query):
# Pretend this is an external API returning data
return [
{"title": "Python Official Site", "url": "https://python.org"},
{"title": "Pydantic Docs", "url": "https://docs.pydantic.dev"},
]Convert API results to structured Pydantic models
results = mock_search_api("Python tutorials")
# Use Pydantic to structure the output
structured_results = [SearchResult(**r) for r in results]
# Print each structured result
for r in structured_results:
print(r)
Output:
title='Python Official Site' url='https://python.org'
title='Pydantic Docs' url='https://docs.pydantic.dev'This covers the basics of the Pydantic AI, which is extremely important if you are working with the AI Agents.
Pydantic AI brings structure and reliability to AI workflows. By defining models for your AI outputs, you can:
Validate outputs automatically – Ensure your AI never returns missing or invalid data.
Structure nested data – Handle lists, dictionaries, and complex nested objects with ease.
Integrate with external APIs and tools – Convert messy API responses into clean, predictable Pydantic models.
Support asynchronous workflows – Combine async AI calls with validation for fast, scalable pipelines.
Whether it’s a simple task output, a search result, or a nested multi-author dataset, Pydantic ensures your AI outputs are safe, structured, and ready to use in production.
With Pydantic AI, you don’t just get data from your models—you get trustworthy, validated, and structured data every time.

