Pydantic AI is a library that helps you structure inputs and outputs of AI models using Pydantic models. It enforces types, validations, and structured outputs, which makes building reliable AI systems easier.

Introduction to Pydantic AI

What is Pydantic AI?

  • A wrapper around LLMs (like GPT-4)

  • Enforces structured outputs using Pydantic models

  • Ensures that your AI returns predictable fields (strings, numbers, lists, nested objects, etc.)

  • Useful for multi-agent systems, AI workflows, and API integrations

Why use Pydantic AI?

  • Prevents AI hallucinations in outputs

  • Makes downstream processing easier (you always know the structure of data)

  • Works well for structured tasks like forms, summaries, classification, or multi-step reasoning

A. Simple Example

Goal: Convert a user question into a structured response

from pydantic import BaseModel
from pydantic_ai import PydanticAI

# Step 1: Define your structured output
class AnswerModel(BaseModel):
    answer: str
    source: str

# Step 2: Initialize Pydantic AI with GPT-4
ai = PydanticAI(model="gpt-4o-mini", output_model=AnswerModel)

# Step 3: Ask a question
user_input = "What is the capital of France?"

# Step 4: Get structured output
result = ai.predict(user_input)
print(result.dict())

Output:

{
  "answer": "Paris",
  "source": "Wikipedia"
}
  • AnswerModel defines the fields we expect

  • PydanticAI ensures the LLM output matches the model

  • No post-processing required

B. Nested Models

Sometimes outputs are more complex, e.g., a summary with multiple points:

from typing import List

class Point(BaseModel):
    title: str
    description: str

class SummaryModel(BaseModel):
    topic: str
    points: List[Point]

ai = PydanticAI(model="gpt-4o-mini", output_model=SummaryModel)

user_input = "Summarize Python OOP concepts"

result = ai.predict(user_input)
print(result.dict())

Why nested models?

  • You can structure lists, dictionaries, or hierarchical data

  • Easier for downstream processing or integration with apps

Suppose your AI agent searches for web pages and each page has multiple authors with additional info.

  • Each author has: name and email

  • Each search result has: title, url, and a list of authors

  • All results are returned in a list

This will involve nested models.

from pydantic import BaseModel, Field
from typing import List, Dict

# Nested model for authors
class Author(BaseModel):
    name: str
    email: str

# Main search result model
class SearchResult(BaseModel):
    title: str
    url: str
    authors: List[Author]  # List of Author models
    metadata: Dict[str, str] = Field(default_factory=dict)  # Extra info in a dictionary
  • authors: List[Author] → a list of nested Author objects

  • metadata: Dict[str, str] → a dictionary of additional information like published_date or category

  • Field(default_factory=dict) → ensures metadata defaults to an empty dict if not provided

    Example:

    "title": "Python Official Site", "url": "https://python.org", "authors": [ {"name": "Guido van Rossum", "email": "[email protected]"}, {"name": "Python Core Team", "email": "[email protected]"} ], "metadata": {"published_date": "2023-01-01", "category": "Programming"}

C. Multi-Agent Workflow with Pydantic AI

Imagine a workflow with multiple AI roles:

  1. Worker AI → performs task

  2. Validator AI → checks if task is complete

  3. Formatter AI → converts to structured format

Define Models for Each Agent

class WorkerOutput(BaseModel):
    content: str
    task_done: bool

class ValidatorOutput(BaseModel):
    is_valid: bool
    feedback: str

class FinalOutput(BaseModel):
    content: str
    approved: bool

Create AI Agents

worker_ai = PydanticAI(model="gpt-4o-mini", output_model=WorkerOutput)

validator_ai = PydanticAI(model="gpt-4o-mini", output_model=ValidatorOutput)

formatter_ai = PydanticAI(model="gpt-4o-mini", output_model=FinalOutput)

Workflow

# Step 1: Worker generates content
worker_result = worker_ai.predict("Write a summary about AI agents")

# Step 2: Validator checks the result
validator_result = validator_ai.predict(worker_result.dict())

# Step 3: If valid, format the output
if validator_result.is_valid:
    final_result = formatter_ai.predict(worker_result.dict())
    print(final_result.dict())
else:
    print("Validation failed:", validator_result.feedback)

Explanation:

  • Each agent has structured inputs and outputs

  • Workflow ensures reliable AI pipelines

  • You can branch based on validation or conditions

D. Advanced Features

1. Default Values
class TaskOutput(BaseModel):
    content: str
    task_done: bool = False  # default value
  • Useful for optional fields or initial states

2. Validators

A validator in Pydantic is a function that automatically checks and transforms fields in your model before the model instance is created.

Key points:

  • Defined using @validator("<field_name>") decorator.

  • Receives the value of the field as input.

  • Can raise a ValueError (or TypeError) to reject invalid values.

  • Must return the value (possibly transformed) if valid.

from pydantic import validator

class TaskOutput(BaseModel):
    content: str
    task_done: bool

    @validator("content")
    def must_not_be_empty(cls, v):
        if not v.strip():
            raise ValueError("Content cannot be empty")
        return v
  • Ensures content is valid before saving or using it

Example:

# Valid data
task1 = TaskOutput(content="Finish the report", task_done=False)
print(task1)
# Output: content='Finish the report' task_done=False

# Invalid data
try:
    task2 = TaskOutput(content="   ", task_done=True)
except ValueError as e:
    print(e)
# Output: 1 validation error for TaskOutput
# content
#   Content cannot be empty (type=value_error)

The validator ensures that empty content is never allowed, protecting downstream code from invalid data.

Multiple field validators

You can validate multiple fields:

class TaskOutput(BaseModel):
    content: str
    task_done: bool

    @validator("content", "task_done")
    def not_none(cls, v, field):
        if v is None:
            raise ValueError(f"{field.name} cannot be None")
        return v
  • Now both content and task_done cannot be None.

Cross-field validation using @root_validator

Sometimes you want to validate fields together:

from pydantic import root_validator

class TaskOutput(BaseModel):
    content: str
    task_done: bool

    @root_validator
    def check_content_done(cls, values):
        content = values.get("content", "").strip()
        done = values.get("task_done")
        if done and not content:
            raise ValueError("Cannot mark task done if content is empty")
        return values
  • Ensures that you cannot mark a task as done if the content is empty.

  • @validator is a powerful way to validate and clean individual fields in Pydantic models.

  • It ensures your data is always correct and safe to use.

  • Raises ValueError to prevent invalid data creation.

  • @root_validator is used for validations involving multiple fields.

E. List / Dict Validations

class Team(BaseModel):
    members: List[str]

team = Team(members=["Alice", "Bob"])

Automatically validates types, length, and structure

F. Combining with Async Agents

async def main():
    worker_result = await worker_ai.apredict("Write an async task")
    print(worker_result.dict())
  • Supports asynchronous workflows for speed and scalability

  • async def main():

    • Defines an asynchronous function.

    • Async functions allow you to run tasks concurrently, without blocking other operations.

  • await worker_ai.apredict("Write an async task")

    • apredict is an asynchronous prediction call to an AI agent.

    • await pauses execution until the AI finishes, without blocking other tasks in your event loop.

    • This is crucial for speed and scalability when you have multiple AI calls or I/O operations.

  • worker_result.dict()

    • Assuming worker_result is a Pydantic model (like TaskOutput).

    • .dict() converts the mod

G. Integration with Tools

Suppose your AI agent needs to call an external search API and return results. Using Pydantic, you can define a structured output model:

from pydantic import BaseModel

class SearchResult(BaseModel):
    title: str
    url: str

Here:

  • title → the title of the web page

  • url → the link to the page

This ensures any output the AI generates conforms to this structure, making it easier to process, display, or store.

Mock “external search API”

We’ll simulate an API that returns a list of results:

def mock_search_api(query):
    # Pretend this is an external API returning data
    return [
        {"title": "Python Official Site", "url": "https://python.org"},
        {"title": "Pydantic Docs", "url": "https://docs.pydantic.dev"},
    ]
Convert API results to structured Pydantic models
results = mock_search_api("Python tutorials")

# Use Pydantic to structure the output
structured_results = [SearchResult(**r) for r in results]

# Print each structured result
for r in structured_results:
    print(r)

Output:

title='Python Official Site' url='https://python.org'
title='Pydantic Docs' url='https://docs.pydantic.dev'

This covers the basics of the Pydantic AI, which is extremely important if you are working with the AI Agents.

Pydantic AI brings structure and reliability to AI workflows. By defining models for your AI outputs, you can:

  • Validate outputs automatically – Ensure your AI never returns missing or invalid data.

  • Structure nested data – Handle lists, dictionaries, and complex nested objects with ease.

  • Integrate with external APIs and tools – Convert messy API responses into clean, predictable Pydantic models.

  • Support asynchronous workflows – Combine async AI calls with validation for fast, scalable pipelines.

Whether it’s a simple task output, a search result, or a nested multi-author dataset, Pydantic ensures your AI outputs are safe, structured, and ready to use in production.

With Pydantic AI, you don’t just get data from your models—you get trustworthy, validated, and structured data every time.

Reply

or to participate

Keep Reading

No posts found