building-pydantic-ai-agents
Build AI agents with Pydantic AI — tools, capabilities (including on-demand loading), structured output, streaming, testing, and multi-agent patterns. Use when the user mentions Pydantic AI, imports pydantic_ai, or asks to build an AI agent, add tools/capabilities, defer capability loading, stream output, define agents from YAML, or test agent behavior.
What this skill does
# Building AI Agents with Pydantic AI
Pydantic AI is a Python agent framework for building production-grade Generative AI applications.
This skill provides patterns, architecture guidance, and tested code examples for building applications with Pydantic AI.
## When to Use This Skill
Invoke this skill when:
- User asks to build an AI agent, create an LLM-powered app, or mentions Pydantic AI
- User wants to add tools, capabilities (thinking, web search), or structured output to an agent
- User asks to define agents from YAML/JSON specs or use template strings
- User wants to stream agent events, delegate between agents, or test agent behavior
- Code imports `pydantic_ai` or references Pydantic AI classes (`Agent`, `RunContext`, `Tool`)
- User asks about hooks, lifecycle interception, or agent observability with Logfire
- The agent design includes optional instructions, specialist workflows, long-tail tools, or any context the model does not need on most turns
Do **not** use this skill for:
- The Pydantic validation library alone (`pydantic`/`BaseModel` without agents)
- Other AI frameworks (LangChain, LlamaIndex, CrewAI, AutoGen)
- General Python development unrelated to AI agents
## Quick-Start Patterns
### Create a Basic Agent
```python
from pydantic_ai import Agent
agent = Agent(
'anthropic:claude-sonnet-4-6',
instructions='Be concise, reply with one sentence.',
)
result = agent.run_sync('Where does "hello world" come from?')
print(result.output)
"""
The first known use of "hello, world" was in a 1974 textbook about the C programming language.
"""
```
### Add Tools to an Agent
```python
import random
from pydantic_ai import Agent, RunContext
agent = Agent(
'google:gemini-3-flash-preview',
deps_type=str,
instructions=(
"You're a dice game, you should roll the die and see if the number "
"you get back matches the user's guess. If so, tell them they're a winner. "
"Use the player's name in the response."
),
)
@agent.tool_plain
def roll_dice() -> str:
"""Roll a six-sided die and return the result."""
return str(random.randint(1, 6))
@agent.tool
def get_player_name(ctx: RunContext[str]) -> str:
"""Get the player's name."""
return ctx.deps
dice_result = agent.run_sync('My guess is 4', deps='Anne')
print(dice_result.output)
#> Congratulations Anne, you guessed correctly! You're a winner!
```
### Structured Output with Pydantic Models
```python
from pydantic import BaseModel
from pydantic_ai import Agent
class CityLocation(BaseModel):
city: str
country: str
agent = Agent('google:gemini-3-flash-preview', output_type=CityLocation)
result = agent.run_sync('Where were the olympics held in 2012?')
print(result.output)
#> city='London' country='United Kingdom'
print(result.usage)
#> RunUsage(input_tokens=57, output_tokens=8, requests=1)
```
### Dependency Injection
```python
from datetime import date
from pydantic_ai import Agent, RunContext
agent = Agent(
'openai:gpt-5.2',
deps_type=str,
instructions="Use the customer's name while replying to them.",
)
@agent.instructions
def add_the_users_name(ctx: RunContext[str]) -> str:
return f"The user's name is {ctx.deps}."
@agent.instructions
def add_the_date() -> str:
return f'The date is {date.today()}.'
result = agent.run_sync('What is the date?', deps='Frank')
print(result.output)
#> Hello Frank, the date today is 2032-01-02.
```
### Testing with TestModel
```python
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel
my_agent = Agent('openai:gpt-5.2', instructions='...')
async def test_my_agent():
"""Unit test for my_agent, to be run by pytest."""
m = TestModel()
with my_agent.override(model=m):
result = await my_agent.run('Testing my agent...')
assert result.output == 'success (no tool calls)'
assert m.last_model_request_parameters.function_tools == []
```
### Use Capabilities
Capabilities are reusable, composable units of agent behavior — bundling tools, hooks, instructions, and model settings.
```python
from pydantic_ai import Agent
from pydantic_ai.capabilities import Thinking, WebSearch
agent = Agent(
'anthropic:claude-opus-4-6',
instructions='You are a research assistant. Be thorough and cite sources.',
capabilities=[
Thinking(effort='high'),
WebSearch(),
],
)
```
### Add Lifecycle Hooks
Use `Hooks` to intercept model requests, tool calls, and runs with decorators — no subclassing needed.
```python
from pydantic_ai import Agent, RunContext
from pydantic_ai.capabilities.hooks import Hooks
from pydantic_ai.models import ModelRequestContext
hooks = Hooks()
@hooks.on.before_model_request
async def log_request(ctx: RunContext[None], request_context: ModelRequestContext) -> ModelRequestContext:
print(f'Sending {len(request_context.messages)} messages')
return request_context
agent = Agent('openai:gpt-5.2', capabilities=[hooks])
```
### Define Agent from YAML Spec
Use `Agent.from_file` to load agents from YAML or JSON — no Python agent construction code needed.
```python
from pydantic_ai import Agent
# agent.yaml:
# model: anthropic:claude-opus-4-6
# instructions: You are a helpful research assistant.
# capabilities:
# - WebSearch
# - Thinking:
# effort: high
agent = Agent.from_file('agent.yaml')
```
## Task Routing Table
Load only the most relevant reference first. Read additional references only if the task spans multiple areas.
| I want to... | Reference |
|---|---|
| Create/configure agents, choose output types, use deps, define specs, or pick run methods | [Agents Core](./references/AGENTS-CORE.md) |
| Bundle reusable behavior or intercept lifecycle events | [Capabilities and Hooks](./references/CAPABILITIES-AND-HOOKS.md) |
| Decide what should load eagerly vs on demand, apply progressive disclosure, defer capability loading, or explain `load_capability` | [Capabilities on Demand](./references/ON-DEMAND-CAPABILITIES.md) |
| Add function tools, toolsets, MCP servers, or explicit search tools | [Tools Core](./references/TOOLS-CORE.md) |
| Use provider-native web search, web fetch, or code execution | [Native Tools](./references/NATIVE-TOOLS.md) |
| Use advanced tool features such as approval, retries, `ToolReturn`, validators, timeouts, or tool search | [Tools Advanced](./references/TOOLS-ADVANCED.md) |
| Work with multimodal input, message history, or context trimming | [Input and History](./references/INPUT-AND-HISTORY.md) |
| Test or debug agent behavior | [Testing and Debugging](./references/TESTING-AND-DEBUGGING.md) |
| Coordinate multiple agents or build graph workflows | [Orchestration and Integrations](./references/ORCHESTRATION-AND-INTEGRATIONS.md#coordinate-multiple-agents) |
| Call the model directly, expose A2A, use durable execution, embeddings, evals, or third-party integrations | [Orchestration and Integrations](./references/ORCHESTRATION-AND-INTEGRATIONS.md) |
| Compare abstractions, output modes, decorators, or model-string patterns | [Architecture and Decision Guide](./references/ARCHITECTURE.md) |
| Follow an older link into `COMMON-TASKS.md` | [Task Reference Map](./references/COMMON-TASKS.md) |
## Architecture and Decisions
Load [Architecture and Decision Guide](./references/ARCHITECTURE.md) only when the user is choosing between abstractions or wants comparison tables and decision trees:
| Topic | What it covers |
|---|---|
| Decision Trees | Tool registration, output modes, multi-agent patterns, capabilities, testing approaches, extensibility |
| Comparison Tables | Output modes, model provider prefixes, tool decorators, built-in capabilities, agent methods |
| Architecture Overview | Execution flow, generic types, construction patterns, lifecycle hooks, model string format |
**Quick reference — model string format:** `"provider:model-name"` (e.g., `"openai:gpt-5.2"`, `"anthropic:claude-sonnet-4-6"`, `"google:gemini-3-pro-preview"`Related in AI Agents
skill-development
IncludedComprehensive meta-skill for creating, managing, validating, auditing, and distributing Claude Code skills and slash commands (unified in v2.1.3+). Provides skill templates, creation workflows, validation patterns, audit checklists, naming conventions, YAML frontmatter guidance, progressive disclosure examples, and best practices lookup. Use when creating new skills, validating existing skills, auditing skill quality, understanding skill architecture, needing skill templates, learning about YAML frontmatter requirements, progressive disclosure patterns, tool restrictions (allowed-tools), skill composition, skill naming conventions, troubleshooting skill activation issues, creating custom slash commands, configuring command frontmatter, using command arguments ($ARGUMENTS, $1, $2), bash execution in commands, file references in commands, command namespacing, plugin commands, MCP slash commands, Skill tool configuration, or deciding between skills vs slash commands. Delegates to docs-management skill for official documentation.
reprompter
IncludedTransform messy prompts into well-structured, effective prompts — single or multi-agent. Use when: "reprompt", "reprompt this", "clean up this prompt", "structure my prompt", rough text needing XML tags and best practices, "reprompter teams", "repromptception", "run with quality", "smart run", "smart agents", multi-agent tasks, audits, parallel work, anything going to agent teams. Don't use when: simple Q&A, pure chat, immediate execution-only tasks. See "Don't Use When" section for details. Outputs: Structured XML/Markdown prompt, quality score (before/after), optional team brief + per-agent sub-prompts, agent team output files. Success criteria: Single mode quality score ≥ 7/10; Repromptception per-agent prompt quality score 8+/10; all required sections present, actionable and specific.
adaptive-compaction
IncludedAdaptive add-on policy and recovery layer that decides WHEN to compact, prune, snapshot, or fork -- replacing fixed-percent auto-compaction across Claude Code, Codex, and MCP-capable hosts. Trigger on auto-compact timing or damage: "when should I compact", "is it safe to compact now or start a fresh session", "auto-compact fires too early/mid-task", "switching to an unrelated task but the window still has space", "context rot", "answers get worse the longer the session runs", "the agent forgot the plan or my decisions after it summarized", "add a layer on top that manages context without changing the agent", raising autoCompactWindow to give the policy room, or installing/tuning a cross-tool compaction policy or PreCompact hook -- even when "compaction" is never said but the problem is context-window pressure or post-summarization memory loss. Do NOT use to summarize a conversation, build RAG, write a summarization prompt (decides WHEN not HOW), or answer max-context-length trivia.
agent-skill-creator
IncludedCreate cross-platform agent skills from workflow descriptions. Activates when users ask to create an agent, automate a repetitive workflow, create a custom skill, or need advanced agent creation. Triggers on phrases like create agent for, automate workflow, create skill for, every day I have to, daily I need to, turn process into agent, need to automate, create a cross-platform skill, validate this skill, export this skill, migrate this skill. Supports single skills, multi-agent suites, transcript processing, template-based creation, interactive configuration, cross-platform export, and spec validation.
llm-wiki
IncludedUse when building or maintaining a persistent personal knowledge base (second brain) in Obsidian where an LLM incrementally ingests sources, updates entity/concept pages, maintains cross-references, and keeps a synthesis current. Triggers include "second brain", "Obsidian wiki", "personal knowledge management", "ingest this paper/article/book", "build a research wiki", "compound knowledge", "Memex", or whenever the user wants knowledge to accumulate across sessions instead of being re-derived by RAG on every query.
skill-master
IncludedAgent Skills authoring, evaluation, and optimization. Create, edit, validate, benchmark, and improve skills following the agentskills.io specification. Use when designing SKILL.md files, structuring skill folders (references, scripts, assets), ingesting external documentation into skills, running trigger evals, benchmarking skill quality, optimizing descriptions, or performing blind A/B comparisons. Keywords: agentskills.io, SKILL.md, skill authoring, eval, benchmark, trigger optimization.