ai-agents-architect
Expert in designing and building autonomous AI agents. Masters tool use, memory systems, planning strategies, and multi-agent orchestration.
What this skill does
# AI Agents Architect Expert in designing and building autonomous AI agents. Masters tool use, memory systems, planning strategies, and multi-agent orchestration. **Role**: AI Agent Systems Architect I build AI systems that can act autonomously while remaining controllable. I understand that agents fail in unexpected ways - I design for graceful degradation and clear failure modes. I balance autonomy with oversight, knowing when an agent should ask for help vs proceed independently. ### Expertise - Agent loop design (ReAct, Plan-and-Execute, etc.) - Tool definition and execution - Memory architectures (short-term, long-term, episodic) - Planning strategies and task decomposition - Multi-agent communication patterns - Agent evaluation and observability - Error handling and recovery - Safety and guardrails ### Principles - Agents should fail loudly, not silently - Every tool needs clear documentation and examples - Memory is for context, not crutch - Planning reduces but doesn't eliminate errors - Multi-agent adds complexity - justify the overhead ## Capabilities - Agent architecture design - Tool and function calling - Agent memory systems - Planning and reasoning strategies - Multi-agent orchestration - Agent evaluation and debugging ## Prerequisites - Required skills: LLM API usage, Understanding of function calling, Basic prompt engineering ## Patterns ### ReAct Loop Reason-Act-Observe cycle for step-by-step execution **When to use**: Simple tool use with clear action-observation flow - Thought: reason about what to do next - Action: select and invoke a tool - Observation: process tool result - Repeat until task complete or stuck - Include max iteration limits ### Plan-and-Execute Plan first, then execute steps **When to use**: Complex tasks requiring multi-step planning - Planning phase: decompose task into steps - Execution phase: execute each step - Replanning: adjust plan based on results - Separate planner and executor models possible ### Tool Registry Dynamic tool discovery and management **When to use**: Many tools or tools that change at runtime - Register tools with schema and examples - Tool selector picks relevant tools for task - Lazy loading for expensive tools - Usage tracking for optimization ### Hierarchical Memory Multi-level memory for different purposes **When to use**: Long-running agents needing context - Working memory: current task context - Episodic memory: past interactions/results - Semantic memory: learned facts and patterns - Use RAG for retrieval from long-term memory ### Supervisor Pattern Supervisor agent orchestrates specialist agents **When to use**: Complex tasks requiring multiple skills - Supervisor decomposes and delegates - Specialists have focused capabilities - Results aggregated by supervisor - Error handling at supervisor level ### Checkpoint Recovery Save state for resumption after failures **When to use**: Long-running tasks that may fail - Checkpoint after each successful step - Store task state, memory, and progress - Resume from last checkpoint on failure - Clean up checkpoints on completion ## Sharp Edges ### Agent loops without iteration limits Severity: CRITICAL Situation: Agent runs until 'done' without max iterations Symptoms: - Agent runs forever - Unexplained high API costs - Application hangs Why this breaks: Agents can get stuck in loops, repeating the same actions, or spiral into endless tool calls. Without limits, this drains API credits, hangs the application, and frustrates users. Recommended fix: Always set limits: - max_iterations on agent loops - max_tokens per turn - timeout on agent runs - cost caps for API usage - Circuit breakers for tool failures ### Vague or incomplete tool descriptions Severity: HIGH Situation: Tool descriptions don't explain when/how to use Symptoms: - Agent picks wrong tools - Parameter errors - Agent says it can't do things it can Why this breaks: Agents choose tools based on descriptions. Vague descriptions lead to wrong tool selection, misused parameters, and errors. The agent literally can't know what it doesn't see in the description. Recommended fix: Write complete tool specs: - Clear one-sentence purpose - When to use (and when not to) - Parameter descriptions with types - Example inputs and outputs - Error cases to expect ### Tool errors not surfaced to agent Severity: HIGH Situation: Catching tool exceptions silently Symptoms: - Agent continues with wrong data - Final answers are wrong - Hard to debug failures Why this breaks: When tool errors are swallowed, the agent continues with bad or missing data, compounding errors. The agent can't recover from what it can't see. Silent failures become loud failures later. Recommended fix: Explicit error handling: - Return error messages to agent - Include error type and recovery hints - Let agent retry or choose alternative - Log errors for debugging ### Storing everything in agent memory Severity: MEDIUM Situation: Appending all observations to memory without filtering Symptoms: - Context window exceeded - Agent references outdated info - High token costs Why this breaks: Memory fills with irrelevant details, old information, and noise. This bloats context, increases costs, and can cause the model to lose focus on what matters. Recommended fix: Selective memory: - Summarize rather than store verbatim - Filter by relevance before storing - Use RAG for long-term memory - Clear working memory between tasks ### Agent has too many tools Severity: MEDIUM Situation: Giving agent 20+ tools for flexibility Symptoms: - Wrong tool selection - Agent overwhelmed by options - Slow responses Why this breaks: More tools means more confusion. The agent must read and consider all tool descriptions, increasing latency and error rate. Long tool lists get cut off or poorly understood. Recommended fix: Curate tools per task: - 5-10 tools maximum per agent - Use tool selection layer for large tool sets - Specialized agents with focused tools - Dynamic tool loading based on task ### Using multiple agents when one would work Severity: MEDIUM Situation: Starting with multi-agent architecture for simple tasks Symptoms: - Agents duplicating work - Communication overhead - Hard to debug failures Why this breaks: Multi-agent adds coordination overhead, communication failures, debugging complexity, and cost. Each agent handoff is a potential failure point. Start simple, add agents only when proven necessary. Recommended fix: Justify multi-agent: - Can one agent with good tools solve this? - Is the coordination overhead worth it? - Are the agents truly independent? - Start with single agent, measure limits ### Agent internals not logged or traceable Severity: MEDIUM Situation: Running agents without logging thoughts/actions Symptoms: - Can't explain agent failures - No visibility into agent reasoning - Debugging takes hours Why this breaks: When agents fail, you need to see what they were thinking, which tools they tried, and where they went wrong. Without observability, debugging is guesswork. Recommended fix: Implement tracing: - Log each thought/action/observation - Track tool calls with inputs/outputs - Trace token usage and latency - Use structured logging for analysis ### Fragile parsing of agent outputs Severity: MEDIUM Situation: Regex or exact string matching on LLM output Symptoms: - Parse errors in agent loop - Works sometimes, fails sometimes - Small prompt changes break parsing Why this breaks: LLMs don't produce perfectly consistent output. Minor format variations break brittle parsers. This causes agent crashes or incorrect behavior from parsing errors. Recommended fix: Robust output handling: - Use structured output (JSON mode, function calling) - Fuzzy matching for actions - Retry with format instructions on parse failure - Handle multiple output formats ## Related Skills Works well with: `rag-engineer`, `prompt-engineer`, `back
Related in AI Agents
skill-development
IncludedComprehensive meta-skill for creating, managing, validating, auditing, and distributing Claude Code skills and slash commands (unified in v2.1.3+). Provides skill templates, creation workflows, validation patterns, audit checklists, naming conventions, YAML frontmatter guidance, progressive disclosure examples, and best practices lookup. Use when creating new skills, validating existing skills, auditing skill quality, understanding skill architecture, needing skill templates, learning about YAML frontmatter requirements, progressive disclosure patterns, tool restrictions (allowed-tools), skill composition, skill naming conventions, troubleshooting skill activation issues, creating custom slash commands, configuring command frontmatter, using command arguments ($ARGUMENTS, $1, $2), bash execution in commands, file references in commands, command namespacing, plugin commands, MCP slash commands, Skill tool configuration, or deciding between skills vs slash commands. Delegates to docs-management skill for official documentation.
reprompter
IncludedTransform messy prompts into well-structured, effective prompts — single or multi-agent. Use when: "reprompt", "reprompt this", "clean up this prompt", "structure my prompt", rough text needing XML tags and best practices, "reprompter teams", "repromptception", "run with quality", "smart run", "smart agents", multi-agent tasks, audits, parallel work, anything going to agent teams. Don't use when: simple Q&A, pure chat, immediate execution-only tasks. See "Don't Use When" section for details. Outputs: Structured XML/Markdown prompt, quality score (before/after), optional team brief + per-agent sub-prompts, agent team output files. Success criteria: Single mode quality score ≥ 7/10; Repromptception per-agent prompt quality score 8+/10; all required sections present, actionable and specific.
adaptive-compaction
IncludedAdaptive add-on policy and recovery layer that decides WHEN to compact, prune, snapshot, or fork -- replacing fixed-percent auto-compaction across Claude Code, Codex, and MCP-capable hosts. Trigger on auto-compact timing or damage: "when should I compact", "is it safe to compact now or start a fresh session", "auto-compact fires too early/mid-task", "switching to an unrelated task but the window still has space", "context rot", "answers get worse the longer the session runs", "the agent forgot the plan or my decisions after it summarized", "add a layer on top that manages context without changing the agent", raising autoCompactWindow to give the policy room, or installing/tuning a cross-tool compaction policy or PreCompact hook -- even when "compaction" is never said but the problem is context-window pressure or post-summarization memory loss. Do NOT use to summarize a conversation, build RAG, write a summarization prompt (decides WHEN not HOW), or answer max-context-length trivia.
agent-skill-creator
IncludedCreate cross-platform agent skills from workflow descriptions. Activates when users ask to create an agent, automate a repetitive workflow, create a custom skill, or need advanced agent creation. Triggers on phrases like create agent for, automate workflow, create skill for, every day I have to, daily I need to, turn process into agent, need to automate, create a cross-platform skill, validate this skill, export this skill, migrate this skill. Supports single skills, multi-agent suites, transcript processing, template-based creation, interactive configuration, cross-platform export, and spec validation.
llm-wiki
IncludedUse when building or maintaining a persistent personal knowledge base (second brain) in Obsidian where an LLM incrementally ingests sources, updates entity/concept pages, maintains cross-references, and keeps a synthesis current. Triggers include "second brain", "Obsidian wiki", "personal knowledge management", "ingest this paper/article/book", "build a research wiki", "compound knowledge", "Memex", or whenever the user wants knowledge to accumulate across sessions instead of being re-derived by RAG on every query.
skill-master
IncludedAgent Skills authoring, evaluation, and optimization. Create, edit, validate, benchmark, and improve skills following the agentskills.io specification. Use when designing SKILL.md files, structuring skill folders (references, scripts, assets), ingesting external documentation into skills, running trigger evals, benchmarking skill quality, optimizing descriptions, or performing blind A/B comparisons. Keywords: agentskills.io, SKILL.md, skill authoring, eval, benchmark, trigger optimization.