Claude
Skills
Sign in
Back

autonomous-agents

Included with Lifetime
$97 forever

Autonomous agents are AI systems that can independently decompose goals, plan actions, execute tools, and self-correct without constant human guidance. The challenge isn't making them capable - it's making them reliable. Every extra decision multiplies failure probability.

General

What this skill does


# Autonomous Agents

Autonomous agents are AI systems that can independently decompose goals,
plan actions, execute tools, and self-correct without constant human guidance.
The challenge isn't making them capable - it's making them reliable. Every
extra decision multiplies failure probability.

This skill covers agent loops (ReAct, Plan-Execute), goal decomposition,
reflection patterns, and production reliability. Key insight: compounding
error rates kill autonomous agents. A 95% success rate per step drops to
60% by step 10. Build for reliability first, autonomy second.

2025 lesson: The winners are constrained, domain-specific agents with clear
boundaries, not "autonomous everything." Treat AI outputs as proposals,
not truth.

## Principles

- Reliability over autonomy - every step compounds error probability
- Constrain scope - domain-specific beats general-purpose
- Treat outputs as proposals, not truth
- Build guardrails before expanding capabilities
- Human-in-the-loop for critical decisions is non-negotiable
- Log everything - every action must be auditable
- Fail safely with rollback, not silently with corruption

## Capabilities

- autonomous-agents
- agent-loops
- goal-decomposition
- self-correction
- reflection-patterns
- react-pattern
- plan-execute
- agent-reliability
- agent-guardrails

## Scope

- multi-agent-systems → multi-agent-orchestration
- tool-building → agent-tool-builder
- memory-systems → agent-memory-systems
- workflow-orchestration → workflow-automation

## Tooling

### Frameworks

- LangGraph - When: Production agents with state management Note: 1.0 released Oct 2025, checkpointing, human-in-loop
- AutoGPT - When: Research/experimentation, open-ended exploration Note: Needs external guardrails for production
- CrewAI - When: Role-based agent teams Note: Good for specialized agent collaboration
- Claude Agent SDK - When: Anthropic ecosystem agents Note: Computer use, tool execution

### Patterns

- ReAct - When: Reasoning + Acting in alternating steps Note: Foundation for most modern agents
- Plan-Execute - When: Separate planning from execution Note: Better for complex multi-step tasks
- Reflection - When: Self-evaluation and correction Note: Evaluator-optimizer loop

## Patterns

### ReAct Agent Loop

Alternating reasoning and action steps

**When to use**: Interactive problem-solving, tool use, exploration

# REACT PATTERN:

"""
The ReAct loop:
1. Thought: Reason about what to do next
2. Action: Choose and execute a tool
3. Observation: Receive result
4. Repeat until goal achieved

Key: Explicit reasoning traces make debugging possible
"""

## Basic ReAct Implementation
"""
from langchain.agents import create_react_agent
from langchain_openai import ChatOpenAI

# Define the ReAct prompt template
react_prompt = '''
Answer the question using the following format:

Question: the input question
Thought: reason about what to do
Action: tool_name
Action Input: input to the tool
Observation: result of the action
... (repeat Thought/Action/Observation as needed)
Thought: I now know the final answer
Final Answer: the answer
'''

# Create the agent
agent = create_react_agent(
    llm=ChatOpenAI(model="gpt-4o"),
    tools=tools,
    prompt=react_prompt,
)

# Execute with step limit
result = agent.invoke(
    {"input": query},
    config={"max_iterations": 10}  # Prevent runaway loops
)
"""

## LangGraph ReAct (Production)
"""
from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.postgres import PostgresSaver

# Production checkpointer
checkpointer = PostgresSaver.from_conn_string(
    os.environ["POSTGRES_URL"]
)

agent = create_react_agent(
    model=llm,
    tools=tools,
    checkpointer=checkpointer,  # Durable state
)

# Invoke with thread for state persistence
config = {"configurable": {"thread_id": "user-123"}}
result = agent.invoke({"messages": [query]}, config)
"""

### Plan-Execute Pattern

Separate planning phase from execution

**When to use**: Complex multi-step tasks, when full plan visibility matters

# PLAN-EXECUTE PATTERN:

"""
Two-phase approach:
1. Planning: Decompose goal into subtasks
2. Execution: Execute subtasks, potentially re-plan

Advantages:
- Full visibility into plan before execution
- Can validate/modify plan with human
- Cleaner separation of concerns

Disadvantages:
- Less adaptive to mid-task discoveries
- Plan may become stale
"""

## LangGraph Plan-Execute
"""
from langgraph.prebuilt import create_plan_and_execute_agent

# Planner creates the task list
planner_prompt = '''
For the given objective, create a step-by-step plan.
Each step should be atomic and actionable.
Format: numbered list of steps.
'''

# Executor handles individual steps
executor_prompt = '''
You are executing step {step_number} of the plan.
Previous results: {previous_results}
Current step: {current_step}
Execute this step using available tools.
'''

agent = create_plan_and_execute_agent(
    planner=planner_llm,
    executor=executor_llm,
    tools=tools,
    replan_on_error=True,  # Re-plan if step fails
)

# Human approval of plan
config = {
    "configurable": {
        "thread_id": "task-456",
    },
    "interrupt_before": ["execute"],  # Pause before execution
}

# First call creates plan
plan = agent.invoke({"objective": goal}, config)

# Review plan, then continue
if human_approves(plan):
    result = agent.invoke(None, config)  # Continue from checkpoint
"""

## Decomposition Strategies
"""
# Decomposition-First: Plan everything, then execute
# Best for: Stable tasks, need full plan approval

# Interleaved: Plan one step, execute, repeat
# Best for: Dynamic tasks, learning as you go

def interleaved_execute(goal, max_steps=10):
    state = {"goal": goal, "completed": [], "remaining": [goal]}

    for step in range(max_steps):
        # Plan next action based on current state
        next_action = planner.plan_next(state)

        if next_action == "DONE":
            break

        # Execute and update state
        result = executor.execute(next_action)
        state["completed"].append((next_action, result))

        # Re-evaluate remaining work
        state["remaining"] = planner.reassess(state)

    return state
"""

### Reflection Pattern

Self-evaluation and iterative improvement

**When to use**: Quality matters, complex outputs, creative tasks

# REFLECTION PATTERN:

"""
Self-correction loop:
1. Generate initial output
2. Evaluate against criteria
3. Critique and identify issues
4. Refine based on critique
5. Repeat until satisfactory

Also called: Evaluator-Optimizer, Self-Critique
"""

## Basic Reflection
"""
def reflect_and_improve(task, max_iterations=3):
    # Initial generation
    output = generator.generate(task)

    for i in range(max_iterations):
        # Evaluate output
        critique = evaluator.critique(
            task=task,
            output=output,
            criteria=[
                "Correctness",
                "Completeness",
                "Clarity",
            ]
        )

        if critique["passes_all"]:
            return output

        # Refine based on critique
        output = generator.refine(
            task=task,
            previous_output=output,
            critique=critique["feedback"],
        )

    return output  # Best effort after max iterations
"""

## LangGraph Reflection
"""
from langgraph.graph import StateGraph

def build_reflection_graph():
    graph = StateGraph(ReflectionState)

    # Nodes
    graph.add_node("generate", generate_node)
    graph.add_node("reflect", reflect_node)
    graph.add_node("output", output_node)

    # Edges
    graph.add_edge("generate", "reflect")
    graph.add_conditional_edges(
        "reflect",
        should_continue,
        {
            "continue": "generate",  # Loop back
            "end": "output",
        }
    )

    return graph.compile()

def should_continue(state):
    if state["iteration"] >= 3:
        return "end"
    if state["score"] >= 0.9:
        return "end"
    return "c

Related in General