Claude
Skills
Sign in
Back

pocketflow

Included with Lifetime
$97 forever

PocketFlow framework for building LLM applications with graph-based abstractions, design patterns, and agentic coding workflows

Designscriptsassets

What this skill does


# PocketFlow Skill

A comprehensive guide to building LLM applications using PocketFlow - a 100-line minimalist framework for Agents, Task Decomposition, RAG, and more.

## When to Use This Skill

Activate this skill when working with:
- **Graph-based LLM workflows** - Building complex AI systems with nodes and flows
- **Agentic applications** - Creating autonomous agents with dynamic action selection
- **Task decomposition** - Breaking down complex LLM tasks into manageable steps
- **RAG systems** - Implementing Retrieval Augmented Generation pipelines
- **Batch processing** - Handling large inputs or multiple files with LLMs
- **Multi-agent systems** - Coordinating multiple AI agents
- **Async workflows** - Building I/O-bound LLM applications with concurrency

## Core Concepts

### Architecture Overview

PocketFlow models LLM workflows as **Graph + Shared Store**:

```python
# Shared Store: Central data storage
shared = {
    "data": {},
    "summary": {},
    "config": {...}
}

# Graph: Nodes connected by transitions
node_a >> node_b >> node_c
flow = Flow(start=node_a)
flow.run(shared)
```

### The Node: Building Block

Every Node has 3 steps: `prep()` → `exec()` → `post()`

```python
class SummarizeFile(Node):
    def prep(self, shared):
        # Get data from shared store
        return shared["data"]

    def exec(self, prep_res):
        # Process with LLM (retries built-in)
        prompt = f"Summarize this text in 10 words: {prep_res}"
        summary = call_llm(prompt)
        return summary

    def post(self, shared, prep_res, exec_res):
        # Write results back to shared store
        shared["summary"] = exec_res
        return "default"  # Action for flow control
```

**Why 3 steps?** Separation of concerns - data storage and processing operate separately.

### The Flow: Orchestration

```python
# Simple sequence
load_data >> summarize >> save_result
flow = Flow(start=load_data)
flow.run(shared)

# Branching with actions
review - "approved" >> payment
review - "needs_revision" >> revise
review - "rejected" >> finish
revise >> review  # Loop back

flow = Flow(start=review)
```

## Quick Reference

### 1. Basic Node Pattern

```python
class LoadData(Node):
    def post(self, shared, prep_res, exec_res):
        shared["data"] = "Some text content"
        return None

class Summarize(Node):
    def prep(self, shared):
        return shared["data"]

    def exec(self, prep_res):
        return call_llm(f"Summarize: {prep_res}")

    def post(self, shared, prep_res, exec_res):
        shared["summary"] = exec_res
        return "default"

# Connect and run
load_data >> summarize
flow = Flow(start=load_data)
flow.run(shared)
```

### 2. Batch Processing

**BatchNode** - Process large inputs in chunks:

```python
class MapSummaries(BatchNode):
    def prep(self, shared):
        # Chunk big file
        content = shared["data"]
        chunk_size = 10000
        return [content[i:i+chunk_size]
                for i in range(0, len(content), chunk_size)]

    def exec(self, chunk):
        # Process each chunk
        return call_llm(f"Summarize: {chunk}")

    def post(self, shared, prep_res, exec_res_list):
        # Combine all results
        shared["summary"] = "\n".join(exec_res_list)
        return "default"
```

**BatchFlow** - Run flow multiple times with different parameters:

```python
class SummarizeAllFiles(BatchFlow):
    def prep(self, shared):
        filenames = list(shared["data"].keys())
        # Return list of parameter dicts
        return [{"filename": fn} for fn in filenames]

class LoadFile(Node):
    def prep(self, shared):
        # Access filename from params
        filename = self.params["filename"]
        return filename
```

### 3. Agent Pattern

```python
class DecideAction(Node):
    def exec(self, inputs):
        query, context = inputs
        prompt = f"""
Given input: {query}
Previous search results: {context}
Should I: 1) Search web for more info 2) Answer with current knowledge

Output in yaml:
```yaml
action: search/answer
reason: why this action
search_term: search phrase if action is search
```"""
        resp = call_llm(prompt)
        yaml_str = resp.split("```yaml")[1].split("```")[0]
        action_data = yaml.safe_load(yaml_str)
        return action_data

# Build agent graph
decide >> search_web
decide - "answer" >> provide_answer
search_web >> decide  # Loop back for more searches

agent_flow = Flow(start=decide)
```

### 4. RAG (Retrieval Augmented Generation)

**Stage 1: Offline Indexing**

```python
class ChunkDocs(BatchNode):
    def prep(self, shared):
        return shared["files"]

    def exec(self, filepath):
        with open(filepath, "r") as f:
            text = f.read()
        # Chunk by 100 chars
        size = 100
        return [text[i:i+size] for i in range(0, len(text), size)]

    def post(self, shared, prep_res, exec_res_list):
        shared["all_chunks"] = [c for chunks in exec_res_list
                                for c in chunks]

chunk_docs >> embed_docs >> build_index
offline_flow = Flow(start=chunk_docs)
```

**Stage 2: Online Query**

```python
class RetrieveDocs(Node):
    def exec(self, inputs):
        q_emb, index, chunks = inputs
        I, D = search_index(index, q_emb, top_k=1)
        return chunks[I[0][0]]

embed_query >> retrieve_docs >> generate_answer
online_flow = Flow(start=embed_query)
```

### 5. Async & Parallel

**AsyncNode** for I/O-bound operations:

```python
class SummarizeThenVerify(AsyncNode):
    async def prep_async(self, shared):
        doc_text = await read_file_async(shared["doc_path"])
        return doc_text

    async def exec_async(self, prep_res):
        summary = await call_llm_async(f"Summarize: {prep_res}")
        return summary

    async def post_async(self, shared, prep_res, exec_res):
        decision = await gather_user_feedback(exec_res)
        if decision == "approve":
            shared["summary"] = exec_res
        return "default"

# Must wrap in AsyncFlow
node = SummarizeThenVerify()
flow = AsyncFlow(start=node)
await flow.run_async(shared)
```

**AsyncParallelBatchNode** - Process multiple items concurrently:

```python
class ParallelSummaries(AsyncParallelBatchNode):
    async def prep_async(self, shared):
        return shared["texts"]  # List of texts

    async def exec_async(self, text):
        # Runs in parallel for each text
        return await call_llm_async(f"Summarize: {text}")

    async def post_async(self, shared, prep_res, exec_res_list):
        shared["summary"] = "\n\n".join(exec_res_list)
        return "default"
```

### 6. Workflow (Task Decomposition)

```python
class GenerateOutline(Node):
    def prep(self, shared):
        return shared["topic"]

    def exec(self, topic):
        return call_llm(f"Create outline for: {topic}")

    def post(self, shared, prep_res, exec_res):
        shared["outline"] = exec_res

class WriteSection(Node):
    def exec(self, outline):
        return call_llm(f"Write content: {outline}")

    def post(self, shared, prep_res, exec_res):
        shared["draft"] = exec_res

class ReviewAndRefine(Node):
    def exec(self, draft):
        return call_llm(f"Review and improve: {draft}")

# Chain the workflow
outline >> write >> review
workflow = Flow(start=outline)
```

### 7. Structured Output

```python
class SummarizeNode(Node):
    def exec(self, prep_res):
        prompt = f"""
Summarize the following text as YAML, with exactly 3 bullet points

{prep_res}

Output:
```yaml
summary:
  - bullet 1
  - bullet 2
  - bullet 3
```"""
        response = call_llm(prompt)
        yaml_str = response.split("```yaml")[1].split("```")[0].strip()

        import yaml
        structured_result = yaml.safe_load(yaml_str)

        # Validate
        assert "summary" in structured_result
        assert isinstance(structured_result["summary"], list)

        return structured_result
```

**Why YAML?** Modern LLMs handle YAML better than JSON (less escaping issues)
Files: 21
Size: 125.9 KB
Complexity: 88/100
Category: Design

Related in Design