execute-tasks
Execute pending Claude Code Tasks in dependency order with wave-based concurrent execution and adaptive verification. Supports task group filtering and configurable parallelism. Use when user says "execute tasks", "run tasks", "start execution", "work on tasks", or wants to execute generated tasks autonomously.
What this skill does
# Execute Tasks Skill
This skill orchestrates autonomous task execution for Claude Code Tasks. It builds a dependency-aware execution plan, launches a dedicated agent for each task through a 4-phase workflow (Understand, Implement, Verify, Complete), handles retries with failure context, and shares learnings across tasks through a shared execution context file.
## Core Principles
### 1. Understand Before Implementing
Never write code without first understanding:
- What the task requires (acceptance criteria or inferred requirements)
- What code already exists (read before modify)
- What conventions the project follows (CLAUDE.md, existing patterns)
- What earlier tasks discovered (shared execution context)
### 2. Follow Existing Patterns
Match the codebase's established patterns:
- Coding style, naming conventions, file organization
- Error handling approach, logging patterns
- Test framework, test structure, assertion style
- Import ordering, module organization
### 3. Verify Against Criteria
Do not assume implementation is correct. Verify by:
- Walking through each acceptance criterion for spec-generated tasks
- Running tests and linters for all tasks
- Confirming the core change works as intended
- Checking for regressions in existing functionality
### 4. Report Honestly
Produce accurate verification results:
- PASS only when all Functional criteria pass and tests pass
- PARTIAL when non-critical criteria fail but core works
- FAIL when Functional criteria or tests fail
- Never mark a task complete if verification fails
## Orchestration Workflow
This skill orchestrates task execution through a 10-step loop. See `references/orchestration.md` for the full detailed procedure.
### Step 1: Load Task List
Retrieve all tasks via `TaskList`. If a `--task-group` argument was provided, filter tasks to only those with matching `metadata.task_group`. If a specific `task-id` argument was provided, validate it exists.
### Step 2: Validate State
Handle edge cases: empty list, all completed, specific task blocked, no unblocked tasks, circular dependencies.
### Step 3: Build Execution Plan
Resolve `max_parallel` setting using precedence: `--max-parallel` CLI arg > `.claude/agent-alchemy.local.md` setting > default 5. Build a dependency graph from pending tasks. Assign tasks to waves using topological levels: Wave 1 = no dependencies, Wave 2 = depends on Wave 1 tasks, etc. Sort within waves by priority (critical > high > medium > low > unprioritized), break ties by "unblocks most others." Cap each wave at `max_parallel` tasks.
### Step 4: Check Settings
Read `.claude/agent-alchemy.local.md` if it exists for execution preferences, including `max_parallel` setting. CLI `--max-parallel` argument takes precedence over the settings file value.
### Step 5: Initialize Execution Directory
Generate a `task_execution_id` using three-tier resolution: (1) if `--task-group` provided → `{task_group}-{YYYYMMDD}-{HHMMSS}`, (2) else if all open tasks share the same `metadata.task_group` → `{task_group}-{YYYYMMDD}-{HHMMSS}`, (3) else → `exec-session-{YYYYMMDD}-{HHMMSS}`. Clean any stale `__live_session__/` files by archiving them to `.claude/sessions/interrupted-{YYYYMMDD}-{HHMMSS}/`, resetting any `in_progress` tasks from the interrupted session back to `pending`. Check for and enforce the concurrency guard via `.lock` file. Create `.claude/sessions/__live_session__/` directory containing:
- `execution_plan.md` — saved execution plan from Step 5
- `execution_context.md` — initialized with standard template
- `task_log.md` — initialized with table headers (Task ID, Subject, Status, Attempts, Duration, Token Usage)
- `tasks/` — subdirectory for archiving completed task files
- `progress.md` — initialized with status template for real-time progress tracking
- `execution_pointer.md` at `~/.claude/tasks/{CLAUDE_CODE_TASK_LIST_ID}/` — created immediately with absolute path to `.claude/sessions/__live_session__/`
### Step 6: Present Execution Plan and Confirm
Display the plan showing tasks to execute, blocked tasks, and completed count. Also display the details of step 5 which includes the session directory path (`.claude/sessions/__live_session__/`) and files created, including the execution pointer file location.
Then ask the user to confirm before proceeding with execution. If the user cancels, stop without modifying any tasks.
### Step 7: Initialize Execution Context
Read `.claude/sessions/__live_session__/execution_context.md` (created in Step 5). If a prior execution context exists, look in `.claude/sessions/` for the most recent timestamped subfolder and merge relevant learnings into the new one.
### Step 8: Execute Loop
Execute tasks in waves. For each wave: snapshot `execution_context.md`, mark wave tasks `in_progress`, update `progress.md` with all active tasks, launch up to `max_parallel` background agents simultaneously via **parallel Task tool calls with `run_in_background: true`**, recording the `{task_list_id → background_task_id}` mapping from each Task tool response. Each agent writes to `context-task-{id}.md` and a compact `result-task-{id}.md` (completion signal). The orchestrator polls for result files via the `poll-for-results.sh` script in a single Bash invocation (with `timeout: 2760000`), then batch-reads results to process outcomes — avoiding full agent output in context. After polling, the orchestrator calls `TaskOutput` on each background task_id to reap the process and extract per-task `duration_ms` and `total_tokens` usage metadata for the task log. If `TaskOutput` times out (agent stuck), `TaskStop` is called to force-terminate the agent. Failed tasks with retries remaining are re-launched as background agents (same TaskOutput reaping after retry polling). After all wave agents complete: merge per-task context files into `execution_context.md`, clean up result and context files, archive completed task JSONs, refresh TaskList for newly unblocked tasks, form next wave, repeat.
### Step 9: Session Summary
Display execution results with pass/fail counts, total execution time (sum of per-task `duration_ms`), failed task list, newly unblocked tasks, and total token usage (sum of per-task `total_tokens` captured via `TaskOutput`). Save `session_summary.md` to `.claude/sessions/__live_session__/`. Archive the session by moving all contents from `__live_session__/` to `.claude/sessions/{task_execution_id}/`, leaving `__live_session__/` as an empty directory. `execution_pointer.md` stays pointing to `__live_session__/`.
### Step 10: Update CLAUDE.md
Review execution context for project-wide changes (new patterns, dependencies, commands, structure changes, design decisions). Make targeted edits to CLAUDE.md if meaningful changes occurred. Skip if only task-specific or internal implementation details.
## Task Classification
Determine whether a task is spec-generated or general to select the right verification approach.
### Detection Algorithm
1. **Check description format**: Look for `**Acceptance Criteria:**` with categorized criteria (`_Functional:_`, `_Edge Cases:_`, etc.)
2. **Check metadata**: Look for `metadata.spec_path` field
3. **Check source reference**: Look for `Source: {path} Section {number}` in the description
If any check matches -> **spec-generated task** (use criterion-based verification)
If none match -> **General task** (use inferred verification)
## 4-Phase Workflow
Each task is executed by the `agent-alchemy-sdd:task-executor` agent through these phases:
### Phase 1: Understand
Load context and understand scope before writing code.
- Read the execute-tasks skill and references
- Read `.claude/sessions/__live_session__/execution_context.md` for learnings from prior tasks
- Load task details via `TaskGet`
- Classify the task (spec-generated vs general)
- Parse acceptance criteria or infer requirements from description
- Explore affected files via Glob/Grep
- Read `CLAUDE.md` for project conventions
### Phase 2: ImRelated in AI Agents
skill-development
IncludedComprehensive meta-skill for creating, managing, validating, auditing, and distributing Claude Code skills and slash commands (unified in v2.1.3+). Provides skill templates, creation workflows, validation patterns, audit checklists, naming conventions, YAML frontmatter guidance, progressive disclosure examples, and best practices lookup. Use when creating new skills, validating existing skills, auditing skill quality, understanding skill architecture, needing skill templates, learning about YAML frontmatter requirements, progressive disclosure patterns, tool restrictions (allowed-tools), skill composition, skill naming conventions, troubleshooting skill activation issues, creating custom slash commands, configuring command frontmatter, using command arguments ($ARGUMENTS, $1, $2), bash execution in commands, file references in commands, command namespacing, plugin commands, MCP slash commands, Skill tool configuration, or deciding between skills vs slash commands. Delegates to docs-management skill for official documentation.
reprompter
IncludedTransform messy prompts into well-structured, effective prompts — single or multi-agent. Use when: "reprompt", "reprompt this", "clean up this prompt", "structure my prompt", rough text needing XML tags and best practices, "reprompter teams", "repromptception", "run with quality", "smart run", "smart agents", multi-agent tasks, audits, parallel work, anything going to agent teams. Don't use when: simple Q&A, pure chat, immediate execution-only tasks. See "Don't Use When" section for details. Outputs: Structured XML/Markdown prompt, quality score (before/after), optional team brief + per-agent sub-prompts, agent team output files. Success criteria: Single mode quality score ≥ 7/10; Repromptception per-agent prompt quality score 8+/10; all required sections present, actionable and specific.
adaptive-compaction
IncludedAdaptive add-on policy and recovery layer that decides WHEN to compact, prune, snapshot, or fork -- replacing fixed-percent auto-compaction across Claude Code, Codex, and MCP-capable hosts. Trigger on auto-compact timing or damage: "when should I compact", "is it safe to compact now or start a fresh session", "auto-compact fires too early/mid-task", "switching to an unrelated task but the window still has space", "context rot", "answers get worse the longer the session runs", "the agent forgot the plan or my decisions after it summarized", "add a layer on top that manages context without changing the agent", raising autoCompactWindow to give the policy room, or installing/tuning a cross-tool compaction policy or PreCompact hook -- even when "compaction" is never said but the problem is context-window pressure or post-summarization memory loss. Do NOT use to summarize a conversation, build RAG, write a summarization prompt (decides WHEN not HOW), or answer max-context-length trivia.
agent-skill-creator
IncludedCreate cross-platform agent skills from workflow descriptions. Activates when users ask to create an agent, automate a repetitive workflow, create a custom skill, or need advanced agent creation. Triggers on phrases like create agent for, automate workflow, create skill for, every day I have to, daily I need to, turn process into agent, need to automate, create a cross-platform skill, validate this skill, export this skill, migrate this skill. Supports single skills, multi-agent suites, transcript processing, template-based creation, interactive configuration, cross-platform export, and spec validation.
llm-wiki
IncludedUse when building or maintaining a persistent personal knowledge base (second brain) in Obsidian where an LLM incrementally ingests sources, updates entity/concept pages, maintains cross-references, and keeps a synthesis current. Triggers include "second brain", "Obsidian wiki", "personal knowledge management", "ingest this paper/article/book", "build a research wiki", "compound knowledge", "Memex", or whenever the user wants knowledge to accumulate across sessions instead of being re-derived by RAG on every query.
skill-master
IncludedAgent Skills authoring, evaluation, and optimization. Create, edit, validate, benchmark, and improve skills following the agentskills.io specification. Use when designing SKILL.md files, structuring skill folders (references, scripts, assets), ingesting external documentation into skills, running trigger evals, benchmarking skill quality, optimizing descriptions, or performing blind A/B comparisons. Keywords: agentskills.io, SKILL.md, skill authoring, eval, benchmark, trigger optimization.