Claude
Skills
Sign in
Back

solidity-function-audit-eval

Included with Lifetime
$97 forever

Non-interactive eval variant of the per-function Solidity audit. Removes all interactive prompts for automated evaluation via `claude -p` mode. Reads design decisions from GROUND_TRUTH.md, skips Slither, always runs verification, stops after Verification.

Design

What this skill does


# Function Audit — Eval Mode

## Purpose

Non-interactive variant of solidity-function-audit for automated evaluation. Runs the same 7-stage pipeline but removes all 12 interactive pause points: pre-seeds design decisions from GROUND_TRUTH.md, auto-confirms domains, skips Slither, always runs verification, and stops after Verification (no Stage 4/5).

Designed for invocation via `claude -p` with `--dangerously-skip-permissions`.

---

## Pre-Flight Discovery

### 0. Clear Previous Output
If `docs/audit/function-audit/` exists, delete its contents and proceed (always overwrite — no archive/cancel prompt).

### 1. Identify Project Path
- Use `$ARGUMENTS` as the project path if provided, otherwise use the current working directory.
- Store as `PROJECT_PATH` for all subsequent steps.

### 2. Discover Contracts (lean — Grep only)
Use Glob for `src/**/*.sol` (excluding `src/artifacts/`) to find all source files. Then use Grep for `contract \w+` and `library \w+` to identify contract and library declarations. Do NOT Read entire source files — only Read a specific file when domain grouping is ambiguous. The goal is to know file paths + contract names.

### 3. Discover Functions (lean — Grep only)
Use Grep for `function \w+\(` in each discovered .sol file to find all function declarations. Use Grep context flags (`-A 1` or `-B 1`) to determine visibility:
- Collect all `external` and `public` functions
- Also collect `internal` functions
- Skip auto-generated getters and pure view helpers that just return a constant

### 4. Group Into Domains
Group functions into logical domains using these heuristics (in priority order):
1. **Shared modifiers**: Functions sharing the same access control modifier belong together
2. **Shared state writes**: Functions that write to the same state variables belong together
3. **Lifecycle stages**: Functions that form a sequence (request -> process -> claim) belong together
4. **Name prefixes**: Functions with common prefixes (deposit/withdraw, add/remove)

Target 4-10 domains of 3-15 functions each. If the contract has fewer than 15 functions total, use a single domain. If natural grouping exceeds 10 domains, merge the smallest related domains.

### 5. Create Output Directory
```
mkdir -p docs/audit/function-audit/{stage0,stage1,stage2,stage3,verification}
```

### 6. Collect Source File Paths
Build the list of all .sol source file paths (absolute paths) for `{source_file_list}` placeholders.

### 7. Detect Project Characteristics
Scan source files for DeFi-relevant patterns:
- **Token interfaces**: Grep for `ERC20`, `ERC721`, `ERC1155`, `ERC4626`, `IERC20`, `SafeERC20` → set `{has_tokens}` true/false
- **Proxy/upgrade patterns**: Grep for `UUPSUpgradeable`, `TransparentProxy`, `Initializable` → set `{has_proxies}` true/false
- **Oracle imports**: Grep for `AggregatorV3Interface`, `IOracle`, `TWAP` → set `{has_oracles}` true/false

---

## Session State Checkpoint

After each completed stage, write `{output_root}/stage-checkpoint.md` using the Write tool (full overwrite). Include:
- `PROJECT_PATH`, `OUTPUT_ROOT`
- `STAGE_STATUS`: key=value pairs for each stage. Write as a standalone line starting with `STAGE_STATUS:` — this line is machine-parsed by the PreCompact hook.
- `DOMAINS`: one line per domain with slug, name, and function list
- `FLAGS`: `has_tokens`, `has_proxies`, `has_oracles`
- `PATHS`: `design_decisions_file` and all stage output file paths known so far
- After Synthesis: `FINDING_TOTALS` with severity counts

Before each stage, read the checkpoint file to confirm all paths and domain groupings. If state has been lost, recover via:
1. `Glob(pattern: "**/docs/audit/function-audit/stage-checkpoint.md")`
2. Read the file to restore all session state
3. Resume from the last completed stage

---

## Stage 0: Design Decisions (automated — no interaction)

Read the project's `GROUND_TRUTH.md` file. Extract `design_decisions_preset` from the YAML frontmatter. Use Glob to find `GROUND_TRUTH.md` in the project root or parent directories.

If `design_decisions_preset` is found, write `docs/audit/function-audit/stage0/design-decisions.md` containing the preset values formatted as design decision categories:
- **Upgradeable**: value of `upgradeable`
- **Token Standard**: value of `token_standard`
- **Access Control Model**: value of `access_control`
- **Oracle Usage**: value of `oracle_usage`
- **Notes**: value of `notes`

If `GROUND_TRUTH.md` is not found or has no preset, run the automated extraction from `resources/REVIEW_PROMPTS.md` (Stage 0 section) using Grep patterns — but skip the interactive confirmation. Write whatever is detected.

Store the absolute path as `{design_decisions_file}`.

Update the session state checkpoint (Stage 0 complete). Proceed immediately to Stage 1 (skip Slither entirely).

---

## Stage 1: Foundation Context (3 background agents)

Launch 3 Task agents, ALL with `run_in_background: true`, `subagent_type: "general-purpose"`, and `max_turns: 15`.

Read the prompt templates from `resources/STAGE_PROMPTS.md` and fill in the placeholders:
- `{output_file}` — the absolute path to the output markdown file
- `{source_file_list}` — the collected source file paths

| Agent | Output File | Prompt Template |
|-------|------------|-----------------|
| 1a: State Variable Map | `docs/audit/function-audit/stage1/state-variable-map.md` | Stage 1a from STAGE_PROMPTS.md |
| 1b: Access Control Map | `docs/audit/function-audit/stage1/access-control-map.md` | Stage 1b from STAGE_PROMPTS.md |
| 1c: External Call Map | `docs/audit/function-audit/stage1/external-call-map.md` | Stage 1c from STAGE_PROMPTS.md |

### Completion Check
After launching all 3:
1. Use `TaskOutput(block: true, timeout: 300000)` on each agent to wait for completion
2. Each agent should return ONLY a short confirmation like "Written to {file} -- {N} items analyzed."
3. Use Glob to verify all 3 files exist: `docs/audit/function-audit/stage1/*.md`
4. Quick-validate each output file: Read the first 5 and last 5 lines. Verify the file is non-empty and contains at least one markdown heading (`## `). If validation fails, note the file as INCOMPLETE in synthesis.
5. Update the session state checkpoint (Stage 1 complete, add stage1 file paths).

---

## Stage 2: Per-Domain Analysis (N background agents)

Launch ONE Task agent per domain, ALL with `run_in_background: true`, `subagent_type: "general-purpose"`, and `max_turns: 25`.

Read the Stage 2 prompt template from `resources/STAGE_PROMPTS.md` and fill in:
- `{domain_name}` — the domain name
- `{output_file}` — `docs/audit/function-audit/stage2/domain-{slug}.md`
- `{stage1_state_var_file}` — absolute path to stage1/state-variable-map.md
- `{stage1_access_control_file}` — absolute path to stage1/access-control-map.md
- `{stage1_external_call_file}` — absolute path to stage1/external-call-map.md
- `{design_decisions_file}` — absolute path to `stage0/design-decisions.md`
- `{slither_file}` — empty string (Slither is skipped in eval mode)
- `{source_file_list}` — source files relevant to this domain only
- `{function_list}` — the functions in this domain with their contract and line numbers
- `{template_file}` — absolute path to `resources/FUNCTION_TEMPLATE.md`
- `{example_file}` — absolute path to `resources/EXAMPLE_OUTPUT.md`

### Completion Check
After launching all domain agents:
1. Use `TaskOutput(block: true, timeout: 600000)` on each agent (up to 10 minutes)
2. Each agent should return ONLY a short confirmation
3. Use Glob to verify all domain files exist: `docs/audit/function-audit/stage2/*.md`
4. Quick-validate each output file: Read the first 5 and last 5 lines. Verify the file is non-empty, contains at least one `## ` heading, contains `## Summary of Findings` or `## Cross-Cutting Analysis`, and has at least one severity tag. If validation fails, note the file as INCOMPLETE.
5. Update the session state checkpoint (Stage 2 complete, add stage2 file paths).

---

## Stage 3: Cross-Cutting Anal

Related in Design