ai-native-cli
Design spec with 98 rules for building CLI tools that AI agents can safely use. Covers structured JSON output, error handling, input contracts, safety guardrails, exit codes, and agent self-description.
What this skill does
# Agent-Friendly CLI Spec v0.1
When building or modifying CLI tools, follow these rules to make them safe and
reliable for AI agents to use.
## Overview
A comprehensive design specification for building AI-native CLI tools. It defines
98 rules across three certification levels (Agent-Friendly, Agent-Ready, Agent-Native)
with prioritized requirements (P0/P1/P2). The spec covers structured JSON output,
error handling, input contracts, safety guardrails, exit codes, self-description,
and a feedback loop via a built-in issue system.
## When to Use This Skill
- Use when building a new CLI tool that AI agents will invoke
- Use when retrofitting an existing CLI to be agent-friendly
- Use when designing command-line interfaces for automation pipelines
- Use when auditing a CLI tool's compliance with agent-safety standards
## Core Philosophy
1. **Agent-first** -- default output is JSON; human-friendly is opt-in via `--human`
2. **Agent is untrusted** -- validate all input at the same level as a public API
3. **Fail-Closed** -- when validation logic itself errors, deny by default
4. **Verifiable** -- every rule is written so it can be automatically checked
## Layer Model
This spec uses two orthogonal axes:
- **Layer** answers rollout scope: `core`, `recommended`, `ecosystem`
- **Priority** answers severity: `P0`, `P1`, `P2`
Use layers for migration and certification:
- **core** -- execution contract: JSON, errors, exit codes, stdout/stderr, safety
- **recommended** -- better machine UX: self-description, explicit modes, richer schemas
- **ecosystem** -- agent-native integration: `agent/`, `skills`, `issue`, inline context
Certification maps to layers:
- **Agent-Friendly** -- all `core` rules pass
- **Agent-Ready** -- all `core` + `recommended` rules pass
- **Agent-Native** -- all layers pass
## How It Works
### Step 1: Output Mode
Default is agent mode (JSON). Explicit flags to switch:
```bash
$ mycli list # default = JSON output (agent mode)
$ mycli list --human # human-friendly: colored, tables, formatted
$ mycli list --agent # explicit agent mode (override config if needed)
```
- **Default (no flag)** -- JSON to stdout. Agent never needs to add a flag.
- **--human** -- human-friendly format (colors, tables, progress bars)
- **--agent** -- explicit JSON mode (useful when env/config overrides default)
### Step 2: agent/ Directory Convention
Every CLI tool MUST have an `agent/` directory at its project root. This is the
tool's identity and behavior contract for AI agents.
```
agent/
brief.md # One paragraph: who am I, what can I do
rules/ # Behavior constraints (auto-registered)
trigger.md # When should an agent use this tool
workflow.md # Step-by-step usage flow
writeback.md # How to write feedback back
skills/ # Extended capabilities (auto-registered)
getting-started.md
```
### Step 3: Four Levels of Self-Description
1. **--brief** (business card, injected into agent config)
2. **Every Command Response** (always-on context: data + rules + skills + issue)
3. **--help** (full self-description: brief + commands + rules + skills + issue)
4. **skills \<name\>** (on-demand deep dive into a specific skill)
## Certification Requirements
Each level includes all rules from the previous level.
Priority tag `[P0]`=agent breaks without it, `[P1]`=agent works but poorly, `[P2]`=nice to have.
### Level 1: Agent-Friendly (core -- 20 rules)
Goal: CLI is a stable, callable API. Agent can invoke, parse, and handle errors.
**Output** -- default is JSON, stable schema
- `[P0]` O1: Default output is JSON. No `--json` flag needed
- `[P0]` O2: JSON MUST pass `jq .` validation
- `[P0]` O3: JSON schema MUST NOT change within same version
**Error** -- structured, to stderr, never interactive
- `[P0]` E1: Errors -> `{"error":true, "code":"...", "message":"...", "suggestion":"..."}` to stderr
- `[P0]` E4: Error has machine-readable `code` (e.g. `MISSING_REQUIRED`)
- `[P0]` E5: Error has human-readable `message`
- `[P0]` E7: On error, NEVER enter interactive mode -- exit immediately
- `[P0]` E8: Error codes are API contracts -- MUST NOT rename across versions
**Exit Code** -- predictable failure signals
- `[P0]` X3: Parameter/usage errors MUST exit 2
- `[P0]` X9: Failures MUST exit non-zero -- never exit 0 then report error in stdout
**Composability** -- clean pipe semantics
- `[P0]` C1: stdout is for data ONLY
- `[P0]` C2: logs, progress, warnings go to stderr ONLY
**Input** -- fail fast on bad input
- `[P1]` I4: Missing required param -> structured error, never interactive prompt
- `[P1]` I5: Type mismatch -> exit 2 + structured error
**Safety** -- protect against agent mistakes
- `[P1]` S1: Destructive ops require `--yes` confirmation
- `[P1]` S4: Reject `../../` path traversal, control chars
**Guardrails** -- runtime input protection
- `[P1]` G1: Unknown flags rejected with exit 2
- `[P1]` G2: Detect API key / token patterns in args, reject execution
- `[P1]` G3: Reject sensitive file paths (*.env, *.key, *.pem)
- `[P1]` G8: Reject shell metacharacters in arguments (; | && $())
### Level 2: Agent-Ready (+ recommended -- 59 rules)
Goal: CLI is self-describing, well-named, and pipe-friendly. Agent discovers capabilities and chains commands without trial and error.
**Self-Description** -- agent discovers what CLI can do
- `[P1]` D1: `--help` outputs structured JSON with `commands[]`
- `[P1]` D3: Schema has required fields (help, commands)
- `[P1]` D4: All parameters have type declarations
- `[P1]` D7: Parameters annotated as required/optional
- `[P1]` D9: Every command has a description
- `[P1]` D11: `--help` outputs JSON with help, rules, skills, commands
- `[P1]` D15: `--brief` outputs `agent/brief.md` content
- `[P1]` D16: Default JSON (agent mode), `--human` for human-friendly
- `[P2]` D2/D5/D6/D8/D10: per-command help, enums, defaults, output schema, version
**Input** -- unambiguous calling convention
- `[P1]` I1: All flags use `--long-name` format
- `[P1]` I2: No positional argument ambiguity
- `[P2]` I3/I6/I7: --json-input, boolean --no-X, array params
**Error**
- `[P1]` E6: Error includes `suggestion` field
- `[P2]` E2/E3: errors to stderr, error JSON valid
**Safety**
- `[P1]` S8: `--sanitize` flag for external input
- `[P2]` S2/S3/S5/S6/S7: default deny, --dry-run, no auto-update, destructive marking
**Exit Code**
- `[P1]` X1: 0 = success
- `[P2]` X2/X4-X8: 1=general, 10=auth, 11=permission, 20=not-found, 30=conflict
**Composability**
- `[P1]` C6: No interactive prompts in pipe mode
- `[P2]` C3/C4/C5/C7: pipe-friendly, --quiet, pipe chain, idempotency
**Naming** -- predictable flag conventions
- `[P1]` N4: Reserved flags (--agent, --human, --brief, --help, --version, --yes, --dry-run, --quiet, --fields)
- `[P2]` N1/N2/N3/N5/N6: consistent naming, kebab-case, max 3 levels, --version semver
**Guardrails**
- `[P1]` I8/I9: no implicit state, non-interactive auth
- `[P1]` G6/G9: precondition checks, fail-closed
- `[P2]` G4/G5/G7: permission levels, PII redaction, batch limits
#### Reserved Flags
| Flag | Semantics | Notes |
|------|-----------|-------|
| `--agent` | JSON output (default) | Explicit override |
| `--human` | Human-friendly output | Colors, tables, formatted |
| `--brief` | One-paragraph identity | For sync into agent config |
| `--help` | Full self-description JSON | Brief + commands + rules + skills + issue |
| `--version` | Semver version string | |
| `--yes` | Confirm destructive ops | Required for delete/destroy |
| `--dry-run` | Preview without executing | |
| `--quiet` | Suppress stderr output | |
| `--fields` | Filter output fields | Save tokens |
### Level 3: Agent-Native (+ ecosystem -- 19 rules)
Goal: CLI has identity, behavior contract, skill system, and feedback loop. Agent can learn the tool, extend its use, and report problems -- full closed-loop collaboration.
**Agent Directory** -- tool identity and behavior contraRelated in Design
contribute
IncludedLocal-only OSS contribution command center. Auto-refreshes the user's in-flight PR and issue state on invoke so conversations start with full context — no need to brief Claude on what's in flight. Helps the user find issues to contribute to on GitHub, builds per-repo dossiers of what each upstream expects (CLA, DCO, branch convention, AI policy, draft-first, review bots, issue templates), runs deterministic gates before any external action so AI-assisted contributions don't reach maintainers as slop. State is markdown-only: candidate files at ~/.contribute-system/candidates/, repo dossiers at ~/.contribute-system/research/, append-only event log at ~/.contribute-system/log.jsonl. No database, no cloud calls. Use when the user asks about their PRs / issues / contributions, wants to find new work to take on, claim an issue, build/refresh a repo's dossier, or draft a Design Issue or PR. Trigger with "/contribute", "what's my PR status", "find a contribution", "claim issue X", "draft a Design Issue for Y", "refresh dossier for Z".
architectural-analysis
IncludedUser-triggered deep architectural analysis of a codebase or scoped subtree across eight modes — information architecture, data flow, integration points, UI surfaces, interaction patterns, data model, control flow, and failure modes. This skill should be used when the user asks to "diagram this codebase," "map the architecture," "show the data flow," "give me an ERD," "trace control flow," "find the integration points," "verify the layout pattern," "audit the UX architecture," or any similar request whose primary deliverable is mermaid diagrams plus cited reports under docs/architecture/. Dispatches haiku/sonnet sub-agents in parallel for per-mode exploration, then verifies every citation mechanically before any node lands in a diagram. Not for one-off prose explanations of code (use code-explanation) or for high-level system design from scratch (use system-design).
mcp
IncludedModel Context Protocol (MCP) server development and tool management. Languages: Python, TypeScript. Capabilities: build MCP servers, integrate external APIs, discover/execute MCP tools, manage multi-server configs, design agent-centric tools. Actions: create, build, integrate, discover, execute, configure MCP servers/tools. Keywords: MCP, Model Context Protocol, MCP server, MCP tool, stdio transport, SSE transport, tool discovery, resource provider, prompt template, external API integration, Gemini CLI MCP, Claude MCP, agent tools, tool execution, server config. Use when: building MCP servers, integrating external APIs as MCP tools, discovering available MCP tools, executing MCP capabilities, configuring multi-server setups, designing tools for AI agents.
react-native-skia
IncludedDesign, build, debug, and optimise high-polish animated graphics in React Native or Expo using @shopify/react-native-skia, Reanimated, and Gesture Handler. Use when the user wants canvas-driven UI, shaders, paths, rich text, image filters, sprite fields, Skottie, video frames, snapshots, web CanvasKit setup, or performance tuning for custom motion-heavy elements such as loaders, hero art, cards, charts, progress indicators, particle systems, or gesture-driven surfaces. Also use when the user asks for fluid, glow, glass, blob, parallax, 60fps/120fps, or GPU-friendly animated effects in React Native, even if they do not explicitly say "Skia". Do not use for ordinary form/layout work with standard views.
plaid
IncludedProduct Led AI Development — guides founders from idea to launched product. Six capabilities: Idea (discover a product idea), Validate (pressure-test the idea against fatal flaws, problem reality, competition, and 2-week MVP feasibility), Plan (vision intake + document generation), Design (translate image references into a design.md spec), Launch (go-to-market strategy), and Build (roadmap execution). Use when someone says "PLAID", "plaid idea", "help me find an idea", "product idea", "idea from my business", "idea from my expertise", "plaid validate", "validate my idea", "pressure-test", "is this idea good", "find fatal flaws", "validate the problem", "plan a product", "define my vision", "generate a PRD", "product strategy", "plaid design", "design from image", "translate image to design", "create design.md", "extract design tokens", "plaid launch", "go-to-market", "launch plan", "GTM strategy", "launch playbook", "plaid build", "build the app", "start building", or "execute the roadmap".
nextjs-framer-motion-animations
IncludedAdds production-safe Motion for React or Framer Motion animations to Next.js apps, including reveal, hover and tap micro-interactions, whileInView, stagger, AnimatePresence, layout and layoutId transitions, reorder, scroll-linked UI, and lightweight route-content transitions. Use when the user asks to add, refactor, or debug Motion or Framer Motion in App Router or Pages Router codebases, especially around server/client boundaries, reduced motion, LazyMotion, bundle size, hydration, or route transitions. Avoid for GSAP-style timelines, WebGL or 3D scenes, heavy scroll storytelling, or CSS-only effects unless Motion is explicitly requested.