discovery-to-determinism
Put the bulk of acceptance coverage below the UI through a fast, deterministic headless client driving an operator seam, and reserve a surgical UI state-graph tier for defects that only manifest through the real GUI. Use when designing test/QA or acceptance-testing strategy, automating acceptance, end-to-end (E2E), or QA testing of a running app, deciding what to cover with fast headless tests vs slow UI/E2E, building agent-driven exploration or automation of a running app, building a below-UI operator seam (interaction layer) or headless client, or crystallising agent-discovered knowledge into reusable deterministic artifacts (maps, graphs, scripts, tests). Covers the Discovery⇄Determinism flywheel, the operator-seam architecture (one seam serving both a headless test client and AI-agent tools), and layered headless-first acceptance testing with a surgical UI state-graph tier for GUI-only defects.
What this skill does
<objective>
There is an asymmetry at the heart of agent-driven engineering. **Discovery** — perceiving an unknown system, exercising judgement, handling the open-ended — is what *agents* are uniquely good at; it is also expensive and non-deterministic. **Exact repetition** — doing the same thing the same way, fast, every time — is what *code* is uniquely good at; it is cheap and deterministic but discovers nothing new.
This skill teaches how to run that asymmetry as a **flywheel** (discovery hardens into determinism; determinism makes the next discovery cheaper and sharper), and how to cash it out architecturally: by carving a clean **operator seam** below the UI so most behaviour is driven by fast deterministic code instead of the slow, flaky GUI — a seam that, built well, serves both a headless test client and AI-agent tools, because tests and agents are both *non-human operators* needing scriptable access to the real logic.
The principle leads; testing is its first worked embodiment. Keep it general; the worked example illustrates, it does not define.
</objective>
<quick_start>
1. **Run the applicability gate first** (`<when-this-applies>`) — and STOP if it fails (the UI *is* the logic, a programmatic seam already exists, the app is too small, or no failing test needs the seam yet).
2. **Default acceptance coverage below the UI** — drive the real logic, persistence, and backend through an **operator seam** with a fast, deterministic headless client; that tier carries the bulk.
3. **Extract the seam test-first** — RED before any seam code; reuse an existing API / service-layer / CLI / SDK / MCP in preference to inventing one.
4. **Reserve a small, surgical UI tier** (`references/ui-state-graph-edt.md`) only for defects that *only* surface through the real GUI; never let it re-test what headless covers.
5. **Crystallise every worthwhile discovery** into self-checking code as part of the same effort — an un-crystallised discovery is a missing feedback loop.
</quick_start>
<the-flywheel>
Run discovery and determinism as a loop with **two arrows**, not a one-way pipeline.
**Arrow 1 — Discovery → Determinism (crystallise).** An agent explores the unknown; the finding is hardened *immediately* into deterministic, self-checking code — a map, a script, a graph, a test, a recognizer. The artifact is a *cache of discovered structure*. A discovery left as prose or a transcript is a finding you will pay full price to discover again.
**Arrow 2 — Determinism → Discovery (bootstrap & target).** ← the half people miss. The crystallised artifacts make the *next* discovery cheaper, safer, and self-targeting:
1. **Launch from the frontier.** To explore new territory, deterministically *traverse the known map to its edge*, then explore only the delta — "arrive in seconds, explore the one new thing," not "redo the whole expensive prefix every run."
2. **A reliable harness, not flailing.** Deterministic *arrange* + *reset* gives exploration a repeatable scaffold: reach a precondition exactly, poke around, reset and re-arrive if lost. Discovery becomes a controlled experiment.
3. **Drift becomes a targeted discovery prompt.** When the system changes, a *precise, localised* failure in the deterministic layer ("expected marker X missing at step A→B") **is the instruction for where to re-discover.** Coarse failures cannot do this.
4. **Orientation keeps findings integrated.** A deterministic "where am I?" check lets an exploring agent locate itself against the known model, so new findings *slot into* the existing structure instead of forking a duplicate.
**The ratchet:** discovery produces determinism; determinism lowers the cost and raises the precision of the next discovery; repeat. This is a *model and a discipline*, not a measured law — there is no promised "cost falls by N%," and a **stale** crystallised map inverts the benefit until re-crystallised.
**Promode already runs this flywheel once.** The `CLAUDE.md`-rooted agent-knowledge graph *is* its first instance: a knowledge node is a crystallised discovery about the *repo*; "orient before you act" is the where-am-I check. Everything below aims the same loop at a *running app* instead of a codebase.
</the-flywheel>
<closing-the-loop>
**Why this is a loop, not two phases — and why it's the whole point.** The naive reading is "discover once, crystallise, then replay forever" — a one-way pipeline. The leverage is in wiring determinism's *output* back to inference's *input*: build the deterministic layer as an **instrument whose failures are designed to summon inference**, because a deterministic check that fails is asking a question only judgement can answer. Use each side for what only it can do:
- **Inference (agents) for discovery and judgement** — perceiving an unknown system, and deciding what a failure *means*. Expensive and non-deterministic; spend it where judgement is unavoidable.
- **Determinism (code) for efficient repetition** — replaying the known, fast and identically, for free. It discovers nothing, but it is the only thing that can *watch continuously at no cost*.
**When a crystallised artifact fails, it has asked a question. Triage it — only inference can, because code cannot know intent:**
1. **Flake** — the check itself is non-deterministic. Response: *eliminate the non-determinism* (pin time, seed RNG, isolate state, fix the unstable selector). A flaky deterministic check is worse than none — it trains everyone to ignore red; hardening it feeds more determinism back into the loop.
2. **Legitimate change** — the system moved on purpose and the artifact is now stale. Response: *re-discover the delta and re-crystallise* — update the map/recognizer/expected value so the deterministic layer tracks reality again. This is the flywheel turning: launch from the frontier, explore only what changed.
3. **Regression** — the system broke by accident. Response: *raise it* — the deterministic layer just did its job as a regression alarm.
This triage **is** the feedback channel, and it is why coverage compounds instead of rotting: every failure either **hardens** the suite (flake → more determinism), **advances** it (change → re-crystallise), or **protects** the system (regression → alarm). The fail-fast, localised-failure requirement exists to make the triage cheap — a precise break tells inference *where* to look and often *which* of the three it is; a vague "it went red" forces a fresh investigation every time, and a suite that fails imprecisely cannot drive its own repair.
</closing-the-loop>
<disciplines>
The methodology is not "use a graph." It is the set of disciplines that keep the loop turning:
- **Always crystallise.** Harden every worthwhile discovery into deterministic, version-controlled code *as part of the same effort*. An un-crystallised discovery is a missing feedback loop.
- **Explore from the frontier.** Forbid re-discovering already-mapped territory; new exploration begins by deterministically traversing the existing map to its edge. Applies identically to a repo (knowledge graph) and a running app (state graph).
- **Make determinism break precisely.** Localised, fail-fast errors are a *first-class build requirement*, not a test-quality nicety — the precise break is the re-discovery signal, and the thing that lets inference triage a failure cheaply (flake vs legitimate change vs regression — see `<closing-the-loop>`). Verify the property by perturbation (deliberately break one check; confirm it halts exactly there and reports precisely).
- **Keep the map orientable.** Maintain a cheap "where am I?" check and stable identifiers, so discoveries integrate rather than fork.
</disciplines>
<the-operator-seam>
**The architectural move that cashes the flywheel out.** When real logic sits behind a UI, most behaviour lives *below* the UI. Carve a clean **operator seam** there — an observable, scriptable interface to the real logic, persistence, and backend, with the GUI removed — and drive Related in Design
contribute
IncludedLocal-only OSS contribution command center. Auto-refreshes the user's in-flight PR and issue state on invoke so conversations start with full context — no need to brief Claude on what's in flight. Helps the user find issues to contribute to on GitHub, builds per-repo dossiers of what each upstream expects (CLA, DCO, branch convention, AI policy, draft-first, review bots, issue templates), runs deterministic gates before any external action so AI-assisted contributions don't reach maintainers as slop. State is markdown-only: candidate files at ~/.contribute-system/candidates/, repo dossiers at ~/.contribute-system/research/, append-only event log at ~/.contribute-system/log.jsonl. No database, no cloud calls. Use when the user asks about their PRs / issues / contributions, wants to find new work to take on, claim an issue, build/refresh a repo's dossier, or draft a Design Issue or PR. Trigger with "/contribute", "what's my PR status", "find a contribution", "claim issue X", "draft a Design Issue for Y", "refresh dossier for Z".
architectural-analysis
IncludedUser-triggered deep architectural analysis of a codebase or scoped subtree across eight modes — information architecture, data flow, integration points, UI surfaces, interaction patterns, data model, control flow, and failure modes. This skill should be used when the user asks to "diagram this codebase," "map the architecture," "show the data flow," "give me an ERD," "trace control flow," "find the integration points," "verify the layout pattern," "audit the UX architecture," or any similar request whose primary deliverable is mermaid diagrams plus cited reports under docs/architecture/. Dispatches haiku/sonnet sub-agents in parallel for per-mode exploration, then verifies every citation mechanically before any node lands in a diagram. Not for one-off prose explanations of code (use code-explanation) or for high-level system design from scratch (use system-design).
mcp
IncludedModel Context Protocol (MCP) server development and tool management. Languages: Python, TypeScript. Capabilities: build MCP servers, integrate external APIs, discover/execute MCP tools, manage multi-server configs, design agent-centric tools. Actions: create, build, integrate, discover, execute, configure MCP servers/tools. Keywords: MCP, Model Context Protocol, MCP server, MCP tool, stdio transport, SSE transport, tool discovery, resource provider, prompt template, external API integration, Gemini CLI MCP, Claude MCP, agent tools, tool execution, server config. Use when: building MCP servers, integrating external APIs as MCP tools, discovering available MCP tools, executing MCP capabilities, configuring multi-server setups, designing tools for AI agents.
react-native-skia
IncludedDesign, build, debug, and optimise high-polish animated graphics in React Native or Expo using @shopify/react-native-skia, Reanimated, and Gesture Handler. Use when the user wants canvas-driven UI, shaders, paths, rich text, image filters, sprite fields, Skottie, video frames, snapshots, web CanvasKit setup, or performance tuning for custom motion-heavy elements such as loaders, hero art, cards, charts, progress indicators, particle systems, or gesture-driven surfaces. Also use when the user asks for fluid, glow, glass, blob, parallax, 60fps/120fps, or GPU-friendly animated effects in React Native, even if they do not explicitly say "Skia". Do not use for ordinary form/layout work with standard views.
plaid
IncludedProduct Led AI Development — guides founders from idea to launched product. Six capabilities: Idea (discover a product idea), Validate (pressure-test the idea against fatal flaws, problem reality, competition, and 2-week MVP feasibility), Plan (vision intake + document generation), Design (translate image references into a design.md spec), Launch (go-to-market strategy), and Build (roadmap execution). Use when someone says "PLAID", "plaid idea", "help me find an idea", "product idea", "idea from my business", "idea from my expertise", "plaid validate", "validate my idea", "pressure-test", "is this idea good", "find fatal flaws", "validate the problem", "plan a product", "define my vision", "generate a PRD", "product strategy", "plaid design", "design from image", "translate image to design", "create design.md", "extract design tokens", "plaid launch", "go-to-market", "launch plan", "GTM strategy", "launch playbook", "plaid build", "build the app", "start building", or "execute the roadmap".
nextjs-framer-motion-animations
IncludedAdds production-safe Motion for React or Framer Motion animations to Next.js apps, including reveal, hover and tap micro-interactions, whileInView, stagger, AnimatePresence, layout and layoutId transitions, reorder, scroll-linked UI, and lightweight route-content transitions. Use when the user asks to add, refactor, or debug Motion or Framer Motion in App Router or Pages Router codebases, especially around server/client boundaries, reduced motion, LazyMotion, bundle size, hydration, or route transitions. Avoid for GSAP-style timelines, WebGL or 3D scenes, heavy scroll storytelling, or CSS-only effects unless Motion is explicitly requested.