bleu
Use this skill whenever a developer wants to turn an idea into a complete, production-ready, end-to-end system plan BEFORE writing any code. Trigger on 'plan this system', 'design the architecture for', 'help me blueprint', 'deep plan for X', 'break this idea into components', 'expand into action points', 'full implementation plan', or when the user pastes a project idea wanting architecture, components, pipelines, and file-level execution mapped out. Casual phrasing also triggers: 'help me think this through end-to-end', 'plan before coding'. Also covers living-workspace patterns: self-improving knowledge bases, reflection loops with auditor agents, four-agent teams, schema-as-code, wiki health scoring. **Resume triggers**: 'where did we leave off', 'continue this plan', 'resume my blueprint' - rehydrates state from disk via SESSION.md/NEXT.md/decisions/. Web research is mandatory every invocation.
What this skill does
# Bleu Turn an idea into a fully thought-through, deeply structured system plan - from architecture down to file-level execution - before any code is written. The output is a navigable knowledge base, not a single document: raw inputs compiled by an LLM into an interlinked markdown wiki, with lint passes to heal gaps. No RAG, no vector store, no embeddings - the whole plan fits in a modern context window and every claim is traceable to a file a human can open, edit, or delete. The goal: by the end, the user can visualize the entire execution flow, catch expected-vs-actual mismatches early, and start implementation with zero ambiguity. ## Why this skill exists Most "planning" with an LLM is one-shot: ask for an architecture, get a wall of text, lose it next session. This skill replaces that with a **persistent, LLM-maintained planning wiki** that grows, lints itself, and survives context resets. It's deliberately heavy on structure because the failure mode of light planning is discovering the architectural hole in week three. The strongest single argument for the skill, worth memorizing: > The tedious part of maintaining a knowledge base is not the reading or the thinking - it's the bookkeeping. Updating cross-references, keeping summaries current, noting when new data contradicts old claims, maintaining consistency across dozens of pages. Humans abandon wikis because the maintenance burden grows faster than the value. LLMs don't get bored, don't forget to update a cross-reference, and can touch 15 files in one pass. That's the bet. Every other design choice in this skill serves it. Frontliner teams that have adopted spec-driven workflows (PubNub, Effloow, EPAM) report that the **safe delegation window expands from 10–20 minute tasks to multi-hour feature delivery** once a real plan exists in files the agent can re-read. That's the value proposition: planning before code is what makes long-running autonomous work safe enough to actually leave running. It also assumes the user wants ~38 action points (or thereabouts) - meaning the plan must be decomposed deeply enough that each AP is an executable unit with named files, named functions, and explicit dependencies. Anything vaguer than that and the skill isn't done yet. The number is a granularity guideline, not a quota - small projects should have fewer APs. The Phase 0 intake sizes the workflow to the project. Don't sledgehammer a nut. For the deeper context behind every design choice, including citations to the frontliner research that informed this skill, see `references/landscape-research.md`. ## Operating principles Hold these the entire time. They override any instinct to move faster. - **Plan, don't code.** No implementation until the blueprint is signed off. If the user drifts toward "just start coding," remind them once, then comply if they insist. - **Be proactively suggestive, not reactive.** Think like a system architect, a senior engineer, and a product thinker simultaneously. Challenge the user's assumptions where they're weak. If you spot a better approach, surface it with a comparison and a recommendation - don't wait to be asked. - **Continuous web research is mandatory.** Not a one-time pass. Every phase researches what's relevant to that phase. Every claim that came from research gets a citation. See `references/research-and-citations.md`. - **Files outlast context.** Everything goes into the planning workspace as markdown. The conversation is ephemeral; the workspace is the deliverable. - **Treat the chat as stateless and the workspace as stateful.** Chats die - context windows fill, the user runs `/clear`, terminals crash. Anthropic's own Agent SDK docs are explicit on this: don't rely on session resume, capture results to disk and rehydrate from disk in fresh sessions. Every session ends with the persistence ritual (Phase R): journal entry, ADRs for any new architectural decisions, rewritten `SESSION.md` and `NEXT.md`. Every session starts by reading those same files first. See `references/session-persistence.md`. - **Lint relentlessly.** Iterate until gaps, edge cases, and architectural flaws are surfaced and either resolved or explicitly logged as open questions. "Done" means the user agrees it's near-perfect, not that you ran out of ideas. - **Adversarial evaluation, not self-evaluation.** Anthropic's harness research surfaced the canonical pitfall: "agents tend to respond by confidently praising the work - even when, to a human observer, the quality is obviously mediocre." Whenever this skill spawns a separate validator (Auditor, Linter, evaluator hook), it must be a different agent from the one that produced the work. Same agent both proposing and approving = self-praise. Production teams from Anthropic to PubNub enforce this strictly. - **Write for the gap, not the overview.** ETH Zurich's AGENTbench paper (Feb 2026) found that LLM-generated CLAUDE.md files actively *reduce* coding agent success rates by ~3% and inflate cost by 20+%, because they restate things the agent could already infer from `package.json` and the README. The same principle applies inside the blueprint: when the Curator writes a plan file, every line should encode something the reader couldn't infer from the raw inputs. Every restated fact is taking attention away from a missing one. - **Audit your harness as models improve.** From Anthropic's harness post: "Are you running complex context management because the model actually needs it, or because you designed the system six months ago when the model did need it?" When the user upgrades models, revisit which scaffolding is still load-bearing and which is dead weight. Sonnet 4.5 needed context resets; Opus 4.6 dropped them. Don't carry yesterday's workarounds into tomorrow's runs. - **Contamination control.** Keep human-curated artifacts (`README.md`, ADRs, the actual codebase) separate from `blueprint/`. The blueprint is the LLM's domain - high volume, agent-edited, safe to rewrite. Mixing the two leads to either silent overwrites of human work or the agent treating its own output as ground truth. - **Start simpler than you think you need to.** Across every frontline source, the loudest message is the same. Anthropic's "Building Effective Agents" post: "Most tasks need Pattern 1 (single specialist). Add complexity only when it demonstrably improves results." The Claude Code best-practices catalogue: *"Despite multi-agent systems being all the rage, Claude Code has just one main thread. I highly doubt your app needs a multi-agent system."* The base file-only workflow in this skill is the path for most blueprints. `references/claude-code-integration.md` and `references/advanced-architecture.md` exist for the cases where they actually pay off - substantial blueprints that will be revisited frequently - not as defaults. - **Match granularity to scope.** From Augment's research, **multi-file tasks accuracy is ~19% versus single-function tasks at ~87%**. Smaller scope dramatically improves agent success rate. Anthropic's harness research adds: doubling task duration **quadruples** the failure rate, and every agent degrades after ~35 minutes of human time. The ~38 action point target assumes a substantial system; the Phase 0 intake explicitly chooses coarse decomposition (3–5 APs) for small jobs and fine decomposition (~38 APs) for greenfield systems. Don't sledgehammer a nut, and don't tweezer a tree. - **Ground truth beats LLM opinion.** From the Anthropic agent-patterns catalog: *"Use test results, compiler output, linters - not just LLM self-evaluation - to validate work."* Whenever the Linter or Auditor runs, it should consult `.claude/rules/blueprint-schema.md` and the actual filesystem state, not vibe-check the proposals. Same applies to research: cite the source, don't paraphrase from memory. - **The Curator owns the wiki, not you.** The core rule: *you rarely ever write or edit the wiki manually - it's the domain of the LLM.* If you ever
Related in Design
contribute
IncludedLocal-only OSS contribution command center. Auto-refreshes the user's in-flight PR and issue state on invoke so conversations start with full context — no need to brief Claude on what's in flight. Helps the user find issues to contribute to on GitHub, builds per-repo dossiers of what each upstream expects (CLA, DCO, branch convention, AI policy, draft-first, review bots, issue templates), runs deterministic gates before any external action so AI-assisted contributions don't reach maintainers as slop. State is markdown-only: candidate files at ~/.contribute-system/candidates/, repo dossiers at ~/.contribute-system/research/, append-only event log at ~/.contribute-system/log.jsonl. No database, no cloud calls. Use when the user asks about their PRs / issues / contributions, wants to find new work to take on, claim an issue, build/refresh a repo's dossier, or draft a Design Issue or PR. Trigger with "/contribute", "what's my PR status", "find a contribution", "claim issue X", "draft a Design Issue for Y", "refresh dossier for Z".
architectural-analysis
IncludedUser-triggered deep architectural analysis of a codebase or scoped subtree across eight modes — information architecture, data flow, integration points, UI surfaces, interaction patterns, data model, control flow, and failure modes. This skill should be used when the user asks to "diagram this codebase," "map the architecture," "show the data flow," "give me an ERD," "trace control flow," "find the integration points," "verify the layout pattern," "audit the UX architecture," or any similar request whose primary deliverable is mermaid diagrams plus cited reports under docs/architecture/. Dispatches haiku/sonnet sub-agents in parallel for per-mode exploration, then verifies every citation mechanically before any node lands in a diagram. Not for one-off prose explanations of code (use code-explanation) or for high-level system design from scratch (use system-design).
mcp
IncludedModel Context Protocol (MCP) server development and tool management. Languages: Python, TypeScript. Capabilities: build MCP servers, integrate external APIs, discover/execute MCP tools, manage multi-server configs, design agent-centric tools. Actions: create, build, integrate, discover, execute, configure MCP servers/tools. Keywords: MCP, Model Context Protocol, MCP server, MCP tool, stdio transport, SSE transport, tool discovery, resource provider, prompt template, external API integration, Gemini CLI MCP, Claude MCP, agent tools, tool execution, server config. Use when: building MCP servers, integrating external APIs as MCP tools, discovering available MCP tools, executing MCP capabilities, configuring multi-server setups, designing tools for AI agents.
react-native-skia
IncludedDesign, build, debug, and optimise high-polish animated graphics in React Native or Expo using @shopify/react-native-skia, Reanimated, and Gesture Handler. Use when the user wants canvas-driven UI, shaders, paths, rich text, image filters, sprite fields, Skottie, video frames, snapshots, web CanvasKit setup, or performance tuning for custom motion-heavy elements such as loaders, hero art, cards, charts, progress indicators, particle systems, or gesture-driven surfaces. Also use when the user asks for fluid, glow, glass, blob, parallax, 60fps/120fps, or GPU-friendly animated effects in React Native, even if they do not explicitly say "Skia". Do not use for ordinary form/layout work with standard views.
plaid
IncludedProduct Led AI Development — guides founders from idea to launched product. Six capabilities: Idea (discover a product idea), Validate (pressure-test the idea against fatal flaws, problem reality, competition, and 2-week MVP feasibility), Plan (vision intake + document generation), Design (translate image references into a design.md spec), Launch (go-to-market strategy), and Build (roadmap execution). Use when someone says "PLAID", "plaid idea", "help me find an idea", "product idea", "idea from my business", "idea from my expertise", "plaid validate", "validate my idea", "pressure-test", "is this idea good", "find fatal flaws", "validate the problem", "plan a product", "define my vision", "generate a PRD", "product strategy", "plaid design", "design from image", "translate image to design", "create design.md", "extract design tokens", "plaid launch", "go-to-market", "launch plan", "GTM strategy", "launch playbook", "plaid build", "build the app", "start building", or "execute the roadmap".
nextjs-framer-motion-animations
IncludedAdds production-safe Motion for React or Framer Motion animations to Next.js apps, including reveal, hover and tap micro-interactions, whileInView, stagger, AnimatePresence, layout and layoutId transitions, reorder, scroll-linked UI, and lightweight route-content transitions. Use when the user asks to add, refactor, or debug Motion or Framer Motion in App Router or Pages Router codebases, especially around server/client boundaries, reduced motion, LazyMotion, bundle size, hydration, or route transitions. Avoid for GSAP-style timelines, WebGL or 3D scenes, heavy scroll storytelling, or CSS-only effects unless Motion is explicitly requested.