autogoal
Create, verify, repair, and close durable Codex goals with measurable outcomes, evidence gates, plan templates, blocker handling, completion audits, and goal-backed workflow repair.
What this skill does
# Autogoal Use this when the user asks for a durable objective, long-running autonomous work, goal setup, or when a governing repo skill requires goal setup before work starts. This skill turns a vague "keep going" instruction into a thread-scoped completion contract: what should be true, how it is verified, what must not change, and when Codex should stop. ## Core Take A normal prompt says: do the next thing. A goal says: keep working until this outcome is true, or until the evidence shows a real blocker. Goals are for work where the next move depends on what Codex learns along the way: debugging, migrations, flaky tests, benchmark tuning, deep research, large refactors, prototypes, browser-proof loops, and pass-gated plans. Goals are not a permission slip to wander. They are a scoped, evidence-checked contract. No measurable outcome, no goal. A goal must have a verification surface and a completion threshold before `create_goal` is called. Prefer numbers: score, count, latency, coverage, pass count, failing-to-passing repro count, issue rows, or explicit command success. When a numeric target does not fit, use a binary artifact checklist that can be audited from files, commands, screenshots, browser proof, or source-backed citations. ## Universal Boundary `autogoal` is the goal lifecycle kernel. It owns: - objective shape - measurable completion thresholds - evidence standards - active goal conflict handling - durable plan state - blocker and completion rules - repair routing when a goal-backed workflow misses expectations It does not own project policy. Keep repo commands, package managers, browser tools, release rules, PR policy, scorecards, issue ledgers, and lane-specific pass schedules in derived skills or project-owned `docs/plans/templates/<template>.md`. Derived skills may be stricter than `autogoal`; they should not duplicate the goal lifecycle. `autogoal` says how work remains honest. The derived skill says what the lane actually requires. ## Template Composition Goal plans are composable, but only through static materialization. The model is: 1. one active goal 2. one concrete `docs/plans` plan file 3. one primary template 4. optional materialized packs The primary template is chosen by dominant risk: `task` for normal execution, `docs` for docs-dominant work, `major-task` for heavyweight architecture or proposal work, and repo-specific templates for domain lanes. Packs are chosen by touched surface. They add recurring gates without becoming parents: - `docs`: docs are touched but not the dominant deliverable - `agent-native`: agent instructions, skills, hooks, commands, prompts, or user-action tooling changed - `browser`: real browser, route, UI, console, network, or interaction proof is required - `package-api`: package exports, public API, release artifacts, package boundaries, or package-level checks changed Core execution and review gates belong in the primary template. Every primary template must include `Autoreview` as the last human-readable gate before `Goal plan complete`. Packs are only for optional touched surfaces that would otherwise be absent from that template. Do not create runtime inheritance between templates. The helper copies pack rows into the generated plan's `Start Gates`, `Work Checklist`, and `Completion Gates`. After creation, the generated plan is the truth; the checker validates that materialized plan only. The generated plan is the dedicated plan shell. Fill that exact file immediately after generation: replace placeholders, resolve every gate row, and mark non-applicable generated rows as `N/A: <reason>` with evidence. Do not delete, wholesale replace, or hand-narrow the generated plan into an ad hoc smaller plan after durable work has started. If the selected template is plainly wrong and no substantive work has started, regenerate once with the right template and record why. If work has already started, keep the generated plan and close it honestly. The first plan checkpoint is requirement extraction. Codex output can compact and lose prompt constraints, so before implementation or broad exploration, copy every explicit user requirement into the plan as checkable rows: scope, non-goals, timing/duration, stop conditions, deliverables, final handoff sections, verification surface, and success criteria. Do not continue into implementation until this is complete or explicitly marked N/A with reason. Use packs like this: ```bash node .agents/skills/autogoal/scripts/create-goal-scratchpad.mjs \ --template task \ --with docs \ --with agent-native \ --title "<short task title>" ``` Examples: - docs-only work: `--template docs` - normal code task that also changes docs: `--template task --with docs` - agent workflow task: `--template task --with agent-native` - browser behavior task: `--template task --with browser` - public app/API or package-boundary task: `--template task --with package-api` - major architecture task: `--template major-task` - major architecture task that also changes docs and package API: `--template major-task --with docs --with package-api` If two packs add related gates, keep both when they protect different failure modes. If they duplicate exactly the same proof, keep the more specific pack and record the other as N/A in the plan. ## Proportionality Dial Classify goal-backed work before creating or updating a plan: - `micro`: one narrow, auditable outcome; no cross-file state; no meaningful continuation loop. Use a tiny plan only when a repo rule requires it, or record the audit surface directly in the final response. - `normal`: multi-step work with concrete evidence and likely continuation. Use the appropriate `docs/plans` template and close all relevant gates. - `major`: architecture, migrations, benchmarks, framework comparisons, broad refactors, pass-gated lanes, or public API/runtime risk. Use a derived skill or project template with phases, risk rows, review gates, and explicit closure criteria. Do not inflate a micro work item into a ceremony pile. Do not shrink a major work item into a checklist that cannot catch real risk. ## Goal Flow Modes Every goal-backed workflow chooses exactly one flow mode before durable work starts. The mode controls the human review boundary; it does not weaken the evidence or completion rules. ### 1. One-Shot Execution Use this for issue-like or work-item-like work where the agent is expected to read the source, derive the local plan, implement, verify, and hand off the result without stopping for plan approval. Rules: - Create or continue a goal when the work is non-trivial and auditable. - Create a plan when durable state is useful or required by the caller. - The plan is an execution ledger, not a proposal waiting for acceptance. - Human review happens at the final handoff or explicit user interruption. - Do not pause merely because the plan has not been reviewed. Pause only for a real blocker, unsafe ambiguity, or a user decision that changes scope. ### 2. Agent-Led Plan Hardening Use this when the requested output is a plan and the user wants the agent to drive toward the best plan with minimal human interruption. Rules: - The agent owns the review loop: research, compare options, pressure-test, revise, and improve the plan until the confidence threshold is met. - Ask the user only for decisions that materially change intent, boundaries, risk tolerance, or acceptance criteria. - Record each self-review pass and plan delta as evidence. - Stop for one major user review when the plan reaches the stated readiness threshold. - Do not execute implementation under the planning goal unless the caller's governing workflow explicitly says planning and execution are the same goal. ### 3. Collaborative Planning Use this when the user and agent are intentionally shaping the plan together before execution. Rules: - The goal outcome is an accepted plan, not implementation. -
Related in General
modeling-omnistudio-epc-catalog
IncludedSalesforce Industries CME EPC product-modeling skill for Product2-based catalog creation. Use when creating EPC products, configuring product attributes, building offer bundles with Product Child Items, or reviewing EPC DataPack JSON metadata for product catalog changes. TRIGGER when: user creates or updates Product2 EPC records, AttributeAssignment payloads, AttributeMetadata/AttributeDefaultValues, Offer bundles, or ProductChildItem relationships. DO NOT TRIGGER when: designing OmniScripts/FlexCards/Integration Procedures (use building-omnistudio-omniscript, building-omnistudio-flexcard, or building-omnistudio-integration-procedure), implementing Apex business logic (use generating-apex), or troubleshooting deployment pipelines (use deploying-metadata).
relationship-science-coach
IncludedUse this skill for direct, practical adult relationship coaching: couples conflict, repair, trust, marriage, dating, flirting, attachment patterns, emotional connection, sex, desire differences, eroticism, kink negotiation, affection, love languages, breakups, and long-term passion. Draw on Gottman, EFT and Hold Me Tight, attachment science, modern sex research, Perel, Nagoski, Kerner, Schnarch, Love and Stosny, and flexible love-language tools. Be concrete and low-hedge. Redirect only for imminent danger, abuse, coercive control, minors, non-consent, self-harm, stalking, or medical/legal/psychiatric decisions.
building-sf-integrations
IncludedSalesforce integration architecture and runtime plumbing with 120-point scoring. Use this skill to set up Named Credentials, External Credentials, External Services, REST/SOAP callout patterns, Platform Events, and Change Data Capture. TRIGGER when: user sets up Named Credentials, External Services, REST/SOAP callouts, Platform Events, CDC, or touches .namedCredential-meta.xml files. DO NOT TRIGGER when: Connected App/OAuth config (use configuring-connected-apps), Apex-only logic (use generating-apex), or data import/export (use handling-sf-data).
venue-templates
IncludedAccess comprehensive LaTeX templates, formatting requirements, and submission guidelines for major scientific publication venues (Nature, Science, PLOS, IEEE, ACM), academic conferences (NeurIPS, ICML, CVPR, CHI), research posters, and grant proposals (NSF, NIH, DOE, DARPA). This skill should be used when preparing manuscripts for journal submission, conference papers, research posters, or grant proposals and need venue-specific formatting requirements and templates.
let-fate-decide
IncludedDraws the 12 Houses of the Zodiac Tarot spread to inject entropy into planning when prompts are vague, ambiguous, or casually delegated. Interprets the spread to guide next steps. Use when the user says 'let fate decide', 'YOLO', 'whatever', 'idk', or other nonchalant phrases, makes Yu-Gi-Oh references, or when you are about to arbitrarily pick between multiple reasonable approaches. Prefer over ask-questions-if-underspecified when the user's tone is casual or playful rather than precision-seeking.
net-ops
IncludedCross-platform network troubleshooting (Windows, macOS, Linux) via local or remote shell. Use for: DNS broken, can't resolve hostnames, nslookup/dig works but apps fail, NRPT, WFP, scutil, /etc/resolver, systemd-resolved, /etc/resolv.conf, NetworkManager, VPN DNS leak residue (ProtonVPN/Mullvad/WireGuard/AnyConnect), AV/firewall blocking DNS or DoH, Tailscale DNS interaction, intermittent connectivity, remote diagnostics over SSH.