evomath-tao
Use this skill whenever the user submits a non-trivial mathematical claim that needs a rigorous proof or audit. Trigger on IMO/Putnam/USAMO/Olympiad-style problems, ML/AI theoretical statements, research conjectures, suspected-false claims, multi-step proofs the user already failed on, proof drafts with possible hidden assumptions, or any request containing 'prove rigorously', 'verify this', 'is this true', 'find the gap', 'audit my proof', 'find a counterexample', or 'use EvoMath' that targets a mathematical claim. Activate also when the problem requires more than three reasoning steps. Do NOT use for single-step calculations, definition lookups, textbook exercises with a known recipe, code analysis tasks, literature survey questions, pure symbolic manipulation, or non-mathematical applications of those trigger phrases (e.g., 'is it true that GPT-4 can solve math?', 'verify this LaTeX syntax'); hand those back instead.
What this skill does
# EvoMath (Tao-style) EvoMath is a lightweight proof workflow for contest-style mathematical reasoning. Its job is to produce a rigorous proof, a verified counterexample, a useful partial result, or a clear handoff. Keep the process small; do not run a heavy audit pipeline by default. ## Methodology Anchor — Terence Tao's Research-Math Practice This skill operationalizes the way Terence Tao approaches research mathematics: 1. **Compute small cases first** (Kepler before Newton) — build intuition from data before reaching for theory. 2. **Try the standard toolbox broadly before going deep** — most hard problems crack to a standard technique; the few that don't only reveal which after several have failed. 3. **Hold rigor and intuition together** (post-rigorous mathematics) — trust intuition, but verify every step. "It feels right" is a hypothesis, not a proof. 4. **Atomize when stuck** — decompose into independently checkable sub-claims. A clean map of proved / conjectured / open beats a polished but shaky narrative. 5. **Stay honest about what isn't proved** — distinguish PROVED / VERIFIED_NUMERICALLY / CONJECTURED / HANDED_OFF. When blocked, name the precise gap. 6. **Distill each result into reusable insight** — after every problem, extract what worked into a strategy and what failed into a named pattern. Mathematical maturity is accumulated meta-insight. Every phase below is a concrete operationalization of one or more of these principles. ## Operating Rules - Use Markdown notes for handoff between steps. Do not require JSON/YAML unless a script explicitly asks for it. - Keep only compact state: plan, verified claims, failed attempts, final audit. Do not pass long failed derivations into later prompts. - Prefer a few independent proof attempts over one long derivation. - **Numerical verification is NOT a proof step** (math-olympiad rule). Checking a claim on n=1..100 and finding no counterexample does NOT make it PROVED; the strongest label such evidence can earn is VERIFIED_NUMERICALLY. - Exact arithmetic can refute; approximate numerics only suggest. - A proof is final only after an adversarial check of the clean proof. - **Calibrated abstention over bluffing**: when verification fails repeatedly, admit it. Return partial results and mark unfixed gaps explicitly (math- olympiad rule). Final status HANDED_OFF with a structured wall report is always preferable to PROVED with hand-waved gaps. - Every final answer must include a visible `final-status: ...` line. - Use TodoWrite to drive the workflow. Each step is one todo; you cannot mark a todo completed unless the corresponding `.md` file passes its validator. If filesystem access is available, create a Markdown workspace with: ```bash python skills/evomath-tao/scripts/evomath_workspace.py init --dir .evomath/current ``` If filesystem access is not available, keep the same Markdown sections inline in the conversation. In that case run the validators by mentally checking the same required fields the script checks — the discipline is the same. ## Fast Exit Do not use EvoMath for single calculations, definition lookups, symbolic manipulation, or answer-only requests with no proof obligation. Give the direct answer instead. No TodoWrite list is needed for a Fast Exit. If the statement has a blocking ambiguity that changes truth value, ask one specific clarification question before solving. ## Execution Protocol (TodoWrite + Validation) For any problem that passes the Fast Exit Gate, follow this protocol. ### 1. Create the 5-step todo list Before doing any solving work, call TodoWrite with these five items in this order. Each item names its primary reference file: 1. **Plan Briefly** — read `references/intake-checklist.md` for type classification, ambiguity handling, goal types. 2. **Try Candidates** — read `references/angles-by-type.md` for technique ideas if you are out of angles for this problem type. 3. **Assemble** — read `references/output-formats.md` if you need formatting conventions or LaTeX templates. 4. **Audit** — read `references/grading-taxonomy.md` for issue classes and severity rules. Read `references/phase-4-audit.md` only if the user requests strict multi-reviewer audit. 5. **Reflect** — read `references/claim-memory.md` only when deep reflection is triggered (see "Deep Reflection Triggers" below). ### 2. Per-step discipline For each step in order: 1. Mark the todo `in_progress` before reading the reference or writing output. 2. Read the referenced file(s) if and only if you need them for this step. 3. Produce the corresponding `.md` output (plan.md, candidates.md, audit.md, final.md sections, etc.). 4. Run the validator before marking the todo completed: ```bash python skills/evomath-tao/scripts/evomath_workspace.py validate-phase <N> --dir .evomath/current ``` 5. If validation FAILS, the todo stays `in_progress`. Revise the `.md` file based on the printed failure messages and re-run the validator. Do not mark completed until the validator exits 0. ### 3. PROVED gate If your final-status is `PROVED`, you MUST additionally run: ```bash python skills/evomath-tao/scripts/evomath_workspace.py validate-proved --dir .evomath/current ``` This verifies that the 10-item PROVED Self-Check Checklist (see `references/output-formats.md`) is present in final.md with all boxes ticked. If this fails, downgrade final-status to CONJECTURED or HANDED_OFF and revise the answer. ### 4. Deep Reflection Triggers Step 5 (Reflect) has two modes: - **Light reflection** (default): three lines — successful pattern, failed pattern to avoid, whether memory was written. - **Deep reflection** (triggered when any of the following hold): - Step 2 required 3+ revision rounds for any candidate - Step 4 identified a fatal flaw before the final repair - A new winning technique appeared that is not yet in any L2 strategy entry - The user explicitly asks for self-evolution or cross-problem learning - final-status is HANDED_OFF with a recurring failure-mode In deep reflection mode, run the full ESE/IVE protocols described in `references/claim-memory.md` and update L2/L3 memory in `.evomath/session-memory.md`. ### 5. Fallback when filesystem is unavailable If you cannot run scripts, keep the same TodoWrite discipline: - Still create the 5-item list and march through it. - Still keep the same `.md` sections inline in the conversation. - Substitute mental validation for the script call — check the same required fields the validator would check. ## Workflow ### 1. Plan Briefly Write a short Markdown plan: - Problem type: algebra, geometry, number theory, combinatorics, analysis, or other. - Goal: prove, refute, find example, or audit proof. - Strategy: one sentence. - Subgoals: at most five bullets. For "determine all" problems, include both: - existence/construction - impossibility/exclusion For a simple problem, one root subgoal is enough. ### 2. Try Candidates **Mode dispatch** (decide before generating): - If Phase 0 `goal = find-numeric-answer` (AIME-style: answer is a single number, no proof required) → **AIME mode**: generate 5–7 short candidate answers using varied approaches (small-case enumeration, modular invariants, algebraic manipulation, generating functions, brute-force code). Take majority vote across candidates. Verify the top two by substitution into the original problem. Skip the rest of the proof workflow; output the numeric answer with `final-status: PROVED` only when both top candidates agree AND substitution checks pass. - Otherwise → **Proof mode**: continue below. **Proof mode — per-candidate 5-round internal loop**: For each active subgoal, try up to four genuinely different candidate routes. Each candidate is itself the product of a 5-round internal mini-process — not a one-shot generation: 1. **Solve** — produce a proof attempt using reasoning only. **No tool use
Related in Security
mac-ops
IncludedComprehensive macOS workstation operations — diagnose kernel panics, identify failing drives, audit launchd startup items, decode wake reasons, triage TCC permission denials, manage APFS snapshots, recover from no-boot. Use for: Mac is slow, slow bootup, won't boot, kernel panic, kernel_task hot, mds_stores CPU, photoanalysisd, cloudd, login loop, gray screen, sleep wake failure, drive failing, IO errors, APFS snapshots eating space, Time Machine local snapshots, Spotlight indexing, launchd, LaunchAgent, LaunchDaemon, login items, TCC permissions, Full Disk Access, Screen Recording denied, Gatekeeper, quarantine, com.apple.quarantine, app is damaged, helper tool, /Library/PrivilegedHelperTools, pmset, wake reasons, dark wake, sysdiagnose, panic.ips, DiagnosticReports, configuration profile, MDM profile, remote diagnostics over SSH.
a11y-audit
IncludedRun accessibility audits on web projects combining automated scanning (axe-core, Lighthouse) with WCAG 2.1 AA compliance mapping, manual check guidance, and structured reporting. Output is configurable: markdown report only, markdown plus machine-readable JSON, or markdown plus issue tracker integration. Use this skill whenever the user mentions "accessibility audit", "a11y audit", "WCAG audit", "accessibility check", "compliance scan", or asks to check a web project for accessibility issues. Also trigger when the user wants to verify WCAG conformance or map findings to a specific standard (CAN-ASC-6.2, EN 301 549, ADA/AODA).
erpclaw
IncludedAI-native ERP system with self-extending OS. Full accounting, invoicing, inventory, purchasing, tax, billing, HR, payroll, advanced accounting (ASC 606/842, intercompany, consolidation), and financial reporting. 413 actions across 14 domains, 43 expansion modules. Constitutional guardrails, adversarial audit, schema migration. Double-entry GL, immutable audit trail, US GAAP.
assess
IncludedAssesses and rates quality 0-10 across multiple dimensions (correctness, maintainability, security, performance, testability, simplicity) with pros/cons analysis. Compares against project conventions and prior decisions from memory. Produces structured evaluation reports with actionable improvement suggestions. Use when evaluating code, designs, architectures, or comparing alternative approaches.
spring-boot-security-jwt
IncludedProvides JWT authentication and authorization patterns for Spring Boot 3.5.x covering token generation with JJWT, Bearer/cookie authentication, database/OAuth2 integration, and RBAC/permission-based access control using Spring Security 6.x. Use when implementing authentication or authorization in Spring Boot applications.
code-hardcode-audit
IncludedDetect hardcoded values, magic numbers, and leaked secrets. TRIGGERS - hardcode audit, magic numbers, PLR2004, secret scanning.