codex-ppt
Generate visually unified image-based PPT/PPTX decks from articles, reports, papers, notes, or outlines.
What this skill does
# Codex PPT
## Overview
This skill creates image-based PowerPoint decks from source material. Each slide is a complete 16:9 generated image. Final images are assembled into `.pptx` with `scripts/assemble_ppt.py`.
Use this when the user wants a visually unified presentation and accepts full-slide image pages. Do not use it when every textbox, chart, or shape must remain separately editable.
Prefer the built-in image generation/editing tool. Use `scripts/image_gen.py` only when the built-in backend is unavailable, lacks a required capability, or the user explicitly asks for API/CLI mode.
## Hard Constraints
- Read the relevant `Reference Map` files before each phase. This file is the orchestration contract; detailed rules live in `docs/` and worker prompts in `prompts/`.
- Respect approval gates. Do not create final `deck_spec.json`, `speech.md`, prompt jobs, slide images, or `.pptx` before the approvals in `docs/workflow-gates-and-progress.md`.
- After the user approves the sample slide and authorizes full-deck generation, every remaining slide image job must be dispatched to a slide subagent whenever subagents are available.
- The main agent owns orchestration, prompt jobs, state recording, QA, speaker notes, and assembly. Do not silently replace available slide subagents with sequential production.
- Every final `origin_image/slide_XX.png` must be generated by the selected image backend: built-in image generation/editing tool or `scripts/image_gen.py`.
- Local drawing, Pillow, SVG, HTML/CSS/canvas screenshots, python-pptx/PptxGenJS layouts, and manual overlays are failure modes, not fallbacks.
- The selected image backend must stay fixed after backend confirmation. Do not let subagents switch backend for convenience.
- After sample approval, record how the approved sample was generated and pass that exact method to every slide subagent.
- Slide dispatch and result state must be recorded with the bundled scripts. Chat messages alone do not make a slide dispatched or complete.
- If a required subagent, image backend, or required-image path is unavailable, stop and report a blocker with the slide id and evidence. Do not create a lower-quality replacement.
## Visible Progress
For non-trivial decks, keep a user-visible checklist with one active step. Canonical completion evidence is in `docs/workflow-gates-and-progress.md`.
Default visible steps:
1. Prepare source, outline, style, and backend decisions.
2. Generate and approve one sample slide.
3. Prepare slide jobs and slide state.
4. Dispatch slide subagents.
5. Record generated slide results.
6. QA, repair, notes, and PPT assembly.
Do not mark a step complete from chat alone; use real files or script-recorded state.
## Default Workflow
1. Understand the source content.
- Identify topic, audience, goal, page count, style/brand constraints, and sections to include or exclude.
- If no page count is specified, choose a practical count. Typical decks are 8-12 slides.
2. Plan the deck outline.
- Before writing or updating `outline.md`, read `docs/workflow-gates-and-progress.md` and `docs/outline-style-and-sample.md`.
- Draft slide roles and required source images. Ask for confirmation, then stop before style, backend, sample, or downstream artifacts until approved.
3. Confirm a unified visual style.
- Before offering style options or using files from `references/`, read `docs/outline-style-and-sample.md`.
- Offer 2-3 concrete style directions, recommend one, wait for confirmation, then keep one visual identity while varying layouts by page role.
4. Confirm the image backend.
- Before generating any slide image, read `docs/backend-selection.md`.
- Check whether a built-in image tool is callable, state what you checked, name the backend, explain fallback status, and wait for confirmation.
- If CLI/API fallback is selected, read `docs/cli-api-fallback.md`. Read `docs/image-model-configuration.md` only after config errors or explicit API-setting requests.
5. Generate one sample slide for approval.
- Before generating or approving the sample slide, read `docs/outline-style-and-sample.md`.
- Generate exactly one representative sample after outline, style, and backend are confirmed. Do not generate the full deck until approved.
- After approval, record `sample_generation_method` in `deck_spec.json` so jobs and subagents inherit the same path.
6. Create the project directory.
- Before initializing folders or assembling files, read `docs/project-assembly-and-reporting.md`.
- If no destination is specified, use the current working directory or the source file directory.
7. Prepare user-supplied assets.
- Before using paper figures, charts, screenshots, logos, or other required assets, read `docs/user-supplied-assets.md`.
- Treat required assets as strict inputs and confirm slide-to-asset mapping before generation.
8. Generate all slide images.
- Before full-deck image generation, read `docs/slide-generation-and-subagents.md`.
- Create per-slide jobs with `scripts/prepare_slide_prompts.py` or saved `prompts/slide_XX.json` files.
- Every final image must come from the selected backend and be recorded with bundled state scripts.
9. Dispatch slide subagents.
- Before dispatching or replacing slide workers, read `docs/slide-generation-and-subagents.md` and `prompts/slide-worker.md`.
- Use one subagent per remaining slide job whenever possible. If required subagents cannot be spawned, stop and report a blocker unless the user changes the workflow.
10. Quality check and repair.
- Before QA or assembly, read `docs/project-assembly-and-reporting.md`.
- Inspect every slide before assembly: text, outline match, truncation, style, unwanted page numbers, overlaps, and required assets.
- Regenerate severe failures with a tighter prompt. Use backend editing for localized issues when available.
- For CLI/API fallback edit commands, read `docs/cli-api-fallback.md`. Replace the final slide only after validating the edited output.
11. Write speaker notes and assemble the PPT.
- Before writing `speech.md` or running assembly, read `docs/project-assembly-and-reporting.md`.
- Make sure `outline.md` reflects the final confirmed deck outline. Use `speech.md` headings that map to `Slide N`.
- Before assembly, ensure `slide_jobs.json` shows generated slides as `recorded` and approved samples as `accepted`. If any slide is `pending`, `dispatched`, or `blocked`, stop.
12. Report the result.
- Use the final report checklist in `docs/project-assembly-and-reporting.md`.
- Include paths, slide count, backend used, recorded-result status, and any limitations or blockers.
13. Save reusable styles when requested.
- If asked to save the current deck style or a supplied image/PDF/PPT/PPTX style, read `docs/style-library.md`.
## Subagent Dispatch
Slide subagents are mandatory after sample approval whenever the runtime can spawn them. The main agent prepares jobs and records state; each worker handles exactly one `prompts/slide_XX.json` job and returns only selected image path, backend, and QA note.
Use `docs/slide-generation-and-subagents.md` for dispatch, commands, result recording, blockers, and backend provenance. Use `prompts/slide-worker.md` as the handoff template.
Subagents must not edit `outline.md`, `deck_spec.json`, other slide jobs, `origin_image/`, `speech.md`, or the final `.pptx`. The parent records outputs and assembles.
## Acceptance Criteria
- Output is a valid `.pptx`.
- Each expected final slide image exists under `origin_image/slide_XX.png`.
- Every final slide image was generated by the confirmed backend and recorded through `record_slide_result.py`, except an approved sample marked accepted by run state.
- `outline.md` reflects the approved deck outline.
- `speech.md` exists when speaker notes are expected, and assembly writes those notes into the PPT.
- `slide_jobs.json` and `slide_run_state.json`Related in Image & Video
watch
IncludedWatch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.
physical-ai-defect-image-generation
IncludedUse when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.
accelint-react-best-practices
IncludedReact performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.
elevenlabs-agents
IncludedBuild conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication
humanizer
IncludedHumanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.
generating-mermaid-diagrams
IncludedSalesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.