voice-extractor
Extract and document someone's authentic writing voice from samples. Use when someone needs a "voice guide," wants to capture their writing DNA, or needs to train AI to write in their style. Also useful for ghostwriting, brand voice documentation, or onboarding writers.
What this skill does
# Voice Extractor
AI-generated content all sounds the same. The fix isn't better prompts — it's teaching the AI how you actually communicate.
This skill extracts your communication DNA from writing samples and produces a Voice Guide: documented, tested, and ready to use.
---
## Mode
Detect from context or ask: *"Quick voice snapshot, full Voice Guide, or full guide with examples?"*
| Mode | What you get | Best for |
|------|-------------|----------|
| `quick` | Top 5 voice characteristics + 3 do/don't rules | Fast style reference, single piece |
| `standard` | Full Voice Guide: tone, vocabulary, rhythm, structure | AI training, ghostwriting, brand documentation |
| `deep` | Full Voice Guide + 10 sample rewrites + writing rules checklist + AI training examples | Onboarding writers, building a brand voice system |
**Default: `standard`** — use `quick` if they just need a fast reference. Use `deep` if they're onboarding a ghostwriter or building a content team.
---
## Context Loading Gates
**Before extracting, collect:**
- [ ] **Writing samples** — minimum 3 samples OR 500 total words (see priority list below)
- [ ] **Purpose of voice guide** — AI training? Ghostwriter onboarding? Team alignment?
- [ ] **Confidence zones** — Any topics where they want to sound more/less authoritative?
- [ ] **Known anti-patterns** — Any words or phrases they already know they want to avoid?
**Sample priority (most → least authentic):**
1. Casual Slack or email (raw, unedited voice)
2. Podcast or call transcript
3. LinkedIn posts or articles
4. Website copy (often edited, less authentic)
**Minimum sample gate:** If samples total under 500 words, stop:
> "These samples are too short to extract reliable patterns. Please add 2-3 more — emails, Slack messages, or transcripts work best. The messier and more casual, the better."
Do not attempt full extraction from under 500 words. Offer quick mode instead.
---
## Phase 1: Sample Quality Assessment
Before extracting, reason through:
1. **Sample authenticity:** Are these samples from edited/polished contexts (website, press) or raw contexts (Slack, email)? More polish = less authentic voice.
2. **Sample variety:** Do the samples cover different contexts (professional, casual, educational)? Single-context samples produce single-dimension voice guides.
3. **Exclusion check:** Identify and flag patterns that are NOT the authentic voice:
- Platform formatting tics (LinkedIn line breaks, Twitter brevity forcing)
- Typos and autocorrect errors
- Phrases borrowed from others (quotes, retweets)
- Unusually formal writing (legal docs, press releases)
4. **Sample size adequacy:** Is there enough material for full mode, or should I use quick mode?
Output a sample assessment:
> "I have [X samples / Y words] to work with. Quality: [high/medium — why]. I'll use [full/quick] mode. Excluding: [any patterns and why]."
---
## Phase 2: Core Energy Extraction
Identify the fundamental communication mode:
**Role:**
- Teacher (breaks things down systematically)
- Challenger (pushes back on assumptions)
- Cheerleader (builds confidence and momentum)
- Straight-shooter (cuts through BS efficiently)
**Default energy:**
- Calm authority ("Here's what works.")
- High enthusiasm ("This is exciting — let me show you.")
- Understated confidence ("I've seen this a hundred times.")
**Recurring themes:** What topics appear unprompted across samples? These are the things they actually care about.
---
## Phase 3: Phrase Extraction (Systematic)
Scan all samples and extract:
**Transition phrases** (how they shift topics):
- Quote exact examples from samples
- Pattern: "Here's the thing...", "What I've learned...", "Let me put it differently..."
**Emphasis phrases** (how they land a point):
- Quote exact examples
- Pattern: "The reality is...", "This is the part people miss...", "Here's the actual problem..."
**Closers** (how they wrap up):
- Quote exact examples
- Pattern: "That's the move.", "Start there.", "You've got this."
---
## Phase 4: Confidence Zone Mapping
| Zone | Description | Language Markers |
|---|---|---|
| Full authority | Topics they're an expert in | No hedging, definitive statements, "here's what works" |
| Earned perspective | Topics with experience but not mastery | "In my experience...", "What I've found..." |
| Active exploration | Topics they're learning now | "I'm testing this...", "What I'm seeing..." |
Map their stated expertise areas to each zone. This calibration is what makes the voice feel real vs. one-dimensional.
---
## Phase 5: Anti-Pattern Documentation
Extract what they'd NEVER say:
- Words that would feel wrong in their voice
- Phrases that make them cringe
- Tones they naturally avoid
- Industry jargon they hate
Source these from sample evidence where possible: "You never used [word] across [X samples] — it doesn't fit your voice."
---
## Phase 6: Validation Test (REQUIRED)
After extracting the full profile, generate 2 test sentences on the same topic:
**Version A** (using the extracted voice profile):
> "[Sample sentence in their voice]"
**Version B** (wrong voice — contrasting example):
> "[Same content, different voice — shows what to avoid]"
Ask the user: "Does Version A actually sound like you when you're not overthinking it? What feels off?"
This validation catches extraction errors before the guide is put into production.
---
## Quick Mode (`--quick`)
When samples are thin (300–500 words) or time is short:
1. Read 3 samples fast
2. Pull 10 signature phrases
3. Note 3 things they'd never say
4. Write 1 sentence describing their energy
**Output:** Minimum viable voice guide.
**Difference from full mode:**
- Quick: ~10 phrases, 3 anti-patterns, 1-sentence energy descriptor
- Full: Complete profile with confidence calibration, validated test sentences, and source-cited examples
---
## Phase 7: Self-Critique Pass (REQUIRED)
After generating the Voice Guide:
- [ ] Are the extracted phrases actually from the samples, or am I inferring them?
- [ ] Does the anti-pattern list include specific words/phrases, or just vague categories?
- [ ] Do the validation test sentences demonstrate a real difference between in-voice and out-of-voice?
- [ ] Is the confidence zone mapping specific to named topics, or just generic?
- [ ] Would a ghostwriter be able to use this guide without asking follow-up questions?
Flag any issues: "The anti-pattern section only has 2 entries — not enough for a usable guide. I need more samples or direct input from the user."
---
## Output Structure
```markdown
## Voice Guide: [Name] — [Date]
### Sample Assessment
- Samples: [count, types]
- Total words: [count]
- Quality: [high/medium — reason]
- Mode: [quick/full]
- Excluded: [patterns excluded + why]
---
### Core Energy
- Role: [teacher/challenger/cheerleader/straight-shooter]
- Default energy: [description]
- Recurring themes: [list]
### Signature Phrases
**Transitions:**
- "[Phrase]" (source: [email/post])
- "[Phrase]"
**Emphasis:**
- "[Phrase]" (source: [email/post])
**Closers:**
- "[Phrase]"
### Confidence Calibration
**Full authority (no hedging):**
Topics: [list]
Sounds like: "[example sentence]"
**Earned perspective:**
Topics: [list]
Sounds like: "[example sentence]"
**Active exploration:**
Topics: [list]
Sounds like: "[example sentence]"
### Anti-Patterns (Never Use)
- [Word/phrase] — why: [evidence from samples]
- [Word/phrase] — why: [evidence]
### Validation Test
**This sounds like you:**
"[Version A]"
**This doesn't:**
"[Version B — contrast]"
### Self-Critique Notes
[Any gaps, things to validate with user]
### Usage Instructions
- For AI: Paste this guide into your system prompt
- For ghostwriter: Share on day 1 — cuts revision cycles in half
- For team: This is the benchmark for "on brand"
```
---
*Skill by Brian Wagner | AI Marketing Architect | brianrwagner.com*
Related in Ads & Marketing
ads
IncludedMulti-platform paid advertising audit and optimization skill. Analyzes Google, Meta, YouTube, LinkedIn, TikTok, Microsoft, and Apple Ads. 250+ checks with scoring, parallel agents, industry templates, and AI creative generation.
banana
IncludedAI image generation Creative Director powered by Google Gemini Nano Banana models. Use this skill for ANY request involving image creation, editing, visual asset production, or creative direction. Triggers on: generate an image, create a photo, edit this picture, design a logo, make a banner, visual for my anything, and all /banana commands. Handles text-to-image, image editing, multi-turn creative sessions, batch workflows, and brand presets.
rpg-migration-analyzer
IncludedAnalyzes legacy RPG (Report Program Generator) programs from AS/400 and IBM i systems for migration to modern Java applications. Extracts business logic from RPG III/IV/ILE source code, identifies data structures (D-specs), file operations (F-specs), program dependencies (CALLB/CALLP), and converts RPG constructs to Java equivalents. Generates migration reports, complexity estimates, and Java implementation strategies with POJO classes, JPA entities, and service methods. Use when modernizing AS/400 or IBM i legacy systems, analyzing RPG source files (.rpg, .rpgle, .RPGLE), converting RPG to Java, mapping data specifications to Java classes, planning legacy system migration, or when user mentions RPG analysis, Report Program Generator, RPG III/IV/ILE, AS/400 modernization, IBM i migration, packed decimal conversion, or mainframe application rewrite.
brand-library-architect
IncludedBuild a complete brand library for a product — visual asset render pipeline, brand documentation set (BRAND, COPY, MANIFESTO, BIOS, FAQ, GLOSSARY, TONE, PRICING), open-source convention files (README, CONTRIBUTING, SECURITY, CODE_OF_CONDUCT), and a self-contained press kit. This skill should be used when the user asks to "build a brand library / brand kit / press kit / brand assets" for a product, "set up a brand library workflow," "create a positioning manifesto plus visual identity," or any combination of brand documentation + visual asset pipeline. Apply phase-by-phase or run end-to-end. Templates are product-agnostic and use {{TOKEN}} placeholders the skill prompts the user to fill.
writing-tech-post
IncludedAuthors engineering blog posts end-to-end: launch deep-dives, incident postmortems, architecture migrations, performance case studies, tutorials, AI/agent system writeups, security disclosures, and research-to-product translations. Picks the correct archetype, plans the abstraction ladder, enforces an evidence cadence (diagrams, benchmarks, profiles, traces, code, ablations), tunes voice against publisher house styles (Datadog, Vercel, GitHub, AWS, Meta, Cloudflare, Jane Street), and runs a pre-publish gate for narrative momentum and disclosure ethics. Use when drafting a new engineering post, restructuring a draft that feels flat, deciding which evidence form belongs where, validating that depth and product context are balanced, or preparing a postmortem, migration, or performance narrative for external publication. Do not use for API reference documentation, README authoring, marketing copy, release notes, generic SEO content, ghost-written executive thought leadership, or non-engineering long-form essays.
blog-google
IncludedGoogle API integration for blog performance: PageSpeed Insights, CrUX Core Web Vitals with 25-week history, Search Console performance, URL Inspection, Indexing API, GA4 organic traffic, NLP entity analysis for E-E-A-T, YouTube video search for embedding, and Google Ads Keyword Planner. Progressive feature availability based on credential tier (API key, OAuth/service account, GA4, Ads). Shares config with claude-seo at ~/.config/claude-seo/google-api.json. Use when user says "google data", "page speed", "core web vitals", "search console", "indexation", "GA4", "keyword research", "nlp entities", "blog performance", "youtube search", "google api setup".