slop-detector
Detects AI-generated writing patterns in prose. Use when reviewing docs for slop, vague language, or identity leaks before publishing.
What this skill does
# AI Slop Detection
**Slop is a density problem, not a word problem.**
A single "delve" is fine. Five "delves" near a "tapestry"
and an "embark" is generated text. This skill scores
density per 100 words, marker clustering, and whether
the overall register fits the document type. It does not
ban words; it flags concentrations.
## Execution Workflow
Identify target files and classify them as technical docs,
narrative prose, or code comments. Classification feeds
context-aware scoring: tier-1 markers in marketing copy
score lower than the same markers in API reference.
### Language Detection
- Auto-detect language from text content using function word frequency
- Override with explicit `--lang` parameter (en, de, fr, es)
- Load language-specific patterns from `data/languages/{lang}.yaml`
- Fall back to English if detection confidence is low
- See `modules/language-handling.md` for cultural calibration and concrete pattern sets
### Spelling Normalization (British to American)
Convert British spellings to American by default, in any scanned
document, unless the document opts out via `.slop-config.yaml`
(`spelling: british`) or a per-word allowlist. This is a consistency
pass, separate from slop scoring. Use the tested `scribe.spelling`
functions (`find_british_spellings`, `to_american`); both preserve
case and skip code, inline code, and URLs.
Load: `@modules/spelling-normalization.md`
### Vocabulary and Phrase Detection
Load: `@modules/vocabulary-patterns.md`
Markers fall into three confidence tiers. Tier 1 words
("delve", "multifaceted", "leverage") appear far more often
in AI text than human text. Tier 2 covers context-dependent
transitions ("moreover", "subsequently"). Tier 3 covers
vapid phrases ("In today's fast-paced world", "cannot be
overstated").
| Word | Context | Human Alternative |
|------|---------|-------------------|
| delve | "delve into" | explore, examine, look at |
| tapestry | "rich tapestry" | mix, combination, variety |
| realm | "in the realm of" | in, within, regarding |
| embark | "embark on a journey" | start, begin |
| beacon | "a beacon of" | example, model |
| spearheaded | formal attribution | led, started |
| multifaceted | describing complexity | complex, varied |
| comprehensive | describing scope | thorough, complete |
| pivotal | importance marker | key, important |
| nuanced | sophistication signal | subtle, detailed |
| meticulous/meticulously | care marker | careful, detailed |
| intricate | complexity marker | detailed, complex |
| showcasing | display verb | showing, displaying |
| leveraging | business jargon | using |
| streamline | optimization verb | simplify, improve |
### Tier 2: Medium-Confidence Markers (Score: 2 each)
Common but context-dependent:
| Category | Words |
|----------|-------|
| Transition overuse | moreover, furthermore, indeed, notably, subsequently |
| Intensity clustering | significantly, substantially, fundamentally, profoundly |
| Hedging stacks | potentially, typically, often, might, perhaps |
| Action inflation | revolutionize, transform, unlock, unleash, elevate |
| Empty emphasis | crucial, vital, essential, paramount |
### Tier 3: Phrase Patterns (Score: 2-4 each)
| Phrase | Score | Issue |
|--------|-------|-------|
| "In today's fast-paced world" | 4 | Vapid opener |
| "It's worth noting that" | 3 | Filler |
| "At its core" | 2 | Positional crutch |
| "Cannot be overstated" | 3 | Empty emphasis |
| "A testament to" | 3 | Attribution cliche |
| "Navigate the complexities" | 4 | Business speak |
| "Unlock the potential" | 4 | Marketing speak |
| "Treasure trove of" | 3 | Overused metaphor |
| "Game changer" | 3 | Buzzword |
| "Look no further" | 4 | Sales pitch |
| "Nestled in the heart of" | 4 | Travel writing cliche |
| "Embark on a journey" | 4 | Melodrama |
| "Ever-evolving landscape" | 4 | Tech cliche |
| "Hustle and bustle" | 3 | Filler |
## Step 3: Structural Pattern Detection
Load: `@modules/structural-patterns.md`
### Em Dash Overuse
The single most-cited 2026 AI tell across Wikipedia, the Field
Guide, and the Algorithmic Bridge. Detection runs in two modes:
**Audit mode** (forensic, applied to unknown prose):
- **0-1 per 1000 words**: Normal human range
- **2-4**: Elevated, review usage
- **5+**: Strong AI signal
**Prevention mode** (applied to docs the agent just generated):
- **Target zero**. Every em-dash is a finding.
- Replace with commas (asides), parentheses (tangents), colons
(definitions), or periods (separate thoughts). See
`modules/structural-patterns.md` § Em Dash Analysis for the
full replacement table.
```bash
# Count em dashes in file
grep -o '—' file.md | wc -l
```
### Tricolon Detection
AI loves groups of three with alliteration:
- "fast, efficient, and reliable"
- "clear, concise, and compelling"
- "robust, reliable, and resilient"
Pattern: `adjective, adjective, and adjective` with similar sounds.
### List-to-Prose Ratio
Count bullet points vs paragraph sentences:
- **>60% bullets**: AI tendency
- **Emoji-led bullets**: Strong AI signal in technical docs
### Sentence Length Uniformity
Measure standard deviation of sentence lengths:
- **Low variance** (SD < 5 words): AI monotony
- **High variance** (SD > 10 words): Human variation
### Paragraph Symmetry
AI produces "blocky" text with uniform paragraph lengths.
Check whether paragraphs cluster around the same word count.
## Step 4: Identity & Voice Leak Sweep (P0)
Load: `@modules/identity-and-voice-leaks.md`
**Some patterns are not slop: they are direct evidence
that AI generated text leaked into a published artifact.**
A single match in this class fails review independently
of any other score.
Scan for:
1. **Identity leaks** ("As a large language model",
"as of my training cutoff", "I cannot provide") —
severity: critical, no exceptions.
2. **Conversational voice leaks** ("Hope this helps!",
"Great question!", "Sure!") outside transcript blocks.
3. **Self-narration of structure** ("In this section, we
will cover...", "Let's dive into...", "By the end of
this guide...").
4. **Hedging seesaw** ("While X has its merits, it's not
without its challenges").
5. **Contrastive constructions**, opening a clause or trailing
it: both contrastive negation ("not just X, but Y", "It's not
X, it's Y", and the trailing "It's X, not Y" / "Y, not X")
and affirmative antithesis ("Less X, more Y", "Where others
X, we Y"). Avoid in all but the most necessary cases; keep
only when the contrast carries information that survives
removal. The trailing copula form ("It's a tool, not a toy")
is the easiest to miss because the opener reads as a plain
definition.
See the module for the full pattern catalogue and false-
positive guidance.
## Step 4.5: Sycophantic Pattern Detection
Especially relevant for conversational or instructional content
(complements Class 2 of the identity-and-voice-leaks module):
| Phrase | Issue |
|--------|-------|
| "I'd be happy to" | Servile opener |
| "Great question!" | Empty validation |
| "Absolutely!" | Over-agreement |
| "That's a wonderful point" | Flattery |
| "I'm glad you asked" | Filler |
| "You're absolutely right" | Sycophancy |
These phrases add no information and signal generated content.
## Step 4.6: Tier 5 / 2026 Patterns (Prevention-Strict)
The 2026 cross-source consensus (Wikipedia *Signs of AI
writing*, Algorithmic Bridge *10 Signs*, Ignorance.ai *Field
Guide*, Stop-Slop Claude skill, George Kao, ContentBeta,
OliviaCal) identifies a handful of shapes that dominate
post-GPT-5 / post-Claude-4.5 prose. Each is detailed in
`@modules/vocabulary-patterns.md` (lexical form) and
`@modules/structural-patterns.md` (structural form).
| Pattern | Form | Why it matters |
|---------|------|----------------|
| Em-dash overuse | — used as rhetorical pause | Most-cited single tell of 2026 |
| Plus-sign for "and" | "hooks and skills" in prose | Strong: humans have "and" |
| Spatial copula | "lives in", "sits Related in writing-quality
voice-extract
IncludedExtracts a user's writing voice from text samples via SICO comparative analysis. Use when building a voice profile for consistent generation.
style-learner
IncludedExtracts writing style patterns from exemplar text into a reusable profile. Use when creating a style guide or learning a specific author's voice.
voice-generate
IncludedGenerates text in a learned writing voice. Use when drafting content that must match a specific author's style profile extracted by voice-extract.
voice-learn
IncludedImproves a voice profile by learning from manual edits. Use after editing generated text to refine registers and close voice drift over time.
voice-review
IncludedRuns parallel prose and craft review agents against a voice profile. Use when checking generated content for AI patterns and voice drift before publishing.