Claude
Skills
Sign in
Back

skill-creator

Included with Lifetime
$97 forever

Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, update or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy. Make sure to use this skill whenever the user mentions creating, building, designing, or improving skills, even if they don't explicitly say "skill-creator".

Code Reviewscriptsassets

What this skill does


# Skill Creator

A skill for creating new skills and iteratively improving them.

At a high level, the process of creating a skill goes like this:

- Decide what you want the skill to do and roughly how it should do it
- Write a draft of the skill
- Create a few test prompts and run claude-with-access-to-the-skill on them
- Help the user evaluate the results both qualitatively and quantitatively
  - While the runs happen in the background, draft some quantitative evals if there aren't any. Then explain them to the user
  - Use the `eval-viewer/generate_review.py` script to show the user the results, and also let them look at the quantitative metrics
- Rewrite the skill based on feedback from the user's evaluation
- Repeat until satisfied
- Expand the test set and try again at larger scale

Your job is to figure out where the user is in this process and help them progress. Maybe they want to make a skill from scratch, or maybe they already have a draft and want to iterate.

Be flexible -- if the user says "I don't need to run a bunch of evaluations, just vibe with me", do that instead.

After the skill is done (order is flexible), run the skill description improver to optimize triggering.

## Communicating with the user

Pay attention to context cues to understand how to phrase communication. In the default case:

- "evaluation" and "benchmark" are borderline, but OK
- for "JSON" and "assertion" you want to see cues from the user that they know what those things are before using them without explaining

It's OK to briefly explain terms if you're in doubt.

---

## Creating a skill

### Capture Intent

Start by understanding the user's intent. The current conversation might already contain a workflow the user wants to capture (e.g., they say "turn this into a skill"). If so, extract answers from the conversation history first -- the tools used, the sequence of steps, corrections the user made, input/output formats observed. The user may need to fill gaps, and should confirm before proceeding.

1. What should this skill enable Claude to do?
2. When should this skill trigger? (what user phrases/contexts)
3. What's the expected output format?
4. Should we set up test cases to verify the skill works? Skills with objectively verifiable outputs (file transforms, data extraction, code generation) benefit from test cases. Skills with subjective outputs (writing style, art) often don't. Suggest the appropriate default based on skill type, but let the user decide.

### Interview and Research

Proactively ask questions about edge cases, input/output formats, example files, success criteria, and dependencies. Wait to write test prompts until you've got this part ironed out.

Check available MCPs -- if useful for research, research in parallel via subagents if available.

### Initialize the Skill

When creating a new skill from scratch, run the `init_skill.py` script to scaffold the directory structure:

```bash
uv run scripts/init_skill.py <skill-name> --path <output-directory>
```

The script:
- Creates the skill directory at the specified path
- Generates a SKILL.md template with proper frontmatter and TODO placeholders
- Creates example resource directories: `scripts/`, `references/`, and `assets/`
- Automatically registers the skill in CLAUDE.md's Available Skills table

Skip this step if the skill already exists and you're iterating or packaging.

#### Manual Registration

If not using init_skill.py, register in CLAUDE.md manually:
1. Find the "Available Skills" table in the "Skill Locations" section
2. Add a new row: `| {skill-name} | \`skills/{skill-name}/SKILL.md\` |`
3. Keep the table alphabetically sorted

### Write the SKILL.md

Based on the user interview, fill in these components:

- **name**: Skill identifier (kebab-case)
- **description**: When to trigger, what it does. This is the primary triggering mechanism -- include both what the skill does AND specific contexts for when to use it. All "when to use" info goes here, not in the body. Note: Claude tends to "undertrigger" skills. To combat this, make descriptions a little bit "pushy" -- e.g., "Make sure to use this skill whenever the user mentions dashboards, data visualization, internal metrics, or wants to display any kind of company data, even if they don't explicitly ask for a 'dashboard.'"
- **the rest of the skill :)**

### Skill Writing Guide

#### Anatomy of a Skill

```
skill-name/
+-- SKILL.md (required)
|   +-- YAML frontmatter (name, description required)
|   +-- Markdown instructions
+-- Bundled Resources (optional)
    +-- scripts/    - Executable code for deterministic/repetitive tasks
    +-- references/ - Docs loaded into context as needed
    +-- assets/     - Files used in output (templates, icons, fonts)
```

**scripts/**: Executable code (Python/Bash/etc.) for tasks that require deterministic reliability or are repeatedly rewritten. Scripts may be executed without loading into context, but can still be read by Claude for patching.

**references/**: Documentation intended to be loaded as needed into context. Keep in references/ when content is detailed (schemas, API docs, policies). If files are large (>300 lines), include a table of contents. Include grep search patterns in SKILL.md for very large files (>10k words).

**assets/**: Files NOT intended to be loaded into context, but used within output Claude produces (templates, images, fonts, boilerplate).

**Avoid duplication**: Information should live in either SKILL.md or references, not both. Keep SKILL.md lean.

#### Progressive Disclosure

Skills use a three-level loading system:
1. **Metadata** (name + description) - Always in context (~100 words)
2. **SKILL.md body** - In context whenever skill triggers (<500 lines ideal)
3. **Bundled resources** - As needed (unlimited, scripts can execute without loading)

**Key patterns:**
- Keep SKILL.md under 500 lines; split content when approaching this limit
- Reference files clearly from SKILL.md with guidance on when to read them
- For large reference files (>300 lines), include a table of contents
- Avoid deeply nested references -- keep one level deep from SKILL.md

**Domain organization**: When a skill supports multiple domains/frameworks, organize by variant:
```
cloud-deploy/
+-- SKILL.md (workflow + selection)
+-- references/
    +-- aws.md
    +-- gcp.md
    +-- azure.md
```
Claude reads only the relevant reference file.

For more patterns, see `references/workflows.md` (sequential/conditional workflows) and `references/output-patterns.md` (template and example patterns).

#### What Not to Include

A skill should only contain files that directly support its functionality. Do NOT create:
- README.md, INSTALLATION_GUIDE.md, QUICK_REFERENCE.md, CHANGELOG.md, etc.
- Setup and testing procedures, user-facing documentation
- Auxiliary context about the process that went into creating it

#### Writing Style

Try to explain to the model why things are important in lieu of heavy-handed MUSTs. Use theory of mind and try to make the skill general rather than narrow to specific examples. If you find yourself writing ALWAYS or NEVER in all caps, reframe and explain the reasoning so the model understands why it matters. That's more humane, powerful, and effective.

Use imperative form in instructions. Prefer examples over verbose explanations.

**Defining output formats:**
```markdown
## Report structure
ALWAYS use this exact template:
# [Title]
## Executive summary
## Key findings
## Recommendations
```

**Examples pattern:**
```markdown
## Commit message format
**Example 1:**
Input: Added user authentication with JWT tokens
Output: feat(auth): implement JWT-based authentication
```

### Test Cases

After writing the skill draft, come up with 2-3 realistic test prompts -- the kind of thing a real user would actually say. Share them with the user: "Here are a few test cases I'd like to try. Do these look right, or do you want to add more?" Then run them.

Save test cases to `evals/ev

Related in Code Review