taxonomy

Included with Lifetime

$97 forever

Source of truth for event taxonomy generation, data auditing, and governance best practices in Amplitude. Use when an agent needs to create, validate, audit, score, or recommend improvements to event tracking plans, naming conventions, property standards, data quality, or deprecation workflows. Covers naming rules, property standards, scoring frameworks, safe metadata operations, deprecation procedures, and AI readiness guidance.

AI Agents

What this skill does


# Taxonomy Generation & Data Auditing

## When to Use

- User asks to create or review a tracking plan or event taxonomy
- User wants to validate event/property naming conventions
- User needs to audit data quality (duplicates, stale events, missing metadata)
- User asks about funnel design or event relationships
- Agent is generating event names or property names and needs to follow standards
- User wants to understand or improve their taxonomy governance
- User asks about reducing event volume or type counts
- User asks about deprecation, blocking, deleting, or hiding events
- Any agent needs a "source of truth" for taxonomy best practices before recommending events
- User asks about AI readiness, AI Controls, or improving AI feature accuracy

---

# Layer 1: Foundational Concepts

## Core Philosophy

Six principles govern all taxonomy work:

1. **Evidence-first. Never fabricate.** Every finding must be grounded in tool-retrieved data. If something cannot be verified, say so explicitly.
2. **Scan aggressively. Propose confidently. Confirm before writing.** Paginate autonomously through the full taxonomy. Form a prioritized, opinionated view of what needs fixing — then present it. Never call a write tool without explicit user confirmation.
3. **Be opinionated, not neutral.** Generic requests ("audit my taxonomy") are an invitation to lead. Use the scoring framework, recommend the highest-impact action first, and explain why. Don't present a menu of equal options.
4. **Surface critical issues proactively.** If you find something important while working on an adjacent task, raise it. Don't silently ignore a PII violation because the user only asked about naming conventions.
5. **Questions extract institutional knowledge.** Ask about business intent and real-world meaning, not Amplitude mechanics. One focused question at a time. The goal is to surface knowledge that lives in people's heads.
6. **Explain before acting.** Before calling any write tool, present exact proposed changes — including before/after state — and wait for explicit confirmation.

## Data Quality Lifecycle

All taxonomy governance follows a four-stage loop:

1. **Detect** — Scan systematically. Paginate through the full taxonomy. Score every finding. Surface issues with evidence before conclusions.
2. **Clarify** — Ask one focused question to capture semantic truth. Do not suggest actions yet. Seek understanding first.
3. **Resolve** — Apply metadata-only improvements. Guide humans through phased deprecation for structural changes. Never execute destructive actions unilaterally.
4. **Prevent** — Recommend conventions and governance habits that stop drift from recurring.

## Event Volume vs. Taxonomy Type Counts

These are **different problems** requiring **different solutions**:

- **Event volume** = total event instances ingested per billing period (how many times events fire). Properties do not count toward volume.
- **Taxonomy type counts** = number of distinct names across all schema dimensions (event types, event property types, user property types, group types, group property types). Each has its own limit.

**Billing models — know which applies before advising:**
- **Event volume billing**: customer has a contracted allocation of events per period. Exceeding it triggers overage costs. Flag significant event volume changes to these customers.
- **MTU billing**: customer is billed based on distinct users who trigger any event in a month. Per-user event counts matter less; total unique user count matters more.

**What customers usually mean:**
- "I need to reduce my event volume" → worried about billing (volume-billed customers)
- "I need to reduce my event types / schema count" → worried about hitting type limits (new types won't be queryable)

**What actually reduces each:**

| Goal | Action | Reduces Volume? | Reduces Type Count? |
|------|--------|:---:|:---:|
| Reduce volume | Block event | Yes | No |
| Reduce volume | Delete event | Yes | Yes |
| Reduce type count | Delete event/property/group type | — | Yes |
| Reduce type count | Block event | No | **No** |
| Reduce type count | Hide event | No | **No** |

**Key rules:**
- Blocking and hiding do NOT reduce type count. A quota-constrained customer must delete, not block.
- **Never recommend sampling.** Sampling breaks funnel charts, journey paths, cohorts, downstream destinations, and Guides.
- Custom events and merged events simplify analysis but do NOT reduce raw event volume.
- When ambiguous, ask: "Are you trying to reduce how many events are being sent, or the number of different event and property types in your taxonomy?"

## Event States and Metadata Permissions

| Status | Meaning | Can Edit Metadata? |
|--------|---------|:---:|
| Planned | In tracking plan; not yet instrumented | Yes |
| Live | Actively receiving data | Yes |
| Blocked | Stops new ingestion; historical data accessible | Yes |
| Unexpected | Receiving data but NOT in tracking plan | **No** — must add to tracking plan first |
| Deleted | Stops ingestion; removed from new-chart dropdowns | **No** — must restore first |

**Unexpected events have special restrictions.** No metadata can be updated until the event is added to the tracking plan. When you encounter Unexpected events:
- If they appear legitimate (real product actions, consistent volume): recommend adding to the tracking plan first, then apply metadata.
- If they appear invalid (single-day spikes, test strings, security scan artifacts): treat as a deprecation candidate through the standard safe deprecation process. Always distinguish "legitimate but undocumented" from "truly invalid" before recommending any action.

**Activity state is NOT a deprecation signal.** An event marked Inactive is behaving as intended.

**Actual deprecation signals:**

| Signal | Interpretation |
|--------|----------------|
| No recent volume | Event has gone stale |
| No recent queries | Event is unused |
| **Both together** | **Strong deprecation candidate** |

**Planned events:** Zero volume and queries are expected — evaluate by age, name collisions with Live events, and test-like names instead.

## Custom Events, Labeled Events, and Merged Events

None reduce event volume. Each has distinct behavior:

- **Custom events** (`ce:` prefix, type = custom): Logical combinations of underlying events for analysis convenience. The underlying events still exist and fire independently. Always check whether an event is used as the basis for a custom event before recommending its deletion — deleting the underlying event may break the custom event silently. Allowed: consolidate duplicate custom events with the same definition; improve naming, descriptions, categories, tags. Never claim that removing a custom event reduces event volume.
- **Labeled events** (`ce:` prefix, type = labeled): Designed for use with Autocapture, distinguished from custom events by a separate metadata flag. Adding/deleting does not impact volume.
- **Merged events** (Transform/Merge): Source events are no longer individually available for analysis after a merge. If the user needs to analyze combined events AND retain independent analysis of source events, recommend a **custom event** instead of a merge. Allowed: merge truly duplicated events that share the same semantics and where independent analysis is not needed. Never claim that merging reduces event volume.

## Protected Data Categories

**How to identify category from naming convention:** Events with bracket prefixes (`[...]`) follow a consistent pattern: if the text inside the brackets is a recognizable third-party product brand, it is an integration. If not, it is an Amplitude system event.

**Amplitude system events** (`[Amplitude]`, `[Guides-Surveys]`, `[Experiment]`, etc.): Critical to platform functionality. Do not recommend blocking, deleting, hiding, or modifying in response to generic cleanup.

**Integration-prefixed data** (`[Appboy]`, `[Adjust]`, `[Intercom]`, etc.): Can be cle

Files: 1

Size: 28.0 KB

Complexity: 35/100

Category: AI Agents

Source: https://github.com/amplitude/mcp-marketplace/tree/main/plugins/amplitude/skills/taxonomy

Related in AI Agents

skill-development

Included

Comprehensive meta-skill for creating, managing, validating, auditing, and distributing Claude Code skills and slash commands (unified in v2.1.3+). Provides skill templates, creation workflows, validation patterns, audit checklists, naming conventions, YAML frontmatter guidance, progressive disclosure examples, and best practices lookup. Use when creating new skills, validating existing skills, auditing skill quality, understanding skill architecture, needing skill templates, learning about YAML frontmatter requirements, progressive disclosure patterns, tool restrictions (allowed-tools), skill composition, skill naming conventions, troubleshooting skill activation issues, creating custom slash commands, configuring command frontmatter, using command arguments ($ARGUMENTS, $1, $2), bash execution in commands, file references in commands, command namespacing, plugin commands, MCP slash commands, Skill tool configuration, or deciding between skills vs slash commands. Delegates to docs-management skill for official documentation.

AI Agentsscripts

reprompter

Included

Transform messy prompts into well-structured, effective prompts — single or multi-agent. Use when: "reprompt", "reprompt this", "clean up this prompt", "structure my prompt", rough text needing XML tags and best practices, "reprompter teams", "repromptception", "run with quality", "smart run", "smart agents", multi-agent tasks, audits, parallel work, anything going to agent teams. Don't use when: simple Q&A, pure chat, immediate execution-only tasks. See "Don't Use When" section for details. Outputs: Structured XML/Markdown prompt, quality score (before/after), optional team brief + per-agent sub-prompts, agent team output files. Success criteria: Single mode quality score ≥ 7/10; Repromptception per-agent prompt quality score 8+/10; all required sections present, actionable and specific.

AI Agentsscripts

adaptive-compaction

Included

Adaptive add-on policy and recovery layer that decides WHEN to compact, prune, snapshot, or fork -- replacing fixed-percent auto-compaction across Claude Code, Codex, and MCP-capable hosts. Trigger on auto-compact timing or damage: "when should I compact", "is it safe to compact now or start a fresh session", "auto-compact fires too early/mid-task", "switching to an unrelated task but the window still has space", "context rot", "answers get worse the longer the session runs", "the agent forgot the plan or my decisions after it summarized", "add a layer on top that manages context without changing the agent", raising autoCompactWindow to give the policy room, or installing/tuning a cross-tool compaction policy or PreCompact hook -- even when "compaction" is never said but the problem is context-window pressure or post-summarization memory loss. Do NOT use to summarize a conversation, build RAG, write a summarization prompt (decides WHEN not HOW), or answer max-context-length trivia.

AI Agentsscripts

agent-skill-creator

Included

Create cross-platform agent skills from workflow descriptions. Activates when users ask to create an agent, automate a repetitive workflow, create a custom skill, or need advanced agent creation. Triggers on phrases like create agent for, automate workflow, create skill for, every day I have to, daily I need to, turn process into agent, need to automate, create a cross-platform skill, validate this skill, export this skill, migrate this skill. Supports single skills, multi-agent suites, transcript processing, template-based creation, interactive configuration, cross-platform export, and spec validation.

AI Agentsscripts

llm-wiki

Included

Use when building or maintaining a persistent personal knowledge base (second brain) in Obsidian where an LLM incrementally ingests sources, updates entity/concept pages, maintains cross-references, and keeps a synthesis current. Triggers include "second brain", "Obsidian wiki", "personal knowledge management", "ingest this paper/article/book", "build a research wiki", "compound knowledge", "Memex", or whenever the user wants knowledge to accumulate across sessions instead of being re-derived by RAG on every query.

AI Agentsscripts

skill-master

Included

Agent Skills authoring, evaluation, and optimization. Create, edit, validate, benchmark, and improve skills following the agentskills.io specification. Use when designing SKILL.md files, structuring skill folders (references, scripts, assets), ingesting external documentation into skills, running trigger evals, benchmarking skill quality, optimizing descriptions, or performing blind A/B comparisons. Keywords: agentskills.io, SKILL.md, skill authoring, eval, benchmark, trigger optimization.

AI Agentsscripts