Claude
Skills
Sign in
Back

taxonomy

Included with Lifetime
$97 forever

Source of truth for event taxonomy generation, data auditing, and governance best practices in Amplitude. Use when an agent needs to create, validate, audit, score, or recommend improvements to event tracking plans, naming conventions, property standards, data quality, or deprecation workflows. Covers naming rules, property standards, scoring frameworks, safe metadata operations, deprecation procedures, and AI readiness guidance.

AI Agents

What this skill does


# Taxonomy Generation & Data Auditing

## When to Use

- User asks to create or review a tracking plan or event taxonomy
- User wants to validate event/property naming conventions
- User needs to audit data quality (duplicates, stale events, missing metadata)
- User asks about funnel design or event relationships
- Agent is generating event names or property names and needs to follow standards
- User wants to understand or improve their taxonomy governance
- User asks about reducing event volume or type counts
- User asks about deprecation, blocking, deleting, or hiding events
- Any agent needs a "source of truth" for taxonomy best practices before recommending events
- User asks about AI readiness, AI Controls, or improving AI feature accuracy

---

# Layer 1: Foundational Concepts

## Core Philosophy

Six principles govern all taxonomy work:

1. **Evidence-first. Never fabricate.** Every finding must be grounded in tool-retrieved data. If something cannot be verified, say so explicitly.
2. **Scan aggressively. Propose confidently. Confirm before writing.** Paginate autonomously through the full taxonomy. Form a prioritized, opinionated view of what needs fixing — then present it. Never call a write tool without explicit user confirmation.
3. **Be opinionated, not neutral.** Generic requests ("audit my taxonomy") are an invitation to lead. Use the scoring framework, recommend the highest-impact action first, and explain why. Don't present a menu of equal options.
4. **Surface critical issues proactively.** If you find something important while working on an adjacent task, raise it. Don't silently ignore a PII violation because the user only asked about naming conventions.
5. **Questions extract institutional knowledge.** Ask about business intent and real-world meaning, not Amplitude mechanics. One focused question at a time. The goal is to surface knowledge that lives in people's heads.
6. **Explain before acting.** Before calling any write tool, present exact proposed changes — including before/after state — and wait for explicit confirmation.

## Data Quality Lifecycle

All taxonomy governance follows a four-stage loop:

1. **Detect** — Scan systematically. Paginate through the full taxonomy. Score every finding. Surface issues with evidence before conclusions.
2. **Clarify** — Ask one focused question to capture semantic truth. Do not suggest actions yet. Seek understanding first.
3. **Resolve** — Apply metadata-only improvements. Guide humans through phased deprecation for structural changes. Never execute destructive actions unilaterally.
4. **Prevent** — Recommend conventions and governance habits that stop drift from recurring.

## Event Volume vs. Taxonomy Type Counts

These are **different problems** requiring **different solutions**:

- **Event volume** = total event instances ingested per billing period (how many times events fire). Properties do not count toward volume.
- **Taxonomy type counts** = number of distinct names across all schema dimensions (event types, event property types, user property types, group types, group property types). Each has its own limit.

**Billing models — know which applies before advising:**
- **Event volume billing**: customer has a contracted allocation of events per period. Exceeding it triggers overage costs. Flag significant event volume changes to these customers.
- **MTU billing**: customer is billed based on distinct users who trigger any event in a month. Per-user event counts matter less; total unique user count matters more.

**What customers usually mean:**
- "I need to reduce my event volume" → worried about billing (volume-billed customers)
- "I need to reduce my event types / schema count" → worried about hitting type limits (new types won't be queryable)

**What actually reduces each:**

| Goal | Action | Reduces Volume? | Reduces Type Count? |
|------|--------|:---:|:---:|
| Reduce volume | Block event | Yes | No |
| Reduce volume | Delete event | Yes | Yes |
| Reduce type count | Delete event/property/group type | — | Yes |
| Reduce type count | Block event | No | **No** |
| Reduce type count | Hide event | No | **No** |

**Key rules:**
- Blocking and hiding do NOT reduce type count. A quota-constrained customer must delete, not block.
- **Never recommend sampling.** Sampling breaks funnel charts, journey paths, cohorts, downstream destinations, and Guides.
- Custom events and merged events simplify analysis but do NOT reduce raw event volume.
- When ambiguous, ask: "Are you trying to reduce how many events are being sent, or the number of different event and property types in your taxonomy?"

## Event States and Metadata Permissions

| Status | Meaning | Can Edit Metadata? |
|--------|---------|:---:|
| Planned | In tracking plan; not yet instrumented | Yes |
| Live | Actively receiving data | Yes |
| Blocked | Stops new ingestion; historical data accessible | Yes |
| Unexpected | Receiving data but NOT in tracking plan | **No** — must add to tracking plan first |
| Deleted | Stops ingestion; removed from new-chart dropdowns | **No** — must restore first |

**Unexpected events have special restrictions.** No metadata can be updated until the event is added to the tracking plan. When you encounter Unexpected events:
- If they appear legitimate (real product actions, consistent volume): recommend adding to the tracking plan first, then apply metadata.
- If they appear invalid (single-day spikes, test strings, security scan artifacts): treat as a deprecation candidate through the standard safe deprecation process. Always distinguish "legitimate but undocumented" from "truly invalid" before recommending any action.

**Activity state is NOT a deprecation signal.** An event marked Inactive is behaving as intended.

**Actual deprecation signals:**

| Signal | Interpretation |
|--------|----------------|
| No recent volume | Event has gone stale |
| No recent queries | Event is unused |
| **Both together** | **Strong deprecation candidate** |

**Planned events:** Zero volume and queries are expected — evaluate by age, name collisions with Live events, and test-like names instead.

## Custom Events, Labeled Events, and Merged Events

None reduce event volume. Each has distinct behavior:

- **Custom events** (`ce:` prefix, type = custom): Logical combinations of underlying events for analysis convenience. The underlying events still exist and fire independently. Always check whether an event is used as the basis for a custom event before recommending its deletion — deleting the underlying event may break the custom event silently. Allowed: consolidate duplicate custom events with the same definition; improve naming, descriptions, categories, tags. Never claim that removing a custom event reduces event volume.
- **Labeled events** (`ce:` prefix, type = labeled): Designed for use with Autocapture, distinguished from custom events by a separate metadata flag. Adding/deleting does not impact volume.
- **Merged events** (Transform/Merge): Source events are no longer individually available for analysis after a merge. If the user needs to analyze combined events AND retain independent analysis of source events, recommend a **custom event** instead of a merge. Allowed: merge truly duplicated events that share the same semantics and where independent analysis is not needed. Never claim that merging reduces event volume.

## Protected Data Categories

**How to identify category from naming convention:** Events with bracket prefixes (`[...]`) follow a consistent pattern: if the text inside the brackets is a recognizable third-party product brand, it is an integration. If not, it is an Amplitude system event.

**Amplitude system events** (`[Amplitude]`, `[Guides-Surveys]`, `[Experiment]`, etc.): Critical to platform functionality. Do not recommend blocking, deleting, hiding, or modifying in response to generic cleanup.

**Integration-prefixed data** (`[Appboy]`, `[Adjust]`, `[Intercom]`, etc.): Can be cle

Related in AI Agents