enrich
Use when the agent needs access to information beyond its training data — knowledge sources, RAG pipelines, or grounding data.
What this skill does
## MANDATORY PREPARATION Invoke /agent-workflow — it contains workflow principles, anti-patterns, and the **Context Gathering Protocol**. Follow the protocol before proceeding — if no workflow context exists yet, you MUST run /teach-maestro first. Consult the knowledge-systems reference in the agent-workflow skill for RAG architecture, chunking strategies, and retrieval patterns. --- Add knowledge sources to ground the workflow in facts. Without grounding, agents hallucinate. With grounding, they cite sources. ### Knowledge Source Assessment Identify what knowledge the workflow needs: | Knowledge Type | Source | Update Frequency | Access Pattern | |---------------|--------|-----------------|----------------| | Domain docs | Internal docs, specs | Monthly | Semantic search | | Code context | Codebase | Real-time | Code search | | User data | Database, CRM | Real-time | Structured query | | External data | APIs, web | Real-time | API call | | Historical | Logs, past interactions | Daily | Time-range query | ### Add RAG Pipeline For document-based knowledge (consult the knowledge-systems reference in the agent-workflow skill): 1. **Select documents**: Identify the authoritative source documents 2. **Chunk strategy**: Choose chunking based on document type (semantic > token-based) 3. **Embed**: Use appropriate embedding model for the domain 4. **Index**: Store in vector database with metadata 5. **Retrieve**: Implement hybrid search (semantic + keyword) 6. **Inject**: Add retrieved context to the prompt with source attribution ### Add Structured Data For database-backed knowledge: 1. **Define the query interface**: Natural language → structured query 2. **Add guardrails**: Read-only access, query complexity limits 3. **Format results**: Transform raw data into context the model can use 4. **Attribute**: Include data source and freshness in the context ### Add Real-Time Data For live information: 1. **Identify APIs**: What external services provide the needed data 2. **Cache strategy**: How often does the data change? Cache accordingly 3. **Fallback**: What happens when the API is down? 4. **Attribution**: Include data timestamp and source ### Enrichment Checklist - [ ] Every knowledge source has attribution (source, date, confidence) - [ ] Retrieval quality tested independently of generation quality - [ ] Chunk sizes tested and optimized for the document types - [ ] Fallbacks exist for all external knowledge sources - [ ] Knowledge base has a refresh/update strategy - [ ] PII is handled appropriately in knowledge sources ### Recommended Next Step After enrichment, run `/evaluate` to test retrieval quality, or `/iterate` to set up continuous monitoring of knowledge freshness. **NEVER**: - Index everything without curation (garbage in = garbage out) - Skip source attribution (hallucination without attribution is undetectable) - Build RAG without testing retrieval quality first - Use fixed chunk sizes for all document types - Assume embedding similarity equals relevance
Related in enhancement
amplify
IncludedUse when the workflow works but needs to handle more complex cases or produce higher-quality output through better tools, context, prompts, or models.
guard
IncludedUse when deploying to production, handling sensitive data, or the workflow needs safety constraints, input validation, and security boundaries.
iterate
IncludedUse when the workflow needs to self-correct, improve over time, or establish feedback loops and evaluation cycles.
temper
IncludedUse when the workflow feels over-engineered, has premature optimizations, unnecessary abstraction layers, or complexity beyond actual requirements.
turbocharge
IncludedUse when the user wants to push past conventional workflow limits with advanced performance techniques like parallel orchestration, streaming pipelines, or adaptive routing.
accelerate
IncludedUse when the workflow is too slow, too expensive, or both and needs latency, cost, or token usage optimization.