knowledge-intake
Processes external resources into stored knowledge with quality scoring and routing. Use when ingesting articles, papers, or docs into a memory palace.
What this skill does
## Table of Contents - [What It Is](#what-it-is) - [The Intake Signal](#the-intake-signal) - [Quick Start](#quick-start) - [Evaluation Framework](#evaluation-framework) - [Importance Criteria](#importance-criteria) - [Scoring Guide](#scoring-guide) - [Application Routing](#application-routing) - [Local Codebase Application](#local-codebase-application) - [Meta-Infrastructure Application](#meta-infrastructure-application) - [Routing Decision Tree](#routing-decision-tree) - [Storage Locations](#storage-locations) - [The Tidying Imperative (KonMari-Inspired)](#the-tidying-imperative-konmari-inspired) - [The Master Curator](#the-master-curator) - [The Two Questions](#the-two-questions) - [Tidying Actions](#tidying-actions) - [Marginal Value Filtering (Anti-Pollution)](#marginal-value-filtering-anti-pollution) - [The Three-Step Filter](#the-three-step-filter) - [Using the Filter](#using-the-filter) - [Filter Output Example](#filter-output-example) - [Progressive Autonomy Integration](#progressive-autonomy-integration) - [RL-Based Quality Scoring](#rl-based-quality-scoring) - [Anchor-Question Clarity Gate](#anchor-question-clarity-gate) - [Usage Signals](#usage-signals) - [Quality Decay Model](#quality-decay-model) - [Source Lineage Tracking](#source-lineage-tracking) - [Knowledge Orchestrator](#knowledge-orchestrator) - [RL Integration with Marginal Value Filter](#rl-integration-with-marginal-value-filter) - [Workflow Example](#workflow-example) - [Queue Processing](#queue-processing) - [Processing Queue Entries](#processing-queue-entries) - [Queue Integration](#queue-integration) - [Queue Status Workflow](#queue-status-workflow) - [Automation](#automation) - [Detailed Resources](#detailed-resources) - [Hook Integration](#hook-integration) - [Automatic Triggers](#automatic-triggers) - [Hook Signals](#hook-signals) - [Deduplication](#deduplication) - [Safety Checks](#safety-checks) - [Index Schema Alignment](#index-schema-alignment) - [Integration](#integration) - [Exit Criteria](#exit-criteria) # Knowledge Intake Process external resources into the knowledge store. When a user links an article, blog post, or paper, this skill guides evaluation, storage decisions, and application routing. ## When To Use - Capturing and organizing knowledge from sessions - Ingesting information into structured memory palaces ## When NOT To Use - Temporary notes that do not need long-term storage - Code-only changes without knowledge capture needs ## What It Is A knowledge governance framework that answers three questions for every external resource: 1. **Is it worth storing?** - Evaluate signal-to-noise and relevance 2. **Where does it apply?** - Route to local codebase or meta-infrastructure 3. **What does it displace?** - Identify outdated knowledge to prune ## The Intake Signal > When a user links an external resource, it is a signal of importance. The act of sharing indicates the resource passed the user's own filter. Our job is to: - Extract the essential patterns and insights - Determine appropriate storage location and format - Connect to existing knowledge structures - Identify application opportunities ## Quick Start When a user shares a link: ``` 1. FETCH → Detect format, retrieve and convert content 2. EVALUATE → Apply importance criteria 3. DECIDE → Storage location and application type 4. STORE → Create structured knowledge entry 5. VALIDATE → Scribe verification (slop scan + doc verify) 6. CONNECT → Link to existing palace structures 7. PROMOTE → Offer Discussion promotion (score 80+) 8. APPLY → Route to codebase or infrastructure updates 9. PRUNE → Identify displaced/outdated knowledge ``` ### Step 1: FETCH with Format Detection Before retrieving content, detect the source format from the URL or file path to choose the right retrieval method. **Web articles and blog posts** (default path): Use WebFetch to retrieve HTML content directly. No conversion needed. **Document URLs** (PDF, DOCX, PPTX, XLSX): Apply the `leyline:document-conversion` protocol. This tries the markitdown MCP tool first for high-quality markdown, then falls back to native Claude Code tools (Read for PDFs, etc.), then informs the user if the format is unsupported without markitdown. **Local files** (user shares a file path): Construct a `file://` URI from the absolute path and apply the `leyline:document-conversion` protocol. **Format detection heuristics:** | URL Pattern | Format | Retrieval | |-------------|--------|-----------| | `*.pdf`, `arxiv.org/pdf/*` | PDF | document-conversion | | `*.docx`, `*.doc` | Word | document-conversion | | `*.pptx`, `*.ppt` | PowerPoint | document-conversion | | `*.xlsx`, `*.xls` | Excel | document-conversion | | `*.epub` | E-book | document-conversion | | `drive.google.com/*` | Various | document-conversion | | Everything else | HTML/web | WebFetch (existing) | After retrieval (regardless of method), wrap the content in external content boundary markers per `leyline:content-sanitization` before proceeding to Step 2 (EVALUATE). ### Step 5: Scribe Validation (Required) **All knowledge corpus entries MUST pass scribe validation before finalizing.** Run `Skill(scribe:slop-detector)` on the new entry: - Score must be < 2.5 (Clean to Light) - No Tier 1 markers (delve, tapestry, comprehensive, leveraging, etc.) - Hedge word density < 15 per 1000 words Use `Agent(scribe:doc-verifier)` to validate: - All file paths and URLs exist - All cross-references valid - Source attributions accurate ```bash # Quick validation for knowledge corpus entry /slop-scan docs/knowledge-corpus/[entry-name].md # Doc verification is now agent-only: Agent(scribe:doc-verifier) "Verify docs/knowledge-corpus/[entry-name].md" ``` **DO NOT finalize entries with slop score > 2.5** - rewrite with concrete specifics. **Verification:** Run the command with `--help` flag to verify availability. ### Step 7: Discussion Promotion (Score 80+ Only) When the evaluation score is 80-100 (evergreen), you MUST execute the Discussion promotion workflow. If the score is below 80, skip this step entirely. **Execute these steps in order:** 1. Read `modules/discussion-promotion.md` for the full GraphQL workflow 2. Tell the user: "This entry has reached evergreen maturity. Publishing to GitHub Discussions. [Y/n]" 3. If the user says "n", skip to Step 8 (APPLY) 4. Run the `gh api graphql` commands from the module to create or update a Discussion in the "Knowledge" category 5. Update the local corpus entry with `discussion_url` - If the entry already has a `discussion_url` field, update the existing Discussion instead of creating a new one - If `gh` is unavailable or promotion fails, warn the user and continue to Step 8 (APPLY) Publishing is the default for qualifying entries. It never blocks the intake workflow. ## Evaluation Framework ### Importance Criteria | Criterion | Weight | Questions | |-----------|--------|-----------| | **Novelty** | 25% | Does this introduce new patterns or concepts? | | **Applicability** | 30% | Can we apply this to current work? | | **Durability** | 20% | Will this remain relevant in 6+ months? | | **Connectivity** | 15% | Does it connect to multiple existing concepts? | | **Authority** | 10% | Is the source credible and well-reasoned? | ### Scoring Guide - **80-100**: Evergreen knowledge, store prominently, apply immediately - **60-79**: Valuable insight, store in corpus, schedule application - **40-59**: Useful reference, store as seedling, revisit later - **Below 40**: Low priority, capture key quote only or skip ## Application Routing ### Local Codebase Application Apply when knowledge directly improves current project: - Bug fix patterns - Performance optimizations - Architecture decisions for this codebase - Tool/library recommendations **Action**: Update code, add comments, create ADR ### Meta-Infrastructure Application Apply when knowledge improves our plugin ecosystem: - Skill design patterns -
Related in governance
dora-metrics
IncludedComputes DORA delivery-performance metrics from git and GitHub API. Use when assessing deployment frequency, lead time, or change failure rate.
release-health-gates
IncludedStandardizes release approvals with GitHub-aware checklists and deployment gates. Use before releasing to production to verify all gates pass.
palace-index-curator
IncludedCurate the web-capture index. Use when the capture backlog grows, captures sit unprocessed at seedling/pending, or to surface stored research during work.
ucx-github-deploy-governance
IncludedHermes governance skill for CI/CD, QA, staging/production readiness, and post-deployment issue-loop control aligned with governance policies.
ucx-github-governance
IncludedHermes governance skill for GitHub issue and PR lifecycle control aligned to governance/GOVERNANCE_RULES.md and UCX V3 round-based gate policy.