Claude
Skills
Sign in
Back

distill

Included with Lifetime
$97 forever

Synthesize wiki pages from related memories. One endpoint, one flow: daemon clusters and synthesizes what it can; agent finishes whatever the daemon couldn't (no LLM or cluster too big). Invoked as `/distill [target]`.

AI Agents

What this skill does


# /distill

Force a distillation pass now. The daemon's background distill cycles run
on its own clock; `/distill` is the explicit user-triggered pass.

## Mental model

Distillation is four operations bundled into one flow:

- **emerge** — cluster new memories into new pages
- **absorb** — assign orphan memories to existing pages, propose new pages
  from topics that 2+ existing pages link to but no page is named for
- **refresh** — regenerate stale pages from their source memories (only when
  the user has not edited the page; pages you have touched stay locked)
- **merge** — combine duplicate pages flagged by the daemon's global review

The default flow runs all four. The `rebuild` verb is a destructive opt-in
that overrides the lock on a single page.

## Single flow

One POST to the daemon. Response splits into:

- `pages_created` / `created_ids`: pages the daemon synthesized itself
  (only when daemon has an LLM).
- `pending`: clusters the daemon couldn't finish. The agent
  synthesizes each in this session and POSTs them back to `/api/pages`.

Trigger timing is the only thing that differs between background distill
cycles and this skill. Code path is the same; daemon hands back
clusters when it can't synthesize; whoever called fills in the rest.

## Flow

### 1. Pick the scope

For bare `/distill`, infer a target from cwd:

```
Bash: top=$(git -C "$PWD" rev-parse --show-toplevel 2>/dev/null); \
      common=$(git -C "$PWD" rev-parse --git-common-dir 2>/dev/null); \
      if [ -n "$common" ]; then \
        case "$common" in /*) root=$(dirname "$common");; *) root=$(cd "$top" && cd "$(dirname "$common")" && pwd);; esac; \
        basename "$root"; \
      fi
```

- Output → use it (e.g. `origin`).
- Not a git repo → fall back to `basename "$PWD"`.
- Reserved keyword `deep` → no scope (global pass).
- Reserved keyword sequence `rebuild <page-id>` → call
  `distill(target=<page-id>, force=true)`. Confirms "Rebuild page <id>?
  Your edits will be wiped, page regenerates from sources." before
  proceeding. Skip the rest of this skill — single-page rebuild does
  not produce `pending` clusters; the daemon's response shape is
  `{"status": "ok", "force": true, "page_id": ..., "updated": true}`.
  Report verbatim.

For `/distill <arg>` → forward `<arg>` to `target`.

### 2. Call the MCP tool

```
distill(target="<scope>")
```

The tool returns the daemon's full JSON payload as text. Parse it as
JSON. Possible shapes:

```
{
  "pages_created": 0,
  "scoped": true,
  "created_ids": [],
  "pending": [
    { "source_ids": [...], "contents": [...], "entity_id": ...,
      "entity_name": ..., "space": ..., "estimated_tokens": ... },
    ...
  ],
  "stale_pages": [
    { "page_id": ..., "title": ..., "summary": ...,
      "source_memory_ids": [...], "stale_reason": "source_updated",
      "user_edited": false, "sources_updated_count": 3 },
    ...
  ],
  "stale_truncated": false,
  "orphan_topics": [
    { "label": "Topic Z", "count": 3 },
    ...
  ]
}
```

The route never invokes the daemon LLM. `created_ids` is always empty
when called from this skill; `pending` carries every cluster the
daemon found. The agent synthesizes them in this session — that's why
the LLM choice is consistent with how the user invoked the skill.

`unresolved` + `hint`: relay to user verbatim and stop.

### 3. Synthesize each `pending` cluster

The daemon route filters out clusters fully covered by an existing
page (subset or Jaccard ≥ 0.8). What remains is either:

- A **brand-new cluster** (no existing page) → create a new page.
- A **refresh candidate** (`existing_page_id` is set) → the cluster
  has new memories beyond what's in the matched page. The agent has
  LLM access, so the right move is to refresh the existing page in
  the same pass.

Cluster shape:

```
pending: [
  {
    source_ids, contents, entity_id, entity_name, space,
    estimated_tokens,
    existing_page_id?, existing_page_title?, new_memory_count?
  },
  ...
]
```

For each cluster, first run a **coherence check** before synthesizing:

- Skim every memory in `cluster.contents`.
- If the cluster has ≥ ~4 memories and the topics scatter (entity
  shared but the memories cover unrelated sub-topics — e.g. all tagged
  `Origin` but spanning RwLock bugs, schema choices, onboarding UI,
  migrations, and CSS), the cluster is **incoherent**. Skip
  synthesizing it. Record it for the report under "Skipped (low
  coherence)" with the existing page title (if refresh) or a short
  topic hint (if new).
- Coherent cluster (memories share an actual topic, not just an entity
  tag) → proceed to synthesis.

The coherence judgement is something only the agent can do — it needs
to read the prose. Daemon clustering is heuristic; agent is the
final filter against producing a grab-bag page.

For each coherent cluster:

- Title: short noun phrase. Use `existing_page_title` when refreshing
  unless the new memories materially change the topic. For new
  clusters: `cluster.entity_name` if specific, otherwise derive from
  the first memory's content.
- Summary: one sentence — the durable claim.
- Body: 3-7 paragraphs of wiki prose. Use `[[wikilinks]]`. Cite source
  ids inline with `(source: mem_XXX)`.

**New cluster** (no `existing_page_id`) — call the MCP tool:

```
create_page(title="...", summary="...", content="...",
            entity_id="<cluster.entity_id or omit>",
            space="<cluster.space>",
            source_memory_ids=[...])
```

**Refresh candidate** (`existing_page_id` is set) — replace the body
in place via the `update_page` MCP tool. This is a single atomic
call: replaces content + source list + optional summary, clears the
daemon's `stale_reason`, bumps version, preserves page_id +
created_at so external `[[wikilinks]]` keep working.

```
update_page(page_id=cluster.existing_page_id,
            content="...",
            source_memory_ids=cluster.source_ids,
            summary="...")
```

### 3.5 Refresh stale pages

The `stale_pages` block in the response lists pages whose sources
changed since last compile. Shape:

```
stale_pages: [
  { page_id, title, summary, source_memory_ids,
    sources_updated_count, stale_reason, user_edited },
  ...
]
stale_truncated: <bool>   # true when 10+ stale pages exist
```

For each stale page:

- **`user_edited == true`** → never auto-rewrite. The user touched
  the page; the upstream memories also changed. Surface in the
  "Conflict" report block and stop. The user resolves by hand, OR
  runs `/distill rebuild <page-id>` to wipe their edits and
  regenerate from sources.
- **`user_edited == false`** → fetch source memories via
  `get_page_sources(page_id="<id>")`, run the same coherence check
  used for clusters, then call `update_page` with the existing
  `source_memory_ids` and freshly synthesized prose.

```
update_page(page_id=stale.page_id,
            content="<refreshed prose>",
            source_memory_ids=stale.source_memory_ids,
            summary="<optional refreshed claim>")
```

When `stale_truncated == true`, tell the user "more stale pages
remain — re-run `/distill` after this pass to continue."

### 3.6 Surface orphan-topic suggestions

`orphan_topics` lists wikilink labels that 2+ existing pages reach
for but no page is named for. Each entry is a topic-discovery
signal — other pages are asking for this page.

Do **not** auto-create pages from this list — the agent doesn't have
the source memories at hand, and an empty-stub page is worse than no
page. Surface them in the report so the user can choose to run
`/distill <label>` intentionally:

```
Topic suggestions (other pages link here, no page yet):
  - "Topic Z"  (3 pages reference it)
  - "Other"    (2 pages reference it)
```

Skip the section when `orphan_topics` is empty.

### 4. Report terse

Three output shapes. Pick the one that matches what happened.

**If `pending` is empty (every cluster already fully covered):**

```
Scope `<scope>` is up to date — no new memories to distill.
```

**If 

Related in AI Agents