hindsight
Persistent agent memory via self-hosted Hindsight. Retain knowledge, recall context, reflect on patterns. Includes multi-bank routing architecture for agent orgs.
What this skill does
# Hindsight Memory (Self-Hosted)
Persistent, structured memory via the official Hindsight CLI (`v0.4.14`). Store knowledge during tasks, recall context before starting new ones, reflect to synthesize patterns.
## Bank Detection
Auto-detect bank from git repo name:
```bash
BANK=$(basename "$(git rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null || echo "general")
```
Or use the helper: `BANK=$(./scripts/hs-bank-id.sh)`
## CLI Config
- **Binary**: `hindsight` at `~/.local/bin/hindsight` (official v0.4.14)
- **Config**: `~/.hindsight/config` (TOML: `api_url`, `api_key`)
- **Endpoint**: `https://api.hs.delo.sh` (resolves to localhost via `/etc/hosts`)
- **Reconfigure**: `hindsight configure --api-url <url> --api-key <key>`
## Core Operations
### Retain (store knowledge)
```bash
hindsight memory retain $BANK "npm test requires --experimental-vm-modules" \
--context "debugging"
```
Context categories: `architecture`, `conventions`, `debugging`, `deployment`, `dependencies`, `preferences`, `session-summary`, `code-edit`
With document tracking (same `doc-id` = upsert, replacing old facts):
```bash
hindsight memory retain $BANK "Project deadline extended to April 15" \
--context "conventions" --doc-id "sprint-notes-2026-03"
```
### Recall (retrieve context)
```bash
hindsight memory recall $BANK "What testing patterns does this project use?"
```
With options:
```bash
hindsight memory recall $BANK "How are auth and session management connected?" \
--budget high --max-tokens 8192 --fact-type world,observation
```
Budget levels: `low` (fast, shallow), `mid` (balanced, default), `high` (deep graph traversal)
JSON output for programmatic use:
```bash
hindsight memory recall $BANK "query" -o json | jq '.results[].text'
```
### Reflect (synthesize with agentic reasoning)
Reflect runs an agentic loop: autonomously searches memories, applies bank disposition, generates grounded response with citations.
```bash
hindsight memory reflect $BANK "What architectural decisions have shaped this project?"
```
With context and higher budget:
```bash
hindsight memory reflect $BANK "Should we migrate to event sourcing?" \
--context "architecture review" --budget high
```
Response includes `based_on.memories`, `based_on.mental_models`, `based_on.directives` for citation traceability.
## Mental Models (pre-computed reflect responses)
Mental models are curated summaries checked first during reflect. Faster, more consistent answers for recurring topics. Top of the retrieval hierarchy.
```bash
hindsight mental-model create $BANK \
--name "Project Architecture" \
--source-query "What is the overall system architecture?"
hindsight mental-model list $BANK
hindsight mental-model refresh $BANK <mental_model_id>
hindsight mental-model delete $BANK <mental_model_id>
```
## Directives (hard rules for reflect)
Always-enforced rules during reflect. Unlike disposition (soft personality influence), directives are strict behavioral constraints.
```bash
hindsight directive create $BANK \
--name "Code Style" \
--content "Always recommend Python type hints and strict typing"
hindsight directive list $BANK
hindsight directive update $BANK <directive_id> --active false
hindsight directive delete $BANK <directive_id>
```
## Documents (source tracking)
Documents track where memories came from. Re-retaining with the same `doc-id` replaces old facts (upsert). Deleting a document removes all its extracted memories.
```bash
hindsight document list $BANK
hindsight document get $BANK <document_id>
hindsight document delete $BANK <document_id>
```
## Bank Management
```bash
hindsight bank list
hindsight bank stats $BANK
hindsight bank disposition $BANK
hindsight bank disposition $BANK --skepticism 4 --literalism 3 --empathy 2
hindsight bank mission $BANK "Extract technical facts, conventions, and decisions."
```
## Disposition (Personality Traits)
Three traits (1-5 scale) that influence reflect behavior:
| Trait | Low (1) | High (5) |
|-------|---------|----------|
| **Skepticism** | Trusting, accepts claims | Questions and doubts claims |
| **Literalism** | Flexible interpretation | Exact, literal interpretation |
| **Empathy** | Detached, fact-focused | Considers emotional context |
## Retrieval Hierarchy (during reflect)
1. **Mental Models** - User-curated summaries (highest priority)
2. **Observations** - Consolidated knowledge (auto-generated from retained facts)
3. **Raw Facts** - Ground truth memories (world, experience, observation types)
## Fact Types
- **world** - Objective facts ("Alice works at Google")
- **experience** - Conversational events ("User asked about deployment")
- **observation** - Consolidated patterns (auto-synthesized from multiple facts)
## Multi-Bank Routing Architecture
For multi-agent or multi-project setups, use domain-first routing to prevent cross-project pollution and recall noise.
### Strategy
- **Primary bank = domain/product** (source of truth). Example: `wean`, `chorescore`, `33god-core`
- **Secondary bank(s) = role/hierarchy overlay**. Example: `exec-office` for leadership decisions
- **Global fallback bank**. Example: `33GOD` for org-wide context
Avoid agent-only banks as canonical memory. They drift when agents switch projects.
### Routing Pattern
For each agent/session:
1. Resolve **writeBank** (where new memories are retained)
2. Resolve **recallBanks[]** (ordered primary -> secondary -> fallback)
3. On prompt build, recall from each bank and merge results
4. On run end/reset/tool-error, retain high-signal facts into writeBank
### Capture Policy (high signal only)
**Retain automatically for:**
- Explicit memory intent ("remember", "don't forget", preferences)
- Post-run user facts/decisions
- High-level architectural patterns
- Pre-reset session summaries
- Non-standard system paths/configs
- Tool errors (debugging context)
**Do NOT retain:**
- Cron/noise/system spam
- Tiny one-word messages
- Slash commands
### Failure Modes
| Symptom | Cause | Fix |
|---------|-------|-----|
| Cross-project pollution | writeBank too broad | Tighten routing to domain bank |
| Recall noise | Too many recallBanks or topK too high | Cap at 3-4 banks |
| Missed intent | Memory-intent regex too strict | Expand capture triggers |
| Latency spike | Recalling too many banks per prompt | Reduce recallBanks count |
## When to Retain
- Discovered a bug fix or workaround
- Found a project convention or pattern
- Learned a user preference
- Completed a significant task (summarize what was done)
- Found something that didn't work (negative knowledge is valuable)
## When to Recall
- Before starting any non-trivial task
- When working in an unfamiliar area of the codebase
- When making architectural or tooling decisions
- When the user asks about past work or patterns
## Best Practices
1. **Be specific**: "npm test requires --experimental-vm-modules" not "tests need a flag"
2. **Include outcomes**: Store what worked AND what didn't
3. **Use context categories**: Tag with the right context for better retrieval
4. **Recall first**: Check for relevant context before starting work
5. **Don't duplicate**: Check if knowledge already exists before retaining
6. **Use document_id**: Group related session facts so they compound, not duplicate
7. **Create mental models**: For topics you reflect on repeatedly
8. **Use directives**: For hard rules that must always be enforced during reflect
## OpenClaw Plugin Integration
The local OpenClaw plugin (`workspace/.openclaw/extensions/hindsight-memory/`) automates capture and recall so agents don't need to manually call hindsight. Config lives in `openclaw.json` under `plugins.entries.hindsight-memory`.
### What the plugin automates
| Hook | Behavior |
|------|----------|
| `before_prompt_build` | Auto-recall from resolved banks, inject as context |
| `message_received` | Capture explicit memory intent ("remember", "prefer", "always", "never") |
| `agent_end` | Capture high-signaRelated in AI Agents
skill-development
IncludedComprehensive meta-skill for creating, managing, validating, auditing, and distributing Claude Code skills and slash commands (unified in v2.1.3+). Provides skill templates, creation workflows, validation patterns, audit checklists, naming conventions, YAML frontmatter guidance, progressive disclosure examples, and best practices lookup. Use when creating new skills, validating existing skills, auditing skill quality, understanding skill architecture, needing skill templates, learning about YAML frontmatter requirements, progressive disclosure patterns, tool restrictions (allowed-tools), skill composition, skill naming conventions, troubleshooting skill activation issues, creating custom slash commands, configuring command frontmatter, using command arguments ($ARGUMENTS, $1, $2), bash execution in commands, file references in commands, command namespacing, plugin commands, MCP slash commands, Skill tool configuration, or deciding between skills vs slash commands. Delegates to docs-management skill for official documentation.
reprompter
IncludedTransform messy prompts into well-structured, effective prompts — single or multi-agent. Use when: "reprompt", "reprompt this", "clean up this prompt", "structure my prompt", rough text needing XML tags and best practices, "reprompter teams", "repromptception", "run with quality", "smart run", "smart agents", multi-agent tasks, audits, parallel work, anything going to agent teams. Don't use when: simple Q&A, pure chat, immediate execution-only tasks. See "Don't Use When" section for details. Outputs: Structured XML/Markdown prompt, quality score (before/after), optional team brief + per-agent sub-prompts, agent team output files. Success criteria: Single mode quality score ≥ 7/10; Repromptception per-agent prompt quality score 8+/10; all required sections present, actionable and specific.
adaptive-compaction
IncludedAdaptive add-on policy and recovery layer that decides WHEN to compact, prune, snapshot, or fork -- replacing fixed-percent auto-compaction across Claude Code, Codex, and MCP-capable hosts. Trigger on auto-compact timing or damage: "when should I compact", "is it safe to compact now or start a fresh session", "auto-compact fires too early/mid-task", "switching to an unrelated task but the window still has space", "context rot", "answers get worse the longer the session runs", "the agent forgot the plan or my decisions after it summarized", "add a layer on top that manages context without changing the agent", raising autoCompactWindow to give the policy room, or installing/tuning a cross-tool compaction policy or PreCompact hook -- even when "compaction" is never said but the problem is context-window pressure or post-summarization memory loss. Do NOT use to summarize a conversation, build RAG, write a summarization prompt (decides WHEN not HOW), or answer max-context-length trivia.
agent-skill-creator
IncludedCreate cross-platform agent skills from workflow descriptions. Activates when users ask to create an agent, automate a repetitive workflow, create a custom skill, or need advanced agent creation. Triggers on phrases like create agent for, automate workflow, create skill for, every day I have to, daily I need to, turn process into agent, need to automate, create a cross-platform skill, validate this skill, export this skill, migrate this skill. Supports single skills, multi-agent suites, transcript processing, template-based creation, interactive configuration, cross-platform export, and spec validation.
llm-wiki
IncludedUse when building or maintaining a persistent personal knowledge base (second brain) in Obsidian where an LLM incrementally ingests sources, updates entity/concept pages, maintains cross-references, and keeps a synthesis current. Triggers include "second brain", "Obsidian wiki", "personal knowledge management", "ingest this paper/article/book", "build a research wiki", "compound knowledge", "Memex", or whenever the user wants knowledge to accumulate across sessions instead of being re-derived by RAG on every query.
skill-master
IncludedAgent Skills authoring, evaluation, and optimization. Create, edit, validate, benchmark, and improve skills following the agentskills.io specification. Use when designing SKILL.md files, structuring skill folders (references, scripts, assets), ingesting external documentation into skills, running trigger evals, benchmarking skill quality, optimizing descriptions, or performing blind A/B comparisons. Keywords: agentskills.io, SKILL.md, skill authoring, eval, benchmark, trigger optimization.