kb-builder

Included with Lifetime

$97 forever

Build an Obsidian-compatible knowledge base from public web sources using the TinyFish CLI. Use this skill when a user wants a builder-grade markdown knowledge base on a technical topic, asks for a structured research vault, or wants a topic compiled from live public sources into interlinked markdown files. Supports two input modes: topic only, or topic plus starter URLs. Supports both first-build and update workflows. Always generates index.md, sources.md, audit.md, and manifest.json. Creates additional files only when the evidence supports them. The output must synthesize the topic into a usable mental model, not just summarize pages. Uses explicit tinyfish agent run commands and public web sources only. Optional `--trace` mode saves raw TinyFish outputs under `_trace/` for debugging.

AI Agents

What this skill does


# KB Builder

Build a topic-specific markdown knowledge base by using TinyFish to browse public web sources and extract structured evidence.

This skill is for **builder knowledge bases**, not personal journals and not direct code generation.

The output is a folder you can drop into Obsidian immediately, and update later without starting over.

## Core principle

Do not produce a pile of source summaries.

The KB should help the reader understand:

- the core mental model
- the main approaches or schools of thought
- what is foundational vs derivative
- what actually matters
- what is unresolved
- what to read first if they want genuine understanding

If the output only says what each source said, the skill has failed.

## Pre-flight check

Run both checks before any TinyFish call:

```bash
which tinyfish && tinyfish --version || echo "TINYFISH_CLI_NOT_INSTALLED"
tinyfish auth status
```

If TinyFish is not installed, stop and tell the user:

```bash
npm install -g @tiny-fish/cli
```

If TinyFish is not authenticated, stop and tell the user:

```bash
tinyfish auth login
```

Do not continue until both checks pass.

## Scope

- **Allowed:** public web pages, public GitHub repos, public papers, public docs, public datasets, public blog posts
- **Not allowed:** private sources, local private files, authenticated dashboards, chat logs, email, Slack, or anything the user cannot access publicly
- **Primitive:** use explicit `tinyfish agent run` commands
- **Output shape:** always `index.md`, `sources.md`, `audit.md`, and `manifest.json`; everything else is dynamic

## Input modes

You support two modes:

1. **Topic only**
   - Example: `Build me a knowledge base on web agent frameworks`
2. **Topic + starter URLs**
   - Example: `Build me a knowledge base on web agent frameworks and start from these URLs: ...`
3. **Update an existing KB**
   - Example: `Update my knowledge base on Kolmogorov-Arnold Networks with these new URLs: ...`
4. **Trace mode**
   - Example: `Build me a knowledge base on browser agents --trace`

If the topic is missing, ask for it before proceeding.

If starter URLs are present:
- use them first
- deduplicate them
- keep only public URLs

If the user explicitly says `update`, `refresh`, `add these sources`, or clearly wants to add to an existing KB, switch into update mode.

If the user includes `--trace`, `trace`, `debug`, or explicitly asks for raw outputs:
- enable trace mode
- save raw TinyFish outputs under `_trace/`
- keep `_trace/` out of the main page navigation unless the user asks for it

## Output directory

Create a folder named:

```text
kb-{topic-slug}/
```

Examples:
- `kb-web-agent-frameworks/`
- `kb-kolmogorov-arnold-networks/`
- `kb-landing-page-design-patterns/`

When trace mode is enabled, also create:

```text
kb-{topic-slug}/_trace/
```

## Always-generated files

### `index.md`

This file is always required. It should contain:
- a short topic overview
- what the knowledge base covers
- a list of generated pages using `[[wikilinks]]`
- 3-7 key takeaways
- open questions or evidence gaps
- a **mental model** section
- a **what matters** section
- a **reading order** section for the strongest sources or pages

### `sources.md`

This file is always required. It should log **every URL visited** with:
- stable source ID
- timestamp
- URL
- source label
- reason it was opened
- result status: useful, partial, irrelevant, blocked, or conflicting

Use ISO 8601 timestamps.

Each source entry must use a stable source ID such as `S001`, `S002`, `S003`.

Example:

```markdown
## [S001] 2026-04-06T08:49:24.014Z | useful

- URL: https://example.com
- Label: Official docs
- Reason opened: discovery pass for {TOPIC}
- Notes: yielded 4 good follow-up links
```

### `audit.md`

This file is always required. It is the trust layer for the KB.

It must contain four sections:

- `FOUND`
- `INFERRED`
- `CONFLICTING`
- `MISSING`

Example:

```markdown
# Audit

## FOUND
- [FOUND | S003] Pikachu is an Electric-type Mouse Pokemon.

## INFERRED
- [INFERRED | S003,S004] Pikachu's mascot role is reinforced across both official canon and encyclopedia framing.

## CONFLICTING
- [CONFLICTING | S004,S009] Source A says X while source B frames Y.

## MISSING
- [MISSING] No dedicated benchmark source was read in this run.
```

Rules:

- `FOUND` requires at least one direct source ID
- `INFERRED` should usually reference at least two source IDs
- `CONFLICTING` must name the disagreement explicitly
- `MISSING` should be used whenever the KB lacks evidence rather than hand-waving

### `manifest.json`

This file is always required. It stores:

- topic
- topic slug
- build or update mode
- created timestamp
- last updated timestamp
- page list
- run history
- simple run bookkeeping like URLs visited and pages generated
- whether trace mode was enabled

## Dynamic files

Do **not** hardcode a fixed set like `papers.md` or `repos.md` for every topic.

Create additional files only when the topic actually supports them.

Common examples:
- `papers.md`
- `repos.md`
- `docs.md`
- `articles.md`
- `datasets.md`
- `benchmarks.md`
- `people.md`
- `glossary.md`
- `timeline.md`
- `landscape.md`
- `reading-order.md`
- `disagreements.md`
- `what-matters.md`

Rules:
- if a category has meaningful evidence, create its file
- if it does not, skip it
- do not create empty placeholder files
- if a category only has 1-2 minor findings, fold it into `index.md` instead
- create `updates.md` when the KB is refreshed in update mode
- if the topic is broad enough to have multiple camps, phases, or implementation styles, create `landscape.md`
- if the sources disagree in meaningful ways, create `disagreements.md`
- if the reader would benefit from a guided path, create `reading-order.md`

All generated markdown files should use `[[wikilinks]]` when linking to other local pages.

Trace mode exception:
- files under `_trace/` are debugging artifacts, not user-facing KB pages
- do not clutter `index.md` with `_trace/` links unless the user explicitly asks

## Operating model

Use a **two-pass workflow**:

1. **Discovery pass**
   - find high-value URLs
2. **Reading pass**
   - extract structured information from the selected URLs

Use **one TinyFish run per URL**.

Do not ask one TinyFish agent to cover multiple independent sites in a single command.

Run independent URLs in parallel where possible using background jobs and `wait`.

## Step 0 — Decide build mode

Determine whether this run is:

- `build` — creating a KB from scratch
- `update` — adding or refreshing sources in an existing KB

Also determine:

- `TRACE` = `true` or `false`

Use `update` mode when:

- the user explicitly says update or refresh
- the target KB folder already exists and the user's intent is additive

In update mode:

- read the existing `index.md`
- read the existing `sources.md`
- read the existing `audit.md`
- read the existing `manifest.json`
- do not renumber old source IDs
- only rewrite the pages whose evidence changed

## Step 1 — Normalize the task

Write down:
- `TOPIC`
- `TOPIC_SLUG`
- `STARTER_URLS` if provided
- `MODE` = `build` or `update`
- `TRACE` = `true` or `false`

Keep the topic human-readable in the markdown output.

## Step 2 — Build the starting URL set

If the user gave starter URLs:
- start with those

If a starter URL is a direct arXiv paper page such as `/abs/...`, `/pdf/...`, or an arXiv HTML render:
- treat it as a **reading target**
- do not send it through the discovery-search workflow first

Then expand with a small set of public discovery URLs relevant to the topic. Choose from these patterns when relevant:

- GitHub repo search:
  ```text
  https://github.com/search?q={TOPIC}&type=repositories
  ```
- arXiv search:
  ```text
  https://arxiv.org/search/?query={TOPIC}&searchtype=all
  ```
- Hugging Face models search:
  ```text
  https://huggingface.co/models?search={TOPIC}
  ```
- Hugging Face datasets search:
  ```text
  https://huggingface.co/datasets?search={TOPIC

Files: 2

Size: 25.6 KB

Complexity: 37/100

Category: AI Agents

Source: https://github.com/tinyfish-io/tinyfish-cookbook/tree/main/skills/kb-builder

Related in AI Agents

skill-development

Included

Comprehensive meta-skill for creating, managing, validating, auditing, and distributing Claude Code skills and slash commands (unified in v2.1.3+). Provides skill templates, creation workflows, validation patterns, audit checklists, naming conventions, YAML frontmatter guidance, progressive disclosure examples, and best practices lookup. Use when creating new skills, validating existing skills, auditing skill quality, understanding skill architecture, needing skill templates, learning about YAML frontmatter requirements, progressive disclosure patterns, tool restrictions (allowed-tools), skill composition, skill naming conventions, troubleshooting skill activation issues, creating custom slash commands, configuring command frontmatter, using command arguments ($ARGUMENTS, $1, $2), bash execution in commands, file references in commands, command namespacing, plugin commands, MCP slash commands, Skill tool configuration, or deciding between skills vs slash commands. Delegates to docs-management skill for official documentation.

AI Agentsscripts

reprompter

Included

Transform messy prompts into well-structured, effective prompts — single or multi-agent. Use when: "reprompt", "reprompt this", "clean up this prompt", "structure my prompt", rough text needing XML tags and best practices, "reprompter teams", "repromptception", "run with quality", "smart run", "smart agents", multi-agent tasks, audits, parallel work, anything going to agent teams. Don't use when: simple Q&A, pure chat, immediate execution-only tasks. See "Don't Use When" section for details. Outputs: Structured XML/Markdown prompt, quality score (before/after), optional team brief + per-agent sub-prompts, agent team output files. Success criteria: Single mode quality score ≥ 7/10; Repromptception per-agent prompt quality score 8+/10; all required sections present, actionable and specific.

AI Agentsscripts

adaptive-compaction

Included

Adaptive add-on policy and recovery layer that decides WHEN to compact, prune, snapshot, or fork -- replacing fixed-percent auto-compaction across Claude Code, Codex, and MCP-capable hosts. Trigger on auto-compact timing or damage: "when should I compact", "is it safe to compact now or start a fresh session", "auto-compact fires too early/mid-task", "switching to an unrelated task but the window still has space", "context rot", "answers get worse the longer the session runs", "the agent forgot the plan or my decisions after it summarized", "add a layer on top that manages context without changing the agent", raising autoCompactWindow to give the policy room, or installing/tuning a cross-tool compaction policy or PreCompact hook -- even when "compaction" is never said but the problem is context-window pressure or post-summarization memory loss. Do NOT use to summarize a conversation, build RAG, write a summarization prompt (decides WHEN not HOW), or answer max-context-length trivia.

AI Agentsscripts

agent-skill-creator

Included

Create cross-platform agent skills from workflow descriptions. Activates when users ask to create an agent, automate a repetitive workflow, create a custom skill, or need advanced agent creation. Triggers on phrases like create agent for, automate workflow, create skill for, every day I have to, daily I need to, turn process into agent, need to automate, create a cross-platform skill, validate this skill, export this skill, migrate this skill. Supports single skills, multi-agent suites, transcript processing, template-based creation, interactive configuration, cross-platform export, and spec validation.

AI Agentsscripts

llm-wiki

Included

Use when building or maintaining a persistent personal knowledge base (second brain) in Obsidian where an LLM incrementally ingests sources, updates entity/concept pages, maintains cross-references, and keeps a synthesis current. Triggers include "second brain", "Obsidian wiki", "personal knowledge management", "ingest this paper/article/book", "build a research wiki", "compound knowledge", "Memex", or whenever the user wants knowledge to accumulate across sessions instead of being re-derived by RAG on every query.

AI Agentsscripts

skill-master

Included

Agent Skills authoring, evaluation, and optimization. Create, edit, validate, benchmark, and improve skills following the agentskills.io specification. Use when designing SKILL.md files, structuring skill folders (references, scripts, assets), ingesting external documentation into skills, running trigger evals, benchmarking skill quality, optimizing descriptions, or performing blind A/B comparisons. Keywords: agentskills.io, SKILL.md, skill authoring, eval, benchmark, trigger optimization.

AI Agentsscripts