token-optimizer
Reduce OpenClaw token usage and API costs through smart model routing, heartbeat optimization, budget tracking, and native 2026.2.15 features (session pruning, bootstrap size limits, cache TTL alignment). Use when token costs are high, API rate limits are being hit, or hosting multiple agents at scale. The 4 executable scripts (context_optimizer, model_router, heartbeat_optimizer, token_tracker) are local-only — no network requests, no subprocess calls, no system modifications. Reference files (PROVIDERS.md, config-patches.json) document optional multi-provider strategies that require external API keys and network access if you choose to use them. See SECURITY.md for full breakdown.
What this skill does
# Token Optimizer
Comprehensive toolkit for reducing token usage and API costs in OpenClaw deployments. Combines smart model routing, optimized heartbeat intervals, usage tracking, and multi-provider strategies.
## Quick Start
**Immediate actions** (no config changes needed):
1. **Generate optimized AGENTS.md (BIGGEST WIN!):**
```bash
python3 scripts/context_optimizer.py generate-agents
# Creates AGENTS.md.optimized — review and replace your current AGENTS.md
```
2. **Check what context you ACTUALLY need:**
```bash
python3 scripts/context_optimizer.py recommend "hi, how are you?"
# Shows: Only 2 files needed (not 50+!)
```
3. **Install optimized heartbeat:**
```bash
cp assets/HEARTBEAT.template.md ~/.openclaw/workspace/HEARTBEAT.md
```
4. **Enforce cheaper models for casual chat:**
```bash
python3 scripts/model_router.py "thanks!"
# Single-provider Anthropic setup: Use Sonnet, not Opus
# Multi-provider setup (OpenRouter/Together): Use Haiku for max savings
```
5. **Check current token budget:**
```bash
python3 scripts/token_tracker.py check
```
**Expected savings:** 50-80% reduction in token costs for typical workloads (context optimization is the biggest factor!).
## Core Capabilities
### 0. Lazy Skill Loading (NEW in v3.0 — BIGGEST WIN!)
**The single highest-impact optimization available.** Most agents burn 3,000–15,000 tokens per session loading skill files they never use. Stop that first.
**The pattern:**
1. Create a lightweight `SKILLS.md` catalog in your workspace (~300 tokens — list of skills + when to load them)
2. Only load individual SKILL.md files when a task actually needs them
3. Apply the same logic to memory files — load MEMORY.md at startup, daily logs only on demand
**Token savings:**
| Library size | Before (eager) | After (lazy) | Savings |
|---|---|---|---|
| 5 skills | ~3,000 tokens | ~600 tokens | **80%** |
| 10 skills | ~6,500 tokens | ~750 tokens | **88%** |
| 20 skills | ~13,000 tokens | ~900 tokens | **93%** |
**Quick implementation in AGENTS.md:**
```markdown
## Skills
At session start: Read SKILLS.md (the index only — ~300 tokens).
Load individual skill files ONLY when a task requires them.
Never load all skills upfront.
```
**Full implementation (with catalog template + optimizer script):**
```bash
clawhub install openclaw-skill-lazy-loader
```
The companion skill `openclaw-skill-lazy-loader` includes a `SKILLS.md.template`, an `AGENTS.md.template` lazy-loading section, and a `context_optimizer.py` CLI that recommends exactly which skills to load for any given task.
**Lazy loading handles context loading costs. The remaining capabilities below handle runtime costs.** Together they cover the full token lifecycle.
---
### 1. Context Optimization (NEW!)
**Biggest token saver** — Only load files you actually need, not everything upfront.
**Problem:** Default OpenClaw loads ALL context files every session:
- SOUL.md, AGENTS.md, USER.md, TOOLS.md, MEMORY.md
- docs/**/*.md (hundreds of files)
- memory/2026-*.md (daily logs)
- Total: Often 50K+ tokens before user even speaks!
**Solution:** Lazy loading based on prompt complexity.
**Usage:**
```bash
python3 scripts/context_optimizer.py recommend "<user prompt>"
```
**Examples:**
```bash
# Simple greeting → minimal context (2 files only!)
context_optimizer.py recommend "hi"
→ Load: SOUL.md, IDENTITY.md
→ Skip: Everything else
→ Savings: ~80% of context
# Standard work → selective loading
context_optimizer.py recommend "write a function"
→ Load: SOUL.md, IDENTITY.md, memory/TODAY.md
→ Skip: docs, old memory, knowledge base
→ Savings: ~50% of context
# Complex task → full context
context_optimizer.py recommend "analyze our entire architecture"
→ Load: SOUL.md, IDENTITY.md, MEMORY.md, memory/TODAY+YESTERDAY.md
→ Conditionally load: Relevant docs only
→ Savings: ~30% of context
```
**Output format:**
```json
{
"complexity": "simple",
"context_level": "minimal",
"recommended_files": ["SOUL.md", "IDENTITY.md"],
"file_count": 2,
"savings_percent": 80,
"skip_patterns": ["docs/**/*.md", "memory/20*.md"]
}
```
**Integration pattern:**
Before loading context for a new session:
```python
from context_optimizer import recommend_context_bundle
user_prompt = "thanks for your help"
recommendation = recommend_context_bundle(user_prompt)
if recommendation["context_level"] == "minimal":
# Load only SOUL.md + IDENTITY.md
# Skip everything else
# Save ~80% tokens!
```
**Generate optimized AGENTS.md:**
```bash
context_optimizer.py generate-agents
# Creates AGENTS.md.optimized with lazy loading instructions
# Review and replace your current AGENTS.md
```
**Expected savings:** 50-80% reduction in context tokens.
### 2. Smart Model Routing (ENHANCED!)
Automatically classify tasks and route to appropriate model tiers.
**NEW: Communication pattern enforcement** — Never waste Opus tokens on "hi" or "thanks"!
**Usage:**
```bash
python3 scripts/model_router.py "<user prompt>" [current_model] [force_tier]
```
**Examples:**
```bash
# Communication (NEW!) → ALWAYS Haiku
python3 scripts/model_router.py "thanks!"
python3 scripts/model_router.py "hi"
python3 scripts/model_router.py "ok got it"
→ Enforced: Haiku (NEVER Sonnet/Opus for casual chat)
# Simple task → suggests Haiku
python3 scripts/model_router.py "read the log file"
# Medium task → suggests Sonnet
python3 scripts/model_router.py "write a function to parse JSON"
# Complex task → suggests Opus
python3 scripts/model_router.py "design a microservices architecture"
```
**Patterns enforced to Haiku (NEVER Sonnet/Opus):**
*Communication:*
- Greetings: hi, hey, hello, yo
- Thanks: thanks, thank you, thx
- Acknowledgments: ok, sure, got it, understood
- Short responses: yes, no, yep, nope
- Single words or very short phrases
*Background tasks:*
- Heartbeat checks: "check email", "monitor servers"
- Cronjobs: "scheduled task", "periodic check", "reminder"
- Document parsing: "parse CSV", "extract data from log", "read JSON"
- Log scanning: "scan error logs", "process logs"
**Integration pattern:**
```python
from model_router import route_task
user_prompt = "show me the config"
routing = route_task(user_prompt)
if routing["should_switch"]:
# Use routing["recommended_model"]
# Save routing["cost_savings_percent"]
```
**Customization:**
Edit `ROUTING_RULES` or `COMMUNICATION_PATTERNS` in `scripts/model_router.py` to adjust patterns and keywords.
### 3. Heartbeat Optimization
Reduce API calls from heartbeat polling with smart interval tracking:
**Setup:**
```bash
# Copy template to workspace
cp assets/HEARTBEAT.template.md ~/.openclaw/workspace/HEARTBEAT.md
# Plan which checks should run
python3 scripts/heartbeat_optimizer.py plan
```
**Commands:**
```bash
# Check if specific type should run now
heartbeat_optimizer.py check email
heartbeat_optimizer.py check calendar
# Record that a check was performed
heartbeat_optimizer.py record email
# Update check interval (seconds)
heartbeat_optimizer.py interval email 7200 # 2 hours
# Reset state
heartbeat_optimizer.py reset
```
**How it works:**
- Tracks last check time for each type (email, calendar, weather, etc.)
- Enforces minimum intervals before re-checking
- Respects quiet hours (23:00-08:00) — skips all checks
- Returns `HEARTBEAT_OK` when nothing needs attention (saves tokens)
**Default intervals:**
- Email: 60 minutes
- Calendar: 2 hours
- Weather: 4 hours
- Social: 2 hours
- Monitoring: 30 minutes
**Integration in HEARTBEAT.md:**
```markdown
## Email Check
Run only if: `heartbeat_optimizer.py check email` → `should_check: true`
After checking: `heartbeat_optimizer.py record email`
```
**Expected savings:** 50% reduction in heartbeat API calls.
**Model enforcement:** Heartbeat should ALWAYS use Haiku — see updated `HEARTBEAT.template.md` for model override instructions.
### 4. Cronjob Optimization (NEW!)
**Problem:** Cronjobs often default to expensive models Related in Backend & APIs
jfrog
IncludedInteract with the JFrog Platform via the JFrog CLI and REST/GraphQL APIs. Use this skill when the user wants to manage Artifactory repositories, upload or download artifacts, manage builds, configure permissions, manage users and groups, work with access tokens, configure JFrog CLI servers, search artifacts, manage properties, set up replication, manage JFrog Projects, run security audits or scans, look up CVE details, query exposures scan results from JFrog Advanced Security, manage release bundles and lifecycle operations, aggregate or export platform data, or perform any JFrog Platform administration task. Also use when the user mentions jf, jfrog, artifactory, xray, distribution, evidence, apptrust, onemodel, graphql, workers, mission control, curation, advanced security, exposures, or any JFrog product name.
cupynumeric-migration-readiness
IncludedPre-migration readiness assessor for porting NumPy to cuPyNumeric. Use BEFORE substantial porting work begins when the user asks whether code will scale on GPU, whether they should migrate to cuPyNumeric, which NumPy patterns transfer cleanly, what must be refactored before porting, or mentions pre-port assessment, scaling analysis, or refactor planning. Inspect the user's source code, look up NumPy usage, cross-reference the cuPyNumeric API support manifest, and distinguish distributed-scaling-friendly patterns from blockers such as unsupported APIs, scalar synchronization, host round-trips, Python/object-heavy control flow, shape/data-dependent branching, and in-place mutation hazards. Produce a verdict of READY, LIGHT REFACTOR, SIGNIFICANT REFACTOR, or NOT RECOMMENDED, with concrete refactor pointers.
alibabacloud-data-agent-skill
IncludedInvoke Alibaba Cloud Apsara Data Agent for Analytics via CLI to perform natural language-driven data analysis on enterprise databases. Data Agent for Analytics is an intelligent data analysis agent developed by Alibaba Cloud Database team for enterprise users. It automatically completes requirement analysis, data understanding, analysis insights, and report generation based on natural language descriptions. This tool supports: discovering data resources (instances/databases/tables) managed in DMS, initiating query or deep analysis sessions, real-time progress tracking, and retrieving analysis conclusions and generated reports. Use this Skill when users need to query databases, analyze data trends, generate data reports, ask questions in natural language, or mention "Data Agent", "data analysis", "database query", "SQL analysis", "data insights".
token-optimizer
IncludedReduce OpenClaw token usage and API costs through smart model routing, heartbeat optimization, budget tracking, and native 2026.2.15 features (session pruning, bootstrap size limits, cache TTL alignment). Use when token costs are high, API rate limits are being hit, or hosting multiple agents at scale. The 4 executable scripts (context_optimizer, model_router, heartbeat_optimizer, token_tracker) are local-only — no network requests, no subprocess calls, no system modifications. Reference files (PROVIDERS.md, config-patches.json) document optional multi-provider strategies that require external API keys and network access if you choose to use them. See SECURITY.md for full breakdown.
resend-cli
IncludedUse this skill when the task is specifically about operating Resend from an AI agent, terminal session, or CI job via the official resend CLI: installing/authenticating the CLI, sending/listing/updating/cancelling emails, batch sends, domains and DNS, webhooks and local listeners, inbound receiving, contacts, topics, segments, broadcasts, templates, API keys, profiles, or debugging Resend CLI/API failures. Trigger on mentions of Resend CLI, `resend`, `resend doctor`, `resend emails send`, `resend domains`, `resend webhooks listen`, `resend emails receiving`, or agent-friendly terminal automation.
alibabacloud-odps-maxframe-coding
IncludedUse this skill for MaxFrame SDK development and documentation navigation on Alibaba Cloud MaxCompute (ODPS). Helps answer MaxFrame API, concept, official example, and supported pandas API questions; create data processing programs; read/write MaxCompute tables; debug jobs (remote or local); and build custom DPE runtime images. Trigger when users mention MaxFrame, MaxCompute with MaxFrame, ODPS table processing, DPE runtime, MaxFrame docs/examples, DataFrame/Tensor operations, or GPU runtime setup. Works for both English and Chinese queries about Alibaba Cloud data processing with MaxFrame.