accelerate

Included with Lifetime

$97 forever

Use when the workflow is too slow, too expensive, or both and needs latency, cost, or token usage optimization.

enhancement

What this skill does


## MANDATORY PREPARATION

Invoke /agent-workflow — it contains workflow principles, anti-patterns, and the **Context Gathering Protocol**. Follow the protocol before proceeding — if no workflow context exists yet, you MUST run /teach-maestro first.
Consult the context-management reference in the agent-workflow skill for window optimization and budget strategies.


---

Make the workflow faster and cheaper without sacrificing quality. Measure before and after.

### Performance Audit

Measure current performance:

```text
Current metrics:
  Latency (p50): ___ms
  Latency (p95): ___ms
  Cost per request: $___
  Token usage (avg): ___ input / ___ output
  Error rate: ___%
```

### Acceleration Strategies

**Reduce Token Usage**

- Shorten system prompts (remove redundant instructions)
- Compress few-shot examples to minimum viable length
- Use structured output schemas instead of verbose text
- Summarize context instead of passing raw documents
- Reduce output length requirements

**Model Cascading**

- Route simple tasks to cheaper/faster models
- Escalate only complex tasks to capable models
- Use classification to determine complexity

**Caching**

- Cache responses for identical or near-identical inputs
- Cache tool results with appropriate TTL
- Cache embeddings for frequently-queried documents
- Use semantic caching for similar (not identical) queries

**Parallelization**

- Run independent tool calls in parallel
- Run independent agent steps in parallel
- Use streaming to start processing before full response

**Context Optimization**

- Retrieve less, retrieve better (improve retrieval precision)
- Use context compression techniques
- Implement sliding window for long conversations

### Acceleration Report

For each optimization:

1. **What changed**: Specific modification
2. **Before**: Latency/cost/tokens before
3. **After**: Latency/cost/tokens after
4. **Quality impact**: Any quality change (verify with golden tests)
5. **Trade-off**: What was sacrificed for the improvement

### Acceleration Checklist

- [ ] Baseline metrics recorded before any changes
- [ ] Each optimization measured with before/after comparison
- [ ] Quality impact verified (golden tests still pass)
- [ ] Trade-offs documented for each change
- [ ] Cost/latency improvements quantified

### Recommended Next Step

After optimization, run `/evaluate` to verify quality didn't degrade, or `/iterate` to set up continuous monitoring.

**NEVER**:

- Optimize without measuring first (you need a baseline)
- Sacrifice quality for speed without explicit user approval
- Cache outputs that depend on real-time data
- Skip the quality check after optimization
- Optimize prematurely (make it correct first, then make it fast)

Files: 1

Size: 2.9 KB

Complexity: 10/100

Category: enhancement

Source: https://github.com/sharpdeveye/maestro/tree/main/source/skills/accelerate

Related in enhancement

amplify

Included

Use when the workflow works but needs to handle more complex cases or produce higher-quality output through better tools, context, prompts, or models.

enhancement

enrich

Included

Use when the agent needs access to information beyond its training data — knowledge sources, RAG pipelines, or grounding data.

enhancement

guard

Included

Use when deploying to production, handling sensitive data, or the workflow needs safety constraints, input validation, and security boundaries.

enhancement

iterate

Included

Use when the workflow needs to self-correct, improve over time, or establish feedback loops and evaluation cycles.

enhancement

temper

Included

Use when the workflow feels over-engineered, has premature optimizations, unnecessary abstraction layers, or complexity beyond actual requirements.

enhancement

turbocharge

Included

Use when the user wants to push past conventional workflow limits with advanced performance techniques like parallel orchestration, streaming pipelines, or adaptive routing.

enhancement

amplify

Included

Use when the workflow works but needs to handle more complex cases or produce higher-quality output through better tools, context, prompts, or models.

enhancement

enrich

Included

Use when the agent needs access to information beyond its training data — knowledge sources, RAG pipelines, or grounding data.

enhancement

guard

Included

Use when deploying to production, handling sensitive data, or the workflow needs safety constraints, input validation, and security boundaries.

enhancement

iterate

Included

Use when the workflow needs to self-correct, improve over time, or establish feedback loops and evaluation cycles.

enhancement

temper

Included

Use when the workflow feels over-engineered, has premature optimizations, unnecessary abstraction layers, or complexity beyond actual requirements.

enhancement

turbocharge

Included

Use when the user wants to push past conventional workflow limits with advanced performance techniques like parallel orchestration, streaming pipelines, or adaptive routing.

enhancement