iterate

Included with Lifetime

$97 forever

Use when the workflow needs to self-correct, improve over time, or establish feedback loops and evaluation cycles.

enhancement

What this skill does


## MANDATORY PREPARATION

Invoke /agent-workflow — it contains workflow principles, anti-patterns, and the **Context Gathering Protocol**. Follow the protocol before proceeding — if no workflow context exists yet, you MUST run /teach-maestro first.

Consult the feedback-loops reference in the agent-workflow skill for evaluation patterns and self-correction strategies.

---

Set up feedback loops that make workflows self-correcting and continuously improving. Iteration transforms one-shot gambles into convergent, reliable systems.

### Feedback Loop Design

### Step 1: Define Quality Criteria

What does "good output" look like? Score dimensions:

| Dimension | Weight | Threshold | Measurement |
|-----------|--------|-----------|-------------|
| Accuracy | 0.4 | ≥ 0.8 | Factual correctness check |
| Completeness | 0.3 | ≥ 0.7 | Required fields present |
| Format | 0.2 | ≥ 0.9 | Schema compliance |
| Tone | 0.1 | ≥ 0.6 | Appropriate for audience |

### Step 2: Choose Evaluator Type

Match evaluator to requirements:

- **Rule-based**: Schema validation, field presence, value ranges (fast, free)
- **Self-check**: Same model evaluates own output (fast, cheap, less reliable)
- **Cross-model**: Different model evaluates (slower, more reliable)
- **Human-in-the-loop**: Human review (slowest, most reliable, doesn't scale)
- **Hybrid**: Rules first, then model check for what rules can't catch

### Step 3: Design the Correction Loop

```text
generate(input) → evaluate(output) → score
  if score ≥ threshold → return output
  if score < threshold AND attempts < max →
    enrich input with evaluator feedback
    generate again (with feedback)
  if attempts ≥ max → fallback or escalate
```

**Critical**: The retry input MUST be different from the original. Include:

- The evaluator's specific feedback
- What was wrong and why
- A suggestion for how to fix it

### Step 4: Set Up Regression Detection

When changing prompts, models, or tools:

1. Run golden test set with OLD config → baseline scores
2. Run golden test set with NEW config → new scores
3. Compare: improvement ≥ 5% → accept; regression ≥ 5% → reject

### Step 5: Continuous Monitoring

For production workflows:

- Sample 1-5% of outputs for automated evaluation
- Track quality scores over time
- Alert on downward trends
- A/B test changes before full rollout

### Iteration Checklist

- [ ] Quality criteria defined with weights and thresholds
- [ ] Evaluator selected and configured
- [ ] Correction loop has max attempts limit
- [ ] Feedback is injected into retries (not identical retry)
- [ ] Golden test set exists with ≥ 10 cases
- [ ] Regression detection configured for changes
- [ ] Production monitoring in place

### Recommended Next Step

After setting up feedback loops, run `/evaluate` to validate the loop with real scenarios, then `/refine` for final polish.

**NEVER**:

- Retry with the exact same input (definition of insanity)
- Use the same weak model to both generate and evaluate
- Skip the max attempts limit (infinite loops are real)
- Deploy changes without regression testing against golden set
- Monitor only errors — track quality scores over time

Files: 1

Size: 3.4 KB

Complexity: 11/100

Category: enhancement

Source: https://github.com/sharpdeveye/maestro/tree/main/source/skills/iterate

Related in enhancement

amplify

Included

Use when the workflow works but needs to handle more complex cases or produce higher-quality output through better tools, context, prompts, or models.

enhancement

enrich

Included

Use when the agent needs access to information beyond its training data — knowledge sources, RAG pipelines, or grounding data.

enhancement

guard

Included

Use when deploying to production, handling sensitive data, or the workflow needs safety constraints, input validation, and security boundaries.

enhancement

temper

Included

Use when the workflow feels over-engineered, has premature optimizations, unnecessary abstraction layers, or complexity beyond actual requirements.

enhancement

turbocharge

Included

Use when the user wants to push past conventional workflow limits with advanced performance techniques like parallel orchestration, streaming pipelines, or adaptive routing.

enhancement

accelerate

Included

Use when the workflow is too slow, too expensive, or both and needs latency, cost, or token usage optimization.

enhancement

amplify

Included

Use when the workflow works but needs to handle more complex cases or produce higher-quality output through better tools, context, prompts, or models.

enhancement

enrich

Included

Use when the agent needs access to information beyond its training data — knowledge sources, RAG pipelines, or grounding data.

enhancement

guard

Included

Use when deploying to production, handling sensitive data, or the workflow needs safety constraints, input validation, and security boundaries.

enhancement

temper

Included

Use when the workflow feels over-engineered, has premature optimizations, unnecessary abstraction layers, or complexity beyond actual requirements.

enhancement

turbocharge

Included

Use when the user wants to push past conventional workflow limits with advanced performance techniques like parallel orchestration, streaming pipelines, or adaptive routing.

enhancement

accelerate

Included

Use when the workflow is too slow, too expensive, or both and needs latency, cost, or token usage optimization.

enhancement