bias-prevention

Included with Lifetime

$97 forever

# Bias Prevention Skill

General

What this skill does

# Bias Prevention Skill

## Purpose

Core safeguards against common AI analysis pitfalls during OSS framework evaluation. This skill implements the 12 pitfall prevention rules to ensure objective, accurate, and defensible evaluations.

## Activation

This skill is automatically loaded during all OSS evaluation phases. It provides behavioral rules that must be followed throughout the evaluation process.

## The 12 AI Analysis Pitfalls

### Pitfall 1: Stale/Outdated Knowledge

**Problem**: AI knowledge has a training cutoff and may not reflect current state.

**Prevention Rules**:
- NEVER claim version numbers, release dates, or metrics from memory
- ALWAYS use WebSearch to verify:
  - Current stable version
  - Last release date
  - GitHub stars/forks
  - Download counts
  - Maintenance activity
- Include verification timestamp with each metric
- If WebSearch fails, explicitly state "Unable to verify current data"

**Example**:
```markdown
<!-- BAD -->
FastAPI has 60k+ GitHub stars

<!-- GOOD -->
FastAPI has 78,234 GitHub stars [Verified: 2025-01-16 via github.com/tiangolo/fastapi]
```

---

### Pitfall 2: False Feature Differentiation

**Problem**: Same features may exist under different names across frameworks, creating false differentiation.

**Prevention Rules**:
- Before marking a feature as "missing", search for:
  - Alternative terminology (middleware vs interceptors vs plugins)
  - Different architectural approaches (built-in vs pattern-based)
  - Community implementations
- Document equivalent features with terminology mapping
- Use @skills/feature-verification/SKILL.md for systematic checking

**Example**:
```markdown
<!-- BAD -->
| Feature | Framework A | Framework B |
|---------|-------------|-------------|
| Middleware | ✅ | ❌ |

<!-- GOOD -->
| Feature | Framework A | Framework B |
|---------|-------------|-------------|
| Request Pipeline | ✅ Middleware | ✅ Interceptors (equivalent) |
```

---

### Pitfall 3: OSS vs Commercial Feature Conflation

**Problem**: Confusing open-source capabilities with paid/enterprise tiers.

**Prevention Rules**:
- EVERY feature must be annotated with availability:
  - `[OSS]` - Available in open source
  - `[PAID]` - Requires paid license
  - `[ENTERPRISE]` - Enterprise tier only
  - `[PLUGIN]` - Requires separate plugin
  - `[COMMUNITY]` - Community-maintained only
- Verify tier availability via official pricing/licensing pages
- When unclear, investigate and document uncertainty

**Example**:
```markdown
<!-- BAD -->
| Feature | Status |
|---------|--------|
| SSO Support | ✅ |

<!-- GOOD -->
| Feature | Status |
|---------|--------|
| SSO Support | ✅ [PAID] Enterprise tier, ✅ [OSS] via community SAML plugin |
```

---

### Pitfall 4: Complexity Overestimation

**Problem**: Holistic complexity estimates tend to be inflated.

**Prevention Rules**:
- Break down effort into component-level tasks
- Never provide holistic estimates like "this is complex"
- Use structured effort breakdown:
  - Design: X days
  - Implementation: X days
  - Testing: X days
  - Documentation: X days
- Compare to similar past work when available
- State confidence level with estimates

**Example**:
```markdown
<!-- BAD -->
Integration will be complex and time-consuming

<!-- GOOD -->
Integration effort breakdown:
- API adapter: 2-3 days (similar to existing adapters)
- Configuration: 0.5 days
- Testing: 1-2 days
- Documentation: 0.5 days
Total: 4-6 days (confidence: 70%)
```

---

### Pitfall 5: Baseline Drift

**Problem**: Comparison criteria shift during evaluation, favoring later-analyzed candidates.

**Prevention Rules**:
- Establish baseline criteria in Phase 1 BEFORE deep analysis
- Document criteria in `.oss-eval/baseline-criteria.md`
- Do NOT modify baseline criteria after Phase 1 unless:
  - Explicitly requested by stakeholder
  - Documented with change rationale
- Review baseline before each phase to ensure consistency

**Checkpoint Question**: "Am I using the same criteria I established in Phase 1?"

---

### Pitfall 6: Marketing Language Adoption

**Problem**: Marketing claims may be repeated without technical verification.

**Prevention Rules**:
- Translate marketing language to technical specifications
- Verify marketing claims with:
  - Documentation
  - Source code
  - Community discussions
  - Independent benchmarks
- Use neutral, technical language in reports

**Translation Examples**:
| Marketing | Technical Translation |
|-----------|----------------------|
| "Blazing fast" | "Xms p50 latency in benchmark Y" |
| "Enterprise-ready" | "Supports X, Y, Z enterprise features" |
| "Batteries included" | "Includes built-in: A, B, C" |
| "Zero-config" | "Sensible defaults for: X, Y" |

---

### Pitfall 7: Popularity Bias

**Problem**: Popular options may be favored over objectively better alternatives.

**Prevention Rules**:
- Stars and downloads are INPUTS, not decision criteria
- Evaluate technical merit independently of popularity
- Consider why something is popular (marketing vs. merit)
- Give equal analysis depth to less popular candidates

**Checkpoint Question**: "Would my assessment change if star counts were hidden?"

---

### Pitfall 8: Recency Bias

**Problem**: Recent changes may be overweighted vs. stable track record.

**Prevention Rules**:
- Consider full project history, not just recent activity
- Distinguish between:
  - Maintenance activity (good: ongoing support)
  - Churn (concerning: frequent breaking changes)
- Value stability for critical infrastructure
- Document both recent activity AND historical patterns

---

### Pitfall 9: Confirmation Bias

**Problem**: Seeking evidence that confirms initial impressions.

**Prevention Rules**:
- Actively seek disconfirming evidence
- For each strength, look for weakness in same area
- For each preferred candidate, advocate for alternatives
- Use adversarial review (Phase 15) to challenge conclusions
- Document findings that contradict initial impressions

**Checkpoint Question**: "What evidence would change my recommendation?"

---

### Pitfall 10: Halo Effect

**Problem**: One strong attribute causing overestimation of other attributes.

**Prevention Rules**:
- Evaluate each dimension independently
- Use structured scoring frameworks
- Don't let one strength "cover" for weaknesses
- Document strengths AND weaknesses for all candidates
- Apply same rigor to favored and unfavored options

**Example**:
```markdown
<!-- BAD -->
Excellent documentation, so overall score: 5/5

<!-- GOOD -->
| Dimension | Score |
|-----------|-------|
| Documentation | 5/5 |
| Performance | 3/5 |
| Community | 4/5 |
| Overall: 4/5 (weighted average)
```

---

### Pitfall 11: Anchoring Bias

**Problem**: First candidate analyzed sets expectations for others.

**Prevention Rules**:
- Randomize analysis order when possible
- Use absolute criteria, not relative comparison
- Re-evaluate early candidates after analyzing later ones
- Apply same checklist to all candidates
- Document any re-evaluation adjustments

---

### Pitfall 12: Sunk Cost Bias

**Problem**: Investment in analysis may bias toward justifying that investment.

**Prevention Rules**:
- "None of the above" is a valid conclusion
- Willingness to restart with different candidates
- Don't force a recommendation if none are suitable
- Document when evaluation reveals need for different approach

---

## Bias Prevention Checklist

Use this checklist at phase boundaries:

```markdown
## Phase X Bias Check

- [ ] All metrics verified via current WebSearch
- [ ] No marketing language repeated without verification
- [ ] [OSS]/[PAID] annotations complete
- [ ] Effort estimates are component-level
- [ ] Same criteria used for all candidates
- [ ] Evidence for and against each candidate documented
- [ ] Would recommend same if popularity hidden?
- [ ] Any assumptions documented and flagged for validation
```

## Integration with Evaluation Phases

| Phase | Primary Pitfalls to Watch |
|-------|---------------------------|
| 1. D

Files: 1

Size: 8.4 KB

Complexity: 10/100

Category: General

Source: https://github.com/maxamillion/claude-oss-eval-plugin/tree/main/skills/bias-prevention

Related in General

modeling-omnistudio-epc-catalog

Included

Salesforce Industries CME EPC product-modeling skill for Product2-based catalog creation. Use when creating EPC products, configuring product attributes, building offer bundles with Product Child Items, or reviewing EPC DataPack JSON metadata for product catalog changes. TRIGGER when: user creates or updates Product2 EPC records, AttributeAssignment payloads, AttributeMetadata/AttributeDefaultValues, Offer bundles, or ProductChildItem relationships. DO NOT TRIGGER when: designing OmniScripts/FlexCards/Integration Procedures (use building-omnistudio-omniscript, building-omnistudio-flexcard, or building-omnistudio-integration-procedure), implementing Apex business logic (use generating-apex), or troubleshooting deployment pipelines (use deploying-metadata).

Generalscripts

relationship-science-coach

Included

Use this skill for direct, practical adult relationship coaching: couples conflict, repair, trust, marriage, dating, flirting, attachment patterns, emotional connection, sex, desire differences, eroticism, kink negotiation, affection, love languages, breakups, and long-term passion. Draw on Gottman, EFT and Hold Me Tight, attachment science, modern sex research, Perel, Nagoski, Kerner, Schnarch, Love and Stosny, and flexible love-language tools. Be concrete and low-hedge. Redirect only for imminent danger, abuse, coercive control, minors, non-consent, self-harm, stalking, or medical/legal/psychiatric decisions.

Generalscripts

building-sf-integrations

Included

Salesforce integration architecture and runtime plumbing with 120-point scoring. Use this skill to set up Named Credentials, External Credentials, External Services, REST/SOAP callout patterns, Platform Events, and Change Data Capture. TRIGGER when: user sets up Named Credentials, External Services, REST/SOAP callouts, Platform Events, CDC, or touches .namedCredential-meta.xml files. DO NOT TRIGGER when: Connected App/OAuth config (use configuring-connected-apps), Apex-only logic (use generating-apex), or data import/export (use handling-sf-data).

Generalscripts

venue-templates

Included

Access comprehensive LaTeX templates, formatting requirements, and submission guidelines for major scientific publication venues (Nature, Science, PLOS, IEEE, ACM), academic conferences (NeurIPS, ICML, CVPR, CHI), research posters, and grant proposals (NSF, NIH, DOE, DARPA). This skill should be used when preparing manuscripts for journal submission, conference papers, research posters, or grant proposals and need venue-specific formatting requirements and templates.

Generalscripts

let-fate-decide

Included

Draws the 12 Houses of the Zodiac Tarot spread to inject entropy into planning when prompts are vague, ambiguous, or casually delegated. Interprets the spread to guide next steps. Use when the user says 'let fate decide', 'YOLO', 'whatever', 'idk', or other nonchalant phrases, makes Yu-Gi-Oh references, or when you are about to arbitrarily pick between multiple reasonable approaches. Prefer over ask-questions-if-underspecified when the user's tone is casual or playful rather than precision-seeking.

Generalscripts

net-ops

Included

Cross-platform network troubleshooting (Windows, macOS, Linux) via local or remote shell. Use for: DNS broken, can't resolve hostnames, nslookup/dig works but apps fail, NRPT, WFP, scutil, /etc/resolver, systemd-resolved, /etc/resolv.conf, NetworkManager, VPN DNS leak residue (ProtonVPN/Mullvad/WireGuard/AnyConnect), AV/firewall blocking DNS or DoH, Tailscale DNS interaction, intermittent connectivity, remote diagnostics over SSH.

Generalscripts