Claude
Skills
Sign in
Back

testability-scoring

Included with Lifetime
$97 forever

AI-powered testability assessment using 10 principles of intrinsic testability with Playwright and optional Vibium integration. Evaluates web applications against Observability, Controllability, Algorithmic Simplicity, Transparency, Stability, Explainability, Unbugginess, Smallness, Decomposability, and Similarity. Use when assessing software testability, evaluating test readiness, identifying testability improvements, or generating testability reports.

testing-methodologiestestabilityscoringplaywrightvibiumassessment10-principlesintrinsic-testabilityjames-bachscripts

What this skill does


# Testability Scoring

## Browser engine

Uses the **qe-browser** fleet skill (`.claude/skills/qe-browser/`) as the primary browser engine. Vibium is installed by `aqe init`. The legacy `scripts/run-assessment.sh` + Playwright path remains as a fallback when a team already has a Playwright test suite configured, but new runs should prefer:

```bash
vibium go "$TARGET_URL"
vibium wait load
vibium a11y-tree --json > /tmp/testability/tree.json
vibium eval --stdin --json <<'EOF' > /tmp/testability/signals.json
JSON.stringify({
  headings: document.querySelectorAll('h1,h2,h3,h4,h5,h6').length,
  testIds: document.querySelectorAll('[data-testid]').length,
  forms: document.querySelectorAll('form').length,
  ariaLabels: document.querySelectorAll('[aria-label]').length,
  links: document.querySelectorAll('a[href]').length,
});
EOF
node .claude/skills/qe-browser/scripts/assert.js --checks '[
  {"kind": "no_console_errors"},
  {"kind": "no_failed_requests"}
]'
```

<default_to_action>
When assessing testability:
1. RUN assessment against target URL
2. ANALYZE all 10 principles automatically
3. GENERATE HTML report with radar chart
4. PRIORITIZE improvements by impact/effort
5. INTEGRATE with QX Partner for holistic view

**Quick Assessment:**
```bash
# Run assessment on any URL
TEST_URL='https://example.com/' npx playwright test tests/testability-scoring/testability-scoring.spec.js --project=chromium --workers=1

# Or use shell script wrapper
.claude/skills/testability-scoring/scripts/run-assessment.sh https://example.com/
```

**The 10 Principles at a Glance:**
| Principle | Weight | Key Question |
|-----------|--------|--------------|
| **Observability** | 15% | Can we see what's happening? |
| **Controllability** | 15% | Can we control the application? |
| **Algorithmic Simplicity** | 10% | Are behaviors predictable? |
| **Algorithmic Transparency** | 10% | Can we understand what it does? |
| **Algorithmic Stability** | 10% | Does behavior remain consistent? |
| **Explainability** | 10% | Is the interface understandable? |
| **Unbugginess** | 10% | How error-free is it? |
| **Smallness** | 10% | Are components appropriately sized? |
| **Decomposability** | 5% | Can we test parts in isolation? |
| **Similarity** | 5% | Is the tech stack familiar? |

**Grade Scale:**
- **A (90-100)**: Excellent testability
- **B (80-89)**: Good testability
- **C (70-79)**: Adequate testability
- **D (60-69)**: Below average
- **F (0-59)**: Poor testability
</default_to_action>

## Quick Reference Card

### Running Assessments

| Method | Command | When to Use |
|--------|---------|-------------|
| Shell Script | `./scripts/run-assessment.sh URL` | One-time assessment |
| ENV Override | `TEST_URL='URL' npx playwright test...` | CI/CD integration |
| Config File | Update `tests/testability-scoring/config.js` | Repeated runs |

### Principle Details

#### High Weight (15% each)
| Principle | Measures | Indicators |
|-----------|----------|------------|
| **Observability** | State visibility, logging, monitoring | Console output, network tracking, error visibility |
| **Controllability** | Input control, state manipulation | API access, test data injection, determinism |

#### Medium Weight (10% each)
| Principle | Measures | Indicators |
|-----------|----------|------------|
| **Simplicity** | Predictable behavior | Clear I/O relationships, low complexity |
| **Transparency** | Understanding what system does | Visible processes, readable code |
| **Stability** | Consistent behavior | Change resilience, maintainability |
| **Explainability** | Interface understanding | Good docs, semantic structure, help text |
| **Unbugginess** | Error-free operation | Console errors, warnings, runtime issues |
| **Smallness** | Component size | Element count, script bloat, page complexity |

#### Low Weight (5% each)
| Principle | Measures | Indicators |
|-----------|----------|------------|
| **Decomposability** | Isolation testing | Component separation, modular design |
| **Similarity** | Technology familiarity | Standard frameworks, known patterns |

---

## Assessment Workflow

```
1. Navigate to URL → 2. Collect Metrics → 3. Score Principles
                                              ↓
4. Generate JSON ← 5. Calculate Grades ← 6. Apply Weights
         ↓
7. Generate HTML Report with Radar Chart
         ↓
8. Open in Browser (auto-opens)
```

### Output Files

```
tests/reports/
├── testability-results-<timestamp>.json  # Raw data
├── testability-report-<timestamp>.html   # Visual report
└── latest.json                           # Symlink
```

---

## Integration Examples

### CI/CD Integration
```yaml
# GitHub Actions
- name: Testability Assessment
  run: |
    timeout 180 .claude/skills/testability-scoring/scripts/run-assessment.sh ${{ env.APP_URL }}

- name: Upload Reports
  uses: actions/upload-artifact@v3
  with:
    name: testability-reports
    path: tests/reports/testability-*.html
```

### QX Partner Integration
```typescript
// Combine testability with QX analysis
const qxAnalysis = await Task("QX Analysis", {
  target: 'https://example.com',
  integrateTestability: true
}, "qx-partner");

// Returns combined insights:
// - QX Score: 78/100
// - Testability Integration: Observability 72/100
// - Combined Insight: Low observability may mask UX issues
```

### Programmatic Usage
```typescript
import { runTestabilityAssessment } from './testability';

const results = await runTestabilityAssessment('https://example.com');
console.log(`Overall: ${results.overallScore}/100 (${results.grade})`);
console.log('Recommendations:', results.recommendations);
```

---

## Agent Integration

```typescript
// Run testability assessment
const assessment = await Task("Testability Assessment", {
  url: 'https://example.com',
  generateReport: true,
  openBrowser: true
}, "qe-quality-analyzer");

// Use with QX Partner for holistic analysis
const qxReport = await Task("Full QX Analysis", {
  target: 'https://example.com',
  integrateTestability: true,
  detectOracleProblems: true
}, "qx-partner");
```

---

## Vibium Integration (Optional)

### Overview
Vibium browser automation can be used alongside Playwright for enhanced testability assessment. While **Playwright remains the primary engine**, Vibium offers complementary capabilities for certain metrics.

**Installation:**
```bash
claude mcp add vibium -- npx -y vibium
```

### Vibium-Enhanced Metrics

| Principle | Vibium Enhancement | Benefit |
|-----------|-------------------|---------|
| **Observability** | Auto-wait duration tracking | Measures DOM stability (30s timeout, 100ms polling) |
| **Controllability** | Element interaction success rate | Validates automation readiness via MCP |
| **Stability** | Screenshot consistency | Visual regression detection for layout stability |
| **Explainability** | Element attribute extraction | ARIA labels, semantic HTML validation |

### When to Use Vibium

✅ **USE Vibium for:**
- Element stability metrics (auto-wait duration analysis)
- Visual consistency checks (screenshot comparison)
- MCP-native AI agent integration
- Lightweight Docker images (400MB vs 1.2GB)

❌ **USE Playwright for:**
- Console error detection (Vibium V1 lacks console API)
- Network performance metrics (BiDi network APIs coming in V2)
- Comprehensive browser coverage (Firefox, Safari)
- Production-proven stability (Vibium V1 released Dec 2024)

### Hybrid Assessment Example

```typescript
// Testability assessment using both engines
const assessment = {
  // Playwright: Comprehensive metrics
  playwright: await runPlaywrightAssessment(url),

  // Vibium: Stability metrics
  vibium: {
    elementStability: await measureAutoWaitDuration(url),
    visualConsistency: await compareScreenshots(url),
    accessibilityAttributes: await extractARIALabels(url)
  }
};

// Enhanced Observability Score
const observability =
  (assessment.playwright.consoleErrors * 0.6) +
  (assessment.vibium.elementStability * 0.4);
```

### Vibium MCP Too

Related in testing-methodologies