testability-scoring
AI-powered testability assessment using 10 principles of intrinsic testability with Playwright and optional Vibium integration. Evaluates web applications against Observability, Controllability, Algorithmic Simplicity, Transparency, Stability, Explainability, Unbugginess, Smallness, Decomposability, and Similarity. Use when assessing software testability, evaluating test readiness, identifying testability improvements, or generating testability reports.
What this skill does
# Testability Scoring
## Browser engine
Uses the **qe-browser** fleet skill (`.claude/skills/qe-browser/`) as the primary browser engine. Vibium is installed by `aqe init`. The legacy `scripts/run-assessment.sh` + Playwright path remains as a fallback when a team already has a Playwright test suite configured, but new runs should prefer:
```bash
vibium go "$TARGET_URL"
vibium wait load
vibium a11y-tree --json > /tmp/testability/tree.json
vibium eval --stdin --json <<'EOF' > /tmp/testability/signals.json
JSON.stringify({
headings: document.querySelectorAll('h1,h2,h3,h4,h5,h6').length,
testIds: document.querySelectorAll('[data-testid]').length,
forms: document.querySelectorAll('form').length,
ariaLabels: document.querySelectorAll('[aria-label]').length,
links: document.querySelectorAll('a[href]').length,
});
EOF
node .claude/skills/qe-browser/scripts/assert.js --checks '[
{"kind": "no_console_errors"},
{"kind": "no_failed_requests"}
]'
```
<default_to_action>
When assessing testability:
1. RUN assessment against target URL
2. ANALYZE all 10 principles automatically
3. GENERATE HTML report with radar chart
4. PRIORITIZE improvements by impact/effort
5. INTEGRATE with QX Partner for holistic view
**Quick Assessment:**
```bash
# Run assessment on any URL
TEST_URL='https://example.com/' npx playwright test tests/testability-scoring/testability-scoring.spec.js --project=chromium --workers=1
# Or use shell script wrapper
.claude/skills/testability-scoring/scripts/run-assessment.sh https://example.com/
```
**The 10 Principles at a Glance:**
| Principle | Weight | Key Question |
|-----------|--------|--------------|
| **Observability** | 15% | Can we see what's happening? |
| **Controllability** | 15% | Can we control the application? |
| **Algorithmic Simplicity** | 10% | Are behaviors predictable? |
| **Algorithmic Transparency** | 10% | Can we understand what it does? |
| **Algorithmic Stability** | 10% | Does behavior remain consistent? |
| **Explainability** | 10% | Is the interface understandable? |
| **Unbugginess** | 10% | How error-free is it? |
| **Smallness** | 10% | Are components appropriately sized? |
| **Decomposability** | 5% | Can we test parts in isolation? |
| **Similarity** | 5% | Is the tech stack familiar? |
**Grade Scale:**
- **A (90-100)**: Excellent testability
- **B (80-89)**: Good testability
- **C (70-79)**: Adequate testability
- **D (60-69)**: Below average
- **F (0-59)**: Poor testability
</default_to_action>
## Quick Reference Card
### Running Assessments
| Method | Command | When to Use |
|--------|---------|-------------|
| Shell Script | `./scripts/run-assessment.sh URL` | One-time assessment |
| ENV Override | `TEST_URL='URL' npx playwright test...` | CI/CD integration |
| Config File | Update `tests/testability-scoring/config.js` | Repeated runs |
### Principle Details
#### High Weight (15% each)
| Principle | Measures | Indicators |
|-----------|----------|------------|
| **Observability** | State visibility, logging, monitoring | Console output, network tracking, error visibility |
| **Controllability** | Input control, state manipulation | API access, test data injection, determinism |
#### Medium Weight (10% each)
| Principle | Measures | Indicators |
|-----------|----------|------------|
| **Simplicity** | Predictable behavior | Clear I/O relationships, low complexity |
| **Transparency** | Understanding what system does | Visible processes, readable code |
| **Stability** | Consistent behavior | Change resilience, maintainability |
| **Explainability** | Interface understanding | Good docs, semantic structure, help text |
| **Unbugginess** | Error-free operation | Console errors, warnings, runtime issues |
| **Smallness** | Component size | Element count, script bloat, page complexity |
#### Low Weight (5% each)
| Principle | Measures | Indicators |
|-----------|----------|------------|
| **Decomposability** | Isolation testing | Component separation, modular design |
| **Similarity** | Technology familiarity | Standard frameworks, known patterns |
---
## Assessment Workflow
```
1. Navigate to URL → 2. Collect Metrics → 3. Score Principles
↓
4. Generate JSON ← 5. Calculate Grades ← 6. Apply Weights
↓
7. Generate HTML Report with Radar Chart
↓
8. Open in Browser (auto-opens)
```
### Output Files
```
tests/reports/
├── testability-results-<timestamp>.json # Raw data
├── testability-report-<timestamp>.html # Visual report
└── latest.json # Symlink
```
---
## Integration Examples
### CI/CD Integration
```yaml
# GitHub Actions
- name: Testability Assessment
run: |
timeout 180 .claude/skills/testability-scoring/scripts/run-assessment.sh ${{ env.APP_URL }}
- name: Upload Reports
uses: actions/upload-artifact@v3
with:
name: testability-reports
path: tests/reports/testability-*.html
```
### QX Partner Integration
```typescript
// Combine testability with QX analysis
const qxAnalysis = await Task("QX Analysis", {
target: 'https://example.com',
integrateTestability: true
}, "qx-partner");
// Returns combined insights:
// - QX Score: 78/100
// - Testability Integration: Observability 72/100
// - Combined Insight: Low observability may mask UX issues
```
### Programmatic Usage
```typescript
import { runTestabilityAssessment } from './testability';
const results = await runTestabilityAssessment('https://example.com');
console.log(`Overall: ${results.overallScore}/100 (${results.grade})`);
console.log('Recommendations:', results.recommendations);
```
---
## Agent Integration
```typescript
// Run testability assessment
const assessment = await Task("Testability Assessment", {
url: 'https://example.com',
generateReport: true,
openBrowser: true
}, "qe-quality-analyzer");
// Use with QX Partner for holistic analysis
const qxReport = await Task("Full QX Analysis", {
target: 'https://example.com',
integrateTestability: true,
detectOracleProblems: true
}, "qx-partner");
```
---
## Vibium Integration (Optional)
### Overview
Vibium browser automation can be used alongside Playwright for enhanced testability assessment. While **Playwright remains the primary engine**, Vibium offers complementary capabilities for certain metrics.
**Installation:**
```bash
claude mcp add vibium -- npx -y vibium
```
### Vibium-Enhanced Metrics
| Principle | Vibium Enhancement | Benefit |
|-----------|-------------------|---------|
| **Observability** | Auto-wait duration tracking | Measures DOM stability (30s timeout, 100ms polling) |
| **Controllability** | Element interaction success rate | Validates automation readiness via MCP |
| **Stability** | Screenshot consistency | Visual regression detection for layout stability |
| **Explainability** | Element attribute extraction | ARIA labels, semantic HTML validation |
### When to Use Vibium
✅ **USE Vibium for:**
- Element stability metrics (auto-wait duration analysis)
- Visual consistency checks (screenshot comparison)
- MCP-native AI agent integration
- Lightweight Docker images (400MB vs 1.2GB)
❌ **USE Playwright for:**
- Console error detection (Vibium V1 lacks console API)
- Network performance metrics (BiDi network APIs coming in V2)
- Comprehensive browser coverage (Firefox, Safari)
- Production-proven stability (Vibium V1 released Dec 2024)
### Hybrid Assessment Example
```typescript
// Testability assessment using both engines
const assessment = {
// Playwright: Comprehensive metrics
playwright: await runPlaywrightAssessment(url),
// Vibium: Stability metrics
vibium: {
elementStability: await measureAutoWaitDuration(url),
visualConsistency: await compareScreenshots(url),
accessibilityAttributes: await extractARIALabels(url)
}
};
// Enhanced Observability Score
const observability =
(assessment.playwright.consoleErrors * 0.6) +
(assessment.vibium.elementStability * 0.4);
```
### Vibium MCP TooRelated in testing-methodologies
contract-testing
IncludedConsumer-driven contract testing for microservices using Pact, schema validation, API versioning, and backward compatibility testing. Use when testing API contracts or coordinating distributed teams.
api-testing-patterns
IncludedComprehensive API testing patterns including contract testing, REST/GraphQL testing, and integration testing. Use when testing APIs or designing API test strategies.
quality-metrics
IncludedTracks quality metrics including defect density, test effectiveness ratio, DORA metrics, and mean time to detection. Use when establishing quality dashboards, defining KPIs, evaluating test suite effectiveness, or reporting quality trends to stakeholders.
shift-right-testing
IncludedTesting in production with feature flags, canary deployments, synthetic monitoring, and chaos engineering. Use when implementing production observability or progressive delivery.
test-automation-strategy
IncludedDesign and implement effective test automation with proper pyramid, patterns, and CI/CD integration. Use when building automation frameworks or improving test efficiency.
risk-based-testing
IncludedFocus testing effort on highest-risk areas using risk assessment and prioritization. Use when planning test strategy, allocating testing resources, or making coverage decisions.