Claude
Skills
Sign in
Back

pentest-validation

Included with Lifetime
$97 forever

Use when validating security findings from SAST/DAST scans, proving exploitability of reported vulnerabilities, eliminating false positives, or running the 4-phase pentest pipeline (recon, analysis, validation, report).

specialized-testingpentestexploitationsecurity-validationshannonno-exploit-no-reportgraduated-exploitationscripts

What this skill does


# Pentest Validation

<default_to_action>
When validating security findings:
1. REQUIRE explicit authorization for target URL
2. SCAN with qe-security-scanner (SAST + dependency + secrets)
3. ANALYZE with qe-security-reviewer + qe-security-auditor (parallel)
4. VALIDATE with qe-pentest-validator (graduated exploitation, parallel per vuln type)
5. REPORT only confirmed findings with PoC evidence ("No Exploit, No Report")
6. UPDATE exploit playbook with new patterns

**Quality Gates:**
- Authorization confirmed before ANY exploitation
- Target URL is staging/dev (NOT production)
- Budget cap enforced ($15 default)
- Time cap enforced (30 min default)
- All exploitation attempts logged
</default_to_action>

## Quick Reference Card

### The 4-Phase Pipeline

| Phase | Agent(s) | Purpose | Parallelism |
|-------|----------|---------|-------------|
| **1. Recon** | qe-security-scanner | SAST, DAST, dependency scan, secrets | Internal parallel |
| **2. Analysis** | qe-security-reviewer + qe-security-auditor | Code review + compliance check | Both in parallel |
| **3. Validation** | qe-pentest-validator | Graduated exploit validation | Per-vuln-type parallel |
| **4. Report** | qe-quality-gate | "No Exploit, No Report" filter | Sequential |

### Graduated Exploitation Tiers

| Tier | Handler | Cost | Latency | Use When |
|------|---------|------|---------|----------|
| **1** | Agent Booster (WASM) | $0 | <1ms | Code pattern is conclusive (eval, innerHTML, hardcoded creds) |
| **2** | Haiku | $0.0002 | ~500ms | Need payload test against live target |
| **3** | Sonnet/Opus | $0.003-$0.015 | 2-5s | Full exploit chain with data proof |

### When to Use This Skill

| Scenario | Tier | Estimated Cost |
|----------|------|----------------|
| PR security review (source only) | 1 | $0 |
| Pre-release validation (staging) | 1-2 | $1-5 |
| Full pentest validation | 1-3 | $5-15 |
| Compliance audit evidence | 1-3 | $5-15 |

---

## Configuration

```yaml
pentest:
  target_url: https://staging.app.com    # REQUIRED for Tier 2-3
  source_repo: ./src                      # REQUIRED for Tier 1+
  exploitation_tier: 2                    # 1=pattern-only, 2=payload-test, 3=full-exploit
  vuln_types:                             # Which pipelines to run
    - injection                           # SQL, NoSQL, command injection
    - xss                                 # Reflected, stored, DOM XSS
    - auth                                # Auth bypass, session, JWT
    - ssrf                                # URL scheme abuse, metadata
  max_cost_usd: 15                        # Budget cap per run
  timeout_minutes: 30                     # Time cap per run
  require_authorization: true             # MUST confirm target ownership
  no_production: true                     # Block production URLs
  production_patterns:                    # URL patterns to block
    - "*.prod.*"
    - "api.*"
    - "www.*"
```

---

## Safeguards (Mandatory)

### Authorization Gate
Every pentest validation run MUST:
1. Display target URL and exploitation tier to user
2. Require explicit confirmation: "I own/authorized testing of this target"
3. Log authorization with timestamp
4. Block if target URL matches production patterns

### What This Skill Does NOT Do
- Full autonomous reconnaissance (Nmap, Subfinder)
- Zero-day exploit development
- Attack targets without explicit authorization
- Test production systems
- Store actual exfiltrated data (only proof of access)
- Social engineering or phishing simulation
- Port scanning or service discovery

---

## Validation Pipelines

### Injection Pipeline
| Attack | Tier 1 (Pattern) | Tier 2 (Payload) | Tier 3 (Full) |
|--------|-------------------|-------------------|----------------|
| SQL injection | String concat in query | `' OR '1'='1` response diff | UNION SELECT data extraction |
| NoSQL injection | `$where`, `$gt` in query | Operator injection test | Collection enumeration |
| Command injection | `exec()`, `system()` calls | Command delimiter test | Reverse shell proof |
| LDAP injection | String concat in filter | Wildcard injection | Directory enumeration |

### XSS Pipeline
| Attack | Tier 1 (Pattern) | Tier 2 (Payload) | Tier 3 (Full) |
|--------|-------------------|-------------------|----------------|
| Reflected XSS | No output encoding | `<img onerror>` reflection | Browser JS execution via Playwright |
| Stored XSS | `innerHTML` assignment | Payload stored + retrieved | Cookie theft PoC |
| DOM XSS | `document.write(location)` | Fragment injection | DOM manipulation proof |

### Auth Pipeline
| Attack | Tier 1 (Pattern) | Tier 2 (Payload) | Tier 3 (Full) |
|--------|-------------------|-------------------|----------------|
| JWT none | No algorithm validation | Modified JWT accepted | Admin access with forged token |
| Session fixation | No session rotation | Pre-set session reused | Cross-user session hijack |
| Credential stuffing | No rate limiting | 100 attempts unblocked | Valid credential discovery |
| IDOR | No authorization check | Access other user data | Full CRUD on foreign resources |

### SSRF Pipeline
| Attack | Tier 1 (Pattern) | Tier 2 (Payload) | Tier 3 (Full) |
|--------|-------------------|-------------------|----------------|
| Internal URL | User-controlled URL fetch | `http://169.254.169.254` | Cloud metadata extraction |
| DNS rebinding | URL validation bypass | Rebind to internal IP | Internal service access |
| Protocol smuggling | URL scheme not restricted | `file:///etc/passwd` | File content in response |

---

## Agent Coordination

### Orchestration Pattern
```typescript
// Phase 1: Recon (parallel scans)
await Task("Security Scan", {
  target: "./src",
  layers: { sast: true, dast: true, dependencies: true, secrets: true }
}, "qe-security-scanner");

// Phase 2: Analysis (parallel review)
await Promise.all([
  Task("Code Security Review", {
    findings: phase1Results,
    depth: "comprehensive"
  }, "qe-security-reviewer"),

  Task("Compliance Audit", {
    findings: phase1Results,
    frameworks: ["owasp-top-10"]
  }, "qe-security-auditor")
]);

// Phase 3: Validation (graduated exploitation)
await Task("Exploit Validation", {
  findings: [...phase1Results, ...phase2Results],
  target_url: "https://staging.app.com",
  exploitation_tier: 2,
  vuln_types: ["injection", "xss", "auth", "ssrf"],
  max_cost_usd: 15,
  timeout_minutes: 30
}, "qe-pentest-validator");

// Phase 4: Report ("No Exploit, No Report" gate)
await Task("Security Quality Gate", {
  findings: phase3Results.confirmedFindings,
  gate: "no-exploit-no-report",
  require_poc: true
}, "qe-quality-gate");
```

### Finding Classification
| Status | Meaning | Action |
|--------|---------|--------|
| `confirmed-exploitable` | Exploitation succeeded with PoC | Report with evidence |
| `likely-exploitable` | Partial exploitation, defenses detected | Report with caveats |
| `not-exploitable` | All exploitation attempts failed | Filter from report |
| `inconclusive` | WAF/defense blocked, unclear if vulnerable | Report for manual review |

---

## Exploit Playbook Memory

### Namespace Structure
```
aqe/pentest/
 playbook/
  exploit/{vuln_type}/{tech_stack}/{technique}
  bypass/{defense_type}/{technique}
  payload/{vuln_type}/{variant}
 results/
  validation-{timestamp}
 poc/
  {finding_id}-poc
```

### Learning Loop
1. **Before validation**: Query playbook for known patterns matching findings
2. **During validation**: Try known payloads first (higher success rate)
3. **After validation**: Store new successful patterns with confidence scores
4. **Over time**: Agent converges on most effective payloads per tech stack

---

## Cost Optimization

### Estimated Cost by Scenario
| Scenario | Tier Mix | Findings | Est. Cost | Est. Time |
|----------|----------|----------|-----------|-----------|
| PR check (source only) | 100% Tier 1 | 5 | $0 | <5s |
| Sprint validation | 70% T1, 30% T2 | 15 | $2-5 | 5-10 min |
| Release validation | 40% T1, 40% T2, 20% T3 

Related in specialized-testing