qa-team

Included with Lifetime

$97 forever

QA team for outside-in validation, side-by-side parity loops, and A/B behavioral comparison. Use when you need behavior-driven tests, legacy-vs-new comparison, or rollout shadow validation. Creates executable scenarios and parity workflows that agents can observe, compare, and iterate on. Supports local, observable tmux, remote SSH, and shadow-mode divergence logging patterns.

Code Reviewscripts

What this skill does


# QA Team Skill

## Purpose [LEVEL 1]

This skill helps you create **agentic outside-in tests** that verify application behavior from an external user's perspective without any knowledge of internal implementation. Using the gadugi-agentic-test framework, you write declarative YAML scenarios that AI agents execute, observe, and validate.

**Key Principle**: Tests describe WHAT should happen, not HOW it's implemented. Agents figure out the execution details.

## When to Use This Skill [LEVEL 1]

### Perfect For

- **Smoke Tests**: Quick validation that critical user flows work
- **Behavior-Driven Testing**: Verify features from user perspective
- **Cross-Platform Testing**: Same test logic for CLI, TUI, Web, Electron
- **Refactoring Safety**: Tests remain valid when implementation changes
- **AI-Powered Testing**: Let agents handle complex interactions
- **Documentation as Tests**: YAML scenarios double as executable specs

### Use This Skill When

- Starting a new project and defining expected behaviors
- Refactoring code and need tests that won't break with internal changes
- Testing user-facing applications (CLI tools, TUIs, web apps, desktop apps)
- Writing acceptance criteria that can be automatically verified
- Need tests that non-developers can read and understand
- Want to catch regressions in critical user workflows
- Testing complex multi-step interactions

### Don't Use This Skill When

- Need unit tests for internal functions (use test-gap-analyzer instead)
- Testing performance or load characteristics
- Need precise timing or concurrency control
- Testing non-interactive batch processes
- Implementation details matter more than behavior

## Core Concepts [LEVEL 1]

### Outside-In Testing Philosophy

**Traditional Inside-Out Testing**:

```python
# Tightly coupled to implementation
def test_calculator_add():
    calc = Calculator()
    result = calc.add(2, 3)
    assert result == 5
    assert calc.history == [(2, 3, 5)]  # Knows internal state
```

**Agentic Outside-In Testing**:

```yaml
# Implementation-agnostic behavior verification
scenario:
  name: "Calculator Addition"
  steps:
    - action: launch
      target: "./calculator"
    - action: send_input
      value: "add 2 3"
    - action: verify_output
      contains: "Result: 5"
```

**Benefits**:

- Tests survive refactoring (internal changes don't break tests)
- Readable by non-developers (YAML is declarative)
- Platform-agnostic (same structure for CLI/TUI/Web/Electron)
- AI agents handle complexity (navigation, timing, screenshots)

### The Gadugi Agentic Test Framework [LEVEL 2]

Gadugi-agentic-test is a Python framework that:

1. **Parses YAML test scenarios** with declarative steps
2. **Dispatches to specialized agents** (CLI, TUI, Web, Electron agents)
3. **Executes actions** (launch, input, click, wait, verify)
4. **Collects evidence** (screenshots, logs, output captures)
5. **Validates outcomes** against expected results
6. **Generates reports** with evidence trails

**Architecture**:

```
YAML Scenario → Scenario Loader → Agent Dispatcher → Execution Engine
                                          ↓
                     [CLI Agent, TUI Agent, Web Agent, Electron Agent]
                                          ↓
                           Observers → Comprehension Agent
                                          ↓
                                   Evidence Report
```

### Progressive Disclosure Levels [LEVEL 1]

This skill teaches testing in four levels:

- **Level 1: Fundamentals** - Basic single-action tests, simple verification
- **Level 2: Intermediate** - Multi-step flows, conditional logic, error handling
- **Level 3: Advanced** - Custom agents, visual regression, performance validation
- **Level 4: Parity & Shadowing** - Side-by-side A/B comparison, remote observable runs, rollout divergence logging

Each example is marked with its level. Start at Level 1 and progress as needed.

## Side-by-Side Parity and A/B Validation [LEVEL 2]

QA Team is the renamed primary skill for what used to be `outside-in-testing`. Use it for standard outside-in scenarios **and** for parity loops where you must compare a legacy implementation to a replacement, or compare approach A to approach B, as an external user would observe them.

### Use QA Team for parity work when

- migrating Python to Rust, old CLI to new CLI, or v1 to v2 behavior
- validating a rewrite before switching defaults
- comparing branch A vs branch B using the same user scenarios
- running observable side-by-side sessions in paired virtual TTYs
- logging rollout divergences in shadow mode without failing the run

### Recommended parity loop

1. Define shared user-facing scenarios first.
2. Run both implementations in isolated sandboxes.
3. Compare stdout, stderr, exit code, JSON outputs, and filesystem side effects.
4. Re-run in `--observable` mode when you need paired tmux panes for debugging.
5. Use `--ssh-target <host>` when parity must happen on a remote environment such as `azlin`.
6. Use `--shadow-mode --shadow-log <file>` during rollout to log divergences without blocking execution.

### Command pattern to reuse

If the repo already has a parity harness, extend it instead of inventing a second one. A good baseline is:

```bash
python tests/parity/validate_cli_parity.py \
  --scenario tests/parity/scenarios/feature.yaml \
  --python-repo /path/to/legacy-repo \
  --rust-binary /path/to/new-binary \
  --observable
```

For remote parity:

```bash
python tests/parity/validate_cli_parity.py \
  --ssh-target azlin \
  --scenario tests/parity/scenarios/feature.yaml \
  --python-repo /remote/path/to/legacy-repo \
  --rust-binary /remote/path/to/new-binary
```

For rollout shadow logging:

```bash
python tests/parity/validate_cli_parity.py \
  --scenario tests/parity/scenarios/feature.yaml \
  --python-repo /path/to/legacy-repo \
  --rust-binary /path/to/new-binary \
  --shadow-mode \
  --shadow-log /tmp/feature-shadow.jsonl
```

## Quick Start [LEVEL 1]

### Installation

**Prerequisites (for native module compilation):**

```bash
# macOS
xcode-select --install

# Ubuntu/Debian
sudo apt-get install -y build-essential python3

# Windows: Install Visual Studio Build Tools with "Desktop development with C++"
```

**Install the framework:**

The gadugi-agentic-test framework is not published to npm. Install it from GitHub:

```bash
# Clone the repository
git clone https://github.com/rysweet/gadugi-agentic-test.git
cd gadugi-agentic-test

# Install dependencies and build
npm install
npm run build

# Verify the build succeeded
node dist/cli.js --version
```

> **Tip**: If you want CLI-style access from anywhere, you can add an alias:
>
> ```bash
> alias gadugi-test="node /path/to/gadugi-agentic-test/dist/cli.js"
> ```

### Your First Test (CLI Example)

Create `test-hello.yaml`:

```yaml
scenario:
  name: "Hello World CLI Test"
  description: "Verify CLI prints greeting"
  type: cli

  prerequisites:
    - "./hello-world executable exists"

  steps:
    - action: launch
      target: "./hello-world"

    - action: verify_output
      contains: "Hello, World!"

    - action: verify_exit_code
      expected: 0
```

Run the test:

```bash
node dist/cli.js run -s test-hello -d ./
```

Output:

```
✓ Scenario: Hello World CLI Test
  ✓ Step 1: Launched ./hello-world
  ✓ Step 2: Output contains "Hello, World!"
  ✓ Step 3: Exit code is 0

PASSED (3/3 steps successful)
Evidence saved to: ./evidence/test-hello-20250116-093045/
```

### Understanding the YAML Structure [LEVEL 1]

Every test scenario has this structure:

```yaml
scenario:
  name: "Descriptive test name"
  description: "What this test verifies"
  type: cli | tui | web | electron

  # Optional metadata
  tags: [smoke, critical, auth]
  timeout: 30s

  # What must be true before test runs
  prerequisites:
    - "Condition 1"
    - "Condition 2"

  # The test steps (executed sequentially)
  steps:
    - action: action_name
      parameter1: value1
      parameter2: val

Files: 19

Size: 209.7 KB

Complexity: 78/100

Category: Code Review

Source: https://github.com/rysweet/amplihack/tree/main/amplifier-bundle/skills/qa-team

Related in Code Review

gstack

Included

Fast headless browser for QA testing and site dogfooding. Navigate pages, interact with elements, verify state, diff before/after, take annotated screenshots, test responsive layouts, forms, uploads, dialogs, and capture bug evidence. Use when asked to open or test a site, verify a deployment, dogfood a user flow, or file a bug with screenshots. (gstack)

Code Reviewscriptsfeatured

startup-due-diligence

Included

Legal due diligence review for seed-stage and Series A startups (US, Delaware C-Corp focus). Supports both investor and founder perspectives. Capabilities include: (1) Interactive document review and issue spotting; (2) Document request list generation; (3) Cap table and SAFE/convertible note analysis; (4) Red flag identification with severity ratings; (5) Diligence report generation. TRIGGERS: due diligence, DD, startup investment, cap table review, Series A, seed round, investor diligence, legal review startup, SAFE analysis, convertible note, 409A, founder vesting.

Code Reviewscripts

interview-master

Included

This skill should be used when the user asks to "generate interview questions", "prepare for interview", "optimize resume", "conduct mock interview", "analyze git commits for resume", "generate resume from code", "review my resume", or mentions interview preparation, career assistance, or extracting project experience from git history. Provides comprehensive interview and career development guidance for both job seekers and interviewers.

Code Reviewscripts

fix-issue

Included

Fixes GitHub issues using parallel analysis agents for root cause investigation, code exploration, and regression detection. Reads issue context from gh CLI, searches codebase and memory for related patterns, generates a fix with tests, and links the resolution back to the issue via PR. Includes prevention analysis to avoid recurrence. Use when debugging errors, resolving regressions, fixing bugs, or triaging issues.

Code Reviewscripts

sf-apex

Included

Generates and reviews Salesforce Apex code with 150-point scoring. TRIGGER when: user writes, reviews, or fixes Apex classes, triggers, test classes, batch/queueable/schedulable jobs, or touches .cls/.trigger files. DO NOT TRIGGER when: LWC JavaScript (use sf-lwc), Flow XML (use sf-flow), SOQL-only queries (use sf-soql), or non-Salesforce code.

Code Reviewscripts

swift-development

Included

Comprehensive Swift development for building, testing, and deploying iOS/macOS applications. Use when Claude needs to: (1) Build Swift packages or Xcode projects from command line, (2) Run tests with XCTest or Swift Testing framework, (3) Manage iOS simulators with simctl, (4) Handle code signing, provisioning profiles, and app distribution, (5) Format or lint Swift code with SwiftFormat/SwiftLint, (6) Work with Swift Package Manager (SPM), (7) Implement Swift 6 concurrency patterns (async/await, actors, Sendable), (8) Create SwiftUI views with MVVM architecture, (9) Set up Core Data or SwiftData persistence, or any other Swift/iOS/macOS development tasks.

Code Reviewscripts