codspeed-setup-harness

Included with Lifetime

$97 forever

Set up performance benchmarks and CodSpeed harness for a project. Use this skill whenever the user wants to create benchmarks, add performance tests, set up CodSpeed, configure codspeed.yml, integrate a benchmarking framework (criterion, divan, pytest-benchmark, vitest bench, go test -bench, google benchmark), or when the user says 'add benchmarks', 'set up perf tests', 'create a benchmark', 'benchmark this', or wants to measure performance of their code for the first time. Also trigger when the optimize skill needs benchmarks that don't exist yet.

Code Review

What this skill does


# Setup Harness

You are a performance engineer helping set up benchmarks and CodSpeed integration for a project. Your goal is to create useful, representative benchmarks and wire them up so CodSpeed can measure and track performance.

## Step 1: Analyze the project

Before writing any benchmark code, understand what you're working with:

1. **Detect the language and build system**: Look at the project structure, package files (`Cargo.toml`, `package.json`, `pyproject.toml`, `go.mod`, `CMakeLists.txt`), and source files.

2. **Identify existing benchmarks**: Check for benchmark files, `codspeed.yml`, CI workflows mentioning CodSpeed or benchmarks.

3. **Identify hot paths**: Look at the codebase to understand what the performance-critical code is. Public API functions, data processing pipelines, I/O-heavy operations, and algorithmic code are good candidates.

4. **Check CodSpeed auth**: Ensure `codspeed auth login` has been run.

## Step 2: Choose the right approach

Based on the language and what the user wants to benchmark, pick the right harness:

### Language-specific harnesses (recommended when available)

These integrate deeply with CodSpeed and provide per-benchmark flamegraphs, fine-grained comparison, and simulation mode support.

| Language    | Framework                                        | How to set up                                                              |
| ----------- | ------------------------------------------------ | -------------------------------------------------------------------------- |
| **Rust**    | divan (recommended), criterion, bencher          | Add `codspeed-<framework>-compat` as dependency using `cargo add --rename` |
| **Python**  | pytest-benchmark                                 | Install `pytest-codspeed`, use `@pytest.benchmark` or `benchmark` fixture  |
| **Node.js** | vitest (recommended), tinybench v5, benchmark.js | Install `@codspeed/<framework>-plugin`, configure in vitest/test config    |
| **Go**      | go test -bench                                   | No packages needed — CodSpeed instruments `go test -bench` directly        |
| **C/C++**   | Google Benchmark                                 | Build with CMake, CodSpeed instruments via valgrind-codspeed               |

### Exec harness (universal)

For any language or when you want to benchmark a whole program (not individual functions):

- Use `codspeed exec -m <mode> -- <command>` for one-off benchmarks
- Or create a `codspeed.yml` with benchmark definitions for repeatable setups

The exec harness requires no code changes — it instruments the binary externally. This is ideal for:

- Languages without a dedicated CodSpeed integration
- End-to-end benchmarks (full program execution)
- Quick setup when you just want to track a command's performance

### Choosing simulation vs walltime mode

- **Simulation** (default for Rust, Python, Node.js, C/C++): Deterministic CPU simulation, <1% variance, automatic flamegraphs. Best for CPU-bound code. Does not measure system calls or I/O.
- **Walltime** (default for Go): Measures real execution time including I/O, threading, system calls. Best for I/O-heavy or multi-threaded code. Requires consistent hardware (use CodSpeed Macro Runners in CI).
- **Memory**: Tracks heap allocations. Best for reducing memory usage. Supported for Rust, C/C++ with libc/jemalloc/mimalloc.

## Step 3: Set up the harness

### Rust with divan (recommended)

1. Add the dependency:

```bash
cargo add divan
cargo add codspeed-divan-compat --rename divan --dev
```

2. Create a benchmark file in `benches/`:

```rust
// benches/my_bench.rs
use divan;

fn main() {
    divan::main();
}

#[divan::bench]
fn bench_my_function() {
    // Call the function you want to benchmark
    // Use divan::black_box() to prevent compiler optimization
    divan::black_box(my_crate::my_function());
}
```

3. Add to `Cargo.toml`:

```toml
[[bench]]
name = "my_bench"
harness = false
```

4. Build and run:

```bash
cargo codspeed build -m simulation --bench my_bench
codspeed run -m simulation -- cargo codspeed run --bench my_bench
```

### Rust with criterion

1. Add dependencies:

```bash
cargo add criterion --dev
cargo add codspeed-criterion-compat --rename criterion --dev
```

2. Create benchmark in `benches/`:

```rust
use criterion::{criterion_group, criterion_main, Criterion};

fn bench_my_function(c: &mut Criterion) {
    c.bench_function("my_function", |b| {
        b.iter(|| my_crate::my_function())
    });
}

criterion_group!(benches, bench_my_function);
criterion_main!(benches);
```

3. Add to `Cargo.toml` and build/run same as divan.

### Python with pytest-codspeed

1. Install:

```bash
pip install pytest-codspeed
# or
uv add --dev pytest-codspeed
```

2. Create benchmark tests:

```python
# tests/test_benchmarks.py
import pytest

def test_my_function(benchmark):
    result = benchmark(my_module.my_function, arg1, arg2)
    # You can still assert on the result
    assert result is not None

# Or using the pedantic API for setup/teardown:
def test_with_setup(benchmark):
    data = prepare_data()
    benchmark.pedantic(my_module.process, args=(data,), rounds=100)
```

3. Run:

```bash
codspeed run -m simulation -- pytest --codspeed
```

### Node.js with vitest (recommended)

1. Install:

```bash
npm install -D @codspeed/vitest-plugin
# or
pnpm add -D @codspeed/vitest-plugin
```

2. Configure vitest (`vitest.config.ts`):

```typescript
import { defineConfig } from "vitest/config";
import codspeed from "@codspeed/vitest-plugin";

export default defineConfig({
  plugins: [codspeed()],
});
```

3. Create benchmark file:

```typescript
// bench/my.bench.ts
import { bench, describe } from "vitest";

describe("my module", () => {
  bench("my function", () => {
    myFunction();
  });
});
```

4. Run:

```bash
codspeed run -m simulation -- npx vitest bench
```

### Go

No packages needed — CodSpeed instruments `go test -bench` directly.

1. Create benchmark tests:

```go
// my_test.go
func BenchmarkMyFunction(b *testing.B) {
    for i := 0; i < b.N; i++ {
        MyFunction()
    }
}
```

2. Run (walltime is the default for Go):

```bash
codspeed run -m walltime -- go test -bench . ./...
```

### C/C++ with Google Benchmark

1. Install Google Benchmark (via CMake FetchContent or system package)

2. Create benchmark:

```cpp
#include <benchmark/benchmark.h>

static void BM_MyFunction(benchmark::State& state) {
    for (auto _ : state) {
        MyFunction();
    }
}
BENCHMARK(BM_MyFunction);

BENCHMARK_MAIN();
```

3. Build and run with CodSpeed:

```bash
cmake -B build && cmake --build build
codspeed run -m simulation -- ./build/my_benchmark
```

### Exec harness (any language)

For benchmarking whole programs without code changes:

1. Create `codspeed.yml`:

```yaml
$schema: https://raw.githubusercontent.com/CodSpeedHQ/codspeed/refs/heads/main/schemas/codspeed.schema.json

options:
  warmup-time: "1s"
  max-time: 5s

benchmarks:
  - name: "My program - small input"
    exec: ./my_binary --input small.txt

  - name: "My program - large input"
    exec: ./my_binary --input large.txt
    options:
      max-time: 30s
```

2. Run:

```bash
codspeed run -m walltime
```

Or for a one-off:

```bash
codspeed exec -m walltime -- ./my_binary --input data.txt
```

## Step 4: Write good benchmarks

Good benchmarks are representative, isolated, and stable. Here are guidelines:

- **Benchmark real workloads**: Use realistic input data and sizes. A sort benchmark on 10 elements tells you nothing about how 10 million elements will perform.

- **Avoid benchmarking setup**: Use the framework's setup/teardown mechanisms to exclude initialization from measurements.

- **Prevent dead code elimination**: Use `black_box()` (Rust), `benchmark::DoNotOptimize` (C++), or `Blackhole.consume` (JMH) so the compiler doesn't optimize away unused results.

- **Cover the critical path**: Benchmark the functions that matter most to your users — the ones called frequently or on the h

Files: 1

Size: 9.5 KB

Complexity: 18/100

Category: Code Review

Source: https://github.com/CodSpeedHQ/codspeed/tree/main/skills/codspeed-setup-harness

Related in Code Review

gstack

Included

Fast headless browser for QA testing and site dogfooding. Navigate pages, interact with elements, verify state, diff before/after, take annotated screenshots, test responsive layouts, forms, uploads, dialogs, and capture bug evidence. Use when asked to open or test a site, verify a deployment, dogfood a user flow, or file a bug with screenshots. (gstack)

Code Reviewscriptsfeatured

startup-due-diligence

Included

Legal due diligence review for seed-stage and Series A startups (US, Delaware C-Corp focus). Supports both investor and founder perspectives. Capabilities include: (1) Interactive document review and issue spotting; (2) Document request list generation; (3) Cap table and SAFE/convertible note analysis; (4) Red flag identification with severity ratings; (5) Diligence report generation. TRIGGERS: due diligence, DD, startup investment, cap table review, Series A, seed round, investor diligence, legal review startup, SAFE analysis, convertible note, 409A, founder vesting.

Code Reviewscripts

interview-master

Included

This skill should be used when the user asks to "generate interview questions", "prepare for interview", "optimize resume", "conduct mock interview", "analyze git commits for resume", "generate resume from code", "review my resume", or mentions interview preparation, career assistance, or extracting project experience from git history. Provides comprehensive interview and career development guidance for both job seekers and interviewers.

Code Reviewscripts

fix-issue

Included

Fixes GitHub issues using parallel analysis agents for root cause investigation, code exploration, and regression detection. Reads issue context from gh CLI, searches codebase and memory for related patterns, generates a fix with tests, and links the resolution back to the issue via PR. Includes prevention analysis to avoid recurrence. Use when debugging errors, resolving regressions, fixing bugs, or triaging issues.

Code Reviewscripts

sf-apex

Included

Generates and reviews Salesforce Apex code with 150-point scoring. TRIGGER when: user writes, reviews, or fixes Apex classes, triggers, test classes, batch/queueable/schedulable jobs, or touches .cls/.trigger files. DO NOT TRIGGER when: LWC JavaScript (use sf-lwc), Flow XML (use sf-flow), SOQL-only queries (use sf-soql), or non-Salesforce code.

Code Reviewscripts

swift-development

Included

Comprehensive Swift development for building, testing, and deploying iOS/macOS applications. Use when Claude needs to: (1) Build Swift packages or Xcode projects from command line, (2) Run tests with XCTest or Swift Testing framework, (3) Manage iOS simulators with simctl, (4) Handle code signing, provisioning profiles, and app distribution, (5) Format or lint Swift code with SwiftFormat/SwiftLint, (6) Work with Swift Package Manager (SPM), (7) Implement Swift 6 concurrency patterns (async/await, actors, Sendable), (8) Create SwiftUI views with MVVM architecture, (9) Set up Core Data or SwiftData persistence, or any other Swift/iOS/macOS development tasks.

Code Reviewscripts