Claude
Skills
Sign in
Back

parallel-batch-executor

Included with Lifetime
$97 forever

Parallel task execution patterns for 300% performance gains

bashbashparallelxargsperformancebatchoptimization

What this skill does


# Parallel Batch Executor

Patterns for executing tasks in parallel using bash, delivering up to 300% performance improvements. Extracted from workspace-hub's batchtools and orchestration scripts.

## When to Use This Skill

✅ **Use when:**
- Processing multiple independent files or items
- Running the same operation across multiple repositories
- Batch operations that don't depend on each other
- Need significant performance improvements
- Operations are I/O bound rather than CPU bound

❌ **Avoid when:**
- Tasks have dependencies on each other
- Order of execution matters
- Shared resources require synchronization
- Single task that can't be parallelized

## Core Capabilities

### 1. Basic Parallel Execution with xargs

The fundamental pattern for parallel execution:

```bash
#!/bin/bash
# ABOUTME: Basic parallel execution using xargs
# ABOUTME: Process multiple items concurrently with controlled parallelism

PARALLEL="${PARALLEL:-5}"  # Default to 5 parallel workers

# Process items from stdin in parallel
cat items.txt | xargs -I {} -P "$PARALLEL" bash -c 'echo "Processing: {}"'

# Process with error handling
cat items.txt | xargs -I {} -P "$PARALLEL" bash -c '
    item="{}"
    if process_item "$item"; then
        echo "✓ $item"
    else
        echo "✗ $item" >&2
    fi
'
```

### 2. JSON Array Processing

Process JSON arrays in parallel (from batch_runner.sh):

```bash
#!/bin/bash
# ABOUTME: Process JSON array items in parallel
# ABOUTME: Uses jq for parsing and xargs for parallel execution

set -e

PARALLEL="${1:-5}"
ORCHESTRATOR="./scripts/routing/orchestrate.sh"

# Check dependencies
if ! command -v jq &> /dev/null; then
    echo "Error: jq is not installed."
    exit 1
fi

echo "Starting batch execution with $PARALLEL parallel workers..."

# Read JSON array from stdin, extract items, process in parallel
jq -r '.[]' | xargs -I {} -P "$PARALLEL" bash -c "$ORCHESTRATOR \"{}\" > /dev/null"

echo "Batch execution complete."
```

### 3. Repository Batch Operations

Execute commands across multiple repositories:

```bash
#!/bin/bash
# ABOUTME: Execute operations across multiple repositories in parallel
# ABOUTME: Pattern from workspace-hub repository_sync

PARALLEL="${PARALLEL:-5}"
REPOS_DIR="/mnt/github"

# Get list of repositories
get_repos() {
    find "$REPOS_DIR" -maxdepth 1 -type d -name "[!.]*" | sort
}

# Execute command in each repository in parallel
batch_repo_command() {
    local command="$1"
    local repos
    repos=$(get_repos)

    echo "$repos" | xargs -I {} -P "$PARALLEL" bash -c "
        repo=\"{}\"
        repo_name=\$(basename \"\$repo\")

        if cd \"\$repo\" 2>/dev/null; then
            result=\$($command 2>&1)
            exit_code=\$?

            if [[ \$exit_code -eq 0 ]]; then
                echo \"✓ \$repo_name: \$result\"
            else
                echo \"✗ \$repo_name: \$result\" >&2
            fi
        else
            echo \"⊘ \$repo_name: Directory not accessible\" >&2
        fi
    "
}

# Usage examples
batch_repo_command "git status --porcelain | head -1"
batch_repo_command "git pull --rebase"
batch_repo_command "git push"
```

### 4. Progress Tracking

Track progress during parallel execution:

```bash
#!/bin/bash
# ABOUTME: Parallel execution with progress tracking
# ABOUTME: Shows real-time progress and summary statistics

PARALLEL="${PARALLEL:-5}"
PROGRESS_FILE=$(mktemp)
TOTAL=0
SUCCESS=0
FAILED=0

# Initialize progress file
echo "0" > "$PROGRESS_FILE"

# Track progress atomically
track_progress() {
    local status="$1"
    local item="$2"

    # Use flock for atomic updates
    (
        flock -x 200
        local current=$(cat "$PROGRESS_FILE")
        echo $((current + 1)) > "$PROGRESS_FILE"

        if [[ "$status" == "success" ]]; then
            echo "S" >> "${PROGRESS_FILE}.status"
        else
            echo "F" >> "${PROGRESS_FILE}.status"
        fi
    ) 200>"${PROGRESS_FILE}.lock"
}

process_with_tracking() {
    local items=("$@")
    local total=${#items[@]}

    echo "Processing $total items with $PARALLEL workers..."
    echo ""

    printf '%s\n' "${items[@]}" | xargs -I {} -P "$PARALLEL" bash -c "
        item=\"{}\"

        # Process item
        if process_item \"\$item\" 2>/dev/null; then
            echo \"success \$item\"
        else
            echo \"failed \$item\"
        fi
    " | while read status item; do
        track_progress "$status" "$item"

        local current=$(cat "$PROGRESS_FILE")
        local pct=$((current * 100 / total))

        # Update progress line
        printf \"\\r[%3d%%] Processed %d/%d items\" \$pct \$current \$total
    done

    echo ""
    echo "Complete!"

    # Show summary
    local success=$(grep -c "S" "${PROGRESS_FILE}.status" 2>/dev/null || echo 0)
    local failed=$(grep -c "F" "${PROGRESS_FILE}.status" 2>/dev/null || echo 0)
    echo "Success: $success, Failed: $failed"

    # Cleanup
    rm -f "$PROGRESS_FILE" "${PROGRESS_FILE}.lock" "${PROGRESS_FILE}.status"
}
```

### 5. Error Collection

Collect and report errors from parallel execution:

```bash
#!/bin/bash
# ABOUTME: Parallel execution with centralized error collection
# ABOUTME: Aggregates errors for reporting after batch completion

ERROR_LOG=$(mktemp)
SUCCESS_LOG=$(mktemp)

cleanup() {
    rm -f "$ERROR_LOG" "$SUCCESS_LOG"
}
trap cleanup EXIT

parallel_with_errors() {
    local command="$1"
    shift
    local items=("$@")

    printf '%s\n' "${items[@]}" | xargs -I {} -P "$PARALLEL" bash -c "
        item=\"{}\"
        output=\$($command \"\$item\" 2>&1)
        exit_code=\$?

        if [[ \$exit_code -eq 0 ]]; then
            echo \"\$item\" >> \"$SUCCESS_LOG\"
        else
            echo \"\$item: \$output\" >> \"$ERROR_LOG\"
        fi
    "

    # Report results
    local success_count=$(wc -l < "$SUCCESS_LOG" 2>/dev/null || echo 0)
    local error_count=$(wc -l < "$ERROR_LOG" 2>/dev/null || echo 0)

    echo ""
    echo "Results: $success_count succeeded, $error_count failed"

    if [[ $error_count -gt 0 ]]; then
        echo ""
        echo "Errors:"
        cat "$ERROR_LOG" | while read line; do
            echo "  ✗ $line"
        done
        return 1
    fi

    return 0
}
```

### 6. GNU Parallel Alternative

For more advanced parallelism:

```bash
#!/bin/bash
# ABOUTME: Advanced parallel execution using GNU Parallel
# ABOUTME: Provides job control, logging, and resume capabilities

# Check for GNU Parallel
if command -v parallel &> /dev/null; then
    USE_GNU_PARALLEL=true
else
    USE_GNU_PARALLEL=false
fi

parallel_execute() {
    local command="$1"
    local jobs="${2:-5}"

    if [[ "$USE_GNU_PARALLEL" == true ]]; then
        # GNU Parallel with job log
        parallel --jobs "$jobs" \
                 --joblog /tmp/parallel_joblog.txt \
                 --bar \
                 "$command" {}
    else
        # Fallback to xargs
        xargs -I {} -P "$jobs" bash -c "$command \"{}\""
    fi
}

# Usage
cat files.txt | parallel_execute "process_file" 10
```

## Complete Example: Batch Task Runner

Full implementation from workspace-hub:

```bash
#!/bin/bash
# ABOUTME: Batch Task Executor
# ABOUTME: Executes tasks in parallel using the Multi-Provider Orchestrator

set -e

# ─────────────────────────────────────────────────────────────────
# Configuration
# ─────────────────────────────────────────────────────────────────

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
ORCHESTRATOR="$SCRIPT_DIR/../routing/orchestrate.sh"
PARALLEL=1
LOG_DIR="$SCRIPT_DIR/logs"

# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
CYAN='\033[0;36m'
NC='\033[0m'

# ─────────────────────────────────────────────────────────────────
# Functions
# ─────────────────────────────────────────────────────────────────

log_info()  { echo -e "${GREEN}[INFO]${NC} $*"; }
log_error() { echo -e "${RED}[ERROR]${NC} $*" >&2; }
die()       { log_error "$1"; exit 1; }

show_usage() {
    cat << EOF
Usage: $0 [OPTIONS] < tasks.json

Executes tasks in p

Related in bash