Claude
Skills
Sign in
Back

agenta

Included with Lifetime
$97 forever

LLM prompt management and evaluation platform. Version prompts, run A/B tests, evaluate with metrics, and deploy with confidence using Agenta's self-hosted solution.

ai-promptingagentallmprompt-managementevaluationab-testingmlopsself-hostedversioning

What this skill does


# Agenta Skill

> Manage, evaluate, and deploy LLM prompts with confidence. Version control your prompts, run A/B tests, and measure quality with automated evaluation.

## Quick Start

```bash
# Install Agenta SDK
pip install agenta

# Start Agenta locally with Docker
docker run -d -p 3000:3000 -p 8000:8000 ghcr.io/agenta-ai/agenta

# Or use pip for just the SDK
pip install agenta

# Initialize project
agenta init --app-name my-llm-app
```

## When to Use This Skill

**USE when:**
- Managing multiple versions of prompts in production
- Need systematic A/B testing of prompt variations
- Evaluating prompt quality with automated metrics
- Collaborating on prompt development across teams
- Requiring audit trails for prompt changes
- Building LLM applications that need to iterate quickly
- Need to compare different models with same prompts
- Want a playground for rapid prompt experimentation
- Self-hosting is required for security/compliance

**DON'T USE when:**
- Simple single-prompt applications
- No need for prompt versioning or testing
- Already using another prompt management system
- Rapid prototyping without evaluation needs
- Cost-sensitive projects (evaluation adds API calls)

## Prerequisites

```bash
# SDK installation
pip install agenta>=0.10.0

# For self-hosted deployment
docker pull ghcr.io/agenta-ai/agenta

# Or with docker-compose
git clone https://github.com/Agenta-AI/agenta
cd agenta
docker-compose up -d

# Environment setup
export AGENTA_HOST="http://localhost:3000"
export AGENTA_API_KEY="your-api-key"  # If using cloud version

# For LLM providers
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
```

### Verify Installation

```python
import agenta as ag
from agenta import Agenta

# Initialize client
client = Agenta()

# Check connection
print(f"Agenta SDK version: {ag.__version__}")
print("Connection successful!")
```

## Core Capabilities

### 1. Prompt Versioning and Management

**Creating Versioned Prompts:**
```python
"""
Create and manage versioned prompts with Agenta.
"""
import agenta as ag
from agenta import Agenta
from typing import Optional, Dict, Any

# Initialize Agenta
ag.init()

@ag.entrypoint
def generate_summary(
    text: str,
    max_length: int = 100,
    style: str = "professional"
) -> str:
    """
    Generate a summary with versioned prompt.

    Args:
        text: Text to summarize
        max_length: Maximum summary length
        style: Writing style (professional, casual, technical)

    Returns:
        Generated summary
    """
    # Define prompt template (this becomes versioned)
    prompt = f"""Summarize the following text in a {style} tone.
Keep the summary under {max_length} words.

Text: {text}

Summary:"""

    # Call LLM (Agenta tracks this)
    response = ag.llm.complete(
        prompt=prompt,
        model="gpt-4",
        temperature=0.3,
        max_tokens=max_length * 2
    )

    return response.text


# Example usage
text = """
The company reported strong Q3 results with revenue up 25% year-over-year.
Operating margins improved to 18% from 15% in the prior year.
The CEO highlighted expansion into new markets and product launches.
"""

summary = generate_summary(text, max_length=50, style="professional")
print(summary)
```

**Managing Prompt Versions:**
```python
"""
Manage multiple prompt versions programmatically.
"""
import agenta as ag
from agenta import Agenta
from dataclasses import dataclass
from typing import List, Dict, Optional
from datetime import datetime

@dataclass
class PromptVersion:
    """Represents a prompt version."""
    version_id: str
    name: str
    template: str
    parameters: Dict[str, Any]
    created_at: datetime
    is_active: bool = False


class PromptManager:
    """
    Manage prompt versions with Agenta.
    """

    def __init__(self, app_name: str):
        self.app_name = app_name
        self.client = Agenta()

    def create_version(
        self,
        name: str,
        template: str,
        parameters: Dict[str, Any] = None
    ) -> PromptVersion:
        """
        Create a new prompt version.

        Args:
            name: Version name
            template: Prompt template
            parameters: Default parameters

        Returns:
            Created PromptVersion
        """
        # Create variant in Agenta
        variant = self.client.create_variant(
            app_name=self.app_name,
            variant_name=name,
            config={
                "template": template,
                "parameters": parameters or {}
            }
        )

        return PromptVersion(
            version_id=variant.id,
            name=name,
            template=template,
            parameters=parameters or {},
            created_at=datetime.now(),
            is_active=False
        )

    def list_versions(self) -> List[PromptVersion]:
        """List all prompt versions."""
        variants = self.client.list_variants(app_name=self.app_name)

        versions = []
        for v in variants:
            versions.append(PromptVersion(
                version_id=v.id,
                name=v.name,
                template=v.config.get("template", ""),
                parameters=v.config.get("parameters", {}),
                created_at=v.created_at,
                is_active=v.is_default
            ))

        return versions

    def set_active_version(self, version_id: str) -> None:
        """Set a version as the active/default version."""
        self.client.set_default_variant(
            app_name=self.app_name,
            variant_id=version_id
        )

    def get_version(self, version_id: str) -> PromptVersion:
        """Get a specific version."""
        variant = self.client.get_variant(variant_id=version_id)

        return PromptVersion(
            version_id=variant.id,
            name=variant.name,
            template=variant.config.get("template", ""),
            parameters=variant.config.get("parameters", {}),
            created_at=variant.created_at,
            is_active=variant.is_default
        )

    def compare_versions(
        self,
        version_ids: List[str],
        test_input: str
    ) -> Dict[str, str]:
        """
        Compare outputs from multiple versions.

        Args:
            version_ids: List of version IDs to compare
            test_input: Input to test with

        Returns:
            Dictionary mapping version_id to output
        """
        results = {}

        for vid in version_ids:
            version = self.get_version(vid)

            # Format prompt with test input
            prompt = version.template.format(input=test_input)

            # Generate output
            response = ag.llm.complete(prompt=prompt)
            results[vid] = response.text

        return results


# Usage
manager = PromptManager("summarizer-app")

# Create versions
v1 = manager.create_version(
    name="concise-v1",
    template="Summarize briefly: {input}",
    parameters={"max_tokens": 100}
)

v2 = manager.create_version(
    name="detailed-v2",
    template="Provide a comprehensive summary with key points: {input}",
    parameters={"max_tokens": 300}
)

# List all versions
versions = manager.list_versions()
for v in versions:
    print(f"{v.name}: {v.version_id} (active: {v.is_active})")

# Set active version
manager.set_active_version(v1.version_id)
```

### 2. A/B Testing Prompts

**Setting Up A/B Tests:**
```python
"""
Configure and run A/B tests on prompt variations.
"""
import agenta as ag
from agenta import Agenta
from typing import Dict, List, Optional
from dataclasses import dataclass
import random

@dataclass
class ABTestConfig:
    """Configuration for A/B test."""
    name: str
    variants: Dict[str, float]  # variant_id: traffic_percentage
    metrics: List[str]
    min_samples: int = 100


class ABTestRunner:
    """
    Run A/B tests on prompt variants.
    """

    def __init__(self, app_name: str):
        self.app_name = app_name
        self.cli

Related in ai-prompting