agenta
Included with Lifetime
$97 forever
LLM prompt management and evaluation platform. Version prompts, run A/B tests, evaluate with metrics, and deploy with confidence using Agenta's self-hosted solution.
ai-promptingagentallmprompt-managementevaluationab-testingmlopsself-hostedversioning
What this skill does
# Agenta Skill
> Manage, evaluate, and deploy LLM prompts with confidence. Version control your prompts, run A/B tests, and measure quality with automated evaluation.
## Quick Start
```bash
# Install Agenta SDK
pip install agenta
# Start Agenta locally with Docker
docker run -d -p 3000:3000 -p 8000:8000 ghcr.io/agenta-ai/agenta
# Or use pip for just the SDK
pip install agenta
# Initialize project
agenta init --app-name my-llm-app
```
## When to Use This Skill
**USE when:**
- Managing multiple versions of prompts in production
- Need systematic A/B testing of prompt variations
- Evaluating prompt quality with automated metrics
- Collaborating on prompt development across teams
- Requiring audit trails for prompt changes
- Building LLM applications that need to iterate quickly
- Need to compare different models with same prompts
- Want a playground for rapid prompt experimentation
- Self-hosting is required for security/compliance
**DON'T USE when:**
- Simple single-prompt applications
- No need for prompt versioning or testing
- Already using another prompt management system
- Rapid prototyping without evaluation needs
- Cost-sensitive projects (evaluation adds API calls)
## Prerequisites
```bash
# SDK installation
pip install agenta>=0.10.0
# For self-hosted deployment
docker pull ghcr.io/agenta-ai/agenta
# Or with docker-compose
git clone https://github.com/Agenta-AI/agenta
cd agenta
docker-compose up -d
# Environment setup
export AGENTA_HOST="http://localhost:3000"
export AGENTA_API_KEY="your-api-key" # If using cloud version
# For LLM providers
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
```
### Verify Installation
```python
import agenta as ag
from agenta import Agenta
# Initialize client
client = Agenta()
# Check connection
print(f"Agenta SDK version: {ag.__version__}")
print("Connection successful!")
```
## Core Capabilities
### 1. Prompt Versioning and Management
**Creating Versioned Prompts:**
```python
"""
Create and manage versioned prompts with Agenta.
"""
import agenta as ag
from agenta import Agenta
from typing import Optional, Dict, Any
# Initialize Agenta
ag.init()
@ag.entrypoint
def generate_summary(
text: str,
max_length: int = 100,
style: str = "professional"
) -> str:
"""
Generate a summary with versioned prompt.
Args:
text: Text to summarize
max_length: Maximum summary length
style: Writing style (professional, casual, technical)
Returns:
Generated summary
"""
# Define prompt template (this becomes versioned)
prompt = f"""Summarize the following text in a {style} tone.
Keep the summary under {max_length} words.
Text: {text}
Summary:"""
# Call LLM (Agenta tracks this)
response = ag.llm.complete(
prompt=prompt,
model="gpt-4",
temperature=0.3,
max_tokens=max_length * 2
)
return response.text
# Example usage
text = """
The company reported strong Q3 results with revenue up 25% year-over-year.
Operating margins improved to 18% from 15% in the prior year.
The CEO highlighted expansion into new markets and product launches.
"""
summary = generate_summary(text, max_length=50, style="professional")
print(summary)
```
**Managing Prompt Versions:**
```python
"""
Manage multiple prompt versions programmatically.
"""
import agenta as ag
from agenta import Agenta
from dataclasses import dataclass
from typing import List, Dict, Optional
from datetime import datetime
@dataclass
class PromptVersion:
"""Represents a prompt version."""
version_id: str
name: str
template: str
parameters: Dict[str, Any]
created_at: datetime
is_active: bool = False
class PromptManager:
"""
Manage prompt versions with Agenta.
"""
def __init__(self, app_name: str):
self.app_name = app_name
self.client = Agenta()
def create_version(
self,
name: str,
template: str,
parameters: Dict[str, Any] = None
) -> PromptVersion:
"""
Create a new prompt version.
Args:
name: Version name
template: Prompt template
parameters: Default parameters
Returns:
Created PromptVersion
"""
# Create variant in Agenta
variant = self.client.create_variant(
app_name=self.app_name,
variant_name=name,
config={
"template": template,
"parameters": parameters or {}
}
)
return PromptVersion(
version_id=variant.id,
name=name,
template=template,
parameters=parameters or {},
created_at=datetime.now(),
is_active=False
)
def list_versions(self) -> List[PromptVersion]:
"""List all prompt versions."""
variants = self.client.list_variants(app_name=self.app_name)
versions = []
for v in variants:
versions.append(PromptVersion(
version_id=v.id,
name=v.name,
template=v.config.get("template", ""),
parameters=v.config.get("parameters", {}),
created_at=v.created_at,
is_active=v.is_default
))
return versions
def set_active_version(self, version_id: str) -> None:
"""Set a version as the active/default version."""
self.client.set_default_variant(
app_name=self.app_name,
variant_id=version_id
)
def get_version(self, version_id: str) -> PromptVersion:
"""Get a specific version."""
variant = self.client.get_variant(variant_id=version_id)
return PromptVersion(
version_id=variant.id,
name=variant.name,
template=variant.config.get("template", ""),
parameters=variant.config.get("parameters", {}),
created_at=variant.created_at,
is_active=variant.is_default
)
def compare_versions(
self,
version_ids: List[str],
test_input: str
) -> Dict[str, str]:
"""
Compare outputs from multiple versions.
Args:
version_ids: List of version IDs to compare
test_input: Input to test with
Returns:
Dictionary mapping version_id to output
"""
results = {}
for vid in version_ids:
version = self.get_version(vid)
# Format prompt with test input
prompt = version.template.format(input=test_input)
# Generate output
response = ag.llm.complete(prompt=prompt)
results[vid] = response.text
return results
# Usage
manager = PromptManager("summarizer-app")
# Create versions
v1 = manager.create_version(
name="concise-v1",
template="Summarize briefly: {input}",
parameters={"max_tokens": 100}
)
v2 = manager.create_version(
name="detailed-v2",
template="Provide a comprehensive summary with key points: {input}",
parameters={"max_tokens": 300}
)
# List all versions
versions = manager.list_versions()
for v in versions:
print(f"{v.name}: {v.version_id} (active: {v.is_active})")
# Set active version
manager.set_active_version(v1.version_id)
```
### 2. A/B Testing Prompts
**Setting Up A/B Tests:**
```python
"""
Configure and run A/B tests on prompt variations.
"""
import agenta as ag
from agenta import Agenta
from typing import Dict, List, Optional
from dataclasses import dataclass
import random
@dataclass
class ABTestConfig:
"""Configuration for A/B test."""
name: str
variants: Dict[str, float] # variant_id: traffic_percentage
metrics: List[str]
min_samples: int = 100
class ABTestRunner:
"""
Run A/B tests on prompt variants.
"""
def __init__(self, app_name: str):
self.app_name = app_name
self.cliRelated in ai-prompting
pandasai
IncludedConversational data analysis using natural language queries on DataFrames. Chat with your data using LLMs to generate insights, create visualizations, and explain code.
ai-prompting
dspy
IncludedCompile prompts into self-improving pipelines with signatures, modules, optimizers, and programmatic prompt engineering
ai-prompting
langchain
IncludedBuild production-ready LLM applications with chains, agents, memory, tools, and RAG pipelines using the LangChain framework
ai-prompting
prompt-engineering
IncludedComprehensive prompting techniques including chain-of-thought, few-shot, zero-shot, system prompts, persona design, and evaluation patterns
ai-prompting