Skip to main content

Research Agent SDK

The Research Agent SDK provides a typed Python interface for running automated prompt optimization jobs. The agent spins up a sandboxed environment, analyzes your code, and applies optimization algorithms like MIPRO or GEPA.

Installation

pip install synth-ai
# or
uvx synth-ai --help

Quick Start

from synth_ai.sdk.api.research_agent import (
    ResearchAgentJob,
    ResearchConfig,
    DatasetSource,
    OptimizationTool,
    MIPROConfig,
    ModelProvider,
)

# Configure the optimization task
research_config = ResearchConfig(
    task_description="""Optimize the prompt for Iris flower classification.

The goal is to classify flowers into setosa, versicolor, or virginica
based on sepal/petal measurements.""",
    tools=[OptimizationTool.MIPRO],
    datasets=[
        DatasetSource(
            source_type="huggingface",
            hf_repo_id="scikit-learn/iris",
            hf_split="train",
        ),
    ],
    primary_metric="accuracy",
    num_iterations=10,
    mipro_config=MIPROConfig(
        meta_model="llama-3.3-70b-versatile",
        meta_provider=ModelProvider.GROQ,
        num_trials=15,
    ),
)

# Create and submit job
job = ResearchAgentJob.from_research_config(
    research=research_config,
    repo_url="https://github.com/your-org/your-pipeline",
    model="gpt-5.1-codex-mini",
    max_agent_spend_usd=25.0,
    backend_url="https://api.usesynth.ai",
    api_key="your-synth-api-key",  # or set SYNTH_API_KEY env var
)

# Submit and wait for completion
job_id = job.submit()
print(f"Job submitted: {job_id}")

result = job.poll_until_complete(timeout=1800.0, poll_interval=30.0)
print(f"Status: {result['status']}")

Configuration Options

ResearchConfig

The main configuration for what to optimize:
ParameterTypeDescription
task_descriptionstrWhat to optimize (detailed instructions for the agent)
toolsList[OptimizationTool]Optimization algorithms: MIPRO (GEPA coming soon)
datasetsList[DatasetSource]Training/evaluation datasets
primary_metricstrMain metric to optimize (default: "accuracy")
num_iterationsintNumber of optimization iterations (default: 10)
mipro_configMIPROConfigMIPRO-specific settings
gepa_configGEPAConfigGEPA-specific settings

DatasetSource

Configure where to load datasets from:
# HuggingFace dataset
DatasetSource(
    source_type="huggingface",
    hf_repo_id="PolyAI/banking77",
    hf_split="train",
    hf_subset="default",  # optional
)

# Inline data (for small datasets)
DatasetSource(
    source_type="inline",
    inline_data={"train.jsonl": '{"input": "test", "output": "result"}'},
)

# Uploaded files
DatasetSource(
    source_type="upload",
    file_ids=["file_abc123"],
)

MIPROConfig

MIPRO uses a meta-model to generate and evaluate instruction proposals:
MIPROConfig(
    meta_model="llama-3.3-70b-versatile",
    meta_provider=ModelProvider.GROQ,  # or OPENAI, GOOGLE
    num_trials=15,
    num_candidates=20,
    proposer_effort="MEDIUM",  # LOW_CONTEXT, LOW, MEDIUM, HIGH
)

GEPAConfig (Coming Soon)

GEPA optimization is not yet fully supported in the Research Agent SDK. Please use MIPRO for now.
GEPA uses genetic evolution for prompt optimization:
GEPAConfig(
    mutation_model="openai/gpt-oss-120b",
    population_size=20,
    num_generations=10,
    elite_fraction=0.2,
    proposer_type="dspy",  # or "spec"
)

Job Configuration

ResearchAgentJobConfig

Configure the job execution environment:
ParameterTypeDescription
researchResearchConfigThe optimization configuration
repo_urlstrGitHub repo URL (or use inline_files)
repo_branchstrBranch to use (default: "main")
inline_filesDict[str, str]Files to inject (alternative to repo)
modelstrAgent model (default: "gpt-5.1-codex-mini")
max_agent_spend_usdfloatMax spend for agent LLM calls
max_synth_spend_usdfloatMax spend for optimization
reasoning_effortstr"low", "medium", or "high"

Using Inline Files

For quick experiments without a repo:
job = ResearchAgentJob.from_research_config(
    research=research_config,
    inline_files={
        "pipeline.py": """
import dspy

class ClassifyIntent(dspy.Signature):
    query: str = dspy.InputField()
    intent: str = dspy.OutputField()
""",
        "README.md": "# My Pipeline\nIntent classification with DSPy.",
    },
    model="gpt-5.1-codex-mini",
    backend_url="https://api.usesynth.ai",
    api_key="your-key",
)

TOML Configuration

You can also configure jobs via TOML files:
[research_agent]
repo_url = "https://github.com/your-org/your-pipeline"
repo_branch = "main"
model = "gpt-5.1-codex-mini"
max_agent_spend_usd = 25.0
max_synth_spend_usd = 150.0
reasoning_effort = "medium"

[research_agent.research]
task_description = "Optimize banking intent classification"
tools = ["mipro"]
primary_metric = "accuracy"
num_iterations = 10

[[research_agent.research.datasets]]
source_type = "huggingface"
hf_repo_id = "PolyAI/banking77"
hf_split = "train"

[research_agent.research.mipro_config]
meta_model = "llama-3.3-70b-versatile"
meta_provider = "groq"
num_trials = 15
Load and run:
config = ResearchAgentJobConfig.from_toml("research_config.toml")
job = ResearchAgentJob(config=config)
job_id = job.submit()

Job Lifecycle

Submit

job = ResearchAgentJob.from_research_config(...)
job_id = job.submit()  # Returns job ID like "ra_abc123..."

Check Status

status = job.get_status()
print(status["status"])  # "queued", "running", "succeeded", "failed"

Poll Until Complete

result = job.poll_until_complete(
    timeout=1800.0,      # 30 minutes max
    poll_interval=30.0,  # Check every 30 seconds
    on_event=lambda e: print(f"Progress: {e}"),  # Optional callback
)

Attach to Existing Job

# Resume monitoring an existing job
job = ResearchAgentJob.from_id(
    job_id="ra_abc123...",
    backend_url="https://api.usesynth.ai",
    api_key="your-key",
)
status = job.get_status()

Environment Variables

VariableDescription
SYNTH_API_KEYYour Synth API key (required)
SYNTH_BACKEND_URLBackend URL (default: https://api.usesynth.ai)

CLI Alternative

You can also run research agents via CLI:
# Run with TOML config
uvx synth-ai agent run --config research_config.toml --poll

# Check job status
uvx synth-ai agent status ra_abc123...

Example: Banking77 Classification

Complete example optimizing a 77-class intent classifier:
from synth_ai.sdk.api.research_agent import (
    ResearchAgentJob,
    ResearchConfig,
    DatasetSource,
    OptimizationTool,
    MIPROConfig,
    ModelProvider,
)

research_config = ResearchConfig(
    task_description="""Optimize the prompt for Banking77 intent classification.

The Banking77 dataset contains customer banking queries that need to be
classified into 77 intent categories (e.g., "card_arrival", "lost_or_stolen_card").

Your goal:
1. Load the Banking77 dataset from /app/data/
2. Create a task app that evaluates classification prompts
3. Use MIPRO to optimize the system prompt for better accuracy
4. Save the best prompt and results to /app/artifacts/

Output must be EXACTLY one of the 77 intent labels.""",
    tools=[OptimizationTool.MIPRO],
    datasets=[
        DatasetSource(
            source_type="huggingface",
            hf_repo_id="PolyAI/banking77",
            hf_split="train",
            description="Banking intent classification dataset",
        ),
    ],
    primary_metric="accuracy",
    num_iterations=10,
    mipro_config=MIPROConfig(
        meta_model="llama-3.3-70b-versatile",
        meta_provider=ModelProvider.GROQ,
        num_trials=15,
    ),
)

job = ResearchAgentJob.from_research_config(
    research=research_config,
    repo_url="https://github.com/synth-labs/banking77-pipeline",
    model="gpt-5.1-codex-mini",
    max_agent_spend_usd=25.0,
    max_synth_spend_usd=150.0,
    reasoning_effort="medium",
)

job_id = job.submit()
print(f"Started optimization job: {job_id}")

# Wait for completion (typically 20-30 minutes for Banking77)
result = job.poll_until_complete(timeout=2400.0)
print(f"Final status: {result['status']}")

Next Steps