Skip to main content
Complete walkthrough for optimizing Banking77 intent-classification prompts with GEPA. Prompt optimization currently supports Modal-only task-app deployments—run your task app on Modal and reuse the resulting .modal.run URL everywhere (configs, health checks, evals).

Prerequisites

  1. Provision Synth credentials (creates .env with SYNTH_API_KEY + ENVIRONMENT_API_KEY):
    uvx synth-ai setup
    
  2. Export model-provider keys used by your policy/mutation/meta models (e.g. export GROQ_API_KEY=...).
  3. Keep uvx synth-ai on your PATH (installed with the Synth tooling).
No additional installs (uv pip, editable checkouts, etc.) are required.

Step 1: Deploy the Task App to Modal

Point the CLI at both the TaskAppConfig module and the Modal entrypoint (they are the same file here):
uvx synth-ai deploy \
  --task-app examples/task_apps/banking77/banking77_task_app.py \
  --modal-app examples/task_apps/banking77/banking77_task_app.py \
  --runtime modal \
  --name banking77-gepa \
  --env .env
  • --task-app loads the TaskAppConfig (seeds, rubrics, rollout handlers).
  • --modal-app exposes the modal.App(...) wrapper so the CLI can package it.
  • --env must include ENVIRONMENT_API_KEY; the helper injects it as a Modal secret.
Deployment prints a .modal.run URL and writes it to TASK_APP_URL inside the .env you passed. Verify immediately:
TASK_APP_URL=$(rg '^TASK_APP_URL=' .env -N | cut -d= -f2)
curl -H "X-API-Key: $ENVIRONMENT_API_KEY" "$TASK_APP_URL/health"

Step 2: Author the Prompt Optimization Config

Start from the sample config and update only the task-app URL/IDs:
cp examples/blog_posts/gepa/configs/banking77_gepa_local.toml configs/banking77_gepa.toml
Key fields inside configs/banking77_gepa.toml:
[prompt_learning]
algorithm = "gepa"
task_app_url = "https://your-task-app.modal.run"
task_app_id = "banking77"

[prompt_learning.initial_prompt]
messages = [
  { role = "system", content = "You are a banking intent classification assistant." },
  { role = "user", pattern = "Customer Query: {query}\n\nClassify this query into one of 77 banking intents." }
]

[prompt_learning.gepa]
initial_population_size = 20
num_generations = 15
mutation_rate = 0.3
crossover_rate = 0.5
rollout_budget = 1000
max_concurrent_rollouts = 20
pareto_set_size = 20
The sample already defines policy settings, train/validation seed pools, mutation models, and archive configuration—only the URL needs to change.

Step 3: Launch GEPA

uvx synth-ai train \
  --config configs/banking77_gepa.toml \
  --poll
The CLI validates the TOML, pings the Modal task app using ENVIRONMENT_API_KEY, and submits the job to Synth’s managed backend (no custom --backend flags required).

Step 4: Monitor the Job

Expect streamed events such as:
🧬 Running GEPA on Banking77
=============================
✅ Task app: https://your-task-app.modal.run
✅ Seeds: train=30 validation=50

Generation 1/15: best_train_accuracy=0.74 len=118
Generation 2/15: best_train_accuracy=0.82
...
✅ prompt.learning.gepa.complete best_train_accuracy=0.88 best_validation_accuracy=0.85

Step 5: Download Optimized Prompts + Scores

Use the SDK helpers against the Synth production backend (https://agent-learning.onrender.com/api). Reuse the SYNTH_API_KEY stored by uvx synth-ai setup:
import os
from synth_ai.learning import get_prompt_text, get_scoring_summary

BASE_URL = os.environ.get("BACKEND_BASE_URL", "https://agent-learning.onrender.com/api").rstrip("/")
API_KEY = os.environ["SYNTH_API_KEY"]
JOB_ID = "pl_xxxx"  # Replace with the job id printed by the CLI

best_prompt = get_prompt_text(job_id=JOB_ID, base_url=BASE_URL, api_key=API_KEY, rank=1)
summary = get_scoring_summary(job_id=JOB_ID, base_url=BASE_URL, api_key=API_KEY)

print(best_prompt)
print(
    f"Train={summary['best_train_accuracy']:.3f} "
    f"Validation={summary.get('best_validation_accuracy', 0.0):.3f}"
)

Troubleshooting

  • Task app health fails → Redeploy to Modal (Step 1) and re-run curl -H "X-API-Key: ...\" "$TASK_APP_URL/health".
  • Missing provider key → Export the required Groq/OpenAI/Google key before running train.
  • Pattern validation errors → Ensure {query} (and any other task wildcards) remain inside initial_prompt.messages.

Next Steps