Banking77

Complete walkthrough for optimizing Banking77 intent-classification prompts with GEPA. Prompt optimization currently supports Modal-only task-app deployments—run your task app on Modal and reuse the resulting .modal.run URL everywhere (configs, health checks, evals).

Prerequisites

Provision Synth credentials (creates .env with SYNTH_API_KEY + ENVIRONMENT_API_KEY):
```
uvx synth-ai setup
```
Export model-provider keys used by your policy/mutation/meta models (e.g. export GROQ_API_KEY=...).
Keep uvx synth-ai on your PATH (installed with the Synth tooling).

No additional installs (uv pip, editable checkouts, etc.) are required. Point the CLI at both the TaskAppConfig module and the Modal entrypoint (they are the same file here):

uvx synth-ai deploy \
  --task-app examples/task_apps/banking77/banking77_task_app.py \
  --modal-app examples/task_apps/banking77/banking77_task_app.py \
  --runtime modal \
  --name banking77-gepa \
  --env .env

--task-app loads the TaskAppConfig (seeds, rubrics, rollout handlers).
--modal-app exposes the modal.App(...) wrapper so the CLI can package it.
--env must include ENVIRONMENT_API_KEY; the helper injects it as a Modal secret.

Deployment prints a .modal.run URL and writes it to TASK_APP_URL inside the .env you passed. Verify immediately:

TASK_APP_URL=$(rg '^TASK_APP_URL=' .env -N | cut -d= -f2)
curl -H "X-API-Key: $ENVIRONMENT_API_KEY" "$TASK_APP_URL/health"

Step 2: Author the Prompt Optimization Config

Start from the sample config and update only the task-app URL/IDs:

cp examples/blog_posts/gepa/configs/banking77_gepa_local.toml configs/banking77_gepa.toml

Key fields inside configs/banking77_gepa.toml:

[prompt_learning]
algorithm = "gepa"
task_app_url = "https://your-task-app.modal.run"
task_app_id = "banking77"

[prompt_learning.initial_prompt]
messages = [
  { role = "system", content = "You are a banking intent classification assistant." },
  { role = "user", pattern = "Customer Query: {query}\n\nClassify this query into one of 77 banking intents." }
]

[prompt_learning.gepa]
initial_population_size = 20
num_generations = 15
mutation_rate = 0.3
crossover_rate = 0.5
rollout_budget = 1000
max_concurrent_rollouts = 20
pareto_set_size = 20

The sample already defines policy settings, train/validation seed pools, mutation models, and archive configuration—only the URL needs to change.

Step 3: Launch GEPA

uvx synth-ai train \
  --config configs/banking77_gepa.toml \
  --poll

The CLI validates the TOML, pings the Modal task app using ENVIRONMENT_API_KEY, and submits the job to Synth’s managed backend (no custom --backend flags required).

Step 4: Monitor the Job

Expect streamed events such as:

🧬 Running GEPA on Banking77
=============================
✅ Task app: https://your-task-app.modal.run
✅ Seeds: train=30 validation=50

Generation 1/15: best_train_accuracy=0.74 len=118
Generation 2/15: best_train_accuracy=0.82
...
✅ prompt.learning.gepa.complete best_train_accuracy=0.88 best_validation_accuracy=0.85

Step 5: Download Optimized Prompts + Scores

Use the SDK helpers against the Synth production backend (https://agent-learning.onrender.com/api). Reuse the SYNTH_API_KEY stored by uvx synth-ai setup:

import os
from synth_ai.learning import get_prompt_text, get_scoring_summary

BASE_URL = os.environ.get("BACKEND_BASE_URL", "https://agent-learning.onrender.com/api").rstrip("/")
API_KEY = os.environ["SYNTH_API_KEY"]
JOB_ID = "pl_xxxx"  # Replace with the job id printed by the CLI

best_prompt = get_prompt_text(job_id=JOB_ID, base_url=BASE_URL, api_key=API_KEY, rank=1)
summary = get_scoring_summary(job_id=JOB_ID, base_url=BASE_URL, api_key=API_KEY)

print(best_prompt)
print(
    f"Train={summary['best_train_accuracy']:.3f} "
    f"Validation={summary.get('best_validation_accuracy', 0.0):.3f}"
)

Troubleshooting

Task app health fails → Redeploy to Modal (Step 1) and re-run curl -H "X-API-Key: ...\" "$TASK_APP_URL/health".
Missing provider key → Export the required Groq/OpenAI/Google key before running train.
Pattern validation errors → Ensure {query} (and any other task wildcards) remain inside initial_prompt.messages.

Next Steps

Configuration Reference – Complete TOML schema
Evaluate + Query Results – SDK + REST guide
Other Examples – Banking77 pipeline, HotpotQA, IFBench, HoVer, PUPA

Start Training

Prompt Optimization

Supervised Fine-Tuning

Reinforcement Learning

Prerequisites

Step 2: Author the Prompt Optimization Config

Step 3: Launch GEPA

Step 4: Monitor the Job

Step 5: Download Optimized Prompts + Scores

Troubleshooting

Next Steps

Start Training

Prompt Optimization

Supervised Fine-Tuning

Reinforcement Learning

​Prerequisites

​Step 1: Deploy the Task App to Modal

​Step 2: Author the Prompt Optimization Config

​Step 3: Launch GEPA

​Step 4: Monitor the Job

​Step 5: Download Optimized Prompts + Scores

​Troubleshooting

​Next Steps

Prerequisites

Step 1: Deploy the Task App to Modal

Step 2: Author the Prompt Optimization Config

Step 3: Launch GEPA

Step 4: Monitor the Job

Step 5: Download Optimized Prompts + Scores

Troubleshooting

Next Steps