/health and /rollout).
Results summary (Banking77, GEPA)
All jobs optimize prompts for the same Banking77 task app; only the implementation language changes.
Prerequisites
- Synth CLI installed (
synthon your PATH) - Task app contract implemented in one of:
- Rust
- TypeScript (Node.js / Bun / Deno)
- Python
- Go
ENVIRONMENT_API_KEYset in your shell- A Cloudflare tunnel or other public URL exposed for
http://localhost:8001
Rust task app
main.rs · gepa_config.toml · walkthrough.md
Sample result (Rust)
Jobpl_4f69a1b099a14e4b — 100% accuracy on Banking77.
TypeScript task app
index.ts · walkthrough.md
Sample result (TypeScript)
Jobpl_787c47998cfe4745 — 85.7% accuracy on Banking77.
Go task app
main.go · gepa_config.toml · walkthrough.md
Sample result (Go)
Jobpl_1dd94dfdc8c6479d — 60.0% accuracy on Banking77.
Python task app
app.py · walkthrough.md
Sample result (Python)
Jobpl_7e0227cc41454ec5 — 66.7% accuracy on Banking77.
The task app contract
All of these task apps implement the same HTTP contract:GET /health— basic health check used by Synth to discover and monitor the appPOST /rollout— evaluate a batch of prompt candidates and return rewards/metrics
- GitHub: task_app.yaml
- Raw:
https://raw.githubusercontent.com/synth-laboratories/synth-ai/main/synth_ai/contracts/task_app.yaml
ENVIRONMENT_API_KEYmust match theX-API-Keyheader on/rolloutrequests.- Your
/rollouthandler:- Reads the prompt candidate from the request payload.
- Runs it against your task (e.g., Banking77 classifier).
- Returns per-sample rewards and aggregate metrics in the response.