Prerequisites
- Python 3.11+
uvpackage managerSYNTH_API_KEYset in your environment- Access to a Synth backend (default is production)
Run the demo locally as a script
From thesynth-ai repo:
What the flags do
--rollouts: Number of online rollouts to run--train-size: Number of training seeds (0..train-size-1)--val-size: Number of validation seeds (train-size..train-size+val-size-1)--min-proposal-rollouts: Minimum rollouts before generating new proposals
What happens
- The script starts a local task app and health-checks it.
- A MIPRO online job is created on the backend.
- The backend returns a proxy URL for prompt candidate selection.
- The script runs rollouts locally, calling the proxy URL for each LLM call.
- Rewards are reported back and proposals evolve in real time.
Tips
- To use a different backend, set
SYNTH_URL(preferred).SYNTH_BACKEND_URLandRUST_BACKEND_URLare also supported for compatibility. - You can change the policy model with
--model gpt-4.1-nano(or another supported model). - The script auto-generates
ENVIRONMENT_API_KEYif it is not set.
Production usage
When you move this flow to production, the loop is the same. You just swap the backend URL, send rewards back to the online MIPRO system, and rely on the proxy URL to perform prompt substitution.1) Set the backend URL
Point to the production backend:SYNTH_URL (preferred), then SYNTH_BACKEND_URL, then RUST_BACKEND_URL.
2) Send reward updates
After each rollout, report the reward to the backend system. The demo uses:"done" status for the rollout:
push_status() does in the demo.
3) Prompt substitution (what happens behind the scenes)
The backend returns a proxy URL (e.g.mipro_proxy_url) for each online job. You call it like:
- The proxy selects the current best candidate prompt for the rollout.
- It substitutes that candidate into the prompt template (system/user message patterns).
- The proxy forwards the request to the model provider with the substituted prompt.
- You receive the model response and compute a reward locally.
- Reward updates drive the next round of proposals.