Prerequisites
GROQ_API_KEYin.env(for policy model inference)SYNTH_API_KEYin.env(for backend authentication)ENVIRONMENT_API_KEYin.env(optional - will be auto-generated if not set)uvinstalled (for running Python commands)
Quick Start
Run the script from anywhere:What Happens Automatically
The script performs these steps without any manual intervention:- Auto-generates ENVIRONMENT_API_KEY if not set (and registers it with backend)
- Starts the Banking77 task app in-process (no separate terminal needed)
- Automatically creates a Cloudflare tunnel if backend is remote (or uses localhost if backend is local)
- Submits a GEPA optimization job to the backend
- Polls for completion and displays results
- Cleans up everything automatically when done
Step-by-Step Execution
Initialization
What you’ll see:Task App Startup
What you’ll see:Job Submission
What you’ll see:Training Progress
What you’ll see - Task app processing rollouts:- Timestamp
- Elapsed time
- Current job status
- Best score achieved so far
Results Display
What you’ll see:- Best score achieved
- Total number of candidates evaluated
- Accuracy statistics (min, max, average)
Cleanup
What you’ll see:- Stops the task app
- Closes the Cloudflare tunnel
- Cleans up temporary files
- Exits cleanly
Configuration
The script usesbanking77_gepa.toml from walkthroughs/gepa/. You can modify:
- Rollout budget:
prompt_learning.gepa.rollout.budget(default: 200) - Number of generations:
prompt_learning.gepa.population.num_generations(default: 5) - Children per generation:
prompt_learning.gepa.population.children_per_generation(default: 4)
Example Output Summary
Here’s what a successful run looks like:- Best score: 87.50% accuracy (significant improvement over baseline)
- Total candidates: 21 prompt variations evaluated
- Time: ~3 minutes for complete optimization
- Cost: Typically 0.20 depending on rollout budget
Troubleshooting
- “GROQ_API_KEY required”: Make sure
.envfile exists at repo root withGROQ_API_KEYset - “SYNTH_API_KEY required”: Make sure
.envfile exists at repo root withSYNTH_API_KEYset - Task app not found: Ensure
walkthroughs/gepa/task_app/banking77_task_app.pyexists - Config file not found: Ensure
walkthroughs/gepa/banking77_gepa.tomlexists - ENVIRONMENT_API_KEY registration failed: The script will continue with the generated key, but backend may not be able to authenticate task app requests. Check that
SYNTH_API_KEYis valid.
Advantages Over Deployed Approach
- Single command: Everything runs from one script
- No manual process management: Task app and tunnel are managed automatically
- Automatic cleanup: Everything stops cleanly when done
- Better for automation: Perfect for CI/CD or batch processing
- Easier debugging: All logs in one place
Related Files
- run.py - In-process GEPA optimization script
- banking77_task_app.py - Banking77 task app implementation
- banking77_gepa.toml - GEPA configuration file
- README.md - Additional documentation
Next Steps
- Review the optimized prompts in the job results
- Adjust configuration parameters for different optimization runs
- Try the deployed walkthrough for more manual control
- Integrate into your own scripts using the
InProcessTaskAppclass