Quickstart: Graph GEPA
This guide walks you through training a multi-node LLM graph using Graph GEPA. By the end, you’ll have an optimized graph that outperforms a single prompt.For most use cases, we recommend using ADAS/Workflows which provides a simpler interface. Use Graph GEPA directly when you need fine-grained control over evolution parameters.
Prerequisites
- Synth API key (get one here)
- Python 3.11+
synth-aipackage installed
Step 1: Prepare Your Dataset
Create a JSON file with your tasks and expected outputs:dataset.json.
Step 2: Create Configuration
Create a TOML configuration file:Step 3: Run Training
Option A: Python SDK
Option B: Using ADAS (Simpler)
If you don’t need fine-grained control, use ADAS:Step 4: Use Your Graph
Production Inference
Download for Local Use
What Happens During Training
- Initialization: Graph GEPA creates an initial population of graph candidates
- Evaluation: Each candidate is run on training seeds and scored
- Selection: Best candidates are selected for the next generation
- Mutation: LLM proposes modifications to prompts and structure
- Repeat: Process continues for
num_generations - Validation: Top candidates are evaluated on held-out validation seeds
Tips for Better Results
1. More Training Data
More examples = better optimization:2. Topology Guidance
Help the proposer understand your task:3. Appropriate Structure
Match structure to task complexity:| Task | Recommended Structure |
|---|---|
| Simple classification | single_prompt |
| Multi-step reasoning | dag |
| Routing/branching logic | conditional |
4. Budget Allocation
More generations with fewer children often beats few generations with many children:Troubleshooting
Low Scores
- Add more diverse training examples
- Increase
num_generations - Try different
topology_guidance - Check that gold outputs are correct
Slow Training
- Reduce
children_per_generation - Use faster policy model (e.g.,
gpt-4o-mini) - Reduce training seed count
High Costs
- Set
max_spend_usdlimit - Use
max_llm_calls_per_runto limit graph complexity - Use cheaper models in
allowed_policy_models
Next Steps
- Graph GEPA Reference - Full configuration options
- Graphs Overview - Understanding graph abstractions
- ADAS/Workflows - Simpler high-level API
- Multi-objective Optimization - Optimize for cost/latency too