Skip to main content

How can you use GEPA in-process for prod applications?

GEPA In-Process allows you to run prompt optimization entirely from a single Python script. You provide your task app and dataset, and the optimizer handles everything:
  • Automatic task app startup: Your FastAPI task app runs in a background thread
  • Cloudflare tunnel: Automatically exposes your local task app to the optimizer backend
  • Dataset-driven optimization: GEPA tests candidate prompts against your dataset via message passing
  • Clean shutdown: Everything cleans up automatically when done and you’re left with optimized prompts and their eval scores
This allows you to run GEPA programmatically over arbitrarily many datasets and tasks in production.

In-Process Task App Architecture

The in-process approach eliminates manual process management by running everything in a single Python script:

Components

Task App: Heart Disease classification using buio/heart-disease dataset
  • Binary classification: 0 (no disease) or 1 (heart disease)
  • Tool-based: Model calls heart_disease_classify function
  • Patient features provided as text input
Runner Script: run_fully_in_process.py
from synth_ai.task import InProcessTaskApp

async with InProcessTaskApp(
    task_app_path=task_app_path,
    port=8114,
    api_key=task_app_api_key,
) as task_app:
    # task_app.url contains the Cloudflare tunnel URL
    # Use it for GEPA jobs
    job = PromptLearningJob.from_config(
        config_path=config_path,
        task_app_url=task_app.url,
    )
    results = await job.poll_until_complete()
# Everything cleaned up automatically
What Happens:
  1. Task app starts in background thread (uvicorn)
  2. Cloudflare tunnel opens automatically
  3. Backend receives public tunnel URL
  4. GEPA job runs rollouts against tunnel
  5. Cleanup happens automatically on exit

Seed Pools

GEPA uses different seed pools for different phases:
[prompt_learning.gepa.evaluation]
train_seeds = [0, 1, 2, ..., 29]      # 30 seeds for training
val_seeds = [30, 31, 32, ..., 79]     # 50 seeds for validation
validation_pool = "train"
validation_top_k = 2
  • Train seeds: Used during evolutionary process to evaluate fitness
  • Val seeds: Held-out validation set for final top-K selection
  • validation_pool: Which pool to use for validation (“train” or “val”)
  • validation_top_k: Number of top candidates to validate

Rollout Configuration

[prompt_learning.gepa.rollout]
budget = 300              # Total rollouts across all generations
max_concurrent = 5        # Parallel rollout limit
Budget is distributed across generations:
  • Initial population: initial_size × len(train_seeds) rollouts
  • Each generation: children_per_generation × len(train_seeds) rollouts
  • Archive candidates re-evaluated periodically

Meta-Model vs Policy Model

  • Policy Model: The model being optimized (e.g., llama-3.1-8b-instant)
    • Runs your actual task (heart disease classification)
    • Needs to be fast and cost-effective
    • Defined in [prompt_learning.policy]
  • Meta-Model: The mutation generator (e.g., llama-3.3-70b-versatile)
    • Analyzes successful/failing prompts and proposes mutations
    • Should be more capable than policy model
    • Defined in [prompt_learning.gepa.mutation]

Termination Conditions

[prompt_learning.termination_config]
max_cost_usd = 3.0       # Budget limit
max_trials = 600         # Maximum rollouts
GEPA stops when either condition is met:
  • Cost exceeds max_cost_usd
  • Total rollouts exceed max_trials

Example: Heart Disease Classification

The in-process demo uses medical classification: Task: Predict heart disease from patient features Dataset: buio/heart-disease (270 samples, train split only) Metric: Binary classification accuracy Budget: 50 rollouts (reduced from 300 for faster demo) Seed Prompt (baseline):
You are a medical classification assistant. Based on the patient's
features, classify whether they have heart disease. Respond with
'1' for heart disease or '0' for no heart disease.
GEPA’s Optimized Prompt (GPT-4.1 Mini):
You are a medical classification assistant. Your task is to analyze patient features and determine the presence of heart disease.

Input: You will receive patient features including age, sex, chest pain type, resting blood pressure, serum cholesterol, fasting blood sugar, resting electrocardiographic results, maximum heart rate achieved, exercise-induced angina, ST depression induced by exercise, slope of peak exercise ST segment, number of major vessels colored by fluoroscopy, and thalassemia type.

Classification Process:
• Carefully examine all provided patient features. Pay particular attention to combinations of risk factors rather than isolated values.
• Consider the relationships between features: high cholesterol combined with high blood pressure and chest pain indicates higher risk than any single factor alone.
• Exercise-related features (maximum heart rate, exercise-induced angina, ST depression) are strong indicators when present alongside other cardiovascular risk factors.
• Age and sex are baseline factors that modify risk interpretation but should not be the sole basis for classification.

Key Risk Indicators:
• Chest pain types associated with cardiovascular issues (especially when combined with other symptoms)
• Elevated resting blood pressure (>140 mmHg) or serum cholesterol (>240 mg/dL)
• Abnormal resting ECG results
• Exercise-induced angina or significant ST depression during exercise
• Multiple major vessels affected (visible via fluoroscopy)
• Thalassemia types associated with cardiovascular complications

Output Format:
• Analyze the feature combination holistically
• Respond with exactly '1' if heart disease is present based on the feature analysis
• Respond with exactly '0' if heart disease is not present
• Base your decision on the overall pattern of risk factors, not individual feature values in isolation