> ## Documentation Index
> Fetch the complete documentation index at: https://docs.usesynth.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Overview

> Optimize prompts using GEPA with verifier-guided evaluation. Works with any language via OpenAPI contracts.

Prompt optimization uses evolutionary algorithms to automatically improve prompts for classification, reasoning, and instruction-following tasks. **Works with any language** – build LocalAPI in Rust, Go, TypeScript, Zig, Python, or any language that can serve HTTP. See [Polyglot LocalAPI](/sdk/localapi/in-process) for examples and the OpenAPI contract.

Synth AI uses **GEPA**: Agrawal et al. (2025). "GEPA: Reflective Prompt Evolution." [arXiv:2507.19457](https://arxiv.org/abs/2507.19457)

### 1. Build a prompt evaluation LocalAPI

Use the TaskAppConfig interface to describe dataset splits, rubrics, and rollout handlers. **Build in any language** – implement the OpenAPI contract in your preferred language.
→ [Create a prompt evaluation LocalAPI](/sdk/localapi/in-process) | [Polyglot examples](/sdk/localapi/in-process)

### 2. Author the prompt optimization config

Capture the GEPA algorithm choice, initial prompt template, training/validation seeds, and optimization parameters in TOML.
→ Read: [Prompt optimization configs](/prompt-optimization-gepa)

### 3. Query and evaluate results

Use the Python API or REST endpoints to retrieve optimized prompts and evaluate them on held-out validation sets.\
→ Read: [Querying results](/sdk/hosted-optimizers)

## Algorithm Overview

### GEPA (Genetic Evolution of Prompt Architectures)

**Best for:** Broad exploration, diverse prompt variants, classification tasks\
**Reference:** [Agrawal et al. (2025)](https://arxiv.org/abs/2507.19457)

GEPA uses evolutionary principles to explore the prompt space:

* **Population-based search** with multiple prompt variants
* **LLM-guided mutations** for intelligent prompt modifications
* **Pareto optimization** balancing performance and prompt length
* **Multi-stage support** for pipeline optimization

**Typical results:** Improves accuracy from 60-75% (baseline) to 85-90%+ over 15 generations

**Key features:**

* Maintains a Pareto front of non-dominated solutions
* Supports both template mode and pattern-based transformations
* Module-aware evolution for multi-stage pipelines
* Reflective feedback from execution traces
* **Hosted verifier integration** for quality-aware optimization

## Architecture: Inference Interception

GEPA **does** call your task app's `/rollout` endpoint — but optimized prompts never appear in the rollout payload. Instead, the backend registers each candidate with an **inference interceptor** and passes your task app a `policy_config.inference_url`. When your task app makes LLM calls through that URL, the interceptor substitutes the candidate prompt before forwarding to the model.

```
GEPA evaluation flow:

Backend ──proposes candidate──▶ Interceptor (registers prompt)
Backend ──/rollout──▶ Task App
Task App ──LLM call via inference_url──▶ Interceptor ──substitutes prompt──▶ LLM
Task App ◀──response──────────────────── LLM
Backend  ◀──metrics/reward────────────── Task App
```

This separation ensures:

* **No prompt leakage**: your task app never sees the optimized prompt text
* **Task apps remain unchanged**: just route LLM calls through `policy_config.inference_url`
* **Traces captured**: the interceptor records execution traces for reflective feedback
* **Stored artifacts**: traces and artifacts can be reused for reflection across generations

## Production-Ready: Works with Your Code

GEPA works with your production code via HTTP-based serverless endpoints. Build LocalAPI in any language (Rust, Go, TypeScript, Zig, Python, or any language that can serve HTTP). See [Polyglot LocalAPI](/sdk/localapi/in-process) for examples and the OpenAPI contract.

## Supported Models

See [Supported Models for Prompt Optimization](/sdk/hosted-optimizers) for the full list of policy models.

## Multi-Stage Pipeline Support

GEPA supports optimizing prompts for multi-stage pipelines (e.g., Banking77 classifier → calibrator):

* **LCS-based stage detection** automatically identifies which stage is being called
* **Per-stage optimization** evolves separate instructions for each pipeline module
* **Unified evaluation** tracks end-to-end performance across all stages

## Next Steps

* [GEPA Algorithm Details](/prompt-optimization-gepa) – How GEPA works under the hood
* [System Specifications](/prompt-optimization-gepa) – How specs guide optimization
* [Configuration Reference](/prompt-optimization-gepa) – Complete parameter documentation
* [Training Guide](/prompt-optimization-gepa) – Step-by-step training instructions
* [Prompt Optimization Cookbook](/sdk/localapi/in-process) – Complete walkthrough
