Skip to main content

synth_ai.sdk.optimization.policy.job

Policy optimization job implementation. This module provides the canonical PolicyOptimizationJob class for running policy optimization (prompt/instruction optimization) jobs. Replaces: PromptLearningJob (deprecated) Backend endpoint: /api/policy-optimization/online/jobs Algorithms:
  • gepa: Genetic Evolutionary Prompt Algorithm (default)
    • Evolutionary algorithm for optimizing prompts through population-based search
    • Uses mutation, crossover, and selection to evolve prompt candidates
    • Supports both online and offline optimization modes
  • mipro: Multi-prompt Instruction Proposal Optimizer
    • Systematic instruction proposal and evaluation algorithm
    • Generates new prompt instructions based on reward feedback
    • Supports online mode where you drive rollouts locally
    • Backend provides proxy URL for prompt candidate selection

Classes

Algorithm

Supported policy optimization algorithms. Attributes:
  • GEPA: Genetic Evolutionary Prompt Algorithm - Evolutionary population-based search
  • MIPRO: Multi-prompt Instruction Proposal Optimizer - Systematic instruction proposal
Methods:

from_string

from_string(cls, value: str) -> Algorithm
Convert string to Algorithm enum. Args:
  • value: Algorithm name (case-insensitive)
Returns:
  • Algorithm enum value, defaults to GEPA if invalid

PolicyOptimizationJobConfig

Configuration for a policy optimization job. This dataclass holds all the configuration needed to submit and run a policy optimization job (GEPA or MIPRO). Supports two modes:
  1. File-based: Provide config_path pointing to a TOML file
  2. Programmatic: Provide config_dict with the configuration directly
Attributes:
  • config_path: Path to the TOML configuration file. Mutually exclusive with config_dict.
  • config_dict: Dictionary with policy optimization configuration.
  • backend_url: Base URL of the Synth API backend.
  • api_key: Synth API key for authentication.
  • localapi_api_key: API key for authenticating with the LocalAPI.
  • algorithm: Optimization algorithm to use (gepa, mipro).
  • allow_experimental: If True, allows use of experimental models.
  • overrides: Dictionary of config overrides.
Example (file-based):
>>> config = PolicyOptimizationJobConfig(
...     config_path=Path("my_config.toml"),
...     backend_url="https://api.usesynth.ai",
...     api_key="sk_live_...",
... )
Example (programmatic with GEPA):
>>> config = PolicyOptimizationJobConfig(
...     config_dict={
...         "policy_optimization": {
...             "algorithm": "gepa",
...             "localapi_url": "https://tunnel.example.com",
...             "policy": {"model": "gpt-4o-mini", "provider": "openai"},
...             "gepa": {...},
...         }
...     },
...     backend_url="https://api.usesynth.ai",
...     api_key="sk_live_...",
... )
Example (programmatic with MIPRO):
>>> config = PolicyOptimizationJobConfig(
...     config_dict={
...         "policy_optimization": {
...             "algorithm": "mipro",
...             "task_app_url": "https://your-task-app.example.com",
...             "policy": {"model": "gpt-4o-mini", "provider": "openai"},
...             "mipro": {
...                 "mode": "online",
...                 "bootstrap_train_seeds": [0, 1, 2, 3, 4],
...                 "val_seeds": [100, 101, 102],
...                 "proposer": {"model": "gpt-4o-mini", "provider": "openai"},
...             },
...         }
...     },
...     backend_url="https://api.usesynth.ai",
...     api_key="sk_live_...",
... )
Methods:

to_prompt_learning_config

to_prompt_learning_config(self) -> Dict[str, Any]
Convert to prompt_learning config format for backward compatibility. The backend currently uses ‘prompt_learning’ section names. This method converts our config to that format until the backend is updated.

PolicyOptimizationJob

High-level SDK class for running policy optimization jobs. This is the canonical class for policy optimization, replacing PromptLearningJob. It supports both GEPA and MIPRO algorithms. GEPA (Genetic Evolutionary Prompt Algorithm):
  • Evolutionary algorithm using population-based search
  • Optimizes prompts through mutation, crossover, and selection
  • Supports both online and offline optimization modes
  • Best for: Comprehensive search across prompt space
MIPRO (Multi-prompt Instruction Proposal Optimizer):
  • Systematic instruction proposal and evaluation
  • Generates new prompt instructions based on reward feedback
  • Online mode: You drive rollouts, backend provides prompt candidates
  • Best for: Iterative refinement with real-time prompt evolution
Example (GEPA):
>>> from synth_ai.sdk.optimization.policy import PolicyOptimizationJob
>>>
>>> # Create job from config
>>> job = PolicyOptimizationJob.from_config(
...     config_path="gepa_config.toml",
...     api_key=os.environ["SYNTH_API_KEY"],
...     algorithm="gepa"
... )
>>>
>>> # Submit job
>>> job_id = job.submit()
>>> print(f"Job submitted: {job_id}")
>>>
>>> # Stream until complete (recommended)
>>> result = job.stream_until_complete()
>>> print(f"Best score: {result.best_score}")
Example (MIPRO):
>>> from synth_ai.sdk.optimization.policy import PolicyOptimizationJob
>>>
>>> # Create MIPRO job from config
>>> job = PolicyOptimizationJob.from_config(
...     config_path="mipro_config.toml",
...     api_key=os.environ["SYNTH_API_KEY"],
...     algorithm="mipro"
... )
>>>
>>> # Submit job
>>> job_id = job.submit()
>>>
>>> # Poll until complete
>>> result = job.poll_until_complete(timeout=3600.0)
>>> print(f"Best score: {result.best_score}")
Attributes:
  • job_id: The job ID (None until submitted)
  • algorithm: The optimization algorithm being used (GEPA or MIPRO)
Methods:

from_config

from_config(cls, config_path: str | Path, backend_url: Optional[str] = None, api_key: Optional[str] = None, localapi_api_key: Optional[str] = None, algorithm: str | Algorithm = Algorithm.GEPA, allow_experimental: Optional[bool] = None, overrides: Optional[Dict[str, Any]] = None) -> PolicyOptimizationJob
Create a job from a TOML config file. Args:
  • config_path: Path to TOML config file
  • backend_url: Backend API URL (defaults to env or production)
  • api_key: API key (defaults to SYNTH_API_KEY env var)
  • localapi_api_key: LocalAPI key (defaults to ENVIRONMENT_API_KEY env var)
  • algorithm: Optimization algorithm (gepa or mipro)
  • allow_experimental: Allow experimental models
  • overrides: Config overrides
Returns:
  • PolicyOptimizationJob instance
Raises:
  • ValueError: If required config is missing
  • FileNotFoundError: If config file doesn’t exist

from_dict

from_dict(cls, config_dict: Dict[str, Any], backend_url: Optional[str] = None, api_key: Optional[str] = None, localapi_api_key: Optional[str] = None, algorithm: str | Algorithm = Algorithm.GEPA, allow_experimental: Optional[bool] = None, overrides: Optional[Dict[str, Any]] = None, skip_health_check: bool = False) -> PolicyOptimizationJob
Create a job from a configuration dictionary. The config_dict can use either the new ‘policy_optimization’ section or the legacy ‘prompt_learning’ section for backward compatibility. Args:
  • config_dict: Configuration dictionary
  • backend_url: Backend API URL (defaults to env or production)
  • api_key: API key (defaults to SYNTH_API_KEY env var)
  • localapi_api_key: LocalAPI key (defaults to ENVIRONMENT_API_KEY env var)
  • algorithm: Optimization algorithm (gepa or mipro)
  • allow_experimental: Allow experimental models
  • overrides: Config overrides
  • skip_health_check: If True, skip LocalAPI health check
Returns:
  • PolicyOptimizationJob instance
Example (GEPA):
>>> job = PolicyOptimizationJob.from_dict(
...     config_dict={
...         "policy_optimization": {
...             "algorithm": "gepa",
...             "localapi_url": "https://tunnel.example.com",
...             "policy": {"model": "gpt-4o-mini", "provider": "openai"},
...             "gepa": {...},
...         }
...     },
...     api_key="sk_live_...",
... )
Example (MIPRO):
>>> job = PolicyOptimizationJob.from_dict(
...     config_dict={
...         "policy_optimization": {
...             "algorithm": "mipro",
...             "task_app_url": "https://your-task-app.example.com",
...             "policy": {"model": "gpt-4o-mini", "provider": "openai"},
...             "mipro": {
...                 "mode": "online",
...                 "bootstrap_train_seeds": [0, 1, 2, 3, 4],
...                 "val_seeds": [100, 101, 102],
...                 "proposer": {"model": "gpt-4o-mini", "provider": "openai"},
...             },
...         }
...     },
...     api_key="sk_live_...",
... )

from_job_id

from_job_id(cls, job_id: str, backend_url: Optional[str] = None, api_key: Optional[str] = None) -> PolicyOptimizationJob
Resume an existing job by ID. Args:
  • job_id: Existing job ID
  • backend_url: Backend API URL (defaults to env or production)
  • api_key: API key (defaults to SYNTH_API_KEY env var)
Returns:
  • PolicyOptimizationJob instance for the existing job

job_id

job_id(self) -> Optional[str]
Get the job ID (None if not yet submitted).

algorithm

algorithm(self) -> Algorithm
Get the optimization algorithm.

submit

submit(self) -> str
Submit the job to the backend. Returns:
  • Job ID
Raises:
  • RuntimeError: If job submission fails
  • ValueError: If LocalAPI health check fails

get_status

get_status(self) -> Dict[str, Any]
Get current job status. Returns:
  • Job status dictionary
Raises:
  • RuntimeError: If job hasn’t been submitted yet

poll_until_complete

poll_until_complete(self) -> PolicyOptimizationResult
Poll job until it reaches a terminal state. Args:
  • timeout: Maximum seconds to wait for job completion
  • interval: Seconds between poll attempts
  • progress: If True, print status updates during polling
  • on_status: Optional callback called on each status update
  • request_timeout: HTTP timeout for each status request
Returns:
  • PolicyOptimizationResult with typed status, best_score, etc.
Raises:
  • RuntimeError: If job hasn’t been submitted yet
  • TimeoutError: If timeout is exceeded

stream_until_complete

stream_until_complete(self) -> PolicyOptimizationResult
Stream job events until completion using SSE. This provides real-time event streaming instead of polling, reducing server load and providing faster updates. Args:
  • timeout: Maximum seconds to wait (default: 3600 = 1 hour)
  • interval: Seconds between status checks (for SSE reconnects)
  • handlers: Optional StreamHandler instances for custom event handling
  • on_event: Optional callback called on each event
Returns:
  • PolicyOptimizationResult with typed status, best_score, etc.
Raises:
  • RuntimeError: If job hasn’t been submitted yet

get_results

get_results(self) -> Dict[str, Any]
Get job results (prompts, scores, etc.). Returns:
  • Results dictionary with best_prompt, best_score, etc.
Raises:
  • RuntimeError: If job hasn’t been submitted yet

get_best_prompt_text

get_best_prompt_text(self, rank: int = 1) -> Optional[str]
Get the text of the best prompt by rank. Args:
  • rank: Prompt rank (1 = best, 2 = second best, etc.)
Returns:
  • Prompt text or None if not found

cancel

cancel(self) -> Dict[str, Any]
Cancel a running job. Args:
  • reason: Optional reason for cancellation
Returns:
  • Dict with cancellation status
Raises:
  • RuntimeError: If job hasn’t been submitted yet

query_workflow_state

query_workflow_state(self) -> Dict[str, Any]
Query the Temporal workflow state for instant polling. Returns:
  • Dict with workflow state
Raises:
  • RuntimeError: If job hasn’t been submitted yet