Overview

Synth is a post-training platform for agents. You ship a task app (a small FastAPI service that exposes your environment), collect traces through structured rollouts, and then run either supervised fine-tuning (SFT) or reinforcement learning (RL) to improve the policy. Every flow shares the same core pieces:

Task app – describes actions, observations, rubrics, and tracing hooks.
Rollouts / eval – run the agent to gather data, judge results, and benchmark policies.
Training job – submit an SFT or RL job with the Synth CLI, monitor progress, and retrieve the updated model.

Agents (humans or other services) can read these pages top-to-bottom and reproduce the entire workflow.

How the docs are organized

Quickstart – install the CLI, pair it with your dashboard session, and walk through the math demo.
SFT docs – gather or filter JSONL datasets, configure hyperparameters, and launch fine-tuning jobs.
RL docs – configure trainer topologies, deploy task apps to Modal, and run online RL.
Rollouts – the uvx synth-ai eval command covers both evaluation and data collection.
Reference material such as pricing and models lives in their dedicated sections.

Model families (Synth hosted)

Model	Sizes	Notes
Qwen 3	0.6B, 1.7B, 4B, 8B, 14B, 32B	Best starting point, dense thinking models that support tool-calling
Qwen 3 (Advanced)	4B-2507, 30B-A3B, 235B-A22B, 480B-A35B	Unique Instruct & Thinking variants with MoE support
Qwen 3 VL	2B, 4B, 8B, 30B-A3B, 32B, 235B-A22B*	Multimodal (vision + language) family with Instruct & Thinking variants
Qwen 3 Coder	30B-A3B, 480B-A35B*	Specialized for code generation (Instruct only)

* 235B and 480B models must be sharded across multiple GPUs for inference and training.

Get Started

Fine-Tuning

Reinforcement Learning

CLI Commands

How the docs are organized

Model families (Synth hosted)

Get Started

Fine-Tuning

Reinforcement Learning

CLI Commands

​How the docs are organized

​Model families (Synth hosted)

How the docs are organized

Model families (Synth hosted)