/api/learning/jobs endpoint, uploads your dataset, and monitors the run until completion. The CLI handles all of this for you.
For CLI flag descriptions, head to Launch Training Jobs.
1. Create the config TOML for your task app
Create a TOML file that follows the schema documented in the SFT config reference.Vision-language (Qwen3-VL)
Synth treats Qwen3-VL checkpoints as multimodal models and flips the SFT pipeline into “vision mode” automatically whenjob.model points at Qwen/Qwen3-VL-*. Keep the following in mind:
-
Config tweaks: you do not need to add extra knobs—
supports_vision,max_images_per_message, and BF16 precision are pulled from the model registry. The trainer will clampper_device_batch/per_device_eval_batchto 1 and raisegradient_accumulation_stepsto keep memory in check. If you truly need more than one image per turn, override the registry default withmodel.max_images_per_message, but expect higher GPU memory pressure. -
Dataset shape: every JSONL record must contain a
messages[]array using the OpenAI multimodal schema. Each message’scontentcan mix text segments and image segments. We accept:The trainer also understands legacy payloads with top-levelimages/image_urlfields, but everything is converted into the message format shown above. -
Image references: each
image_url.urlmust be resolvable from the training container. HTTPS URLs, public object-store links, anddata:image/...;base64,<payload>blobs are supported. Local filesystem paths only work if that path exists inside the uploaded artifact, so prefer URLs or data URIs. -
Image limits: Qwen3-VL defaults to
max_images_per_message = 1. Additional images in a single turn are trimmed and a debug log is emitted. Plan your prompts accordingly or bump the limit explicitly if your GPU topology can handle it.
2. Launch the job
- The CLI validates the dataset (it must contain
messages[]with ≥2 turns). - The JSONL is uploaded to
/api/learning/files; the CLI waits until the backend marks itready. - A job is created and started with the payload generated from your TOML.
- The CLI polls status and prints progress events until the job reaches a terminal state.
--backend– override the Synth API base URL (defaults to production).--model– override the model in the TOML without editing the file.--examples N– upload only the firstNJSONL records (smoke testing).--no-poll– submit the job and exit immediately (useful when an agent wants to poll separately).
.env produced by uvx synth-ai setup. Use --env-file when you need to target a different secrets file (you can pass the flag multiple times to layer values).
3. Monitor and retrieve outputs
- Copy the
job_idprinted in the CLI output. - Re-run the command later with
--no-pollto check status without re-uploading. - Query job details directly:
- When the job succeeds, note the
fine_tuned_modelidentifier in the response. You will use this value when deploying the updated policy.
4. Automate with agents
- Run
uvx synth-ai evalto generate traces. - Run
uvx synth-ai filterto create JSONL. - Run
uvx synth-ai train --type sft … --no-poll. - Poll
/api/learning/jobs/<job_id>until status issucceeded. - Fetch the new
fine_tuned_modeland move on to deployment.