1. Install the Crafter demo to your current working directory
2. Save and load your Synth credentials
synth-ai setup automatically does the following:
- Fetches
SYNTH_API_KEYandENVIRONMENT_API_KEYfrom https://usesynth.ai via your web browser - Saves
SYNTH_API_KEYandENVIRONMENT_API_KEYto .env in current working directory - Saves
SYNTH_API_KEYandENVIRONMENT_API_KEYto ~/.synth-ai/config.json - Loads
SYNTH_API_KEYandENVIRONMENT_API_KEYto process environment
3. Deploy the pre-built Crafter task app locally to start collecting rollout data
4. Collect rollouts for supervision
-
In a second terminal, request a batch of traced evaluations with
uvx synth-ai eval: -
This command drives
/rolloutfor each seed, writes structured traces totraces/v3/eval.sqlite, and stores per-turn JSONL shards underft_data/raw_sft.
5. Build a filtered SFT dataset
-
Create
configs/filter_crafter.tomlwith the minimal filter config: -
Export the curated JSONL using
uvx synth-ai filter: -
The resulting
ft_data/crafter_sft.jsonlis ready for supervised fine-tuning.
6. Launch the Crafter FFT baseline
-
Submit the bundled full-finetune job with
uvx synth-ai train: -
The
--pollflag streams status updates until the job reaches a terminal state.
7. Evaluate your fine-tuned model
-
After the job finishes, re-run
uvx synth-ai evalwith the returned fine-tuned model id (for exampleft:CRAFT-1234): - Compare the new outcome and event scores against the baseline to confirm the supervised improvement.
Next steps
- Tighten the filter thresholds (for example, raise
min_official_scoreor add metadata selectors) and rerun steps 6–7 to study data quality trade-offs. - Clone
configs/crafter_fft_4b.toml, settraining.use_qlora = true, and explore LoRA versus full-finetune results on the same dataset.