
synth-ai/examples/finetuning/synth_qwen/
.
Requirements
- Have
uv
installed and useuvx
/uv run
SYNTH_API_KEY
exported in your shell- Local tracing and environment service deployed with
uvx synth-ai serve
- End-to-end flow in four steps: Generate traces → Filter to SFT JSONL → Kick off SFT → Run fine-tuned model
- Uses Qwen/Qwen3-4B-Instruct-2507 with tool-calling in a Crafter environment
- Central configuration via
examples/finetuning/synth_qwen/config.toml
Overview: ReAct agent + tool-calling in Crafter
- Agent loop: A ReAct-style LLM agent runs inside the Crafter environment. Each turn the model thinks in text and issues a structured tool call (OpenAI functions) to act in the world.
- Tool-calling: We send OpenAI-compatible messages plus function tools (e.g., step/look). For Qwen3 we use its native chat template and support
tool_choice
andstop_after_tool_calls
to ensure a clean, single action per turn. - API usage:
- Initial rollouts use a dev-only instance of
Qwen/Qwen3-4B-Instruct-2507
via the Synth inference API to generate traces. - We filter those traces into an OpenAI-format SFT JSONL and kick off fine-tuning through the same Synth API.
- Fine-tuning returns a model id like
ft:Qwen/Qwen3-4B-Instruct-2507:ftjob-<full-uuid>
, which we then use for inference in Crafter.
- Initial rollouts use a dev-only instance of
- Observability: Full tracing (SQLite/Turso) captures sessions, tool calls, rewards, and tokens for analysis and dataset creation.
- Generate traces (Qwen 4B)
- Filter traces → SFT JSONL
- Finetune (SFT)
- Evaluate the fine-tuned adapter