Skip to main content2025-10-09 – LoRA, MoE & Large Model Support
🚀 New Features
- Expanded Qwen catalog: Simple Training now ships SFT and inference presets for every Qwen release outside the existing
qwen3-{0.6B–32B}
range, giving full coverage for the remaining Qwen 1.x/2.x/2.5 checkpoints.
- Large-model inference & training topologies: Added 2×, 4×, and 8× layouts across B200, H200, and H100 fleets, all MoE-ready for advanced Qwen variants in both SFT and inference workflows.
- Turnkey rollout: API and UI selectors automatically surface the new Qwen SKUs so jobs can be scheduled without manual topology overrides.
- LoRA-first SFT: Low-Rank Adaptation is now a first-class training mode across every new Qwen topology, providing parameter-efficient finetuning defaults out of the box.
🚀 New Features
- Rollout Viewer: Enhanced visualization and monitoring interface for training rollouts with real-time metrics and progress tracking
- B200 & H200 GPU Support: Added support for NVIDIA’s latest flagship GPUs (B200, H200) for both training and inference workloads
- Faster Inference: Optimized inference pipeline with improved throughput and reduced latency across all model sizes
- GSPO Support: Integrated Group Sequence Policy Optimization (GSPO) algorithm for advanced reinforcement learning training
2025-09-17 – Online RL (customer‑visible features)
-
Organization‑scoped environment credentials
- Upload your environment API key once (sealed‑box encrypted). The platform decrypts and injects it at run time; plaintext is never transmitted or stored.
-
First‑party Task App integration
- Run environments behind a managed Task App with authenticated rollouts. Online RL calls your Task App endpoints directly during training.
-
Single‑node, multi‑GPU Online RL
- Out‑of‑the‑box split between vLLM inference GPUs and training GPUs on a single node (e.g., 6 inference / 2 training on H100). *Multi-node training finished in dev, reach out if interested.
- Supports reference model (for KL) stacked on inference or in its own GPU, and configurable tensor parallelism for inference.
-
Production run flow
- Start an Online RL job against your deployed Task App, monitor progress/events, and run inference using the produced checkpoint when training completes.
0.2.2.dev2 — Aug 8, 2025
- Fine-tuning (SFT) endpoints available and documented end-to-end
- Interactive demo launcher (
uvx synth-ai demo
) with finetuning flow for Qwen 4B
- Live polling output during training with real-time status updates
- CLI Reference for
uvx synth-ai serve
, uvx synth-ai traces
, and demo launcher
0.2.2.dev1 — Aug 7, 2025
- New backend balance APIs and CLI for account visibility
- CLI utilities:
balance
, traces
, and man
commands
- Traces inventory view with per-DB counts and storage footprint
- Standardized one-off usage:
uvx synth-ai <command>
(removed interactive watch
)
- Improved
.env
loading and API key resolution
0.2.2.dev0 — Jul 30, 2025
- Environment Registration API for custom environments
- Turso/sqld daemon support with local-first replicas
- Environment Service Daemon via
uvx synth-ai serve
0.2.1.dev1 — Jul 29, 2025
- Initial development release
Feb 3, 2025
- Cuvier Error Search (deprecated)
Jan 2025
- Langsmith integration for Enterprise partners
- Python SDK v0.3 (simplified API, Anthropic support)