Roadmap & Changelog

2025-10-09 – LoRA, MoE & Large Model Support

Expanded Qwen catalog: Simple Training now ships SFT and inference presets for every Qwen release outside the existing qwen3-{0.6B–32B} range, giving full coverage for the remaining Qwen 1.x/2.x/2.5 checkpoints.
Large-model inference & training topologies: Added 2×, 4×, and 8× layouts across B200, H200, and H100 fleets, all MoE-ready for advanced Qwen variants in both SFT and inference workflows.
Turnkey rollout: API and UI selectors automatically surface the new Qwen SKUs so jobs can be scheduled without manual topology overrides.
LoRA-first SFT: Low-Rank Adaptation is now a first-class training mode across every new Qwen topology, providing parameter-efficient finetuning defaults out of the box.

Rollout Viewer: Enhanced visualization and monitoring interface for training rollouts with real-time metrics and progress tracking
B200 & H200 GPU Support: Added support for NVIDIA’s latest flagship GPUs (B200, H200) for both training and inference workloads
Faster Inference: Optimized inference pipeline with improved throughput and reduced latency across all model sizes
GSPO Support: Integrated Group Sequence Policy Optimization (GSPO) algorithm for advanced reinforcement learning training

Organization‑scoped environment credentials
- Upload your environment API key once (sealed‑box encrypted). The platform decrypts and injects it at run time; plaintext is never transmitted or stored.
First‑party Task App integration
- Run environments behind a managed Task App with authenticated rollouts. Online RL calls your Task App endpoints directly during training.
Single‑node, multi‑GPU Online RL
- Out‑of‑the‑box split between vLLM inference GPUs and training GPUs on a single node (e.g., 6 inference / 2 training on H100). *Multi-node training finished in dev, reach out if interested.
- Supports reference model (for KL) stacked on inference or in its own GPU, and configurable tensor parallelism for inference.
Production run flow
- Start an Online RL job against your deployed Task App, monitor progress/events, and run inference using the produced checkpoint when training completes.

Fine-tuning (SFT) endpoints available and documented end-to-end
Interactive demo launcher (uvx synth-ai demo) with finetuning flow for Qwen 4B
Live polling output during training with real-time status updates
CLI Reference for uvx synth-ai serve, uvx synth-ai traces, and demo launcher

New backend balance APIs and CLI for account visibility
CLI utilities: balance, traces, and man commands
Traces inventory view with per-DB counts and storage footprint
Standardized one-off usage: uvx synth-ai <command> (removed interactive watch)
Improved .env loading and API key resolution