TL;DR
- Expanded Qwen Catalog: SFT and inference presets for all Qwen releases
- Large-Model Topologies: 2×, 4×, and 8× layouts across B200, H200, and H100 fleets
- MoE-Ready: All topologies ready for advanced Qwen variants
- LoRA-First SFT: Low-Rank Adaptation as first-class training mode
- Turnkey Rollout: Automatic SKU surfacing in API and UI
Expanded Qwen Catalog
Simple Training now ships SFT and inference presets for every Qwen release outside the existingqwen3-{0.6B–32B} range.
- Full Coverage: Complete coverage for Qwen 1.x/2.x/2.5 checkpoints
- Presets: Ready-to-use presets for SFT and inference
- Consistency: Consistent configuration across all Qwen variants
Large-Model Inference & Training Topologies
Added 2×, 4×, and 8× layouts across B200, H200, and H100 fleets.- Multiple Layouts: Choose from 2×, 4×, or 8× GPU configurations
- Fleet Support: Works across B200, H200, and H100 GPU fleets
- MoE-Ready: All topologies ready for advanced Qwen variants
- SFT & Inference: Support for both training and inference workflows
LoRA-First SFT
Low-Rank Adaptation is now a first-class training mode across every new Qwen topology.- Default Mode: LoRA is the default training mode for new topologies
- Parameter Efficiency: Efficient fine-tuning with minimal parameter overhead
- Universal Support: Available across all Qwen topologies
- Easy Configuration: Simple configuration for LoRA training
Turnkey Rollout
API and UI selectors automatically surface the new Qwen SKUs so jobs can be scheduled without manual topology overrides.- Automatic Discovery: New SKUs automatically appear in selectors
- No Manual Overrides: No need to manually configure topologies
- Seamless Integration: Works seamlessly with existing workflows
Use Cases
- Complete Qwen Coverage: Use any Qwen model variant with presets
- Large Model Training: Train large models with multi-GPU topologies
- Efficient Fine-Tuning: Use LoRA-first approach for parameter-efficient training
- Simplified Workflows: Automatic SKU discovery simplifies job scheduling