When to Use
- Cloning successful AI generations (ReST-EM style self-training)
- Distilling from a larger model to a smaller one
- Training on domain-specific data (code, medical, legal, etc.)
- Teaching specific output formats or styles
- Vision fine-tuning with image-text pairs
Full Config Reference
Parameters
[algorithm] (Required)
| Parameter | Type | Required | Description |
|---|---|---|---|
type | string | ✓ | Must be "offline" for SFT |
method | string | ✓ | "sft" or "supervised_finetune" |
variety | string | ✓ | "fft" (full fine-tune), "lora", or "qlora" |
[job] (Required)
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | ✓ | HuggingFace model identifier (e.g., "Qwen/Qwen3-4B") |
data | string | ✓ | Path to training data (JSONL format). Alternative: data_path |
data_path | string | Alternative to data | |
poll_seconds | int | Polling interval for status updates (default: 30) |
[compute] (Required)
| Parameter | Type | Required | Description |
|---|---|---|---|
gpu_type | string | ✓ | GPU type: "H100", "H200", "A100", etc. |
gpu_count | int | ✓ | Number of GPUs |
nodes | int | Number of nodes (default: 1) |
[compute.topology]
| Parameter | Type | Description |
|---|---|---|
type | string | Topology type (e.g., "single_node_split") |
gpus_for_vllm | int | GPUs for inference |
gpus_for_training | int | GPUs for training |
gpus_for_ref | int | GPUs for reference model |
tensor_parallel | int | Tensor parallelism degree |
reference_placement | string | Reference model placement: "none", "shared", "dedicated" |
[policy]
| Parameter | Type | Required | Description |
|---|---|---|---|
model_name | string | ✓* | Model name (exactly one of model_name or source required) |
source | string | ✓* | Checkpoint source (e.g., "ft:abc123") |
max_tokens | int | Max generation tokens (default: 512) | |
temperature | float | Sampling temperature (default: 0.7) | |
top_p | float | Top-p sampling (default: 0.95) | |
top_k | int | Top-k sampling | |
repetition_penalty | float | Repetition penalty (default: 1.0) | |
stop_sequences | list | Stop sequences | |
trainer_mode | string | ✓ | Training mode: "full", "lora", or "qlora" |
label | string | ✓ | Model identifier/name |
inference_url | string | URL for distributed inference |
[data]
| Parameter | Type | Description |
|---|---|---|
validation_path | string | Path to validation dataset (JSONL) |
[data.topology]
| Parameter | Type | Description |
|---|---|---|
container_count | int | Number of data containers |
gpus_per_node | int | GPUs per node |
total_gpus | int | Total GPUs |
nodes | int | Number of nodes |
[training]
| Parameter | Type | Description |
|---|---|---|
mode | string | Training mode: "full_finetune", "lora", "sft_offline" |
use_qlora | bool | Enable QLoRA (4-bit quantization) |
[training.validation]
| Parameter | Type | Description |
|---|---|---|
enabled | bool | Enable validation during training |
evaluation_strategy | string | "steps" or "epoch" |
eval_steps | int | Evaluate every N steps |
save_best_model_at_end | bool | Save best checkpoint |
metric_for_best_model | string | Metric to optimize (e.g., "val.loss") |
greater_is_better | bool | Whether higher metric is better |
[training.lora]
| Parameter | Type | Description |
|---|---|---|
r | int | LoRA rank |
alpha | int | LoRA alpha scaling factor |
dropout | float | LoRA dropout rate |
target_modules | list | Modules to apply LoRA (e.g., ["q_proj", "v_proj"]) |
[hyperparameters]
| Parameter | Type | Default | Description |
|---|---|---|---|
n_epochs | int | 1 | Number of training epochs |
batch_size | int | Total batch size (deprecated) | |
global_batch | int | Global batch size across all GPUs | |
per_device_batch | int | Batch size per GPU | |
gradient_accumulation_steps | int | Gradient accumulation steps | |
sequence_length | int | Max sequence length | |
learning_rate | float | Learning rate | |
warmup_ratio | float | Warmup ratio (fraction of total steps) | |
weight_decay | float | Weight decay for regularization | |
train_kind | string | "fft" (full) or "peft" (LoRA/QLoRA) |
[hyperparameters.parallelism]
| Parameter | Type | Description |
|---|---|---|
use_deepspeed | bool | Use DeepSpeed for training |
deepspeed_stage | int | DeepSpeed ZeRO stage (1, 2, or 3) |
fsdp | bool | Use FSDP (Fully Sharded Data Parallel) |
bf16 | bool | Use bfloat16 precision |
fp16 | bool | Use float16 precision |
activation_checkpointing | bool | Enable gradient checkpointing |
tensor_parallel_size | int | Tensor parallelism degree |
pipeline_parallel_size | int | Pipeline parallelism degree |
[model_config]
| Parameter | Type | Description |
|---|---|---|
supports_vision | bool | Enable vision model support |
max_images_per_message | int | Max images per input message |
[tags]
Arbitrary key-value pairs for metadata and tracking.
Returns
Model ID Format
ft:Qwen/Qwen3-0.6B:job_658ba4f3a93845aapeft:Qwen/Qwen3-4B:job_abc123def456(LoRA)
Using Your Model
Dev Inference (testing):List Your Models
Related
- Production API — Call your fine-tuned models
- Artifacts CLI — Export models to HuggingFace
- SFT Jobs — SDK reference