LoRA SFT

Train larger models on constrained hardware by enabling LoRA while keeping the same SFT flow and payload schema.

Works via the standard SFT CLI: uvx synth-ai train --type sft --config <path>
You do not need QLoRA to use LoRA. Use either:
- LoRA with standard precision memory optimizations: training.mode = "lora" and omit use_qlora
- QLoRA (4-bit quantization adapters): set training.mode = "lora" and training.use_qlora = true
Uses the same hyperparameter keys as FFT; the backend interprets LoRA-appropriate settings

Quickstart

uvx synth-ai train --type sft --config examples/warming_up_to_rl/configs/crafter_fft_4b.toml --dataset /abs/path/to/train.jsonl

Minimal TOML (LoRA enabled)

[job]
model = "Qwen/Qwen3-4B"
# Either set here or pass via --dataset
# data = "/abs/path/to/train.jsonl"

[compute]
gpu_type = "H100"       # required by backend
gpu_count = 1
nodes = 1

[data]
# Optional; forwarded into metadata.effective_config
topology = {}
# Optional local validation file; client uploads if present
# validation_path = "/abs/path/to/validation.jsonl"

[training]
mode = "sft_offline"
# Option A: LoRA without QLoRA (recommended starting point)
# use_qlora = false
# Option B: QLoRA (enable 4-bit adapters)
use_qlora = true

[training.validation]
enabled = true
evaluation_strategy = "steps"
eval_steps = 20
save_best_model_at_end = true
metric_for_best_model = "val.loss"
greater_is_better = false

[hyperparameters]
n_epochs = 1
per_device_batch = 1
gradient_accumulation_steps = 64
sequence_length = 4096
learning_rate = 5e-6
warmup_ratio = 0.03

# Optional parallelism block forwarded as-is
#[hyperparameters.parallelism]
# use_deepspeed = true
# deepspeed_stage = 2
# bf16 = true

What the client validates and sends

Validates dataset path existence (from [job].data or --dataset) and JSONL shape
Uploads training (and optional validation) files to /api/learning/files
Builds payload with:
- model from [job].model
- training_type = "sft_offline"
- hyperparameters from [hyperparameters] (+ [training.validation] knobs)
- metadata.effective_config.compute from [compute]
- metadata.effective_config.data.topology from [data.topology]
- metadata.effective_config.training.{mode,use_qlora} from [training]

Multi‑GPU guidance

Set [compute].gpu_type, gpu_count, and optionally nodes
Use [hyperparameters.parallelism] for deepspeed/FSDP/precision/TP/PP knobs; forwarded verbatim
Optionally add [data.topology] (e.g., container_count) for visibility; backend validates resource consistency

LoRA ranks and targets

Default rank is typically 16 in production. To sweep ranks, include the following under hyperparameters:

[hyperparameters]
"lora_rank" = 8            # try 4, 8, 16, 32, 64
"lora_alpha" = 16          # optional, scales adapter updates
"lora_dropout" = 0.05      # optional

Some examples (e.g., Qwen Coder) include a [lora] block for target selection:

[lora]
r = 16
alpha = 32
dropout = 0.05
target_modules = ["all-linear"]

Common issues

HTTP 400 missing_gpu_type: set [compute].gpu_type (and typically gpu_count) so it appears under metadata.effective_config.compute
Dataset not found: provide absolute path or use --dataset; the client resolves relative paths from the current working directory

Helpful CLI flags

--dataset to override [job].data
--examples N to subset the first N rows for quick smoke tests
--dry-run to preview payload without submitting

All sections and parameters (LoRA SFT)

The client recognizes and/or forwards the following sections:

[job] (client reads)
- model (string, required): base model identifier
- data or data_path (string): local path to training JSONL. Required unless overridden by --dataset
- Notes:
  - Paths are resolved relative to the current working directory (CWD), not the TOML location
  - Legacy scripts may also mention poll_seconds (legacy-only; the new CLI uses --poll-* flags)
[compute] (forwarded into metadata.effective_config.compute)
- gpu_type (string): required by backend (e.g., “H100”, “A10G”). Missing this often causes HTTP 400
- gpu_count (int): number of GPUs
- nodes (int, optional)
[data] (partially read)
- topology (table/dict): forwarded as-is to metadata.effective_config.data.topology
- validation_path (string, optional): if present and the file exists, the client uploads it and wires validation
- Path resolution: relative to the current working directory (CWD)
[training] (partially read)
- mode (string, optional): copied into metadata.effective_config.training.mode (documentation hint)
- use_qlora (bool): set to true to enable QLoRA (LoRA works without this)
- [training.validation] (optional; some keys are promoted into hyperparameters)
  - enabled (bool, default true): surfaced in metadata.effective_config.training.validation.enabled
  - evaluation_strategy (string, default “steps”): forwarded into hyperparameters
  - eval_steps (int, default 0): forwarded
  - save_best_model_at_end (bool, default true): forwarded
  - metric_for_best_model (string, default “val.loss”): forwarded
  - greater_is_better (bool, default false): forwarded
[hyperparameters] (client reads selective keys)
- Required/defaulted:
  - n_epochs (int, default 1)
- Optional (forwarded if present):
  - batch_size, global_batch, per_device_batch, gradient_accumulation_steps, sequence_length, learning_rate, warmup_ratio, train_kind
  - LoRA-specific (forwarded if present): lora_rank, lora_alpha, lora_dropout
- Note: some legacy examples include world_size. The client does not forward world_size; prefer specifying per_device_batch and gradient_accumulation_steps explicitly.
- [hyperparameters.parallelism] (dict): forwarded verbatim (e.g., use_deepspeed, deepspeed_stage, fsdp, bf16, fp16, tensor_parallel_size, pipeline_parallel_size)
[algorithm] (ignored by client): present in some examples for documentation; no effect on payload

Validation and error rules (client):

Missing dataset path -> prompt or error; dataset must exist and be valid JSONL
Missing gpu_type (backend rule) -> HTTP 400 at create job
Validation path missing -> warning; continues without validation

Payload mapping recap:

model from [job].model
training_type = "sft_offline"
hyperparameters from [hyperparameters] + [training.validation] select keys
metadata.effective_config.compute from [compute]
metadata.effective_config.data.topology from [data.topology]
metadata.effective_config.training.{mode,use_qlora} from [training]

Hosted Training

Fine Tuning

Quickstart

Minimal TOML (LoRA enabled)

What the client validates and sends

Multi‑GPU guidance

LoRA ranks and targets

Common issues

Helpful CLI flags

All sections and parameters (LoRA SFT)

Hosted Training

Fine Tuning

​Quickstart

​Minimal TOML (LoRA enabled)

​What the client validates and sends

​Multi‑GPU guidance

​LoRA ranks and targets

​Common issues

​Helpful CLI flags

​All sections and parameters (LoRA SFT)

Quickstart

Minimal TOML (LoRA enabled)

What the client validates and sends

Multi‑GPU guidance

LoRA ranks and targets

Common issues

Helpful CLI flags

All sections and parameters (LoRA SFT)