Fine-Tune Your Model

Supervised Fine-Tuning (SFT) trains a model on your instruction-response pairs, teaching it to generate outputs that match your examples. Unlike prompt optimization which changes the prompt, SFT modifies the model weights directly.

When to Use SFT

Best for:

Domain adaptation (legal, medical, code)
Custom response style or tone
Teaching specific formats or structures
When you have high-quality example data

Consider prompt optimization instead if:

You don’t have labeled training data
You want to iterate quickly without retraining
Your task is classification or QA with clear metrics

Prerequisites

# Required environment variables in .env
SYNTH_API_KEY=sk_...    # For authentication

Install the CLI:

pip install synth-ai
# or
uvx synth-ai --help

Step 1: Prepare Your Training Data

Create a JSONL file with conversation examples. Each line is a JSON object with a messages array:

{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is 2+2?"}, {"role": "assistant", "content": "4"}]}
{"messages": [{"role": "user", "content": "Translate 'hello' to French"}, {"role": "assistant", "content": "Bonjour"}]}

Data Format Requirements

Each example must have:

At least one user message
At least one assistant message (this is what the model learns to generate)

Supported Message Roles

Role	Description
`system`	Optional system prompt (first message only)
`user`	User input
`assistant`	Model response (training target)
`tool`	Tool/function response

Example: Basic Conversation

{
  "messages": [
    {"role": "system", "content": "You are a customer service agent for Acme Corp."},
    {"role": "user", "content": "I need to return my order"},
    {"role": "assistant", "content": "I'd be happy to help with your return. Could you please provide your order number?"}
  ]
}

Example: With Tool Calls

{
  "messages": [
    {"role": "user", "content": "What's the weather in NYC?"},
    {"role": "assistant", "content": null, "tool_calls": [
      {"id": "call_1", "type": "function", "function": {"name": "get_weather", "arguments": "{\"city\": \"NYC\"}"}}
    ]},
    {"role": "tool", "tool_call_id": "call_1", "content": "Sunny, 72°F"},
    {"role": "assistant", "content": "It's currently sunny and 72°F in New York City."}
  ],
  "tools": [{"name": "get_weather", "description": "Get current weather", "parameters": {"type": "object", "properties": {"city": {"type": "string"}}}}]
}

Example: Vision/Multimodal

For vision models (e.g., Qwen3-VL), include images in user messages:

{
  "messages": [
    {"role": "user", "content": [
      {"type": "text", "text": "What's in this image?"},
      {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
    ]},
    {"role": "assistant", "content": "I see a golden retriever playing in a park."}
  ]
}

Images can be URLs or base64-encoded:

{"type": "image_url", "image_url": {"url": "data:image/png;base64,iVBORw0KGgo..."}}

Data Quality Tips

Diverse examples: Cover the range of inputs your model will see
Consistent format: Use the same response style across examples
Quality over quantity: 100 excellent examples beat 10,000 mediocre ones
Validation set: Hold out 10-20% for evaluation

Step 2: Create the Configuration

Create a TOML file for your training job:

[training]
algorithm = "sft"
model = "Qwen/Qwen2.5-7B-Instruct"

[training.data]
training_file = "data/train.jsonl"
validation_file = "data/val.jsonl"  # Optional but recommended

[training.hyperparameters]
num_train_epochs = 3
learning_rate = 2e-4
per_device_train_batch_size = 4
gradient_accumulation_steps = 4
max_seq_length = 2048

[training.lora]
enabled = true
rank = 16
alpha = 32

[training.evaluation]
eval_steps = 500
early_stopping_patience = 3

Configuration Reference

Model Selection

Model	Use Case	Notes
`Qwen/Qwen2.5-7B-Instruct`	General purpose	Good balance of speed/quality
`Qwen/Qwen2.5-14B-Instruct`	Higher quality	Slower, more GPU memory
`Qwen/Qwen3-VL-7B`	Vision tasks	Supports image inputs
`meta-llama/Llama-3.1-8B-Instruct`	General purpose	Strong reasoning

Hyperparameters

Parameter	Default	Description
`num_train_epochs`	3	Training passes over your data
`learning_rate`	2e-4	How fast the model updates (lower = more stable)
`per_device_train_batch_size`	4	Examples per GPU per step
`gradient_accumulation_steps`	4	Accumulate gradients before update
`max_seq_length`	2048	Maximum tokens per example
`warmup_ratio`	0.1	Fraction of steps for learning rate warmup

Effective batch size = per_device_train_batch_size × gradient_accumulation_steps = 16

LoRA Settings

LoRA (Low-Rank Adaptation) fine-tunes efficiently by updating a small number of parameters:

Parameter	Default	Description
`enabled`	true	Use LoRA (recommended)
`rank`	16	Rank of adaptation matrices (higher = more capacity)
`alpha`	32	Scaling factor (typically 2× rank)
`dropout`	0.1	Regularization

Evaluation Settings

Parameter	Default	Description
`eval_steps`	500	Evaluate every N steps
`early_stopping_patience`	3	Stop if no improvement for N evals
`save_best_model`	true	Keep the best checkpoint

Step 3: Launch the Training Job

Using the CLI

synth-ai train --type sft --config my_config.toml --poll

The --poll flag shows progress until completion:

[14:23:01]    0.0s  Status: queued
[14:23:15]   14.2s  Status: running | Step: 0 | Loss: 2.45
[14:23:45]   44.1s  Status: running | Step: 100 | Loss: 1.82
[14:24:15]   74.0s  Status: running | Step: 200 | Loss: 1.34
...
[14:45:30]  1349s   Status: succeeded | Final Loss: 0.89
Job completed! Fine-tuned model: ft:qwen2.5-7b:my-org:abc123

Using Python

from synth_ai.sdk.api.train.sft import SFTJob
import os

# Create job from config
job = SFTJob.from_config(
    config_path="my_config.toml",
    api_key=os.environ["SYNTH_API_KEY"]
)

# Submit and wait
job_id = job.submit()
print(f"Job started: {job_id}")

result = job.poll_until_complete(timeout=7200.0)  # 2 hour timeout
print(f"Fine-tuned model: {result.get('fine_tuned_model')}")

Resume a Job

If you need to check on a job later:

job = SFTJob.from_job_id(
    job_id="sft_abc123",
    api_key=os.environ["SYNTH_API_KEY"]
)

status = job.get_status()
print(f"Status: {status['status']}")
print(f"Progress: {status.get('current_step', 0)} / {status.get('total_steps', '?')}")

Step 4: Use Your Fine-Tuned Model

After training completes, you’ll receive a model ID like ft:qwen2.5-7b:my-org:abc123.

Via API

from openai import OpenAI

client = OpenAI(
    base_url="https://api.usesynth.ai/v1",
    api_key=os.environ["SYNTH_API_KEY"]
)

response = client.chat.completions.create(
    model="ft:qwen2.5-7b:my-org:abc123",  # Your fine-tuned model
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

print(response.choices[0].message.content)

Supported Models

See Supported Models for the full list of trainable models.

Get Started

Training Walkthroughs

Supported Models

Pricing

Fine-Tune Your Model

When to Use SFT

Prerequisites

Step 1: Prepare Your Training Data

Data Format Requirements

Supported Message Roles

Example: Basic Conversation

Example: With Tool Calls

Example: Vision/Multimodal

Data Quality Tips

Step 2: Create the Configuration

Configuration Reference

Model Selection

Hyperparameters

LoRA Settings

Evaluation Settings

Step 3: Launch the Training Job

Using the CLI

Using Python

Resume a Job

Step 4: Use Your Fine-Tuned Model

Via API

Supported Models

Get Started

Training Walkthroughs

Supported Models

Pricing

​When to Use SFT

​Prerequisites

​Step 1: Prepare Your Training Data

​Data Format Requirements

​Supported Message Roles

​Example: Basic Conversation

​Example: With Tool Calls

​Example: Vision/Multimodal

​Data Quality Tips

​Step 2: Create the Configuration

​Configuration Reference

​Model Selection

​Hyperparameters

​LoRA Settings

​Evaluation Settings

​Step 3: Launch the Training Job

​Using the CLI

​Using Python

​Resume a Job

​Step 4: Use Your Fine-Tuned Model

​Via API

​Supported Models

When to Use SFT

Prerequisites

Step 1: Prepare Your Training Data

Data Format Requirements

Supported Message Roles

Example: Basic Conversation

Example: With Tool Calls

Example: Vision/Multimodal

Data Quality Tips

Step 2: Create the Configuration

Configuration Reference

Model Selection

Hyperparameters

LoRA Settings

Evaluation Settings

Step 3: Launch the Training Job

Using the CLI

Using Python

Resume a Job

Step 4: Use Your Fine-Tuned Model

Via API

Supported Models