Skip to main content

Synth AI home page

GitHub
Get Started
Get Started

Overview

Overview

API Keys

Synth API Key
Environment API Key

Task Apps

Task Apps Overview
In-Process Task App
Standalone Task Apps

Tracing

V3 Trace Format

Jobs

Jobs Overview
Research Agent
- Reinforcement Learning
- GSPO
Supervised Fine-Tuning

Reinforcement Learning

GSPO

Group Sequence Policy Optimization

GSPO (Group Sequence Policy Optimization) is a reinforcement learning algorithm for training language models.

Reinforcement Learning Supervised Fine-Tuning

⌘I

Powered by Mintlify