SDK Overview
The Synth SDK provides a comprehensive tracing and reward system designed for RL and SFT training. The system captures fine-grained execution details, supports multiple reward types, and enables sophisticated filtering and analysis.Core Concepts
Sessions and Traces
A session represents a complete execution (e.g., a conversation, RL episode, or batch job). Each session is captured as a v3 trace containing:- Structured event history
- Message exchanges between subsystems
- Token usage and cost tracking
- Timing and performance metrics
- Custom metadata
Events
Events are intra-system facts that capture something that happened:- LMCAISEvent: Language model API calls with token/cost tracking
- EnvironmentEvent: Feedback from environments (rewards, observations)
- RuntimeEvent: System decisions and actions
Messages
Messages represent information transmitted between subsystems:- User → Agent (instructions)
- Agent → Runtime (decisions)
- Runtime → Environment (tool executions)
- Environment → Runtime (results)
Rewards
The system supports two types of rewards:- Event Rewards: Attached to specific events within a session (step-level)
- Outcome Rewards: Attached to the entire session (episode-level)
Key Features
v3 Trace Format
Complete session traces with events, messages, and metadata
Event Rewards
Step-level rewards for fine-grained credit assignment
Outcome Rewards
Episode-level rewards for filtering and evaluation
Judge Integration
Automated rubric-based evaluation of traces