Skip to main content

V3 Trace Format

The v3 trace format is the single source of truth for session data in the Synth SDK. It captures complete execution details including events, messages, timing, token usage, and metadata.

Overview

A SessionTrace represents one complete execution (conversation, RL episode, batch job). It contains:
  • Ordered timesteps with events and messages
  • Complete event/message history (flat chronological lists)
  • Session-level metadata
  • Timing and performance data

Why V3?

The v3 format supersedes legacy trajectory formats by providing:
  • Richer data: Token IDs, logprobs, timing, multimodal content
  • Better structure: Clear separation between events and messages
  • Type safety: Frozen dataclasses with validation
  • Future-proof: Extensible metadata and event types

Core Data Structures

SessionTrace

Top-level container for a complete session.
@dataclass
class SessionTrace:
    session_id: str = ""
    created_at: datetime = field(default_factory=lambda: datetime.now(UTC))
    session_time_steps: list[SessionTimeStep] = field(default_factory=list)
    event_history: list[BaseEvent] = field(default_factory=list)
    markov_blanket_message_history: list[SessionEventMarkovBlanketMessage] = field(default_factory=list)
    metadata: dict[str, Any] = field(default_factory=dict)
    session_metadata: list[dict[str, Any]] | None = None
Key Fields:
  • session_id: Unique identifier (e.g., UUID, run ID)
  • session_time_steps: Ordered list of logical steps/turns
  • event_history: Complete flat list of all events (chronological)
  • markov_blanket_message_history: Complete flat list of all messages
  • metadata: Session-level context (user_id, experiment_id, env_name, etc.)
Session Metadata Keys (recommended):
metadata = {
    "user_id": "user-123",
    "experiment_id": "exp-abc",
    "environment_name": "crafter-v1",
    "model_config": {"model": "gpt-4", "temperature": 0.7},
    "policy_name": "my-policy-v1",
    "policy_iteration": 42,
}

SessionTimeStep

A logical step within the session (e.g., one conversation turn, one RL step).
@dataclass
class SessionTimeStep:
    step_id: str = ""
    step_index: int = 0
    timestamp: datetime = field(default_factory=lambda: datetime.now(UTC))
    turn_number: int | None = None
    events: list[BaseEvent] = field(default_factory=list)
    markov_blanket_messages: list[SessionEventMarkovBlanketMessage] = field(default_factory=list)
    step_metadata: dict[str, Any] = field(default_factory=dict)
    completed_at: datetime | None = None
Key Fields:
  • step_id: Unique identifier for this step
  • step_index: Sequential position (0-based)
  • turn_number: 1-based turn count for conversational contexts
  • events: Events that occurred during this step
  • markov_blanket_messages: Messages exchanged during this step

Event Types

BaseEvent

Common fields for all event types.
@dataclass
class BaseEvent:
    system_instance_id: str
    time_record: TimeRecord
    metadata: dict[str, Any] = field(default_factory=dict)
    event_metadata: list[Any] | None = None

LMCAISEvent

Language model API calls with token/cost tracking.
@dataclass
class LMCAISEvent(BaseEvent):
    model_name: str = ""
    provider: str | None = None
    input_tokens: int | None = None
    output_tokens: int | None = None
    total_tokens: int | None = None
    cost_usd: float | None = None
    latency_ms: int | None = None
    span_id: str | None = None
    trace_id: str | None = None
    system_state_before: dict[str, Any] | None = None
    system_state_after: dict[str, Any] | None = None
    call_records: list[LLMCallRecord] = field(default_factory=list)
Example:
lm_event = LMCAISEvent(
    system_instance_id="agent",
    time_record=TimeRecord(event_time=time.time()),
    model_name="gpt-4",
    provider="openai",
    input_tokens=150,
    output_tokens=80,
    total_tokens=230,
    cost_usd=0.0069,
    latency_ms=1250,
    call_records=[
        LLMCallRecord(
            request_messages=[
                {"role": "system", "content": "You are a helpful assistant."},
                {"role": "user", "content": "What is 2+2?"},
            ],
            response_message={"role": "assistant", "content": "4"},
            tool_calls=[...],
        )
    ],
)

EnvironmentEvent

Feedback from environments (rewards, observations, termination).
@dataclass
class EnvironmentEvent(BaseEvent):
    reward: float = 0.0
    terminated: bool = False
    truncated: bool = False
    system_state_before: dict[str, Any] | None = None
    system_state_after: dict[str, Any] | None = None
Example:
env_event = EnvironmentEvent(
    system_instance_id="crafter",
    time_record=TimeRecord(event_time=time.time()),
    reward=0.5,
    terminated=False,
    truncated=False,
    system_state_after={
        "inventory": {"wood": 5, "stone": 2},
        "health": 9,
        "position": [12, 34],
    },
)

RuntimeEvent

System decisions and actions.
@dataclass
class RuntimeEvent(BaseEvent):
    actions: list[int] = field(default_factory=list)
Example:
runtime_event = RuntimeEvent(
    system_instance_id="agent_runtime",
    time_record=TimeRecord(event_time=time.time()),
    actions=[3, 7, 1],  # Tool/action indices
    metadata={
        "tool_name": "collect_wood",
        "tool_args": {"quantity": 5},
    },
)

Messages

Messages represent information crossing subsystem boundaries.
@dataclass
class SessionEventMarkovBlanketMessage:
    content: SessionMessageContent
    message_type: str
    time_record: TimeRecord
    metadata: dict[str, Any] = field(default_factory=dict)
Message Types:
  • observation: Environment → Agent
  • action: Agent → Environment
  • result: Environment → Agent
  • user_input: User → Agent
  • agent_response: Agent → User
Example:
message = SessionEventMarkovBlanketMessage(
    content=SessionMessageContent(
        text="Collect wood from tree",
        json_payload=json.dumps({"action": "collect", "target": "tree"}),
    ),
    message_type="action",
    time_record=TimeRecord(event_time=time.time()),
    metadata={
        "from_system_role": "agent",
        "to_system_role": "environment",
        "call_id": "call-123",
    },
)

Time Records

Timing information for events and messages.
@dataclass
class TimeRecord:
    event_time: float  # Unix timestamp (microsecond precision)
    message_time: int | None = None  # Optional sequence number

Complete Example

from synth_ai.tracing_v3 import SessionTracer, SessionTrace, LMCAISEvent, EnvironmentEvent
from datetime import datetime, UTC
import time

# Create tracer
tracer = SessionTracer(db_path="traces.db", session_id="episode_001")

# Start session
session = SessionTrace(
    session_id="episode_001",
    created_at=datetime.now(UTC),
    metadata={
        "environment_name": "crafter-v1",
        "policy_name": "baseline-v1",
        "user_id": "user-123",
    },
)

# Record LM event for turn 1
lm_event = LMCAISEvent(
    system_instance_id="agent",
    time_record=TimeRecord(event_time=time.time()),
    model_name="gpt-4",
    provider="openai",
    input_tokens=200,
    output_tokens=50,
    call_records=[...],
)
lm_event_id = tracer.record_event(lm_event)

# Record environment response
env_event = EnvironmentEvent(
    system_instance_id="crafter",
    time_record=TimeRecord(event_time=time.time()),
    reward=1.0,
    system_state_after={"inventory": {"wood": 1}},
)
tracer.record_event(env_event)

# Get complete trace
complete_trace = tracer.get_session_trace()
print(f"Session: {complete_trace.session_id}")
print(f"Total events: {len(complete_trace.event_history)}")
print(f"Total messages: {len(complete_trace.markov_blanket_message_history)}")

Schema Validation

The v3 format includes automatic validation:
  • Type checking: All fields are type-checked at creation
  • Immutability: Events and messages are frozen dataclasses
  • JSON serialization: SessionTrace.to_dict() for storage
# Convert to dict for JSON serialization
trace_dict = session.to_dict()

# Store in database or file
import json
with open("trace.json", "w") as f:
    json.dump(trace_dict, f, indent=2, default=str)

Best Practices

1. Use Meaningful System IDs

# Good
system_instance_id="llm_agent_v1"
system_instance_id="crafter_env"
system_instance_id="tool_executor"

# Bad
system_instance_id="system1"
system_instance_id="agent"

2. Populate Metadata

# Session metadata
metadata = {
    "environment_name": "crafter-v1",
    "policy_name": "ppo-v3",
    "policy_iteration": 42,
    "user_id": "user-123",
    "experiment_id": "exp-20250101",
}

# Event metadata
event.metadata = {
    "step_id": "turn_1",
    "duration_ms": 1250,
    "error": None,
}

# Message metadata
message.metadata = {
    "from_system_role": "agent",
    "to_system_role": "environment",
    "call_id": "call-abc123",
}

3. Record Both Events and Messages

Events capture what happened, messages capture communication.
# Record LLM decision (event)
lm_event = LMCAISEvent(...)
tracer.record_event(lm_event)

# Record action sent to environment (message)
action_message = SessionEventMarkovBlanketMessage(
    content=SessionMessageContent(text="collect wood"),
    message_type="action",
    ...
)
tracer.record_message(action_message)

# Record environment result (event)
env_event = EnvironmentEvent(reward=0.5, ...)
tracer.record_event(env_event)

# Record result returned to agent (message)
result_message = SessionEventMarkovBlanketMessage(
    content=SessionMessageContent(text="Success"),
    message_type="result",
    ...
)
tracer.record_message(result_message)

4. Use Consistent Timestamps

import time

# Use unix timestamp for event_time
time_record = TimeRecord(event_time=time.time())

# Use datetime for session timestamps
created_at = datetime.now(UTC)

Migration from Legacy Format

If you have legacy trajectory data, use the conversion utilities:
from synth_ai.tracing_v3.conversion import convert_legacy_trajectory

# Convert old format to v3
v3_trace = convert_legacy_trajectory(legacy_trajectory)

See Also