Skip to main content

Required TOML headers:

[algorithm]
[policy]
[compute]
[training]
[evaluation]
[services]

Algorithm header required keys:

  • type = “online”
  • method
    • Value must be one of:
      • “policy_gradient”
      • “ppo”
      • “gspo”
  • variety

Policy header required keys:

  • model_name or source
  • trainer_mode
  • label

Compute header required keys:

  • gpu_type
  • gpu_count
  • rollout.env_name
  • rollout.policy_name
  • topology.reference_placement

Training header required keys:

  • num_epochs
  • iterations_per_epoch
  • max_turns
  • batch_size
  • group_size
  • learning_rate

Evaluation header required keys:

  • instances
  • every_n_iters
  • seeds

Services header required key:

  • task_url