TL;DR
- Organization-Scoped Credentials: Sealed-box encrypted environment API keys
- First-Party Task App Integration: Managed Task Apps with authenticated rollouts
- Single-Node Multi-GPU Online RL: Out-of-the-box GPU split for inference and training
- Production Run Flow: Complete workflow from job start to checkpoint inference
Organization-Scoped Environment Credentials
Upload your environment API key once (sealed-box encrypted). The platform decrypts and injects it at run time; plaintext is never transmitted or stored.- Secure Storage: Sealed-box encryption for API keys
- Runtime Injection: Keys injected at runtime, never stored in plaintext
- Organization Scope: Keys scoped to organizations for better security
- One-Time Setup: Upload once, use across all jobs
First-Party Task App Integration
Run environments behind a managed Task App with authenticated rollouts. Online RL calls your Task App endpoints directly during training.- Managed Task Apps: Task Apps managed by the platform
- Authenticated Rollouts: Secure authentication for rollout requests
- Direct Integration: Online RL calls Task App endpoints directly
- Seamless Workflow: No manual configuration required
Single-Node, Multi-GPU Online RL
Out-of-the-box split between vLLM inference GPUs and training GPUs on a single node (e.g., 6 inference / 2 training on H100).- Automatic Split: Automatic GPU allocation for inference and training
- Single Node: Works on a single node with multiple GPUs
- Flexible Configuration: Configurable tensor parallelism for inference
- Reference Model Support: Supports reference model (for KL) stacked on inference or in its own GPU
Multi-Node Training
Multi-node training finished in dev - reach out if interested.Production Run Flow
Start an Online RL job against your deployed Task App, monitor progress/events, and run inference using the produced checkpoint when training completes.- Complete Workflow: End-to-end workflow from job start to inference
- Progress Monitoring: Real-time progress and event monitoring
- Checkpoint Inference: Use produced checkpoints for inference
- Production Ready: Full production workflow support
Use Cases
- Secure Credentials: Store environment API keys securely with encryption
- Managed Environments: Use managed Task Apps for easier deployment
- Efficient RL: Optimize GPU usage with automatic split between inference and training
- Production RL: Run Online RL jobs in production with complete monitoring