HTTP API Reference

Base URL: https://agent-learning.onrender.com/api/v1
Auth: Authorization: Bearer <SYNTH_API_KEY> (organization‑scoped)
Content types: JSON unless noted; file upload uses multipart/form‑data
Notes: Some endpoints accept optional GPU routing via X-GPU-Preference header or gpu query.

POST /api/v1/chat/completions
- Body: OpenAI Chat Completions schema (model, messages, optional temperature, max_tokens, tools, tool_choice, stream)
- Headers: Authorization, optional X-GPU-Preference: A100|L40S|A10G|...
- Response: OpenAI‑compatible chat completion object. Supports streaming when stream=true.
POST /api/v1/responses
- Body: OpenAI Responses API schema (model, input or messages, n, best_of, etc.)
- Headers: Authorization, optional X-GPU-Preference
- Response: OpenAI‑compatible responses object. Supports streaming when stream=true.

POST /api/v1/warmup/{model_id}
- Path: model_id (e.g., Qwen/Qwen3-4B-Instruct-2507)
- Query: gpu (optional; e.g., A100, L40S)
- Headers: Authorization, optional X-GPU-Preference
- Response: Warmup submission status for the model/GPU.
GET /api/v1/warmup/status/{model_id}
- Path: model_id
- Headers: Authorization
- Response: Current warmup status for the model.

POST /api/v1/files
- Form: file (binary), purpose (e.g., fine-tune)
- Headers: Authorization
- Response: Uploaded file metadata (id, filename, bytes, created_at).
GET /api/v1/files
- Query: purpose (optional)
- Headers: Authorization
- Response: List of files.
GET /api/v1/files/{file_id}
- Headers: Authorization
- Response: File metadata.
GET /api/v1/files/{file_id}/content
- Headers: Authorization
- Response: File content (streamed).
DELETE /api/v1/files/{file_id}
- Headers: Authorization
- Response: Deletion result.

POST /api/v1/fine_tuning/jobs
- Body: { model: string, training_file: string, training_type?: "sft", hyperparameters?: object, suffix?: string, validation_file?: string, seed?: number }
- Only supervised fine‑tuning (SFT) is supported at this time.
- Headers: Authorization
- Response: Fine‑tuning job object (id, status, model, fine_tuned_model when available).
GET /api/v1/fine_tuning/jobs
- Query: after (optional cursor), limit (1–100, default 20)
- Headers: Authorization
- Response: Paginated list of jobs.
GET /api/v1/fine_tuning/jobs/{job_id}
- Headers: Authorization
- Response: Job details and current status.
GET /api/v1/fine_tuning/jobs/{job_id}/events
- Query: after, limit (1–100), stream (bool)
- Headers: Authorization
- Response: List of job events or SSE stream when stream=true.
POST /api/v1/fine_tuning/jobs/{job_id}/cancel
- Headers: Authorization
- Response: Cancellation acknowledgement.

GET /api/v1/models
- Headers: Authorization
- Response: OpenAI‑compatible data plus extras:
  - gpus: supported GPU configurations
  - fine_tuned_models: org fine‑tunes (id, base_model, created_at, job_id, status)
  - available_models: quick local registry list
  - organization_id, customer_id

GET /api/v1/balance/current
- Headers: Authorization
- Response: { organization_id, balance_cents, balance_dollars, last_updated }
GET /api/v1/balance/usage
- Headers: Authorization
- Response: { current_month: { token_spend_cents, gpu_spend_cents, total_spend_cents, token_count, gpu_hours }, last_30_days: { ... } }

Standard HTTP status codes; 401 for unauthorized, 4xx for validation, 5xx for upstream errors.
Error bodies include detail string or structured troubleshooting info for some routes.