Overview

  • Base URL: https://agent-learning.onrender.com/api/v1
  • Auth: Authorization: Bearer <SYNTH_API_KEY> (organization‑scoped)
  • Content types: JSON unless noted; file upload uses multipart/form‑data
  • Notes: Some endpoints accept optional GPU routing via X-GPU-Preference header or gpu query.

Inference

  • POST /api/v1/chat/completions
    • Body: OpenAI Chat Completions schema (model, messages, optional temperature, max_tokens, tools, tool_choice, stream)
    • Headers: Authorization, optional X-GPU-Preference: A100|L40S|A10G|...
    • Response: OpenAI‑compatible chat completion object. Supports streaming when stream=true.
  • POST /api/v1/responses
    • Body: OpenAI Responses API schema (model, input or messages, n, best_of, etc.)
    • Headers: Authorization, optional X-GPU-Preference
    • Response: OpenAI‑compatible responses object. Supports streaming when stream=true.

Warmup (model loading/cache)

  • POST /api/v1/warmup/{model_id}
    • Path: model_id (e.g., Qwen/Qwen3-4B-Instruct-2507)
    • Query: gpu (optional; e.g., A100, L40S)
    • Headers: Authorization, optional X-GPU-Preference
    • Response: Warmup submission status for the model/GPU.
  • GET /api/v1/warmup/status/{model_id}
    • Path: model_id
    • Headers: Authorization
    • Response: Current warmup status for the model.

Files (for fine‑tuning)

  • POST /api/v1/files
    • Form: file (binary), purpose (e.g., fine-tune)
    • Headers: Authorization
    • Response: Uploaded file metadata (id, filename, bytes, created_at).
  • GET /api/v1/files
    • Query: purpose (optional)
    • Headers: Authorization
    • Response: List of files.
  • GET /api/v1/files/{file_id}
    • Headers: Authorization
    • Response: File metadata.
  • GET /api/v1/files/{file_id}/content
    • Headers: Authorization
    • Response: File content (streamed).
  • DELETE /api/v1/files/{file_id}
    • Headers: Authorization
    • Response: Deletion result.

Fine‑tuning (SFT)

  • POST /api/v1/fine_tuning/jobs
    • Body: { model: string, training_file: string, training_type?: "sft", hyperparameters?: object, suffix?: string, validation_file?: string, seed?: number }
    • Only supervised fine‑tuning (SFT) is supported at this time.
    • Headers: Authorization
    • Response: Fine‑tuning job object (id, status, model, fine_tuned_model when available).
  • GET /api/v1/fine_tuning/jobs
    • Query: after (optional cursor), limit (1–100, default 20)
    • Headers: Authorization
    • Response: Paginated list of jobs.
  • GET /api/v1/fine_tuning/jobs/{job_id}
    • Headers: Authorization
    • Response: Job details and current status.
  • GET /api/v1/fine_tuning/jobs/{job_id}/events
    • Query: after, limit (1–100), stream (bool)
    • Headers: Authorization
    • Response: List of job events or SSE stream when stream=true.
  • POST /api/v1/fine_tuning/jobs/{job_id}/cancel
    • Headers: Authorization
    • Response: Cancellation acknowledgement.

Models

  • GET /api/v1/models
    • Headers: Authorization
    • Response: OpenAI‑compatible data plus extras:
      • gpus: supported GPU configurations
      • fine_tuned_models: org fine‑tunes (id, base_model, created_at, job_id, status)
      • available_models: quick local registry list
      • organization_id, customer_id

Balance

  • GET /api/v1/balance/current
    • Headers: Authorization
    • Response: { organization_id, balance_cents, balance_dollars, last_updated }
  • GET /api/v1/balance/usage
    • Headers: Authorization
    • Response: { current_month: { token_spend_cents, gpu_spend_cents, total_spend_cents, token_count, gpu_hours }, last_30_days: { ... } }

Error semantics

  • Standard HTTP status codes; 401 for unauthorized, 4xx for validation, 5xx for upstream errors.
  • Error bodies include detail string or structured troubleshooting info for some routes.