synth_ai.sdk.inference.client
Inference client for model inference via Synth AI.
This module provides a client for making inference requests through Synth AI’s
inference proxy, which routes requests to appropriate model providers (OpenAI,
Groq, etc.) based on the model identifier.
Example:
Classes
InferenceClient
Client for making inference requests through Synth AI’s inference proxy.
This client provides a unified interface for calling LLMs through Synth AI’s
backend, which handles routing to appropriate providers (OpenAI, Groq, etc.)
based on the model identifier.
Methods:
create_chat_completion
model: Model identifier (e.g., “gpt-4o-mini”, “Qwen/Qwen3-4B”)messages: List of message dicts with “role” and “content” keys**kwargs: Additional OpenAI-compatible parameters:- temperature: Sampling temperature (0.0-2.0)
- max_tokens: Maximum tokens to generate
- thinking_budget: Budget for thinking tokens (default: 256)
- top_p: Nucleus sampling parameter
- frequency_penalty: Frequency penalty (-2.0 to 2.0)
- presence_penalty: Presence penalty (-2.0 to 2.0)
- stop: Stop sequences
- tools: Function calling tools
- tool_choice: Tool choice strategy
- stream: Whether to stream responses
- … (other OpenAI API parameters)
- Completion response dict with:
- id: Request ID
- choices: List of completion choices
- usage: Token usage statistics
- … (other OpenAI-compatible fields)
ValueError: If model is not supported or request is invalidHTTPError: If the API request fails