llm_client¶
llm_client ¶
LLM client abstractions and implementations.
Provides unified interface for multiple LLM providers following the Adapter pattern and Dependency Inversion principle.
LLMClient ¶
Bases: ABC
Abstract base class for LLM clients.
Defines the contract that all LLM provider implementations must follow, enabling easy swapping of providers (Strategy pattern).
Initialize LLM client.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spec
|
LLMSpec
|
LLM specification |
required |
Source code in ondine/adapters/llm_client.py
invoke
abstractmethod
¶
Invoke LLM with a single prompt.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
Text prompt |
required |
**kwargs
|
Any
|
Additional model parameters |
{}
|
Returns:
| Type | Description |
|---|---|
LLMResponse
|
LLMResponse with result and metadata |
Source code in ondine/adapters/llm_client.py
estimate_tokens
abstractmethod
¶
Estimate token count for text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Input text |
required |
Returns:
| Type | Description |
|---|---|
int
|
Estimated token count |
batch_invoke ¶
Invoke LLM with multiple prompts.
Default implementation: sequential invocation. Subclasses can override for provider-optimized batch processing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompts
|
list[str]
|
List of text prompts |
required |
**kwargs
|
Any
|
Additional model parameters |
{}
|
Returns:
| Type | Description |
|---|---|
list[LLMResponse]
|
List of LLMResponse objects |
Source code in ondine/adapters/llm_client.py
calculate_cost ¶
Calculate cost for token usage.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tokens_in
|
int
|
Input tokens |
required |
tokens_out
|
int
|
Output tokens |
required |
Returns:
| Type | Description |
|---|---|
Decimal
|
Total cost in USD |
Source code in ondine/adapters/llm_client.py
OpenAIClient ¶
Bases: LLMClient
OpenAI LLM client implementation.
Initialize OpenAI client.
Source code in ondine/adapters/llm_client.py
invoke ¶
Invoke OpenAI API.
Source code in ondine/adapters/llm_client.py
AzureOpenAIClient ¶
Bases: LLMClient
Azure OpenAI LLM client implementation.
Initialize Azure OpenAI client.
Source code in ondine/adapters/llm_client.py
invoke ¶
Invoke Azure OpenAI API.
Source code in ondine/adapters/llm_client.py
AnthropicClient ¶
Bases: LLMClient
Anthropic Claude LLM client implementation.
Initialize Anthropic client.
Source code in ondine/adapters/llm_client.py
invoke ¶
Invoke Anthropic API.
Source code in ondine/adapters/llm_client.py
GroqClient ¶
Bases: LLMClient
Groq LLM client implementation.
Initialize Groq client.
Source code in ondine/adapters/llm_client.py
invoke ¶
Invoke Groq API.
Source code in ondine/adapters/llm_client.py
OpenAICompatibleClient ¶
Bases: LLMClient
Client for OpenAI-compatible API endpoints.
Supports custom providers like Ollama, vLLM, Together.ai, Anyscale, and any other API that implements the OpenAI chat completions format.
Initialize OpenAI-compatible client.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spec
|
LLMSpec
|
LLM specification with base_url required |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If base_url not provided |
Source code in ondine/adapters/llm_client.py
invoke ¶
Invoke OpenAI-compatible API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
Text prompt |
required |
**kwargs
|
Any
|
Additional model parameters |
{}
|
Returns:
| Type | Description |
|---|---|
LLMResponse
|
LLMResponse with result and metadata |
Source code in ondine/adapters/llm_client.py
estimate_tokens ¶
Estimate tokens using tiktoken.
Note: This is approximate for custom providers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Input text |
required |
Returns:
| Type | Description |
|---|---|
int
|
Estimated token count |
Source code in ondine/adapters/llm_client.py
MLXClient ¶
Bases: LLMClient
MLX client for Apple Silicon local inference.
MLX is Apple's optimized ML framework for M-series chips. This client enables fast, local LLM inference without API costs.
Requires: pip install ondine[mlx] Platform: macOS with Apple Silicon only
Initialize MLX client and load model.
Model is loaded once and cached for fast subsequent calls.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spec
|
LLMSpec
|
LLM specification with model name |
required |
_mlx_lm_module
|
MLX module (internal/testing only) |
None
|
Raises:
| Type | Description |
|---|---|
ImportError
|
If MLX not installed |
Exception
|
If model loading fails |
Source code in ondine/adapters/llm_client.py
invoke ¶
Invoke MLX model for inference.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
Text prompt |
required |
**kwargs
|
Any
|
Additional generation parameters |
{}
|
Returns:
| Type | Description |
|---|---|
LLMResponse
|
LLMResponse with result and metadata |
Source code in ondine/adapters/llm_client.py
estimate_tokens ¶
Estimate token count using MLX tokenizer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Input text |
required |
Returns:
| Type | Description |
|---|---|
int
|
Estimated token count |
Source code in ondine/adapters/llm_client.py
create_llm_client ¶
Factory function to create appropriate LLM client using ProviderRegistry.
Supports both built-in providers (via LLMProvider enum) and custom providers (registered via ProviderRegistry).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spec
|
LLMSpec
|
LLM specification |
required |
Returns:
| Type | Description |
|---|---|
LLMClient
|
Configured LLM client |
Raises:
| Type | Description |
|---|---|
ValueError
|
If provider not supported |
Example
Built-in provider¶
spec = LLMSpec(provider=LLMProvider.OPENAI, model="gpt-4o-mini") client = create_llm_client(spec)
Custom provider (registered via @provider decorator)¶
spec = LLMSpec(provider="my_custom_llm", model="my-model") client = create_llm_client(spec)