specifications¶
specifications ¶
Core specification models for pipeline configuration.
These Pydantic models define the configuration contracts for all pipeline components, following the principle of separation between configuration (what to do) and execution (how to do it).
DataSourceType ¶
Bases: str, Enum
Supported data source types.
LLMProvider ¶
Bases: str, Enum
Supported LLM providers.
ErrorPolicy ¶
Bases: str, Enum
Error handling policies for processing failures.
MergeStrategy ¶
Bases: str, Enum
Output merge strategies.
DatasetSpec ¶
Bases: BaseModel
Specification for data source configuration.
validate_source_path
classmethod
¶
Convert string paths to Path objects.
validate_no_overlap
classmethod
¶
Ensure output columns don't overlap with input columns.
Source code in ondine/core/specifications.py
PromptSpec ¶
Bases: BaseModel
Specification for prompt template configuration.
LLMSpec ¶
Bases: BaseModel
Specification for LLM provider configuration.
validate_base_url_format
classmethod
¶
Validate base_url is a valid HTTP(S) URL with a host.
Source code in ondine/core/specifications.py
validate_azure_config
classmethod
¶
Validate Azure-specific configuration.
Source code in ondine/core/specifications.py
validate_provider_requirements ¶
Validate provider-specific requirements.
Source code in ondine/core/specifications.py
ProcessingSpec ¶
OutputSpec ¶
Bases: BaseModel
Specification for output configuration.
validate_destination_path
classmethod
¶
Convert string paths to Path objects.
Source code in ondine/core/specifications.py
PipelineSpecifications ¶
Bases: BaseModel
Container for all pipeline specifications.
LLMProviderPresets ¶
Pre-configured LLM provider specifications for common use cases.
These presets provide convenient access to popular LLM providers with correct base URLs, pricing, and configuration. API keys must be provided at runtime via environment variables or explicit overrides.
Example
Use preset with env var API key¶
from ondine.core.specifications import LLMProviderPresets
pipeline = ( PipelineBuilder.create() .from_csv("data.csv", input_columns=["text"], output_columns=["result"]) .with_prompt("Process: {text}") .with_llm_spec(LLMProviderPresets.TOGETHER_AI_LLAMA_70B) .build() )
Override API key¶
spec = LLMProviderPresets.TOGETHER_AI_LLAMA_70B.model_copy( update={"api_key": "your-key"} # pragma: allowlist secret ) pipeline.with_llm_spec(spec)
Security Note
All presets have api_key=None by default. You must provide API keys at runtime via environment variables or explicit overrides.
create_custom_openai_compatible
classmethod
¶
create_custom_openai_compatible(provider_name: str, model: str, base_url: str, input_cost_per_1k: float = 0.0, output_cost_per_1k: float = 0.0, **kwargs) -> LLMSpec
Factory method for custom OpenAI-compatible providers.
Use this for providers like vLLM, LocalAI, Anyscale, or any custom OpenAI-compatible API endpoint.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
provider_name
|
str
|
Display name for the provider (for logging/metrics) |
required |
model
|
str
|
Model identifier |
required |
base_url
|
str
|
API endpoint URL (e.g., http://localhost:8000/v1) |
required |
input_cost_per_1k
|
float
|
Input token cost per 1K tokens (default: 0.0) |
0.0
|
output_cost_per_1k
|
float
|
Output token cost per 1K tokens (default: 0.0) |
0.0
|
**kwargs
|
Additional LLMSpec parameters (temperature, max_tokens, etc.) |
{}
|
Returns:
| Type | Description |
|---|---|
LLMSpec
|
Configured LLMSpec for the custom provider |
Example
spec = LLMProviderPresets.create_custom_openai_compatible( provider_name="My vLLM Server", model="mistral-7b-instruct", base_url="http://my-server:8000/v1", temperature=0.7 )