Connecting to OSS Models
LangDB AI Gateway supports connecting to open-source models through providers like Ollama and vLLM. This allows you to use locally hosted models while maintaining the same OpenAI-compatible API interface.
Configuration
To use Ollama or vLLM, you need to provide a list of models with their endpoints. By default, ai-gateway loads models from ~/.langdb/models.yaml. You can define your models there in the following format:
- model: gpt-oss
model_provider: ollama
inference_provider:
provider: ollama
model_name: gpt-oss
endpoint: https://my-ollama-server.localhost
price:
per_input_token: 0.0
per_output_token: 0.0
input_formats:
- text
output_formats:
- text
limits:
max_context_size: 128000
capabilities: ['tools']
type: completions
description: OpenAI's open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases.
Configuration Fields
| Field | Description | Required |
|---|---|---|
model | The model identifier used in API requests | Yes |
model_provider | The provider type (e.g., ollama, vllm) | Yes |
inference_provider | Provider-specific configuration | Yes |
price | Token pricing (set to 0.0 for local models) | Yes |
input_formats | Supported input formats | Yes |
output_formats | Supported output formats | Yes |
limits | Model limitations (context size, etc.) | Yes |
capabilities | Model capabilities array (e.g., ['tools'] for function calling) | Yes |
type | Model type (e.g., completions) | Yes |
description | Human-readable model description | Yes |
Example Usage
Once configured, you can use your OSS models through the standard OpenAI-compatible API:
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-oss",
"messages": [{"role": "user", "content": "What is the capital of France?"}]
}'
Supported Providers
Ollama
- Provider:
ollama - Endpoint: URL to your Ollama server
- Model Name: The model name as configured in Ollama
vLLM
- Provider:
vllm - Endpoint: URL to your vLLM server
- Model Name: The model name as configured in vLLM
Best Practices
- Local Development: Use
localhostor127.0.0.1for local Ollama/vLLM instances - Production: Use proper domain names or IP addresses for remote instances
- Security: Ensure your OSS model endpoints are properly secured
- Performance: Consider the network latency between ai-gateway and your model servers
- Monitoring: Use the observability features to monitor OSS model performance