Routing Engine (Enterprise Only)

LangDB's Routing Engine enables organizations to control how user requests are handled by AI models, optimizing for cost, performance, compliance, and user experience. By defining routing rules in JSON, businesses can automate decision-making, ensure reliability, and maximize value from their AI investments.

What is Routing and What are its Components?

Routing is the process of directing an incoming request to the most appropriate AI model based on a set of rules. The LangDB router is composed of several key components that work together to execute this logic:

Routes: These are the core building blocks of your router. A router is essentially a list of routes that are evaluated in order. The first route whose conditions are met is executed.
Conditions: The logic that determines whether a route should be triggered. Conditions can evaluate request data, user metadata, or results from pre-request hooks.
Targets: The destination for a request if a route's conditions are met. This is typically one or more AI models.
Interceptors (Guardrails & Rate Limiters): These are pre-request hooks that can inspect, modify, or enrich a request before routing rules are evaluated. Their results can be used in conditions.
Message Mapper: A component used to block a request or modify the final response, often for handling errors like rate limits.

Example Use Cases

Enterprise Use Case	Business Goal	Key Variables & Metrics	Routing Logic Summary
SLA-Driven Tiering	Guarantee premium performance for high-value customers.	`extra.user.tier`, `ttft`	Route `extra.user.tier: "premium"` to models with the lowest `ttft`.
Geographic Compliance	Ensure data sovereignty and meet regulatory requirements (e.g., GDPR).	`metadata.region`, `extra.user.tags`	If `metadata.region: "EU"`, route to models for users with `tags: ["GDPR"]`.
Intelligent Cost Management	Reduce operational expenses for internal or low-priority tasks.	`metadata.group_name`, `price`	If `metadata.group_name: "internal"`, sort available models by `"sort_by": "price"`.
Content-Aware Routing	Improve accuracy by using specialized models for specific topics.	`pre_request.semantic_guardrail.result.topic`	If `topic: "finance"`, route to a finance-tuned model.
Brand Safety Enforcement	Prevent brand damage by blocking or redirecting inappropriate content.	`pre_request.toxicity_guardrail.result.passed`	If `passed: false`, block the request or route to a safe-reply model.

For more detailed examples, see the pages below:

Quick, focused routing patterns you can copy and adapt.

End-to-end example of a multi-layer enterprise routing setup with tiering, cost and fallbacks.

Example showing rate limiting, semantic guardrails, GDPR routing, and error handling.

Anatomy of a Routing Request

A routing request is a standard chat completion request with two key additions:

The model must be set to "router/dynamic".
A router object containing your routing logic must be included in the request body.

Here’s how the various components fit into a complete request. The example below shows a simple configuration with two routes: one for premium users and a fallback for everyone else.

{
  "model": "router/dynamic",
  "messages": [
    {
      "role": "user",
      "content": "Our production API is down, I need help now!"
    }
  ],
  "extra": {
    "user": { "tier": "premium" }
  },
  "router": {
    "type": "conditional",
    "routes": [
      {
        "name": "premium_support_fast_track",
        "conditions": {
          "all": [
            { "extra.user.tier": { "$eq": "premium" } }
          ]
        },
        "targets": {
          "$any": ["anthropic/claude-4-opus", "openai/gpt-o3"],
          "sort_by": "ttft",
          "sort_order": "min"
        }
      },
      {
        "name": "default_fallback",
        "conditions": { "all": [] },
        "targets": "openai/gpt-4o-mini"
      }
    ]
  }
}

Routing Components Explained

This section breaks down each of the major components of the router object.

Routes

The routes property contains an array of route objects. These are evaluated sequentially from top to bottom, and the first route whose conditions are met will be executed. Every route must have a name, conditions, and targets.

Conditions

The conditions block defines when a route should be activated. It uses a flexible JSON-based syntax.

Logical Operators: You can combine multiple conditions using all (AND) or any (OR).
Comparison Operators: Conditions use operators like $eq (equal), $neq (not equal), $in (in array), $lt (less than), $gt (greater than) to evaluate variables.
Lazy Evaluation of Guardrails: It's important to note that guardrails are evaluated lazily. A guardrail interceptor will only be executed if the router encounters a condition that requires its result (pre_request.{guardrail_name}.*). This prevents unnecessary latency.

Targets

The targets block defines what happens when a route is matched. It specifies one or more models to which the request can be sent.

Specifying Models

You can specify models in your targets list in several ways, giving you flexibility in how you define your candidate pool.

Exact Name with Provider: openai/gpt-4o
- This is the most specific and recommended method. It uniquely identifies a single model from a single provider.
Provider Wildcard: openai/*
- This selects all available models from a specific provider (e.g., all models from OpenAI). This is useful for creating provider-level routing rules or fallbacks.
Model Name Only: claude-3-opus
- This selects all models with that name from any available provider. For example, if both Anthropic and another provider offered claude-3-opus, both would be added to the candidate pool. This is particularly useful when you want to use sorting to find the best provider for a specific model based on real-time metrics like price or ttft. Use with caution, as it can be ambiguous if providers have different capabilities for the same model name.

Filtering Models

Before selecting a model, you can filter the list of potential targets in $any using the filter property. This is useful for ensuring models meet certain real-time performance criteria.

"targets": {
    "$any": ["anthropic/claude-4-opus", "openai/gpt-o3"],
    "filter": {
      "error_rate": { "$lt": 0.02 }
    }
}

This example ensures that only models with an error rate below 2% are considered.

Sorting Models

After filtering, the router can sort the remaining candidate models to find the best one based on a specific metric.

sort_by: The metric to use for sorting. Common values are price, ttft (time to first token), and error_rate.
sort_order: The direction to sort, either min (for lowest cost/latency) or max.

"targets": {
    "$any": ["mistral/mistral-large-latest", "anthropic/claude-4-sonnet"],
    "sort_by": "price",
    "sort_order": "min"
}

This example selects the cheapest model from the pool.

Interceptors (Guardrails & Rate Limiters)

Interceptors are hooks that run before the main routing logic is evaluated. They are defined in the pre_request array.

Guardrails enforce content and safety policies (e.g., checking for toxicity or classifying topics).
Rate Limiters enforce usage quotas to prevent abuse.

The results of these interceptors are made available in the pre_request variable space for use in your conditions. For detailed configuration, see Interceptors & Guardrails.

Message Mapper

The Message Mapper is used to take direct control of the response. Its most common use case is to block a request that has failed an interceptor check and return a custom error message.

{
  "name": "rate_limit_exceeded_block",
  "conditions": {
    "pre_request.rate_limiter.passed": { "$eq": false }
  },
  "message_mapper": {
    "modifier": "block",
    "content": "You have exceeded your daily quota."
  }
}

Metadata and Variables

Effective routing relies on rich contextual information. LangDB provides two main sources of data for your conditions: extra.user.* data, which you pass in the request, and metadata.* data, which is automatically populated by LangDB. For a complete list, see Variables & Functions.

Components Summary

Component	Purpose	Key Configuration
Routes	A list of rules evaluated in order.	`name`, `conditions`, `targets`
Conditions	The "if" statement for a route.	`all`, `any`, operators (`$eq`, `$lt`), variables
Targets	The destination model(s) for a route.	`$any`, `filter`, `sort_by`, `sort_order`
Interceptors	Pre-request hooks for validation or enrichment.	`pre_request` array, `type` (`guardrail`, `interceptor`)
Message Mapper	Blocks or modifies the final response.	`modifier: "block"`, `content`

Performance Impact

Different routing components have different performance characteristics.

Guardrails & Interceptors:
- Simple checks like a Rate Limiter are very fast, typically involving a quick lookup in Redis.
- More complex guardrails, especially those that are "LLM-as-a-judge" (i.e., they make an LLM call to evaluate the prompt), will introduce significant latency to the request, equal to the duration of that LLM call.
Model Sorting:
- Sorting models based on performance metrics (ttft, error_rate) or price requires fetching this data from a metrics store (like Redis). This is a very fast operation but does add a small, fixed overhead to each routed request.

Tracing & Observability

Routing decisions are fully transparent and traceable within the LangDB ecosystem.

Routing Performance: The performance of the routing logic itself is tracked, so you can monitor for any overhead.
Candidate & Picked Models: For every request, the trace records:
- The initial pool of candidate models for a matched route.
- The list of models remaining after filtering.
- The final picked model after sorting.
OpenTelemetry: All router metrics (decisions, latencies, error rates) are exported via OpenTelemetry, allowing you to integrate with your existing observability stack (e.g., DataDog, New Relic) for real-time analytics and alerting.

What is Routing and What are its Components?​

Example Use Cases​

Anatomy of a Routing Request​

Routing Components Explained​

Routes​

Conditions​

Targets​

Specifying Models​

Filtering Models​

Sorting Models​

Interceptors (Guardrails & Rate Limiters)​

Message Mapper​

Metadata and Variables​

Components Summary​

Performance Impact​

Tracing & Observability​