Router Modular
Table of Contents
- Overview
- Architecture
- Quick Start
- Enumerations
- Configuration System
- Core Data Objects
- BaseHandler — Writing Your Own Handler
- HandlerRequest Methods
- HandlerResponse Methods
- Route
- RouteGroup
- Built-in Group Templates
- HierarchicalRouter
- Constructor
register_group()update_group()register()unregister()unregister_group()set_fallback()route_query()route_async()stream()stream_async()chain()feedback()re_encode_route()re_encode_group()metrics()cache_stats()feedback_summary()list_groups()list_routes()get_route()get_group()summary()
- RoutingResult Properties
- RouteCandidate Properties
- RoutingTrace Methods
- CacheManager
- FeedbackEngine
- Integration Examples
- Error Reference
Overview
The router module is a production-grade, hierarchical LLM query routing engine. It intelligently directs incoming natural-language queries to the most appropriate handler using a hybrid scoring approach: sentence embeddings for semantic similarity, keyword signals for fast veto/boost, and an optional LLM classifier for ambiguous cases.
Key Capabilities
| Capability | Description |
|---|---|
| Hierarchical routing | Two-level: query → Group → Route. Groups represent intent domains (RAG, tools, chat); routes are specific capabilities within a domain |
| Hybrid scoring | Combines semantic embedding similarity, keyword signals, and optional LLM scoring with configurable weights |
| 4 execution modes | Single best, sequential top-k (stop on success), parallel top-k (concurrent), and tool chaining |
| Adaptive feedback | EMA-based score bias that automatically improves routing accuracy over time from observed outcomes |
| LRU+TTL cache | Skip the embedding model and scorer entirely for repeated queries |
| Streaming | Sync and async token streaming from matched handlers |
| Full observability | Structured JSON logging, per-request routing traces, rolling metrics with P95 latency |
| Context manager | with router: — auto-saves feedback state on exit |
Architecture
HierarchicalRouter ← primary public entry point
│
├── RoutingPipeline ← orchestrates the decision flow
│ ├── EmbeddingProvider ← sentence-transformer wrapper + cache
│ ├── HybridScorer ← weighted combination of:
│ │ ├── SemanticScorer (embedding cosine similarity)
│ │ ├── KeywordScorer (keyword veto/boost)
│ │ └── LLMScorer (optional, for low-confidence cases)
│ └── ConfidenceEvaluator ← maps score → HIGH / MEDIUM / LOW / NONE
│
├── ExecutionEngine ← runs matched handler(s)
│ ├── Single mode
│ ├── Sequential mode (first success wins)
│ ├── Parallel mode (concurrent + aggregation)
│ ├── Streaming
│ └── Tool chaining
│
├── CacheManager ← LRU+TTL route decision cache
├── FeedbackEngine ← EMA-based adaptive score biases
└── MetricsCollector ← rolling-window latency & hit-rate metrics
RouteGroup ← named domain namespace (e.g. "rag", "tools", "chat")
└── Route[] ← individual routing destination + handlerRequest Flow:
query
→ cache lookup (skip pipeline on hit)
→ embed query
→ score groups (pick best domain)
→ score routes (pick top-k within domain)
→ evaluate confidence (HIGH / MEDIUM / LOW / NONE)
→ execute handler(s)
→ record feedback & metrics
→ RoutingResultQuick Start
from fennec_community.router import (
HierarchicalRouter, RouteGroup, Route,
BaseHandler, HandlerRequest, HandlerResponse,
make_rag_group, make_tools_group,
)
# 1. Define handlers
class DocsQAHandler(BaseHandler):
def handle(self, request: HandlerRequest) -> HandlerResponse:
answer = my_rag_pipeline(request.query)
return HandlerResponse.ok(answer)
class WeatherHandler(BaseHandler):
def handle(self, request: HandlerRequest) -> HandlerResponse:
city = extract_city(request.query)
data = fetch_weather(city)
return HandlerResponse.ok(data)
# 2. Build route groups
rag_group = make_rag_group(priority=10)
rag_group.add_route(Route(
name = "docs_qa",
description = "Answer questions from the knowledge base",
handler = DocsQAHandler(),
examples = ["What is the refund policy?", "How do I reset my password?"],
))
tools_group = make_tools_group(priority=20)
tools_group.add_route(Route(
name = "weather",
description = "Get current weather for a city",
handler = WeatherHandler(),
examples = ["What's the weather in Cairo?", "Is it raining in London?"],
))
# 3. Create router and register groups
router = HierarchicalRouter()
router.register_group(rag_group)
router.register_group(tools_group)
# 4. Route a query
result = router.route_query("What are the return conditions?")
if result.success:
print(result.content)
print(f"Routed to: {result.route_name} ({result.confidence})")Enumerations
All enumerations inherit from str, Enum — their values can be passed as plain strings wherever an enum is expected.
SimilarityMetric
Controls the vector similarity function used by the semantic scorer.
| Value | String | Description |
|---|---|---|
SimilarityMetric.COSINE |
"cosine" |
Cosine similarity (default; works with non-normalized vectors) |
SimilarityMetric.DOT_PRODUCT |
"dot_product" |
Dot product (assumes L2-normalized vectors) |
SimilarityMetric.EUCLIDEAN |
"euclidean" |
Euclidean distance converted to similarity via 1/(1+dist) |
ExecutionMode
Controls how matched route candidates are executed.
| Value | String | Behaviour |
|---|---|---|
ExecutionMode.SINGLE |
"single" |
Execute only the top-scoring route |
ExecutionMode.SEQUENTIAL |
"sequential" |
Try candidates in ranked order; stop on first success |
ExecutionMode.PARALLEL |
"parallel" |
Execute all top-k candidates concurrently; aggregate results |
AggregationStrategy
Controls how results are combined when ExecutionMode.PARALLEL is used.
| Value | String | Behaviour |
|---|---|---|
AggregationStrategy.FIRST_WINS |
"first_wins" |
Return the first successful response |
AggregationStrategy.VOTING |
"voting" |
Majority vote (for structured responses) |
AggregationStrategy.MERGE |
"merge" |
Merge all successful responses into a list |
ConfidenceLevel
The confidence tier assigned to a routing decision based on the combined score vs configured thresholds.
| Value | String | Score Range | Router Action |
|---|---|---|---|
ConfidenceLevel.HIGH |
"high" |
≥ high_confidence_threshold (default 0.85) |
Route directly |
ConfidenceLevel.MEDIUM |
"medium" |
≥ medium_confidence_threshold (default 0.60) |
Route with warning |
ConfidenceLevel.LOW |
"low" |
≥ fallback_confidence_threshold (default 0.40) |
Route, trigger LLM fallback if configured |
ConfidenceLevel.NONE |
"none" |
Below all thresholds | Invoke global fallback handler or raise |
Configuration System
RouterConfig
Module: router.config
Import: from fennec_community.router import RouterConfig
The master configuration object for the entire routing system. All sub-configs are nested inside it.
@dataclass
class RouterConfig:
embedding: EmbeddingConfig = EmbeddingConfig()
scoring: ScoringConfig = ScoringConfig()
execution: ExecutionConfig = ExecutionConfig()
cache: CacheConfig = CacheConfig()
feedback: FeedbackConfig = FeedbackConfig()
observability: ObservabilityConfig = ObservabilityConfig()
llm_model: Optional[str] = None
llm_api_key: Optional[str] = None
raise_on_no_match: bool = False| Field | Type | Default | Description |
|---|---|---|---|
embedding |
EmbeddingConfig |
defaults | Embedding model settings |
scoring |
ScoringConfig |
defaults | Score weights and confidence thresholds |
execution |
ExecutionConfig |
defaults | Execution mode and retry policy |
cache |
CacheConfig |
defaults | Cache TTL, size, and enable/disable |
feedback |
FeedbackConfig |
defaults | Adaptive feedback loop settings |
observability |
ObservabilityConfig |
defaults | Logging, tracing, and metrics |
llm_model |
Optional[str] |
None |
OpenAI model name for LLM scoring (e.g., "gpt-4o-mini") |
llm_api_key |
Optional[str] |
None |
API key for the LLM scorer |
raise_on_no_match |
bool |
False |
If True, raise LookupError when no route matches; otherwise return a failed RoutingResult |
EmbeddingConfig
Controls the sentence-transformer model used for semantic scoring.
| Field | Type | Default | Description |
|---|---|---|---|
model_name |
str |
"sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2" |
HuggingFace model name |
batch_size |
int |
32 |
Batch size for encoding multiple texts at once |
cache_embeddings |
bool |
True |
Cache computed embeddings in memory |
normalize_embeddings |
bool |
True |
L2-normalize embeddings before similarity computation |
ScoringConfig
Controls hybrid score weights and confidence thresholds.
| Field | Type | Default | Description |
|---|---|---|---|
semantic_weight |
float |
0.70 |
Weight for embedding similarity score |
keyword_weight |
float |
0.20 |
Weight for keyword signal score |
llm_weight |
float |
0.10 |
Weight for LLM classifier score (only when LLM is configured) |
similarity_metric |
SimilarityMetric |
COSINE |
Vector similarity function |
high_confidence_threshold |
float |
0.85 |
Score ≥ this → HIGH confidence |
medium_confidence_threshold |
float |
0.60 |
Score ≥ this → MEDIUM confidence |
fallback_confidence_threshold |
float |
0.40 |
Score ≥ this → LOW confidence (still routes) |
top_k |
int |
5 |
Maximum number of candidate routes to surface |
Score Formula:
combined = (w_sem × semantic) + (w_kw × keyword) + (w_llm × llm) + route.score_bias
ExecutionConfig
Controls how matched routes are executed and retried.
| Field | Type | Default | Description |
|---|---|---|---|
mode |
ExecutionMode |
SINGLE |
Execution mode |
aggregation |
AggregationStrategy |
FIRST_WINS |
Aggregation strategy for parallel mode |
parallel_timeout |
float |
10.0 |
Seconds before parallel execution is cancelled |
max_retries |
int |
2 |
Number of retry attempts per handler on failure |
retry_delay |
float |
0.5 |
Seconds to wait between retry attempts |
retry_on_exceptions |
List[str] |
["RuntimeError", "TimeoutError"] |
Only retry when the error message contains one of these strings |
CacheConfig
Controls the two-tier LRU+TTL routing cache.
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool |
True |
Master switch for caching |
ttl_seconds |
int |
300 |
Cache entry time-to-live (5 minutes) |
max_size |
int |
1000 |
Maximum number of cached routing decisions |
embedding_cache |
bool |
True |
Also cache query embeddings |
embedding_ttl |
int |
3600 |
Embedding cache TTL (1 hour — embeddings are more stable) |
FeedbackConfig
Controls the adaptive feedback loop.
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool |
True |
Enable automatic bias adjustment from routing outcomes |
score_decay |
float |
0.05 |
EMA alpha — how quickly biases adapt to new observations |
min_samples_to_adapt |
int |
10 |
Ignore routes with fewer than this many observations |
persist_path |
Optional[str] |
None |
JSON file path to save/load learned biases across restarts |
ObservabilityConfig
Controls logging, tracing, and metrics.
| Field | Type | Default | Description |
|---|---|---|---|
log_level |
LogLevel |
INFO |
Python logging level (DEBUG, INFO, WARNING, ERROR) |
structured_logging |
bool |
True |
Emit JSON log lines (set False for human-readable format) |
enable_tracing |
bool |
True |
Record full routing decision trace in every RoutingResult |
metrics_window |
int |
1000 |
Rolling window size for latency and hit-rate metrics |
slow_route_ms |
float |
500.0 |
Latency threshold; warnings are emitted when exceeded |
RouterConfig.fast()
RouterConfig.fast() -> RouterConfigPurpose: Class method that returns a RouterConfig preset optimized for low latency: single-route execution, top_k=1, cache enabled.
Returns: RouterConfig
Example:
router = HierarchicalRouter(config=RouterConfig.fast())RouterConfig.accurate()
RouterConfig.accurate() -> RouterConfigPurpose: Class method that returns a RouterConfig preset optimized for accuracy: parallel top-3 execution, increased LLM scorer weight (0.20), reduced semantic weight (0.60).
Returns: RouterConfig
Example:
router = HierarchicalRouter(config=RouterConfig.accurate())Core Data Objects
HandlerRequest
Module: router.core.base
Import: from fennec_community.router import HandlerRequest
The unified input object passed to every handler. Created automatically by the router on each route_query() call.
| Field | Type | Default | Description |
|---|---|---|---|
query |
str |
required | The raw natural-language query from the user |
context |
Dict[str, Any] |
{} |
Arbitrary key-value context (conversation history, user profile, session data) |
request_id |
str |
auto UUID | UUID for correlation across logs and traces |
timestamp |
float |
time.time() |
Unix epoch of when the request was created |
metadata |
Dict[str, Any] |
{} |
Router-internal metadata (route name, confidence score). Handlers may read but should not depend on this in business logic |
HandlerResponse
Module: router.core.base
Import: from fennec_community.router import HandlerResponse
The unified output object returned by every handler.
| Field | Type | Default | Description |
|---|---|---|---|
content |
Any |
— | The response payload (string, dict, list, model output, etc.) |
success |
bool |
True |
False if the handler experienced a handled error |
error |
Optional[str] |
None |
Human-readable error message (only set when success=False) |
metadata |
Dict[str, Any] |
{} |
Handler-supplied annotations (sources, token count, etc.) |
processing_time |
float |
0.0 |
Seconds the handler took (set by the execution engine) |
RouteKeywords
Module: router.core.route
Import: from fennec_community.router import RouteKeywords
Keyword signals that improve routing accuracy without an embedding call. Applied before semantic scoring.
| Field | Type | Default | Description |
|---|---|---|---|
required |
List[str] |
[] |
AND logic — ALL of these must appear in the query; if any is missing the score is 0.0 (veto) |
any_of |
List[str] |
[] |
OR logic — each match adds boost to the combined score |
excluded |
List[str] |
[] |
If any of these appear in the query the score is 0.0 (veto) |
boost |
float |
0.05 |
Score bonus added per matched any_of keyword |
Example:
from fennec_community.router import RouteKeywords
weather_keywords = RouteKeywords(
any_of = ["weather", "temperature", "rain", "forecast", "climate"],
excluded = ["historical", "average", "record"],
boost = 0.10,
)RouteCandidate
Module: router.core.result
Import: from fennec_community.router import RouteCandidate
A scored route candidate surfaced by the routing pipeline. Accessible via RoutingResult.trace.candidates.
| Field | Type | Description |
|---|---|---|
route_name |
str |
Name of the candidate route |
group_name |
Optional[str] |
Name of the group the route belongs to |
combined_score |
float |
Raw weighted combined score (before bias) |
semantic_score |
float |
Embedding similarity component |
keyword_score |
float |
Keyword signal component |
llm_score |
float |
LLM classifier component |
confidence |
ConfidenceLevel |
Confidence tier assigned to this candidate |
score_bias |
float |
Feedback-learned bias offset for this route |
RoutingTrace
Module: router.core.result
Import: from fennec_community.router import RoutingTrace
The full audit trail of a single routing decision. Always present in RoutingResult.trace.
| Field | Type | Description |
|---|---|---|
request_id |
str |
Correlation ID for this request |
query |
str |
The original query |
candidates |
List[RouteCandidate] |
All scored candidates (sorted best-first, up to top_k) |
selected_group |
Optional[str] |
The group that was selected |
selected_route |
Optional[str] |
The route that was selected |
confidence |
Optional[ConfidenceLevel] |
Confidence of the routing decision |
used_llm_fallback |
bool |
Whether the LLM scorer was invoked |
used_cache |
bool |
Whether this result came from the cache |
embedding_time_ms |
float |
Time to embed the query |
scoring_time_ms |
float |
Time to score all candidates |
execution_time_ms |
float |
Time to execute the handler |
total_time_ms |
float |
Total end-to-end latency |
notes |
List[str] |
Human-readable notes about the routing decision |
RoutingResult
Module: router.core.result
Import: from fennec_community.router import RoutingResult
The final object returned by HierarchicalRouter.route_query(). Contains both the handler response and the full routing trace.
| Field | Type | Description |
|---|---|---|
response |
Optional[HandlerResponse] |
The handler's response (or fallback response) |
trace |
RoutingTrace |
Full audit trail of the routing decision |
matched |
bool |
True if a route was matched (may still be False if handler failed) |
Boolean evaluation: bool(result) is equivalent to result.success.
MultiRoutingResult
Module: router.core.result
Import: from fennec_community.router import MultiRoutingResult
Wraps results from multi-route parallel or sequential execution.
| Field | Type | Description |
|---|---|---|
results |
List[RoutingResult] |
Individual result per executed candidate |
aggregated |
Optional[HandlerResponse] |
The aggregated response from all successful results |
trace |
Optional[RoutingTrace] |
The shared routing trace |
BaseHandler — Writing Your Own Handler
Module: router.core.base
Import: from fennec_community.router import BaseHandler, HandlerRequest, HandlerResponse
The abstract base class for all route handlers. You must implement handle(). All other methods have default implementations that can be overridden.
from fennec_community.router import BaseHandler, HandlerRequest, HandlerResponse
class MyHandler(BaseHandler):
def handle(self, request: HandlerRequest) -> HandlerResponse:
# Your logic here
result = do_something(request.query)
return HandlerResponse.ok(result)handle()
handler.handle(request: HandlerRequest) -> HandlerResponsePurpose: Required. Synchronously process a request and return a response. This is the core method every handler must implement.
| Parameter | Type | Description |
|---|---|---|
request |
HandlerRequest |
The routing request containing the query, context, and metadata |
Returns: HandlerResponse
Example:
class RAGHandler(BaseHandler):
def __init__(self, retriever, llm):
self.retriever = retriever
self.llm = llm
def handle(self, request: HandlerRequest) -> HandlerResponse:
docs = self.retriever.search(request.query, top_k=5)
answer = self.llm.generate(query=request.query, context=docs)
return HandlerResponse.ok(
content = answer,
sources = [d.source for d in docs],
doc_count = len(docs),
)handle_async()
async handler.handle_async(request: HandlerRequest) -> HandlerResponsePurpose: Asynchronously process a request. The default implementation calls self.handle() synchronously. Override this for true async I/O (e.g., async LLM API calls, async DB queries).
| Parameter | Type | Description |
|---|---|---|
request |
HandlerRequest |
The routing request |
Returns: HandlerResponse
Example:
class AsyncLLMHandler(BaseHandler):
def handle(self, request: HandlerRequest) -> HandlerResponse:
# Sync fallback
return HandlerResponse.ok(sync_call(request.query))
async def handle_async(self, request: HandlerRequest) -> HandlerResponse:
result = await async_llm_call(request.query)
return HandlerResponse.ok(result)validate()
handler.validate(request: HandlerRequest) -> boolPurpose: Pre-execution validation hook. The router calls this before handle(). If it returns False, the handler is skipped and the router falls to the next candidate. Use this to enforce preconditions (e.g., required context keys, query length limits).
| Parameter | Type | Description |
|---|---|---|
request |
HandlerRequest |
The routing request |
Returns: bool — True to proceed, False to skip this handler.
Default implementation: Always returns True.
Example:
class SecureHandler(BaseHandler):
def validate(self, request: HandlerRequest) -> bool:
return "user_id" in request.context # Require authentication
def handle(self, request: HandlerRequest) -> HandlerResponse:
user_id = request.context["user_id"]
return HandlerResponse.ok(fetch_user_data(user_id))stream()
handler.stream(request: HandlerRequest) -> Iterator[str]Purpose: Yield response chunks for synchronous streaming (e.g., SSE or chunked HTTP). The default implementation calls handle() and yields the entire content as a single chunk. Override for token-by-token streaming from LLMs.
| Parameter | Type | Description |
|---|---|---|
request |
HandlerRequest |
The routing request |
Returns: Iterator[str] — yields string chunks
Example:
class StreamingLLMHandler(BaseHandler):
def handle(self, request: HandlerRequest) -> HandlerResponse:
return HandlerResponse.ok("".join(self.stream(request)))
def stream(self, request: HandlerRequest) -> Iterator[str]:
for chunk in openai_stream(request.query):
yield chunk.delta.content or ""stream_async()
async handler.stream_async(request: HandlerRequest) -> AsyncIterator[str]Purpose: Yield response chunks for asynchronous streaming. The default implementation calls handle_async() and yields the full content as one chunk.
| Parameter | Type | Description |
|---|---|---|
request |
HandlerRequest |
The routing request |
Returns: AsyncIterator[str] — async yields string chunks
HandlerRequest Methods
with_metadata()
request.with_metadata(**kwargs) -> HandlerRequestPurpose: Return a new HandlerRequest with the given keyword arguments merged into the metadata dict. Follows an immutable-style update pattern — the original request is not modified.
| Parameter | Type | Description |
|---|---|---|
**kwargs |
Any |
Key-value pairs to merge into metadata |
Returns: HandlerRequest — new instance with updated metadata.
Example:
enriched = request.with_metadata(route_name="weather", confidence=0.91)
# enriched.metadata == {"route_name": "weather", "confidence": 0.91}
# request.metadata is unchangedHandlerResponse Methods
HandlerResponse.ok()
HandlerResponse.ok(content: Any, **meta) -> HandlerResponsePurpose: Class method convenience constructor for successful responses. Any keyword arguments are stored in metadata.
| Parameter | Type | Description |
|---|---|---|
content |
Any |
The response payload |
**meta |
Any |
Arbitrary metadata key-value pairs |
Returns: HandlerResponse with success=True
Example:
return HandlerResponse.ok(
answer,
sources = ["doc_1", "doc_2"],
confidence = 0.93,
tokens = 142,
)HandlerResponse.fail()
HandlerResponse.fail(error: str, **meta) -> HandlerResponsePurpose: Class method convenience constructor for failed responses.
| Parameter | Type | Description |
|---|---|---|
error |
str |
Human-readable error description |
**meta |
Any |
Arbitrary metadata |
Returns: HandlerResponse with success=False, content=None
Example:
return HandlerResponse.fail(
"City not found in weather database",
query = request.query,
code = 404,
)Route
Module: router.core.route
Import: from fennec_community.router import Route, RouteKeywords
A named routing destination with semantic examples, keyword signals, a handler, and runtime metrics.
Route Constructor
Route(
name: str,
description: str,
handler: Union[BaseHandler, Callable],
examples: Optional[List[str]] = None,
keywords: Optional[RouteKeywords] = None,
group: Optional[str] = None,
tags: Optional[Set[str]] = None,
metadata: Optional[Dict[str, Any]] = None,
enabled: bool = True,
priority: int = 0,
tools: Optional[List[str]] = None,
score_bias: float = 0.0,
)Purpose: Define a routing destination. The handler receives matched queries and produces responses.
| Parameter | Type | Default | Description |
|---|---|---|---|
name |
str |
required | Unique identifier within its group. Cannot be empty |
description |
str |
required | Human-readable description; also used for semantic scoring |
handler |
BaseHandler | Callable |
required | Handler instance or plain callable fn(query: str) -> Any |
examples |
Optional[List[str]] |
None |
Sample queries that should route here. Duplicates are removed automatically |
keywords |
Optional[RouteKeywords] |
None |
Keyword veto/boost configuration |
group |
Optional[str] |
None |
Parent group name (set automatically by RouteGroup.add_route()) |
tags |
Optional[Set[str]] |
None |
Free-form labels for filtering (get_by_tag()) |
metadata |
Optional[Dict[str, Any]] |
None |
Arbitrary key-value store |
enabled |
bool |
True |
If False, the route is invisible to the scoring pipeline |
priority |
int |
0 |
Tie-breaker (higher wins after scoring) |
tools |
Optional[List[str]] |
None |
Names of tools this route may chain to (for agent use cases) |
score_bias |
float |
0.0 |
Persistent score offset managed by the feedback engine (clamped to [-0.3, +0.3]) |
Raises: ValueError if name or description is empty.
Example:
from fennec_community.router import Route, RouteKeywords
route = Route(
name = "invoice_lookup",
description = "Look up invoice details by invoice ID or customer",
handler = InvoiceHandler(),
examples = [
"Show me invoice #INV-2024-001",
"Find all invoices for Acme Corp",
"What's the status of my last invoice?",
],
keywords = RouteKeywords(
any_of = ["invoice", "billing", "receipt", "payment"],
excluded = ["create", "new invoice"],
boost = 0.08,
),
tags = {"finance", "read-only"},
priority = 5,
)add_example()
route.add_example(example: str) -> boolPurpose: Add a new example utterance to the route. Invalidates the cached embeddings so the route will be re-encoded on the next routing call. Duplicates and empty strings are silently ignored.
| Parameter | Type | Description |
|---|---|---|
example |
str |
The example query to add |
Returns: bool — True if the example was added, False if it was a duplicate or empty.
Example:
added = route.add_example("Can you pull up invoice number 5082?")
if added:
router.re_encode_route(route.name)remove_example()
route.remove_example(example: str) -> boolPurpose: Remove an example utterance from the route. Invalidates cached embeddings.
| Parameter | Type | Description |
|---|---|---|
example |
str |
The exact example string to remove |
Returns: bool — True if removed, False if not found.
enable() / disable() (Route)
route.enable() -> None
route.disable() -> NonePurpose: Toggle the route's participation in routing. A disabled route is completely invisible to the scoring pipeline — it will never be selected. Useful for A/B testing, maintenance windows, or feature flagging without unregistering.
Returns: None
Example:
# Disable during maintenance
route.disable()
# Re-enable
route.enable()Route.to_dict()
route.to_dict() -> Dict[str, Any]Purpose: Serialize the route's configuration (excluding the handler) to a plain dictionary. Useful for persisting route definitions to JSON.
Returns: Dict[str, Any] — contains name, description, examples, keywords, group, tags, metadata, enabled, priority, tools, score_bias.
Example:
import json
with open("routes.json", "w") as f:
json.dump([r.to_dict() for r in my_routes], f)Route.from_dict()
Route.from_dict(data: Dict[str, Any], handler: Union[BaseHandler, Callable]) -> RoutePurpose: Class method that reconstructs a Route from a dict (produced by to_dict()). The handler must be supplied separately since it cannot be serialized.
| Parameter | Type | Description |
|---|---|---|
data |
Dict[str, Any] |
Dictionary from to_dict() |
handler |
BaseHandler | Callable |
The handler to attach to the reconstructed route |
Returns: Route
Example:
with open("routes.json") as f:
data = json.load(f)
route = Route.from_dict(data[0], handler=MyHandler())RouteGroup
Module: router.core.route_group
Import: from fennec_community.router import RouteGroup
A named collection of routes sharing a common intent domain. The router first selects the best group, then finds the best route within it.
RouteGroup Constructor
RouteGroup(
name: str,
description: str,
intent_examples: Optional[List[str]] = None,
intent_keywords: Optional[RouteKeywords] = None,
priority: int = 0,
enabled: bool = True,
metadata: Optional[Dict[str, Any]] = None,
)Purpose: Define a routing domain namespace.
| Parameter | Type | Default | Description |
|---|---|---|---|
name |
str |
required | Unique group identifier (e.g., "rag", "tools", "chat") |
description |
str |
required | Intent description used for top-level group selection |
intent_examples |
Optional[List[str]] |
None |
Sample queries that signal this group's intent. Used for group-level embedding scoring |
intent_keywords |
Optional[RouteKeywords] |
None |
Keyword signals for group-level selection |
priority |
int |
0 |
Group-level tie-breaker (higher wins). Recommended: tools=20, rag=10, chat=0 |
enabled |
bool |
True |
If False, the entire group is excluded from routing |
metadata |
Optional[Dict[str, Any]] |
None |
Arbitrary labels or configuration |
Raises: ValueError if name is empty.
add_route()
group.add_route(route: Route) -> RouteGroupPurpose: Register a route inside this group. Automatically sets route.group to this group's name. Returns self for fluent chaining.
| Parameter | Type | Description |
|---|---|---|
route |
Route |
The route to register |
Returns: RouteGroup (self) — for chaining.
Raises: ValueError if a route with the same name already exists. Use update_route() to replace.
Example:
group = RouteGroup("rag", "Knowledge base Q&A")
group \
.add_route(Route(name="docs_qa", ...)) \
.add_route(Route(name="faq_lookup", ...))update_route()
group.update_route(route: Route) -> RouteGroupPurpose: Add or replace an existing route. Unlike add_route(), this does not raise if the name already exists. Returns self for chaining.
| Parameter | Type | Description |
|---|---|---|
route |
Route |
The route to add or replace |
Returns: RouteGroup (self)
remove_route()
group.remove_route(name: str) -> boolPurpose: Remove a route by name from this group.
| Parameter | Type | Description |
|---|---|---|
name |
str |
The route name to remove |
Returns: bool — True if removed, False if not found.
get_route()
group.get_route(name: str) -> Optional[Route]Purpose: Retrieve a single route by name.
| Parameter | Type | Description |
|---|---|---|
name |
str |
The route name |
Returns: Optional[Route] — None if not found.
get_routes()
group.get_routes(enabled_only: bool = True) -> List[Route]Purpose: Return all routes in this group, sorted by priority descending.
| Parameter | Type | Default | Description |
|---|---|---|---|
enabled_only |
bool |
True |
If True, exclude disabled routes |
Returns: List[Route] sorted by priority descending.
get_by_tag()
group.get_by_tag(tag: str) -> List[Route]Purpose: Return all routes that have the specified tag.
| Parameter | Type | Description |
|---|---|---|
tag |
str |
The tag to filter by |
Returns: List[Route]
Example:
read_only_routes = group.get_by_tag("read-only")route() (decorator)
group.route(
name: str,
description: str,
examples: Optional[List[str]] = None,
keywords: Optional[RouteKeywords] = None,
priority: int = 0,
tags: Optional[Set[str]] = None,
tools: Optional[List[str]] = None,
) -> CallablePurpose: Decorator that registers a function, BaseHandler subclass, or BaseHandler instance as a route handler. The decorated object is returned unchanged.
| Parameter | Type | Description |
|---|---|---|
name |
str |
Route name |
description |
str |
Route description |
examples |
Optional[List[str]] |
Sample utterances |
keywords |
Optional[RouteKeywords] |
Keyword signals |
priority |
int |
Priority tie-breaker |
tags |
Optional[Set[str]] |
Tags for filtering |
tools |
Optional[List[str]] |
Tool chain declarations |
Returns: The decorated function/class unchanged.
Example:
chat_group = RouteGroup("chat", "General conversation")
@chat_group.route(
name = "greeting",
description = "Greet the user",
examples = ["hello", "hi", "good morning"],
priority = 10,
)
def greet(query: str):
return f"Hello! You said: {query}"
@chat_group.route(
name = "joke",
description = "Tell a joke",
examples = ["tell me a joke", "say something funny"],
)
class JokeHandler(BaseHandler):
def handle(self, request: HandlerRequest) -> HandlerResponse:
return HandlerResponse.ok("Why did the robot go on a diet? Too many bytes!")enable() / disable() (RouteGroup)
group.enable() -> None
group.disable() -> NonePurpose: Toggle the entire group's participation in routing. A disabled group and all its routes are completely excluded from the scoring pipeline.
RouteGroup.invalidate_embeddings()
group.invalidate_embeddings() -> NonePurpose: Force re-computation of embeddings for the group's intent examples and all routes within the group. Call this after bulk-adding examples or changing the description. After calling this, use router.re_encode_group(group.name) to trigger the actual re-encoding.
RouteGroup.summary()
group.summary() -> Dict[str, Any]Purpose: Return a human-readable summary of the group's configuration and route inventory.
Returns: Dict[str, Any] with keys: name, description, enabled, priority, route_count, routes (list of route names), metadata.
Built-in Group Templates
Pre-built RouteGroup objects for the most common LLM application patterns.
make_rag_group()
make_rag_group(priority: int = 10) -> RouteGroupPurpose: Creates a pre-configured RouteGroup for Retrieval-Augmented Generation patterns. Includes intent examples and keyword signals tuned for document search and knowledge base queries.
| Parameter | Type | Default | Description |
|---|---|---|---|
priority |
int |
10 |
Group priority (higher = preferred over lower-priority groups when scores are equal) |
Returns: RouteGroup with name="rag".
Example:
from fennec_community.router import make_rag_group
rag = make_rag_group(priority=10)
rag.add_route(Route(
name = "policy_qa",
description = "Answer questions from company policy documents",
handler = PolicyRAGHandler(),
examples = ["What is the vacation policy?", "How many sick days do I get?"],
))
router.register_group(rag)make_tools_group()
make_tools_group(priority: int = 20) -> RouteGroupPurpose: Creates a pre-configured RouteGroup for tool/function-calling patterns. Includes intent examples for API calls, code execution, email, calendar, and database queries. Highest default priority since tool calls are usually explicit.
| Parameter | Type | Default | Description |
|---|---|---|---|
priority |
int |
20 |
Group priority |
Returns: RouteGroup with name="tools".
Example:
from fennec_community.router import make_tools_group
tools = make_tools_group(priority=20)
tools.add_route(Route(
name = "send_email",
description = "Send an email to a recipient",
handler = EmailHandler(),
examples = ["Send an email to John", "Email the report to sales@company.com"],
))
router.register_group(tools)make_chat_group()
make_chat_group(priority: int = 0) -> RouteGroupPurpose: Creates a pre-configured RouteGroup for general conversation patterns. Low default priority — acts as a catch-all for queries that don't match more specific groups.
| Parameter | Type | Default | Description |
|---|---|---|---|
priority |
int |
0 |
Group priority (lowest by default) |
Returns: RouteGroup with name="chat".
HierarchicalRouter
Module: router.routing.hierarchical
Import: from fennec_community.router import HierarchicalRouter
The primary public entry point for the entire routing system.
HierarchicalRouter Constructor
HierarchicalRouter(config: Optional[RouterConfig] = None)Purpose: Instantiates the router and all subsystems. Loads the embedding model on startup.
| Parameter | Type | Default | Description |
|---|---|---|---|
config |
Optional[RouterConfig] |
None |
Full configuration object. Uses RouterConfig() defaults if not provided |
Example:
# Default configuration
router = HierarchicalRouter()
# Custom configuration
from fennec_community.router import RouterConfig, ScoringConfig, ExecutionConfig, ExecutionMode
config = RouterConfig()
config.scoring.high_confidence_threshold = 0.90
config.execution.mode = ExecutionMode.SEQUENTIAL
config.cache.ttl_seconds = 600
router = HierarchicalRouter(config=config)
# Preset configurations
router = HierarchicalRouter(config=RouterConfig.fast())
router = HierarchicalRouter(config=RouterConfig.accurate())register_group()
router.register_group(group: RouteGroup) -> HierarchicalRouterPurpose: Register a RouteGroup with the router. All routes within the group are immediately encoded (if they have examples). Returns self for fluent chaining.
| Parameter | Type | Description |
|---|---|---|
group |
RouteGroup |
The route group to register |
Returns: HierarchicalRouter (self) — for chaining.
Raises: ValueError if a group with the same name is already registered. Use update_group() to replace.
Example:
router \
.register_group(make_tools_group()) \
.register_group(make_rag_group()) \
.register_group(make_chat_group())update_group()
router.update_group(group: RouteGroup) -> HierarchicalRouterPurpose: Replace an existing group (or add it if not present). Removes old routes from the flat route map, re-encodes the new group, and clears the decision cache.
| Parameter | Type | Description |
|---|---|---|
group |
RouteGroup |
The replacement group |
Returns: HierarchicalRouter (self)
register()
router.register(route: Route, group_name: Optional[str] = None) -> NonePurpose: Register a single route directly. If group_name is specified, the route is added to that group; otherwise it is placed in an auto-created "default" group. Clears the decision cache.
| Parameter | Type | Default | Description |
|---|---|---|---|
route |
Route |
required | The route to register |
group_name |
Optional[str] |
None |
Target group name. Creates a "default" group if omitted |
Returns: None
Example:
router.register(
Route(name="quick_answer", description="...", handler=QuickHandler(), examples=[...]),
group_name="chat"
)unregister()
router.unregister(route_name: str) -> boolPurpose: Remove a route by name from both the flat route map and its parent group. Clears the decision cache.
| Parameter | Type | Description |
|---|---|---|
route_name |
str |
The name of the route to remove |
Returns: bool — True if the route was found and removed, False if not found.
unregister_group()
router.unregister_group(group_name: str) -> boolPurpose: Remove an entire group and all its routes from the router. Clears the decision cache.
| Parameter | Type | Description |
|---|---|---|
group_name |
str |
The name of the group to remove |
Returns: bool — True if the group was found and removed, False if not found.
set_fallback()
router.set_fallback(handler: Union[BaseHandler, Callable]) -> NonePurpose: Set a global fallback handler invoked when no route meets the minimum confidence threshold. Accepts a BaseHandler instance or a plain callable.
| Parameter | Type | Description |
|---|---|---|
handler |
BaseHandler | Callable |
The fallback handler |
Returns: None
Example:
def fallback_handler(query: str):
return f"I couldn't find a specific answer for: {query}"
router.set_fallback(fallback_handler)route_query()
router.route_query(
query: str,
context: Optional[Dict[str, Any]] = None,
return_result: bool = True,
**handler_kwargs,
) -> RoutingResultPurpose: The primary routing method. Route a query synchronously through the full pipeline and return a RoutingResult.
Internal pipeline order:
- Validate query is non-empty
- Build
HandlerRequestfrom query + context - Cache lookup — return immediately on hit (skips embedding + scoring)
- Embed query via
EmbeddingProvider - Score all groups → select best group
- Score all routes in selected group → top-k candidates
- Evaluate confidence level
- Cache the routing decision
- Execute handler(s) via
ExecutionEngine - Record metrics + auto-feedback
- Return
RoutingResult
| Parameter | Type | Default | Description |
|---|---|---|---|
query |
str |
required | The user's query string |
context |
Optional[Dict[str, Any]] |
None |
Optional context dict passed to the handler via HandlerRequest.context |
return_result |
bool |
True |
Kept for backward compatibility. Always True |
**handler_kwargs |
Any |
— | Additional key-value pairs merged into request.context |
Returns: RoutingResult
Raises: ValueError if query is empty. LookupError if raise_on_no_match=True and no route matched.
Example — basic:
result = router.route_query("What is the refund policy?")
if result.success:
print(result.content)Example — with context:
result = router.route_query(
"Show my last 5 invoices",
context = {
"user_id": "usr_abc123",
"session_id": "sess_xyz789",
"locale": "en-US",
},
)
print(f"Route: {result.route_name}, Confidence: {result.confidence}")
print(f"Latency: {result.total_time_ms:.1f}ms")Example — inspect the trace:
result = router.route_query("Compare Q1 vs Q2 revenue")
trace = result.trace
print(f"Group selected: {trace.selected_group}")
print(f"Route selected: {trace.selected_route}")
print(f"Confidence: {trace.confidence}")
print(f"Embedding time: {trace.embedding_time_ms:.1f}ms")
print(f"Scoring time: {trace.scoring_time_ms:.1f}ms")
print(f"Execution time: {trace.execution_time_ms:.1f}ms")
for i, candidate in enumerate(trace.candidates, 1):
print(f" #{i} {candidate.route_name}: score={candidate.effective_score:.3f} ({candidate.confidence.value})")route_async()
async router.route_async(request: HandlerRequest) -> RoutingResultPurpose: Route a pre-built HandlerRequest asynchronously (non-blocking I/O via async handlers). Uses the same pipeline as route_query() but uses execute_async() for the handler execution step.
| Parameter | Type | Description |
|---|---|---|
request |
HandlerRequest |
A pre-built request object |
Returns: RoutingResult
Example:
import asyncio
from fennec_community.router import HandlerRequest
async def handle_request(query: str):
request = HandlerRequest(query=query, context={"user_id": "usr_001"})
result = await router.route_async(request)
return result.content
answer = asyncio.run(handle_request("What's the weather in Cairo?"))stream() (Router)
router.stream(
query: str,
context: Optional[Dict[str, Any]] = None,
) -> Iterator[str]Purpose: Route the query and stream tokens from the best-matching handler synchronously. Bypasses the execution engine's aggregation and calls the handler's stream() method directly.
| Parameter | Type | Default | Description |
|---|---|---|---|
query |
str |
required | The user's query |
context |
Optional[Dict[str, Any]] |
None |
Context dict passed to the handler |
Returns: Iterator[str] — yields string chunks
Example:
for chunk in router.stream("Explain quantum computing simply"):
print(chunk, end="", flush=True)stream_async() (Router)
async router.stream_async(
query: str,
context: Optional[Dict[str, Any]] = None,
) -> AsyncIterator[str]Purpose: Route the query and stream tokens asynchronously.
| Parameter | Type | Default | Description |
|---|---|---|---|
query |
str |
required | The user's query |
context |
Optional[Dict[str, Any]] |
None |
Context dict |
Returns: AsyncIterator[str]
Example:
async def stream_response(query: str):
async for chunk in router.stream_async(query):
print(chunk, end="", flush=True)chain()
router.chain(
query: str,
route_names: List[str],
context: Optional[Dict[str, Any]] = None,
) -> HandlerResponsePurpose: Execute a fixed sequence of routes as a pipeline. Each route's response.content is passed into the next route's context as {"last_result": ..., "step_<name>": ...}. If any step fails, the chain stops and returns a failed response.
| Parameter | Type | Description |
|---|---|---|
query |
str |
The original user query (shared across all steps) |
route_names |
List[str] |
Ordered list of route names to execute in sequence |
context |
Optional[Dict[str, Any]] |
Initial context dict passed to the first step |
Returns: HandlerResponse — the last step's response (or a failure response if the chain broke).
Example:
# Pipeline: extract intent → retrieve docs → generate answer
response = router.chain(
query = "What happened to the quarterly results?",
route_names = ["intent_extractor", "doc_retriever", "answer_generator"],
context = {"user_id": "usr_001"},
)
print(response.content)feedback()
router.feedback(
route_name: str,
success: bool,
score: float = 0.0,
query: str = "",
) -> NonePurpose: Record a quality signal for a routing outcome. Call this after the application has determined whether the routed response was useful (e.g., user clicked thumbs up/down). The feedback engine uses these signals to adjust the route's score_bias via EMA, improving future routing accuracy.
| Parameter | Type | Default | Description |
|---|---|---|---|
route_name |
str |
required | The name of the route being rated |
success |
bool |
required | True if the response was useful, False if not |
score |
float |
0.0 |
The original routing score at the time of routing (for context) |
query |
str |
"" |
The original query (stored as a hash only for privacy) |
Returns: None
Score Bias Model:
ema_new = alpha × outcome + (1 - alpha) × ema_prev
bias = clip((ema - 0.5) × 0.6, -0.3, +0.3)After ≥ min_samples_to_adapt observations:
- Many successes →
bias→+0.3(route preferred) - Many failures →
bias→-0.3(route penalized)
Example:
result = router.route_query("What is our refund policy?")
# After the user rates the response
user_rating = get_user_rating() # e.g., thumbs up/down
router.feedback(
route_name = result.route_name,
success = (user_rating == "up"),
score = result.trace.candidates[0].effective_score if result.trace.candidates else 0.0,
query = "What is our refund policy?",
)re_encode_route()
router.re_encode_route(route_name: str) -> boolPurpose: Force re-computation of a route's embeddings after manually adding examples via route.add_example(). Also clears the decision cache.
| Parameter | Type | Description |
|---|---|---|
route_name |
str |
The name of the route to re-encode |
Returns: bool — True if the route was found and re-encoded, False if not found.
Example:
route = router.get_route("docs_qa")
route.add_example("Where can I find the terms of service?")
router.re_encode_route("docs_qa")re_encode_group()
router.re_encode_group(group_name: str) -> boolPurpose: Re-encode all routes in a group and the group's intent embeddings. Use after bulk-updating routes or changing group-level examples.
| Parameter | Type | Description |
|---|---|---|
group_name |
str |
The group to re-encode |
Returns: bool — True if the group was found and re-encoded, False if not found.
metrics()
router.metrics() -> Dict[str, Any]Purpose: Return a point-in-time snapshot of all router performance metrics. Thread-safe.
Returns: Dict[str, Any] with keys:
| Key | Type | Description |
|---|---|---|
uptime_seconds |
float |
Router uptime since initialization |
total_requests |
int |
Cumulative total routing requests |
window_size |
int |
Rolling window size for rate metrics |
match_rate_pct |
float |
% of requests that matched a route (rolling window) |
cache_hit_rate_pct |
float |
% of requests served from cache (cumulative) |
llm_invocations |
int |
Total times the LLM scorer was used (cumulative) |
latency_ms.avg |
float |
Average end-to-end latency (rolling window) |
latency_ms.p95 |
float |
95th percentile latency (rolling window) |
latency_ms.min |
float |
Minimum latency (rolling window) |
latency_ms.max |
float |
Maximum latency (rolling window) |
confidence_distribution |
Dict[str, int] |
Count of HIGH/MEDIUM/LOW/NONE decisions |
routes |
Dict[str, Dict] |
Per-route stats: calls, success_rate, avg_ms, p95_ms |
Example:
import json
m = router.metrics()
print(json.dumps(m, indent=2))
print(f"Match rate: {m['match_rate_pct']}%")
print(f"P95 latency: {m['latency_ms']['p95']}ms")cache_stats()
router.cache_stats() -> Dict[str, Any]Purpose: Return current statistics from the route decision cache.
Returns: Dict[str, Any] with keys: enabled, ttl_seconds, and route_cache (contains size, max_size, hits, misses, hit_rate).
Example:
stats = router.cache_stats()
print(f"Cache hit rate: {stats['route_cache']['hit_rate']}%")
print(f"Cache size: {stats['route_cache']['size']}/{stats['route_cache']['max_size']}")feedback_summary()
router.feedback_summary() -> Dict[str, Any]Purpose: Return the current state of the adaptive feedback engine, including per-route EMA values and success rates.
Returns: Dict[str, Any] with keys: enabled, alpha, min_samples, tracked_routes, total_records, and states (per-route feedback state).
Example:
fb = router.feedback_summary()
for route_name, state in fb["states"].items():
print(f"{route_name}: success_rate={state['success_rate']:.1%}, bias={state['bias']:+.3f}")list_groups()
router.list_groups() -> List[str]Purpose: Return a list of all registered group names.
Returns: List[str]
list_routes()
router.list_routes(group_name: Optional[str] = None) -> List[str]Purpose: Return route names, optionally filtered by group.
| Parameter | Type | Default | Description |
|---|---|---|---|
group_name |
Optional[str] |
None |
If provided, return only routes in that group |
Returns: List[str]
Example:
all_routes = router.list_routes()
tools_routes = router.list_routes("tools")get_route() (Router)
router.get_route(name: str) -> Optional[Route]Purpose: Retrieve a Route object by name from the flat route registry.
| Parameter | Type | Description |
|---|---|---|
name |
str |
The route name |
Returns: Optional[Route] — None if not found.
get_group()
router.get_group(name: str) -> Optional[RouteGroup]Purpose: Retrieve a RouteGroup object by name.
| Parameter | Type | Description |
|---|---|---|
name |
str |
The group name |
Returns: Optional[RouteGroup] — None if not found.
summary() (Router)
router.summary() -> Dict[str, Any]Purpose: Return a comprehensive snapshot of the router's current configuration, all groups, all routes, and live metrics. Useful for health checks and admin dashboards.
Returns: Dict[str, Any] with keys: groups (per-group summary), total_routes, config (key thresholds and settings), metrics (live metrics snapshot).
Example:
import json
print(json.dumps(router.summary(), indent=2))RoutingResult Properties
Convenience properties on the RoutingResult object (avoid accessing result.response and result.trace directly):
| Property | Type | Description |
|---|---|---|
result.content |
Any |
Shortcut to result.response.content — the most common access |
result.success |
bool |
True if matched and handler returned success |
result.route_name |
Optional[str] |
Name of the selected route |
result.group_name |
Optional[str] |
Name of the selected group |
result.confidence |
Optional[ConfidenceLevel] |
Confidence level of the routing decision |
result.total_time_ms |
float |
Total end-to-end latency |
bool(result) |
bool |
Equivalent to result.success |
Serialization:
result.to_dict() # → Dict with matched, success, route, group, confidence, content, error, time_ms, traceRouteCandidate Properties
| Property | Type | Description |
|---|---|---|
candidate.effective_score |
float |
min(1.0, combined_score + score_bias) — the score actually used for ranking |
Serialization:
candidate.to_dict() # → Dict with route, group, score, semantic, keyword, llm, confidenceRoutingTrace Methods
add_note()
trace.add_note(note: str) -> NonePurpose: Append a human-readable note to the trace's notes list. Used internally by the pipeline to explain routing decisions. Can also be used in custom handlers or middleware.
Serialization:
trace.to_dict() # → Full audit trail dictCacheManager
Module: router.cache.manager
Import: from fennec_community.router import CacheManager
Two-tier LRU+TTL cache. Primarily used internally by HierarchicalRouter, but available for advanced use (e.g., custom cache warming or invalidation).
get_decision()
cache.get_decision(query: str) -> Optional[Tuple[str, Optional[str]]]Purpose: Look up a cached routing decision. Queries are normalized (lowercased, whitespace-collapsed) and hashed with SHA-256 before lookup. Returns None on miss or expired entry.
| Parameter | Type | Description |
|---|---|---|
query |
str |
The raw query string |
Returns: Optional[Tuple[str, Optional[str]]] — (route_name, group_name) or None.
set_decision()
cache.set_decision(
query: str,
route_name: str,
group_name: Optional[str] = None,
) -> NonePurpose: Store a routing decision in the cache. Evicts expired or LRU entries when max_size is reached.
| Parameter | Type | Description |
|---|---|---|
query |
str |
The raw query string |
route_name |
str |
The route that was selected |
group_name |
Optional[str] |
The group of the selected route |
invalidate()
cache.invalidate(query: str) -> boolPurpose: Remove a specific query from the cache. Useful after updating a route that was previously cached for a specific query.
| Parameter | Type | Description |
|---|---|---|
query |
str |
The raw query to evict |
Returns: bool — True if the entry existed and was removed.
clear() (Cache)
cache.clear() -> NonePurpose: Flush all cached routing decisions. Called automatically by the router after register(), unregister(), update_group(), unregister_group(), and re_encode_*().
stats() (Cache Method)
cache.stats() -> Dict[str, Any]Purpose: Return cache statistics including size, hit rate, and configuration.
Returns: Dict[str, Any] with keys: enabled, ttl_seconds, route_cache (size, max_size, hits, misses, hit_rate).
FeedbackEngine
Module: router.feedback.engine
Import: from fennec_community.router import FeedbackEngine
Manages the adaptive feedback loop. Used internally by HierarchicalRouter. Available for advanced scenarios such as batch feedback import or persisting/loading state manually.
FeedbackEngine.record()
feedback_engine.record(
route_name: str,
group_name: Optional[str],
success: bool,
score: float,
query: str = "",
) -> NonePurpose: Record the outcome of a single routing decision. Updates the route's EMA-based feedback state. Thread-safe.
| Parameter | Type | Description |
|---|---|---|
route_name |
str |
The route that was executed |
group_name |
Optional[str] |
The group of the route |
success |
bool |
True if the handler's response was successful |
score |
float |
The combined score at routing time |
query |
str |
The raw query (stored as hash only for privacy) |
apply_biases()
feedback_engine.apply_biases(route_map: Dict[str, Route]) -> NonePurpose: Push learned score_bias values from feedback states to the actual Route objects. Only applies to routes with at least min_samples_to_adapt observations. Thread-safe. Called automatically after every router.feedback() call.
| Parameter | Type | Description |
|---|---|---|
route_map |
Dict[str, Route] |
The router's internal {name: Route} map |
bulk_record()
feedback_engine.bulk_record(records: List[Dict[str, Any]]) -> NonePurpose: Import a batch of feedback events at once, e.g., from a human labelling pipeline or offline evaluation. Each record dict must have route_name and success; score, group_name, and query are optional.
| Parameter | Type | Description |
|---|---|---|
records |
List[Dict[str, Any]] |
List of dicts with keys: route_name (str), success (bool), score (float), group_name (str), query (str) |
Example:
feedback_engine = router._feedback # Access internal engine if needed
feedback_engine.bulk_record([
{"route_name": "docs_qa", "success": True, "score": 0.91},
{"route_name": "weather", "success": False, "score": 0.78, "group_name": "tools"},
{"route_name": "docs_qa", "success": True, "score": 0.88},
])
feedback_engine.apply_biases(router._route_map)save()
feedback_engine.save(path: Optional[str] = None) -> NonePurpose: Persist the learned feedback states (EMA values, sample counts, success/failure counts) to a JSON file. Allows state to survive process restarts. Called automatically when using with router: context manager.
| Parameter | Type | Default | Description |
|---|---|---|---|
path |
Optional[str] |
None |
Target file path. Falls back to FeedbackConfig.persist_path if None |
Example:
feedback_engine.save("feedback_state.json")get_state()
feedback_engine.get_state(route_name: str) -> Optional[RouteFeedbackState]Purpose: Retrieve the current feedback state for a specific route.
| Parameter | Type | Description |
|---|---|---|
route_name |
str |
The route name |
Returns: Optional[RouteFeedbackState] with fields: ema, n_samples, n_success, n_failure, success_rate (property), computed_bias (property).
FeedbackEngine.summary()
feedback_engine.summary() -> Dict[str, Any]Purpose: Return the full feedback engine state as a dictionary.
Returns: Dict[str, Any] with keys: enabled, alpha, min_samples, tracked_routes, total_records, states (per-route state dicts).
FeedbackEngine.reset()
feedback_engine.reset(route_name: Optional[str] = None) -> NonePurpose: Reset feedback state for one specific route or for all routes.
| Parameter | Type | Default | Description |
|---|---|---|---|
route_name |
Optional[str] |
None |
If provided, reset only that route. If None, reset all states and history |
Integration Examples
Full Production Setup
from fennec_community.router import (
HierarchicalRouter, RouterConfig, ExecutionConfig, ExecutionMode,
RouteGroup, Route, RouteKeywords,
BaseHandler, HandlerRequest, HandlerResponse,
make_rag_group, make_tools_group, make_chat_group,
)
# ── Configuration ───────────────────────────────────────────────────────
config = RouterConfig()
config.scoring.high_confidence_threshold = 0.88
config.scoring.medium_confidence_threshold = 0.65
config.execution.mode = ExecutionMode.SEQUENTIAL
config.feedback.persist_path = "router_feedback.json"
config.observability.slow_route_ms = 300.0
# ── Handlers ────────────────────────────────────────────────────────────
class PolicyHandler(BaseHandler):
def handle(self, request: HandlerRequest) -> HandlerResponse:
result = rag_search(request.query, index="policy_docs")
return HandlerResponse.ok(result["answer"], sources=result["sources"])
class WeatherHandler(BaseHandler):
async def handle_async(self, request: HandlerRequest) -> HandlerResponse:
city = extract_city(request.query)
data = await async_weather_api(city)
return HandlerResponse.ok(data)
class ChatHandler(BaseHandler):
def handle(self, request: HandlerRequest) -> HandlerResponse:
reply = llm_chat(request.query, history=request.context.get("history", []))
return HandlerResponse.ok(reply)
# ── Route Groups ─────────────────────────────────────────────────────────
rag = make_rag_group(priority=10)
rag.add_route(Route(
name = "policy_qa",
description = "Answer questions from company policy documents",
handler = PolicyHandler(),
examples = ["What is the leave policy?", "How do I apply for remote work?"],
keywords = RouteKeywords(any_of=["policy", "procedure", "rule", "regulation"], boost=0.08),
))
tools = make_tools_group(priority=20)
tools.add_route(Route(
name = "weather",
description = "Get current weather for a city",
handler = WeatherHandler(),
examples = ["What's the weather in Cairo?", "Is it cold in London today?"],
keywords = RouteKeywords(any_of=["weather", "temperature", "forecast", "rain"], boost=0.10),
))
chat = make_chat_group(priority=0)
chat.add_route(Route(
name = "general_chat",
description = "General conversation and open-ended questions",
handler = ChatHandler(),
examples = ["Hello!", "Tell me a joke", "How are you?"],
))
# ── Router ────────────────────────────────────────────────────────────────
with HierarchicalRouter(config=config) as router:
router \
.register_group(tools) \
.register_group(rag) \
.register_group(chat)
router.set_fallback(lambda q: f"Sorry, I couldn't find an answer for: {q}")
# Route a query
result = router.route_query(
"What's the remote work policy?",
context={"user_id": "usr_001", "locale": "en-US"},
)
if result.success:
print(result.content)
print(f"→ {result.route_name} [{result.confidence.value}] {result.total_time_ms:.1f}ms")
# Record user feedback
router.feedback(result.route_name, success=True)Async Routing
import asyncio
from fennec_community.router import HierarchicalRouter, HandlerRequest
router = HierarchicalRouter()
# ... register groups ...
async def handle_query(query: str, user_id: str) -> str:
request = HandlerRequest(
query = query,
context = {"user_id": user_id},
)
result = await router.route_async(request)
return result.content if result.success else f"Error: {result.response.error}"
async def main():
answers = await asyncio.gather(
handle_query("What's the weather?", "usr_001"),
handle_query("Find the Q3 report", "usr_002"),
handle_query("Hello!", "usr_003"),
)
for a in answers:
print(a)
asyncio.run(main())Streaming Response
import sys
router = HierarchicalRouter()
# ... register groups with streaming handlers ...
print("Response: ", end="")
for chunk in router.stream("Explain the company's AI strategy"):
print(chunk, end="", flush=True)
print()Tool Chaining Pipeline
router = HierarchicalRouter()
# Register pipeline stages as individual routes
router.register(Route("intent_classifier", "Classify the user intent", handler=IntentHandler()), group_name="pipeline")
router.register(Route("doc_retriever", "Retrieve relevant documents", handler=RetrieverHandler()), group_name="pipeline")
router.register(Route("answer_generator", "Generate grounded answer", handler=GeneratorHandler()), group_name="pipeline")
response = router.chain(
query = "What caused the 2008 financial crisis?",
route_names = ["intent_classifier", "doc_retriever", "answer_generator"],
context = {"user_id": "usr_001"},
)
print(response.content)
# Each step receives the previous step's output in context["last_result"]Decorator-based Registration
from fennec_community.router import HierarchicalRouter, RouteGroup, RouteKeywords, BaseHandler, HandlerRequest, HandlerResponse
router = HierarchicalRouter()
support = RouteGroup("support", "Customer support queries", priority=15)
@support.route(
name = "order_status",
description = "Check the status of a customer order",
examples = ["Where is my order?", "Track order #12345", "When will my package arrive?"],
keywords = RouteKeywords(any_of=["order", "package", "delivery", "track"], boost=0.09),
tags = {"read", "orders"},
)
class OrderStatusHandler(BaseHandler):
def handle(self, request: HandlerRequest) -> HandlerResponse:
order_id = extract_order_id(request.query)
status = db.get_order_status(order_id)
return HandlerResponse.ok(status)
@support.route(
name = "refund_request",
description = "Process or check the status of a refund",
examples = ["I want a refund", "My refund hasn't arrived", "Cancel and refund my order"],
keywords = RouteKeywords(any_of=["refund", "money back", "return", "cancel"], boost=0.10),
tags = {"write", "refunds"},
)
def handle_refund(query: str):
return process_refund(extract_order_id(query))
router.register_group(support)Monitoring & Observability
import time, json
router = HierarchicalRouter()
# Periodic health check
def health_check():
m = router.metrics()
return {
"status": "healthy" if m["match_rate_pct"] > 80 else "degraded",
"match_rate": m["match_rate_pct"],
"p95_ms": m["latency_ms"]["p95"],
"cache_hits": m["cache_hit_rate_pct"],
"top_routes": sorted(
m["routes"].items(),
key=lambda x: x[1]["calls"],
reverse=True,
)[:5],
}
# Inspect trace after a failure
result = router.route_query("some ambiguous query")
if not result.success:
trace = result.trace
print(f"No match. Top candidates:")
for c in trace.candidates:
print(f" {c.route_name}: {c.effective_score:.3f} ({c.confidence.value})")
print(f"Notes: {trace.notes}")Error Reference
| Error | When | Resolution |
|---|---|---|
ValueError: Query cannot be empty |
route_query("") called |
Validate query before routing |
ValueError: Group 'X' already registered |
register_group() called with duplicate name |
Use update_group() to replace |
ValueError: Route 'X' already exists |
add_route() called with duplicate name |
Use update_route() to replace |
ValueError: Route name cannot be empty |
Route/Group constructed with empty name | Provide a non-empty name |
LookupError: No route matched |
No route meets threshold AND raise_on_no_match=True |
Lower fallback_confidence_threshold or set a fallback handler |
ImportError: sentence-transformers required |
Library not installed | pip install sentence-transformers |
ImportError: openai required |
LLM scorer configured but openai not installed |
pip install openai or remove llm_model from config |
Route returns success=False |
Handler raised an exception | Check result.response.error for details; configure max_retries |
| Low match rate | Examples are too sparse or too generic | Add more diverse examples; tune high_confidence_threshold |
| Cache eviction | max_size reached |
Increase CacheConfig.max_size or reduce ttl_seconds |
community/router.md