Fennec Logo Fennec
Fennec Community community/prompt.md

Prompt Modular


Table of Contents

  1. Overview
  2. Architecture
  3. Quick Start
  4. Data Types & Enumerations
  5. PromptEngine
  6. ContextManager
  7. GuardrailEngine
  8. GuardrailLibrary
  9. PromptOptimizer
  10. BuiltPrompt Methods
  11. ContextResult Properties
  12. Strategy System
  13. PromptMetrics Methods
  14. Integration Examples
  15. Error Reference

Overview

The prompt module is a production-grade, AI-ready prompt orchestration engine designed for Retrieval-Augmented Generation (RAG) systems. It transforms raw user queries and retrieved documents into highly optimized, context-aware prompts that can be fed directly into any LLM API (OpenAI, Anthropic, or any compatible provider).

Key Capabilities

Capability Description
Auto-detection Automatically infers prompt type, strategy, and complexity from the query and documents
Context Engineering Intelligently orders, deduplicates, and token-budget-manages retrieved documents
Guardrails Injects anti-hallucination, citation, safety, and scope-control instructions
Optimization Reduces token count via whitespace normalization, filler removal, and deduplication
Caching Hash-based prompt cache with configurable TTL and size
Observability Full metrics, per-prompt trace log, and event hook system
Adaptive Feedback Records quality signals and recommends the historically best-performing strategy
Multi-LLM Output Produces OpenAI-compatible and Anthropic-compatible payloads from the same prompt

Architecture

PromptEngine  ← primary public entry point
    │
    ├── PromptBuilder         ← resolves strategy, coordinates subsystems
    │       ├── ContextManager    ← document processing & token budgeting
    │       ├── GuardrailEngine   ← safety & quality instruction injection
    │       └── Strategy (7)      ← prompt template construction
    │               SimpleStrategy
    │               ChainOfThoughtStrategy
    │               MultiHopStrategy
    │               SelfConsistentStrategy
    │               StepBackStrategy
    │               ReActStrategy
    │               LeastToMostStrategy
    │
    └── PromptOptimizer       ← token reduction post-build

Data flow:
query + documents + configPromptEngine.build()BuiltPrompt → LLM API


Quick Start

from fennec_community.prompt import PromptEngine, Document

# 1. Create the engine (production defaults)
engine = PromptEngine()

# 2. Build a prompt
prompt = engine.build(
    query     = "What caused the 2008 financial crisis?",
    documents = [
        {"content": "The crisis was triggered by...", "source": "wiki", "score": 0.92},
        {"content": "Subprime mortgage lending...",   "source": "fed_report", "score": 0.87},
    ],
    strategy      = "multi_hop",
    output_format = "json",
)

# 3a. Use with OpenAI
response = openai_client.chat.completions.create(
    model    = "gpt-4o",
    messages = prompt.to_messages(),
)

# 3b. Use with Anthropic
payload  = prompt.to_anthropic()
response = anthropic_client.messages.create(
    **payload,
    model      = "claude-opus-4-20250514",
    max_tokens = 1024,
)

Data Types & Enumerations

Enumerations

All enumerations inherit from str, Enum, so their values can be passed as plain strings wherever an enum is expected.


PromptType

Defines the canonical archetype of the prompt being built. The engine uses this to select the default strategy and configure guardrails.

Value String Description
PromptType.QA "qa" Grounded question-answering from context
PromptType.CONVERSATIONAL "conversational" Multi-turn dialogue
PromptType.REASONING "reasoning" Chain-of-thought / multi-hop reasoning
PromptType.AGENT "agent" ReAct / plan-and-execute agentic tasks
PromptType.TOOL_USE "tool_use" Function-calling / tool description
PromptType.SAFETY "safety" Content moderation / safety guard
PromptType.SUMMARIZATION "summarization" Document summarization
PromptType.EXTRACTION "extraction" Structured data extraction
PromptType.COMPARISON "comparison" Compare / contrast multiple documents

PromptStrategy

Controls the reasoning template applied to the prompt. Different strategies produce structurally different prompts that guide the LLM's reasoning approach.

Value String Best For
PromptStrategy.SIMPLE "simple" Single-fact lookup, direct Q&A
PromptStrategy.CHAIN_OF_THOUGHT "cot" Reasoning tasks, complex explanations
PromptStrategy.MULTI_HOP "multi_hop" Multi-document, multi-step questions
PromptStrategy.SELF_CONSISTENT "self_consistent" High-stakes answers requiring verification
PromptStrategy.STEP_BACK "step_back" Abstract-first reasoning
PromptStrategy.REACT "react" Agentic tasks with tool calls
PromptStrategy.LEAST_TO_MOST "least_to_most" Math, logic, progressive sub-problems

OutputFormat

Specifies the format in which the LLM should return its answer. The engine injects the appropriate formatting instruction into the guardrail block.

Value String Description
OutputFormat.TEXT "text" Free-form plain text (default)
OutputFormat.JSON "json" Structured JSON with answer, sources, confidence
OutputFormat.MARKDOWN "markdown" Markdown with headers and bullets
OutputFormat.BULLET_LIST "bullet_list" Concise bulleted list
OutputFormat.STRUCTURED "structured" Domain-specific schema (requires output_schema)
OutputFormat.CITATION "citation" Prose with inline [1] citations and Sources section

QueryComplexity

Signals the complexity level of the query. Affects strategy selection (complex queries auto-upgrade to CoT or Multi-Hop) and guardrail selection (complex/expert adds a self-check guardrail).

Value String Description
QueryComplexity.SIMPLE "simple" Single-fact lookup
QueryComplexity.MODERATE "moderate" Some reasoning required
QueryComplexity.COMPLEX "complex" Multi-hop or multi-document
QueryComplexity.EXPERT "expert" Deep domain knowledge required

UserProfile

Tailors the tone and language style of the system prompt to the target audience.

Value String Style Applied
UserProfile.GENERAL "general" Clear, accessible, jargon-free
UserProfile.TECHNICAL "technical" Precise technical language with full details
UserProfile.ACADEMIC "academic" Formal academic tone with rigorous evidence
UserProfile.EXECUTIVE "executive" Extremely concise, business impact first

Document

Module: prompt.types

A dataclass representing a single retrieved passage from your vector store or retrieval system.

@dataclass
class Document:
    content:   str
    source:    str             = ""
    score:     float           = 1.0
    metadata:  Dict[str, Any]  = field(default_factory=dict)
    chunk_id:  Optional[str]   = None
    language:  str             = "en"
Field Type Default Description
content str required The text content of the passage
source str "" Source identifier (URL, filename, document ID) used in citations
score float 1.0 Relevance score from retrieval system (higher = more relevant)
metadata Dict[str, Any] {} Arbitrary key-value metadata (page number, author, date, etc.)
chunk_id Optional[str] None Unique identifier for the chunk within a document
language str "en" Language code of the document content

Usage:

doc = Document(
    content  = "The Federal Reserve raised interest rates...",
    source   = "fed_report_2023.pdf",
    score    = 0.94,
    metadata = {"page": 12, "author": "Federal Reserve"},
    chunk_id = "fed_2023_p12_chunk_3",
)

Note: Document objects can also be created implicitly by PromptEngine.build() when you pass plain dict or str objects in the documents list.


Message

Module: prompt.types

A dataclass representing a single turn in a conversation history.

@dataclass
class Message:
    role:    Literal["system", "user", "assistant"]
    content: str
Field Type Description
role Literal["system", "user", "assistant"] The speaker role
content str The text of the message

Usage:

history = [
    Message(role="user",      content="What is inflation?"),
    Message(role="assistant", content="Inflation is the rate at which..."),
    Message(role="user",      content="And what causes it?"),
]

PromptRequest

Module: prompt.types

A dataclass that encapsulates all configuration and context the engine needs to build a prompt. This is the internal canonical request object; most users pass arguments directly to PromptEngine.build() instead.

@dataclass
class PromptRequest:
    # Required
    query: str

    # Context
    documents:  List[Document]   = []
    memory:     List[Message]    = []

    # Intent / routing
    prompt_type:   PromptType      = PromptType.QA
    strategy:      PromptStrategy  = PromptStrategy.SIMPLE
    output_format: OutputFormat    = OutputFormat.TEXT
    complexity:    QueryComplexity = QueryComplexity.SIMPLE
    user_profile:  UserProfile     = UserProfile.GENERAL

    # Constraints
    max_context_tokens: int          = 3000
    max_answer_tokens:  int          = 512
    output_schema:      Optional[Dict] = None
    language:           str          = "en"

    # Feature flags
    enable_guardrails:  bool = True
    enable_cot:         bool = False
    enable_citations:   bool = True
    enable_uncertainty: bool = True

    # Metadata
    session_id: str          = ""
    user_id:    str          = ""
    trace_id:   str          = ""
    extra:      Dict[str, Any] = {}
Field Type Default Description
query str required The user's question or task
documents List[Document] [] Retrieved passages for context
memory List[Message] [] Conversation history
prompt_type PromptType QA Prompt archetype
strategy PromptStrategy SIMPLE Reasoning strategy
output_format OutputFormat TEXT Desired response format
complexity QueryComplexity SIMPLE Query complexity level
user_profile UserProfile GENERAL Target audience
max_context_tokens int 3000 Token budget for injected context
max_answer_tokens int 512 Hint to the LLM about expected answer length
output_schema Optional[Dict] None JSON schema for STRUCTURED output format
language str "en" Response language code
enable_guardrails bool True Inject grounding/safety instructions
enable_cot bool False Force chain-of-thought (auto-set by strategy)
enable_citations bool True Request inline source citations
enable_uncertainty bool True Ask model to express uncertainty honestly
session_id str "" Session identifier for tracing
user_id str "" User identifier for logging
trace_id str "" Trace identifier for distributed tracing
extra Dict[str, Any] {} Arbitrary extension data (e.g., tools list for agents)

BuiltPrompt

Module: prompt.types

The output of the entire build pipeline. Contains the fully assembled prompt, token accounting, guardrail metadata, and optimization notes. This object is ready to be passed directly to any LLM client.

@dataclass
class BuiltPrompt:
    system_prompt: str
    user_prompt:   str
    messages:      List[Message]

    prompt_type:   PromptType
    strategy:      PromptStrategy
    output_format: OutputFormat

    estimated_tokens:    int
    context_tokens_used: int
    documents_included:  int
    documents_truncated: int

    guardrails_applied:  List[str]
    tokens_saved:        int
    optimization_notes:  List[str]

    session_id: str
    trace_id:   str
Field Type Description
system_prompt str The assembled system prompt
user_prompt str The assembled user prompt with context and question
messages List[Message] Full message list (system + history + user)
prompt_type PromptType The effective prompt type used
strategy PromptStrategy The effective strategy used
output_format OutputFormat The output format enforced
estimated_tokens int Approximate total token count of the prompt
context_tokens_used int Tokens consumed by the context block specifically
documents_included int Number of documents successfully injected
documents_truncated int Number of documents excluded due to token budget
guardrails_applied List[str] Names of all guardrails injected (for observability)
tokens_saved int Tokens saved by the optimizer
optimization_notes List[str] Human-readable notes from the optimizer pipeline
session_id str Echo of the request session ID
trace_id str Echo of the request trace ID

ContextResult

Module: prompt.context_manager

The output of ContextManager.build(). Contains the ready-to-inject context block and rich metadata about what was included, excluded, and deduplicated.

Field Type Description
context_block str The fully formatted, ready-to-inject context string
included_docs List[Document] Documents that fit within the token budget
excluded_docs List[Document] Documents excluded due to token budget
citation_map Dict[int, str] Maps citation index (e.g., 1) to source identifier
tokens_used int Token count of the assembled context block
tokens_budget int The token budget that was enforced
duplicates_removed int Number of near-duplicate documents removed
truncated bool Whether any document was partially truncated to fit

PromptMetrics

Module: prompt.prompt_engine

A dataclass that accumulates engine-wide performance statistics across all calls to build(). Accessed via the engine.metrics property.

Field Type Description
total_builds int Total number of prompts built
total_tokens int Cumulative token count across all builds
total_tokens_saved int Cumulative tokens saved by the optimizer
cache_hits int Number of times the cache was successfully hit
builds_by_type Dict[str, int] Count of builds broken down by PromptType
builds_by_strategy Dict[str, int] Count of builds broken down by PromptStrategy
avg_build_ms float Rolling average build time in milliseconds

FeedbackEntry

Module: prompt.prompt_engine

A dataclass that stores a single quality feedback signal for a previously built prompt, used by the adaptive feedback loop.

Field Type Description
trace_id str The trace ID of the prompt this feedback refers to
prompt_type str The prompt type of the rated prompt
strategy str The strategy used for the rated prompt
quality_score float Quality rating from 0.0 (bad) to 1.0 (perfect)
notes str Optional free-text notes about the quality

Guardrail

Module: prompt.guardrails

A dataclass defining a single guardrail instruction that can be injected into the system prompt.

Field Type Default Description
name str required Unique identifier (used for observability and deduplication)
instruction str required The actual instruction text injected into the prompt
priority int 50 Injection order — higher priority instructions appear first

PromptEngine

Module: prompt.prompt_engine
Import: from fennec_community.prompt import PromptEngine

The primary entry point for the entire system. Coordinates all subsystems, manages caching, collects metrics, and exposes the adaptive feedback loop.


PromptEngine Constructor

PromptEngine(
    context_manager:  Optional[ContextManager]  = None,
    guardrail_engine: Optional[GuardrailEngine] = None,
    extra_guardrails: Optional[List[Guardrail]]  = None,
    enable_cache:       bool = True,
    cache_ttl_sec:      int  = 300,
    max_cache_size:     int  = 256,
    enable_auto_detect: bool = True,
    memory_store:  Optional[Any] = None,
    cache_store:   Optional[Any] = None,
    router:        Optional[Any] = None,
)

Purpose: Instantiates the engine and all its subsystems. All parameters are optional — the defaults are production-ready.

Parameter Type Default Description
context_manager Optional[ContextManager] None Custom context manager instance. Uses default ContextManager() if not provided
guardrail_engine Optional[GuardrailEngine] None Custom guardrail engine. Uses default GuardrailEngine() if not provided
extra_guardrails Optional[List[Guardrail]] None Additional custom Guardrail objects appended to every request
enable_cache bool True Enable in-process SHA-256 hash-based prompt caching
cache_ttl_sec int 300 Cache time-to-live in seconds (5 minutes by default)
max_cache_size int 256 Maximum number of cached prompts; oldest is evicted when exceeded
enable_auto_detect bool True Auto-detect prompt type, strategy, and complexity from query content
memory_store Optional[Any] None External memory store handle (passed through for integration)
cache_store Optional[Any] None External cache store handle (passed through for integration)
router Optional[Any] None External router handle (passed through for integration)

Example:

# Default — production-ready
engine = PromptEngine()

# Custom — add a domain-specific guardrail, disable cache
from fennec_community.prompt import PromptEngine, Guardrail

medical_guardrail = Guardrail(
    name        = "medical_disclaimer",
    instruction = "Always recommend consulting a licensed physician. "
                  "Do not provide specific medical diagnoses.",
    priority    = 120,
)

engine = PromptEngine(
    extra_guardrails = [medical_guardrail],
    enable_cache     = False,
)

build()

engine.build(
    query:              str,
    documents:          Optional[List[Union[Document, Dict, str]]] = None,
    memory:             Optional[List[Union[Message, Dict]]]       = None,
    prompt_type:        Union[PromptType,    str] = PromptType.QA,
    strategy:           Union[PromptStrategy, str] = PromptStrategy.SIMPLE,
    output_format:      Union[OutputFormat,  str] = OutputFormat.TEXT,
    complexity:         Union[QueryComplexity, str] = QueryComplexity.SIMPLE,
    user_profile:       Union[UserProfile,   str] = UserProfile.GENERAL,
    max_context_tokens: int            = 3000,
    max_answer_tokens:  int            = 512,
    output_schema:      Optional[Dict] = None,
    language:           str            = "en",
    enable_guardrails:  bool           = True,
    enable_citations:   bool           = True,
    enable_uncertainty: bool           = True,
    session_id:         str            = "",
    user_id:            str            = "",
    trace_id:           str            = "",
    extra:              Optional[Dict] = None,
) -> BuiltPrompt

Purpose: The core method of the entire system. Accepts a query and supporting context, runs the full build pipeline (auto-detection → context engineering → guardrails → strategy → optimization → caching), and returns a BuiltPrompt ready for any LLM API.

Parameters:

Parameter Type Default Description
query str required The user's question or task. This is the only mandatory argument
documents List[Document | Dict | str] None Retrieved passages. Accepts Document objects, plain dicts ({"content": ..., "source": ..., "score": ...}), or raw strings
memory List[Message | Dict] None Conversation history. Accepts Message objects or dicts ({"role": ..., "content": ...})
prompt_type PromptType | str "qa" The prompt archetype. Can be overridden by auto-detection
strategy PromptStrategy | str "simple" The reasoning strategy. Can be overridden by auto-detection and complexity upgrade
output_format OutputFormat | str "text" The desired LLM response format
complexity QueryComplexity | str "simple" Query complexity. Affects strategy selection and guardrails
user_profile UserProfile | str "general" Target audience profile. Adjusts system prompt tone
max_context_tokens int 3000 Hard token budget for injected document context
max_answer_tokens int 512 Instructs the LLM about expected answer length
output_schema Optional[Dict] None JSON Schema dict — required when output_format="structured"
language str "en" BCP-47 language code for the response (e.g., "ar", "fr", "de")
enable_guardrails bool True When True, injects grounding, no-fabrication, PII-protection, and scope guardrails
enable_citations bool True When True and documents are provided, adds citation instruction
enable_uncertainty bool True When True, instructs the model to say "I don't know" rather than guess
session_id str "" Session identifier, echoed in BuiltPrompt and trace log
user_id str "" User identifier for logging purposes
trace_id str "" Distributed trace ID, echoed in BuiltPrompt and trace log
extra Optional[Dict] None Extension payload. Use extra={"tools": [...]} for agent/tool-use prompts

Returns: BuiltPrompt — the fully assembled, optimized prompt object.

Raises: ValueError — if an invalid string value is passed for an enum parameter (e.g., strategy="invalid_strategy").

Build Pipeline (internal order):

  1. Normalize inputs — convert dicts/strings to typed objects
  2. Auto-detect — infer prompt_type, strategy, complexity from query content (if enable_auto_detect=True)
  3. Cache check — return cached BuiltPrompt if a matching prompt was recently built
  4. PromptBuilder.build() — run the full build pipeline
  5. Cache store — save the result for future identical requests
  6. Metrics — update PromptMetrics counters
  7. Trace log — append a trace entry
  8. Fire hooks — emit the "prompt.built" event

Example — minimal:

prompt = engine.build(query="What is RAG?")

Example — full configuration:

prompt = engine.build(
    query         = "Compare the economic impacts of COVID-19 in the US vs EU.",
    documents     = retrieved_docs,
    memory        = chat_history,
    prompt_type   = "comparison",
    strategy      = "multi_hop",
    output_format = "markdown",
    complexity    = "complex",
    user_profile  = "executive",
    max_context_tokens = 4000,
    max_answer_tokens  = 1024,
    language      = "en",
    enable_citations   = True,
    session_id    = "sess_abc123",
    trace_id      = "trace_xyz789",
)

Example — structured output with schema:

schema = {
    "type": "object",
    "properties": {
        "summary":    {"type": "string"},
        "key_points": {"type": "array", "items": {"type": "string"}},
        "sources":    {"type": "array", "items": {"type": "string"}},
    }
}

prompt = engine.build(
    query         = "Summarize the key findings.",
    documents     = docs,
    output_format = "structured",
    output_schema = schema,
)

Auto-Detection Rules (when enable_auto_detect=True):

Signal Result
Query contains "summarize" / "overview" prompt_typeSUMMARIZATION
Query contains "compare" / "versus" / "vs" prompt_typeCOMPARISON
Query contains "extract" / "list all" prompt_typeEXTRACTION
Query contains "why" / "how" / "explain" prompt_typeREASONING
Query > 40 words OR > 5 documents complexityCOMPLEX
Query > 20 words OR > 2 documents complexityMODERATE
Query has "and" / "also" / "furthermore" AND multiple docs strategyMULTI_HOP

build_from_request()

engine.build_from_request(request: PromptRequest) -> BuiltPrompt

Purpose: Builds a prompt from a pre-constructed PromptRequest object instead of individual keyword arguments. Useful when you need to construct, serialize, or batch requests programmatically.

Parameter Type Description
request PromptRequest A fully populated PromptRequest dataclass instance

Returns: BuiltPrompt

Example:

from fennec_community.prompt import PromptRequest, PromptType, PromptStrategy

request = PromptRequest(
    query        = "What are the side effects of ibuprofen?",
    documents    = my_docs,
    prompt_type  = PromptType.QA,
    strategy     = PromptStrategy.CHAIN_OF_THOUGHT,
    user_profile = UserProfile.TECHNICAL,
)

prompt = engine.build_from_request(request)

record_feedback()

engine.record_feedback(
    trace_id:      str,
    quality_score: float,
    notes:         str = "",
) -> None

Purpose: Records a quality signal for a previously built prompt. Feedback is stored in an in-memory circular buffer (max 1000 entries) and used by adaptive_strategy_for() to recommend the best strategy for future similar prompts.

Parameter Type Description
trace_id str The trace_id of the prompt being rated (from BuiltPrompt.trace_id)
quality_score float Quality score from 0.0 (completely wrong / unhelpful) to 1.0 (perfect)
notes str Optional human-readable notes (e.g., "Answer was too verbose")

Returns: None

Example:

# After the LLM response is reviewed
engine.record_feedback(
    trace_id      = prompt.trace_id,
    quality_score = 0.85,
    notes         = "Good answer but missed one key point.",
)

adaptive_strategy_for()

engine.adaptive_strategy_for(prompt_type: PromptType) -> Optional[PromptStrategy]

Purpose: Analyses accumulated feedback to recommend the historically best-performing strategy for a given PromptType. Returns None if fewer than 5 feedback entries exist for that type (insufficient data). Use this to automatically select the strategy that your users have rated highest over time.

Parameter Type Description
prompt_type PromptType The prompt type to query the feedback history for

Returns: Optional[PromptStrategy] — the best-performing strategy, or None if data is insufficient.

Example:

best = engine.adaptive_strategy_for(PromptType.QA)
if best:
    prompt = engine.build(query=user_query, strategy=best)
else:
    prompt = engine.build(query=user_query)  # use defaults

metrics (property)

engine.metrics -> Dict[str, Any]

Purpose: Returns a snapshot of all engine-wide performance metrics as a plain dictionary, suitable for logging, dashboarding, or alerting.

Returns: Dict[str, Any] with the following keys:

Key Type Description
total_builds int Total number of build() calls
total_tokens int Cumulative token usage
total_tokens_saved int Cumulative tokens saved by optimizer
cache_hits int Total cache hits
avg_build_ms float Rolling average build latency in milliseconds
cache_hit_rate_pct float Cache hit percentage (0.0100.0)
builds_by_type Dict[str, int] Build count per PromptType
builds_by_strategy Dict[str, int] Build count per PromptStrategy

Example:

import json
print(json.dumps(engine.metrics, indent=2))
# {
#   "total_builds": 142,
#   "total_tokens": 284000,
#   "total_tokens_saved": 12400,
#   "cache_hits": 38,
#   "avg_build_ms": 4.72,
#   "cache_hit_rate_pct": 26.8,
#   "builds_by_type": {"qa": 90, "reasoning": 42, "summarization": 10},
#   "builds_by_strategy": {"simple": 60, "cot": 52, "multi_hop": 30}
# }

get_trace_log()

engine.get_trace_log(last_n: int = 20) -> List[Dict[str, Any]]

Purpose: Returns the most recent trace entries from the internal trace log. Each entry captures the full context of a single build() call — inputs, outputs, token counts, latency, and guardrails applied. Useful for debugging and observability dashboards.

Parameter Type Default Description
last_n int 20 Number of most recent trace entries to return

Returns: List[Dict[str, Any]] — each dict contains:

Key Description
trace_id The trace identifier
session_id The session identifier
query_preview First 80 characters of the query
prompt_type Effective prompt type used
strategy Effective strategy used
output_format Output format used
docs_included Number of documents included
docs_truncated Number of documents excluded
estimated_tokens Total estimated token count
tokens_saved Tokens saved by optimizer
guardrails List of guardrail names applied
elapsed_ms Build time in milliseconds
ts Unix timestamp of the build

Example:

traces = engine.get_trace_log(last_n=5)
for t in traces:
    print(f"{t['trace_id']} | {t['prompt_type']} | {t['elapsed_ms']}ms | tokens={t['estimated_tokens']}")

reset_metrics()

engine.reset_metrics() -> None

Purpose: Resets all accumulated PromptMetrics counters to zero. Useful for periodic metric resets in long-running services (e.g., reset at the start of each hour for per-hour dashboards).

Returns: None

Example:

# Reset every hour in a scheduled job
engine.reset_metrics()

on()

engine.on(event: str, callback: Callable) -> None

Purpose: Registers an event hook that is called whenever the specified event is fired. This is the primary extensibility mechanism — use hooks to integrate with external monitoring systems, logging pipelines, or custom business logic without modifying the engine.

Parameter Type Description
event str The event name to subscribe to (currently: "prompt.built")
callback Callable A callable invoked with keyword arguments when the event fires

Returns: None

Available Events:

Event Fired When Callback kwargs
"prompt.built" After every successful build() call prompt: BuiltPrompt, request: PromptRequest

Example:

def log_to_datadog(prompt: BuiltPrompt, request: PromptRequest):
    datadog.metric("prompt.tokens", prompt.estimated_tokens, tags=[
        f"type:{prompt.prompt_type.value}",
        f"strategy:{prompt.strategy.value}",
    ])

engine.on("prompt.built", log_to_datadog)

ContextManager

Module: prompt.context_manager
Import: from fennec_community.prompt import ContextManager, ContextResult

Transforms a raw list of retrieved documents into an optimally ordered, deduplicated, token-budget-aware context block ready for injection. Used internally by PromptBuilder but can also be used standalone.


ContextManager Constructor

ContextManager(
    dedup_threshold:     float = 0.85,
    min_doc_tokens:      int   = 5,
    use_lost_in_middle:  bool  = True,
    summarize_overflow:  bool  = False,
    max_memory_messages: int   = 10,
)

Purpose: Configures the context engineering pipeline.

Parameter Type Default Description
dedup_threshold float 0.85 Jaccard similarity threshold above which two documents are considered duplicates and the lower-scoring one is removed
min_doc_tokens int 5 Documents with fewer estimated tokens than this are silently skipped
use_lost_in_middle bool True Reorders documents to place the most relevant at the start and end of the context block, combating the "lost-in-the-middle" attention problem
summarize_overflow bool False If True, documents that don't fit the token budget are summarized instead of excluded (stub — not yet implemented)
max_memory_messages int 10 Maximum number of conversation turns to include in the memory block

ContextManager.build()

context_manager.build(request: PromptRequest) -> ContextResult

Purpose: Runs the full context engineering pipeline on the documents in request.documents and returns a ContextResult with the formatted, ready-to-inject context block.

Pipeline steps (internal order):

  1. Filter documents shorter than min_doc_tokens
  2. Sort by relevance score (descending)
  3. Deduplicate using exact hash + Jaccard shingle similarity
  4. Reorder for lost-in-the-middle mitigation (if enabled)
  5. Enforce token budget — partially truncate documents that overflow
  6. Format the context block with numbered source headers and relevance labels
  7. Build the citation map ({1: "source_a", 2: "source_b", ...})
Parameter Type Description
request PromptRequest The full prompt request (uses request.documents and request.max_context_tokens)

Returns: ContextResult

Example (standalone usage):

from fennec_community.prompt import ContextManager, Document, PromptRequest

cm = ContextManager(dedup_threshold=0.80, use_lost_in_middle=True)

request = PromptRequest(query="What is inflation?", documents=my_docs)
result  = cm.build(request)

print(f"Context utilization: {result.utilization_pct}%")
print(f"Documents included:  {len(result.included_docs)}")
print(f"Duplicates removed:  {result.duplicates_removed}")
print(result.context_block)

format_memory()

context_manager.format_memory(
    memory:    List[Message],
    max_turns: Optional[int] = None,
) -> str

Purpose: Converts a list of Message objects (conversation history) into a compact, formatted text block for injection into the prompt. Limits history to the most recent max_turns turns to control token usage.

Parameter Type Default Description
memory List[Message] required The full conversation history
max_turns Optional[int] None Maximum number of conversation turns to include. Defaults to max_memory_messages set in the constructor

Returns: str — formatted conversation history, or an empty string if memory is empty.

Output format:

User: What is inflation?
Assistant: Inflation is the rate at which...
User: And what causes it?

Example:

memory_block = cm.format_memory(memory=chat_history, max_turns=5)

GuardrailEngine

Module: prompt.guardrails
Import: from fennec_community.prompt import GuardrailEngine, Guardrail

Selects and assembles safety and quality instructions that are injected into the system prompt. Guardrails are applied before generation, not as post-processing filters.


GuardrailEngine Constructor

GuardrailEngine(extra_guardrails: Optional[List[Guardrail]] = None)

Purpose: Creates a guardrail engine. Optionally accepts custom guardrails that will be appended to every request in addition to the automatically selected standard guardrails.

Parameter Type Default Description
extra_guardrails Optional[List[Guardrail]] None Custom Guardrail objects always appended to the guardrail block

GuardrailEngine.build()

guardrail_engine.build(request: PromptRequest) -> tuple[str, List[str]]

Purpose: Selects all applicable guardrails for the given request, sorts them by priority, deduplicates by name, and renders them into a single formatted instruction block.

Parameter Type Description
request PromptRequest The prompt request (used to determine which guardrails apply)

Returns: tuple[str, List[str]]

  • [0] — The rendered guardrail instruction block (injected into the system prompt)
  • [1] — List of applied guardrail names (for observability, stored in BuiltPrompt.guardrails_applied)

Guardrail Selection Logic:

Condition Guardrails Applied
Always safe_output, concise
enable_guardrails=True AND documents present grounding, no_fabrication
enable_uncertainty=True uncertainty
enable_citations=True AND documents present cite_sources
Prompt type is not AGENT or TOOL_USE stay_on_topic
enable_guardrails=True pii_protection
Strategy is COT, MULTI_HOP, or LEAST_TO_MOST show_reasoning
Complexity is COMPLEX or EXPERT self_check
OutputFormat.JSON JSON format instruction
OutputFormat.BULLET_LIST Bullet list format instruction
OutputFormat.MARKDOWN Markdown format instruction
OutputFormat.CITATION Citation format instruction
OutputFormat.STRUCTURED Schema-based format instruction

GuardrailLibrary

Module: prompt.guardrails
Import: from fennec_community.prompt import GuardrailLibrary

A catalogue of pre-built guardrail objects. All guardrails are class-level attributes (singletons). Use these when constructing custom GuardrailEngine instances or passing extra_guardrails to PromptEngine.

Attribute Name Priority Purpose
GuardrailLibrary.SAFE_OUTPUT safe_output 110 Blocks harmful, offensive, or discriminatory outputs
GuardrailLibrary.PII_PROTECTION pii_protection 105 Prevents exposure of personal identifiable information
GuardrailLibrary.GROUNDING grounding 100 Forces answers to stay within provided context only
GuardrailLibrary.NO_FABRICATION no_fabrication 95 Prohibits invented facts, statistics, or citations
GuardrailLibrary.UNCERTAINTY uncertainty 90 Requires honest "I don't know" responses when unsure
GuardrailLibrary.CITE_SOURCES cite_sources 80 Requires bracketed [1] inline citations
GuardrailLibrary.STAY_ON_TOPIC stay_on_topic 70 Prevents scope drift and unsolicited opinions
GuardrailLibrary.NO_PERSONAL_OPINIONS no_personal_opinions 60 Prevents editorializing
GuardrailLibrary.SHOW_REASONING show_reasoning 50 Requires step-by-step reasoning before answer
GuardrailLibrary.SELF_CHECK self_check 45 Adds a 3-point self-verification step before answering
GuardrailLibrary.CONCISE concise 40 Strips preamble filler and gets to the point
GuardrailLibrary.NO_MARKDOWN_LEAKAGE no_markdown_leakage 30 Prevents unsolicited markdown formatting

Example — create a custom guardrail:

from fennec_community.prompt import Guardrail, GuardrailLibrary

legal_guardrail = Guardrail(
    name        = "legal_disclaimer",
    instruction = "This is not legal advice. Always recommend consulting a qualified attorney.",
    priority    = 115,  # higher than safe_output, applied first
)

engine = PromptEngine(extra_guardrails=[legal_guardrail])

PromptOptimizer

Module: prompt.optimizer
Import: from fennec_community.prompt import PromptOptimizer

Applies a pipeline of lightweight, deterministic token-reduction optimizations to the assembled system and user prompts. Runs automatically inside every strategy's build() method.


PromptOptimizer Constructor

PromptOptimizer(
    max_total_tokens:  int  = 6000,
    enable_filler:     bool = True,
    enable_dedup:      bool = True,
    enable_whitespace: bool = True,
)

Purpose: Configures the optimization pipeline.

Parameter Type Default Description
max_total_tokens int 6000 Hard total token cap. If system + user tokens exceed this, the user prompt is truncated at a paragraph boundary
enable_filler bool True Strip common LLM padding phrases ("Certainly!", "Great question!", etc.)
enable_dedup bool True Remove instruction paragraphs from the user prompt that already appear verbatim in the system prompt
enable_whitespace bool True Collapse multiple spaces and excessive blank lines

optimize()

optimizer.optimize(
    system:  str,
    user:    str,
    request: Optional[object] = None,
) -> Tuple[str, str, int, List[str]]

Purpose: Applies all enabled optimization passes to the system and user prompt strings. The optimization pipeline runs in this order: whitespace normalization → filler removal → instruction deduplication → hard token cap.

Parameter Type Default Description
system str required The assembled system prompt text
user str required The assembled user prompt text
request Optional[object] None The original PromptRequest (reserved for future use)

Returns: Tuple[str, str, int, List[str]]

  • [0] — Optimized system prompt
  • [1] — Optimized user prompt
  • [2] — Number of tokens saved (0 if none)
  • [3] — List of human-readable optimization notes (e.g., ["whitespace-normalized", "filler-stripped", "dedup-removed-2-paragraphs"])

Example (standalone usage):

from fennec_community.prompt import PromptOptimizer

optimizer = PromptOptimizer(max_total_tokens=4000)

system_opt, user_opt, saved, notes = optimizer.optimize(
    system = my_system_prompt,
    user   = my_user_prompt,
)

print(f"Saved {saved} tokens via: {notes}")

BuiltPrompt Methods

These are public methods and properties on the BuiltPrompt object returned by engine.build().


to_messages()

built_prompt.to_messages() -> List[Dict[str, str]]

Purpose: Serializes the full message list (system + conversation history + user) into the OpenAI Chat Completions API format — a list of {"role": ..., "content": ...} dicts.

Returns: List[Dict[str, str]]

Example:

response = openai_client.chat.completions.create(
    model    = "gpt-4o",
    messages = prompt.to_messages(),
)

to_anthropic()

built_prompt.to_anthropic() -> Dict[str, Any]

Purpose: Serializes the prompt into the Anthropic Messages API format — a dict with a "system" key (string) and a "messages" key (list of non-system messages). Can be unpacked directly as **kwargs into anthropic_client.messages.create().

Returns: Dict[str, Any] with keys:

  • "system" — the system prompt string
  • "messages" — list of {"role": ..., "content": ...} dicts (excludes system messages)

Example:

payload = prompt.to_anthropic()
response = anthropic_client.messages.create(
    **payload,
    model      = "claude-opus-4-20250514",
    max_tokens = 1024,
)

full_text (property)

built_prompt.full_text -> str

Purpose: Returns the system and user prompts combined as a single plain-text string, prefixed with [SYSTEM] and [USER] section headers. Useful for debugging, logging, or human review of the assembled prompt.

Returns: str

Example:

print(prompt.full_text)
# [SYSTEM]
# You are an expert AI assistant...
#
# [USER]
# ## Context
# --- Source [1] wiki (relevance: 0.92) ---
# ...

ContextResult Properties


utilization_pct (property)

context_result.utilization_pct -> float

Purpose: Returns the percentage of the token budget consumed by the assembled context block. Useful for monitoring how efficiently the document context is using the available token budget.

Returns: float — value between 0.0 and 100.0+ (can exceed 100 if truncation occurred).

Example:

result = cm.build(request)
print(f"Token budget utilization: {result.utilization_pct}%")
# Token budget utilization: 84.3%

Strategy System

The strategy system provides 7 built-in prompt construction templates. Strategies are selected automatically (via auto-detection and complexity upgrade) or specified explicitly via strategy= in engine.build().


get_strategy()

get_strategy(strategy: PromptStrategy) -> BaseStrategy

Module: prompt.strategies
Import: from fennec_community.prompt import get_strategy

Purpose: Retrieves the singleton strategy implementation for the given PromptStrategy enum value. Falls back to SimpleStrategy if the strategy is not registered (with a warning log). Primarily used internally by PromptBuilder, but available for advanced use cases.

Parameter Type Description
strategy PromptStrategy The strategy enum value to look up

Returns: BaseStrategy — the strategy implementation object.

Example:

from fennec_community.prompt import get_strategy, PromptStrategy

impl = get_strategy(PromptStrategy.CHAIN_OF_THOUGHT)

STRATEGY_REGISTRY

STRATEGY_REGISTRY: Dict[PromptStrategy, BaseStrategy]

Module: prompt.strategies
Import: from fennec_community.prompt import STRATEGY_REGISTRY

Purpose: The dictionary mapping every PromptStrategy enum value to its singleton implementation. Use this to inspect available strategies or to register custom strategy implementations.

Strategy Key Implementation Class Best For
PromptStrategy.SIMPLE SimpleStrategy Direct Q&A, factual lookup
PromptStrategy.CHAIN_OF_THOUGHT ChainOfThoughtStrategy Reasoning, explanation
PromptStrategy.MULTI_HOP MultiHopStrategy Multi-document, multi-step
PromptStrategy.SELF_CONSISTENT SelfConsistentStrategy High-stakes verification
PromptStrategy.STEP_BACK StepBackStrategy Abstract-first reasoning
PromptStrategy.REACT ReActStrategy Agentic tool-use
PromptStrategy.LEAST_TO_MOST LeastToMostStrategy Math, logic, progressive decomposition

Example — register a custom strategy:

from fennec_community.prompt import STRATEGY_REGISTRY, PromptStrategy
from fennec_community.prompt.strategies import BaseStrategy

class MyCustomStrategy(BaseStrategy):
    STRATEGY = PromptStrategy.SIMPLE  # override an existing slot

    def _build_system(self, req, guardrail_block): ...
    def _build_user(self, req, context_block, memory_block): ...

STRATEGY_REGISTRY[PromptStrategy.SIMPLE] = MyCustomStrategy()

PromptMetrics Methods


to_dict()

metrics_obj.to_dict() -> Dict[str, Any]

Purpose: Serializes the PromptMetrics dataclass into a plain Python dictionary, suitable for JSON serialization, logging, or dashboarding. This is what engine.metrics (the property) calls internally.

Returns: Dict[str, Any] — see the metrics property section for the full key reference.

Example:

import json

# Access via engine property (recommended)
print(json.dumps(engine.metrics, indent=2))

Integration Examples

OpenAI

from fennec_community.prompt import PromptEngine
import openai

engine = PromptEngine()
client = openai.OpenAI(api_key="...")

prompt = engine.build(
    query     = "What is the capital of France?",
    documents = [{"content": "France is a country in Europe. Its capital is Paris.", "source": "geo_db"}],
)

response = client.chat.completions.create(
    model    = "gpt-4o",
    messages = prompt.to_messages(),
)
print(response.choices[0].message.content)

Anthropic

from fennec_community.prompt import PromptEngine
import anthropic

engine = PromptEngine()
client = anthropic.Anthropic(api_key="...")

prompt = engine.build(
    query         = "Summarize the quarterly earnings report.",
    documents     = retrieved_docs,
    strategy      = "chain_of_thought",
    output_format = "markdown",
    user_profile  = "executive",
)

response = client.messages.create(
    **prompt.to_anthropic(),
    model      = "claude-opus-4-20250514",
    max_tokens = 2048,
)
print(response.content[0].text)

Multi-turn Conversation

from fennec_community.prompt import PromptEngine, Message

engine  = PromptEngine()
history = []

def chat(user_message: str, docs=None) -> str:
    prompt = engine.build(
        query     = user_message,
        documents = docs or [],
        memory    = history,
        prompt_type = "conversational",
    )

    # Call your LLM here...
    answer = llm_call(prompt.to_messages())

    # Update history
    history.append(Message(role="user",      content=user_message))
    history.append(Message(role="assistant", content=answer))

    return answer

Agentic Tool-Use

tools = [
    {
        "name":        "search_database",
        "description": "Search the company knowledge base.",
        "parameters":  {"query": "string", "top_k": "int"},
    },
    {
        "name":        "get_document",
        "description": "Retrieve a specific document by ID.",
        "parameters":  {"doc_id": "string"},
    },
]

prompt = engine.build(
    query       = "Find all invoices from Q3 2024 and calculate the total.",
    prompt_type = "agent",
    strategy    = "react",
    extra       = {"tools": tools},
)

Custom Guardrails + Observability Hook

from fennec_community.prompt import PromptEngine, Guardrail, BuiltPrompt, PromptRequest

# Custom guardrail
disclaimer = Guardrail(
    name        = "financial_disclaimer",
    instruction = "This is not financial advice. Past performance is not indicative of future results.",
    priority    = 115,
)

engine = PromptEngine(extra_guardrails=[disclaimer])

# Hook into every build for custom logging
def on_prompt_built(prompt: BuiltPrompt, request: PromptRequest):
    print(f"[{request.trace_id}] Built {prompt.prompt_type.value} | "
          f"{prompt.estimated_tokens} tokens | {prompt.tokens_saved} saved | "
          f"guardrails={prompt.guardrails_applied}")

engine.on("prompt.built", on_prompt_built)

Adaptive Strategy Selection

# After collecting feedback over time:
engine.record_feedback(trace_id="abc", quality_score=0.9)
engine.record_feedback(trace_id="def", quality_score=0.6)
# ... at least 5 feedback entries ...

best_strategy = engine.adaptive_strategy_for(PromptType.QA)

prompt = engine.build(
    query    = "What is the return policy?",
    strategy = best_strategy or PromptStrategy.SIMPLE,
)

Error Reference

Error When Resolution
ValueError Invalid string passed for an enum parameter (e.g., strategy="unknown") Use a valid PromptStrategy, PromptType, OutputFormat, QueryComplexity, or UserProfile value
KeyError get_strategy() called with unregistered strategy Engine falls back to SimpleStrategy with a warning log — not a hard error
Token budget exceeded Documents larger than max_context_tokens Documents are partially truncated at word boundaries; BuiltPrompt.documents_truncated > 0 signals this
Cache eviction max_cache_size reached Oldest entry is evicted automatically (LRU-like behaviour)
Hook error Exception inside an on() callback Logged as a warning; does not propagate or interrupt the build

Source: community/prompt.md