Fennec Community community/prompt.md

Prompt Modular

Overview
Architecture
Quick Start
Data Types & Enumerations
PromptEngine
ContextManager
GuardrailEngine
- Constructor
- build()
GuardrailLibrary
PromptOptimizer
- Constructor
- optimize()
BuiltPrompt Methods
ContextResult Properties
- utilization_pct (property)
Strategy System
- get_strategy()
- STRATEGY_REGISTRY
PromptMetrics Methods
- to_dict()
Integration Examples
Error Reference

Overview

The prompt module is a production-grade, AI-ready prompt orchestration engine designed for Retrieval-Augmented Generation (RAG) systems. It transforms raw user queries and retrieved documents into highly optimized, context-aware prompts that can be fed directly into any LLM API (OpenAI, Anthropic, or any compatible provider).

Key Capabilities

Capability	Description
Auto-detection	Automatically infers prompt type, strategy, and complexity from the query and documents
Context Engineering	Intelligently orders, deduplicates, and token-budget-manages retrieved documents
Guardrails	Injects anti-hallucination, citation, safety, and scope-control instructions
Optimization	Reduces token count via whitespace normalization, filler removal, and deduplication
Caching	Hash-based prompt cache with configurable TTL and size
Observability	Full metrics, per-prompt trace log, and event hook system
Adaptive Feedback	Records quality signals and recommends the historically best-performing strategy
Multi-LLM Output	Produces OpenAI-compatible and Anthropic-compatible payloads from the same prompt

Architecture

PromptEngine  ← primary public entry point
    │
    ├── PromptBuilder         ← resolves strategy, coordinates subsystems
    │       ├── ContextManager    ← document processing & token budgeting
    │       ├── GuardrailEngine   ← safety & quality instruction injection
    │       └── Strategy (7)      ← prompt template construction
    │               SimpleStrategy
    │               ChainOfThoughtStrategy
    │               MultiHopStrategy
    │               SelfConsistentStrategy
    │               StepBackStrategy
    │               ReActStrategy
    │               LeastToMostStrategy
    │
    └── PromptOptimizer       ← token reduction post-build

Data flow:
query + documents + config → PromptEngine.build() → BuiltPrompt → LLM API

Quick Start

from fennec_community.prompt import PromptEngine, Document

# 1. Create the engine (production defaults)
engine = PromptEngine()

# 2. Build a prompt
prompt = engine.build(
    query     = "What caused the 2008 financial crisis?",
    documents = [
        {"content": "The crisis was triggered by...", "source": "wiki", "score": 0.92},
        {"content": "Subprime mortgage lending...",   "source": "fed_report", "score": 0.87},
    ],
    strategy      = "multi_hop",
    output_format = "json",
)

# 3a. Use with OpenAI
response = openai_client.chat.completions.create(
    model    = "gpt-4o",
    messages = prompt.to_messages(),
)

# 3b. Use with Anthropic
payload  = prompt.to_anthropic()
response = anthropic_client.messages.create(
    **payload,
    model      = "claude-opus-4-20250514",
    max_tokens = 1024,
)

Data Types & Enumerations

Enumerations

All enumerations inherit from str, Enum, so their values can be passed as plain strings wherever an enum is expected.

`PromptType`

Defines the canonical archetype of the prompt being built. The engine uses this to select the default strategy and configure guardrails.

Value	String	Description
`PromptType.QA`	`"qa"`	Grounded question-answering from context
`PromptType.CONVERSATIONAL`	`"conversational"`	Multi-turn dialogue
`PromptType.REASONING`	`"reasoning"`	Chain-of-thought / multi-hop reasoning
`PromptType.AGENT`	`"agent"`	ReAct / plan-and-execute agentic tasks
`PromptType.TOOL_USE`	`"tool_use"`	Function-calling / tool description
`PromptType.SAFETY`	`"safety"`	Content moderation / safety guard
`PromptType.SUMMARIZATION`	`"summarization"`	Document summarization
`PromptType.EXTRACTION`	`"extraction"`	Structured data extraction
`PromptType.COMPARISON`	`"comparison"`	Compare / contrast multiple documents

`PromptStrategy`

Controls the reasoning template applied to the prompt. Different strategies produce structurally different prompts that guide the LLM's reasoning approach.

Value	String	Best For
`PromptStrategy.SIMPLE`	`"simple"`	Single-fact lookup, direct Q&A
`PromptStrategy.CHAIN_OF_THOUGHT`	`"cot"`	Reasoning tasks, complex explanations
`PromptStrategy.MULTI_HOP`	`"multi_hop"`	Multi-document, multi-step questions
`PromptStrategy.SELF_CONSISTENT`	`"self_consistent"`	High-stakes answers requiring verification
`PromptStrategy.STEP_BACK`	`"step_back"`	Abstract-first reasoning
`PromptStrategy.REACT`	`"react"`	Agentic tasks with tool calls
`PromptStrategy.LEAST_TO_MOST`	`"least_to_most"`	Math, logic, progressive sub-problems

`OutputFormat`

Specifies the format in which the LLM should return its answer. The engine injects the appropriate formatting instruction into the guardrail block.

Value	String	Description
`OutputFormat.TEXT`	`"text"`	Free-form plain text (default)
`OutputFormat.JSON`	`"json"`	Structured JSON with answer, sources, confidence
`OutputFormat.MARKDOWN`	`"markdown"`	Markdown with headers and bullets
`OutputFormat.BULLET_LIST`	`"bullet_list"`	Concise bulleted list
`OutputFormat.STRUCTURED`	`"structured"`	Domain-specific schema (requires `output_schema`)
`OutputFormat.CITATION`	`"citation"`	Prose with inline `[1]` citations and Sources section

`QueryComplexity`

Signals the complexity level of the query. Affects strategy selection (complex queries auto-upgrade to CoT or Multi-Hop) and guardrail selection (complex/expert adds a self-check guardrail).

Value	String	Description
`QueryComplexity.SIMPLE`	`"simple"`	Single-fact lookup
`QueryComplexity.MODERATE`	`"moderate"`	Some reasoning required
`QueryComplexity.COMPLEX`	`"complex"`	Multi-hop or multi-document
`QueryComplexity.EXPERT`	`"expert"`	Deep domain knowledge required

`UserProfile`

Tailors the tone and language style of the system prompt to the target audience.

Value	String	Style Applied
`UserProfile.GENERAL`	`"general"`	Clear, accessible, jargon-free
`UserProfile.TECHNICAL`	`"technical"`	Precise technical language with full details
`UserProfile.ACADEMIC`	`"academic"`	Formal academic tone with rigorous evidence
`UserProfile.EXECUTIVE`	`"executive"`	Extremely concise, business impact first

`Document`

Module: prompt.types

A dataclass representing a single retrieved passage from your vector store or retrieval system.

@dataclass
class Document:
    content:   str
    source:    str             = ""
    score:     float           = 1.0
    metadata:  Dict[str, Any]  = field(default_factory=dict)
    chunk_id:  Optional[str]   = None
    language:  str             = "en"

Field	Type	Default	Description
`content`	`str`	required	The text content of the passage
`source`	`str`	`""`	Source identifier (URL, filename, document ID) used in citations
`score`	`float`	`1.0`	Relevance score from retrieval system (higher = more relevant)
`metadata`	`Dict[str, Any]`	`{}`	Arbitrary key-value metadata (page number, author, date, etc.)
`chunk_id`	`Optional[str]`	`None`	Unique identifier for the chunk within a document
`language`	`str`	`"en"`	Language code of the document content

Usage:

doc = Document(
    content  = "The Federal Reserve raised interest rates...",
    source   = "fed_report_2023.pdf",
    score    = 0.94,
    metadata = {"page": 12, "author": "Federal Reserve"},
    chunk_id = "fed_2023_p12_chunk_3",
)

Note: Document objects can also be created implicitly by PromptEngine.build() when you pass plain dict or str objects in the documents list.

`Message`

Module: prompt.types

A dataclass representing a single turn in a conversation history.

@dataclass
class Message:
    role:    Literal["system", "user", "assistant"]
    content: str

Field	Type	Description
`role`	`Literal["system", "user", "assistant"]`	The speaker role
`content`	`str`	The text of the message

Usage:

history = [
    Message(role="user",      content="What is inflation?"),
    Message(role="assistant", content="Inflation is the rate at which..."),
    Message(role="user",      content="And what causes it?"),
]

`PromptRequest`

Module: prompt.types

A dataclass that encapsulates all configuration and context the engine needs to build a prompt. This is the internal canonical request object; most users pass arguments directly to PromptEngine.build() instead.

@dataclass
class PromptRequest:
    # Required
    query: str

    # Context
    documents:  List[Document]   = []
    memory:     List[Message]    = []

    # Intent / routing
    prompt_type:   PromptType      = PromptType.QA
    strategy:      PromptStrategy  = PromptStrategy.SIMPLE
    output_format: OutputFormat    = OutputFormat.TEXT
    complexity:    QueryComplexity = QueryComplexity.SIMPLE
    user_profile:  UserProfile     = UserProfile.GENERAL

    # Constraints
    max_context_tokens: int          = 3000
    max_answer_tokens:  int          = 512
    output_schema:      Optional[Dict] = None
    language:           str          = "en"

    # Feature flags
    enable_guardrails:  bool = True
    enable_cot:         bool = False
    enable_citations:   bool = True
    enable_uncertainty: bool = True

    # Metadata
    session_id: str          = ""
    user_id:    str          = ""
    trace_id:   str          = ""
    extra:      Dict[str, Any] = {}

Field	Type	Default	Description
`query`	`str`	required	The user's question or task
`documents`	`List[Document]`	`[]`	Retrieved passages for context
`memory`	`List[Message]`	`[]`	Conversation history
`prompt_type`	`PromptType`	`QA`	Prompt archetype
`strategy`	`PromptStrategy`	`SIMPLE`	Reasoning strategy
`output_format`	`OutputFormat`	`TEXT`	Desired response format
`complexity`	`QueryComplexity`	`SIMPLE`	Query complexity level
`user_profile`	`UserProfile`	`GENERAL`	Target audience
`max_context_tokens`	`int`	`3000`	Token budget for injected context
`max_answer_tokens`	`int`	`512`	Hint to the LLM about expected answer length
`output_schema`	`Optional[Dict]`	`None`	JSON schema for `STRUCTURED` output format
`language`	`str`	`"en"`	Response language code
`enable_guardrails`	`bool`	`True`	Inject grounding/safety instructions
`enable_cot`	`bool`	`False`	Force chain-of-thought (auto-set by strategy)
`enable_citations`	`bool`	`True`	Request inline source citations
`enable_uncertainty`	`bool`	`True`	Ask model to express uncertainty honestly
`session_id`	`str`	`""`	Session identifier for tracing
`user_id`	`str`	`""`	User identifier for logging
`trace_id`	`str`	`""`	Trace identifier for distributed tracing
`extra`	`Dict[str, Any]`	`{}`	Arbitrary extension data (e.g., `tools` list for agents)

`BuiltPrompt`

Module: prompt.types

The output of the entire build pipeline. Contains the fully assembled prompt, token accounting, guardrail metadata, and optimization notes. This object is ready to be passed directly to any LLM client.

@dataclass
class BuiltPrompt:
    system_prompt: str
    user_prompt:   str
    messages:      List[Message]

    prompt_type:   PromptType
    strategy:      PromptStrategy
    output_format: OutputFormat

    estimated_tokens:    int
    context_tokens_used: int
    documents_included:  int
    documents_truncated: int

    guardrails_applied:  List[str]
    tokens_saved:        int
    optimization_notes:  List[str]

    session_id: str
    trace_id:   str

Field	Type	Description
`system_prompt`	`str`	The assembled system prompt
`user_prompt`	`str`	The assembled user prompt with context and question
`messages`	`List[Message]`	Full message list (system + history + user)
`prompt_type`	`PromptType`	The effective prompt type used
`strategy`	`PromptStrategy`	The effective strategy used
`output_format`	`OutputFormat`	The output format enforced
`estimated_tokens`	`int`	Approximate total token count of the prompt
`context_tokens_used`	`int`	Tokens consumed by the context block specifically
`documents_included`	`int`	Number of documents successfully injected
`documents_truncated`	`int`	Number of documents excluded due to token budget
`guardrails_applied`	`List[str]`	Names of all guardrails injected (for observability)
`tokens_saved`	`int`	Tokens saved by the optimizer
`optimization_notes`	`List[str]`	Human-readable notes from the optimizer pipeline
`session_id`	`str`	Echo of the request session ID
`trace_id`	`str`	Echo of the request trace ID

`ContextResult`

Module: prompt.context_manager

The output of ContextManager.build(). Contains the ready-to-inject context block and rich metadata about what was included, excluded, and deduplicated.

Field	Type	Description
`context_block`	`str`	The fully formatted, ready-to-inject context string
`included_docs`	`List[Document]`	Documents that fit within the token budget
`excluded_docs`	`List[Document]`	Documents excluded due to token budget
`citation_map`	`Dict[int, str]`	Maps citation index (e.g., `1`) to source identifier
`tokens_used`	`int`	Token count of the assembled context block
`tokens_budget`	`int`	The token budget that was enforced
`duplicates_removed`	`int`	Number of near-duplicate documents removed
`truncated`	`bool`	Whether any document was partially truncated to fit

`PromptMetrics`

Module: prompt.prompt_engine

A dataclass that accumulates engine-wide performance statistics across all calls to build(). Accessed via the engine.metrics property.

Field	Type	Description
`total_builds`	`int`	Total number of prompts built
`total_tokens`	`int`	Cumulative token count across all builds
`total_tokens_saved`	`int`	Cumulative tokens saved by the optimizer
`cache_hits`	`int`	Number of times the cache was successfully hit
`builds_by_type`	`Dict[str, int]`	Count of builds broken down by `PromptType`
`builds_by_strategy`	`Dict[str, int]`	Count of builds broken down by `PromptStrategy`
`avg_build_ms`	`float`	Rolling average build time in milliseconds

`FeedbackEntry`

Module: prompt.prompt_engine

A dataclass that stores a single quality feedback signal for a previously built prompt, used by the adaptive feedback loop.

Field	Type	Description
`trace_id`	`str`	The trace ID of the prompt this feedback refers to
`prompt_type`	`str`	The prompt type of the rated prompt
`strategy`	`str`	The strategy used for the rated prompt
`quality_score`	`float`	Quality rating from `0.0` (bad) to `1.0` (perfect)
`notes`	`str`	Optional free-text notes about the quality

`Guardrail`

Module: prompt.guardrails

A dataclass defining a single guardrail instruction that can be injected into the system prompt.

Field	Type	Default	Description
`name`	`str`	required	Unique identifier (used for observability and deduplication)
`instruction`	`str`	required	The actual instruction text injected into the prompt
`priority`	`int`	`50`	Injection order — higher priority instructions appear first

`PromptEngine`

Module: prompt.prompt_engine
Import: from fennec_community.prompt import PromptEngine

The primary entry point for the entire system. Coordinates all subsystems, manages caching, collects metrics, and exposes the adaptive feedback loop.

`PromptEngine` Constructor

PromptEngine(
    context_manager:  Optional[ContextManager]  = None,
    guardrail_engine: Optional[GuardrailEngine] = None,
    extra_guardrails: Optional[List[Guardrail]]  = None,
    enable_cache:       bool = True,
    cache_ttl_sec:      int  = 300,
    max_cache_size:     int  = 256,
    enable_auto_detect: bool = True,
    memory_store:  Optional[Any] = None,
    cache_store:   Optional[Any] = None,
    router:        Optional[Any] = None,
)

Purpose: Instantiates the engine and all its subsystems. All parameters are optional — the defaults are production-ready.

Parameter	Type	Default	Description
`context_manager`	`Optional[ContextManager]`	`None`	Custom context manager instance. Uses default `ContextManager()` if not provided
`guardrail_engine`	`Optional[GuardrailEngine]`	`None`	Custom guardrail engine. Uses default `GuardrailEngine()` if not provided
`extra_guardrails`	`Optional[List[Guardrail]]`	`None`	Additional custom `Guardrail` objects appended to every request
`enable_cache`	`bool`	`True`	Enable in-process SHA-256 hash-based prompt caching
`cache_ttl_sec`	`int`	`300`	Cache time-to-live in seconds (5 minutes by default)
`max_cache_size`	`int`	`256`	Maximum number of cached prompts; oldest is evicted when exceeded
`enable_auto_detect`	`bool`	`True`	Auto-detect prompt type, strategy, and complexity from query content
`memory_store`	`Optional[Any]`	`None`	External memory store handle (passed through for integration)
`cache_store`	`Optional[Any]`	`None`	External cache store handle (passed through for integration)
`router`	`Optional[Any]`	`None`	External router handle (passed through for integration)

Example:

# Default — production-ready
engine = PromptEngine()

# Custom — add a domain-specific guardrail, disable cache
from fennec_community.prompt import PromptEngine, Guardrail

medical_guardrail = Guardrail(
    name        = "medical_disclaimer",
    instruction = "Always recommend consulting a licensed physician. "
                  "Do not provide specific medical diagnoses.",
    priority    = 120,
)

engine = PromptEngine(
    extra_guardrails = [medical_guardrail],
    enable_cache     = False,
)

`build()`

engine.build(
    query:              str,
    documents:          Optional[List[Union[Document, Dict, str]]] = None,
    memory:             Optional[List[Union[Message, Dict]]]       = None,
    prompt_type:        Union[PromptType,    str] = PromptType.QA,
    strategy:           Union[PromptStrategy, str] = PromptStrategy.SIMPLE,
    output_format:      Union[OutputFormat,  str] = OutputFormat.TEXT,
    complexity:         Union[QueryComplexity, str] = QueryComplexity.SIMPLE,
    user_profile:       Union[UserProfile,   str] = UserProfile.GENERAL,
    max_context_tokens: int            = 3000,
    max_answer_tokens:  int            = 512,
    output_schema:      Optional[Dict] = None,
    language:           str            = "en",
    enable_guardrails:  bool           = True,
    enable_citations:   bool           = True,
    enable_uncertainty: bool           = True,
    session_id:         str            = "",
    user_id:            str            = "",
    trace_id:           str            = "",
    extra:              Optional[Dict] = None,
) -> BuiltPrompt

Purpose: The core method of the entire system. Accepts a query and supporting context, runs the full build pipeline (auto-detection → context engineering → guardrails → strategy → optimization → caching), and returns a BuiltPrompt ready for any LLM API.

Parameters:

Parameter	Type	Default	Description
`query`	`str`	required	The user's question or task. This is the only mandatory argument
`documents`	`List[Document \| Dict \| str]`	`None`	Retrieved passages. Accepts `Document` objects, plain dicts (`{"content": ..., "source": ..., "score": ...}`), or raw strings
`memory`	`List[Message \| Dict]`	`None`	Conversation history. Accepts `Message` objects or dicts (`{"role": ..., "content": ...}`)
`prompt_type`	`PromptType \| str`	`"qa"`	The prompt archetype. Can be overridden by auto-detection
`strategy`	`PromptStrategy \| str`	`"simple"`	The reasoning strategy. Can be overridden by auto-detection and complexity upgrade
`output_format`	`OutputFormat \| str`	`"text"`	The desired LLM response format
`complexity`	`QueryComplexity \| str`	`"simple"`	Query complexity. Affects strategy selection and guardrails
`user_profile`	`UserProfile \| str`	`"general"`	Target audience profile. Adjusts system prompt tone
`max_context_tokens`	`int`	`3000`	Hard token budget for injected document context
`max_answer_tokens`	`int`	`512`	Instructs the LLM about expected answer length
`output_schema`	`Optional[Dict]`	`None`	JSON Schema dict — required when `output_format="structured"`
`language`	`str`	`"en"`	BCP-47 language code for the response (e.g., `"ar"`, `"fr"`, `"de"`)
`enable_guardrails`	`bool`	`True`	When `True`, injects grounding, no-fabrication, PII-protection, and scope guardrails
`enable_citations`	`bool`	`True`	When `True` and documents are provided, adds citation instruction
`enable_uncertainty`	`bool`	`True`	When `True`, instructs the model to say "I don't know" rather than guess
`session_id`	`str`	`""`	Session identifier, echoed in `BuiltPrompt` and trace log
`user_id`	`str`	`""`	User identifier for logging purposes
`trace_id`	`str`	`""`	Distributed trace ID, echoed in `BuiltPrompt` and trace log
`extra`	`Optional[Dict]`	`None`	Extension payload. Use `extra={"tools": [...]}` for agent/tool-use prompts

Returns: BuiltPrompt — the fully assembled, optimized prompt object.

Raises: ValueError — if an invalid string value is passed for an enum parameter (e.g., strategy="invalid_strategy").

Build Pipeline (internal order):

Normalize inputs — convert dicts/strings to typed objects
Auto-detect — infer prompt_type, strategy, complexity from query content (if enable_auto_detect=True)
Cache check — return cached BuiltPrompt if a matching prompt was recently built
PromptBuilder.build() — run the full build pipeline
Cache store — save the result for future identical requests
Metrics — update PromptMetrics counters
Trace log — append a trace entry
Fire hooks — emit the "prompt.built" event

Example — minimal:

prompt = engine.build(query="What is RAG?")

Example — full configuration:

prompt = engine.build(
    query         = "Compare the economic impacts of COVID-19 in the US vs EU.",
    documents     = retrieved_docs,
    memory        = chat_history,
    prompt_type   = "comparison",
    strategy      = "multi_hop",
    output_format = "markdown",
    complexity    = "complex",
    user_profile  = "executive",
    max_context_tokens = 4000,
    max_answer_tokens  = 1024,
    language      = "en",
    enable_citations   = True,
    session_id    = "sess_abc123",
    trace_id      = "trace_xyz789",
)

Example — structured output with schema:

schema = {
    "type": "object",
    "properties": {
        "summary":    {"type": "string"},
        "key_points": {"type": "array", "items": {"type": "string"}},
        "sources":    {"type": "array", "items": {"type": "string"}},
    }
}

prompt = engine.build(
    query         = "Summarize the key findings.",
    documents     = docs,
    output_format = "structured",
    output_schema = schema,
)

Auto-Detection Rules (when enable_auto_detect=True):

Signal	Result
Query contains "summarize" / "overview"	`prompt_type` → `SUMMARIZATION`
Query contains "compare" / "versus" / "vs"	`prompt_type` → `COMPARISON`
Query contains "extract" / "list all"	`prompt_type` → `EXTRACTION`
Query contains "why" / "how" / "explain"	`prompt_type` → `REASONING`
Query > 40 words OR > 5 documents	`complexity` → `COMPLEX`
Query > 20 words OR > 2 documents	`complexity` → `MODERATE`
Query has "and" / "also" / "furthermore" AND multiple docs	`strategy` → `MULTI_HOP`

`build_from_request()`

engine.build_from_request(request: PromptRequest) -> BuiltPrompt

Purpose: Builds a prompt from a pre-constructed PromptRequest object instead of individual keyword arguments. Useful when you need to construct, serialize, or batch requests programmatically.

Parameter	Type	Description
`request`	`PromptRequest`	A fully populated `PromptRequest` dataclass instance

Returns: BuiltPrompt

Example:

from fennec_community.prompt import PromptRequest, PromptType, PromptStrategy

request = PromptRequest(
    query        = "What are the side effects of ibuprofen?",
    documents    = my_docs,
    prompt_type  = PromptType.QA,
    strategy     = PromptStrategy.CHAIN_OF_THOUGHT,
    user_profile = UserProfile.TECHNICAL,
)

prompt = engine.build_from_request(request)

`record_feedback()`

engine.record_feedback(
    trace_id:      str,
    quality_score: float,
    notes:         str = "",
) -> None

Purpose: Records a quality signal for a previously built prompt. Feedback is stored in an in-memory circular buffer (max 1000 entries) and used by adaptive_strategy_for() to recommend the best strategy for future similar prompts.

Parameter	Type	Description
`trace_id`	`str`	The `trace_id` of the prompt being rated (from `BuiltPrompt.trace_id`)
`quality_score`	`float`	Quality score from `0.0` (completely wrong / unhelpful) to `1.0` (perfect)
`notes`	`str`	Optional human-readable notes (e.g., `"Answer was too verbose"`)

Returns: None

Example:

# After the LLM response is reviewed
engine.record_feedback(
    trace_id      = prompt.trace_id,
    quality_score = 0.85,
    notes         = "Good answer but missed one key point.",
)

`adaptive_strategy_for()`

engine.adaptive_strategy_for(prompt_type: PromptType) -> Optional[PromptStrategy]

Purpose: Analyses accumulated feedback to recommend the historically best-performing strategy for a given PromptType. Returns None if fewer than 5 feedback entries exist for that type (insufficient data). Use this to automatically select the strategy that your users have rated highest over time.

Parameter	Type	Description
`prompt_type`	`PromptType`	The prompt type to query the feedback history for

Returns: Optional[PromptStrategy] — the best-performing strategy, or None if data is insufficient.

Example:

best = engine.adaptive_strategy_for(PromptType.QA)
if best:
    prompt = engine.build(query=user_query, strategy=best)
else:
    prompt = engine.build(query=user_query)  # use defaults

`metrics` (property)

engine.metrics -> Dict[str, Any]

Purpose: Returns a snapshot of all engine-wide performance metrics as a plain dictionary, suitable for logging, dashboarding, or alerting.

Returns: Dict[str, Any] with the following keys:

Key	Type	Description
`total_builds`	`int`	Total number of `build()` calls
`total_tokens`	`int`	Cumulative token usage
`total_tokens_saved`	`int`	Cumulative tokens saved by optimizer
`cache_hits`	`int`	Total cache hits
`avg_build_ms`	`float`	Rolling average build latency in milliseconds
`cache_hit_rate_pct`	`float`	Cache hit percentage (`0.0` – `100.0`)
`builds_by_type`	`Dict[str, int]`	Build count per `PromptType`
`builds_by_strategy`	`Dict[str, int]`	Build count per `PromptStrategy`

Example:

import json
print(json.dumps(engine.metrics, indent=2))
# {
#   "total_builds": 142,
#   "total_tokens": 284000,
#   "total_tokens_saved": 12400,
#   "cache_hits": 38,
#   "avg_build_ms": 4.72,
#   "cache_hit_rate_pct": 26.8,
#   "builds_by_type": {"qa": 90, "reasoning": 42, "summarization": 10},
#   "builds_by_strategy": {"simple": 60, "cot": 52, "multi_hop": 30}
# }

`get_trace_log()`

engine.get_trace_log(last_n: int = 20) -> List[Dict[str, Any]]

Purpose: Returns the most recent trace entries from the internal trace log. Each entry captures the full context of a single build() call — inputs, outputs, token counts, latency, and guardrails applied. Useful for debugging and observability dashboards.

Parameter	Type	Default	Description
`last_n`	`int`	`20`	Number of most recent trace entries to return

Returns: List[Dict[str, Any]] — each dict contains:

Key	Description
`trace_id`	The trace identifier
`session_id`	The session identifier
`query_preview`	First 80 characters of the query
`prompt_type`	Effective prompt type used
`strategy`	Effective strategy used
`output_format`	Output format used
`docs_included`	Number of documents included
`docs_truncated`	Number of documents excluded
`estimated_tokens`	Total estimated token count
`tokens_saved`	Tokens saved by optimizer
`guardrails`	List of guardrail names applied
`elapsed_ms`	Build time in milliseconds
`ts`	Unix timestamp of the build

Example:

traces = engine.get_trace_log(last_n=5)
for t in traces:
    print(f"{t['trace_id']} | {t['prompt_type']} | {t['elapsed_ms']}ms | tokens={t['estimated_tokens']}")

`reset_metrics()`

engine.reset_metrics() -> None

Purpose: Resets all accumulated PromptMetrics counters to zero. Useful for periodic metric resets in long-running services (e.g., reset at the start of each hour for per-hour dashboards).

Returns: None

Example:

# Reset every hour in a scheduled job
engine.reset_metrics()

`on()`

engine.on(event: str, callback: Callable) -> None

Purpose: Registers an event hook that is called whenever the specified event is fired. This is the primary extensibility mechanism — use hooks to integrate with external monitoring systems, logging pipelines, or custom business logic without modifying the engine.

Parameter	Type	Description
`event`	`str`	The event name to subscribe to (currently: `"prompt.built"`)
`callback`	`Callable`	A callable invoked with keyword arguments when the event fires

Returns: None

Available Events:

Event	Fired When	Callback kwargs
`"prompt.built"`	After every successful `build()` call	`prompt: BuiltPrompt`, `request: PromptRequest`

Example:

def log_to_datadog(prompt: BuiltPrompt, request: PromptRequest):
    datadog.metric("prompt.tokens", prompt.estimated_tokens, tags=[
        f"type:{prompt.prompt_type.value}",
        f"strategy:{prompt.strategy.value}",
    ])

engine.on("prompt.built", log_to_datadog)

`ContextManager`

Module: prompt.context_manager
Import: from fennec_community.prompt import ContextManager, ContextResult

Transforms a raw list of retrieved documents into an optimally ordered, deduplicated, token-budget-aware context block ready for injection. Used internally by PromptBuilder but can also be used standalone.

`ContextManager` Constructor

ContextManager(
    dedup_threshold:     float = 0.85,
    min_doc_tokens:      int   = 5,
    use_lost_in_middle:  bool  = True,
    summarize_overflow:  bool  = False,
    max_memory_messages: int   = 10,
)

Purpose: Configures the context engineering pipeline.

Parameter	Type	Default	Description
`dedup_threshold`	`float`	`0.85`	Jaccard similarity threshold above which two documents are considered duplicates and the lower-scoring one is removed
`min_doc_tokens`	`int`	`5`	Documents with fewer estimated tokens than this are silently skipped
`use_lost_in_middle`	`bool`	`True`	Reorders documents to place the most relevant at the start and end of the context block, combating the "lost-in-the-middle" attention problem
`summarize_overflow`	`bool`	`False`	If `True`, documents that don't fit the token budget are summarized instead of excluded (stub — not yet implemented)
`max_memory_messages`	`int`	`10`	Maximum number of conversation turns to include in the memory block

`ContextManager.build()`

context_manager.build(request: PromptRequest) -> ContextResult

Purpose: Runs the full context engineering pipeline on the documents in request.documents and returns a ContextResult with the formatted, ready-to-inject context block.

Pipeline steps (internal order):

Filter documents shorter than min_doc_tokens
Sort by relevance score (descending)
Deduplicate using exact hash + Jaccard shingle similarity
Reorder for lost-in-the-middle mitigation (if enabled)
Enforce token budget — partially truncate documents that overflow
Format the context block with numbered source headers and relevance labels
Build the citation map ({1: "source_a", 2: "source_b", ...})

Parameter	Type	Description
`request`	`PromptRequest`	The full prompt request (uses `request.documents` and `request.max_context_tokens`)

Returns: ContextResult

Example (standalone usage):

from fennec_community.prompt import ContextManager, Document, PromptRequest

cm = ContextManager(dedup_threshold=0.80, use_lost_in_middle=True)

request = PromptRequest(query="What is inflation?", documents=my_docs)
result  = cm.build(request)

print(f"Context utilization: {result.utilization_pct}%")
print(f"Documents included:  {len(result.included_docs)}")
print(f"Duplicates removed:  {result.duplicates_removed}")
print(result.context_block)

`format_memory()`

context_manager.format_memory(
    memory:    List[Message],
    max_turns: Optional[int] = None,
) -> str

Purpose: Converts a list of Message objects (conversation history) into a compact, formatted text block for injection into the prompt. Limits history to the most recent max_turns turns to control token usage.

Parameter	Type	Default	Description
`memory`	`List[Message]`	required	The full conversation history
`max_turns`	`Optional[int]`	`None`	Maximum number of conversation turns to include. Defaults to `max_memory_messages` set in the constructor

Returns: str — formatted conversation history, or an empty string if memory is empty.

Output format:

User: What is inflation?
Assistant: Inflation is the rate at which...
User: And what causes it?

Example:

memory_block = cm.format_memory(memory=chat_history, max_turns=5)

`GuardrailEngine`

Module: prompt.guardrails
Import: from fennec_community.prompt import GuardrailEngine, Guardrail

Selects and assembles safety and quality instructions that are injected into the system prompt. Guardrails are applied before generation, not as post-processing filters.

`GuardrailEngine` Constructor

GuardrailEngine(extra_guardrails: Optional[List[Guardrail]] = None)

Purpose: Creates a guardrail engine. Optionally accepts custom guardrails that will be appended to every request in addition to the automatically selected standard guardrails.

Parameter	Type	Default	Description
`extra_guardrails`	`Optional[List[Guardrail]]`	`None`	Custom `Guardrail` objects always appended to the guardrail block

`GuardrailEngine.build()`

guardrail_engine.build(request: PromptRequest) -> tuple[str, List[str]]

Purpose: Selects all applicable guardrails for the given request, sorts them by priority, deduplicates by name, and renders them into a single formatted instruction block.

Parameter	Type	Description
`request`	`PromptRequest`	The prompt request (used to determine which guardrails apply)

Returns: tuple[str, List[str]]

[0] — The rendered guardrail instruction block (injected into the system prompt)
[1] — List of applied guardrail names (for observability, stored in BuiltPrompt.guardrails_applied)

Guardrail Selection Logic:

Condition	Guardrails Applied
Always	`safe_output`, `concise`
`enable_guardrails=True` AND documents present	`grounding`, `no_fabrication`
`enable_uncertainty=True`	`uncertainty`
`enable_citations=True` AND documents present	`cite_sources`
Prompt type is not `AGENT` or `TOOL_USE`	`stay_on_topic`
`enable_guardrails=True`	`pii_protection`
Strategy is `COT`, `MULTI_HOP`, or `LEAST_TO_MOST`	`show_reasoning`
Complexity is `COMPLEX` or `EXPERT`	`self_check`
`OutputFormat.JSON`	JSON format instruction
`OutputFormat.BULLET_LIST`	Bullet list format instruction
`OutputFormat.MARKDOWN`	Markdown format instruction
`OutputFormat.CITATION`	Citation format instruction
`OutputFormat.STRUCTURED`	Schema-based format instruction

`GuardrailLibrary`

Module: prompt.guardrails
Import: from fennec_community.prompt import GuardrailLibrary

A catalogue of pre-built guardrail objects. All guardrails are class-level attributes (singletons). Use these when constructing custom GuardrailEngine instances or passing extra_guardrails to PromptEngine.

Attribute	Name	Priority	Purpose
`GuardrailLibrary.SAFE_OUTPUT`	`safe_output`	110	Blocks harmful, offensive, or discriminatory outputs
`GuardrailLibrary.PII_PROTECTION`	`pii_protection`	105	Prevents exposure of personal identifiable information
`GuardrailLibrary.GROUNDING`	`grounding`	100	Forces answers to stay within provided context only
`GuardrailLibrary.NO_FABRICATION`	`no_fabrication`	95	Prohibits invented facts, statistics, or citations
`GuardrailLibrary.UNCERTAINTY`	`uncertainty`	90	Requires honest "I don't know" responses when unsure
`GuardrailLibrary.CITE_SOURCES`	`cite_sources`	80	Requires bracketed `[1]` inline citations
`GuardrailLibrary.STAY_ON_TOPIC`	`stay_on_topic`	70	Prevents scope drift and unsolicited opinions
`GuardrailLibrary.NO_PERSONAL_OPINIONS`	`no_personal_opinions`	60	Prevents editorializing
`GuardrailLibrary.SHOW_REASONING`	`show_reasoning`	50	Requires step-by-step reasoning before answer
`GuardrailLibrary.SELF_CHECK`	`self_check`	45	Adds a 3-point self-verification step before answering
`GuardrailLibrary.CONCISE`	`concise`	40	Strips preamble filler and gets to the point
`GuardrailLibrary.NO_MARKDOWN_LEAKAGE`	`no_markdown_leakage`	30	Prevents unsolicited markdown formatting

Example — create a custom guardrail:

from fennec_community.prompt import Guardrail, GuardrailLibrary

legal_guardrail = Guardrail(
    name        = "legal_disclaimer",
    instruction = "This is not legal advice. Always recommend consulting a qualified attorney.",
    priority    = 115,  # higher than safe_output, applied first
)

engine = PromptEngine(extra_guardrails=[legal_guardrail])

`PromptOptimizer`

Module: prompt.optimizer
Import: from fennec_community.prompt import PromptOptimizer

Applies a pipeline of lightweight, deterministic token-reduction optimizations to the assembled system and user prompts. Runs automatically inside every strategy's build() method.

`PromptOptimizer` Constructor

PromptOptimizer(
    max_total_tokens:  int  = 6000,
    enable_filler:     bool = True,
    enable_dedup:      bool = True,
    enable_whitespace: bool = True,
)

Purpose: Configures the optimization pipeline.

Parameter	Type	Default	Description
`max_total_tokens`	`int`	`6000`	Hard total token cap. If system + user tokens exceed this, the user prompt is truncated at a paragraph boundary
`enable_filler`	`bool`	`True`	Strip common LLM padding phrases (`"Certainly!"`, `"Great question!"`, etc.)
`enable_dedup`	`bool`	`True`	Remove instruction paragraphs from the user prompt that already appear verbatim in the system prompt
`enable_whitespace`	`bool`	`True`	Collapse multiple spaces and excessive blank lines

`optimize()`

optimizer.optimize(
    system:  str,
    user:    str,
    request: Optional[object] = None,
) -> Tuple[str, str, int, List[str]]

Purpose: Applies all enabled optimization passes to the system and user prompt strings. The optimization pipeline runs in this order: whitespace normalization → filler removal → instruction deduplication → hard token cap.

Parameter	Type	Default	Description
`system`	`str`	required	The assembled system prompt text
`user`	`str`	required	The assembled user prompt text
`request`	`Optional[object]`	`None`	The original `PromptRequest` (reserved for future use)

Returns: Tuple[str, str, int, List[str]]

[0] — Optimized system prompt
[1] — Optimized user prompt
[2] — Number of tokens saved (0 if none)
[3] — List of human-readable optimization notes (e.g., ["whitespace-normalized", "filler-stripped", "dedup-removed-2-paragraphs"])

Example (standalone usage):

from fennec_community.prompt import PromptOptimizer

optimizer = PromptOptimizer(max_total_tokens=4000)

system_opt, user_opt, saved, notes = optimizer.optimize(
    system = my_system_prompt,
    user   = my_user_prompt,
)

print(f"Saved {saved} tokens via: {notes}")

`BuiltPrompt` Methods

These are public methods and properties on the BuiltPrompt object returned by engine.build().

`to_messages()`

built_prompt.to_messages() -> List[Dict[str, str]]

Purpose: Serializes the full message list (system + conversation history + user) into the OpenAI Chat Completions API format — a list of {"role": ..., "content": ...} dicts.

Returns: List[Dict[str, str]]

Example:

response = openai_client.chat.completions.create(
    model    = "gpt-4o",
    messages = prompt.to_messages(),
)

`to_anthropic()`

built_prompt.to_anthropic() -> Dict[str, Any]

Purpose: Serializes the prompt into the Anthropic Messages API format — a dict with a "system" key (string) and a "messages" key (list of non-system messages). Can be unpacked directly as **kwargs into anthropic_client.messages.create().

Returns: Dict[str, Any] with keys:

"system" — the system prompt string
"messages" — list of {"role": ..., "content": ...} dicts (excludes system messages)

Example:

payload = prompt.to_anthropic()
response = anthropic_client.messages.create(
    **payload,
    model      = "claude-opus-4-20250514",
    max_tokens = 1024,
)

`full_text` (property)

built_prompt.full_text -> str

Purpose: Returns the system and user prompts combined as a single plain-text string, prefixed with [SYSTEM] and [USER] section headers. Useful for debugging, logging, or human review of the assembled prompt.

Returns: str

Example:

print(prompt.full_text)
# [SYSTEM]
# You are an expert AI assistant...
#
# [USER]
# ## Context
# --- Source [1] wiki (relevance: 0.92) ---
# ...

`ContextResult` Properties

`utilization_pct` (property)

context_result.utilization_pct -> float

Purpose: Returns the percentage of the token budget consumed by the assembled context block. Useful for monitoring how efficiently the document context is using the available token budget.

Returns: float — value between 0.0 and 100.0+ (can exceed 100 if truncation occurred).

Example:

result = cm.build(request)
print(f"Token budget utilization: {result.utilization_pct}%")
# Token budget utilization: 84.3%

Strategy System

The strategy system provides 7 built-in prompt construction templates. Strategies are selected automatically (via auto-detection and complexity upgrade) or specified explicitly via strategy= in engine.build().

`get_strategy()`

get_strategy(strategy: PromptStrategy) -> BaseStrategy

Module: prompt.strategies
Import: from fennec_community.prompt import get_strategy

Purpose: Retrieves the singleton strategy implementation for the given PromptStrategy enum value. Falls back to SimpleStrategy if the strategy is not registered (with a warning log). Primarily used internally by PromptBuilder, but available for advanced use cases.

Parameter	Type	Description
`strategy`	`PromptStrategy`	The strategy enum value to look up

Returns: BaseStrategy — the strategy implementation object.

Example:

from fennec_community.prompt import get_strategy, PromptStrategy

impl = get_strategy(PromptStrategy.CHAIN_OF_THOUGHT)

`STRATEGY_REGISTRY`

STRATEGY_REGISTRY: Dict[PromptStrategy, BaseStrategy]

Module: prompt.strategies
Import: from fennec_community.prompt import STRATEGY_REGISTRY

Purpose: The dictionary mapping every PromptStrategy enum value to its singleton implementation. Use this to inspect available strategies or to register custom strategy implementations.

Strategy Key	Implementation Class	Best For
`PromptStrategy.SIMPLE`	`SimpleStrategy`	Direct Q&A, factual lookup
`PromptStrategy.CHAIN_OF_THOUGHT`	`ChainOfThoughtStrategy`	Reasoning, explanation
`PromptStrategy.MULTI_HOP`	`MultiHopStrategy`	Multi-document, multi-step
`PromptStrategy.SELF_CONSISTENT`	`SelfConsistentStrategy`	High-stakes verification
`PromptStrategy.STEP_BACK`	`StepBackStrategy`	Abstract-first reasoning
`PromptStrategy.REACT`	`ReActStrategy`	Agentic tool-use
`PromptStrategy.LEAST_TO_MOST`	`LeastToMostStrategy`	Math, logic, progressive decomposition

Example — register a custom strategy:

from fennec_community.prompt import STRATEGY_REGISTRY, PromptStrategy
from fennec_community.prompt.strategies import BaseStrategy

class MyCustomStrategy(BaseStrategy):
    STRATEGY = PromptStrategy.SIMPLE  # override an existing slot

    def _build_system(self, req, guardrail_block): ...
    def _build_user(self, req, context_block, memory_block): ...

STRATEGY_REGISTRY[PromptStrategy.SIMPLE] = MyCustomStrategy()

`PromptMetrics` Methods

`to_dict()`

metrics_obj.to_dict() -> Dict[str, Any]

Purpose: Serializes the PromptMetrics dataclass into a plain Python dictionary, suitable for JSON serialization, logging, or dashboarding. This is what engine.metrics (the property) calls internally.

Returns: Dict[str, Any] — see the metrics property section for the full key reference.

Example:

import json

# Access via engine property (recommended)
print(json.dumps(engine.metrics, indent=2))

Integration Examples

OpenAI

from fennec_community.prompt import PromptEngine
import openai

engine = PromptEngine()
client = openai.OpenAI(api_key="...")

prompt = engine.build(
    query     = "What is the capital of France?",
    documents = [{"content": "France is a country in Europe. Its capital is Paris.", "source": "geo_db"}],
)

response = client.chat.completions.create(
    model    = "gpt-4o",
    messages = prompt.to_messages(),
)
print(response.choices[0].message.content)

Anthropic

from fennec_community.prompt import PromptEngine
import anthropic

engine = PromptEngine()
client = anthropic.Anthropic(api_key="...")

prompt = engine.build(
    query         = "Summarize the quarterly earnings report.",
    documents     = retrieved_docs,
    strategy      = "chain_of_thought",
    output_format = "markdown",
    user_profile  = "executive",
)

response = client.messages.create(
    **prompt.to_anthropic(),
    model      = "claude-opus-4-20250514",
    max_tokens = 2048,
)
print(response.content[0].text)

Multi-turn Conversation

from fennec_community.prompt import PromptEngine, Message

engine  = PromptEngine()
history = []

def chat(user_message: str, docs=None) -> str:
    prompt = engine.build(
        query     = user_message,
        documents = docs or [],
        memory    = history,
        prompt_type = "conversational",
    )

    # Call your LLM here...
    answer = llm_call(prompt.to_messages())

    # Update history
    history.append(Message(role="user",      content=user_message))
    history.append(Message(role="assistant", content=answer))

    return answer

Agentic Tool-Use

tools = [
    {
        "name":        "search_database",
        "description": "Search the company knowledge base.",
        "parameters":  {"query": "string", "top_k": "int"},
    },
    {
        "name":        "get_document",
        "description": "Retrieve a specific document by ID.",
        "parameters":  {"doc_id": "string"},
    },
]

prompt = engine.build(
    query       = "Find all invoices from Q3 2024 and calculate the total.",
    prompt_type = "agent",
    strategy    = "react",
    extra       = {"tools": tools},
)

Custom Guardrails + Observability Hook

from fennec_community.prompt import PromptEngine, Guardrail, BuiltPrompt, PromptRequest

# Custom guardrail
disclaimer = Guardrail(
    name        = "financial_disclaimer",
    instruction = "This is not financial advice. Past performance is not indicative of future results.",
    priority    = 115,
)

engine = PromptEngine(extra_guardrails=[disclaimer])

# Hook into every build for custom logging
def on_prompt_built(prompt: BuiltPrompt, request: PromptRequest):
    print(f"[{request.trace_id}] Built {prompt.prompt_type.value} | "
          f"{prompt.estimated_tokens} tokens | {prompt.tokens_saved} saved | "
          f"guardrails={prompt.guardrails_applied}")

engine.on("prompt.built", on_prompt_built)

Adaptive Strategy Selection

# After collecting feedback over time:
engine.record_feedback(trace_id="abc", quality_score=0.9)
engine.record_feedback(trace_id="def", quality_score=0.6)
# ... at least 5 feedback entries ...

best_strategy = engine.adaptive_strategy_for(PromptType.QA)

prompt = engine.build(
    query    = "What is the return policy?",
    strategy = best_strategy or PromptStrategy.SIMPLE,
)

Error Reference

Error	When	Resolution
`ValueError`	Invalid string passed for an enum parameter (e.g., `strategy="unknown"`)	Use a valid `PromptStrategy`, `PromptType`, `OutputFormat`, `QueryComplexity`, or `UserProfile` value
`KeyError`	`get_strategy()` called with unregistered strategy	Engine falls back to `SimpleStrategy` with a warning log — not a hard error
Token budget exceeded	Documents larger than `max_context_tokens`	Documents are partially truncated at word boundaries; `BuiltPrompt.documents_truncated > 0` signals this
Cache eviction	`max_cache_size` reached	Oldest entry is evicted automatically (LRU-like behaviour)
Hook error	Exception inside an `on()` callback	Logged as a warning; does not propagate or interrupt the build

Source: community/prompt.md

Table of Contents

Overview

Key Capabilities

Architecture

Quick Start

Data Types & Enumerations

Enumerations

PromptType

PromptStrategy

OutputFormat

QueryComplexity

UserProfile

Document

Message

PromptRequest

BuiltPrompt

ContextResult

PromptMetrics

FeedbackEntry

Guardrail

PromptEngine

PromptEngine Constructor

build()

build_from_request()

record_feedback()

adaptive_strategy_for()

metrics (property)

get_trace_log()

reset_metrics()

on()

ContextManager

ContextManager Constructor

ContextManager.build()

format_memory()

GuardrailEngine

GuardrailEngine Constructor

GuardrailEngine.build()

GuardrailLibrary

PromptOptimizer

PromptOptimizer Constructor

optimize()

BuiltPrompt Methods

to_messages()

to_anthropic()

full_text (property)

ContextResult Properties

utilization_pct (property)

Strategy System

get_strategy()

STRATEGY_REGISTRY

PromptMetrics Methods

to_dict()

Integration Examples

OpenAI

Anthropic

Multi-turn Conversation

Agentic Tool-Use

Custom Guardrails + Observability Hook

Adaptive Strategy Selection

Error Reference

`PromptType`

`PromptStrategy`

`OutputFormat`

`QueryComplexity`

`UserProfile`

`Document`

`Message`

`PromptRequest`

`BuiltPrompt`

`ContextResult`

`PromptMetrics`

`FeedbackEntry`

`Guardrail`

`PromptEngine`

`PromptEngine` Constructor

`build()`

`build_from_request()`

`record_feedback()`

`adaptive_strategy_for()`

`metrics` (property)

`get_trace_log()`

`reset_metrics()`

`on()`

`ContextManager`

`ContextManager` Constructor

`ContextManager.build()`

`format_memory()`

`GuardrailEngine`

`GuardrailEngine` Constructor

`GuardrailEngine.build()`

`GuardrailLibrary`

`PromptOptimizer`

`PromptOptimizer` Constructor

`optimize()`

`BuiltPrompt` Methods

`to_messages()`

`to_anthropic()`

`full_text` (property)

`ContextResult` Properties

`utilization_pct` (property)

`get_strategy()`

`STRATEGY_REGISTRY`

`PromptMetrics` Methods

`to_dict()`