Fennec Logo Fennec
Fennec Community community/output_parser.md

Output Parser Modular

Purpose: A production-grade engine for parsing, validating, and fault-tolerantly handling raw LLM outputs.


Architecture Overview

The module operates as a multi-stage pipeline:

Raw LLM Text
     │
     ▼
FormatDetector     ← Auto-detects format (JSON / YAML / CSV / ...)
     │
     ▼
Format Parsers     ← Converts text into structured Python objects
     │
     ▼
OutputFixer        ← Repairs broken or incomplete outputs
     │
     ▼
OutputValidator    ← Validates correctness, types, and safety
     │
     ▼
RetryHandler       ← Re-prompts LLM with escalating instructions on failure
     │
     ▼
ParseResult        ← Final typed, audited result

Imports

from fennec_community.output_parser import (
    # Core
    OutputParser, ParseError,
    create_answer_parser, create_json_parser, create_tool_call_parser,

    # Schemas
    AnswerSchema, ToolCallSchema, RetrievalResultSchema,
    RankedAnswersSchema, FieldSchema,

    # Enums
    OutputFormat, ParseMode, FixStrategy, ValidationStatus,

    # Results & Tracing
    ParseResult, ParseTrace, ValidationResult,

    # Format Detection
    FormatDetector, FormatCandidate,

    # Validation
    OutputValidator, ValidationRule, build_answer_validator,

    # Fixing
    OutputFixer, build_answer_fixer,

    # Retry
    RetryHandler, RetryResult, RetryStrategy, graceful_fallback,
)

1. Core Class: OutputParser

Purpose: The central orchestrator — combines all pipeline stages into a single, unified interface.


OutputParser.__init__()

OutputParser(
    schema=None,
    fields=None,
    mode=ParseMode.LENIENT,
    expected_format=None,
    llm_fn=None,
    max_retries=2,
    enable_safety=True,
    enable_cache=True,
    original_prompt="",
)
Parameter Type Default Description
schema Type[BaseModel] | None None Pydantic model class. When provided, parsed dicts are automatically cast to this type.
fields List[FieldSchema] | None None Explicit field definitions as an alternative to a Pydantic schema.
mode ParseMode LENIENT Strictness level: STRICT / LENIENT / SEMANTIC / TOOL_CALL
expected_format OutputFormat | None None Force a specific format, skipping auto-detection.
llm_fn Callable[[str], str] | None None An LLM callable used during fix and retry operations.
max_retries int 2 Maximum number of LLM regeneration attempts on parse failure.
enable_safety bool True Enables safety checks: hallucination markers, data leakage, prompt injection.
enable_cache bool True Caches successful parse results in memory (keyed by MD5 hash of input).
original_prompt str "" The original user prompt, embedded in retry prompts for context.

OutputParser.parse()

def parse(
    text: str,
    expected_format: Optional[OutputFormat] = None,
) -> ParseResult

Purpose: Parses raw LLM text and returns a fully validated, optionally typed ParseResult with a complete audit trail.

Parameter Type Description
text str Raw LLM output string to parse.
expected_format OutputFormat | None Override the detected format for this call only.

Returns: ParseResult

Property Type Description
.data Any The parsed, validated, typed output (dict / Pydantic instance / list / ...).
.ok bool True if parsing succeeded and data is not None.
.trace ParseTrace Full audit trail of every pipeline stage.
.raw str The original, unmodified LLM text.

Raises: ParseError in ParseMode.STRICT if parsing fails after all recovery attempts.

Example:

parser = OutputParser()
result = parser.parse('{"answer": "Paris", "confidence": 0.98}')

if result.ok:
    print(result.data)                      # {'answer': 'Paris', 'confidence': 0.98}
    print(result.trace.detected_format)     # OutputFormat.JSON
    print(result.trace.duration_ms)         # e.g. 1.23

OutputParser.parse_typed()

def parse_typed(
    text: str,
    schema: Type[T],
    expected_format: Optional[OutputFormat] = None,
) -> T

Purpose: Parses text and returns a typed instance of schema directly. This is a convenience shorthand for parser.parse(text).as_typed(schema).

Parameter Type Description
text str Raw LLM output string.
schema Type[T] The class to cast the parsed data into.
expected_format OutputFormat | None Optional format override.

Returns: An instance of type T (Pydantic model or dataclass).

Raises: ParseError if parsing fails; TypeError if the cast to schema fails.

Example:

from fennec_community.output_parser import AnswerSchema

parser = OutputParser(schema=AnswerSchema)
answer: AnswerSchema = parser.parse_typed(raw_text, AnswerSchema)

print(answer.answer)      # "Paris"
print(answer.confidence)  # 0.98

OutputParser.get_format_instructions()

def get_format_instructions() -> str

Purpose: Generates a format instruction string ready to be embedded directly into a System Prompt or User Prompt, guiding the LLM to produce output in the expected structure.

Takes no parameters.

Returns: str — a prompt-ready instruction block.

Behavior:

  • If schema is set → generates a full JSON Schema with an example
  • If fields are set → generates a key: <type> list with descriptions
  • If neither → returns a generic JSON example

Example:

parser = OutputParser(schema=AnswerSchema)
instructions = parser.get_format_instructions()
# "Return ONLY valid JSON matching this schema:\n```json\n{...}\n```\nDo not include any explanation..."

prompt = f"Answer the following question.\n{instructions}\nQuestion: What is the capital of France?"

OutputParser.clear_cache()

def clear_cache() -> None

Purpose: Clears all in-memory cached parse results. Useful when the schema or mode changes at runtime, or after long test sessions.

Takes no parameters. Returns nothing.

Example:

parser.clear_cache()

2. Factory Functions

Convenience functions that return a pre-configured OutputParser for the most common use cases.


create_answer_parser()

def create_answer_parser(
    llm_fn: Optional[Callable[[str], str]] = None,
    max_retries: int = 2,
    strict: bool = False,
) -> OutputParser

Purpose: Creates a parser pre-configured for standard RAG AnswerSchema outputs — includes full safety checks and schema validation with zero additional setup.

Parameter Type Default Description
llm_fn Callable[[str], str] | None None LLM callable for retry on failure.
max_retries int 2 Number of regeneration attempts.
strict bool False True = raise ParseError on failure; False = use graceful fallback.

Returns: A ready-to-use OutputParser targeting AnswerSchema.

Example:

parser = create_answer_parser(llm_fn=my_llm, strict=True)
result = parser.parse(raw_llm_output)
answer: AnswerSchema = result.data
print(answer.answer, answer.sources, answer.confidence)

create_json_parser()

def create_json_parser(
    schema: Optional[Type] = None,
    llm_fn: Optional[Callable[[str], str]] = None,
    strict: bool = False,
) -> OutputParser

Purpose: Creates a parser dedicated to JSON outputs with optional Pydantic schema enforcement. Ideal when the LLM is expected to return pure JSON.

Parameter Type Default Description
schema Type | None None Pydantic model to cast the parsed result into automatically.
llm_fn Callable[[str], str] | None None LLM callable for retry on failure.
strict bool False Strictness level on failure.

Returns: An OutputParser locked to OutputFormat.JSON.

Example:

from pydantic import BaseModel

class ProductSchema(BaseModel):
    name: str
    price: float
    in_stock: bool

parser = create_json_parser(schema=ProductSchema, strict=True)
product: ProductSchema = parser.parse_typed(raw_text, ProductSchema)

create_tool_call_parser()

def create_tool_call_parser(
    llm_fn: Optional[Callable[[str], str]] = None,
) -> OutputParser

Purpose: Creates a parser specialized for Tool Call outputs. Supports both OpenAI function-call JSON format and ReAct-style Action: ... / Action Input: ... format. Safety checks are disabled by default as tool environments are trusted.

Parameter Type Default Description
llm_fn Callable[[str], str] | None None LLM callable for a single retry attempt.

Returns: An OutputParser configured with ParseMode.TOOL_CALL and OutputFormat.TOOL_CALL.

Example:

parser = create_tool_call_parser()
result = parser.parse('Action: search\nAction Input: {"query": "weather in Cairo"}')
# result.data → {"tool_name": "search", "arguments": {"query": "weather in Cairo"}, "thought": None}

3. Format Detection: FormatDetector

Purpose: Analyses raw LLM text using multi-signal heuristics to determine its format before parsing begins.


FormatDetector.detect()

def detect(text: str) -> OutputFormat

Purpose: Returns the single most likely OutputFormat for the given text.

Parameter Type Description
text str The raw LLM output to analyse.

Returns: OutputFormat enum value.

Example:

detector = FormatDetector()
detector.detect('{"key": "value"}')        # → OutputFormat.JSON
detector.detect("1. First\n2. Second")     # → OutputFormat.NUMBERED_LIST
detector.detect("| A | B |\n|---|---|\n")  # → OutputFormat.MARKDOWN_TABLE

FormatDetector.rank()

def rank(text: str) -> List[FormatCandidate]

Purpose: Returns a ranked list of all plausible formats with confidence scores for each — invaluable for debugging ambiguous outputs.

Parameter Type Description
text str The raw text to analyse.

Returns: List[FormatCandidate] — sorted descending by confidence. Each FormatCandidate contains:

Property Type Description
.format OutputFormat The detected format.
.confidence float Confidence score from 0.0 to 1.0.
.evidence str Human-readable reason for this score.

Example:

detector = FormatDetector()
candidates = detector.rank("name: John\nage: 30\ncity: Cairo")
for c in candidates:
    print(f"{c.format.value}: {c.confidence:.2f}{c.evidence}")
# key_value:  0.60 — 3 key: value pairs
# yaml:       0.50 — 3 key: value lines
# plain_text: 0.25 — default text fallback

FormatDetector.detect_with_confidence()

def detect_with_confidence(text: str) -> Tuple[OutputFormat, float]

Purpose: Returns the best format alongside its confidence score as a single tuple — a practical shorthand when you need both values together.

Parameter Type Description
text str The raw text to analyse.

Returns: Tuple[OutputFormat, float](best_format, confidence_score)

Example:

detector = FormatDetector()
fmt, confidence = detector.detect_with_confidence(text)

if confidence < 0.5:
    print("Warning: format is ambiguous, consider using LENIENT mode")

4. Validation: OutputValidator

Purpose: Validates parsed outputs across four sequential layers to ensure correctness, type safety, business rules, and security.


OutputValidator.__init__()

OutputValidator(
    fields=None,
    rules=None,
    pydantic_model=None,
    enable_safety=True,
)
Parameter Type Default Description
fields List[FieldSchema] | None None Fields to validate for presence and type.
rules List[ValidationRule] | None None Custom business-rule predicates.
pydantic_model Type | None None Pydantic model for structural validation (Layer 4).
enable_safety bool True Enables hallucination, data leakage, and prompt injection checks.

The four validation layers:

Layer Name What It Checks
1 Schema Completeness Required fields present with correct types
2 Business Rules Custom ValidationRule predicates
3 Safety Checks Hallucination markers, PII/credential leakage, prompt injection
4 Pydantic Validation Full structural validation against a Pydantic model

OutputValidator.validate()

def validate(
    data: Any,
    raw_text: str = "",
    trace: Optional[ParseTrace] = None,
) -> List[ValidationResult]

Purpose: Runs all validation layers and returns detailed per-check results. If a trace is passed, results are automatically appended to it.

Parameter Type Description
data Any The parsed data to validate (typically a dict).
raw_text str The original LLM text (used for safety checks).
trace ParseTrace | None If provided, validation results are added to the trace.

Returns: List[ValidationResult]. Each result contains:

Property Type Description
.status ValidationStatus PASSED / FAILED / WARNING / SKIPPED
.field str | None The associated field name (if applicable).
.message str Error or warning message.
.value Any The value that caused the issue.
.passed bool True if status is PASSED.

Example:

validator = OutputValidator(
    fields=[FieldSchema("answer", "The answer text", dtype="str", required=True)],
    enable_safety=True,
)
results = validator.validate({"answer": "Paris"}, raw_text=raw_output)
failures = [r for r in results if not r.passed]

OutputValidator.is_valid()

def is_valid(
    data: Any,
    raw_text: str = "",
) -> bool

Purpose: A quick pass/fail check — returns True only if all validation layers pass (warnings and skipped checks are not treated as failures).

Parameter Type Description
data Any The parsed data.
raw_text str Original LLM text for safety checks.

Returns: bool

Example:

if not validator.is_valid(parsed_data, raw_text):
    raise ValueError("Output did not pass validation")

OutputValidator.get_failures()

def get_failures(
    data: Any,
    raw_text: str = "",
) -> List[ValidationResult]

Purpose: Returns only the failed validation results — ideal for structured logging and error reporting.

Parameter Type Description
data Any The parsed data.
raw_text str Original LLM text.

Returns: List[ValidationResult] — empty list if all checks pass.

Example:

failures = validator.get_failures(data, raw_text)
if failures:
    for f in failures:
        logger.error("[%s] %s", f.field, f.message)

build_answer_validator()

def build_answer_validator(enable_safety: bool = True) -> OutputValidator

Purpose: Factory that returns a pre-configured OutputValidator for AnswerSchema outputs — validates answer presence, confidence range [0.0, 1.0], and runs all safety checks.

Parameter Type Default Description
enable_safety bool True Enable or disable safety checks.

Returns: A ready-to-use OutputValidator.

Example:

validator = build_answer_validator(enable_safety=True)
is_ok = validator.is_valid(parsed_data, raw_text)

5. Custom Rules: ValidationRule

Purpose: Defines a named, callable validation rule with configurable severity.


ValidationRule.__init__()

ValidationRule(
    name: str,
    predicate: Callable[[Any], bool],
    message: str,
    field: Optional[str] = None,
    severity: ValidationStatus = ValidationStatus.FAILED,
)
Parameter Type Description
name str Rule identifier (used in logs).
predicate Callable[[Any], bool] A function that receives the data and returns True if the check passes.
message str Error message shown on failure.
field str | None The associated field name (optional, for context).
severity ValidationStatus Failure severity: FAILED (default) or WARNING.

ValidationRule.check()

def check(data: Any) -> ValidationResult

Purpose: Applies the predicate to the given data and returns a ValidationResult.

Parameter Type Description
data Any The data to validate.

Returns: ValidationResult

Example:

rule = ValidationRule(
    name="answer_min_length",
    predicate=lambda d: len(d.get("answer", "")) >= 10,
    message="Answer is too short (less than 10 characters)",
    field="answer",
    severity=ValidationStatus.WARNING,
)
result = rule.check({"answer": "Yes"})
# result.status → ValidationStatus.WARNING

6. Fault Tolerance: OutputFixer

Purpose: Repairs malformed, incomplete, or broken LLM outputs using a hierarchy of four progressively deeper strategies.


OutputFixer.__init__()

OutputFixer(
    required_fields=None,
    field_defaults=None,
    llm_fn=None,
)
Parameter Type Default Description
required_fields List[str] | None None Field names that must be present in the output.
field_defaults Dict[str, Any] | None None Default values injected when a required field is missing.
llm_fn Callable[[str], str] | None None LLM callable, required for the LLM_REFORMAT strategy.

OutputFixer.fix()

def fix(
    text: str,
    expected_format: OutputFormat = OutputFormat.JSON,
) -> Tuple[str, FixStrategy]

Purpose: Attempts to repair a broken text string using four escalating strategies, returning the repaired text and the strategy that succeeded. The caller should re-parse the returned text.

Parameter Type Description
text str The malformed or unparseable text.
expected_format OutputFormat The expected format, which influences which repair logic is applied.

Returns: Tuple[str, FixStrategy]

  • str — the repaired text (must be re-parsed by the caller)
  • FixStrategy — the strategy used, or FixStrategy.NONE if all strategies failed

Repair strategies applied in order:

Strategy Description
REGEX_REPAIR Strips markdown fences, fixes trailing commas, converts single quotes, quotes bare keys
FIELD_INJECTION Injects missing required fields with their default values
FALLBACK_PARSE Parses as key-value pairs and re-serializes as JSON
LLM_REFORMAT Sends a reformat request to the LLM (requires llm_fn)

Example:

fixer = OutputFixer(
    required_fields=["answer"],
    field_defaults={"answer": "", "confidence": 0.5},
)
fixed_text, strategy = fixer.fix("```json\n{answer: 'Paris'}\n```", OutputFormat.JSON)
# → ('{"answer": "Paris"}', FixStrategy.REGEX_REPAIR)

OutputFixer.fix_dict()

def fix_dict(
    data: Dict[str, Any],
) -> Tuple[Dict[str, Any], FixStrategy]

Purpose: Repairs a partially parsed dict by injecting missing required fields directly — faster than fix() when structured data is already available.

Parameter Type Description
data Dict[str, Any] The incomplete parsed dict.

Returns: Tuple[Dict[str, Any], FixStrategy]

  • A completed dict with injected fields
  • FixStrategy.FIELD_INJECTION if fields were injected; FixStrategy.NONE if nothing was missing

Example:

fixer = OutputFixer(required_fields=["answer", "sources"])
fixed, strategy = fixer.fix_dict({"answer": "Paris"})
# → ({"answer": "Paris", "sources": None}, FixStrategy.FIELD_INJECTION)

build_answer_fixer()

def build_answer_fixer(llm_fn: Optional[Callable] = None) -> OutputFixer

Purpose: Factory that returns a pre-configured OutputFixer for AnswerSchema outputs — automatically injects answer: "", sources: [], and confidence: 0.5 for missing fields.

Parameter Type Default Description
llm_fn Callable | None None LLM callable for the LLM_REFORMAT strategy.

Returns: A ready-to-use OutputFixer.


7. Retry & Regeneration: RetryHandler

Purpose: Manages LLM regeneration when parsing fails, re-prompting with progressively stricter instructions until a valid output is obtained or the retry budget is exhausted.


RetryHandler.__init__()

RetryHandler(
    llm_fn: LLMCallable,
    max_retries: int = 3,
    backoff_seconds: float = 0.5,
    pydantic_schema: Optional[Type] = None,
    required_fields: Optional[List[str]] = None,
)
Parameter Type Default Description
llm_fn Callable[[str], str] required The LLM callable to invoke on each retry.
max_retries int 3 Maximum number of retry attempts.
backoff_seconds float 0.5 Wait time between retries (multiplied by attempt index for linear backoff).
pydantic_schema Type | None None Used to generate a JSON schema example in retry prompts.
required_fields List[str] | None None Field names embedded in retry prompt examples.

RetryHandler.run()

def run(
    original_prompt: str,
    parse_fn: Callable[[str], Any],
    format_instructions: str = "",
    last_error: str = "",
    last_response: str = "",
) -> RetryResult

Purpose: Executes a full retry cycle — builds an improved prompt → calls LLM → tests parseability → repeats until success or budget exhaustion.

Parameter Type Description
original_prompt str The original user question or request.
parse_fn Callable[[str], Any] Parse function that raises an exception on failure.
format_instructions str Format hint string from get_format_instructions().
last_error str Error message from the most recent failed parse.
last_response str The last raw LLM response that failed parsing.

Retry strategies applied in order:

Strategy Description
STRICT_FORMAT Adds explicit format instructions and the failed response to the prompt
JSON_STRICT Forces JSON-only output with a full schema example
SIMPLIFIED Strips the prompt down to a minimal question with a basic JSON example
GRACEFUL_FAIL Returns a structured error payload

Returns: RetryResult

Property Type Description
.success bool True if any attempt succeeded.
.response str The raw LLM response from the winning attempt.
.attempts int Total number of actual LLM calls made.
.strategy_used RetryStrategy | None The strategy that succeeded.
.errors List[str] Error messages from each failed attempt.
.total_duration_ms float Total wall-clock time in milliseconds.

Example:

handler = RetryHandler(llm_fn=my_llm, max_retries=3)
result = handler.run(
    original_prompt="What is the capital of France?",
    parse_fn=lambda t: json.loads(t),
    format_instructions='Return JSON: {"answer": "..."}',
    last_error="No JSON found",
    last_response="The capital of France is Paris.",
)
if result.success:
    parsed = json.loads(result.response)

RetryResult.as_error_payload()

def as_error_payload(original_query: str = "") -> Dict[str, Any]

Purpose: Converts a failed RetryResult into a structured error dictionary — useful for consistent error handling at the application level.

Parameter Type Description
original_query str The original user query (for context in the error payload).

Returns: Dict[str, Any]

{
    "error": True,
    "message": "Failed to obtain a valid response after all retries",
    "attempts": 3,
    "original_query": "...",
    "errors": ["Attempt 1 ...", "Attempt 2 ...", ...],
}

graceful_fallback()

def graceful_fallback(
    raw_text: str,
    query: str = "",
) -> Dict[str, Any]

Purpose: The final safety net — when all parsing, fixing, and retry attempts fail, this function returns a structured, always-safe dict that guarantees the pipeline never returns None or raises an unhandled exception.

Parameter Type Description
raw_text str The original text that could not be parsed.
query str The original user query (for debugging).

Returns: Dict[str, Any] — always contains:

{
    "answer": "<raw_text or error message>",
    "sources": [],
    "confidence": 0.0,
    "_parse_error": True,   # Flag to distinguish from real answers
    "_raw": "<first 500 chars of raw_text>",
    "_query": "<first 200 chars of query>",
}

Example:

fallback = graceful_fallback(broken_text, query="What is the capital of France?")
# Always safe — never returns None, never raises

8. Schemas & Data Types

AnswerSchema

Standard schema for RAG pipeline answers.

class AnswerSchema(BaseModel):
    answer: str                   # The answer text (required)
    sources: List[str]            # Source references (default: [])
    confidence: float             # Confidence score 0.0–1.0 (default: 1.0, auto-clamped)
    reasoning: Optional[str]      # Optional chain-of-thought (default: None)

ToolCallSchema

Schema for LLM tool/function call outputs.

class ToolCallSchema(BaseModel):
    tool_name: str                # Name of the tool to invoke (required)
    arguments: Dict[str, Any]     # Tool arguments (default: {})
    thought: Optional[str]        # Optional reasoning before the call (default: None)

RetrievalResultSchema

Schema for a single retrieved document result.

class RetrievalResultSchema(BaseModel):
    content: str                  # Document content (required)
    source: Optional[str]         # Source URL or ID (default: None)
    score: Optional[float]        # Relevance score 0.0–1.0 (default: None)
    metadata: Dict[str, Any]      # Additional metadata (default: {})

RankedAnswersSchema

Schema for multiple candidate answers with a designated best answer.

class RankedAnswersSchema(BaseModel):
    answers: List[AnswerSchema]   # List of candidates (required, min length: 1)
    best_index: int               # Index of the best answer (default: 0)

    @property
    def best(self) -> AnswerSchema:  # Direct access to the best answer

FieldSchema

Defines a single expected field in an LLM output for schema-less validation.

@dataclass
class FieldSchema:
    name: str                         # Field name
    description: str                  # Human-readable description
    dtype: str = "str"                # Type: str | int | float | bool | list | dict
    required: bool = True             # Whether the field is required
    aliases: List[str] = []           # Alternative field names to match
    default: Any = None               # Default value if missing
    choices: Optional[List] = None    # Restricts value to an allowed set

9. Enums Reference

OutputFormat

Value Description
JSON JSON object or array
YAML YAML data
CSV Tabular CSV data
MARKDOWN_TABLE Pipe-delimited Markdown table
NUMBERED_LIST Numbered list (1. / a.)
BULLETED_LIST Bulleted list (* / - / )
KEY_VALUE Key: value pairs
XML XML tag pairs
TOOL_CALL Tool or function call invocation
PLAIN_TEXT Plain unstructured text
MIXED Mix of multiple formats
UNKNOWN Could not be determined

ParseMode

Value Description
STRICT Raises ParseError on any failure — no fallback
LENIENT Auto-fixes minor issues, uses graceful fallback on failure
SEMANTIC Uses LLM to extract meaning from freeform text
TOOL_CALL Specialized mode for parsing tool/function call syntax

FixStrategy

Value Description
REGEX_REPAIR Regex-based fixes (quotes, commas, fences, bare keys)
FIELD_INJECTION Injects missing fields with default values
FALLBACK_PARSE Parses as key-value and re-serializes as JSON
LLM_REFORMAT Asks the LLM to reformat its own output
NONE No fix was applied or possible

RetryStrategy

Value Description
STRICT_FORMAT Adds explicit format instructions to the retry prompt
JSON_STRICT Forces JSON-only output with a schema example
SIMPLIFIED Strips the prompt down to a minimal question
GRACEFUL_FAIL Returns a structured error payload

ValidationStatus

Value Description
PASSED Check passed
FAILED Check failed — treated as a hard error
WARNING Check flagged a concern but did not fail
SKIPPED Check was not applicable (e.g. field is optional and absent)

10. Complete Usage Examples

Basic Usage

from fennec_community.output_parser import OutputParser, AnswerSchema, ParseMode

parser = OutputParser(
    schema=AnswerSchema,
    mode=ParseMode.LENIENT,
    enable_safety=True,
)

result = parser.parse('{"answer": "Paris", "confidence": 0.95, "sources": ["Wikipedia"]}')

if result.ok:
    answer: AnswerSchema = result.data
    print(answer.answer)        # Paris
    print(answer.confidence)    # 0.95
    print(result.trace.summary())

With LLM and Retry

from fennec_community.output_parser import create_answer_parser, ParseError

def my_llm(prompt: str) -> str:
    return openai_client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}]
    ).choices[0].message.content

parser = create_answer_parser(
    llm_fn=my_llm,
    max_retries=3,
    strict=True,
)

try:
    result = parser.parse(raw_llm_output)
    answer = result.data
except ParseError as e:
    print("Parse failed:", e)
    print("Trace:", e.trace.summary() if e.trace else "N/A")

Custom Validation Rules

from fennec_community.output_parser import OutputValidator, ValidationRule, FieldSchema, ValidationStatus

validator = OutputValidator(
    fields=[
        FieldSchema("answer", "The answer text", dtype="str", required=True),
        FieldSchema("confidence", "Confidence score", dtype="float", required=True),
    ],
    rules=[
        ValidationRule(
            name="min_confidence",
            predicate=lambda d: d.get("confidence", 0) >= 0.3,
            message="Confidence is too low (below 30%)",
            field="confidence",
            severity=ValidationStatus.WARNING,
        ),
        ValidationRule(
            name="answer_not_empty",
            predicate=lambda d: bool(d.get("answer", "").strip()),
            message="Answer field must not be empty",
            field="answer",
        ),
    ],
)

results = validator.validate(parsed_data, raw_text=raw_text)
failures = [r for r in results if not r.passed]

Format Detection Standalone

from fennec_community.output_parser import FormatDetector, OutputFormat

detector = FormatDetector()

# Single best format
fmt = detector.detect("| Name | Age |\n|------|-----|\n| Ali | 30 |")
# → OutputFormat.MARKDOWN_TABLE

# With confidence score
fmt, confidence = detector.detect_with_confidence(text)
if confidence < 0.5:
    print("Ambiguous format — consider forcing expected_format in the parser")

# Full ranking for debugging
for candidate in detector.rank(text):
    print(f"{candidate.format.value}: {candidate.confidence:.2f}{candidate.evidence}")

Manual Fix and Re-Parse

from fennec_community.output_parser import OutputFixer, OutputFormat
import json

fixer = OutputFixer(
    required_fields=["answer", "confidence"],
    field_defaults={"answer": "", "confidence": 0.5},
)

broken = "```json\n{answer: 'Paris', confidence: '0.9'}\n```"
fixed_text, strategy = fixer.fix(broken, OutputFormat.JSON)

print(strategy)               # FixStrategy.REGEX_REPAIR
print(json.loads(fixed_text)) # {"answer": "Paris", "confidence": "0.9"}

11. Production Notes

Safety Checks: When enable_safety=True, the validator automatically detects hallucination admission phrases, sensitive data leakage (credit card numbers, SSNs, email addresses, API keys, Bearer tokens), and prompt injection attempts. Any match raises a FAILED or WARNING validation result.

Caching: Parse results are stored in an in-memory dict keyed by MD5 hash of the input text. This benefits pipelines that process repeated outputs. Call clear_cache() when the schema or mode changes at runtime.

Without Pydantic: The entire module functions without Pydantic installed. It transparently falls back to dataclasses for all schema types.

Without PyYAML: YAML parsing is disabled if pyyaml is not installed, but all other formats work normally.

ParseTrace: Every ParseResult includes a ParseTrace with a complete audit record: detected format, fix strategy applied, retry count, all validation results, errors, warnings, and total duration in milliseconds. Use .trace.summary() for a compact dict suitable for structured logging and observability pipelines.

Source: community/output_parser.md