Pydantic feedback loop

The Pydantic feedback loop is the runtime quality guarantee added in SDK v4.1.0. When the LLM emits a tool_use with arguments that fail Pydantic validation, the SDK gives the LLM a structured second chance — instead of failing the call outright with a VALIDATION_MISSING_FIELD.

This closes roughly 75% of arg-quality hallucinations observed in production traffic — the largest single hallucination class in the 2026-04-30 audit.

What it closes

Class	Severity	After v4.1.0
LLM omits required Pydantic fields (`title`, `project_id`)	🔴 high	✅ closed — bounded retry with prose feedback
LLM passes wrong-shape args (plural list vs singular string)	🟠 medium	✅ closed — Pydantic type errors translate to "expected X, got Y"
LLM passes ISO-incompatible date strings ("tomorrow")	🟠 medium	✅ closed — prose includes ISO format example
LLM emits unknown extra fields	🟡 low	✅ closed — `extra_forbidden` translates to "unknown field — remove it"
LLM hallucinates ID slugs in retry round	🔴 critical	✅ closed — I-AH-1 re-runs on every retry input

Architecture

LLM emits tool_use(create_task, {description: "..."})
    ↓
outer for-loop in handle_message
    ↓
_execute_function(retry_ctx={...})
    ↓
(after pre-guards UNKNOWN_SUB_FUNCTION + I-AH-1):
  retry_count = 0; current_tu = tu
  while True:
    try:
      _model_instance = _func_def._pydantic_model(**current_tu.input)
      raw_result = await _func_def.func(ctx, **{...})
      return content                          # SUCCESS
    except PydanticValidationError as e:
      if not _retry_eligible or retry_count >= _RETRY_BUDGET:
        return validation_missing_field(...)  # exhausted
      prose = format_pydantic_for_llm(e)
      retry_resp = await client.create_message(...)
      new_tu = first_tool_use_with_same_name(retry_resp, current_tu.name)
      if check_id_shape_fabrication(new_tu.input):  # I-AH-1 on retry
        return validation_missing_field(...)
      current_tu = new_tu
      retry_count += 1

How feedback is formatted

format_pydantic_for_llm(e) translates each Pydantic error into one human-readable line:

Pydantic error type	Output line
`missing`	`- '{loc}': required field is missing — provide a value`
`string_*`	`- '{loc}': expected string, got {input_type}`
`int_*`	`- '{loc}': expected integer, got {input!r}`
`datetime_*`	`- '{loc}': expected ISO datetime (e.g. '2026-05-03T00:00:00'), got {input!r}`
`list_type`	`- '{loc}': expected list/array, got {input_type}`
`extra_forbidden`	`- '{loc}': unknown field — remove it`
(other)	`- '{loc}': {msg}` (Pydantic's own message verbatim)

The full prose includes a header and a retry instruction so the LLM understands what to fix.

Federal invariants (5 new)

Federal invariants are runtime contracts that block PR merge if weakened.

I-PYDANTIC-RETRY-BUDGET — at most _RETRY_BUDGET = 2 retries per tool_use. Beyond that → existing failure path with VALIDATION_MISSING_FIELD.
I-PYDANTIC-RETRY-SCOPE — retry triggers ONLY on pydantic.ValidationError. MUST NOT retry on FABRICATED_ID_SHAPE, UNKNOWN_SUB_FUNCTION, generic Exception, or TaskCancelled.
I-PYDANTIC-FEEDBACK-STRUCTURED — feedback is structured prose generated from e.errors(), not raw JSON or freeform text.
I-PYDANTIC-FC-SINGLE-APPEND — each logical tool_use produces exactly ONE entry in _functions_called, regardless of retry count.
I-PYDANTIC-WIRE-FROZEN — retry feature does NOT add new fields to FunctionCall, FunctionCallModel, or ChatResult.to_dict(). Observability lives in SigNoz log-derived metrics only.

Observability

Each retry emits a structured log line that SigNoz turns into a metric:

validation_retry_outcome tool=<name> ext=<name> outcome=<value> retry_count=<N>

Outcome	Meaning	Log level
`no_retry`	First attempt succeeded	DEBUG
`success`	Retry produced valid args	INFO
`redundant`	LLM repeated the same wrong args	WARNING
`exhausted`	Hit retry budget without success	WARNING
`llm_gave_up`	LLM stopped emitting tool_use	INFO
`fabricated_id_on_retry`	I-AH-1 caught fabrication on retry input	WARNING (security alert at >0)

What you need to do as an extension author

For Pydantic-typed @chat.function handlers, nothing. The retry loop activates automatically.

For legacy **kwargs handlers, the retry layer is a no-op. Migrate to typed parameters to opt in:

from pydantic import BaseModel, Field

class CreateTaskParams(BaseModel):
    title: str = Field(description="Task title")
    project_id: str = Field(description="Project UUID")
    due_date: str | None = Field(None, description="ISO datetime, e.g. 2026-06-15T09:00:00")

@chat.function(
    description="Create a task in a project.",
)
async def create_task(ctx, params: CreateTaskParams):
    # ...

Pydantic models for @chat.function must be defined at module scope. Function-local models silently disable the retry loop because auto-detection runs via func.__globals__. See the federal feedback memo on this for context.

Cost

Sonnet 4.6 retry call ≈ 700 input + 150 output tokens — about $0.0044 per retry. At 7-day baseline rate (12 rejected/24h × max 2 retries × $0.0044), worst-case additional spend is **$3/month** per fleet. Negligible compared to baseline chain LLM cost.

Cross-references

SDK v4.1 overview

Install, hello world, what every extension must satisfy

Pydantic feedback loop

SDK v4.1 overview

On this page