Fact-ledger — verbatim cross-turn recall
How Webbee remembers exact tool results across the last 5 turns without any extension API — the anti-fabrication surface under every classifier prompt.
The essence
The fact-ledger is the web-kernel's verbatim cross-turn recall surface. It stores the exact JSON-serialised ActionResult.data returned by every successful tool call and replays the most recent five turns of those facts into the intent classifier's prompt on every subsequent chat turn. Your extension writes nothing to the fact-ledger directly — the kernel's session-memory producer populates it automatically after each successful dispatch. There is no @ext.fact_ledger decorator, no ctx.fact_ledger accessor. Your only contract is to return clean, structured data inside ActionResult.data, and the kernel takes care of the rest.
Do not use the fact-ledger as persistent state — it is bounded to the last five turns; use ctx.store for longer-lived data. Do not rely on ActionResult.summary as the recall surface — the producer reads .data (structured payload), not the human-readable prose summary.
Why it exists
On 2026-05-15 a production incident surfaced a recurring failure mode: the chat narrator confidently fabricated factual claims that contradicted the most recent tool result. A list_tasks call returned 36 tasks; the narrator's follow-up turn announced "you have 50 tasks." The root cause sat in the chain dispatch path: the session-memory producer in session_workflow._step_to_dispatch_dict was reading step.data only to extract action_type and duration_ms, then returning a dispatch dict with data_summary (a 1500-character prose preview) but no data key at all. The classifier rehydrating this turn next round saw only paraphrased prose, not structured facts, and the narrator filled in plausibly-wrong numbers from training distribution.
The fix landed the same day (BUG-5 ROOT P3, commit adf54e6): the producer now extracts step.data minus the _action_meta envelope and plumbs it into the dispatch dict, and the classifier renders verbatim FACTS: lines under each turn's prose preview in _render_history. The fact-ledger is the surface that closure created.
What it stores
After every successful tool call, the kernel writes a ToolCallDigest entry into the session-memory record for that turn. The digest contains:
| Field | Source | Purpose |
|---|---|---|
app_id | The extension owning the tool | Lets the LLM attribute a fact to its source extension |
fn_name | The @chat.function name that ran | Identifies the specific tool that produced the data |
data_facts_json | json.dumps(ActionResult.data) from the producer | The verbatim payload the LLM later sees |
On every chat turn, the classifier's _build_history_items populates TurnHistoryItem.data_facts for the most recent five turns, and _render_history emits one FACTS: app=<id> fn=<name> data=<json> line per tool call directly under the turn's prose preview.
Caps and limits
The fact-ledger is bounded so it stays affordable even when classifiers run on every chat turn.
| Constraint | Value | Federal invariant |
|---|---|---|
| Aggregate fact-ledger size per turn | ≤ 3000 chars | I-FACT-LEDGER-PER-TURN-AGG-CAP |
| Number of turns retained in classifier view | Last 5 | Hardcoded in _build_history_items |
| PII mask gate | Env var IMPERAL_FACT_LEDGER_EXPOSE_PII — defaults to false in source; set to true in production .env | I-FACT-LEDGER-PII-MASK-DEFAULT-OFF |
| Verbatim contract | Producer must plumb ActionResult.data, not data_summary prose | I-DISPATCH-RESULT-DATA-PLUMBED |
| Deep serialisation | data_facts_json is json.dumps(.data) — nested structures preserved at full depth | I-CROSS-TURN-FACT-LEDGER-DEEP-SERIALIZED |
When a turn's facts exceed the aggregate cap, the producer truncates from the tail (oldest tool calls in the turn) and surfaces a warning marker. Truncation is rare in practice — typical tool calls produce 100-500 character payloads.
PII gate
When a user returns sensitive content from a tool (an email body, a customer name, a phone number), the fact-ledger does not surface it to the classifier unless IMPERAL_FACT_LEDGER_EXPOSE_PII=true is set in the runtime environment.
Production defaults this to true because the classifier already runs inside the same security perimeter as the data itself and benefits from precise recall. Self-hosted deployments where the classifier provider differs from the data tenant should keep this false; in that mode the producer masks email-shaped, phone-shaped, and named-entity strings before serialisation.
What the classifier sees
In the classifier's [HISTORY] block, each turn renders as a one-line prose preview followed by one FACTS: line per successful tool call:
What this shows: a two-turn excerpt where the second turn's anaphoric reference ("отправь на тот же") is resolvable because the first turn's structured email address is right there in the fact line.
[2026-05-15T17:21:04Z user ok apps=[mail]] показать письма за сегодня → отправил список с 8 непрочитанными
FACTS: app=mail fn=list_inbox data={"unread": 8, "messages": [{"id": "abc", "from": "sarah@example.com", "subject": "Q3 plan"}]}
[2026-05-15T17:21:14Z user] отправь на тот же адрес "статус по проекту"In the second turn, the classifier resolves "тот же адрес" by reading the previous turn's FACTS: line and extracting sarah@example.com verbatim from the structured data.messages[0].from field — never from the prose summary, which might have paraphrased the address away.
What you DO NOT do
The fact-ledger has no extension API. Specifically:
- There is no
@ext.fact_ledgerdecorator. Attempts to import one fail at module load. - There is no
ctx.fact_ledgeraccessor.getattr(ctx, "fact_ledger", None)returnsNone. - There is no
ctx.skeleton.set()either — both the skeleton and the fact-ledger are read-only from extension code; the kernel is the sole writer to both.
Your only contract: return clean structured payloads in ActionResult.data.
- If you put 30 KB of nested JSON into
.data, the producer truncates per the aggregate cap (3000 chars per turn) and the classifier sees only the leading portion. - If you put PII in plaintext, the production PII mask gate handles redaction at serialisation time.
- If you put unstructured prose in
.data(a string instead of a dict), the JSON serialisation works but the classifier cannot index into fields — anaphora resolution will fail.
Federal invariants
The fact-ledger is defended by five named contracts enforced at the kernel boundary:
I-DISPATCH-RESULT-DATA-PLUMBED
Producer must read [ActionResult](/en/reference/glossary/).data, never data_summary, when populating ToolCallDigest.
I-CROSS-TURN-FACT-LEDGER-DEEP-SERIALIZED
data_facts_json preserves nested structures at full depth; no shallow projections.
I-FACT-LEDGER-PER-TURN-AGG-CAP
≤3000 chars aggregate per turn; truncation from tail when exceeded.
I-FACT-LEDGER-PII-MASK-DEFAULT-OFF
PII exposure is opt-in via IMPERAL_FACT_LEDGER_EXPOSE_PII; default off in source.
I-CLASSIFIER-PROMPT-RENDERS-FACT-LEDGER
The classifier prompt must include FACTS: lines under each turn's preview; render gap is a federal regression.
See the federal invariants page for the full inventory.
What's next
Context channels
How the fact-ledger fits alongside skeleton and ctx.cache.
Skeletons
The other classifier-visible channel — TTL-refreshed awareness probe.
Federal invariants
Every contract enforced at the boundary, including the fact-ledger ones.
Cache vs Store
The two data surfaces extensions own directly.