Skeletons
Small data probes the web-kernel injects into the intent classifier on every chat turn — your extension's awareness surface for the LLM.
The essence
A skeleton is a small async function decorated with @ext.skeleton that the web-kernel queries on a per-user TTL schedule. The web-kernel stores each snapshot and injects it into the intent classifier's prompt on every chat turn, so the LLM always knows the current counts, flags, and recent items in your extension's domain without making a tool call first. You write one function returning {"response": {...}} with flat scalar values; the kernel handles registration, scheduling, refresh, per-user isolation, and watchdog respawn. Aim for ≤2 KB per section — the kernel will compress everything it injects, so anything beyond a few dozen short scalars and at most five inline items is silently flattened.
Skeletons are for ambient awareness, not for rich content. Email bodies, full task lists, large nested objects belong in ctx.cache (panel-bound, 90s TTL) or arrive verbatim via the fact-ledger (last 5 turns of ActionResult.data). Putting a 30-message inbox into a skeleton wastes refresh cycles and never reaches the LLM anyway — the kernel collapses lists longer than five to a list[N] shape hint before classifier injection.
Skeletons are NOT for building UI
A skeleton is a data probe consumed by the LLM — it has no rendering side, no React component, no panel slot. Never use @ext.skeleton to build a sidebar, dashboard, editor, or any visible surface. For UI use @ext.panel instead. Inside a panel handler, fetch user state via ctx.cache (short-lived) or ctx.store (persistent) — NOT via ctx.skeleton.get(), which is restricted to @ext.skeleton handlers by federal invariant I-SKELETON-LLM-ONLY and raises SkeletonAccessForbidden from any other context.
What you write
Every skeleton is one Python async function. The decorator registers it as a synthetic tool named skeleton_refresh_{section_name}; the web-kernel schedules and calls it.
What this does: declares a notes skeleton section that the kernel refreshes every 300 seconds for each installed user, returning four scalar counts plus a short list of recent titles.
from imperal_sdk import Extension
ext = Extension(
"notes",
display_name="Notes",
description="Notes extension — create, organize, and search your notes with AI.",
icon="icon.svg",
actions_explicit=True,
)
@ext.skeleton(
"notes",
alert=False,
ttl=300,
description="Note statistics: total count, pinned, trash, recent titles.",
)
async def skeleton_refresh_notes(ctx) -> dict:
"""Refresh note statistics. Pure read — idempotent."""
total = await ctx.store.count("notes")
pinned = await ctx.store.count("notes", where={"is_pinned": True})
trash = await ctx.store.count("notes", where={"is_trashed": True})
recent = await ctx.store.query("notes", order_by="-updated_at", limit=5)
return {"response": {
"total_notes": total,
"pinned_notes": pinned,
"trash_count": trash,
"recent_notes": [
{"note_id": d.id, "title": d.data.get("title", "")}
for d in recent.data
],
}}The decorator's signature is fixed:
@ext.skeleton(
section_name: str, # required — positional, no spaces or special chars
*,
alert: bool = False, # if True, paired skeleton_alert_{section_name} fires on change
ttl: int = 300, # hint to operators; authoritative TTL lives in the kernel Registry
description: str = "", # manifest tool description; defaults to "Skeleton refresh: {section_name}"
)@ext.skeleton is a method on Extension — never on ChatExtension. The web-kernel key under which the snapshot is stored is imperal:skeleton:{app_id}:{user_id}:{section_name} (federal invariant I-SKELETON-KEY-FORMAT).
What the web-kernel does with your skeleton
This is the part most extension authors miss — and getting it wrong is why "I put 30 emails in my skeleton and the LLM doesn't see them" happens.
TTL refresh loop
When a user installs an extension, the web-kernel bootstraps one per-user SkeletonRefreshWorkflow per registered section. Every ttl seconds it:
- Calls
skeleton_refresh_{section_name}(ctx)with a fullContextobject (ctx.user,ctx.store,ctx.http,ctx.cache,ctx.notify,ctx.aiare all available — same as in any tool). - Unwraps the
{"response": ...}envelope. - Stores the inner dict under
imperal:skeleton:{app_id}:{user_id}:{section_name}via theskeleton_save_sectionactivity. The save adds_refreshed_at(epoch seconds) and_freshnessmetadata for staleness tracking.
Federal invariants I-SKEL-AUTO-DERIVE-1 (no manual section listing required), I-SKELETON-WATCHDOG (terminated workflows respawn within seconds), and I-SKELETON-CAN-ROTATE (continue-as-new rotation prevents unbounded history) keep this loop running indefinitely without operator intervention.
Compression at classifier-render time
On every chat turn, the web-kernel reads every stored skeleton section for the active user, compresses each one, and injects the compressed result into the intent classifier's prompt. The compression rules are not configurable — they apply to every skeleton, every turn:
| Compression rule | Behaviour | Symbol |
|---|---|---|
| Strings longer than 60 chars | Truncated to 60 chars + "..."; newlines flattened to spaces | _render_skeleton_value |
| Lists of dicts, ≤5 items | Expanded inline as [label (#id), label (#id), ...] — extracts title/name/label/subject + id/project_id/task_id keys | I-SKELETON-SMALL-LIST-INLINE |
| Lists of dicts, >5 items | Collapsed to list[N] shape hint — content is NOT visible to the classifier | _render_skeleton_value |
| Nested dicts (non-top-level) | Collapsed to dict[N keys] shape hint | _render_skeleton_value |
| Top-level section dict | First 6 non-underscore-prefixed fields rendered as key=value, key=value, ...; remaining fields dropped with ... suffix | _summarise_skeleton |
| Staleness suffix | Each section gets (cached ~Xs ago) computed from _refreshed_at | _section_age_hint |
| Staleness header | The entire skeleton block prefixed with a federally-mandated note instructing the classifier to treat numbers as cached and fetch fresh for specific metrics | _SKELETON_STALENESS_HEADER |
Practical consequence: if your handler returns 30 emails with full bodies in a recent_emails array, the classifier prompt receives recent_emails: list[30]. The bodies never reach the LLM. Rich content goes in ctx.cache (for panels and chat functions to fetch on demand) or arrives via the fact-ledger (verbatim recent ActionResult.data).
Staleness envelope
Federal invariant I-SKELETON-STALENESS-ENVELOPE requires every skeleton block in the classifier prompt to carry a header explicitly framing the data as a cached snapshot:
NOTE: every section below is a cached per-user snapshot. The
(cached ~Xs ago) tag is the age of that section's data.
- Availability / presence / enablement (bool, counts of 0 vs non-0,
section existence): skeleton is AUTHORITATIVE — quote directly.
- Specific numbers / metrics / timestamps / content: skeleton is
STALE. When the user asks for the CURRENT value, target_apps MUST
include the relevant extension so it can fetch fresh data before
the LLM narrates.This is why "how many tasks do I have" works from skeleton alone, but "what's my exact balance right now" routes through a tool call: the kernel teaches the classifier the boundary.
Return contract
The function must return {"response": <dict>} where the inner dict contains flat scalar values, short strings, and at most a five-item list of small dicts.
What this does: declares a status skeleton that surfaces three integer counts the classifier can use for routing decisions without fetching data.
from imperal_sdk import Extension
ext = Extension(
"my-app",
display_name="My App",
description="My App extension — manage resources with AI assistance.",
icon="icon.svg",
actions_explicit=True,
)
@ext.skeleton("status", ttl=300)
async def skeleton_refresh_status(ctx) -> dict:
total = await ctx.store.count("items")
pending = await ctx.store.count("items", where={"status": "pending"})
active = await ctx.store.count("items", where={"status": "active"})
return {"response": {
"total_count": total,
"pending_count": pending,
"active_count": active,
}}The outer {"response": ...} wrapper is required — skeleton_save_section unwraps it and stores the inner dict. Returning the inner dict directly is a federal error.
Size discipline
Three numbers to internalise:
| Constraint | Value | Why |
|---|---|---|
| Author target per section (norm) | ≤ 2 KB raw JSON | The classifier prompt has a finite budget; bigger sections compress hard and waste the kernel's render work |
| What the classifier actually sees | ≤ 6 fields × short scalars + ≤ 5 inline dict items + ≤ 60-char strings per section | The kernel-side compression rules above |
| Anti-pattern (silently compressed away) | Full message bodies, lists >5 items, strings >60 chars, deeply nested dicts | Use ctx.cache (panel snapshots) or rely on the fact-ledger (verbatim recent tool results) |
A common author mistake is to think "the skeleton is the LLM's working memory, so I should put rich content there." The mental model is wrong. The skeleton is the classifier's ambient awareness — enough to route precisely, not enough to narrate from. Rich content reaches the LLM through different channels.
Common patterns
Counter skeleton
The simplest pattern — surface counts the LLM can use for routing decisions without fetching data.
What this does: registers a tasks skeleton that refreshes every 30 seconds and surfaces three overdue/today/upcoming counts to the classifier.
from imperal_sdk import Extension
ext = Extension(
"tasks",
display_name="Tasks",
description="Tasks extension — manage and track your tasks with AI assistance.",
icon="icon.svg",
actions_explicit=True,
)
@ext.skeleton(
"tasks",
alert=True,
ttl=30,
description="Today/overdue/upcoming task counts for instant AI context.",
)
async def skeleton_refresh_tasks(ctx) -> dict:
overdue = await ctx.store.count("tasks", where={"overdue": True})
today = await ctx.store.count("tasks", where={"due_today": True})
upcoming = await ctx.store.count("tasks", where={"upcoming_7d": True})
return {"response": {
"overdue_count": overdue,
"today_count": today,
"upcoming_7d_count": upcoming,
}}Recent-items skeleton (≤5 items)
Surface a short list of the most-recently updated items so the LLM can reference them by name without a tool call. The kernel expands lists of ≤5 dicts inline by extracting title/name/label and id fields.
What this does: registers a notes skeleton that surfaces three counts plus the five most-recently updated notes by note_id + title.
@ext.skeleton(
"notes",
ttl=300,
description="Note statistics: total count, pinned, trash, recent titles.",
)
async def skeleton_refresh_notes(ctx) -> dict:
total = await ctx.store.count("notes")
pinned = await ctx.store.count("notes", where={"is_pinned": True})
recent = await ctx.store.query("notes", order_by="-updated_at", limit=5)
return {"response": {
"total_notes": total,
"pinned_notes": pinned,
"recent_notes": [
{"note_id": d.id, "title": d.data.get("title", "")}
for d in recent.data
],
}}The classifier will see this rendered as roughly recent_notes=[Meeting prep (#abc), Q3 roadmap (#xyz), ...]. Use limit=5 exactly — six items collapses the entire list to list[6] and the LLM loses every title.
Status-gauge skeleton
Aggregate statuses into counts for quick triage questions ("are any monitors down?").
What this does: queries up to 50 monitors, fetches their latest snapshots in parallel, and returns four counters the classifier can use to answer "all OK?" without a tool call.
import asyncio
from imperal_sdk import Extension
ext = Extension(
"web-tools",
display_name="Web Tools",
description="Web Tools extension — monitor websites and APIs with AI assistance.",
icon="icon.svg",
actions_explicit=True,
)
@ext.skeleton(
"web_tools",
ttl=300,
description="Monitor status counts: how many are critical, warning, ok.",
)
async def skeleton_refresh_web_tools(ctx) -> dict:
try:
page = await ctx.store.query(
"wt_monitors",
where={"owner_id": ctx.user.imperal_id},
limit=50,
)
if not page.data:
return {"response": {"total": 0, "critical": 0, "warning": 0, "ok": 0}}
snap_ids = [m.data.get("last_snapshot_id") for m in page.data]
async def _get_snap(snap_id: str | None):
if snap_id:
return await ctx.store.get("wt_snapshots", snap_id)
return None
snaps = await asyncio.gather(*[_get_snap(sid) for sid in snap_ids])
critical = warning = ok = 0
for m, snap in zip(page.data, snaps):
status = snap.data.get("status", "unknown") if snap else "unknown"
if status == "critical":
critical += 1
elif status == "warning":
warning += 1
elif status == "ok":
ok += 1
return {"response": {
"total": len(page.data),
"critical": critical,
"warning": warning,
"ok": ok,
}}
except Exception as exc:
return {"response": {
"error": str(exc)[:120],
"total": 0, "critical": 0, "warning": 0, "ok": 0,
}}Alert-on-change skeleton
Use alert=True plus a paired @ext.tool to surface deltas as ambient alerts to the classifier. The kernel diffs the previous snapshot against the new one after each refresh; if they differ, it calls skeleton_alert_{section_name}(ctx, old=<prev>, new=<next>).
What this does: registers a tasks skeleton with alert=True and a companion alert tool that emits a one-line message whenever new overdue tasks appear since the last refresh.
@ext.skeleton("tasks", alert=True, ttl=30)
async def skeleton_refresh_tasks_alert(ctx) -> dict:
overdue = await ctx.store.count("tasks", where={"overdue": True})
today = await ctx.store.count("tasks", where={"due_today": True})
return {"response": {"overdue_count": overdue, "today_count": today}}
@ext.tool(
"skeleton_alert_tasks",
description="Alert on new overdue tasks or today's task changes.",
)
async def skeleton_alert_tasks(
ctx,
old: dict | None = None,
new: dict | None = None,
) -> dict:
if not new:
return {"response": ""}
overdue = new.get("overdue_count", 0)
old_overdue = (old or {}).get("overdue_count", 0)
if overdue > 0 and overdue > old_overdue:
delta = overdue - old_overdue
return {"response": f"{delta} new overdue task(s) — {overdue} total overdue"}
return {"response": ""}A non-empty string returned by the alert tool is included in the next classifier prompt as an ambient alert.
Skeleton vs cache vs fact-ledger vs panel
All four come from your extension. They address different parts of the platform:
| Skeleton | ctx.cache | Fact-ledger | Panel | |
|---|---|---|---|---|
| Decorator | @ext.skeleton | (none — ctx.cache.set/get) | (none — kernel-populated) | @ext.panel |
| Who writes | Extension (via decorator) | Extension (in handlers) | Kernel (after every successful tool) | Extension (UINode tree) |
| Who reads | Intent classifier | Extension handlers + panels | Intent classifier | React panel host |
| When triggered | TTL schedule (every ttl seconds, per user) | On demand in handlers | After every successful @chat.function | Batch discovery + ui.Call |
| Lifetime | Until user uninstalls extension | TTL up to 300s | Last 5 turns | Until user navigates away |
| Max useful payload | ~2 KB (kernel compresses) | 64 KB per key | 3 KB aggregate per turn | Arbitrary UINode |
| Purpose | Ambient awareness for routing | Page snapshots, expensive recomputation | Verbatim cross-turn recall | Visual UI |
A mature extension uses all four. The skeleton surfaces "you have 8 unread emails." The chat function list_inbox fetches the 25-message page and caches it. The fact-ledger remembers the verbatim list for the next 5 turns. The panel renders the messages visibly. None of them overlap.
Federal guarantees
Skeleton workflows are managed by the web-kernel. Extension code never has to monitor, restart, or schedule them.
Auto-derive
Sections are discovered by the skeleton_refresh_{X} naming convention — no manual listing or Registry migration needed (I-SKEL-AUTO-DERIVE-1).
Watchdog respawn
A parent session workflow watches each skeleton handle. If the workflow terminates unexpectedly, it is respawned within seconds (I-SKELETON-WATCHDOG).
Continue-as-new safe
Skeleton workflows run indefinitely without accumulating unbounded history. The kernel auto-rotates the handle (I-SKELETON-CAN-ROTATE).
Live invalidation
On user uninstall, skeleton state is purged immediately. Within two seconds the data is physically unreachable (I-SKEL-LIVE-INVALIDATE).
Per-user isolation
One skeleton workflow runs per user per extension. ctx.user.[imperal_id](/en/reference/glossary/) is kernel-authoritative — your code cannot accidentally see another user's state.
Read-only outside handlers
ctx.skeleton.get() is restricted to @ext.skeleton handlers; calling from @chat.function or @ext.panel raises SkeletonAccessForbidden (I-SKELETON-LLM-ONLY).
Access guard
ctx.skeleton.get(section) reads the currently stored snapshot for a section. It is available only inside @ext.skeleton handlers — calling it anywhere else raises SkeletonAccessForbidden at runtime.
Validator rule V24 (AST scan) flags any ctx.skeleton access in @chat.function or @ext.panel bodies as an error at imperal validate time. The rationale: skeletons are classifier input, not a generic data source. Use ctx.cache for short-lived snapshots shared between the skeleton and chat functions, or ctx.store for persistent per-user state.
Validators and AST checks
The SDK validator enforces several rules on skeleton code:
| Rule | What it checks |
|---|---|
V13 | refresh_* or alert_* tool names not registered via @ext.skeleton — suggests the decorator |
V24 | ctx.skeleton.* access in @chat.function or @ext.panel bodies — always an error |
SKEL-GUARD-1 | ctx.skeleton.get() called outside a @ext.skeleton context |
SKEL-GUARD-2 | ctx.skeleton_data anywhere — removed in v1.6.0 |
SKEL-GUARD-3 | ctx.skeleton.update() anywhere — removed in v1.6.0, kernel is sole writer |
MANIFEST-SKELETON-1 | @ext.tool("skeleton_refresh_*") — should be @ext.skeleton |
What's next
Context channels
The umbrella view — skeleton, ctx.cache, and fact-ledger as three coordinated channels.
Fact-ledger
How verbatim cross-turn recall covers the rich-content case skeletons compress away.
Cache vs Store
The two extension-owned data surfaces — page snapshots and persistent state.
Panels
The UI surface — rendered by React, not the LLM.
Federal invariants
Every contract enforced at the boundary, including the skeleton ones.