Skeletons
Skeletons — small live data feeds that keep Webbee aware of your extension's user state every chat turn, so the agent routes precisely without prompting first.
The essence
A skeleton is a small async function decorated with @ext.skeleton that the web-kernel queries on a per-user TTL schedule. The web-kernel stores each snapshot and injects it into the intent classifier's prompt on every chat turn, so the LLM always knows the current counts, flags, and recent items in your extension's domain without making a tool call first. You write one function returning {"response": {...}} with flat scalar values; the kernel handles registration, scheduling, refresh, per-user isolation, and watchdog respawn. Aim for ≤2 KB per section — the kernel will compress everything it injects, so anything beyond a few dozen short scalars and at most five inline items is silently flattened.
Skeletons are for ambient awareness, not for rich content. Email bodies, full task lists, large nested objects belong in ctx.cache (panel-bound, 90s TTL) or arrive verbatim via the fact-ledger (last 5 turns of ActionResult.data). Putting a 30-message inbox into a skeleton wastes refresh cycles and never reaches the LLM anyway — the kernel collapses lists longer than five to a list[N] shape hint before classifier injection.
Skeletons are NOT for building UI
A skeleton is a data probe consumed by the LLM — it has no rendering side, no React component, no panel slot. Never use @ext.skeleton to build a sidebar, dashboard, editor, or any visible surface. For UI use @ext.panel instead. Inside a panel handler, fetch user state via ctx.cache (short-lived) or ctx.store (persistent) — NOT via ctx.skeleton.get(), which the platform restricts to @ext.skeleton handlers and raises SkeletonAccessForbidden from any other context.
What you write
Every skeleton is one Python async function. The decorator registers it as a synthetic tool named skeleton_refresh_{section_name}; the web-kernel schedules and calls it.
What this does: declares a notes skeleton section that the kernel refreshes every 300 seconds for each installed user, returning four scalar counts plus a short list of recent titles.
from imperal_sdk import Extension
ext = Extension(
"notes",
display_name="Notes",
description="Notes extension — create, organize, and search your notes with AI.",
icon="icon.svg",
actions_explicit=True,
)
@ext.skeleton(
"notes",
alert=False,
ttl=300,
description="Note statistics: total count, pinned, trash, recent titles.",
)
async def skeleton_refresh_notes(ctx) -> dict:
"""Refresh note statistics. Pure read — idempotent."""
total = await ctx.store.count("notes")
pinned = await ctx.store.count("notes", where={"is_pinned": True})
trash = await ctx.store.count("notes", where={"is_trashed": True})
recent = await ctx.store.query("notes", order_by="-updated_at", limit=5)
return {"response": {
"total_notes": total,
"pinned_notes": pinned,
"trash_count": trash,
"recent_notes": [
{"note_id": d.id, "title": d.data.get("title", "")}
for d in recent.data
],
}}The decorator's signature is fixed:
@ext.skeleton(
section_name: str, # required — positional, no spaces or special chars
*,
alert: bool = False, # if True, paired skeleton_alert_{section_name} fires on change
ttl: int = 300, # hint to operators; authoritative TTL lives in the kernel Registry
description: str = "", # manifest tool description; defaults to "Skeleton refresh: {section_name}"
)@ext.skeleton is a method on Extension — never on ChatExtension. The platform stores one snapshot per section, isolated per user and per extension; you never address the storage location yourself.
What the web-kernel does with your skeleton
This is the part most extension authors miss — and getting it wrong is why "I put 30 emails in my skeleton and the LLM doesn't see them" happens.
TTL refresh loop
When a user installs an extension, the platform runs one durable per-user background refresh loop per registered section. Every ttl seconds it:
- Calls
skeleton_refresh_{section_name}(ctx)with a fullContextobject (ctx.user,ctx.store,ctx.http,ctx.cache,ctx.notify,ctx.aiare all available — same as in any tool). - Unwraps the
{"response": ...}envelope. - Stores the inner dict as that section's current snapshot, recording the refresh time so the platform can track how stale each section is.
The platform discovers your sections automatically from the skeleton_refresh_{X} naming convention (no manual listing needed), restarts the refresh loop within seconds if it ever stops, and runs it indefinitely without accumulating unbounded history — all without operator intervention.
Compression at classifier-render time
On every chat turn, the platform reads every stored skeleton section for the active user, compresses each one, and makes the compressed result available to the intent classifier. The compression rules are not configurable — they apply to every skeleton, every turn:
| Compression rule | Behaviour |
|---|---|
| Strings longer than 60 chars | Truncated to 60 chars + "..."; newlines flattened to spaces |
| Lists of dicts, ≤5 items | Expanded inline as [label (#id), label (#id), ...] — extracts title/name/label/subject + id/project_id/task_id keys |
| Lists of dicts, >5 items | Collapsed to list[N] shape hint — content is NOT visible to the classifier |
| Nested dicts (non-top-level) | Collapsed to dict[N keys] shape hint |
| Top-level section dict | First 6 non-underscore-prefixed fields surface as key=value, key=value, ...; remaining fields dropped with ... suffix |
| Staleness suffix | Each section gets a (cached ~Xs ago) tag computed from its last refresh time |
| Staleness header | The entire skeleton block is prefixed with a note framing the data as a cached snapshot — counts and flags are authoritative, specific numbers should be re-fetched when the user wants the current value |
Practical consequence: if your handler returns 30 emails with full bodies in a recent_emails array, the classifier prompt receives recent_emails: list[30]. The bodies never reach the LLM. Rich content goes in ctx.cache (for panels and chat functions to fetch on demand) or arrives via the fact-ledger (verbatim recent ActionResult.data).
Staleness envelope
Every skeleton section is always framed to the classifier as a cached per-user snapshot, tagged with how long ago it was last refreshed. The platform draws an explicit boundary for the agent:
- Availability / presence / enablement (booleans, "0 vs non-0" counts, whether a section exists at all): the skeleton is authoritative — the agent can answer directly.
- Specific numbers / metrics / timestamps / content: the skeleton is treated as potentially stale. When the user asks for the current value, the platform routes to your extension to fetch fresh data before the agent answers.
This is why "how many tasks do I have" works from the skeleton alone, but "what's my exact balance right now" routes through a tool call: the platform teaches the agent that boundary for you.
Return contract
The function must return {"response": <dict>} where the inner dict contains flat scalar values, short strings, and at most a five-item list of small dicts.
What this does: declares a status skeleton that surfaces three integer counts the classifier can use for routing decisions without fetching data.
from imperal_sdk import Extension
ext = Extension(
"my-app",
display_name="My App",
description="My App extension — manage resources with AI assistance.",
icon="icon.svg",
actions_explicit=True,
)
@ext.skeleton("status", ttl=300)
async def skeleton_refresh_status(ctx) -> dict:
total = await ctx.store.count("items")
pending = await ctx.store.count("items", where={"status": "pending"})
active = await ctx.store.count("items", where={"status": "active"})
return {"response": {
"total_count": total,
"pending_count": pending,
"active_count": active,
}}The outer {"response": ...} wrapper is required — the platform unwraps it and stores the inner dict. Returning the inner dict directly is a federal error.
Size discipline
Three numbers to internalise:
| Constraint | Value | Why |
|---|---|---|
| Author target per section (norm) | ≤ 2 KB raw JSON | The classifier prompt has a finite budget; bigger sections compress hard and waste the kernel's render work |
| What the classifier actually sees | ≤ 6 fields × short scalars + ≤ 5 inline dict items + ≤ 60-char strings per section | The kernel-side compression rules above |
| Anti-pattern (silently compressed away) | Full message bodies, lists >5 items, strings >60 chars, deeply nested dicts | Use ctx.cache (panel snapshots) or rely on the fact-ledger (verbatim recent tool results) |
A common author mistake is to think "the skeleton is the LLM's working memory, so I should put rich content there." The mental model is wrong. The skeleton is the classifier's ambient awareness — enough to route precisely, not enough to narrate from. Rich content reaches the LLM through different channels.
Common patterns
Counter skeleton
The simplest pattern — surface counts the LLM can use for routing decisions without fetching data.
What this does: registers a tasks skeleton that refreshes every 30 seconds and surfaces three overdue/today/upcoming counts to the classifier.
from imperal_sdk import Extension
ext = Extension(
"tasks",
display_name="Tasks",
description="Tasks extension — manage and track your tasks with AI assistance.",
icon="icon.svg",
actions_explicit=True,
)
@ext.skeleton(
"tasks",
alert=True,
ttl=30,
description="Today/overdue/upcoming task counts for instant AI context.",
)
async def skeleton_refresh_tasks(ctx) -> dict:
overdue = await ctx.store.count("tasks", where={"overdue": True})
today = await ctx.store.count("tasks", where={"due_today": True})
upcoming = await ctx.store.count("tasks", where={"upcoming_7d": True})
return {"response": {
"overdue_count": overdue,
"today_count": today,
"upcoming_7d_count": upcoming,
}}Recent-items skeleton (≤5 items)
Surface a short list of the most-recently updated items so the LLM can reference them by name without a tool call. The kernel expands lists of ≤5 dicts inline by extracting title/name/label and id fields.
What this does: registers a notes skeleton that surfaces three counts plus the five most-recently updated notes by note_id + title.
@ext.skeleton(
"notes",
ttl=300,
description="Note statistics: total count, pinned, trash, recent titles.",
)
async def skeleton_refresh_notes(ctx) -> dict:
total = await ctx.store.count("notes")
pinned = await ctx.store.count("notes", where={"is_pinned": True})
recent = await ctx.store.query("notes", order_by="-updated_at", limit=5)
return {"response": {
"total_notes": total,
"pinned_notes": pinned,
"recent_notes": [
{"note_id": d.id, "title": d.data.get("title", "")}
for d in recent.data
],
}}The classifier will see this rendered as roughly recent_notes=[Meeting prep (#abc), Q3 roadmap (#xyz), ...]. Use limit=5 exactly — six items collapses the entire list to list[6] and the LLM loses every title.
Status-gauge skeleton
Aggregate statuses into counts for quick triage questions ("are any monitors down?").
What this does: queries up to 50 monitors, fetches their latest snapshots in parallel, and returns four counters the classifier can use to answer "all OK?" without a tool call.
import asyncio
from imperal_sdk import Extension
ext = Extension(
"web-tools",
display_name="Web Tools",
description="Web Tools extension — monitor websites and APIs with AI assistance.",
icon="icon.svg",
actions_explicit=True,
)
@ext.skeleton(
"web_tools",
ttl=300,
description="Monitor status counts: how many are critical, warning, ok.",
)
async def skeleton_refresh_web_tools(ctx) -> dict:
try:
page = await ctx.store.query(
"wt_monitors",
where={"owner_id": ctx.user.imperal_id},
limit=50,
)
if not page.data:
return {"response": {"total": 0, "critical": 0, "warning": 0, "ok": 0}}
snap_ids = [m.data.get("last_snapshot_id") for m in page.data]
async def _get_snap(snap_id: str | None):
if snap_id:
return await ctx.store.get("wt_snapshots", snap_id)
return None
snaps = await asyncio.gather(*[_get_snap(sid) for sid in snap_ids])
critical = warning = ok = 0
for m, snap in zip(page.data, snaps):
status = snap.data.get("status", "unknown") if snap else "unknown"
if status == "critical":
critical += 1
elif status == "warning":
warning += 1
elif status == "ok":
ok += 1
return {"response": {
"total": len(page.data),
"critical": critical,
"warning": warning,
"ok": ok,
}}
except Exception as exc:
return {"response": {
"error": str(exc)[:120],
"total": 0, "critical": 0, "warning": 0, "ok": 0,
}}Alert-on-change skeleton
Use alert=True plus a paired @ext.tool to surface deltas as ambient alerts to the classifier. The kernel diffs the previous snapshot against the new one after each refresh; if they differ, it calls skeleton_alert_{section_name}(ctx, old=<prev>, new=<next>).
What this does: registers a tasks skeleton with alert=True and a companion alert tool that emits a one-line message whenever new overdue tasks appear since the last refresh.
@ext.skeleton("tasks", alert=True, ttl=30)
async def skeleton_refresh_tasks_alert(ctx) -> dict:
overdue = await ctx.store.count("tasks", where={"overdue": True})
today = await ctx.store.count("tasks", where={"due_today": True})
return {"response": {"overdue_count": overdue, "today_count": today}}
@ext.tool(
"skeleton_alert_tasks",
description="Alert on new overdue tasks or today's task changes.",
)
async def skeleton_alert_tasks(
ctx,
old: dict | None = None,
new: dict | None = None,
) -> dict:
if not new:
return {"response": ""}
overdue = new.get("overdue_count", 0)
old_overdue = (old or {}).get("overdue_count", 0)
if overdue > 0 and overdue > old_overdue:
delta = overdue - old_overdue
return {"response": f"{delta} new overdue task(s) — {overdue} total overdue"}
return {"response": ""}A non-empty string returned by the alert tool is included in the next classifier prompt as an ambient alert.
Skeleton vs cache vs fact-ledger vs panel
All four come from your extension. They address different parts of the platform:
| Skeleton | ctx.cache | Fact-ledger | Panel | |
|---|---|---|---|---|
| Decorator | @ext.skeleton | (none — ctx.cache.set/get) | (none — kernel-populated) | @ext.panel |
| Who writes | Extension (via decorator) | Extension (in handlers) | Kernel (after every successful tool) | Extension (UINode tree) |
| Who reads | Intent classifier | Extension handlers + panels | Intent classifier | React panel host |
| When triggered | TTL schedule (every ttl seconds, per user) | On demand in handlers | After every successful @chat.function | Batch discovery + ui.Call |
| Lifetime | Until user uninstalls extension | TTL up to 300s | Last 5 turns | Until user navigates away |
| Max useful payload | ~2 KB (kernel compresses) | 64 KB per key | 3 KB aggregate per turn | Arbitrary UINode |
| Purpose | Ambient awareness for routing | Page snapshots, expensive recomputation | Verbatim cross-turn recall | Visual UI |
A mature extension uses all four. The skeleton surfaces "you have 8 unread emails." The chat function list_inbox fetches the 25-message page and caches it. The fact-ledger remembers the verbatim list for the next 5 turns. The panel renders the messages visibly. None of them overlap.
Federal guarantees
Skeleton workflows are managed by the web-kernel. Extension code never has to monitor, restart, or schedule them.
Auto-derive
Sections are discovered by the skeleton_refresh_{X} naming convention — no manual listing or migration step needed.
Watchdog respawn
The platform watches each skeleton's refresh loop. If it ever stops unexpectedly, the platform restarts it within seconds.
Runs indefinitely
Skeleton refresh loops run indefinitely without accumulating unbounded history — the platform manages long-running lifecycle for you.
Live invalidation
On user uninstall, skeleton state is purged immediately. Within two seconds the data is physically unreachable.
Per-user isolation
One skeleton workflow runs per user per extension. ctx.user.[imperal_id](/en/reference/glossary/) is kernel-authoritative — your code cannot accidentally see another user's state.
Read-only outside handlers
ctx.skeleton.get() is restricted to @ext.skeleton handlers; calling from @chat.function or @ext.panel raises SkeletonAccessForbidden.
Access guard
ctx.skeleton.get(section) reads the currently stored snapshot for a section. It is available only inside @ext.skeleton handlers — calling it anywhere else raises SkeletonAccessForbidden at runtime.
Validator rule V24 (AST scan) flags any ctx.skeleton access in @chat.function or @ext.panel bodies as an error at imperal validate time. The rationale: skeletons are classifier input, not a generic data source. Use ctx.cache for short-lived snapshots shared between the skeleton and chat functions, or ctx.store for persistent per-user state.
Validators and AST checks
The SDK validator enforces several rules on skeleton code:
| Rule | What it checks |
|---|---|
V13 | refresh_* or alert_* tool names not registered via @ext.skeleton — suggests the decorator |
V24 | ctx.skeleton.* access in @chat.function or @ext.panel bodies — always an error |
SKEL-GUARD-1 | ctx.skeleton.get() called outside a @ext.skeleton context |
SKEL-GUARD-2 | ctx.skeleton_data anywhere — removed in v1.6.0 |
SKEL-GUARD-3 | ctx.skeleton.update() anywhere — removed in v1.6.0, kernel is sole writer |
MANIFEST-SKELETON-1 | @ext.tool("skeleton_refresh_*") — should be @ext.skeleton |
What's next
Context channels
The umbrella view — skeleton, ctx.cache, and fact-ledger as three coordinated channels.
Fact-ledger
How verbatim cross-turn recall covers the rich-content case skeletons compress away.
Cache vs Store
The two extension-owned data surfaces — page snapshots and persistent state.
Panels
The UI surface — rendered by React, not the LLM.
Federal invariants
Every contract enforced at the boundary, including the skeleton ones.
@chat.function
@chat.function — the typed decorator that turns a Python coroutine into an LLM-callable Webbee tool, with auto-validated params, action types, and chain mode.
Fact-ledger — verbatim cross-turn recall
Fact-ledger — verbatim cross-turn recall of exact tool results across the last 5 turns, no extension API; the anti-fabrication surface under the classifier.