Testing extensions

Unit and integration testing patterns for Imperal SDK extensions — MockContext, ActionResult assertions, skeleton-gated handlers, scheduled jobs, panels

Every @chat.function, @ext.panel, and @ext.skeleton handler runs in production against real user data, the audit ledger, and the confirmation gate. Bugs that slip through surface immediately — and the web-kernel fails closed. A small suite of unit tests catches the vast majority of these issues locally without a running web-kernel.

This guide covers:

Topic	Section
Install test dependencies	Install
File layout convention	Layout
Minimum viable test	First test
MockContext in depth	MockContext
Mocking ctx.http	HTTP
Mocking ctx.store	Store
Mocking ctx.ai, ctx.notify, ctx.billing	Other mocks
Asserting ActionResult shape	ActionResult
Typed Pydantic params	Pydantic params
Skeleton handlers	Skeletons
Scheduled jobs	Schedules
Webhook handlers	Webhooks
Event handlers	Events
Panels (UI assertions)	Panels
Integration tests	Integration
CI patterns	CI
Common pitfalls	Pitfalls

Why test extensions

Extension handlers run synchronously inside an ICNLI Worker activity. If a handler raises an unhandled exception the web-kernel logs it and the user sees a generic failure message — but the failed state is final in the audit ledger. There is no automatic retry for business-logic bugs. The only practical way to catch these issues before they affect users is with a local test suite.

Specifically, unit tests catch:

ActionResult.error(...) returned for a path you expected to succeed
Wrong refresh_panels — a sidebar that never refreshes after a write
ctx.skeleton.get() called from a non-skeleton tool (raises SkeletonAccessForbidden)
Pydantic validation errors on your own params model — caught before the retry loop ever fires
Off-by-one errors in store queries with where= filters
Notification payloads that fail downstream consumers

None of these require a running web-kernel. The SDK ships a full testing module for exactly this purpose.

Install test dependencies

# imperal-sdk[dev] pulls pytest, pytest-asyncio, respx, and jsonschema
pip install "imperal-sdk[dev]"
# or, if you pin a local SDK checkout:
pip install -e /path/to/imperal-sdk[dev]

Confirm the install:

pytest --version   # 7.x or 8.x

Add a pyproject.toml section to enable async tests project-wide:

[tool.pytest.ini_options]
asyncio_mode = "auto"

File layout

Production extensions separate concerns into per-concern modules. Tests mirror that structure:

my-extension/
  app.py                 # Extension + ChatExtension declarations
  handlers_crud.py       # @chat.function handlers
  handlers_schedule.py   # @ext.schedule handlers
  panels.py              # @ext.panel handlers
  skeleton.py            # @ext.skeleton handlers
  tests/
    test_handlers_crud.py
    test_handlers_schedule.py
    test_panels.py
    test_skeleton.py
    conftest.py          # shared fixtures

Keep tests co-located with the extension. Do not place them in a top-level tests/ directory — the extension is the unit of deployment and its tests travel with it.

The minimum viable test

A single test that imports a handler, constructs a MockContext, and checks the returned ActionResult is enough to validate the happy path:

tests/test_handlers_crud.py

import pytest
from pydantic import BaseModel
from imperal_sdk import Extension, ChatExtension, ActionResult
from imperal_sdk.testing import MockContext

ext = Extension(
    "my-ext",
    display_name="My Extension",
    description="A minimal extension for illustrating testing patterns.",
    actions_explicit=True,
)
chat = ChatExtension(ext)


class CreateItemParams(BaseModel):
    title: str
    description: str = ""


@chat.function(
    "create_item",
    description="Create a new item with the given title and optional description",
    action_type="write",
    effects=["create:item"],
)
async def create_item(ctx, params: CreateItemParams) -> ActionResult:
    doc = await ctx.store.create("items", {"title": params.title, "description": params.description})
    return ActionResult.success(
        data={"item_id": doc.id, "title": params.title},
        summary=f"Created item '{params.title}'",
        refresh_panels=["sidebar"],
    )


@pytest.mark.asyncio
async def test_create_item_happy_path():
    ctx = MockContext(user_id="user-1", role="user")
    params = CreateItemParams(title="My first item")
    result = await create_item(ctx, params)

    assert result.status == "success"
    assert result.data["title"] == "My first item"
    assert "item_id" in result.data
    assert result.refresh_panels == ["sidebar"]
    # Inspect the in-memory store directly
    assert "items" in ctx.store._data

ctx.store._data is the raw in-memory dict — you can inspect it in tests without any HTTP call.

MockContext reference

MockContext is a factory function that returns a fully populated Context with all sub-clients replaced by their in-memory equivalents:

conftest.py — common fixture patterns

from imperal_sdk.testing import MockContext

# Default: regular user, all scopes
ctx = MockContext()

# Admin user in a specific tenant
ctx = MockContext(user_id="admin-42", role="admin", tenant_id="acme")

# Pass admin config values
ctx = MockContext(config={"api_key": "test-key", "limits.max_items": 100})

# Skeleton-gated tool (see Testing skeleton handlers below)
ctx = MockContext(tool_type="skeleton")

All keyword arguments:

Argument	Type	Default	Purpose
`user_id`	`str`	`"test_user"`	Sets `ctx.user.imperal_id`
`email`	`str`	`"test@test.com"`	Sets `ctx.user.email`
`role`	`str`	`"user"`	`"user"` or `"admin"`
`scopes`	`list[str]`	`["*"]`	All scopes allowed by default
`tenant_id`	`str`	`"default"`	Sets `ctx.user.tenant_id`
`extension_id`	`str`	`"test-ext"`	Sets `ctx._extension_id`
`config`	`dict`	`{}`	Dot-path config (e.g. `"db.host"`)
`tool_type`	`str`	`"tool"`	`"tool"` / `"skeleton"` / `"panel"` / `"chat_fn"`

Attributes on the returned Context:

Attribute	Mock class	Key API
`ctx.store`	`MockStore`	`._data` dict for inspection
`ctx.http`	`MockHTTP`	`.mock_get(pattern, resp)` / `.mock_post(...)`
`ctx.ai`	`MockAI`	`.set_response(pattern, text)`
`ctx.notify`	`MockNotify`	`.sent` list for inspection
`ctx.billing`	`MockBilling`	`.balance`, `.plan`
`ctx.skeleton`	wrapped `MockSkeleton`	`._sections` dict (access via `ctx.skeleton._client._sections`)
`ctx.storage`	`MockStorage`	`._files` dict
`ctx.config`	`MockConfig`	dot-path `get(key, default)`
`ctx.extensions`	`MockExtensions`	`.register(app, method, fn)` / `._emitted`

ctx.cache is not available in a MockContext — the CacheClient requires an Extension reference and a live gateway URL. Tests for cache-backed handlers should stub the cache layer at the ctx.store level (cache-miss → store fetch) or restructure the handler to accept an injectable fetcher function.

Mocking ctx.http

MockHTTP matches registered patterns by substring. Register before calling the handler:

tests/test_weather.py

import pytest
from imperal_sdk import ActionResult
from imperal_sdk.testing import MockContext


async def fetch_weather(ctx, city: str) -> ActionResult:
    resp = await ctx.http.get(f"https://api.weather.example/v1/current?city={city}")
    if not resp.ok:
        return ActionResult.error("Weather service unavailable", retryable=True)
    data = resp.json()
    return ActionResult.success(
        data={"temp": data["temperature"]},
        summary=f"Current temperature in {city}: {data['temperature']}°C",
    )


@pytest.mark.asyncio
async def test_fetch_weather_success():
    ctx = MockContext()
    ctx.http.mock_get("api.weather.example", {"temperature": 22}, status=200)

    result = await fetch_weather(ctx, "Paris")

    assert result.status == "success"
    assert result.data["temp"] == 22


@pytest.mark.asyncio
async def test_fetch_weather_service_unavailable():
    ctx = MockContext()
    ctx.http.mock_get("api.weather.example", {"error": "down"}, status=503)

    result = await fetch_weather(ctx, "Paris")

    assert result.status == "error"
    assert result.retryable is True

For POST requests use ctx.http.mock_post(pattern, response, status=200). If no matching mock is found, MockHTTP returns status=404 with {"error": "No mock registered"}.

Mocking ctx.store

MockStore is an in-memory dict-of-dicts. Pre-seed it before calling the handler:

tests/test_items.py

import pytest
from imperal_sdk import ActionResult
from imperal_sdk.testing import MockContext


async def get_item(ctx, item_id: str) -> ActionResult:
    doc = await ctx.store.get("items", item_id)
    if doc is None:
        return ActionResult.error(f"Item '{item_id}' not found")
    return ActionResult.success(
        data={"item_id": doc.id, "title": doc.data["title"]},
        summary=f"Found item '{doc.data['title']}'",
    )


@pytest.mark.asyncio
async def test_get_item_found():
    ctx = MockContext()
    # Pre-seed the store
    ctx.store._data["items"] = {
        "item-abc": {"title": "Pre-seeded item"},
    }

    result = await get_item(ctx, "item-abc")

    assert result.status == "success"
    assert result.data["title"] == "Pre-seeded item"


@pytest.mark.asyncio
async def test_get_item_not_found():
    ctx = MockContext()

    result = await get_item(ctx, "nonexistent")

    assert result.status == "error"
    assert "not found" in result.error

MockStore.query() supports flat equality matching via the where= dict:

tests/test_query.py

import pytest
from imperal_sdk.testing import MockContext


@pytest.mark.asyncio
async def test_query_by_status():
    ctx = MockContext()
    ctx.store._data["tasks"] = {
        "t1": {"title": "Buy milk", "status": "open"},
        "t2": {"title": "Call dentist", "status": "done"},
        "t3": {"title": "Fix bug", "status": "open"},
    }

    page = await ctx.store.query("tasks", where={"status": "open"})

    assert len(page.data) == 2
    titles = {doc.data["title"] for doc in page.data}
    assert titles == {"Buy milk", "Fix bug"}

Mocking ctx.ai, ctx.notify, ctx.billing

MockAI — returns the default response unless a pattern is registered:

tests/test_ai.py

import pytest
from imperal_sdk.testing import MockContext


@pytest.mark.asyncio
async def test_ai_response_routing():
    ctx = MockContext()
    ctx.ai.set_response("summarize", "This is a concise summary.")
    ctx.ai.set_response("translate", "Ceci est une traduction.")

    result1 = await ctx.ai.complete("Please summarize the following text: ...")
    result2 = await ctx.ai.complete("Please translate this to French: ...")
    result3 = await ctx.ai.complete("What is the weather today?")

    assert "concise summary" in result1.text
    assert "traduction" in result2.text
    assert result3.text == "Mock AI response"  # default fallback

MockNotify — records calls in .sent. Both invocation styles work:

tests/test_notify.py

import pytest
from imperal_sdk.testing import MockContext


@pytest.mark.asyncio
async def test_notification_sent():
    ctx = MockContext()
    await ctx.notify("Task completed!", priority="high")
    await ctx.notify.send("Another message", channel="email")

    assert len(ctx.notify.sent) == 2
    assert ctx.notify.sent[0]["message"] == "Task completed!"
    assert ctx.notify.sent[0]["priority"] == "high"
    assert ctx.notify.sent[1]["channel"] == "email"

MockBilling — allows all requests by default. check_limits() always returns allowed=True regardless of balance. To test a real quota-gated code path, subclass MockBilling and override check_limits():

tests/test_billing.py

import pytest
from imperal_sdk.testing import MockContext, MockBilling
from imperal_sdk.types.models import LimitsResult


class ExhaustedBilling(MockBilling):
    async def check_limits(self) -> LimitsResult:
        return LimitsResult(allowed=False, balance=0, plan=self.plan)


@pytest.mark.asyncio
async def test_handler_respects_quota():
    ctx = MockContext()
    ctx.billing = ExhaustedBilling(balance=0)

    limits = await ctx.billing.check_limits()

    assert limits.allowed is False

Asserting ActionResult shape

The ActionResult dataclass has six testable fields:

Field	Type	What to assert
`status`	`"success"` or `"error"`	Always assert this first
`data`	`dict` or Pydantic model	Check keys and values
`summary`	`str`	User-facing — non-empty for write actions
`error`	`str \| None`	Non-None only on `status="error"`
`retryable`	`bool`	`True` only for transient errors
`refresh_panels`	`list[str] \| None`	`None` = all panels; `[]` = none; `["sidebar"]` = targeted
`ui`	UINode or None	Assert `.type` if you return inline UI

tests/test_delete.py

import pytest
from imperal_sdk import ActionResult
from imperal_sdk.testing import MockContext


async def delete_item(ctx, item_id: str) -> ActionResult:
    doc = await ctx.store.get("items", item_id)
    if doc is None:
        return ActionResult.error(f"Item '{item_id}' not found")
    await ctx.store.delete("items", item_id)
    return ActionResult.success(
        data={"deleted_id": item_id},
        summary=f"Deleted item {item_id}",
        refresh_panels=["sidebar"],
    )


@pytest.mark.asyncio
async def test_delete_existing_item():
    ctx = MockContext()
    ctx.store._data["items"] = {"item-1": {"title": "To delete"}}

    result = await delete_item(ctx, "item-1")

    assert result.status == "success"
    assert result.data == {"deleted_id": "item-1"}
    assert result.summary        # non-empty
    assert result.refresh_panels == ["sidebar"]
    assert result.error is None
    assert result.retryable is False
    # Verify store is empty after delete
    assert "item-1" not in ctx.store._data.get("items", {})


@pytest.mark.asyncio
async def test_delete_missing_item_returns_error():
    ctx = MockContext()

    result = await delete_item(ctx, "nonexistent")

    assert result.status == "error"
    assert result.error is not None
    assert result.retryable is False

to_dict() gives you the wire-serialized form for asserting the exact shape the web-kernel receives:

tests/test_action_result_wire.py

from imperal_sdk import ActionResult


def test_success_wire_shape():
    result = ActionResult.success(data={"id": "x"}, summary="Done")
    d = result.to_dict()

    assert d["status"] == "success"
    assert d["data"] == {"id": "x"}
    assert "error" not in d      # not emitted on success
    assert "retryable" not in d  # not emitted when False


def test_error_wire_shape():
    result = ActionResult.error("Connection failed", retryable=True)
    d = result.to_dict()

    assert d["status"] == "error"
    assert d["error"] == "Connection failed"
    assert d["retryable"] is True

Testing typed Pydantic params

SDK v4.1.0 added a Pydantic feedback loop: on PydanticValidationError, the web-kernel retries up to two times with structured feedback to the LLM. Your unit tests should verify that your model accepts valid input and rejects invalid input as expected — the feedback loop does not fire in local tests.

tests/test_params.py

import pytest
from pydantic import BaseModel, ValidationError

from imperal_sdk import ActionResult
from imperal_sdk.testing import MockContext


class CreateNoteParams(BaseModel):
    title: str
    content: str = ""
    tags: list[str] = []
    pinned: bool = False


async def create_note(ctx, params: CreateNoteParams) -> ActionResult:
    doc = await ctx.store.create("notes", params.model_dump())
    return ActionResult.success(
        data={"note_id": doc.id},
        summary=f"Created note '{params.title}'",
        refresh_panels=["sidebar"],
    )


@pytest.mark.asyncio
async def test_create_note_minimal_params():
    ctx = MockContext()
    result = await create_note(ctx, CreateNoteParams(title="Quick note"))
    assert result.status == "success"


@pytest.mark.asyncio
async def test_create_note_with_all_params():
    ctx = MockContext()
    params = CreateNoteParams(
        title="Full note",
        content="Some content here",
        tags=["work", "urgent"],
        pinned=True,
    )
    result = await create_note(ctx, params)
    assert result.status == "success"


def test_create_note_params_validation_missing_title():
    """Pydantic raises ValidationError when required fields are absent."""
    with pytest.raises(ValidationError, match="title"):
        CreateNoteParams()  # title is required


def test_create_note_params_defaults():
    params = CreateNoteParams(title="Defaults test")
    assert params.content == ""
    assert params.tags == []
    assert params.pinned is False

Pydantic params models must be defined at module scope, not inside functions. The SDK auto-detects them via func.__globals__ — a function-local model class is invisible and causes a validation gap at dispatch time.

Testing skeleton handlers

@ext.skeleton handlers refresh the LLM-facts snapshot. They must be called with tool_type="skeleton" — otherwise ctx.skeleton.get() raises SkeletonAccessForbidden.

tests/test_skeleton.py

import pytest
from imperal_sdk import Extension
from imperal_sdk.testing import MockContext
from imperal_sdk.errors import SkeletonAccessForbidden


ext = Extension(
    "task-tracker",
    display_name="Task Tracker",
    description="Track tasks and surfaces summary counts for the LLM classifier.",
    actions_explicit=True,
)


@ext.skeleton("tasks_summary", alert=True, ttl=300)
async def tasks_summary_skeleton(ctx) -> dict:
    """Return open/done counts for the LLM classifier."""
    open_count = await ctx.store.count("tasks", where={"status": "open"})
    done_count = await ctx.store.count("tasks", where={"status": "done"})
    return {
        "response": {
            "open_tasks": open_count,
            "done_tasks": done_count,
            "total": open_count + done_count,
        }
    }


@pytest.mark.asyncio
async def test_skeleton_returns_correct_counts():
    # Must pass tool_type="skeleton" so ctx.skeleton.get() is permitted
    ctx = MockContext(tool_type="skeleton")
    ctx.store._data["tasks"] = {
        "t1": {"status": "open"},
        "t2": {"status": "open"},
        "t3": {"status": "done"},
    }

    result = await tasks_summary_skeleton(ctx)

    assert result["response"]["open_tasks"] == 2
    assert result["response"]["done_tasks"] == 1
    assert result["response"]["total"] == 3


@pytest.mark.asyncio
async def test_skeleton_access_forbidden_outside_skeleton_context():
    """ctx.skeleton.get() raises SkeletonAccessForbidden in non-skeleton contexts."""
    ctx = MockContext(tool_type="tool")  # wrong tool_type

    async def bad_handler(ctx) -> None:
        await ctx.skeleton.get("tasks_summary")  # raises

    with pytest.raises(SkeletonAccessForbidden):
        await bad_handler(ctx)

The return contract for skeleton handlers is {"response": {...}} with scalar values at the top level of the inner dict. The web-kernel reads these scalars as LLM classifier hints.

Testing scheduled jobs

@ext.schedule handlers receive a system context: ctx.user.imperal_id == "__system__". They use ctx.store.list_users() and ctx.as_user(uid) to fan out per-user work. In tests, skip the list_users() async iterator and call the per-user helper directly with a user-scoped MockContext:

tests/test_schedule.py

import pytest
from imperal_sdk import ActionResult
from imperal_sdk.testing import MockContext


async def _check_and_notify(ctx, threshold: int) -> None:
    """Per-user check — called from schedule fan-out."""
    count = await ctx.store.count("alerts", where={"seen": False})
    if count >= threshold:
        await ctx.notify(f"You have {count} unseen alerts", priority="high")


@pytest.mark.asyncio
async def test_check_notifies_when_threshold_exceeded():
    ctx = MockContext(user_id="user-42")
    ctx.store._data["alerts"] = {
        "a1": {"seen": False},
        "a2": {"seen": False},
        "a3": {"seen": False},
    }

    await _check_and_notify(ctx, threshold=2)

    assert len(ctx.notify.sent) == 1
    assert "3 unseen alerts" in ctx.notify.sent[0]["message"]


@pytest.mark.asyncio
async def test_check_does_not_notify_below_threshold():
    ctx = MockContext(user_id="user-99")
    ctx.store._data["alerts"] = {
        "a1": {"seen": False},
    }

    await _check_and_notify(ctx, threshold=5)

    assert len(ctx.notify.sent) == 0

Test the per-user helper function (_check_and_notify) directly rather than the schedule decorator wrapper. The decorator wrapper is web-kernel infrastructure — unit tests focus on the business logic inside it.

For tests that exercise ctx.as_user() itself, build a system context explicitly:

tests/test_as_user.py

import pytest
from imperal_sdk.context import Context
from imperal_sdk.types.identity import UserContext
from imperal_sdk.testing import MockStore


def make_system_ctx() -> Context:
    """Build a minimal system-context for as_user() tests."""
    user = UserContext(
        imperal_id="__system__",
        email="",
        tenant_id="default",
        role="system",
        scopes=["*"],
    )
    return Context(user=user, store=MockStore())


def test_as_user_changes_user_id():
    sys_ctx = make_system_ctx()
    user_ctx = sys_ctx.as_user("user-42")
    assert user_ctx.user.imperal_id == "user-42"


def test_as_user_raises_if_not_system():
    from imperal_sdk.testing import MockContext
    ctx = MockContext(user_id="real-user")
    with pytest.raises(RuntimeError, match="system context"):
        ctx.as_user("target-user")

Testing webhook handlers

@ext.webhook handlers receive ctx, headers, body, query_params. HMAC verification is the handler's responsibility — the SDK provides no built-in helper. Test both the success path and the signature rejection path:

tests/test_webhook.py

import hashlib
import hmac
import json
import pytest
from imperal_sdk import Extension, ActionResult
from imperal_sdk.testing import MockContext

ext = Extension(
    "payment-ext",
    display_name="Payments",
    description="Processes payment webhooks from the payment gateway.",
    actions_explicit=True,
)

WEBHOOK_SECRET = "test-webhook-secret"


def _verify_signature(body: str, signature: str, secret: str) -> bool:
    expected = "sha256=" + hmac.new(
        secret.encode(),
        body.encode(),
        hashlib.sha256,
    ).hexdigest()
    return hmac.compare_digest(expected, signature)


@ext.webhook("payment", method="POST", secret_header="X-Hub-Signature-256")
async def on_payment(ctx, headers: dict, body: str, query_params: dict) -> ActionResult:
    sig = headers.get("X-Hub-Signature-256", "")
    if not _verify_signature(body, sig, WEBHOOK_SECRET):
        return ActionResult.error("Invalid webhook signature")

    payload = json.loads(body)
    await ctx.store.create("payments", {"amount": payload["amount"], "status": "received"})
    return ActionResult.success(
        data={"payment_id": payload.get("id")},
        summary="Payment recorded",
    )


def _make_signed_payload(payload: dict, secret: str) -> tuple[str, str]:
    body = json.dumps(payload)
    sig = "sha256=" + hmac.new(secret.encode(), body.encode(), hashlib.sha256).hexdigest()
    return body, sig


@pytest.mark.asyncio
async def test_payment_webhook_valid_signature():
    ctx = MockContext()
    payload = {"id": "pay-001", "amount": 4999}
    body, sig = _make_signed_payload(payload, WEBHOOK_SECRET)

    result = await on_payment(ctx, {"X-Hub-Signature-256": sig}, body, {})

    assert result.status == "success"
    assert result.data["payment_id"] == "pay-001"
    assert "payments" in ctx.store._data


@pytest.mark.asyncio
async def test_payment_webhook_invalid_signature_rejected():
    ctx = MockContext()
    body = json.dumps({"id": "pay-002", "amount": 100})

    result = await on_payment(ctx, {"X-Hub-Signature-256": "sha256=bad"}, body, {})

    assert result.status == "error"
    assert "signature" in result.error
    assert "payments" not in ctx.store._data

Testing event handlers

@ext.on_event handlers receive a minimal context — ctx.user and ctx.store are populated but ctx.http, ctx.ai, and other I/O surfaces are not available. Keep event handlers thin: record the event in ctx.store, then process it in a @ext.schedule or a subsequent @chat.function.

tests/test_events.py

import pytest
from imperal_sdk import Extension
from imperal_sdk.testing import MockContext

ext = Extension(
    "inbox-ext",
    display_name="Inbox",
    description="Handles email.received events and queues messages for processing.",
    actions_explicit=True,
)


@ext.on_event("email.received")
async def on_email_received(ctx, event) -> None:
    """Record the incoming email event for deferred processing."""
    await ctx.store.create("inbox_queue", {
        "from": event.data.get("from"),
        "subject": event.data.get("subject"),
        "processed": False,
    })


@pytest.mark.asyncio
async def test_email_received_queues_message():
    ctx = MockContext()

    class FakeEvent:
        data = {"from": "alice@example.com", "subject": "Hello"}

    await on_email_received(ctx, FakeEvent())

    assert "inbox_queue" in ctx.store._data
    queued = list(ctx.store._data["inbox_queue"].values())
    assert len(queued) == 1
    assert queued[0]["from"] == "alice@example.com"
    assert queued[0]["processed"] is False

Do not call ctx.http, ctx.ai, or ctx.billing from inside @ext.on_event handlers. These clients are not populated in the minimal event-dispatch context and calling them raises AttributeError. Keep handlers thin: record the event, process it later.

Testing panels

@ext.panel handlers return a UINode tree. Call them directly in tests and assert on the serialized structure:

tests/test_panels.py

import pytest
from imperal_sdk import Extension, ui
from imperal_sdk.testing import MockContext

ext = Extension(
    "item-ext",
    display_name="Items",
    description="Displays and manages a list of items in the left sidebar.",
    actions_explicit=True,
)


@ext.panel("sidebar", slot="left", title="Items", icon="📜")
async def sidebar(ctx, **kwargs) -> object:
    page = await ctx.store.query("items", order_by="created_at")
    if not page.data:
        return ui.Empty(message="No items yet", icon="📦")
    return ui.Stack(children=[
        ui.List(items=[
            ui.ListItem(
                id=doc.id,
                title=doc.data["title"],
                on_click=ui.Call("__panel__editor", item_id=doc.id),
            )
            for doc in page.data
        ])
    ])


@pytest.mark.asyncio
async def test_sidebar_empty_state():
    ctx = MockContext()

    result = await sidebar(ctx)

    assert result.type == "Empty"
    assert result.props["message"] == "No items yet"


@pytest.mark.asyncio
async def test_sidebar_with_items():
    ctx = MockContext()
    ctx.store._data["items"] = {
        "i1": {"title": "First item"},
        "i2": {"title": "Second item"},
    }

    result = await sidebar(ctx)

    assert result.type == "Stack"
    serialized = result.to_dict()
    list_node = serialized["props"]["children"][0]
    assert list_node["type"] == "List"
    assert len(list_node["props"]["items"]) == 2


@pytest.mark.asyncio
async def test_sidebar_item_has_call_action():
    ctx = MockContext()
    ctx.store._data["items"] = {"i1": {"title": "Clickable item"}}

    result = await sidebar(ctx)
    serialized = result.to_dict()

    item = serialized["props"]["children"][0]["props"]["items"][0]
    assert item["props"]["on_click"]["action"] == "call"
    assert item["props"]["on_click"]["function"] == "__panel__editor"
    assert item["props"]["on_click"]["params"]["item_id"] == "i1"

All UINode objects expose .type (the render type string) and .props (the props dict). Call .to_dict() for the full wire-serialized form.

Integration tests

Integration tests verify that the extension mounts correctly: validators pass, the manifest is well-formed, and the health check returns healthy.

tests/test_integration.py

import pytest
from imperal_sdk import validate_extension, generate_manifest
from imperal_sdk.testing import MockContext

# Replace with your real extension import
# from my_extension.app import ext


@pytest.mark.asyncio
async def test_extension_validates_clean():
    """V1-V24 validators all pass at import time."""
    from my_extension.app import ext  # type: ignore[import]

    report = validate_extension(ext)
    errors = [i for i in report.issues if i.level == "ERROR"]
    assert errors == [], "Validator errors:\n" + "\n".join(
        f"  {i.rule}: {i.message}" for i in errors
    )


@pytest.mark.asyncio
async def test_health_check_passes():
    from my_extension.app import ext  # type: ignore[import]

    ctx = MockContext()
    if ext._health_check:
        status = await ext._health_check.func(ctx)
        assert status.healthy is True


def test_manifest_generation_does_not_raise():
    from my_extension.app import ext  # type: ignore[import]

    manifest = generate_manifest(ext)
    assert manifest["app_id"] == ext.app_id
    assert "tools" in manifest

CI patterns

A minimal GitHub Actions workflow for an extension:

.github/workflows/test.yml

name: test
on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.11"
      - name: Install dependencies
        run: pip install "imperal-sdk[dev]"
      - name: Syntax check
        run: |
          python3 -m py_compile app.py
          python3 -m py_compile handlers_crud.py
          python3 -m py_compile panels.py
          python3 -m py_compile skeleton.py
      - name: Run tests
        run: pytest tests/ -v

For extensions with Pyright:

.github/workflows/test.yml (with type checking)

      - name: Type check
        run: pip install pyright && pyright handlers_crud.py panels.py skeleton.py

python3 -m py_compile is mandatory before every deploy. Add it as the first step in CI so syntax errors are caught before tests run.

Common pitfalls

Forgetting @pytest.mark.asyncio or asyncio_mode = "auto"

All SDK handler functions are async def. Without the asyncio marker, pytest silently collects the test as a sync function and the coroutine is never awaited — the test passes vacuously. Fix: add asyncio_mode = "auto" to [tool.pytest.ini_options] in pyproject.toml.

Pre-seeding ctx.store._data with the wrong structure

MockStore._data is dict[str, dict[str, dict]] — {collection: {doc_id: data_dict}}. A common mistake is seeding it as a flat list or using integer keys. Each document must be keyed by its doc_id string.

ctx.user.id does not exist

The canonical user ID is ctx.user.imperal_id. ctx.user.id was removed in the W1 unification (2026-04-27). Code that references ctx.user.id raises AttributeError at runtime — unit tests expose this immediately.

ctx.cache unavailable in MockContext

ctx.cache requires an Extension reference and a live gateway URL. In MockContext neither is present, so accessing ctx.cache raises RuntimeError. Structure cache-backed handlers to accept an injectable fetcher, or test them at the ctx.store level.

Skeleton access from wrong tool_type

ctx.skeleton.get() raises SkeletonAccessForbidden unless tool_type="skeleton" was passed to MockContext. Skeleton handler tests must use MockContext(tool_type="skeleton"). All other handler tests use the default tool_type="tool".

Missing await on store methods

All store methods — create, get, query, update, delete, count — are async def. Forgetting await returns a coroutine object. The subsequent assertion on .data or .status raises AttributeError in a confusing way.

ctx.as_user() requires system context

ctx.as_user(uid) raises RuntimeError unless ctx.user.imperal_id == "__system__". For scheduled-job tests that exercise fan-out, build a system context explicitly (see Scheduled jobs) rather than using the default MockContext().

Calling ctx.http, ctx.ai, or ctx.billing in event handlers

@ext.on_event handlers receive minimal context. These clients are not populated and calling them raises AttributeError. Keep event handlers thin — record to store, process later.