Imperal Docs
Core Concepts

Architecture

How a chat message becomes a tool call — the full path from user to your handler

This page traces a single chat turn from the moment a user types something to the moment your extension's handler returns a result. Every layer that participates is named, what it owns is explained, and the federal guarantees are pointed out as we pass them.

What the web-kernel is

The web-kernel is Imperal Cloud's purpose-built orchestration layer. It's not a Python interpreter, not a Linux web-kernel, not a Kubernetes scheduler — it's a clean-room piece of software we built specifically to make conversational dispatch reliable. The name reflects what it does: act as the web-kernel of a web-native AI platform — the trusted layer every chat turn passes through.

Concretely, it owns five jobs:

  • Intent classification — turns a natural-language message into a typed handler call (which extension, which function, with which parameters), with confidence + plan steps for multi-action chains.
  • Federal authorization — every action checked against the user's RBAC scopes, tenant boundary, confirmation policy, and action-type rules before any side-effect fires. If a check fails, your extension code never runs.
  • Context construction — every handler receives a uniform ctx carrying user identity, tenant, scopes, configured LLM, store/cache/http/secrets clients, and platform services. You don't re-derive any of this.
  • Async lifecycle — short calls run inline; calls exceeding the promotion threshold detach automatically and post their result to chat when done. Heartbeats, cancellations, timeouts, and retries are all managed for you.
  • Audit chokepoint — every action lands in the federal action-ledger with retention class and tenant scope. Federal invariants enforce this at the boundary, not by extension policy.

Everything downstream of the web-kernel — your extension code, panels, skeletons, schedules, webhooks — is just an extension to it. Everything upstream — Webbee chat, the Imperal Panel app, the marketplace, the Developer Portal — is just a client of it.

The 30-second version

USER (panel.imperal.io)
  │  "send the Q3 report to sarah"

AUTH GATEWAY (auth.imperal.io) — JWT, RBAC, rate limit


WEB-KERNEL (Webbee)  ← intent classifier picks tool + args


EXTENSION HANDLER (your code) ← typed Pydantic args


RESULT → web-kernel renders to chat → user sees reply

Every arrow has security, audit, and tenant-isolation gates baked in. You write the handler — the platform does the rest.

The seven layers

🌐

1. Imperal Panel UI

React app at panel.imperal.io. Where the user lives. Renders chat, panels, settings.

🛡️

2. [Auth Gateway](/en/reference/glossary/)

auth.imperal.io. JWT issuance, OAuth, RBAC, federation. Every request crosses it.

🧠

3. Webbee web-kernel

Intent classification, action authorization, chain orchestration, rendering, audit.

⚙️

4. [ICNLI Worker](/en/reference/glossary/)

Where extensions actually execute. Handles long-running tasks, retries, timeouts.

📦

5. Your extension

The Python package. Handlers + manifest + optional panels and skeletons.

🔌

6. Registry

Source of truth for installed extensions, scopes, metadata. Backed by Postgres.

🌉

7. UCG (Unified Context Gateway)

Bridge that lets external clients (third-party apps, webhooks) talk to Webbee.

Walk-through: a chat turn

Let's trace "send the Q3 report to sarah" through the system.

The user types in chat

The Imperal Panel posts to auth.imperal.io/v1/chat/turn with the message body and the user's JWT.

Auth Gateway authorizes

  • Validates the JWT signature and expiry
  • Loads the user record + tenant info
  • Applies rate-limit + RBAC checks
  • Forwards to the web-kernel with a typed envelope: {user, tenant, message, history_ref}

This is also where multi-tenancy is enforced. The user_id and tenant_id are now web-kernel-authoritative and travel with every downstream call.

Federal guarantee

Every action that fires from this turn will be audited with the same user_id. If your code somewhere "forgets" to pass it, the web-kernel's audit chokepoint catches it and fail-closes the action. See federal invariants.

Web-kernel classifies intent

The web-kernel runs a single LLM call (the intent classifier) that returns:

{
  "intent": "action",
  "confidence": 0.92,
  "language": "en",
  "action_plan": {
    "tool": "mail.send_message",
    "args": {
      "to": ["sarah@company.com"],
      "subject": "Q3 report",
      "body_template": "q3-report"
    }
  },
  "is_destructive": false,
  "is_acceptance": false
}

The classifier knows about every installed extension's tools (from the Registry) and produces a strongly-typed plan.

Web-kernel checks authorization

Before any side-effects fire, the web-kernel runs the federal authorization invariants:

  • Is this action allowed for this user? (RBAC + scopes)
  • Does it need confirmation? (action_type=destructive → yes)
  • Tenant scope match? (acted-on resource belongs to acting user's tenant)

If any check fails, the action is rejected with a federal-recorded reason. No code in your extension runs.

Web-kernel dispatches to the worker

The web-kernel queues a ICNLI Worker activity targeting your extension. The worker picks it up. Your handler is invoked with a fully-formed ctx:

async def send_message(ctx, params: SendMessageParams):
    #     ↑              ↑
    #     │              └─ already validated by Pydantic
    #     │
    #     └─ ctx.user, ctx.tenant, ctx.http, ctx.store, ctx.ai, ctx.skeleton, ctx.cache, ctx.notify, ctx.storage, ctx.tools

Your handler runs

You write the business logic. SMTP, API call, database query, whatever.

async def send_message(ctx, params: SendMessageParams):
    body = await render_template(params.body_template)
    await ctx.http.post("/mail/send", json={...})
    return {"text": "Email sent to sarah@company.com"}

Web-kernel records, renders, replies

  • The return JSON hits the web-kernel.
  • An audit row is written to the action ledger with retention class federal_7y.
  • The chat renderer turns your text into the user-visible reply (with i18n if ctx.lang differs from default).
  • The Imperal Panel receives the rendered turn over Fast-RPC and updates the UI.

The whole round-trip on a warm path: ~80–300ms, mostly web-kernel + LLM time. Your handler is rarely the bottleneck.

Where state lives

Prop

Type

Your extension's state is your responsibility

Imperal Cloud doesn't impose a database. You can use SQLite, Postgres, Redis, S3, your own API. What's required: scope every read/write by ctx.user.imperal_id (and ctx.tenant_id for multi-org tenants). The federal layer can't enforce this from the outside — it's a contract you keep.

The data plane vs the control plane

A useful split:

PlaneWhat flowsLatency target
Control planeManifests, validations, scope changes, deploysseconds (fine)
Data planeChat turns, tool calls, audit writesmilliseconds (matters)
Live updatesSkeletons, panel re-renders, status pingssub-second (matters)

Most of what you write is data-plane code (your handlers). The control plane is mostly the Developer Portal + Registry — you interact with it once at publish time.

Putting it together — the diagram

                      ┌────────────────────────────────────────────────┐
                      │              panel.imperal.io                   │
                      │  React UI, chat, panels, settings, marketplace  │
                      └──────────────────────┬─────────────────────────┘
                                             │ HTTPS / Fast-RPC

                      ┌────────────────────────────────────────────────┐
                      │           auth.imperal.io  (Auth GW)           │
                      │   JWT │ OAuth │ RBAC │ rate limit │ federation  │
                      └──────────────────────┬─────────────────────────┘

                  ┌──────────────────────────┴──────────────────────────┐
                  │                  Webbee WEB-KERNEL                       │
                  │                                                      │
                  │   intent classifier  →  authz + tenant gates → audit │
                  │            │                                         │
                  │            ▼                                         │
                  │    chain orchestrator (multi-step / typed dispatch)  │
                  │            │                                         │
                  │            ▼                                         │
                  │    extension dispatcher  →  ICNLI Worker activity        │
                  └──────────────────────────┬──────────────────────────┘


                      ┌────────────────────────────────────────────────┐
                      │           ICNLI worker fleet (3 instances)      │
                      │              your @chat.function runs           │
                      └────────────────────────────────────────────────┘
                                  ↑                  ↑                ↑
                                  │                  │                │
                            ┌─────┴────┐   ┌─────────┴────┐   ┌──────┴──────┐
                            │ Registry │   │   Postgres   │   │   Redis     │
                            │   (PG)   │   │ action ledger│   │ pub/sub TTL │
                            └──────────┘   └──────────────┘   └─────────────┘

What's next

On this page