Imperal Docs
Billing & Earnings

BYOLLM Pricing — impact on your revenue

When a user runs on their own LLM provider, what changes in your earnings — and why

Read this before setting your prices. BYOLLM (Bring Your Own LLM) is a key Imperal feature where users can plug in their own OpenAI / Anthropic / etc. API keys. When they do, your revenue per call changes — sometimes significantly. Understanding the difference helps you price correctly.

What "BYOLLM" means

Imperal users have two LLM modes:

ModeWho pays the LLM providerImperal's LLM cost
Platform LLM (default)Imperal doesReal cost (passed via platform_fee)
BYOLLMUser does — they pay OpenAI/Anthropic directlyZero

When is_byollm=True in the user's web-kernel context, every wallet-touching code path applies the I-BYOLLM-PLATFORM-FEE-ZERO federal contract: platform_fee → 0.

What this means for your extension

Your base_price (the price you set in Dev Portal) is unchanged in both modes. What changes is the platform_fee component on top.

Example: your summarize_text tool at base_price=5 (indie tier, 80%)

Non-BYOLLM user calls it:

base_price        = 5  (your pricing)
platform_fee      = 2  (Imperal's LLM cost — model-tier-derived)
─────────────────────
total user cost   = 7 tokens
developer_share   = floor(7 × 0.80) = 5 tokens   ← you earn 5
platform_share    = 2 tokens

BYOLLM user calls it:

base_price        = 5  (unchanged)
platform_fee      = 0  ← BYOLLM exclusion
─────────────────────
total user cost   = 5 tokens
developer_share   = floor(5 × 0.80) = 4 tokens   ← you earn 4
platform_share    = 1 token

(Replace 0.80 with 0.70 for explorer tier, 0.85 for studio, 0.95 for partner — see Developer Tiers.)

You earn slightly less per BYOLLM call because the total cost is smaller — but the absolute discount is proportional to your platform_fee (which depends on the model tier the user picked).

When the platform_fee is bigger, BYOLLM savings are bigger

Platform fees scale with model tier:

TierDefault platform_fee
economy (Haiku, GPT-4o-mini, etc.)1 token
standard (Sonnet, GPT-4o)2 tokens
premium (Opus, GPT-4.1)5 tokens

If your extension is typically invoked with a premium model (heavy reasoning), the BYOLLM discount is more impactful for you — non-BYOLLM users were paying base_price + 5 and you were earning floor((base+5) × 0.80); BYOLLM users pay only base and you earn floor(base × 0.80).

Where BYOLLM users pay nothing at all

Two situations short-circuit to zero, even before your base_price applies:

1. Branch B conversational turns

When a user just chats with Webbee without invoking any extension (no @chat.function triggered, only a free-form response), web-kernel charges __system__/hub_chat/write. For non-BYOLLM users this is e.g. 7 tokens (5 base + 2 fee). For BYOLLM users this is zero.

This doesn't affect your extension's revenue — __system__ isn't your app — but it's why BYOLLM users see far fewer wallet decrements in their history.

2. Chain-step reserve (the chain itself, not your individual handlers)

When a user fires off a multi-step request like "send John an email AND create a note", web-kernel reserves a chain-wide budget using __system__/hub_chat/write per step. For BYOLLM users this reserve is zero.

Your individual extension calls within that chain still charge base_price (via the chain's hold key), so your per-handler earnings are unchanged. You only lose the platform_fee component that you'd have shared in.

Detecting BYOLLM in your handler (optional)

If your extension does its own LLM calls inside a handler (e.g. content generation beyond what web-kernel-intent does), you can check the user's BYOLLM status via:

@chat.function(action_type="read")
async def my_handler(ctx, params):
    if ctx.user.is_byollm:
        # User has their own LLM key — ctx.llm.create_message will use it
        # We still charge them base_price for this action
        response = await ctx.llm.create_message(system="...", user="...")
    else:
        # Platform-provided LLM — same call, web-kernel pays the provider
        response = await ctx.llm.create_message(system="...", user="...")
    return ActionResult.success(...)

Either way, the wallet deduct happens automatically based on your registered pricing. Your handler doesn't need to compute the price — web-kernel does that before dispatch. See BYOLLM-aware extension recipe for a full example.

Pricing recommendations

Given that BYOLLM users yield slightly less per call:

  1. Don't try to compensate with higher base_price. Users on BYOLLM are usually power users who chose Imperal specifically for the marketplace — they'll notice and uninstall if you over-charge. Trust the platform_fee mechanism.
  2. Set your prices based on the value your tool delivers, not the model tier. A summarize_text is worth ~5 tokens whether it ran on Haiku or Opus.
  3. Use per-function pricing when some tools are LLM-heavy and others aren't. A tool that internally makes 5 LLM calls deserves higher base_price than a single-DB-query tool.
  4. Don't price by Imperal's platform_fee. Platform_fee is opaque to you — it changes with model rates, tier reassignment, and BYOLLM. Your base_price should be self-sufficient.

Federal invariant

The contract I-BYOLLM-PLATFORM-FEE-ZERO is enforced at every wallet deduct site (web-kernel single-action, chain reserve, Branch B conversational). Tests in imperal_kernel/tests/test_i_byollm_platform_fee_zero.py pin the shape; CI fails if any caller drops the flag.

Effectively, you can rely on this behaviour staying consistent — if it ever changes, it'll go through Imperal's federal review process and you'll be notified well in advance.

On this page