BYOLLM Pricing — impact on your revenue
When a user runs on their own LLM provider, what changes in your earnings — and why
Read this before setting your prices. BYOLLM (Bring Your Own LLM) is a key Imperal feature where users can plug in their own OpenAI / Anthropic / etc. API keys. When they do, your revenue per call changes — sometimes significantly. Understanding the difference helps you price correctly.
What "BYOLLM" means
Imperal users have two LLM modes:
| Mode | Who pays the LLM provider | Imperal's LLM cost |
|---|---|---|
| Platform LLM (default) | Imperal does | Real cost (passed via platform_fee) |
| BYOLLM | User does — they pay OpenAI/Anthropic directly | Zero |
When is_byollm=True in the user's web-kernel context, every wallet-touching code path applies the I-BYOLLM-PLATFORM-FEE-ZERO federal contract: platform_fee → 0.
What this means for your extension
Your base_price (the price you set in Dev Portal) is unchanged in both modes. What changes is the platform_fee component on top.
Example: your summarize_text tool at base_price=5 (indie tier, 80%)
Non-BYOLLM user calls it:
base_price = 5 (your pricing)
platform_fee = 2 (Imperal's LLM cost — model-tier-derived)
─────────────────────
total user cost = 7 tokens
developer_share = floor(7 × 0.80) = 5 tokens ← you earn 5
platform_share = 2 tokensBYOLLM user calls it:
base_price = 5 (unchanged)
platform_fee = 0 ← BYOLLM exclusion
─────────────────────
total user cost = 5 tokens
developer_share = floor(5 × 0.80) = 4 tokens ← you earn 4
platform_share = 1 token(Replace 0.80 with 0.70 for explorer tier, 0.85 for studio, 0.95 for partner — see Developer Tiers.)
You earn slightly less per BYOLLM call because the total cost is smaller — but the absolute discount is proportional to your platform_fee (which depends on the model tier the user picked).
When the platform_fee is bigger, BYOLLM savings are bigger
Platform fees scale with model tier:
| Tier | Default platform_fee |
|---|---|
economy (Haiku, GPT-4o-mini, etc.) | 1 token |
standard (Sonnet, GPT-4o) | 2 tokens |
premium (Opus, GPT-4.1) | 5 tokens |
If your extension is typically invoked with a premium model (heavy reasoning), the BYOLLM discount is more impactful for you — non-BYOLLM users were paying base_price + 5 and you were earning floor((base+5) × 0.80); BYOLLM users pay only base and you earn floor(base × 0.80).
Where BYOLLM users pay nothing at all
Two situations short-circuit to zero, even before your base_price applies:
1. Branch B conversational turns
When a user just chats with Webbee without invoking any extension (no @chat.function triggered, only a free-form response), web-kernel charges __system__/hub_chat/write. For non-BYOLLM users this is e.g. 7 tokens (5 base + 2 fee). For BYOLLM users this is zero.
This doesn't affect your extension's revenue — __system__ isn't your app — but it's why BYOLLM users see far fewer wallet decrements in their history.
2. Chain-step reserve (the chain itself, not your individual handlers)
When a user fires off a multi-step request like "send John an email AND create a note", web-kernel reserves a chain-wide budget using __system__/hub_chat/write per step. For BYOLLM users this reserve is zero.
Your individual extension calls within that chain still charge base_price (via the chain's hold key), so your per-handler earnings are unchanged. You only lose the platform_fee component that you'd have shared in.
Detecting BYOLLM in your handler (optional)
If your extension does its own LLM calls inside a handler (e.g. content generation beyond what web-kernel-intent does), you can check the user's BYOLLM status via:
@chat.function(action_type="read")
async def my_handler(ctx, params):
if ctx.user.is_byollm:
# User has their own LLM key — ctx.llm.create_message will use it
# We still charge them base_price for this action
response = await ctx.llm.create_message(system="...", user="...")
else:
# Platform-provided LLM — same call, web-kernel pays the provider
response = await ctx.llm.create_message(system="...", user="...")
return ActionResult.success(...)Either way, the wallet deduct happens automatically based on your registered pricing. Your handler doesn't need to compute the price — web-kernel does that before dispatch. See BYOLLM-aware extension recipe for a full example.
Pricing recommendations
Given that BYOLLM users yield slightly less per call:
- Don't try to compensate with higher base_price. Users on BYOLLM are usually power users who chose Imperal specifically for the marketplace — they'll notice and uninstall if you over-charge. Trust the platform_fee mechanism.
- Set your prices based on the value your tool delivers, not the model tier. A
summarize_textis worth ~5 tokens whether it ran on Haiku or Opus. - Use per-function pricing when some tools are LLM-heavy and others aren't. A tool that internally makes 5 LLM calls deserves higher
base_pricethan a single-DB-query tool. - Don't price by Imperal's platform_fee. Platform_fee is opaque to you — it changes with model rates, tier reassignment, and BYOLLM. Your base_price should be self-sufficient.
Federal invariant
The contract I-BYOLLM-PLATFORM-FEE-ZERO is enforced at every wallet deduct site (web-kernel single-action, chain reserve, Branch B conversational). Tests in imperal_kernel/tests/test_i_byollm_platform_fee_zero.py pin the shape; CI fails if any caller drops the flag.
Effectively, you can rely on this behaviour staying consistent — if it ever changes, it'll go through Imperal's federal review process and you'll be notified well in advance.