Imperal Docs
Guides

BYOLLM (Bring Your Own LLM)

BYOLLM lets users bring their own LLM provider and call it from your extension via ctx.ai, with credentials handled by the platform and never seen by your code.

BYOLLM = a user provides their own LLM credentials, and any LLM calls made on their behalf run through their provider instead of the platform's. Useful when:

  • The user has a private model deployment (Anthropic enterprise, local Ollama, OpenAI org).
  • Compliance requires data not flow through the platform's default LLMs.
  • The user wants to control their own LLM costs.

It's transparent to your extension

You don't write any special code for BYOLLM. Your extension just calls ctx.ai.complete(...) as normal. If the user has configured their own LLM in the Panel, the platform automatically routes the call through it; otherwise it uses the platform LLM. Either way:

  • You never see the user's API key. Credentials are stored and used by the platform — they never reach your extension.
  • BYOLLM calls cost the user no platform fee. When the user is on their own LLM, the platform-fee component drops to zero automatically (see BYOLLM Pricing); your base_price is unchanged.

There is no separate LLM object, no provider/credential API, and no flag your extension reads — your extension only ever calls ctx.ai.complete(...), and whether a user is on BYOLLM is resolved by the platform, not by you.

When your extension calls the LLM directly

Most extensions never call an LLM themselves — the platform's classifier and narrator handle the conversational layer around your @chat.function handlers. You only reach for ctx.ai.complete(...) when your extension does its own generation, for example:

📝

Long-form content generation

The user asks your extension itself to draft something.

🔍

Summarization

Condensing documents or results your extension fetched.

🤖

Internal reasoning steps

When your handler needs a model to interpret or transform data (with appropriate guardrails).

The pattern

from imperal_sdk import ChatExtension, ActionResult, sdl
from pydantic import BaseModel, Field

class SummarizeParams(BaseModel):
    text: str = Field(description="The content to summarize.")
    max_words: int = Field(80, description="Target summary length.")

# Read tools declare a data_model (V23). The generated summary is a typed
# sdl.Entity; sdl.Excerptable adds the summary/word_count facet fields.
class Summary(sdl.Entity, sdl.Excerptable):
    pass

@chat.function(
    "summarize",
    description="Summarize a piece of text using the user's configured LLM.",
    action_type="read",
    data_model=Summary,
)
async def summarize(ctx, params: SummarizeParams) -> ActionResult:
    # If the user has configured BYOLLM, this call runs through their own
    # LLM automatically — no special handling needed.
    result = await ctx.ai.complete(
        f"Summarize the following in {params.max_words} words or fewer. "
        f"Output plain text only.\n\n{params.text}"
    )
    return ActionResult.success(
        Summary(id="summary", title="Summary", summary=result.text),
        summary=result.text,
    )

What ctx.ai.complete returns

Prop

Type

Pass model="..." to request a specific model; omit it to use the default.

Guarantees

🔐

Credentials never reach your code

The platform makes the call on the user's behalf; your extension never sees the user's API key.

📊

BYOLLM users pay no platform fee

When the user runs on their own LLM, the platform-fee component is zero. Your base price is unchanged.

⚖️

Tenant isolation

One user's LLM configuration is never used to make calls on behalf of another.

Anti-patterns

What's next

On this page