Recipe — BYOLLM-aware extension
An extension that makes its own LLM calls — using the user's BYOLLM provider when set
This recipe shows an extension that calls an LLM itself beyond the web-kernel's intent classifier — for content generation inside the handler. Federal-clean BYOLLM with platform-LLM fallback.
Use case
A "rewrite my message" tool — the user pastes text, picks a tone, the extension uses the user's BYOLLM (or platform LLM) to rewrite.
from pydantic import BaseModel, Field
from typing import Literal
class RewriteParams(BaseModel):
text: str = Field(description="The text to rewrite.")
tone: Literal["formal", "casual", "concise", "friendly", "apologetic"] = Field(
description="Target tone.",
)
keep_meaning: bool = Field(
True,
description="Preserve the original meaning. Set false if the user wants the AI to also restructure ideas.",
)from imperal_sdk import ChatExtension, ActionResult
from .schemas import RewriteParams
TONE_PROMPTS = {
"formal": "Rewrite the following text in a formal, professional tone.",
"casual": "Rewrite the following text in a casual, conversational tone.",
"concise": "Rewrite the following text to be as concise as possible while keeping every important detail.",
"friendly": "Rewrite the following text to feel warm and friendly.",
"apologetic": "Rewrite the following text to be apologetic and humble.",
}
@chat.function(
description="Rewrite a piece of text in a specified tone using the user's configured LLM.",
action_type="read",
effects=["llm.text-generation"],
)
async def rewrite(ctx, params: RewriteParams):
system = TONE_PROMPTS[params.tone]
if params.keep_meaning:
system += " Preserve all factual content. Don't add or remove information."
response = await ctx.llm.create_message(
system=system,
messages=[{"role": "user", "content": params.text}],
max_tokens=1500,
purpose="content_rewrite", # cascades admin per-purpose settings
)
return {
"text": response.text,
"provider": ctx.llm.provider,
"model": ctx.llm.model,
"is_byollm": ctx.llm.is_byollm,
}How ctx.llm resolves
ctx.llm is built per-call by the web-kernel:
├─ If user has BYOLLM set: use their provider/model/api_key/base_url
├─ If admin per-purpose settings exist for "content_rewrite": apply them
├─ Otherwise: fall back to platform-default LLM (Sonnet 4.6)Your code is provider-agnostic. The same ctx.llm.create_message(...) call works for Anthropic, OpenAI, or local Ollama on the user's side.
What you NEVER do
Don't import the anthropic / openai SDK
V7 forbids it (direct provider SDK imports). ctx.llm is the federal-clean surface.
Don't read user.attributes for credentials
API keys never land in attributes. Decryption happens at [auth gateway](/en/reference/glossary/).
Don't loop unbounded
Bound your own LLM loops. The platform won't enforce it for in-handler loops — but cost reviews catch unbounded calls.
Try it in chat
"rewrite this casually: 'Per our previous conversation, attached please find the document for your perusal.'"
The classifier picks rewrite(text=..., tone="casual"). Your handler calls ctx.llm.create_message, gets back something like "Hey — here's the doc you wanted, take a look when you get a chance."
Cost considerations
# Show user which LLM and whether BYOLLM
return {
"text": rewritten_text,
"provider": ctx.llm.provider,
"model": ctx.llm.model,
"is_byollm": ctx.llm.is_byollm,
}If is_byollm=True, the call doesn't bill the user (Policy A — BYOLLM excluded from billing). The Imperal Panel renders a small BYOLLM badge so the user sees it.
Variations
class RewriteParams(BaseModel):
text: str = Field(description="Text to rewrite.")
tone: Literal["formal", "casual"] = Field(description="Tone.")
n_variations: int = Field(3, description="Number of rewrites to generate.")
async def rewrite_multiple(ctx, params):
results = []
for i in range(params.n_variations):
r = await ctx.llm.create_message(
system=f"Rewrite in {params.tone} tone, variation {i+1}.",
messages=[{"role": "user", "content": params.text}],
)
results.append(r.text)
return {"variations": results}async def extract_action_items(ctx, params):
response = await ctx.llm.create_message(
system="Extract action items as JSON: {items: [{owner, deadline, task}]}",
messages=[{"role": "user", "content": params.text}],
response_format="json",
)
return {"action_items": response.json}For UI-driven streaming (typewriter effect), use ctx.llm.stream_message and yield chunks back via the chat surface. This is rarer for chat.functions — typically panels handle stream UX.