Long-running AI calls — background task pattern
Run 60-90 second AI generation without blocking chat — three SDK forms
When your extension needs to call an external AI service (OpenAI, Anthropic, retrieval-augmented pipelines) that takes longer than the 30-second ctx.http default, you have three options on Imperal Cloud — from simplest to most flexible.
TL;DR — pick by duration
| Op duration | Easiest form | Chat experience |
|---|---|---|
| ≤ 30s | ctx.http.post(url) (default) | Chat blocks until result. |
| 30–180s, single HTTP call | ctx.http.post(url, timeout=120) per-call override | Chat blocks; fine for predictable single-call latency. |
| Any duration up to 30 min, want chat unblocked | @chat.function(background=True) sugar (v4.2.13+) | Chat unblocks instantly with ack; auto-delivers result as fresh bot turn. |
| Same, want custom ack summary or conditional dispatch | ctx.background_task(coro) explicit (v4.2.12+) | Same as above; you control the immediate ack. |
Pattern A — ctx.http(..., timeout=N) (≤ 180s)
Use timeout=N on the specific HTTP call. Chat stays in "thinking…" for the duration. Cleanest when one external call is the only slow step.
from imperal_sdk import ActionResult
from pydantic import BaseModel, Field
# Assumes `chat` is imported from your extension's app module —
# typically: from .app import chat (or from app import chat)
class RefineParams(BaseModel):
input: str = Field(description="Text to refine")
max_length: int = Field(default=1000)
@chat.function(
description="Refine the given text via AI completion.",
action_type="write",
event="text_refined",
)
async def refine(ctx, params: RefineParams) -> ActionResult:
api_key = await ctx.secrets.get("openai_api_key")
if not api_key:
return ActionResult.error("OpenAI key not connected", retryable=False)
# Per-call timeout — federal cap 180s.
resp = await ctx.http.post(
"https://api.openai.com/v1/chat/completions",
json={
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You refine prose. Be concise."},
{"role": "user", "content": params.input},
],
"max_tokens": params.max_length,
},
headers={"Authorization": f"Bearer {api_key}"},
timeout=120,
)
if resp.status_code != 200:
return ActionResult.error(f"OpenAI returned {resp.status_code}", retryable=True)
text = resp.body["choices"][0]["message"]["content"]
return ActionResult.success(
summary="Refined text ready!",
data={"text": text},
)The user sees Refined text ready! as the chat response once the coroutine completes.
Pattern B — @chat.function(background=True) decorator sugar (v4.2.13+, recommended)
The single-flag form. Add background=True to the decorator and the SDK auto-wraps your handler in ctx.background_task() under the hood. No inner _work() coroutine. No manual task_id plumbing. Use this whenever the entire handler body is the long work and you're happy with the platform's auto-generated acknowledgement.
from imperal_sdk import ActionResult
from pydantic import BaseModel, Field
class StartRefinementParams(BaseModel):
input: str
@chat.function(
description="Refine the given text via AI completion (long-running).",
action_type="write",
event="text_refined",
background=True, # ← auto-wrap in ctx.background_task
long_running=False, # False → 180s cap; True → 1800s cap (federal)
)
async def refine_output(ctx, params: StartRefinementParams) -> ActionResult:
"""The body runs DETACHED. Progress emissions and the final
ActionResult are auto-delivered to chat by the platform."""
api_key = await ctx.secrets.get("openai_api_key")
if not api_key:
return ActionResult.error("OpenAI key not connected", retryable=False)
await ctx.progress(15, "Fetching context")
# ... optional retrieval call ...
await ctx.progress(45, "Generating with AI")
resp = await ctx.http.post(
"https://api.openai.com/v1/chat/completions",
json={"model": "gpt-4o", "messages": [...], "max_tokens": 4000},
headers={"Authorization": f"Bearer {api_key}"},
timeout=120,
)
await ctx.progress(90, "Saving")
text = resp.body["choices"][0]["message"]["content"]
await ctx.store.set("last_refined", text)
# This ActionResult is what the user sees when the background work finishes.
return ActionResult.success(
summary="Refined output ready! 🎉",
data={"text": text},
)User chat experience:
- User: "улучши этот текст"
- Bot (instant, auto-generated): "Started 'refine_output' in background — the result will be sent to chat when it finishes."
- ...~90 seconds later, the platform injects a fresh bot turn from your handler's return value...
- Bot: "Refined output ready! 🎉" (data carries the refined text)
The user can keep typing about other things between turn 1 and turn 4.
Pattern C — ctx.background_task(coro) explicit (v4.2.12+)
Use the explicit form when you need any of:
- Custom acknowledgement summary in turn 1 (Pattern B's auto-generated summary is fine for most cases but isn't configurable).
- Conditional dispatch — sometimes you want to run synchronously, sometimes spawn a background task. Make the decision at runtime.
- Mixed sync + background — first part of the handler runs immediately, second part detaches.
from imperal_sdk import ActionResult
from pydantic import BaseModel
class StartParams(BaseModel):
input: str
fast_path: bool = False
@chat.function(
description="Refine text; chooses fast or background path at runtime.",
action_type="write",
event="refinement_started",
)
async def start_refinement(ctx, params: StartParams) -> ActionResult:
api_key = await ctx.secrets.get("openai_api_key")
if not api_key:
return ActionResult.error("OpenAI key not connected", retryable=False)
# Fast path — runs synchronously.
if params.fast_path:
resp = await ctx.http.post(
"https://api.openai.com/v1/chat/completions",
json={"model": "gpt-4o-mini", "messages": [...], "max_tokens": 500},
headers={"Authorization": f"Bearer {api_key}"},
timeout=30,
)
return ActionResult.success(
summary="Quick refine done.",
data={"text": resp.body["choices"][0]["message"]["content"]},
)
# Slow path — spawn detached.
async def _work():
await ctx.progress(50, "Generating with AI")
resp = await ctx.http.post(
"https://api.openai.com/v1/chat/completions",
json={"model": "gpt-4o", "messages": [...], "max_tokens": 4000},
headers={"Authorization": f"Bearer {api_key}"},
timeout=120,
)
text = resp.body["choices"][0]["message"]["content"]
await ctx.store.set("last_refined", text)
return ActionResult.success(
summary="Refined output ready! 🎉",
data={"text": text},
)
task_id = await ctx.background_task(
_work(),
long_running=False, # < 180s; True raises cap to 1800s
name="AI refinement",
)
# YOUR custom acknowledgement — sent to chat immediately.
return ActionResult.success(
summary="Got it — refining (≈90s). I'll send the result here.",
data={"task_id": task_id},
)Federal contract you sign by using any of these
Five LONGRUN invariants the platform enforces
I-LONGRUN-HTTP-CAP-180S— per-calltimeout=capped at 180 seconds. Larger →ValueError.I-LONGRUN-BG-CORO-RETURNS-ACTIONRESULT— your handler (sugar or explicit coro) MUST returnActionResult. Returning anything else writes a critical audit row and delivers a fallback error to chat.I-LONGRUN-BG-USER-SCOPED— every background task is bound to(your_ext, user)at creation; cross-user access returns 403.I-LONGRUN-CHAT-INJECT-USER-SCOPED— the delivered result message lands in that user's chat — never another user's.I-LONGRUN-CHAT-INJECT-AUDIT-EVERY— every delivered message writes an audit row.
Progress emissions — keep them flowing
The web-kernel uses ctx.progress(...) calls as heartbeats. If your coroutine goes silent for too long the platform may reclaim the task. Emit at every coarse milestone (fetching, generating, saving):
await ctx.progress(15, "Fetching context")
# ... some work ...
await ctx.progress(45, "Generating with AI")
# ... more work ...
await ctx.progress(90, "Saving")ctx.progress() also raises TaskCancelled if the user cancels — let it propagate; the platform handles the cancellation delivery to chat.
When not to use a background task
- Reads under 1 second — overhead isn't worth it. Just return
ActionResultsynchronously. - Pure CPU work — the platform's heartbeat-based liveness check assumes the coroutine yields. Wrap CPU-bound chunks with
await asyncio.sleep(0)between iterations. - Streaming partial output to chat —
ctx.background_taskdelivers a single final result. For per-token streaming, build the streaming directly into your handler (different problem; out of scope for this recipe).
Cross-links
Handle user API keys
Pattern for accepting per-user API keys / OAuth tokens / webhook signing secrets via @ext.secret and ctx.secrets
Build a complete extension
From zero to a working "Tasks Mini" extension — exercises every decorator, UI primitive, ctx surface, and federal pattern. Approximately 30 minutes.