Imperal Docs
Recipes

Long-running AI calls — background task pattern

Run 60-90 second AI generation without blocking chat — three SDK forms

When your extension needs to call an external AI service (OpenAI, Anthropic, retrieval-augmented pipelines) that takes longer than the 30-second ctx.http default, you have three options on Imperal Cloud — from simplest to most flexible.


TL;DR — pick by duration

Op durationEasiest formChat experience
≤ 30sctx.http.post(url) (default)Chat blocks until result.
30–180s, single HTTP callctx.http.post(url, timeout=120) per-call overrideChat blocks; fine for predictable single-call latency.
Any duration up to 30 min, want chat unblocked@chat.function(background=True) sugar (v4.2.13+)Chat unblocks instantly with ack; auto-delivers result as fresh bot turn.
Same, want custom ack summary or conditional dispatchctx.background_task(coro) explicit (v4.2.12+)Same as above; you control the immediate ack.

Pattern A — ctx.http(..., timeout=N) (≤ 180s)

Use timeout=N on the specific HTTP call. Chat stays in "thinking…" for the duration. Cleanest when one external call is the only slow step.

handlers_refine.py — Pattern A
from imperal_sdk import ActionResult
from pydantic import BaseModel, Field

# Assumes `chat` is imported from your extension's app module —
# typically: from .app import chat   (or from app import chat)

class RefineParams(BaseModel):
    input: str = Field(description="Text to refine")
    max_length: int = Field(default=1000)

@chat.function(
    description="Refine the given text via AI completion.",
    action_type="write",
    event="text_refined",
)
async def refine(ctx, params: RefineParams) -> ActionResult:
    api_key = await ctx.secrets.get("openai_api_key")
    if not api_key:
        return ActionResult.error("OpenAI key not connected", retryable=False)

    # Per-call timeout — federal cap 180s.
    resp = await ctx.http.post(
        "https://api.openai.com/v1/chat/completions",
        json={
            "model": "gpt-4o",
            "messages": [
                {"role": "system", "content": "You refine prose. Be concise."},
                {"role": "user", "content": params.input},
            ],
            "max_tokens": params.max_length,
        },
        headers={"Authorization": f"Bearer {api_key}"},
        timeout=120,
    )

    if resp.status_code != 200:
        return ActionResult.error(f"OpenAI returned {resp.status_code}", retryable=True)

    text = resp.body["choices"][0]["message"]["content"]
    return ActionResult.success(
        summary="Refined text ready!",
        data={"text": text},
    )

The user sees Refined text ready! as the chat response once the coroutine completes.


The single-flag form. Add background=True to the decorator and the SDK auto-wraps your handler in ctx.background_task() under the hood. No inner _work() coroutine. No manual task_id plumbing. Use this whenever the entire handler body is the long work and you're happy with the platform's auto-generated acknowledgement.

handlers_refine.py — Pattern B (sugar)
from imperal_sdk import ActionResult
from pydantic import BaseModel, Field

class StartRefinementParams(BaseModel):
    input: str

@chat.function(
    description="Refine the given text via AI completion (long-running).",
    action_type="write",
    event="text_refined",
    background=True,        # ← auto-wrap in ctx.background_task
    long_running=False,     # False → 180s cap; True → 1800s cap (federal)
)
async def refine_output(ctx, params: StartRefinementParams) -> ActionResult:
    """The body runs DETACHED. Progress emissions and the final
    ActionResult are auto-delivered to chat by the platform."""
    api_key = await ctx.secrets.get("openai_api_key")
    if not api_key:
        return ActionResult.error("OpenAI key not connected", retryable=False)

    await ctx.progress(15, "Fetching context")
    # ... optional retrieval call ...

    await ctx.progress(45, "Generating with AI")
    resp = await ctx.http.post(
        "https://api.openai.com/v1/chat/completions",
        json={"model": "gpt-4o", "messages": [...], "max_tokens": 4000},
        headers={"Authorization": f"Bearer {api_key}"},
        timeout=120,
    )

    await ctx.progress(90, "Saving")
    text = resp.body["choices"][0]["message"]["content"]
    await ctx.store.set("last_refined", text)

    # This ActionResult is what the user sees when the background work finishes.
    return ActionResult.success(
        summary="Refined output ready! 🎉",
        data={"text": text},
    )

User chat experience:

  1. User: "улучши этот текст"
  2. Bot (instant, auto-generated): "Started 'refine_output' in background — the result will be sent to chat when it finishes."
  3. ...~90 seconds later, the platform injects a fresh bot turn from your handler's return value...
  4. Bot: "Refined output ready! 🎉" (data carries the refined text)

The user can keep typing about other things between turn 1 and turn 4.


Pattern C — ctx.background_task(coro) explicit (v4.2.12+)

Use the explicit form when you need any of:

  • Custom acknowledgement summary in turn 1 (Pattern B's auto-generated summary is fine for most cases but isn't configurable).
  • Conditional dispatch — sometimes you want to run synchronously, sometimes spawn a background task. Make the decision at runtime.
  • Mixed sync + background — first part of the handler runs immediately, second part detaches.
handlers_refine.py — Pattern C (explicit)
from imperal_sdk import ActionResult
from pydantic import BaseModel

class StartParams(BaseModel):
    input: str
    fast_path: bool = False

@chat.function(
    description="Refine text; chooses fast or background path at runtime.",
    action_type="write",
    event="refinement_started",
)
async def start_refinement(ctx, params: StartParams) -> ActionResult:
    api_key = await ctx.secrets.get("openai_api_key")
    if not api_key:
        return ActionResult.error("OpenAI key not connected", retryable=False)

    # Fast path — runs synchronously.
    if params.fast_path:
        resp = await ctx.http.post(
            "https://api.openai.com/v1/chat/completions",
            json={"model": "gpt-4o-mini", "messages": [...], "max_tokens": 500},
            headers={"Authorization": f"Bearer {api_key}"},
            timeout=30,
        )
        return ActionResult.success(
            summary="Quick refine done.",
            data={"text": resp.body["choices"][0]["message"]["content"]},
        )

    # Slow path — spawn detached.
    async def _work():
        await ctx.progress(50, "Generating with AI")
        resp = await ctx.http.post(
            "https://api.openai.com/v1/chat/completions",
            json={"model": "gpt-4o", "messages": [...], "max_tokens": 4000},
            headers={"Authorization": f"Bearer {api_key}"},
            timeout=120,
        )
        text = resp.body["choices"][0]["message"]["content"]
        await ctx.store.set("last_refined", text)
        return ActionResult.success(
            summary="Refined output ready! 🎉",
            data={"text": text},
        )

    task_id = await ctx.background_task(
        _work(),
        long_running=False,         # < 180s; True raises cap to 1800s
        name="AI refinement",
    )

    # YOUR custom acknowledgement — sent to chat immediately.
    return ActionResult.success(
        summary="Got it — refining (≈90s). I'll send the result here.",
        data={"task_id": task_id},
    )

Federal contract you sign by using any of these

Five LONGRUN invariants the platform enforces

  • I-LONGRUN-HTTP-CAP-180S — per-call timeout= capped at 180 seconds. Larger → ValueError.
  • I-LONGRUN-BG-CORO-RETURNS-ACTIONRESULT — your handler (sugar or explicit coro) MUST return ActionResult. Returning anything else writes a critical audit row and delivers a fallback error to chat.
  • I-LONGRUN-BG-USER-SCOPED — every background task is bound to (your_ext, user) at creation; cross-user access returns 403.
  • I-LONGRUN-CHAT-INJECT-USER-SCOPED — the delivered result message lands in that user's chat — never another user's.
  • I-LONGRUN-CHAT-INJECT-AUDIT-EVERY — every delivered message writes an audit row.

Progress emissions — keep them flowing

The web-kernel uses ctx.progress(...) calls as heartbeats. If your coroutine goes silent for too long the platform may reclaim the task. Emit at every coarse milestone (fetching, generating, saving):

await ctx.progress(15, "Fetching context")
# ... some work ...
await ctx.progress(45, "Generating with AI")
# ... more work ...
await ctx.progress(90, "Saving")

ctx.progress() also raises TaskCancelled if the user cancels — let it propagate; the platform handles the cancellation delivery to chat.


When not to use a background task

  • Reads under 1 second — overhead isn't worth it. Just return ActionResult synchronously.
  • Pure CPU work — the platform's heartbeat-based liveness check assumes the coroutine yields. Wrap CPU-bound chunks with await asyncio.sleep(0) between iterations.
  • Streaming partial output to chatctx.background_task delivers a single final result. For per-token streaming, build the streaming directly into your handler (different problem; out of scope for this recipe).

On this page