Skip to content

Developer API

Spindrel exposes a comprehensive REST API for building integrations, dashboards, and custom clients. This guide covers authentication, endpoint discovery, the chat API, and streaming events.

Interactive Documentation

Spindrel auto-generates OpenAPI docs from its route definitions:

URL Description
/docs Swagger UI — interactive endpoint explorer with "Try it out"
/redoc ReDoc — readable reference documentation
/openapi.json Raw OpenAPI 3.x schema (for code generation)

All endpoints are visible, including admin routes. Authentication is still enforced — the docs just show what's available.

Authentication

Spindrel supports three authentication methods:

Static API Key

Set API_KEY in your .env. Pass it as a Bearer token:

curl -H "Authorization: Bearer your-api-key" \
  http://localhost:8000/api/v1/admin/bots

The static key has full access to all endpoints.

Scoped API Keys

Scoped keys (prefixed ask_) grant access to specific endpoint groups. Create them via the admin UI (Settings > API Keys) or the API:

# Create a scoped key with chat + channel access
curl -X POST http://localhost:8000/api/v1/admin/api-keys \
  -H "Authorization: Bearer $ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{"name": "my-integration", "scopes": ["chat", "channels:read"]}'

The response includes the key (shown once) and its scopes.

Scope Reference

Spindrel defines 51 scopes across 22 groups:

Group Scopes Description
Admin admin Full access — bypasses all scope checks
Channels channels:read, channels:write Channel CRUD (broad — includes sub-scopes)
Channels (granular) channels.messages:read/write, channels.config:read/write, channels.heartbeat:read/write, channels.integrations:read/write Fine-grained channel sub-resources
Chat chat Send messages (blocking + streaming), cancel, submit tool results
Sessions sessions:read, sessions:write Session details and message history
Bots bots:read, bots:write, bots:delete Bot configuration management
Tasks tasks:read, tasks:write Scheduled/deferred task management
Workspaces workspaces:read/write, workspaces.files:read/write Workspace management and file operations
Documents documents:read, documents:write Ingested document search and management
Knowledge knowledge:read, knowledge:write Bot knowledge entries (deprecated — prefer workspace files)
Todos todos:read, todos:write Persistent work items
Attachments attachments:read, attachments:write File attachment management
Logs logs:read, logs:write Agent turns, tool calls, traces, server logs
Tools tools:read, tools:execute Tool listing and direct execution
Providers providers:read, providers:write LLM provider configuration
Users users:read, users:write User management
Settings settings:read, settings:write Server-wide settings
Operations operations:read, operations:write Backups, git pull, restart
Usage usage:read Cost analytics and usage limits
Capabilities carapaces:read, carapaces:write Skill+tool bundle management
Workflows workflows:read, workflows:write Workflow definitions and run management
LLM llm:completions Direct LLM calls through the server's provider system
Mission Control mission_control:read, mission_control:write Dashboard data (kanban, journal, etc.)

Hierarchy rules: channels:write implies all channels.*:write sub-scopes. admin implies everything.

Presets

The admin UI offers one-click presets for common use cases:

Preset Use Case Key Scopes
Messaging Integration Slack, Discord, etc. chat, bots:read, channels:read/write, channels.config:read/write, sessions:read/write, todos:read, llm:completions
Chat Client Custom chat frontends chat, bots:read, channels:read/write, sessions:read, attachments:read/write
Container Bot Bots in their container environment chat, bots:read, channels:read/write, tasks:read/write, documents:read/write, todos:read/write, workspaces.files:read/write, attachments:read/write, carapaces:read/write, tools:read/execute
Read-Only Monitor Dashboards bots:read, channels:read, sessions:read, tasks:read, todos:read, attachments:read, logs:read
Mission Control MC dashboard bots:read, channels:read, sessions:read, tasks:read/write, todos:read/write, workspaces:read, workspaces.files:read/write, attachments:read, logs:read, mission_control:read/write, carapaces:read

JWT (User Authentication)

The UI uses JWT tokens via Google OAuth. For API access, scoped keys are preferred.

Widget Tokens (Short-Lived, Bot-Scoped)

Interactive HTML widgets (authored by bots via emit_html_widget) render in sandboxed iframes and need to call /api/v1/... endpoints without borrowing the viewing user's session. Spindrel mints short-lived (15 min) bot-scoped JWTs for this case via POST /api/v1/widget-auth/mint — payload {source_bot_id, pin_id?}, response {token, expires_at, expires_in, bot_id, bot_name, scopes}. The renderer re-mints every 12 min and pushes the new token into the live iframe so the widget never 401s mid-session.

Under the hood these are regular JWTs with kind: "widget" in the payload; the auth dependency has a dedicated branch that returns an ApiKeyAuth with the scopes inlined from the token (no per-request DB lookup). Scopes are copied from the bot's configured API key at mint time, so the widget can only do what the bot could do — not what the viewing user could do. See the HTML Widgets guide for the user-facing version.

You shouldn't typically call /widget-auth/mint yourself — it's automated by the widget renderer. It's documented here for completeness.

Endpoint Discovery

The /api/v1/discover endpoint returns all accessible endpoints filtered by your key's scopes:

# List endpoints accessible with your key
curl -H "Authorization: Bearer $API_KEY" \
  http://localhost:8000/api/v1/discover

# Get full markdown API reference
curl -H "Authorization: Bearer $API_KEY" \
  "http://localhost:8000/api/v1/discover?detail=true"

The basic response includes method, path, description, and required scope for each endpoint.

Chat API

Blocking Request

curl -X POST http://localhost:8000/chat \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "What is the weather like?",
    "bot_id": "default",
    "channel_id": "your-channel-uuid"
  }'

Response:

{
  "session_id": "uuid",
  "response": "The weather is...",
  "transcript": "",
  "client_actions": []
}

Streaming Request

curl -X POST http://localhost:8000/chat/stream \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Write a haiku about coding",
    "bot_id": "default",
    "channel_id": "your-channel-uuid"
  }' --no-buffer

Returns Server-Sent Events (SSE). Each event is a JSON object on a data: line:

data: {"type": "tool_start", "name": "web_search", "args": {...}}

data: {"type": "tool_result", "name": "web_search", "result": "..."}

data: {"type": "assistant_text", "text": "Here's what I found..."}

data: {"type": "response", "text": "Full response text", "tools_used": ["web_search"]}

Request Fields

Field Type Description
message string User message text
channel_id uuid Target channel (preferred)
bot_id string Bot ID (default: "default")
client_id string Client identifier (default: "default")
audio_data string Base64-encoded audio (for voice input)
audio_format string Audio format: m4a, wav, webm
attachments array Vision attachments (images)
dispatch_type string "slack", "webhook", "internal", "none"
dispatch_config object Routing config for the dispatch type
model_override string Per-turn model override
passive bool Store message without running agent

SSE Event Types

Events emitted during streaming:

Event Type Description Key Fields
assistant_text Incremental text from LLM text
tool_start Tool call beginning name, args
tool_result Tool call completed name, result, duration_ms
response Final response (always last) text, tools_used, client_actions
thinking_content Extended thinking (Claude models) text
error Processing error message
cancelled User cancelled the run
queued Message queued (session locked or system paused) session_id, task_id, reason
passive_stored Passive message stored session_id
secret_warning Secret-like patterns detected in input patterns
rate_limit_wait Waiting on LLM rate limit wait_seconds
fallback Fallback model activated original_model, fallback_model
context_budget Context window utilization used, limit
rag_rerank RAG results reranked
delegation_post Delegated to another bot target_bot
approval_request Waiting for tool approval request_id, tool_name
approval_resolved Approval decision received request_id, approved
transcript Audio transcription result text

Context assembly events (prefixed with source type) are also emitted during streaming but are primarily for debugging — clients typically only need assistant_text, tool_start, tool_result, and response.

LLM Completions API

A thin proxy that lets integrations make LLM calls through the server's multi-provider infrastructure without needing to know about provider URLs, API keys, or routing. Usage is recorded as a TraceEvent for cost tracking.

Scope: llm:completions

Request

curl -X POST http://localhost:8000/api/v1/llm/completions \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini/gemini-2.5-flash",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Summarize this text: ..."}
    ],
    "temperature": 0.7
  }'

Request Fields

Field Type Required Description
model string No Model ID (LiteLLM format). Defaults to DEFAULT_MODEL.
messages array Yes OpenAI-format messages (role + content).
temperature float No 0–2.
max_tokens int No Max completion tokens.
extra object No Provider-specific params passed through to the LLM call (e.g. Gemini safety_settings).

Response

{
  "content": "Here is a summary...",
  "model": "gemini/gemini-2.5-flash",
  "usage": {
    "prompt_tokens": 42,
    "completion_tokens": 128,
    "total_tokens": 170
  }
}
Field Type Description
content string LLM response text (empty string if model returned no content).
model string Actual model used.
usage object | null Token counts (null if provider didn't report usage).

Notes

  • The model is resolved through the server's provider system — the caller doesn't need to know which provider or API key to use.
  • All calls are recorded as TraceEvents with caller identity, model, token counts, duration, and cost (when available from LiteLLM).
  • Used by the ingestion pipeline's safety classifier and available to any integration with a scoped API key.

Common Patterns

Create a Channel and Send a Message

import requests

BASE = "http://localhost:8000"
HEADERS = {"Authorization": "Bearer your-api-key"}

# 1. Create a channel
ch = requests.post(f"{BASE}/api/v1/admin/channels", headers=HEADERS, json={
    "name": "my-project",
    "bot_id": "default",
}).json()

channel_id = ch["id"]

# 2. Send a message (blocking)
resp = requests.post(f"{BASE}/chat", headers=HEADERS, json={
    "message": "Hello!",
    "channel_id": channel_id,
}).json()

print(resp["response"])

Stream a Response (Python)

import json
import requests

resp = requests.post(
    f"{BASE}/chat/stream",
    headers=HEADERS,
    json={"message": "Explain Docker", "channel_id": channel_id},
    stream=True,
)

for line in resp.iter_lines():
    if line and line.startswith(b"data: "):
        event = json.loads(line[6:])
        if event["type"] == "assistant_text":
            print(event["text"], end="", flush=True)
        elif event["type"] == "response":
            print()  # final newline

Python Client

The agent CLI can also be used as a library:

cd client && pip install -e .
from agent.client import AgentClient

client = AgentClient(base_url="http://localhost:8000", api_key="your-key")
response = client.chat("Hello!", bot_id="default")
print(response.text)

CORS

To allow browser-based clients, set CORS_ORIGINS in your .env:

CORS_ORIGINS=http://localhost:3000,https://my-dashboard.example.com

Rate Limiting

This limits requests to the Spindrel API itself

These limits control how fast clients can call your Spindrel server. They do not affect outbound calls to LLM providers (OpenAI, Anthropic, etc.) — those have their own rate limits handled by the agent's retry/backoff logic.

Opt-in rate limiting protects against runaway clients hammering your server. Enable in .env:

RATE_LIMIT_ENABLED=true
RATE_LIMIT_DEFAULT=100/minute    # all Spindrel API endpoints
RATE_LIMIT_CHAT=30/minute        # /chat and /chat/stream (stricter)

When rate-limited, the server returns HTTP 429 with a Retry-After header.

Limits are per API key (or per client IP if no key is provided). The rate limiter uses an in-memory token bucket — limits reset on server restart. You can also configure these at runtime from Settings > API Rate Limiting in the admin UI.

Error Handling

Status Meaning
200 Success
401 Missing or invalid API key
403 Insufficient scopes for this endpoint
404 Resource not found
409 Conflict (e.g., session locked)
422 Validation error (check detail field)
429 Rate limited (check Retry-After header)
500 Server error

Error responses follow this format:

{
  "detail": "Human-readable error message"
}

Validation errors (422) include field-level details:

{
  "detail": [
    {"loc": ["body", "message"], "msg": "field required", "type": "value_error.missing"}
  ]
}