Developer API¶
Spindrel exposes a comprehensive REST API for building integrations, dashboards, and custom clients. This guide covers authentication, endpoint discovery, the chat API, and streaming events.
Interactive Documentation¶
Spindrel auto-generates OpenAPI docs from its route definitions:
| URL | Description |
|---|---|
/docs |
Swagger UI — interactive endpoint explorer with "Try it out" |
/redoc |
ReDoc — readable reference documentation |
/openapi.json |
Raw OpenAPI 3.x schema (for code generation) |
All endpoints are visible, including admin routes. Authentication is still enforced — the docs just show what's available.
Authentication¶
Spindrel supports three authentication methods:
Static API Key¶
Set API_KEY in your .env. Pass it as a Bearer token:
The static key has full access to all endpoints.
Scoped API Keys¶
Scoped keys (prefixed ask_) grant access to specific endpoint groups. Create them via the admin UI (Settings > API Keys) or the API:
# Create a scoped key with chat + channel access
curl -X POST http://localhost:8000/api/v1/admin/api-keys \
-H "Authorization: Bearer $ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{"name": "my-integration", "scopes": ["chat", "channels:read"]}'
The response includes the key (shown once) and its scopes.
Scope Reference¶
Spindrel defines 51 scopes across 22 groups:
| Group | Scopes | Description |
|---|---|---|
| Admin | admin |
Full access — bypasses all scope checks |
| Channels | channels:read, channels:write |
Channel CRUD (broad — includes sub-scopes) |
| Channels (granular) | channels.messages:read/write, channels.config:read/write, channels.heartbeat:read/write, channels.integrations:read/write |
Fine-grained channel sub-resources |
| Chat | chat |
Send messages (blocking + streaming), cancel, submit tool results |
| Sessions | sessions:read, sessions:write |
Session details and message history |
| Bots | bots:read, bots:write, bots:delete |
Bot configuration management |
| Tasks | tasks:read, tasks:write |
Scheduled/deferred task management |
| Workspaces | workspaces:read/write, workspaces.files:read/write |
Workspace management and file operations |
| Documents | documents:read, documents:write |
Ingested document search and management |
| Knowledge | knowledge:read, knowledge:write |
Bot knowledge entries (deprecated — prefer workspace files) |
| Todos | todos:read, todos:write |
Persistent work items |
| Attachments | attachments:read, attachments:write |
File attachment management |
| Logs | logs:read, logs:write |
Agent turns, tool calls, traces, server logs |
| Tools | tools:read, tools:execute |
Tool listing and direct execution |
| Providers | providers:read, providers:write |
LLM provider configuration |
| Users | users:read, users:write |
User management |
| Settings | settings:read, settings:write |
Server-wide settings |
| Operations | operations:read, operations:write |
Backups, git pull, restart |
| Usage | usage:read |
Cost analytics and usage limits |
| Capabilities | carapaces:read, carapaces:write |
Skill+tool bundle management |
| Workflows | workflows:read, workflows:write |
Workflow definitions and run management |
| LLM | llm:completions |
Direct LLM calls through the server's provider system |
| Mission Control | mission_control:read, mission_control:write |
Dashboard data (kanban, journal, etc.) |
Hierarchy rules: channels:write implies all channels.*:write sub-scopes. admin implies everything.
Presets¶
The admin UI offers one-click presets for common use cases:
| Preset | Use Case | Key Scopes |
|---|---|---|
| Messaging Integration | Slack, Discord, etc. | chat, bots:read, channels:read/write, channels.config:read/write, sessions:read/write, todos:read, llm:completions |
| Chat Client | Custom chat frontends | chat, bots:read, channels:read/write, sessions:read, attachments:read/write |
| Container Bot | Bots in their container environment | chat, bots:read, channels:read/write, tasks:read/write, documents:read/write, todos:read/write, workspaces.files:read/write, attachments:read/write, carapaces:read/write, tools:read/execute |
| Read-Only Monitor | Dashboards | bots:read, channels:read, sessions:read, tasks:read, todos:read, attachments:read, logs:read |
| Mission Control | MC dashboard | bots:read, channels:read, sessions:read, tasks:read/write, todos:read/write, workspaces:read, workspaces.files:read/write, attachments:read, logs:read, mission_control:read/write, carapaces:read |
JWT (User Authentication)¶
The UI uses JWT tokens via Google OAuth. For API access, scoped keys are preferred.
Widget Tokens (Short-Lived, Bot-Scoped)¶
Interactive HTML widgets (authored by bots via emit_html_widget) render in sandboxed iframes and need to call /api/v1/... endpoints without borrowing the viewing user's session. Spindrel mints short-lived (15 min) bot-scoped JWTs for this case via POST /api/v1/widget-auth/mint — payload {source_bot_id, pin_id?}, response {token, expires_at, expires_in, bot_id, bot_name, scopes}. The renderer re-mints every 12 min and pushes the new token into the live iframe so the widget never 401s mid-session.
Under the hood these are regular JWTs with kind: "widget" in the payload; the auth dependency has a dedicated branch that returns an ApiKeyAuth with the scopes inlined from the token (no per-request DB lookup). Scopes are copied from the bot's configured API key at mint time, so the widget can only do what the bot could do — not what the viewing user could do. See the HTML Widgets guide for the user-facing version.
You shouldn't typically call /widget-auth/mint yourself — it's automated by the widget renderer. It's documented here for completeness.
Endpoint Discovery¶
The /api/v1/discover endpoint returns all accessible endpoints filtered by your key's scopes:
# List endpoints accessible with your key
curl -H "Authorization: Bearer $API_KEY" \
http://localhost:8000/api/v1/discover
# Get full markdown API reference
curl -H "Authorization: Bearer $API_KEY" \
"http://localhost:8000/api/v1/discover?detail=true"
The basic response includes method, path, description, and required scope for each endpoint.
Chat API¶
Blocking Request¶
curl -X POST http://localhost:8000/chat \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"message": "What is the weather like?",
"bot_id": "default",
"channel_id": "your-channel-uuid"
}'
Response:
Streaming Request¶
curl -X POST http://localhost:8000/chat/stream \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"message": "Write a haiku about coding",
"bot_id": "default",
"channel_id": "your-channel-uuid"
}' --no-buffer
Returns Server-Sent Events (SSE). Each event is a JSON object on a data: line:
data: {"type": "tool_start", "name": "web_search", "args": {...}}
data: {"type": "tool_result", "name": "web_search", "result": "..."}
data: {"type": "assistant_text", "text": "Here's what I found..."}
data: {"type": "response", "text": "Full response text", "tools_used": ["web_search"]}
Request Fields¶
| Field | Type | Description |
|---|---|---|
message |
string | User message text |
channel_id |
uuid | Target channel (preferred) |
bot_id |
string | Bot ID (default: "default") |
client_id |
string | Client identifier (default: "default") |
audio_data |
string | Base64-encoded audio (for voice input) |
audio_format |
string | Audio format: m4a, wav, webm |
attachments |
array | Vision attachments (images) |
dispatch_type |
string | "slack", "webhook", "internal", "none" |
dispatch_config |
object | Routing config for the dispatch type |
model_override |
string | Per-turn model override |
passive |
bool | Store message without running agent |
SSE Event Types¶
Events emitted during streaming:
| Event Type | Description | Key Fields |
|---|---|---|
assistant_text |
Incremental text from LLM | text |
tool_start |
Tool call beginning | name, args |
tool_result |
Tool call completed | name, result, duration_ms |
response |
Final response (always last) | text, tools_used, client_actions |
thinking_content |
Extended thinking (Claude models) | text |
error |
Processing error | message |
cancelled |
User cancelled the run | — |
queued |
Message queued (session locked or system paused) | session_id, task_id, reason |
passive_stored |
Passive message stored | session_id |
secret_warning |
Secret-like patterns detected in input | patterns |
rate_limit_wait |
Waiting on LLM rate limit | wait_seconds |
fallback |
Fallback model activated | original_model, fallback_model |
context_budget |
Context window utilization | used, limit |
rag_rerank |
RAG results reranked | — |
delegation_post |
Delegated to another bot | target_bot |
approval_request |
Waiting for tool approval | request_id, tool_name |
approval_resolved |
Approval decision received | request_id, approved |
transcript |
Audio transcription result | text |
Context assembly events (prefixed with source type) are also emitted during streaming but are primarily for debugging — clients typically only need assistant_text, tool_start, tool_result, and response.
LLM Completions API¶
A thin proxy that lets integrations make LLM calls through the server's multi-provider infrastructure without needing to know about provider URLs, API keys, or routing. Usage is recorded as a TraceEvent for cost tracking.
Scope: llm:completions
Request¶
curl -X POST http://localhost:8000/api/v1/llm/completions \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gemini/gemini-2.5-flash",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Summarize this text: ..."}
],
"temperature": 0.7
}'
Request Fields¶
| Field | Type | Required | Description |
|---|---|---|---|
model |
string | No | Model ID (LiteLLM format). Defaults to DEFAULT_MODEL. |
messages |
array | Yes | OpenAI-format messages (role + content). |
temperature |
float | No | 0–2. |
max_tokens |
int | No | Max completion tokens. |
extra |
object | No | Provider-specific params passed through to the LLM call (e.g. Gemini safety_settings). |
Response¶
{
"content": "Here is a summary...",
"model": "gemini/gemini-2.5-flash",
"usage": {
"prompt_tokens": 42,
"completion_tokens": 128,
"total_tokens": 170
}
}
| Field | Type | Description |
|---|---|---|
content |
string | LLM response text (empty string if model returned no content). |
model |
string | Actual model used. |
usage |
object | null | Token counts (null if provider didn't report usage). |
Notes¶
- The model is resolved through the server's provider system — the caller doesn't need to know which provider or API key to use.
- All calls are recorded as TraceEvents with caller identity, model, token counts, duration, and cost (when available from LiteLLM).
- Used by the ingestion pipeline's safety classifier and available to any integration with a scoped API key.
Common Patterns¶
Create a Channel and Send a Message¶
import requests
BASE = "http://localhost:8000"
HEADERS = {"Authorization": "Bearer your-api-key"}
# 1. Create a channel
ch = requests.post(f"{BASE}/api/v1/admin/channels", headers=HEADERS, json={
"name": "my-project",
"bot_id": "default",
}).json()
channel_id = ch["id"]
# 2. Send a message (blocking)
resp = requests.post(f"{BASE}/chat", headers=HEADERS, json={
"message": "Hello!",
"channel_id": channel_id,
}).json()
print(resp["response"])
Stream a Response (Python)¶
import json
import requests
resp = requests.post(
f"{BASE}/chat/stream",
headers=HEADERS,
json={"message": "Explain Docker", "channel_id": channel_id},
stream=True,
)
for line in resp.iter_lines():
if line and line.startswith(b"data: "):
event = json.loads(line[6:])
if event["type"] == "assistant_text":
print(event["text"], end="", flush=True)
elif event["type"] == "response":
print() # final newline
Python Client¶
The agent CLI can also be used as a library:
from agent.client import AgentClient
client = AgentClient(base_url="http://localhost:8000", api_key="your-key")
response = client.chat("Hello!", bot_id="default")
print(response.text)
CORS¶
To allow browser-based clients, set CORS_ORIGINS in your .env:
Rate Limiting¶
This limits requests to the Spindrel API itself
These limits control how fast clients can call your Spindrel server. They do not affect outbound calls to LLM providers (OpenAI, Anthropic, etc.) — those have their own rate limits handled by the agent's retry/backoff logic.
Opt-in rate limiting protects against runaway clients hammering your server. Enable in .env:
RATE_LIMIT_ENABLED=true
RATE_LIMIT_DEFAULT=100/minute # all Spindrel API endpoints
RATE_LIMIT_CHAT=30/minute # /chat and /chat/stream (stricter)
When rate-limited, the server returns HTTP 429 with a Retry-After header.
Limits are per API key (or per client IP if no key is provided). The rate limiter uses an in-memory token bucket — limits reset on server restart. You can also configure these at runtime from Settings > API Rate Limiting in the admin UI.
Error Handling¶
| Status | Meaning |
|---|---|
| 200 | Success |
| 401 | Missing or invalid API key |
| 403 | Insufficient scopes for this endpoint |
| 404 | Resource not found |
| 409 | Conflict (e.g., session locked) |
| 422 | Validation error (check detail field) |
| 429 | Rate limited (check Retry-After header) |
| 500 | Server error |
Error responses follow this format:
Validation errors (422) include field-level details: