Every endpoint. Every schema.
Plain HTTP. Bearer auth on private endpoints. JSON in, JSON out. Versioned at /v1. The CLI and SDKs wrap this surface; you can hit it directly with curl.
Authentication
Bearer-token auth on every /v1 endpoint that touches tenant state. Public endpoints (health, public registry, receipts/verify) have no auth.
Bearer keys
X-API-Key: ks_<your-key> # or, equivalently (for SDK compatibility): Authorization: Bearer ks_<your-key>
X-API-Key is the canonical header. Authorization: Bearer is also accepted for SDK compatibility. Issued at signup. Rotate via POST /v1/account/rotate-key. Stored as a scrypt hash - we never see your raw key after rotation.
Browser sessions
Cookie: kolm_session=<httpOnly-cookie>
Web dashboard uses an httpOnly session cookie. SameSite=Lax, 30-day rolling. POST /v1/session/login issues; POST /v1/session/logout revokes.
HTTP error codes
Every endpoint returns errors as JSON: { "error": "<message>", "code": "<machine_code>", "request_id": "..." }. Status codes follow standard HTTP semantics.
2xxSuccess.
3xxRedirect or not-modified.
400Bad request: malformed body, missing fields, invalid types.
401Missing or invalid bearer / session.
402Quota exhausted on current plan. Upgrade or wait for reset.
403Authenticated but not authorized (wrong tenant, signup disabled).
404Not found.
409Conflict: duplicate id, signature mismatch, version skew.
422Semantically wrong: k-score below floor, verifier rejects.
429Rate-limited. Retry-After header indicates when.
500Server error. Includes a request_id you can quote in an issue.
503Upstream dependency unavailable (frontier API down, secret unset).
Account & plans
Self-serve provisioning, plan changes, cancellation, deletion. No mailto, no scoping calls.
Provisions a new tenant. Returns the bearer key plus the Stripe Payment Link for paid plans (or trial mode if the link isn't configured).
Request
{
"email": "you@example.com",
"name": "your-org",
"plan": "free | starter | pro | teams | enterprise"
}
Response (201)
{
"tenant": {
"id": "...",
"name": "your-org-x4f8",
"plan": "pro",
"quota": 100000,
"seats": 3,
"trial_ends_at": "2026-06-08T00:00:00Z"
},
"api_key": "ks_...",
"billing_url": "https://buy.stripe.com/...",
"billing_required": true
}
Errors
Exchange an API key for a session cookie. Useful when wiring a web dashboard to an existing tenant key.
Request
{ "api_key": "ks_..." }
Response (200)
{ "ok": true, "tenant": { "id": "...", "name": "...", "plan": "pro" } }
Errors
Returns tenant metadata: plan, quota, seats, usage, trial_ends_at, signup time.
Response
{
"id": "...",
"name": "your-org",
"plan": "pro",
"quota": 100000,
"seats": 3,
"usage": { "compiles_30d": 12, "calls_30d": 8421 },
"trial_ends_at": null,
"created_at": "2026-05-01T00:00:00Z"
}
Self-serve upgrade or downgrade. Returns the new Stripe Payment Link if the plan requires one. Free is always free.
Request
{ "plan": "teams" }
Response
{
"tenant": { "id": "...", "plan": "teams", "quota": 1000000, "seats": 5 },
"billing_url": "https://buy.stripe.com/...",
"billing_required": true
}
Errors
Drops the tenant to the free tier at the end of the paid period. The key keeps working; quotas drop to free limits at period end.
Response
{ "ok": true, "downgrades_to": "free", "effective_at": "2026-06-01T00:00:00Z" }
Soft-deletes the tenant. The key stops working immediately. Artifacts already issued continue to verify.
Request
{ "confirm": "DELETE my-org" }
Response
{ "ok": true, "deleted_at": "2026-05-08T17:30:00Z" }
Issues a fresh bearer key and revokes the previous one. The new key is shown once.
Response
{ "api_key": "ks_..." }
Static plan catalog: ids, prices, quotas, seat counts, billing-link availability.
Response
{
"plans": [
{ "id": "free", "label": "Developer", "price_usd_month": 0, "quota": 10000, "seats": 1, "self_serve": true, "billing_link_configured": false },
{ "id": "starter", "label": "Starter", "price_usd_month": 9, "quota": 50000, "seats": 1, "self_serve": true, "billing_link_configured": true },
{ "id": "pro", "label": "Pro", "price_usd_month": 49, "quota": 200000, "seats": 1, "self_serve": true, "billing_link_configured": true },
{ "id": "teams", "label": "Teams", "price_usd_month": 149, "quota": 1000000, "seats": 5, "self_serve": true, "billing_link_configured": true },
{ "id": "business", "label": "Business", "price_usd_month": 1499, "quota": 5000000, "seats": 15, "self_serve": true, "billing_link_configured": true },
{ "id": "enterprise", "label": "Enterprise", "price_usd_month": 2999, "quota": 10000000, "seats": 25, "self_serve": true, "billing_link_configured": true }
]
}
Compile
The orchestrator: synthesize a verifier, k-sample the frontier, fit a LoRA, observe deterministic patterns, sign, bundle. Returns a job id; polls become .kolm files.
Starts a compile job. Synchronous compiles return when the artifact is ready; async returns a job id and you poll.
Request
{
"task": "summarize patient intake notes for triage",
"examples": [{"input": "...", "output": "..."}],
"corpus_namespace": "tenant/patient-corpus",
"base_model": "qwen3-7b",
"k_score_floor": 0.85,
"mode": "async"
}
Response (202)
{ "job_id": "cmp_8f2a...", "status": "queued", "poll": "/v1/compile/cmp_8f2a..." }
Errors
List recent compile jobs for the current tenant.
Response
{ "jobs": [
{ "id": "cmp_...", "status": "ready", "k_score": 0.91, "size_mb": 142, "created_at": "..." },
{ "id": "cmp_...", "status": "running", "stage": "k-sample", "created_at": "..." }
]}
Job status. Stages: queued → k-sample → verify → lora-fit → bundle → sign → ready. ready means the artifact is downloadable.
Response (ready)
{
"id": "cmp_8f2a...",
"status": "completed",
"k_score": 0.913,
"size_mb": 142,
"manifest": "sha256:...",
"signature": "hmac-sha256:...",
"artifact_url": "/v1/compile/cmp_8f2a.../.kolm"
}
Streams the signed .kolm archive. Content-Type: application/zip. Includes manifest, recipes, evals, signature, receipt.
List artifacts the tenant has compiled. Includes hash, size, base model, k-score, signing time.
Run
Local-first. Use kolm run on the artifact when you can. The HTTP run endpoint is for the cloud runtime - same code path, different host.
Execute a registered version (or concept) against an input. Returns the output, latency, cache flag, and an optional HMAC-bound receipt.
Request
{
"concept_id": "cpt_8f2a...", // or "version_id": "ver_..."
"input": { "note": "patient presents with..." },
"use_cache": true, // optional, defaults true
"receipt": true // optional, defaults true
}
Response
{
"output": { "triage": "urgent", "confidence": 0.91 },
"version_id": "ver_...",
"latency_us": 142000,
"cache_hit": null,
"receipt": {
"spec": "rs-1",
"input_hash": "sha256:...",
"output_hash": "sha256:...",
"version_id": "ver_...",
"runtime_version": "0.2.0",
"issued_at": "2026-05-08T17:30:00Z",
"hmac": "..."
}
}
k-sample a frontier model and pick the candidate that passes a deterministic verifier (test cases or recipe-as-judge). Drop-in shape for messages.create; verified is the kolm extension. Used during compile and as a standalone "verified inference" surface.
Request
{
"messages": [{ "role": "user", "content": "extract invoice line items as JSON" }],
"system": "You are a code generator...", // optional
"model": "claude-opus-4-7",
"max_tokens": 2048, // optional
"temperature": 0.7, // optional
"verified": { "k": 4, "test_cases": [...] }, // or { judge_recipe_id, expected }
"corpus_namespace": "tenant/patient-corpus" // optional Recall grounding
}
Response
{
"id": "wrap_...",
"model": "claude-opus-4-7",
"role": "assistant",
"stop_reason": "end_turn", // or "verifier_unsatisfied"
"content": [{ "type": "text", "text": "..." }],
"_kolm": {
"verified": true,
"chosen": { "source": "...", "passes": 4, "total": 4 },
"candidates": [...],
"cost_usd": 0.0123,
"elapsed_ms": 1840,
"recall_chunks": 0,
"receipt": { "spec": "rs-1", "primitive": "verified-inference", ... }
}
}
Top-k retrieval over a tenant corpus. Multimodal: text, image, audio, video, PDF.
Request
{
"namespace": "tenant/patient-corpus",
"query": "patients reporting chest pain in the last week",
"k": 8
}
Response
{
"namespace": "tenant/patient-corpus",
"n": 3,
"chunks": [
{ "path": "...", "snippet": "...", "score": 0.91 }
]
}
Tokenize-and-ingest absolute paths from a server-mounted corpus into a tenant namespace. Routes to the right embedder by modality. Server-mounted absolute paths only; SaaS upload is on the Sprint 2 roadmap.
Request
{
"namespace": "tenant/patient-corpus",
"paths": ["/abs/path/to/doc1.pdf", "/abs/path/to/notes.md"],
"force": false
}
Response
{ "indexed": 2, "skipped": 0, "namespace": "tenant/patient-corpus" }
Receipts
Every /v1/run output ships with an HMAC-bound receipt. Anyone can re-verify offline. The verifier is part of the artifact; this endpoint is for convenience.
Verifies a receipt against the registry's published HMAC chain. Public - any auditor can call without a key.
Request
{ "receipt": { "kolm_version": "0.1", "artifact_hash": "...", "chain": [...], "signature": "..." } }
Response
{
"verified": true,
"reasons": [],
"mode": "drive-by" // or omitted for full-receipt mode
}
Errors
Registry
The public artifact and recipe index. Read-only on this surface; submit through the CLI or /v1/publish.
Browse the public registry: featured recipes, base-model fingerprints, RS-1 spec entries.
Bulk export of the registry as NDJSON. Rate-limited to 60/min/IP. Cache headers honored.
Health
Probe endpoints. Use /health for liveness, /ready for readiness (dependencies + secrets).
Liveness probe. Always 200 unless the process is dead.
Response
{
"status": "ok",
"version": "0.2.0",
"library_version": "...",
"region": "local",
"uptime_s": 81342,
"stats": {
"concepts": 12,
"versions": 34,
"syntheses": 8,
"invocations": 2104,
"tenants": 5
}
}
Readiness probe. Returns the state of all upstream dependencies (frontier API, registry, embedder, store, receipt secret). 200 when ready, 503 when a required check fails.
Response
{
"status": "ready",
"production_like": false,
"checks": [
{ "name": "receipt_secret", "ok": true, "required": true, "label": "RECIPE_RECEIPT_SECRET" },
{ "name": "anthropic_key", "ok": true, "required": false, "label": "ANTHROPIC_API_KEY" },
{ "name": "store", "ok": true, "required": true, "label": "store" }
]
}
SDKs
Thin wrappers around this HTTP surface. Add one to your project; the underlying API is the same.