DOCS · DROP-IN PROXY, NO SDK REWRITE

Point your API calls. Own the model.

Everything you need to turn your live OpenAI- and Anthropic-compatible calls into a signed model you run on your own hardware. Route a namespace, capture the behavior you already use, compile a portable .kolm artifact, and ship it to a laptop, your private cloud, or the edge. Every route here is one you can run and verify yourself.

01 · QUICKSTART

Ship your first model in an afternoon.

Start small and prove it fast. Point one OpenAI-compatible namespace at Kolm, export the examples it captures, then compile once the behavior is covered by your evals. Three calls take you from live traffic to a signed model you own.

quickstart.sh · route · capture · compile
$ curl https://kolm.ai/v1/route/chat/completions \
  -H "Authorization: Bearer $KOLM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"namespace":"billing-agent","model":"openai:gpt-4.1-mini","messages":[{"role":"user","content":"classify this ticket"}]}'

$ curl https://kolm.ai/v1/capture/export \
  -H "Authorization: Bearer $KOLM_API_KEY" \
  -o billing-agent.jsonl

$ curl https://kolm.ai/v1/compile \
  -H "Authorization: Bearer $KOLM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"task":"billing-agent","examples_file":"billing-agent.jsonl","target":"local-cpu"}'

Here is what each of those three calls does, the response you should see, and roughly how long it takes. A namespace (a named bucket for one agent's traffic)A namespace is a label you attach to one agent's calls, for example billing-agent. Captures, evals, and the compiled model are all scoped to it, so two agents never mix. keeps each agent's traffic, examples, and compiled model separate.

1

POST /v1/route/chat/completions

Route one live, OpenAI-compatible call through Kolm instead of straight to the provider. Kolm forwards it to openai:gpt-4.1-mini, returns the same response shape your code already parses, and quietly records the request and reply as a captured example under the billing-agent namespace.

returns 200 chat completion JSON typical 0.4 to 2 s adds 1 captured example
{ "id": "chatcmpl-...", "choices": [ { "message": { "role": "assistant", "content": "refund_request" } } ], "x_kolm": { "captured": true, "namespace": "billing-agent" } }
2

GET /v1/capture/export

Pull every example captured so far for the namespace as a JSONL (one JSON record per line)JSONL is a plain text file with one JSON object per line. It streams and appends well, which is why it is the standard shape for training and eval example sets. file you own and can inspect. The -o flag writes it to billing-agent.jsonl on disk.

returns 200 file: billing-agent.jsonl typical under 1 s per 1k rows
{"namespace":"billing-agent","input":"classify this ticket","output":"refund_request","ts":"2026-06-14T10:02:11Z"}
{"namespace":"billing-agent","input":"where is my order","output":"order_status","ts":"2026-06-14T10:02:14Z"}
3

POST /v1/compile

Turn the captured examples into a signed, portable .kolm model built for local-cpu. Compile runs as a job, so the call returns a job id right away and the artifact is ready when the job reports done. The result is Ed25519-signed, so anyone can verify it later.

returns 202 job id, then .kolm artifact typical 2 to 20 min output is signed
{ "id": "cmp_018x", "status": "queued", "target": "local-cpu", "artifact_url": "/v1/compile/cmp_018x/.kolm" }
PLATFORM ARCHITECTURE / CAPTURE TO RUNlive
Providers

Your provider calls

OpenAI- or Anthropic-compatible calls keep their shape. You change the base URL, not your SDK.

OpenAI+ Anthropic
Gateway

kolm gateway

Forwards each call and captures the request and reply as a governed example.

capturegoverned
Compiler

capture to model

Turns captured behavior, once your evals cover it, into a model.

compileto model
Artifact

.kolm signed artifact

Ed25519-signed and portable. It is the one thing you own and carry out.

Ed25519signed
Runtimes

Run it where it fits

Wherever you run it: a laptop CPU, your private cloud, or the edge.

laptopcloud / edge
ARTIFACT one signed .kolm you own verify offline

Want the bigger picture before you build? See the whole compiler on /platform, the exact .kolm format and its Ed25519 signature on /spec, where artifacts run on /runtimes, and check any signed model offline at /verify.

02 · REFERENCE

Build against the real contract.

Every page here points to something you can run, test, or verify - not a screenshot. Generate clients, wire up tests, and check our work straight from the source.

03 · API CONTROL CENTER

Run the whole pipeline from one place.

Capture, set your rules, test the behavior, compile, deploy, and export - all driven by the same API, all visible in one console. Anything you can click, you can script.

  • Set up your accounts and connections: workspace, project, environment, source, connector, credential, and provider.
  • Decide what gets kept and how it is tested: capture policy, redaction policy, retention policy, eval suite, failure taxonomy, and regression set.
  • Ship the result: compile run, artifact, manifest, target runtime, receipt, export, and release checks.
capture · stream.kolmlive
api-control-center · contract
contract GET /v1/account/api-control-center
every change logged
every export replayable

04 · DATA IN AND OUT

Bring your traffic from anywhere. Take your model with you.

Capture from the tools you already use, and export the result in formats your stack already speaks. Nothing locks in - what comes out is a file you own.

CAPTURE IN

12+ ways in

REST proxy, SDK capture, CLI upload, JSONL/CSV/parquet, webhooks, streams, OTEL, gateway logs, trace imports, warehouse import, browser traces, and human labels.

WORKS WITH YOUR STACK

Your trace tools and agents

Trace exports, eval dashboards, OpenTelemetry spans, MCP/tool calls, agent handoffs, support tickets, CRM records, and red-team cases.

EXPORT OUT

10+ ways out

Signed artifacts, manifests, receipts, eval reports, event streams, SIEM/log export, warehouse export, webhook callbacks, CLI/SDK pull, and runtime packages.

YOUR RULES STICK

Your rules travel with the data

Source identity, scope, redaction, retention, provider policy, budget, rate limits, approvals, export permissions, and release state - all enforced wherever the data goes.

05 · LIFECYCLE ROUTES

From live calls to a shipped model, in routes.

The whole path - capture, compile, deploy - is just API calls you can run today. Here is the spine end to end.

lifecycle routes · capture → compile → deploy
POST /v1/route/chat/completions OpenAI-compatible traffic
POST /v1/gateway/dispatch capture · cache · fallback
GET /v1/capture/export log · list · bulk
POST /v1/compile .kolm
POST /v1/serve detect · recommend · install
  • POST /v1/route/chat/completions - point your existing OpenAI-compatible traffic at Kolm, no SDK rewrite.
  • POST /v1/gateway/dispatch - send calls to your providers with capture, caching, fallback, and quality routing built in.
  • POST /v1/capture/log, GET /v1/captures/list, GET /v1/capture/export, POST /v1/capture/bulk - pull the behavior you captured into review and training.
  • POST /v1/compile/estimate, POST /v1/compile/preview, POST /v1/compile, GET /v1/compile/:id/.kolm - turn that behavior into a signed model you own.
  • GET /v1/devices/detect, POST /v1/devices/recommend, POST /v1/devices/:id/install, POST /v1/serve - run your model on the hardware you already have.

06 · ERRORS AND EDGES

Know exactly what to expect when something breaks.

We document the hard parts, not just the happy path - auth, rate limits, idempotency, unsupported targets, and what is still landing - so you can build with confidence and never get surprised in production.

AUTH

One bearer key

Authenticate with Authorization: Bearer $KOLM_API_KEY. Create a key at /signup or from the CLI.

ERRORS

Codes you can act on

Every error returns a stable code and a clear next step - so you fix it fast instead of guessing.

IDEMPOTENCY

Retry without fear

Write routes accept an idempotency key or expose a status lookup, so retries never double up your work.

WHAT'S LIVE

No guessing on scope

We say plainly what ships today and what is still landing - so you can plan against real limits, not promises.

07 · PROOF YOU CAN HAND OVER

Verify it yourself. No trust required.

Every artifact is signed and carries its own receipt. Hand the receipt to a customer, an auditor, or your security team, and they can confirm exactly what it is and how it behaves - without ever touching your workspace.

  • Every .kolm artifact is Ed25519-signed with content-addressed receipts - signed so anyone can verify it.
  • Export the eval reports, manifests, and receipts and attach them to your release as proof.
  • Verify any artifact at /verify, and see every check we run at /checks.
billing-agent.kolm · SIGNED
signature Ed25519
receipts sha-256
eval report attached
release checks passed
verify loopEd25519

08 · CLI AND SDK

Same workflow, from the terminal.

The CLI and SDK speak the same language as the API and the console - move between them freely. Sign up, compile, verify, and serve a model without leaving your shell.

kolm-cli · signup → compile → serve
$ kolm signup --email you@company.com
$ kolm login --key ks_...
$ kolm compile "billing-agent" --target local-cpu
$ kolm verify artifact.kolm
$ kolm serve artifact.kolm

READY TO OWN WHAT YOU'RE RENTING?

Point one namespace. Own your first model.

Grab a key, route a single OpenAI-compatible namespace, and follow the three-call quickstart from live calls to a signed model you run on your own hardware. It takes an afternoon and the Free plan needs no card.