Bring your own key. Pay for what compiles.
The cloud runs the four engines so your laptop doesn't have to. Same compile pipeline, same artifact format, same K-score, plus a verified-inference wrap that turns any Anthropic / OpenAI / Mistral key into a self-checking call, k-sample, deterministically verify, accept the winner, return a receipt.
Four engines, one pipeline.
Cloud runs the four engines so your laptop doesn't have to. Each is a stateless service behind one HTTP boundary; the orchestrator schedules them per compile job. Everything is observable in /account.
multimodal index
BM25 + vector + RRF + cross-encoder rerank. BGE-M3 1024d INT8. Per-tenant namespace.
verified inference
k-sample frontier model, deterministically verify, accept the winner, return a signed receipt.
recipe drafts
Extract the structured-token subset of a model's behavior into a deterministic, registry-indexed draft pack.
artifact assembly
Bundle base GGUF + LoRA + recipes + sqlite-vec index + manifest, sign with HMAC chain, ship a single .kolm.
Measured latency, west-region
Verified inference, drop-in.
Or just compile in the browser.
Drag a folder onto the Cloud dashboard, write a task in the textbox, click Compile.
The pipeline runs on our hardware and emits a downloadable .kolm with full K-score breakdown.
Your data is namespaced per tenant, your API key is yours, your model bill stays on your account.
Pricing.
Free
- 1 Specialist / mo
- 10k corpus rows
- 4B-class open base only
- 100k vectors managed
- unlimited
kolm run
Mobile
- 1 personal Specialist / mo
- 100k corpus rows
- 1M vectors managed
- kolm.app phone runtime
- unlimited on-device
Pro
- unlimited Specialists
- 1M rows each
- all open bases incl. Hermes-3
- 10M vectors managed
- full CLI + Cloud + MCP
Enterprise
- private base models
- on-prem training bridge
- private Trieve cluster
- dedicated GPU embedders
- on-chain receipt anchoring
You always pay your own frontier API bill on top, we're the compiler, not the model host.