The job each one does
Both products start from the same insight: every dollar you spend on a frontier API is a labelled training pair. Capture those pairs, train a smaller model, route the cheap traffic locally. OpenPipe shipped this in 2023 and the loop works.
The difference is what comes out of the pipe. OpenPipe gives you a hosted endpoint - swap your OPENAI_BASE_URL and you're done; the model lives in their cloud. kolm gives you a 1-3 GB file. The file holds the model, a LoRA, a verifier that checks each output, a sqlite-vec recall index, recipe packs for fast paths, and an HMAC-signed manifest. You can run it on your own machine, your customer's machine, an air-gapped server, or a phone.
If you're optimizing for "make this OpenAI call cheaper next quarter," OpenPipe is excellent. If you're optimizing for "this model becomes our product, runs in our customer's environment, and survives our cloud being down," that's what compile-to-file is for.
Where OpenPipe wins
Honest concession. OpenPipe is a more polished hosted offering than kolm cloud is today. They've been at this longer, the dashboard is mature, the auto-eval suite for fine-tunes is real, the Mistral / Llama / GPT-4o-mini fine-tune coverage is broad, and the pay-per-token economics are straightforward. If your team only cares about cost per inference and you're happy hosting in their cloud, OpenPipe is a good answer.
Where the model has to leave their cloud (offline, on-device, regulated, customer-owned), the hosted-only architecture stops working. That's where the .kolm file format earns its keep.
Side-by-side
| OpenPipe | kolm | |
|---|---|---|
| What it is | Capture + fine-tune-as-a-service | Capture + compile to portable artifact |
| Capture loop | yes - drop-in OpenAI proxy | yes - drop-in proxy for OpenAI + Anthropic |
| Output | Hosted endpoint URL | Signed .kolm file (≤3 GB) |
| Runs offline | no - hosted only | yes - laptop, phone, air-gap |
| You own the weights | export available on higher tiers | yes - the file is yours, period |
| Quality gate | Auto-eval over capture set | K-score on a held-out test set, gated at 0.70 default |
| Receipts / signing | no | HMAC-SHA256 receipt chain on every output |
| Bundled recall (RAG) | no - bring your own | yes - sqlite-vec index ships in the file |
| Recipe / draft cache | no | yes - deterministic drafts for sub-100ms hot paths |
| Pricing model | Per token through their cloud | Flat per compile, then $0 marginal inference |
| Lock-in | Endpoint stops if account stops | File survives. Spec is RS-1 MIT. Verifier is open. |
When to use OpenPipe
Use OpenPipe when your model only ever needs to run inside a hosted endpoint and your goal is to cut the per-token bill on traffic you already send to OpenAI.
# swap base URL, capture, train, save 5-10x: OPENAI_BASE_URL=https://api.openpipe.ai/v1 # after ~10k captures, fine-tune a smaller model # OpenPipe routes future calls to the smaller model
When to use kolm
Use kolm when the model needs to leave the cloud - or you need a signed artifact you can hand to a customer, an auditor, or a regulator.
# 1. capture frontier traffic (same drop-in pattern) ANTHROPIC_BASE_URL=https://kolm.ai/v1/capture/anthropic # 2. compile after enough pairs accumulate kolm compile "PHI redaction for clinical notes" \ --namespace clinical-notes \ --base qwen2.5-7b ok wrote phi-redactor.kolm k_score=0.93 signature=hmac-sha256 # 3. ship the file. it runs anywhere. kolm run phi-redactor.kolm "patient John Smith, DOB 1985..." --receipt
Can I use both?
Yes, and it's a reasonable composition. Capture through OpenPipe to get a fine-tuned hosted model for cloud traffic; capture through kolm to get a portable artifact for the cases where the model needs to be in the customer's environment. Same training pairs serve both deliverables. The traffic isn't exclusive - you can dual-write to both.
Or: if you've already adopted OpenPipe and just need a signed offline copy of the model for a regulated customer, point kolm compile at your OpenPipe-exported weights as the base and let kolm wrap them with the verifier + recall + receipt chain.
Verdict
If your only constraint is cost and the model can live in someone else's cloud, OpenPipe is the cleaner answer. Hosted, mature, easy.
If the model has to ship - to a phone, an enterprise's VPC, an offline device, a regulated environment - the file format is the difference. .kolm is what you hand over. The hosted endpoint isn't.
Adjacent comparisons: vs fine-tuning · vs Predibase · vs LangSmith · full comparison table