The same first step, different last step
Both products start from the same place: (input, expected output) pairs. The user uploads a JSONL file or proxies traffic through a capture endpoint. Both run a supervised training loop on those pairs.
The difference is the deliverable. OpenAI returns a model name like ft:gpt-4o-mini-2026-08-06:org:project:abc123. You get a per-token bill in exchange. kolm returns a .kolm file with weights, recipes, and a signed manifest. You get a model you can copy onto a server, a phone, or an air-gapped network.
Where OpenAI fine-tuning wins
Honest concession. OpenAI's tooling is more polished. The dashboard handles eval reports, hyperparameter sweeps, version pinning, and the resulting model speaks the OpenAI API exactly. If your stack is already on the OpenAI SDK and your bottleneck is "I need a slightly better GPT-4o-mini for this task this week," there is no faster path.
You also benefit from frontier upgrades. When OpenAI ships a new base model, your fine-tune can be re-run against it with a click. kolm targets specific open bases (Qwen, Llama, Phi). When a new base comes out, you re-compile.
Bigger context windows. A fine-tuned GPT-4o variant inherits 128K context. A 7B-base .kolm is typically 32K or 128K depending on base. If your task needs 200K+ context, frontier fine-tuning is the right answer today.
Where kolm wins
You own the file. The whole point. The .kolm file lands on your hard drive. You can run it offline, ship it to a customer, deploy it on a phone, fork it, audit it, anchor its hash on-chain. None of that is possible with a hosted fine-tune.
$0 marginal inference. After the compile, you are not paying per call. You bought the artifact; running it is your hardware cost only. For high-volume tasks this crosses a break-even point fast.
Privacy and sovereignty. The captured pairs and the resulting model never leave your control once compiled. For HIPAA, finance, defense, or any data-residency regime, this is the difference between "we can ship" and "we can't ship."
Receipts. Every output a .kolm produces ships with an HMAC-SHA256 receipt chain. OpenAI does not sign individual outputs.
Portability across stacks. The .kolm file runs through llama.cpp, vLLM, MLX, or our own runtime. It is not bound to any vendor.
Side-by-side
| OpenAI Fine-Tuning | kolm | |
|---|---|---|
| What it is | Hosted fine-tune of GPT-4o family | Compile to portable signed artifact |
| Output | A model name on OpenAI servers | A .kolm file (≤3 GB) |
| Where it runs | OpenAI servers only | Anywhere - server, phone, air-gapped |
| Per-call cost | ~3-8x base GPT-4o-mini per token | $0 marginal (your hardware) |
| Base model | GPT-4o-mini, GPT-4o (closed) | Qwen, Llama, Phi (open weights) |
| Context window | 128K inherited | 32K-128K depending on base |
| Receipts / signing | no | HMAC-SHA256 chain on every output |
| Data residency | Pairs uploaded to OpenAI | Pairs stay on your namespace; artifact yours |
| Vendor lock | model dies if account dies | file is yours forever |
| Eval tooling | first-class - dashboard + sweeps | K-score gate on held-out test set |
When to use OpenAI fine-tuning
Use OpenAI fine-tuning when the task needs frontier-grade reasoning, the data residency story is "OpenAI is fine," and the per-call cost is acceptable for your volume. The 128K context, the polished dashboard, and the API-compatibility-by-default are all real advantages.
# classic OpenAI fine-tune flow: openai.files.create(file=open("pairs.jsonl", "rb"), purpose="fine-tune") openai.fine_tuning.jobs.create(training_file="file-abc", model="gpt-4o-mini-2026-08-06")
When to use kolm
Use kolm when the task needs a portable model you own. Privacy regulations, on-device deployment, air-gapped networks, $0-marginal-cost inference, or simply not wanting your model to live on someone else's servers - all good reasons to compile.
# point traffic at the kolm capture proxy: OPENAI_BASE_URL=https://kolm.ai/v1/capture/openai # once enough pairs accumulate, compile: kolm compile "answer support tickets" \ --namespace support \ --base qwen2.5-7b ok wrote support.kolm k_score=0.89 signature=hmac-sha256
Can I use both?
Yes. Many teams will keep an OpenAI fine-tune for the long-tail or top-of-funnel reasoning while compiling specific high-volume sub-tasks into .kolm files. The kolm capture proxy preserves the upstream call, so there's no compromise on the OpenAI side until you flip to the local artifact.
Verdict
If your task needs frontier reasoning at 128K and you're fine on data residency, use OpenAI fine-tuning. The dashboard is polished, the upgrade path is clean.
If your task can fit in a 7B-class model and you want the file, use kolm. You will pay once, run forever, and own the deliverable.
Adjacent comparisons: vs fine-tuning (general) · vs Together · vs LangSmith · full comparison table