What is the same
Both produce a model that performs better on your task than the cold base model. Both consume labeled examples. Both use a teacher (the frontier) to produce labels for a student (the customized model). Both can use LoRA-style parameter-efficient methods.
The mechanism overlaps. The contract does not.
Side-by-side
| Vendor fine-tune | kolm | |
|---|---|---|
| Output | Model ID on vendor cloud | .kolm file (≤3 GB) |
| Where weights live | Vendor cluster only | Your disk, your phone, your edge box |
| Run cost | Per-token, often 4–8× the base model | $0 after compile |
| Vendor lock | total, only one provider runs it | none, any GGUF runtime works |
| Deprecation risk | Vendor sunsets the base, your fine-tune dies with it | File outlives any vendor's product schedule |
| Offline / air-gap | no | yes |
| Choice of teacher | Locked to whichever frontier the vendor uses | Any frontier with a key (Anthropic, OpenAI, Mistral, etc.) |
| Choice of base | Vendor's catalogue only (GPT-4o-mini, etc.) | Any open-weight base (Qwen2.5, Llama-3, Phi-3, Hermes-3) |
| Receipts / signing | none, you trust the vendor's logs | HMAC-SHA256 chain over manifest → output |
| Data leaves your network | always, for training and for every inference | Training-only (opt-in cloud or self-host); inference is local |
| Audit trail | Vendor invoice; vendor logs | Cryptographic chain you control |
The lock-in math
OpenAI fine-tuning charges 4–8× the base model price for inference. At any meaningful volume, that becomes a per-month bill. Worse: when the base model deprecates (every 12–18 months), your fine-tune dies with it. You re-train. You re-pay. You re-validate.
kolm pays the frontier teacher cost once, at compile time, then runs the resulting student locally for $0 per token forever. The base model open-weight (Qwen2.5, Llama-3) does not get deprecated by anyone; it sits in your filesystem.
How a kolm compile differs from a fine-tune
A vendor fine-tune trains the vendor's own base on your labels. A kolm compile trains an open-weight base on labels generated by a vendor frontier. The frontier is the teacher; the local model is the student. The student is what runs.
# vendor fine-tune (e.g. OpenAI) your labels -> vendor cloud -> vendor model ID # to use it: pay per token, vendor side, every request # kolm compile your labels -> k-sample teacher (any frontier) -> verified pairs | v LoRA student (open base) | v .kolm file # to use it: run locally, $0, byte-identical forever
When a vendor fine-tune is the right answer
Use vendor fine-tuning when (a) the absolute ceiling of vendor base model is the only one good enough for your task, (b) you don't care about per-token cost, (c) you have no need for offline / sovereignty / signed audit. Most chat applications fit this profile.
When kolm is the right answer
Use kolm when you need any one of: offline operation, signed receipts, vendor-independent ownership, sub-cent inference economics, or guarantee against vendor model deprecation. The compile step pays for itself the first month at meaningful volume.
Verdict
Both customize a model. Only one of them gives you the model.
If the question is "should I fine-tune?", the more useful question is "do I want a model ID or a file?" If a file: kolm compile.
Adjacent comparisons: vs Ollama · vs RAG · full comparison table