ROI calculator

The cloud-vs-kolm math,
teacher cost honest.

Plug in your monthly call volume, token mix, and frontier price. kolm is a one-time compile cost, not a per-call cost. Year one absorbs the teacher bill once; year two and after, the same artifact runs at zero marginal token cost. The numbers below are unrounded and the formulas are the ones below the calculator.

Preset 01
Support summary - 100k MAU.
12 calls/MAU - 800/250 tok
Preset 02
Document extraction - 1M docs/yr.
3.2k/600 tok per doc
Preset 03
Agent inner loop - 50 devs.
600 calls/dev/day - 2.5k/900 tok

Inputs.

Monthly call volume, average tokens per call, and your current frontier price. Defaults are the customer-support preset.

1.2M = 100K MAU - 12 calls/user/month.
Most LLM features sit at 500-3000 in, 200-900 out.
Default = Sonnet-class pricing. Set 5.00 / 25.00 for Opus-class.
All tiers include unlimited local-runtime inference.
Most teams recompile monthly (12) or quarterly (4).
k samples per training row. 8 is the default; 16 for the strictest verifier.
Most workflows top out 1.5K - 5K verified-label rows.

Output.

All numbers are dollars per year. Kept dollars compound; year-two onward the teacher cost goes to near-zero unless you choose to recompile.

Cloud frontier - per-call inference
-
Cloud egress + variance buffer (8%)
-
Year-1 cloud-only total
-
kolm plan (12 mo)
-
Frontier teacher (k-sample at compile, year 1)
-
On-device inference
$0
Year-1 kolm total
-
-
cheaper year one. Year two onward the gap widens because the teacher bill drops to whatever recompile cycle you choose.
-

What's in the math, what isn't.

In: per-call frontier cost (input + output token price), ~8% egress + variance buffer for the cloud lane, kolm plan cost, frontier teacher cost during the compile phase ((k - rows - in_tokens) priced at the same rate).

Out: engineer time, GPU rental during fine-tune (Pro tier handles this on the bridge), recall index population (one-time, scales with corpus size, not call volume), and the cost of not compiling (cloud variance during incidents, prompt regressions in production, vendor-side rate-limit incidents).

The honest assumption: tasks suited for kolm are tasks 80%+ deterministic on labeled examples. If your task is mostly free-form open-ended generation, the math swings; check /k-score for the gating logic, or open a thread in GitHub Discussions with your task shape before you trust this calculator.

Next move
if the math says yes.

A 30-day signed pilot. One workflow. One .kolm. K-score gate. The artifact stays with you whether or not we expand; that's the contract.