ROI calculator

Preset 01

Support summary - 100k MAU.

12 calls/MAU - 800/250 tok

Preset 02

Document extraction - 1M docs/yr.

3.2k/600 tok per doc

Preset 03

Agent inner loop - 50 devs.

600 calls/dev/day - 2.5k/900 tok

Inputs.

Monthly call volume, average tokens per call, and your current frontier price. Defaults are the customer-support preset.

Monthly calls 1.2M = 100K MAU - 12 calls/user/month.

Tokens per call (input - output)

Most LLM features sit at 500-3000 in, 200-900 out.

Frontier price ($ / 1M tokens, input - output)

Default = Sonnet-class pricing. Set 5.00 / 25.00 for Opus-class.

kolm Pro plan All tiers include unlimited local-runtime inference.

Compile cycles per year Most teams recompile monthly (12) or quarterly (4).

Teacher k-sample size at compile k samples per training row. 8 is the default; 16 for the strictest verifier.

Training rows per compile Most workflows top out 1.5K - 5K verified-label rows.

Output.

All numbers are dollars per year. Kept dollars compound; year-two onward the teacher cost goes to near-zero unless you choose to recompile.

Cloud frontier - per-call inference

Cloud egress + variance buffer (8%)

Year-1 cloud-only total

kolm plan (12 mo)

Frontier teacher (k-sample at compile, year 1)

On-device inference

Year-1 kolm total

cheaper year one. Year two onward the gap widens because the teacher bill drops to whatever recompile cycle you choose.

What's in the math, what isn't.

In: per-call frontier cost (input + output token price), ~8% egress + variance buffer for the cloud lane, kolm plan cost, frontier teacher cost during the compile phase ((k - rows - in_tokens) priced at the same rate).

Out: engineer time, GPU rental during fine-tune (Pro tier handles this on the bridge), recall index population (one-time, scales with corpus size, not call volume), and the cost of not compiling (cloud variance during incidents, prompt regressions in production, vendor-side rate-limit incidents).

The honest assumption: tasks suited for kolm are tasks 80%+ deterministic on labeled examples. If your task is mostly free-form open-ended generation, the math swings; check /k-score for the gating logic, or open a thread in GitHub Discussions with your task shape before you trust this calculator.

The cloud-vs-kolm math,
teacher cost honest.

Inputs.

Output.

What's in the math, what isn't.

Next move
if the math says yes.

The cloud-vs-kolm math,teacher cost honest.

Inputs.

Output.

What's in the math, what isn't.

Next moveif the math says yes.

The cloud-vs-kolm math,
teacher cost honest.

Next move
if the math says yes.