PRICING · SAVINGS-ALIGNED

You pay only when we cut your bill.

Two ways to buy. Either a fixed base + a % of measured savings (the autopilot path), or transparent per-unit rates if you'd rather not share traffic data. Free tier on both. No card to start.

Path A · Pay from savings (recommended)

Wrap your AI client, let the autopilot run. We measure your before/after and bill a cut. If we save you nothing, you pay nothing.

FREE · FOREVER
$0
no card · no time limit
  • 10,000 recipe-calls / month
  • 5 synthesized recipes
  • Public registry only
  • Anonymous CLI bootstrap (robots welcome)
  • Community support
Mint a key →
GROWTH
$99 /mo + 10% of savings
most AI-native startups land here
  • Unlimited recipe-calls
  • Unlimited recipes
  • Private registry & tenant isolation
  • Autopilot mode (observe → cluster → replace)
  • Weekly savings receipt — dollar-denominated
  • Specialists waitlist priority
  • Email support
Start Growth →
SCALE
$1,000 /mo + 5% of savings
capped at 50% of your bill
  • Specialists (LoRA fine-tunes) included
  • Dedicated synthesis priority
  • VPC / on-prem option
  • SLA: 99.9%
  • SOC 2 Type II (Q2 2027)
  • Dedicated support engineer
Talk to us →

A worked example — how the savings cut actually shakes out

Your current Claude bill$2,000 / mo
Recipe replaces 78% of calls (typical)$1,560 saved
Your new Claude bill$440 / mo
Growth base+ $99 / mo
Growth savings cut (10% of $1,560)+ $156 / mo
What you actually pay (Recipe + Claude)$695 / mo
Net savings, monthly$1,305 / mo — 65% off
Annualized$15,660 / yr

Numbers above use $0.0008/call avg cost (Claude Sonnet 4.6, ~250 in tokens, ~5 out). Plug your real numbers into the calculator.


Path B · Per-unit rates (no savings sharing)

If you'd rather not wire your traffic through our autopilot, pay-as-you-go per unit. Same engine, no savings telemetry, no % cut.

LayerUnitPrice
Build (synthesis)per accepted recipe under 1 KB$0.10
Build (synthesis)per accepted recipe 1–32 KB$1.00
Storageper GB-month registry$0.01
Storageper million registry reads$0.10
Runper million recipe-calls (cold)$0.20
Runper million recipe-calls (cache hit)$0.05
Auto-label soonper 1k rows after 10k free$1.00
Specialist train soonper LoRA trainedfrom $40
Specialist host soonper active LoRA / month$20

For yes/no classifier @ 1M calls/month: Claude Sonnet 4.6 ~$675 · Cohere classify ~$2,000 · Recipe per-unit: ~$0.20.


How the two paths compare

 Path A · Savings cutPath B · Per-unit
You share LLM traffic with us?yes (PII-redacted, opt-in per cluster)no
Setupwrap your client (2 lines)call our API directly
Bill predictabilitytied to actual savingsflat per call
Best forAI-native startups, teams worried about runaway spendregulated buyers, on-prem, BYOC
Worst case$99/mo even if no savings (cancel any time)$0/mo (just usage rates)
Best casetypical: 65–80% net bill cuttypical: 95%+ vs LLM if you have a busy classifier

Enterprise (six- and seven-figure inference bills)

Custom pricing tied to verified savings via a 30-day pilot. Air-gapped runtime. Custom registry / private marketplace. Multi-region. SOC 2 + HIPAA. hi@kolm.ai.


Self-host

If you don't want to depend on us, the runtime is just JS. Click any recipe in the registry, copy its source, and paste it into your own server. Recipe is the build & registry layer; running them is free at the file system.

FAQ

How do you measure "savings"?

For each cluster the autopilot replaces, we record (a) how many LLM calls would have happened without the recipe and (b) what those calls cost (your real model rate × observed token counts). Savings = the avoided spend. The receipt is itemized; you can audit every line.

Can I cap how much I pay?

Yes. Scale has a built-in cap at 50% of your remaining LLM bill. Growth caps at 3× the base ($297/mo). Enterprise is whatever you negotiate.

Do you store my examples?

Yes — in the recipe's lineage, so you can audit how it was built. Set visibility: "private" to keep them tenant-only. We never train shared models on your private examples.

Can I run this in my VPC?

Yes, on Scale. The whole system is one Node service + a JSON or Postgres store + an optional Anthropic key. Email hi@kolm.ai.

What happens when I exceed the free tier?

We return HTTP 429 with a clear X-Quota-Used header. Nothing breaks silently. Upgrade to Growth from /signup any time.

Do you have an SLA?

Free and Growth: best-effort, see /status. Scale: 99.9% uptime SLA + on-call rotation.