You pay only when we cut your bill.
Two ways to buy. Either a fixed base + a % of measured savings (the autopilot path), or transparent per-unit rates if you'd rather not share traffic data. Free tier on both. No card to start.
Path A · Pay from savings (recommended)
Wrap your AI client, let the autopilot run. We measure your before/after and bill a cut. If we save you nothing, you pay nothing.
- 10,000 recipe-calls / month
- 5 synthesized recipes
- Public registry only
- Anonymous CLI bootstrap (robots welcome)
- Community support
- Unlimited recipe-calls
- Unlimited recipes
- Private registry & tenant isolation
- Autopilot mode (observe → cluster → replace)
- Weekly savings receipt — dollar-denominated
- Specialists waitlist priority
- Email support
- Specialists (LoRA fine-tunes) included
- Dedicated synthesis priority
- VPC / on-prem option
- SLA: 99.9%
- SOC 2 Type II (Q2 2027)
- Dedicated support engineer
A worked example — how the savings cut actually shakes out
| Your current Claude bill | $2,000 / mo |
| Recipe replaces 78% of calls (typical) | $1,560 saved |
| Your new Claude bill | $440 / mo |
| Growth base | + $99 / mo |
| Growth savings cut (10% of $1,560) | + $156 / mo |
| What you actually pay (Recipe + Claude) | $695 / mo |
| Net savings, monthly | $1,305 / mo — 65% off |
| Annualized | $15,660 / yr |
Numbers above use $0.0008/call avg cost (Claude Sonnet 4.6, ~250 in tokens, ~5 out). Plug your real numbers into the calculator.
Path B · Per-unit rates (no savings sharing)
If you'd rather not wire your traffic through our autopilot, pay-as-you-go per unit. Same engine, no savings telemetry, no % cut.
| Layer | Unit | Price |
|---|---|---|
| Build (synthesis) | per accepted recipe under 1 KB | $0.10 |
| Build (synthesis) | per accepted recipe 1–32 KB | $1.00 |
| Storage | per GB-month registry | $0.01 |
| Storage | per million registry reads | $0.10 |
| Run | per million recipe-calls (cold) | $0.20 |
| Run | per million recipe-calls (cache hit) | $0.05 |
| Auto-label soon | per 1k rows after 10k free | $1.00 |
| Specialist train soon | per LoRA trained | from $40 |
| Specialist host soon | per active LoRA / month | $20 |
For yes/no classifier @ 1M calls/month: Claude Sonnet 4.6 ~$675 · Cohere classify ~$2,000 · Recipe per-unit: ~$0.20.
How the two paths compare
| Path A · Savings cut | Path B · Per-unit | |
|---|---|---|
| You share LLM traffic with us? | yes (PII-redacted, opt-in per cluster) | no |
| Setup | wrap your client (2 lines) | call our API directly |
| Bill predictability | tied to actual savings | flat per call |
| Best for | AI-native startups, teams worried about runaway spend | regulated buyers, on-prem, BYOC |
| Worst case | $99/mo even if no savings (cancel any time) | $0/mo (just usage rates) |
| Best case | typical: 65–80% net bill cut | typical: 95%+ vs LLM if you have a busy classifier |
Enterprise (six- and seven-figure inference bills)
Custom pricing tied to verified savings via a 30-day pilot. Air-gapped runtime. Custom registry / private marketplace. Multi-region. SOC 2 + HIPAA. hi@kolm.ai.
Self-host
If you don't want to depend on us, the runtime is just JS. Click any recipe in the registry, copy its source, and paste it into your own server. Recipe is the build & registry layer; running them is free at the file system.
FAQ
How do you measure "savings"?
For each cluster the autopilot replaces, we record (a) how many LLM calls would have happened without the recipe and (b) what those calls cost (your real model rate × observed token counts). Savings = the avoided spend. The receipt is itemized; you can audit every line.
Can I cap how much I pay?
Yes. Scale has a built-in cap at 50% of your remaining LLM bill. Growth caps at 3× the base ($297/mo). Enterprise is whatever you negotiate.
Do you store my examples?
Yes — in the recipe's lineage, so you can audit how it was built. Set visibility: "private" to keep them tenant-only. We never train shared models on your private examples.
Can I run this in my VPC?
Yes, on Scale. The whole system is one Node service + a JSON or Postgres store + an optional Anthropic key. Email hi@kolm.ai.
What happens when I exceed the free tier?
We return HTTP 429 with a clear X-Quota-Used header. Nothing breaks silently. Upgrade to Growth from /signup any time.
Do you have an SLA?
Free and Growth: best-effort, see /status. Scale: 99.9% uptime SLA + on-call rotation.