build your own

Yours, local, signed.

Anyone (a human, a teammate, or an AI agent) can author a signed .kolm in four commands. JSON spec goes in, a single binary comes out, the binary runs locally with zero egress, and every output ships with a receipt chain you can walk byte-for-byte. No account required for local builds. No cloud round-trip. The recipe sandbox is tight enough that an agent can pipe a spec into kolm compile --spec - and trust the result.

01thirty seconds, four commands

Scaffold → compile → run → serve.

The local build path uses a per-user receipt secret stored at ~/.kolm/config.json (mode 0600) so artifacts you compile on your laptop verify on your laptop without a cloud account. Set RECIPE_RECEIPT_SECRET in your environment if you want a teammate's verifier to accept the same artifact.

$ npm i -g github:sneaky-hippo/kolmogorov-stack
$ kolm new my-redactor --from redactor
wrote ./my-redactor.spec.json    # a working spec, edit to taste

$ kolm compile --spec my-redactor.spec.json --out my-redactor.kolm
built my-redactor.kolm  4249 B  k-score 381.61  sha256 867dc0ce…

$ kolm run my-redactor.kolm '{"text":"call me at 555-123-4567"}'
{ "output": { "redacted": "call me at [PHONE]", "hits": [{ "name": "PHONE", "count": 1 }] } }

$ kolm serve --mcp --http --port 11455
ok serving ~/.kolm/artifacts as local tools
02templates

Four shapes that cover most of the work.

Every template scaffolds a complete spec: recipes, evals, a working pack, an index. Compile, run, and ship. If none fit, start --from blank and write the recipe yourself; the sandbox is just JavaScript with a curated lib.

TemplateShape of the recipeRealistic example
redactor
--from redactor
Replace regex matches with named tokens. Tenants pass extra_patterns or redact_words at run time without re-signing. Strip phone numbers, emails, MRNs, account numbers, employee ids, project codenames.
extractor
--from extractor
Pull named fields out of free-text input. Tenants pass extra_rules at run time to add fields the buyer cares about. Pull amount + date + counterparty from a payment description; pull case_id + priority from a triage note.
classifier
--from classifier
Score input against keyword categories with weights, return the top label. Tenants pass extra_categories at run time. Route a support ticket to billing / bug / feature / rma. Tag a sensor reading nominal / warning / critical.
blank
--from blank
One stub recipe + one eval case. The whole authoring guide in 40 lines of JSON. Edit and compile. Anything the three above don't cover. Most one-off agent tools start here.
03spec is the program

Forty lines of JSON, one signed binary.

The compiler is deterministic over the spec. Same spec in, the same content-hash out (the zip wrapper varies bit-for-bit; the signed manifest does not). Every recipe is a function generate(input, lib) that runs in a node:vm sandbox with a 1 MiB input cap and a 1 000 ms timeout.

# spec.json - this is the entire program
{
  "job_id": "job_clinic_redactor",
  "task": "Strip identifiers from clinical notes before they leave the device",
  "base_model": "none",
  "recipes": [{
    "id": "rcp_redact_v1",
    "name": "regex redactor",
    "source": "function generate(input, lib) {\n  var text = String(input.text || '');\n  var hits = [];\n  for (var p of (lib.params.extra_patterns || [])) {\n    var re = new RegExp(p.regex, 'g'); var n = 0;\n    text = text.replace(re, function(){ n++; return p.replacement || '[REDACTED]'; });\n    if (n) hits.push({ name: p.name, count: n });\n  }\n  return { redacted: text, hits: hits };\n}",
    "schema": { "input": { "text": "string" }, "output": { "redacted": "string", "hits": "array" } }
  }],
  "evals": {
    "spec": "rs-1-evals",
    "cases": [{ "id": "phone", "input": { "text": "call 555-123-4567" }, "expected": { "redacted": "call [PHONE]" },
      "params": { "extra_patterns": [{ "name": "PHONE", "regex": "\\d{3}-\\d{3}-\\d{4}", "replacement": "[PHONE]" }] } }]
  }
}

$ cat spec.json | kolm compile --spec - --out clinic-redactor.kolm
ok wrote clinic-redactor.kolm
$ kolm inspect clinic-redactor.kolm | head
{
  "tier": "recipe",
  "signature_valid": true,
  "recipes_n": 1,
  "evals_n": 1,
  "k_score": 412.7,
  "size_bytes": 4188
}

The full reference is in docs/AUTHORING.md: spec field rules, recipe sandbox surface (lib.patterns, lib.parseFloatSafe, lib.pack, lib.index, lib.params), pack and index containers, the receipt chain, and every error code the compiler emits.

04AI-friendly authoring

Hand a model the spec; pipe the spec to kolm compile.

An agent that can write JSON can ship a signed artifact. The spec is small, the validation is strict, and the failure modes are mechanical (every recipe is type-checked at compile time, every eval case runs before the binary is sealed). Drop this prompt into Claude / Cursor / your own agent. The next step is kolm compile --spec -.

# AI authoring prompt (drop into any frontier model)

You are authoring a kolm spec - a JSON program that compiles to a signed
.kolm artifact and runs locally with zero egress.

Output ONLY a single JSON object. Required fields:

  job_id        "job_"   matches /^job_[a-z0-9_-]+$/i
  task          one-sentence human description
  recipes[]     each: { id: "rcp_", name, source }
                source is "function generate(input, lib) { ... return {...}; }"
                lib has: patterns, parseFloatSafe, pack, index, params
  evals.cases[] each: { id, input, expected, params? }

Constraints:
  - source must be valid JavaScript that compiles under Node's vm module
  - inputs are capped at 1 MiB, recipes at 1000 ms wall-time
  - never reach for fs, net, child_process - the sandbox blocks them
  - return JSON-serialisable objects only

Task: 

# Pipe the model's output straight to the compiler
$ claude --message "" \
    | jq -r '.content' \
    | kolm compile --spec - --out my-task.kolm
ok wrote my-task.kolm  k-score 387.4  signature valid

The AI doesn't need an account. The compiler doesn't need a network. The artifact is signed under your local secret. Verifying on a teammate's machine just means setting RECIPE_RECEIPT_SECRET to the same value, or pinning a fleet secret in your CI.

05sensitive verticals

What kolm does and does not promise.

Hospitals, banks, defense primes, and insurers are the obvious buyers. The honest framing matters: we ship the runtime substrate (signed rules, zero-egress recipe sandbox, receipt chain, audit log). We do not ship compliance attestation, recipe correctness, or input-handling guarantees that hold outside the runtime. Read both columns.

What you getWhat you're still on the hook for
Zero runtime egress. The benchmark monitor patches fetch, http, https, net, tls, and dns. Any artifact that touches the wire fails benchmark. Whether the recipe is correct. A regex that misses a digit pattern still misses it. Write evals that cover your edge cases.
Signed receipt chain. Five HMAC-SHA256 rings (task → seeds → recipes → evals → package). Receipts walk byte-for-byte; tampering is detected at load. Compliance attestation. kolm is not SOC 2 / HIPAA-attested. The runtime gives an auditor an artifact to inspect; the audit itself is yours.
Tenant runtime params. Buyers pass extra_patterns, extra_rules, extra_categories per call. The artifact's signature does not change. What buyers do with the output. If a clinic logs PHI to disk after redaction, that's a clinic problem. Audit-sink is opt-in.
Audit-sink hook. Every runArtifact emits a kolm-audit-1 entry: recipe id, input sha256 prefix, latency, ran_at. Plug into your SIEM. Threat model outside the sandbox. The recipe is sandboxed; the host process isn't. Run kolm in the same trust boundary as the rest of your pipeline.

Three concrete patterns to start from:

# clinic redactor - PHI redaction with tenant-specific patterns
$ kolm new clinic-redact --from redactor
# edit clinic-redact.spec.json: add MRN, DOB, NPI patterns; add evals; compile
$ kolm compile --spec clinic-redact.spec.json --out clinic-redact.kolm
$ kolm run clinic-redact.kolm '{"text":"pt MRN 0042-A seen 2026-04-01"}' \
    --params '{"extra_patterns":[{"name":"MRN","regex":"MRN \\d{4}-[A-Z]","replacement":"[MRN]"}]}'

# banking extractor - pull amount + date + counterparty out of payment text
$ kolm new wire-extract --from extractor
$ kolm compile --spec wire-extract.spec.json --out wire-extract.kolm
$ kolm run wire-extract.kolm '{"text":"$2,400 wire to ACME on 2026-04-15"}'

# support triage - route inbound tickets without an LLM
$ kolm new support-triage --from classifier
$ kolm compile --spec support-triage.spec.json --out support-triage.kolm
$ kolm run support-triage.kolm '{"text":"refund please, charge looks wrong"}'
{ "output": { "label": "billing", "score": 2 } }

The cookbook expands each of these to a full vertical walkthrough (healthcare, finance, legal, edge, defense), including the gates that matter to the buyer (BAA, audit lineage, privilege, RAM/offline).

06tenant params don't re-sign

One artifact, many tenants, no resigning.

A single signed .kolm goes out the door. Buyers configure it per call. The signature does not change. The audit log records exactly which params each tenant passed. This is how a healthcare vendor ships a redactor that 200 clinics customize for their own MRN format without ever touching the binary.

# a single signed redactor, two tenants, two configurations, one audit trail

tenant-A$ kolm run ./redactor.kolm \
    '{"text":"pt MRN A-100042 reviewed"}' \
    --params '{"extra_patterns":[{"name":"MRN","regex":"MRN A-\\d{6}","replacement":"[MRN]"}]}'

tenant-B$ kolm run ./redactor.kolm \
    '{"text":"patient ID 99-7711 admitted"}' \
    --params '{"extra_patterns":[{"name":"PID","regex":"\\d{2}-\\d{4}","replacement":"[PID]"}]}'

# both tenants run the same artifact (same sha256, same signature).
# both audit logs distinguish which params were applied.
07buy instead of rent Sprint 3 roadmap

Compound your frontier-API spend into a local artifact.

Sprint 3 roadmap, not yet shipped. The capture → aggregate → distill loop and the endpoints below are coming. Subscribe at /signup to be notified the moment they land.

The spec authoring path is the front door. The other door is heavier and earns its keep over time: every API call you proxy through the planned capture endpoint will write a verified (input, output) pair into your namespace. After enough pairs land, those observations distill into a recipe (or a LoRA) that runs locally for the same task at a fraction of the latency and zero per-call cost. The frontier API stops being a forever-bill and starts being a deposit account.

PhaseWhat will happenWhat you keep
Day 1
capture
Point your client at the capture proxy. The proxy forwards the call upstream with your key, records the verified (input, output, latency) in observations under your namespace. Same answer, same SLA, your traffic now leaves a labelled trail you own.
Day 7-14
aggregate
Observations cluster by intent. The bridges surface recommends synthesis once a cluster crosses the threshold; the corpus exports as JSONL or parquet. A clean dataset of your own task, captured in your own production traffic, that you can train any model on.
Day 14-30
distill
The auto-distill job compiles the cluster into a recipe (or LoRA, if size warrants). The output is a signed .kolm sitting in ~/.kolm/artifacts. An offline-capable specialist for the slice of your task that you actually run. Runs on a laptop. No per-call cost.
Day 30+
compound
Route the easy 60% of traffic to the local artifact, route the hard 40% to the frontier (and keep capturing). The local-share grows; the frontier bill shrinks. Latency wins, privacy wins, cost wins. The artifact is yours; even if you cancel kolm, the .kolm on disk still runs.

A 50-engineer team running 80 000 frontier calls a month, captured for 30 days, distilled at 50 000 verified labels: a Phi-3-mini-LoRA would hit ~78% of Opus quality on their own tasks at <5% the latency and zero per-token cost. Numbers are illustrative. The capture proxy, corpus synthesis, and auto-distill endpoints are on the Sprint 3 roadmap and not yet live; today, the spec-authoring front door (sections 01-06 above) is what ships.

Ship one. Ship a hundred.

The four templates compile in seconds. The spec validator catches the wrong-shape mistakes before the binary is sealed. The receipt chain catches the rest. Read the authoring guide, scaffold a template, edit a regex, ship.