Yours, local, signed.
Anyone (a human, a teammate, or an AI agent) can author a signed .kolm in four commands. JSON spec goes in, a single binary comes out, the binary runs locally with zero egress, and every output ships with a receipt chain you can walk byte-for-byte. No account required for local builds. No cloud round-trip. The recipe sandbox is tight enough that an agent can pipe a spec into kolm compile --spec - and trust the result.
Scaffold → compile → run → serve.
The local build path uses a per-user receipt secret stored at ~/.kolm/config.json (mode 0600) so artifacts you compile on your laptop verify on your laptop without a cloud account. Set RECIPE_RECEIPT_SECRET in your environment if you want a teammate's verifier to accept the same artifact.
$ npm i -g github:sneaky-hippo/kolmogorov-stack $ kolm new my-redactor --from redactor wrote ./my-redactor.spec.json # a working spec, edit to taste $ kolm compile --spec my-redactor.spec.json --out my-redactor.kolm built my-redactor.kolm 4249 B k-score 381.61 sha256 867dc0ce… $ kolm run my-redactor.kolm '{"text":"call me at 555-123-4567"}' { "output": { "redacted": "call me at [PHONE]", "hits": [{ "name": "PHONE", "count": 1 }] } } $ kolm serve --mcp --http --port 11455 ok serving ~/.kolm/artifacts as local tools
Four shapes that cover most of the work.
Every template scaffolds a complete spec: recipes, evals, a working pack, an index. Compile, run, and ship. If none fit, start --from blank and write the recipe yourself; the sandbox is just JavaScript with a curated lib.
| Template | Shape of the recipe | Realistic example |
|---|---|---|
| redactor --from redactor |
Replace regex matches with named tokens. Tenants pass extra_patterns or redact_words at run time without re-signing. |
Strip phone numbers, emails, MRNs, account numbers, employee ids, project codenames. |
| extractor --from extractor |
Pull named fields out of free-text input. Tenants pass extra_rules at run time to add fields the buyer cares about. |
Pull amount + date + counterparty from a payment description; pull case_id + priority from a triage note. |
| classifier --from classifier |
Score input against keyword categories with weights, return the top label. Tenants pass extra_categories at run time. |
Route a support ticket to billing / bug / feature / rma. Tag a sensor reading nominal / warning / critical. |
| blank --from blank |
One stub recipe + one eval case. The whole authoring guide in 40 lines of JSON. Edit and compile. | Anything the three above don't cover. Most one-off agent tools start here. |
Forty lines of JSON, one signed binary.
The compiler is deterministic over the spec. Same spec in, the same content-hash out (the zip wrapper varies bit-for-bit; the signed manifest does not). Every recipe is a function generate(input, lib) that runs in a node:vm sandbox with a 1 MiB input cap and a 1 000 ms timeout.
# spec.json - this is the entire program { "job_id": "job_clinic_redactor", "task": "Strip identifiers from clinical notes before they leave the device", "base_model": "none", "recipes": [{ "id": "rcp_redact_v1", "name": "regex redactor", "source": "function generate(input, lib) {\n var text = String(input.text || '');\n var hits = [];\n for (var p of (lib.params.extra_patterns || [])) {\n var re = new RegExp(p.regex, 'g'); var n = 0;\n text = text.replace(re, function(){ n++; return p.replacement || '[REDACTED]'; });\n if (n) hits.push({ name: p.name, count: n });\n }\n return { redacted: text, hits: hits };\n}", "schema": { "input": { "text": "string" }, "output": { "redacted": "string", "hits": "array" } } }], "evals": { "spec": "rs-1-evals", "cases": [{ "id": "phone", "input": { "text": "call 555-123-4567" }, "expected": { "redacted": "call [PHONE]" }, "params": { "extra_patterns": [{ "name": "PHONE", "regex": "\\d{3}-\\d{3}-\\d{4}", "replacement": "[PHONE]" }] } }] } } $ cat spec.json | kolm compile --spec - --out clinic-redactor.kolm ok wrote clinic-redactor.kolm $ kolm inspect clinic-redactor.kolm | head { "tier": "recipe", "signature_valid": true, "recipes_n": 1, "evals_n": 1, "k_score": 412.7, "size_bytes": 4188 }
The full reference is in docs/AUTHORING.md: spec field rules, recipe sandbox surface (lib.patterns, lib.parseFloatSafe, lib.pack, lib.index, lib.params), pack and index containers, the receipt chain, and every error code the compiler emits.
Hand a model the spec; pipe the spec to kolm compile.
An agent that can write JSON can ship a signed artifact. The spec is small, the validation is strict, and the failure modes are mechanical (every recipe is type-checked at compile time, every eval case runs before the binary is sealed). Drop this prompt into Claude / Cursor / your own agent. The next step is kolm compile --spec -.
# AI authoring prompt (drop into any frontier model) You are authoring a kolm spec - a JSON program that compiles to a signed .kolm artifact and runs locally with zero egress. Output ONLY a single JSON object. Required fields: job_id "job_" matches /^job_[a-z0-9_-]+$/i task one-sentence human description recipes[] each: { id: "rcp_ ", name, source } source is "function generate(input, lib) { ... return {...}; }" lib has: patterns, parseFloatSafe, pack, index, params evals.cases[] each: { id, input, expected, params? } Constraints: - source must be valid JavaScript that compiles under Node's vm module - inputs are capped at 1 MiB, recipes at 1000 ms wall-time - never reach for fs, net, child_process - the sandbox blocks them - return JSON-serialisable objects only Task: # Pipe the model's output straight to the compiler $ claude --message " " \ | jq -r '.content' \ | kolm compile --spec - --out my-task.kolm ok wrote my-task.kolm k-score 387.4 signature valid
The AI doesn't need an account. The compiler doesn't need a network. The artifact is signed under your local secret. Verifying on a teammate's machine just means setting RECIPE_RECEIPT_SECRET to the same value, or pinning a fleet secret in your CI.
What kolm does and does not promise.
Hospitals, banks, defense primes, and insurers are the obvious buyers. The honest framing matters: we ship the runtime substrate (signed rules, zero-egress recipe sandbox, receipt chain, audit log). We do not ship compliance attestation, recipe correctness, or input-handling guarantees that hold outside the runtime. Read both columns.
| What you get | What you're still on the hook for |
|---|---|
Zero runtime egress. The benchmark monitor patches fetch, http, https, net, tls, and dns. Any artifact that touches the wire fails benchmark. |
Whether the recipe is correct. A regex that misses a digit pattern still misses it. Write evals that cover your edge cases. |
| Signed receipt chain. Five HMAC-SHA256 rings (task → seeds → recipes → evals → package). Receipts walk byte-for-byte; tampering is detected at load. | Compliance attestation. kolm is not SOC 2 / HIPAA-attested. The runtime gives an auditor an artifact to inspect; the audit itself is yours. |
Tenant runtime params. Buyers pass extra_patterns, extra_rules, extra_categories per call. The artifact's signature does not change. |
What buyers do with the output. If a clinic logs PHI to disk after redaction, that's a clinic problem. Audit-sink is opt-in. |
Audit-sink hook. Every runArtifact emits a kolm-audit-1 entry: recipe id, input sha256 prefix, latency, ran_at. Plug into your SIEM. |
Threat model outside the sandbox. The recipe is sandboxed; the host process isn't. Run kolm in the same trust boundary as the rest of your pipeline. |
Three concrete patterns to start from:
# clinic redactor - PHI redaction with tenant-specific patterns $ kolm new clinic-redact --from redactor # edit clinic-redact.spec.json: add MRN, DOB, NPI patterns; add evals; compile $ kolm compile --spec clinic-redact.spec.json --out clinic-redact.kolm $ kolm run clinic-redact.kolm '{"text":"pt MRN 0042-A seen 2026-04-01"}' \ --params '{"extra_patterns":[{"name":"MRN","regex":"MRN \\d{4}-[A-Z]","replacement":"[MRN]"}]}' # banking extractor - pull amount + date + counterparty out of payment text $ kolm new wire-extract --from extractor $ kolm compile --spec wire-extract.spec.json --out wire-extract.kolm $ kolm run wire-extract.kolm '{"text":"$2,400 wire to ACME on 2026-04-15"}' # support triage - route inbound tickets without an LLM $ kolm new support-triage --from classifier $ kolm compile --spec support-triage.spec.json --out support-triage.kolm $ kolm run support-triage.kolm '{"text":"refund please, charge looks wrong"}' { "output": { "label": "billing", "score": 2 } }
The cookbook expands each of these to a full vertical walkthrough (healthcare, finance, legal, edge, defense), including the gates that matter to the buyer (BAA, audit lineage, privilege, RAM/offline).
One artifact, many tenants, no resigning.
A single signed .kolm goes out the door. Buyers configure it per call. The signature does not change. The audit log records exactly which params each tenant passed. This is how a healthcare vendor ships a redactor that 200 clinics customize for their own MRN format without ever touching the binary.
# a single signed redactor, two tenants, two configurations, one audit trail tenant-A$ kolm run ./redactor.kolm \ '{"text":"pt MRN A-100042 reviewed"}' \ --params '{"extra_patterns":[{"name":"MRN","regex":"MRN A-\\d{6}","replacement":"[MRN]"}]}' tenant-B$ kolm run ./redactor.kolm \ '{"text":"patient ID 99-7711 admitted"}' \ --params '{"extra_patterns":[{"name":"PID","regex":"\\d{2}-\\d{4}","replacement":"[PID]"}]}' # both tenants run the same artifact (same sha256, same signature). # both audit logs distinguish which params were applied.
Compound your frontier-API spend into a local artifact.
The spec authoring path is the front door. The other door is heavier and earns its keep over time: every API call you proxy through the planned capture endpoint will write a verified (input, output) pair into your namespace. After enough pairs land, those observations distill into a recipe (or a LoRA) that runs locally for the same task at a fraction of the latency and zero per-call cost. The frontier API stops being a forever-bill and starts being a deposit account.
| Phase | What will happen | What you keep |
|---|---|---|
| Day 1 capture |
Point your client at the capture proxy. The proxy forwards the call upstream with your key, records the verified (input, output, latency) in observations under your namespace. |
Same answer, same SLA, your traffic now leaves a labelled trail you own. |
| Day 7-14 aggregate |
Observations cluster by intent. The bridges surface recommends synthesis once a cluster crosses the threshold; the corpus exports as JSONL or parquet. | A clean dataset of your own task, captured in your own production traffic, that you can train any model on. |
| Day 14-30 distill |
The auto-distill job compiles the cluster into a recipe (or LoRA, if size warrants). The output is a signed .kolm sitting in ~/.kolm/artifacts. |
An offline-capable specialist for the slice of your task that you actually run. Runs on a laptop. No per-call cost. |
| Day 30+ compound |
Route the easy 60% of traffic to the local artifact, route the hard 40% to the frontier (and keep capturing). The local-share grows; the frontier bill shrinks. | Latency wins, privacy wins, cost wins. The artifact is yours; even if you cancel kolm, the .kolm on disk still runs. |
A 50-engineer team running 80 000 frontier calls a month, captured for 30 days, distilled at 50 000 verified labels: a Phi-3-mini-LoRA would hit ~78% of Opus quality on their own tasks at <5% the latency and zero per-token cost. Numbers are illustrative. The capture proxy, corpus synthesis, and auto-distill endpoints are on the Sprint 3 roadmap and not yet live; today, the spec-authoring front door (sections 01-06 above) is what ships.
Ship one. Ship a hundred.
The four templates compile in seconds. The spec validator catches the wrong-shape mistakes before the binary is sealed. The receipt chain catches the rest. Read the authoring guide, scaffold a template, edit a regex, ship.