Spec · 2026-05-07 · 10 min read

The .kolm file format, component by component.

A .kolm is a signed zip with seven internal components and one manifest. The whole thing is one file because everything inside is co-versioned. A LoRA without its base, a recipe pack without its embedder, a verifier without its tests — these are how AI artifacts go bad in production. This document is the spec.

By KolmogorovTag .kolm spec · format · v0.1

Why one file with everything inside.

An AI artifact in 2026 is rarely a single weight matrix. It is a composition of artifacts: a base model, a LoRA fine-tune, a retrieval index, a draft pack for speculative decoding, a verifier for output-quality gating, and a held-out test set the verifier can be re-run against. Six things, all of which need to agree on each other's content.

The way this fails in practice: a team ships a LoRA fine-tune separately from its base, and three weeks later the base gets a 0.0.2 patch from upstream, and now the LoRA is loading on top of slightly different weights than it was trained against, and accuracy drops by 12 points and nobody knows why.

The fix is the same fix software has been using for fifty years: ship one binary that contains everything you need to run, and content-hash the whole thing. If any byte changes, the signature changes, and the runtime refuses to load it without an explicit override. The .kolm format is that binary for AI.

One file, one signature, one K-score on the cover. Everything that affects behavior at runtime is bundled and fingerprinted. .kolm is to AI artifacts what .deb is to system packages: a versioned, signed, reproducible unit.

The on-disk layout.

A .kolm is a deflate-compressed ZIP archive. The choice of ZIP is deliberate: every operating system already has the libraries to read it, every language already has the libraries to read it, and the format supports per-file compression and integrity checks at the container level.

# unzip -l support-triage.kolm
Length        Date    Time    Name
--------  ----------  -----   ----
   1,247  2026-05-07  09:14   manifest.json
1,847,392  2026-05-07  09:14   model.gguf            # INT4 base, 1.5-4 GB
   91,608  2026-05-07  09:14   lora.bin              # LoRA delta
   12,458  2026-05-07  09:14   recipes.json          # draft pack
6,144,768  2026-05-07  09:14   index.sqlite-vec      # multimodal recall
   18,232  2026-05-07  09:14   verifier.js           # scoring fn
   45,672  2026-05-07  09:14   tests.jsonl           # held-out evals
       64  2026-05-07  09:14   signature.sig         # HMAC-SHA256 chain root

Files are ordered consistently: manifest.json first (so a partial read can validate before the model is loaded), signature.sig last (so it covers everything emitted before it). The whole archive is content-hashed; the signature anchors that hash to a key the runtime can verify.

The manifest, field by field.

manifest.json is the artifact's index. Every field is required unless marked optional.

{
  "format_version": "0.1",
  "artifact_id": "klm_8a4f9c2b",
  "task": "summarize support tickets in our voice",
  "created_at": "2026-05-07T09:14:32Z",
  "compiler": { "name": "kolm", "version": "0.1.0" },

  "base": {
    "family": "qwen2.5",
    "size_b": 3,
    "quant": "q4_k_m",
    "hash": "sha256:a8f4..."
  },
  "lora": { "rank": 16, "hash": "sha256:2c91..." },
  "teacher": { "family": "claude-opus-4-7", "k": 8 },

  "recipes": { "count": 1284, "coverage": 0.71 },
  "recall": { "vectors": 8240, "modalities": ["text", "image"] },
  "verifier": { "hash": "sha256:f12e...", "deterministic": true },
  "tests": { "count": 200, "hash": "sha256:9b03..." },

  "k_score": 0.94,
  "benchmarks": {
    "accuracy": 0.91, "size_mb": 38, "p50_ms": 80,
    "cost_usd_per_1k": 0.00, "coverage": 0.71
  },
  "runtime": { "engine": "llama.cpp", "min_version": "b3000" },
  "egress_policy": "none"
}

Three fields deserve special attention.

egress_policy declares what the runtime is permitted to send out. "none" means a fully offline artifact; the runtime enforces that no fetch hits the network. "allowed-domains" with an explicit list permits scoped outbound calls (rare; only enabled for tool-using compiles).

k_score is the single number on the cover, derived from the five benchmark properties. Below the configured gate (default 0.70), the artifact does not ship.

format_version is mandatory and the runtime checks for backward compatibility. v0.1 artifacts are forward-compatible to v0.x runtimes; major version bumps will require runtime upgrades.

The seven payload components.

model.gguf1.5 — 4 GB

The base model in GGUF format (the standard interchange format consumed by llama.cpp and its many ports). INT4-quantized via q4_k_m for the best speed/accuracy ratio at this size class. The base is whichever open-weight model the compiler picked for the task: typically a 3B or 7B Qwen, Llama, Phi, or Hermes.

lora.bin10 — 100 MB

The LoRA adapter, trained against the base on the verified labels. The runtime applies it as a delta at load time. The adapter is rank-16 by default; higher rank for harder tasks. The hash in the manifest must match exactly, or the runtime refuses to apply.

recipes.json5 — 20 MB

The draft pack of (prefix-shape, token) pairs extracted at compile time. Each entry is a deterministic shortcut: when the runtime sees a matching prefix, it drafts the predicted token instead of running the base. Verified against the base for correctness; the base is always the source of truth.

index.sqlite-vec100 MB — 8 GB

The multimodal recall index. A sqlite-vec database with embeddings for every chunk of the user's corpus. Indexed by content hash, queryable by cosine, queryable by metadata. The runtime loads it read-only by default; kolm recall ingest appends new content from local sources.

verifier.js5 — 50 KB

A pure JavaScript scoring function that returns 0..1 for any candidate output. Synthesized at compile time from the seed examples; can be hand-edited and re-pinned by the engineer. Determinism is enforced (the compiler runs it 100× on a fixed input and rejects non-deterministic outputs).

tests.jsonl10 — 200 KB

The held-out evaluation set, in JSONL. Each line is a record of {input, ideal, hash}. Used at compile time to derive the K-score, and at runtime by kolm verify to re-derive that score independently. The hash makes silent test-set rewriting impossible.

signature.sig64 bytes

An HMAC-SHA256 over the canonical concatenation of every other component's content hash, computed in declaration order. The signing key is tenant-scoped; the verifying key is published on the public registry. Tamper with any byte and the signature fails.

The HMAC signature chain.

Each .kolm belongs to a tenant's signing chain. The first artifact a tenant produces sets the genesis hash. Every subsequent artifact's signature includes parent_hash referencing the previous artifact's signature. The chain is published on the public registry as it grows.

This chain is what makes .kolm defensible at audit. Rewriting an old artifact requires forging every artifact emitted after it. The forgery is detectable to anyone who has cached an intermediate chain root. Public anchoring (Arweave, Bitcoin OP_RETURN) extends the detectability to the whole world.

# Verify an artifact against the public registry
$ kolm verify support-triage.kolm
✓ format v0.1
✓ component hashes match manifest
✓ signature verifies against tenant key acme-prod
✓ chain anchors to genesis 2026-04-12T00:00:00Z
✓ K-score 0.94 ≥ gate 0.70
artifact OK

Extending the format (vendor extensions).

The format reserves extensions/ as a namespace for vendor-specific payloads. Anything in this directory is content-hashed but not interpreted by the standard runtime. Use cases we have seen:

If a runtime encounters an extension it doesn't understand, it MUST log a warning and continue. Extensions cannot affect the K-score or signature semantics; if you need that, propose it for the spec instead.

FAQ.

Why ZIP and not a custom container?

Every OS reads ZIP. Every language reads ZIP. The compression and CRC are battle-tested. Reinventing the container would buy us a fraction of a percent on size and lose us decades of tooling.

Is the format open?

Yes. The spec is MIT-licensed at github.com/kolmogorov/kolm-spec. Anyone can produce .kolm files; anyone can write a runtime to load them. The Kolmogorov-hosted registry is one possible signing authority; you can run your own.

What about the model weights — am I allowed to redistribute them?

Each base model carries its own license; the manifest's base.family field tells you which. Most open-weight bases (Qwen, Llama 3, Hermes 3, Phi-3) permit redistribution as part of derivative works under their respective terms. Your .kolm derivative inherits those terms.

Can I run a .kolm without the kolm CLI?

Yes. The format is open; any runtime that knows about GGUF + LoRA + sqlite-vec can load the components. We ship a reference runtime (kolm run); third-party runtimes are welcome and a good thing.