§0 Abstract
This document specifies RS-1, the wire format and protocol for .kolm
artifacts. A .kolm is a single-file, deterministic, signed bundle that
encodes a base model, a personal LoRA adapter, a recipe pack of deterministic-token drafts,
a multimodal recall index, an evaluation suite, and a manifest with cryptographic anchors
to a public registry. The compiler that produces .kolm files is called
kolm; the compile pipeline has eight normative stages
(gather → spec → synthesize → k-sample → verify → fit → observe → sign).
RS-1 is designed to satisfy three goals simultaneously: portability (the same artifact runs on commodity CPUs, consumer GPUs, and mobile NPUs without a network), verifiability (any third party can byte-for-byte reproduce an artifact from its inputs, and any consumer of an artifact's output can verify the output's lineage offline), and compositionality (artifacts may import other artifacts, with the receipt chain surviving composition).
§1 Introduction
The current AI deployment stack forces three undesirable choices: (a) call a closed remote model and accept opacity, latency, and per-token cost; (b) self-host an open-weight model and accept poor task-specific quality; (c) fine-tune an open-weight model and accept fragile, undocumented, unverifiable artifacts.
RS-1 is a fourth option: compile a task into a portable artifact. The compiler distills a frontier model's behavior on the task into an open-weight base model + personal adapter + draft pack, signs every layer, and emits a single file. The artifact is owned by the operator, runs offline, and proves its own provenance.
This document is the normative spec. Implementations MAY add capabilities, but MUST NOT
rename existing fields, change layer ordering inside the zip, or break receipt verification
against the public registry at registry.kolm.ai.
github.com/sneaky-hippo/kolmogorov-stack. Specification changes follow
semantic versioning; backward-incompatible changes increment the major version
(RS-1 → RS-2) and require a registry epoch transition.
§2 Terminology
The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174].
| Term | Definition |
|---|---|
| artifact | A .kolm file conforming to §3. |
| base model | An open-weight LLM whose weights are referenced by SHA-256 hash and quantization profile in the manifest. The student. |
| frontier model | A closed or remote LLM used during distillation as a label source. The teacher. |
| recipe | A deterministic-token draft pattern indexed by prefix-shape and embedding-shape, drawn from a public registry. |
| recall | A multimodal vector index over the operator's corpus, packaged inside the artifact. |
| k-sample | The protocol of querying a frontier model k times for the same input and selecting the verifier-passing winner (§5.4). |
| verifier | A deterministic checker that decides whether a candidate output is accepted (§7). |
| K-score | A composite quality metric in [0, 100] computed against the artifact's bundled eval suite (§6). |
| receipt | An HMAC-SHA256 over a canonical statement of (input, output, artifact, timestamp) that anchors the inference event to the registry (§8). |
| registry | The public service at registry.kolm.ai that serves recipe packs, anchors receipts, and tracks specification epochs. |
| conformant runtime | A program that loads and executes a .kolm per §11. |
§3 Artifact format
A .kolm is a ZIP archive (PKZip 2.0 / DEFLATE) with a fixed inner layout.
Stored entries MUST appear in the canonical order below, written
with deterministic ZIP options (mtime fixed to 2020-01-01T00:00:00Z, extra
fields stripped, UTF-8 names, no Zip64 unless any single layer exceeds 4 GiB).
| # | Path | Required | Purpose |
|---|---|---|---|
| 01 | manifest.json | yes | The single source of truth for every other layer's hash, size, and role. Read first by every runtime. |
| 02 | signature.sig | yes | HMAC-SHA256 chain root. 256 bytes binary. See §8 and Appendix B. |
| 03 | model.gguf | yes | The base model in GGUF format [GGUF-v3]. Quantization profile MUST match manifest.base_model.quantization. |
| 04 | lora.bin | conditional | The personal adapter in GGUF-LoRA format. REQUIRED if any compile stage produced training labels (§5.6). |
| 05 | recipes.json | yes | The recipe pack: deterministic-token drafts indexed by prefix-shape and embedding-shape (§7.5). |
| 06 | index.sqlite-vec | conditional | The multimodal recall index. REQUIRED if a corpus was supplied; otherwise MAY be omitted. |
| 07 | tests.jsonl | yes | The evaluation suite the artifact passed at compile time. Used for K-score recompute (§6.3). |
| 08 | verifiers.json | yes | The verifier definitions referenced by tests.jsonl (§7). |
| 09 | provenance/ | OPTIONAL | Audit material: prompts, k-sample logs, label-acceptance traces. Operators MAY strip this for privacy without breaking signature verification; the signature commits to a stripped-tree hash. |
.kolm MUST contain the local file header for
manifest.json. Runtimes MAY read only the first 4 KiB to detect format
and version, before deciding whether to stream the rest.§4 Manifest schema
The manifest is a UTF-8 JSON document with the following top-level shape. Full JSON Schema is in Appendix A.
// manifest.json - RS-1 1.0.0 { "rs": "1.0.0", // spec version "id": "kolm:8b73e9...02", // artifact id = sha256(zip)[:32] "created_at": "2026-05-08T14:31:00Z", "compiler": { "name": "kolm", "version": "6.5.0" }, "task": { "description": "detect whether a short text is a greeting", "intent_hash": "7c1af2..." // see §5.2 }, "base_model": { "name": "qwen2.5-3b-instruct", "weights_sha256": "3a92c1...", "quantization": "Q4_K_M" }, "adapter": { "format": "gguf-lora", "rank": 16, "alpha": 32, "epochs": 3, "weights_sha256": "4d22ef..." }, "recipes": { "registry_epoch": "rs-1@2026-05-06", "pack_sha256": "ab02f1...", "count": 142 }, "recall": { "embedder": "bge-m3", "chunks": 412, "index_sha256": "8b73e9..." }, "verifiers": [ { "id": "v_schema_0", "type": "schema", "sha256": "..." }, { "id": "v_regex_1", "type": "regex", "sha256": "..." } ], "k_score": { "composite": 94.2, "components": { "task": 96, "calibration": 91, "latency": 95 }, "gate": "passed", // "passed" | "warned" | "failed" "floor": 85 }, "signature": { "alg": "hmac-sha256", "anchored_to": "registry.kolm.ai/anchor/2026-05-08-31415", "layer_hashes": { "manifest": "...", "model": "...", /* ... */ } } }
x_ (e.g. x_acme_metadata); such
fields MUST NOT be assumed present by other implementations.§5 Compile pipeline
The compile process has eight stages. Stages MUST execute in the order below; a conformant compiler MAY elide stages 4 (recall), 5 (k-sample), or 6 (fit) when their inputs are absent, but stages 1, 2, 3, 7, and 8 are REQUIRED.
5.1 Stage 1: Gather
Inputs: a task description, optional examples (jsonl), optional corpus
(any directory). The compiler MUST canonicalize examples to UTF-8, strip BOMs, and
record the SHA-256 of the canonicalized input bundle as task.input_hash.
5.2 Stage 2: Spec
The compiler synthesizes a structured task specification: input shape, output shape,
desiderata, edge cases. The spec is hashed to task.intent_hash; two compiles
with semantically equivalent task descriptions SHOULD produce the
same intent hash up to documented normalization rules (whitespace, casing, punctuation).
5.3 Stage 3: Synthesize verifier
From the spec the compiler emits one or more verifiers (§7). At least one verifier MUST be of a deterministic type (schema, regex, or function); classifier-typed verifiers MAY only be used in composite mode where a deterministic verifier dominates.
5.4 Stage 4: K-sample
For each example or generated input, the compiler queries a frontier model k times
(default k=5, range 1–32) and applies the synthesized verifier to each output. The
first verifier-passing output is selected as the label. If no output passes within k
attempts, the example is recorded as frontier.unverified and excluded from
training (FM-001 budget tracking applies; see §10.3).
5.5 Stage 5: Verify
An independent verifier pass is run over all accepted (input, output) pairs to detect
label drift. Pairs that fail re-verification are dropped. The acceptance rate
is recorded in provenance/k-sample.log.
5.6 Stage 6: Fit
If at least one accepted pair exists, the compiler fits a LoRA adapter on the verified pairs against the chosen base model. Default hyperparameters are listed in §A.adapter. Implementations MAY use INT4 + bf16 mixed-precision training when the host has sufficient VRAM; otherwise full-precision training on CPU is permitted but not RECOMMENDED.
5.7 Stage 7: Observe
The compiler runs the fitted artifact against the bundled eval suite (tests.jsonl)
and computes the K-score (§6). Recipe candidates extracted from accepted outputs are
proposed back to the registry; recipes that pass the registry's gated synthesis become
part of future epochs.
If the K-score gate fails, the compiler MUST NOT emit a signed
artifact. It SHOULD emit a diagnostic bundle under
./build/ for the operator to inspect. See FM-001 in §10.3.
5.8 Stage 8: Sign
The compiler computes layer hashes, builds the manifest, computes the signature chain (§8), writes the artifact, and posts the anchor record to the registry. After registry acknowledgment the artifact's signature is considered anchored; consumers may verify the anchor at any later time, including offline against a previously cached epoch root.
§6 K-score
K-score is the composite quality metric reported by every compile. It is a single number in [0, 100] and three named subcomponents.
6.1 Definition
Let T be the task accuracy on the bundled eval suite (proportion of
verifier-passing outputs). Let C be the calibration score: 1 minus the
mean absolute deviation between the artifact's stated confidence and its observed
accuracy on a 10-bucket reliability diagram. Let L be the latency score:
a step function over p50 inference time (<10 ms = 100, <50 ms = 95, <250 ms = 85,
<1 s = 70, <5 s = 50, ≥5 s = 0). Then:
// composite K-score
K = round( 100 · ( 0.60·T + 0.25·C + 0.15·L ) , 1 )
The weights (0.60, 0.25, 0.15) are normative for K-score 1.0; they
MAY be overridden by a task-specific profile (e.g. for latency-critical
or calibration-critical workloads), in which case the profile name and weights
MUST appear in manifest.k_score.profile.
6.2 Gate semantics
| Gate | Condition | Behavior |
|---|---|---|
| passed | K ≥ floor and T ≥ 0.85 | Artifact emitted normally. |
| warned | floor > K ≥ floor − 5 or 0.85 > T ≥ 0.75 | Artifact emitted with a warning record. Operators are advised to inspect the eval suite. |
| failed | K < floor − 5 or T < 0.75 | Artifact MUST NOT be emitted. Compile exits non-zero. See FM-001. |
6.3 Recompute protocol
Any party with the artifact MAY recompute the K-score by:
(1) loading tests.jsonl and verifiers.json;
(2) running the artifact's inference path on each test input;
(3) applying the verifier;
(4) recomputing T, C, L per §6.1.
A divergence of more than ±0.5 K-points between recomputed and stated values
SHOULD be reported as a tampering signal.
§7 Verifier types
A verifier is a deterministic decision procedure on (input, candidate_output) → boolean. Five verifier types are normative.
| Type | Form | Deterministic? | Use |
|---|---|---|---|
| schema | JSON Schema 2020-12 over output | yes | Structured outputs (extraction, function-calls). |
| regex | RE2 regular expression over output | yes | Format constraints (dates, IDs, classification labels). |
| function | Pure JavaScript / Python evaluator referenced by SHA-256 | yes | Computable predicates (parser, math, code-runs-and-returns). |
| classifier | Bundled small model (≤200 MB) with a calibrated threshold | no, but stable | Subjective tasks (style, tone, harmlessness). MUST be paired with a deterministic gate inside a composite. |
| composite | Boolean combination over the above | follows leaves | Real tasks. AND/OR with a documented short-circuit order. |
Verifiers are referenced from tests.jsonl by ID and their full body is stored
in verifiers.json. The body is content-addressed; renaming a verifier
invalidates the receipt chain.
§8 Receipt chain
A receipt is an HMAC-SHA256 over a canonical statement of an inference event. Receipts
are emitted at every kolm run invocation and chain to an artifact's signature,
which itself anchors to the public registry.
8.1 Statement form
// canonical receipt statement (CSON, sorted keys) { "v": "rs-1-receipts/1.0.0", "artifact": "kolm:8b73e9...02", "input_hash": "sha256:7c1af2...", "output_hash": "sha256:e84b91...", "runtime": { "name": "kolm-run", "version": "6.5.0", "host": "linux-x86_64" }, "observed_at": "2026-05-08T14:32:11Z", "k_score_passed": true }
8.2 Chain construction
The receipt body is HMAC-SHA256'd under a per-artifact key derived from the artifact's signature and the operator's tenant secret. The resulting MAC is appended to the receipt body to form the wire-format receipt.
An artifact's signature.sig is itself an HMAC over the canonical concatenation
of every layer hash listed in the manifest, computed under the registry epoch key. Anchor
posts to the registry are inserted into a Merkle tree; the registry publishes the daily
root at registry.kolm.ai/epoch/<date>/root.json.
8.3 Offline verification
To verify a receipt offline a verifier needs: (1) the receipt body and its MAC,
(2) the artifact's signature.sig and manifest, (3) the registry epoch
root for the day the artifact was anchored. Given those three inputs the verifier
can reconstruct every link in the chain without any network access.
§9 Reproducibility
RS-1 commits to byte-identical reproducibility: given the same
(task, examples, corpus, base_model, recipe_registry@epoch), two
independent compiles MUST produce the same artifact bytes:
not merely an artifact with the same K-score, but the identical zip.
Mechanism:
- K-sample selection ties broken by lexicographic comparison of canonical output strings, not wall-clock arrival.
- LoRA training uses fixed seeds derived from
task.intent_hash; data shuffling is a deterministic permutation of(intent_hash, example_index). - ZIP entries written with fixed mtime, no extra fields, and stable per-file order (§3).
- Floating-point determinism on GPU is achieved via cuBLAS workspace limits and the deterministic-cudnn flag; on CPU via single-threaded BLAS for the final norm-and-clip pass.
A frontier model that is non-deterministic at temperature 0 (e.g. due to provider-side
batching) is the only legitimate source of irreproducibility. The compiler caches the
first compile's k-sample winners under provenance/k-sample.log; replays
use the cached winners when present and the same inputs are given.
§10 Threat model & failure modes
10.1 In scope
- Tampered artifacts (modified weights, swapped LoRA, edited eval suite).
- Forged receipts (claiming an output came from a given artifact when it did not).
- Registry-side recipe corruption (a malicious draft pattern that biases inference).
- Frontier-model failure during compile (rate limit, content policy, downtime).
- Runtime failure on the consumer device (missing CPU instructions, OOM).
10.2 Out of scope
- Side-channel attacks on the operator's machine.
- Plaintext exfiltration before compile begins (the operator owns their inputs and is responsible for transport).
- Inverting an artifact to reconstruct training data. A compiled
.kolmstores neither raw examples nor raw corpus; only embedded chunks and verifier-passing labels are committed.
10.3 Failure modes (FM)
Twelve normative failure modes are defined. Each has an exit code, an HTTP status (when surfaced via the API), and a recovery action. A representative slice:
| Code | Phase | Severity | Action |
|---|---|---|---|
FM-001 | compile | fail | K-score below floor; emit diagnostic bundle, exit 65. |
FM-002 | compile | degraded | Corpus partially unreadable; proceed, warn, list skipped files. |
FM-003 | compile | degraded | Frontier quota exceeded; fall back to cached k-sample winners; mark recompile_recommended. |
FM-101 | runtime | fail | Signature invalid; refuse to execute, exit 70. |
FM-103 | runtime | fail | OOM. Return 503, suggest smaller quantization. |
FM-201 | registry | degraded | Registry unavailable; verify against last-cached epoch root. |
FM-301 | auth | fail | Bearer invalid. Return 401. |
§11 Conformance
An implementation is conformant if it satisfies the rules in this section. The
public test suite at github.com/sneaky-hippo/kolmogorov-stack/tree/main/conformance
is the operational definition; this section is the human-readable summary.
11.1 Compiler conformance
- MUST implement stages 1, 2, 3, 7, 8 (§5).
- MUST emit byte-identical artifacts for identical inputs (§9).
- MUST NOT emit a signed artifact if the K-score gate fails (§6.2).
- MUST include at least one deterministic verifier per task (§7).
- MUST post anchor records to the registry on success.
11.2 Runtime conformance
- MUST verify
signature.sigbefore executing any layer. - MUST emit a receipt per inference event (§8).
- MUST refuse to execute artifacts whose
rsfield is a higher major version than the runtime supports. - SHOULD support offline operation. Receipt verification and inference both MUST work without network when the epoch root is locally cached.
- MAY implement recipe-driven speculative decoding (§7.5); doing so does not affect output correctness, only latency.
11.3 Naming
Implementations that pass the conformance suite MAY use the
name kolm, the file extension .kolm, and the registered MIME type
application/vnd.kolm.rs1+zip. The names kolm and
Kolmogorov are trademarks of Kolmogorov, Inc. Use is permitted in the context of
conformant implementations and in editorial reference. The names
RS-1, K-score, and recipe pack are public terms.
§12 References
- [RFC2119] Bradner, S. Key words for use in RFCs to Indicate Requirement Levels. 1997.
- [RFC8174] Leiba, B. Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words. 2017.
- [GGUF-v3] Gerganov, G. et al. GGUF: GPT-Generated Unified Format, version 3. 2024.
- [LoRA] Hu, E. et al. LoRA: Low-Rank Adaptation of Large Language Models. 2021.
- [SpecDec] Leviathan, Y. et al. Fast Inference from Transformers via Speculative Decoding. 2023.
- [BGE-M3] Xiao, S. et al. BGE M3-Embedding. 2024.
- [SQLITE-VEC] Asg, A. sqlite-vec: A vector search SQLite extension. 2024.
- [JSON-SCHEMA-2020-12] JSON Schema Draft 2020-12. json-schema.org.
- Reproducibility appendix and conformance test vectors live at
github.com/sneaky-hippo/kolmogorov-stack.
§A Appendix A: manifest JSON Schema
The full JSON Schema for manifest.json is published at
kolm.ai/schemas/manifest-1.0.0.json. Required top-level keys are:
| Key | Type | Constraint |
|---|---|---|
rs | string | SemVer; major MUST match runtime major. |
id | string | Pattern kolm:[0-9a-f]{32}. |
created_at | string | RFC 3339 UTC. |
compiler | object | Required keys: name, version. |
task | object | Required keys: description, intent_hash. |
base_model | object | Required keys: name, weights_sha256, quantization. |
adapter | object | Required when stage 6 ran. Keys: format, rank, alpha, epochs, weights_sha256. |
recipes | object | Required keys: registry_epoch, pack_sha256, count. |
recall | object | Required when corpus supplied. Keys: embedder, chunks, index_sha256. |
verifiers | array | Min length 1. Each entry: id, type, sha256. |
k_score | object | Required keys: composite, components, gate, floor. |
signature | object | Required keys: alg, anchored_to, layer_hashes. |
A.adapter: default LoRA hyperparameters
| Param | Default | Range |
|---|---|---|
rank | 16 | 4–64 |
alpha | 32 | = 2·rank |
epochs | 3 | 1–10 |
learning_rate | 2e-4 | 1e-5–1e-3 |
batch_size | auto | scaled to host VRAM |
quantization | INT4 + bf16 | {INT4, INT8, fp16, bf16} |
§B Appendix B: signature wire format
The signature.sig file is a 256-byte binary blob with the following layout:
// signature.sig - RS-1 1.0.0 offset size field 0 4 magic "KOLM" 4 2 version 0x0100 (= 1.0) 6 2 reserved 0x0000 8 32 manifest_sha256 40 32 layers_concat_sha256 // sha256 of sorted (path, sha256) tuples 72 32 epoch_root // registry root at anchor time 104 32 anchor_record_id // id assigned by registry post 136 32 hmac // hmac-sha256 over bytes 0..135 under epoch key 168 88 reserved (zeros) 256 EOF
Verification: a runtime computes HMAC-SHA256(epoch_key, bytes[0..135]) and
compares to bytes[136..167]. The epoch key is published at
registry.kolm.ai/epoch/<anchor_date>/key.json and signed by the
registry's long-term Ed25519 key.
github.com/sneaky-hippo/kolmogorov-stack.
The maintainers can be reached at spec@kolm.ai.