Receipt chains: cryptographic provenance for fine-tuned models

The provenance problem
The five-step chain
Canonical JSON
The content identifier
TEE attestation as an optional binding
The Rust verifier
TOFU and offline verification

The provenance problem.

You hand a fine-tuned model to a regulator. They ask three questions. What data trained it. What code produced it. What base model it descended from. The honest answer for most production fine-tunes today is a Slack thread, a Notion page, and a hope that the engineer who ran the job has not switched companies. The model itself is a tarball of weights with a filename and no embedded provenance.

A receipt chain solves this at the artifact layer. Every .kolm file carries an in-band proof of how it was built, signed under a tenant key, verifiable from the bytes alone. No network call. No registry lookup. No trust in our continued existence. The verifier is open-source Rust; you can vendor it, embed it, and run it in an air-gapped network.

The five-step chain.

The chain is a sequence of five HMAC seals. Each step hashes a canonical-JSON envelope over (step, input_hash, output_hash) and HMACs the result under the tenant receipt secret. The previous step's output_hash is the next step's input_hash; a single mutation anywhere in the build pipeline produces a different chain.

step 1  task     spec_hash         →  task_hash
step 2  seeds    task_hash         →  training_stats_hash
step 3  recipes  training_stats_hash → recipes_hash
step 4  evals    recipes_hash      →  eval_set_hash
step 5  package  eval_set_hash     →  artifact_hash

Each output_hash is a sha256 over canonical-JSON of the corresponding input. The spec_hash is over the static artifact specification version (so a future format break invalidates old chains). The task_hash is over the user-supplied task description. The training_stats_hash is over the run-level metrics the trainer emitted. The recipes_hash is over the deterministic recipe registry the compiler synthesized. The eval_set_hash is over the held-out cases the K-score was computed against. The artifact_hash is over the canonical join of every emitted file's sha256.

The Node reference implementation is short:

// src/artifact.js
const stepSeal = (step, input_hash, output_hash) => {
  const hmac = crypto.createHmac('sha256', secret)
    .update(canonicalJson({ step, input_hash, output_hash }))
    .digest('hex');
  return { step, input_hash, output_hash, hmac };
};

const chain = [
  stepSeal('task',    sha256(canonicalJson({ spec: ARTIFACT_SPEC })), taskHash),
  stepSeal('seeds',   taskHash,    seedsHash),
  stepSeal('recipes', seedsHash,   recipesHash),
  stepSeal('evals',   recipesHash, evalsHash),
  stepSeal('package', evalsHash,   artifact_hash),
];

The whole chain plus the body of the receipt is then sealed under one final HMAC, the receipt body signature. A verifier walks the chain forward (every step's hmac must reproduce under the secret), confirms the linkage between steps (each step's input_hash equals the previous step's output_hash), and re-verifies the body HMAC over the canonical-JSON of the receipt body.

The chain is not a Merkle tree. It is a linear hash ladder. The trade is simpler verification at the cost of one extra hash per step. For five steps this is free; for fifty steps it would not be.

Canonical JSON.

The hardest bug in any cross-language verifier is JSON serialization drift. Two implementations that produce semantically identical JSON but format it differently produce different sha256 hashes and a hmac mismatch failure that is excruciating to debug. The fix is canonical JSON: a deterministic serialization with sorted keys and no whitespace.

The kolm canonical form is short enough to memorize:

// src/artifact.js
function canonicalJson(v) {
  if (v === null || typeof v !== 'object') return JSON.stringify(v);
  if (Array.isArray(v)) return '[' + v.map(canonicalJson).join(',') + ']';
  const k = Object.keys(v).sort();
  return '{' + k.map(x => JSON.stringify(x) + ':' + canonicalJson(v[x])).join(',') + '}';
}

Three rules. Scalars use the host language's default JSON serializer. Arrays recurse element-by-element with no whitespace. Objects sort keys lexicographically, then recurse. The Rust verifier in packages/runtime-rs/ implements the same function on the same rule set; the Python reference in apps/trainer/ matches; an Ed25519 verifier we are sketching follows the same canonical form.

The subtle bug we fixed during the infra wave was a missing array recursion. A first implementation passed arrays through with default JSON.stringify, which produced the same bytes for primitive arrays but differed for arrays of objects (key order inside elements was not normalized). The chain verified on round-trip from the same implementation but failed cross-language. The fix was the one line above: v.map(canonicalJson).join instead of JSON.stringify(v).

The content identifier.

The CID is the artifact's identity. It is a deterministic string of the form cidv1:sha256:<64-hex>, computed over canonical JSON of the manifest's hashes block (which itself holds the sha256 of each emitted file). The receipt chain, the signing key, and the K-score are deliberately not in the CID; those are the seal on the bundle, not the identity of it.

{
  "cid": "cidv1:sha256:a8c5e3...d7f2",
  "hashes": {
    "model_pointer": "sha256:1c4f...",
    "recipes_json": "sha256:9e02...",
    "lora_bin":      "sha256:5d8b...",
    "index_bin":     "sha256:bf91...",
    "evals_json":    "sha256:e3b0..."
  }
}

The CID is embedded in three places: the manifest, the receipt body, and the audit log row written by the registry. A third party verifier reads any one of those, recomputes the CID from the manifest hashes, and confirms they match. A re-sealed receipt (different signing key, same content) produces the same CID; a re-trained artifact (any byte different) produces a different CID. The public endpoint GET /v1/cid/:cid resolves the CID against the registry for the audit lookup path.

Two compile jobs that produce the same CID can be deduplicated in storage. The audit log is never deduplicated: every compile remains a distinct event with its own receipt, but the underlying artifact is one row. This is the same trade Git makes between objects and refs.

TEE attestation as an optional binding.

The receipt chain proves the artifact came out of the build pipeline. It does not prove the build pipeline ran on the hardware you think it did. For workloads where that matters (regulated deployments, multi-party compute, supply-chain attestation), TEE attestation adds a fifth optional binding to the receipt body.

The supported attestation formats live in packages/attestation/:

AWS Nitro Enclaves. COSE_Sign1 envelope with PCR0 measurement of the enclave image. The parser extracts the measurement; verification matches against the expected PCR0 a deployer registered during enclave provisioning.
AMD SEV-SNP. 1184-byte binary attestation report. The measurement is the launch digest of the encrypted VM.
Intel TDX. A TDX quote. Measurement is the MRTD over the trust domain.
GCP confidential VM and Azure confidential VM. Cloud-provider-mediated attestation flows.
Docker (sha256 image digest). A software-only measurement labeled as such. Not hardware-attested; useful for parity with the hardware paths during dev.

An attested receipt extends the receipt body with one extra envelope:

{
  "attestation": {
    "vendor":      "aws-nitro",
    "measurement": "PCR0:8f31a4...",
    "signed_at":   "2026-05-14T11:02:18Z",
    "raw":         "base64:..."
  }
}

The kolm runtime does not verify the attestation against AWS root certificates itself. It stores the parsed envelope and the raw bytes; an external root-of-trust check is the deployer's responsibility. The point of the binding is that the same artifact, re-attested a year from now, will fail if the underlying enclave image has rotated. Drift becomes visible.

The Rust verifier.

The reference verifier lives in packages/runtime-rs/. It is a Rust 2021 crate, #![forbid(unsafe_code)], pure-Rust dependencies, no build.rs, no proc macros from foreign crates. Cargo.toml emits three crate types (rlib, cdylib, staticlib) so the same source can produce a Rust library, a shared object for FFI, or a static archive for a C consumer.

The verifier surface is two functions:

use kolm_runtime::{Artifact, VerifyError};

let a = Artifact::load("./refund-flagger-1.0.0.kolm")?;
let report = a.verify(secret)?;

if report.ok {
    println!("verified: cid={}", report.cid);
    println!("chain steps: {}", report.chain_steps.len());
    println!("k-score: {}", report.k_score);
}

Artifact::load unzips the bundle, parses the manifest, and stages the in-memory hashes. verify does five things: it recomputes the per-file sha256 of every emitted member and matches against the manifest hashes; it walks the chain forward and re-verifies each HMAC under the supplied secret; it confirms the linkage (each step's input_hash is the previous step's output_hash); it recomputes the CID from the manifest's hashes block and confirms it matches the embedded CID; and it re-verifies the body HMAC over canonical JSON of the receipt body.

The whole verifier is under 1500 lines of Rust. Cargo features keep optional dependencies (zip decompression, hex encoding) gated. The release profile uses lto = "fat", codegen-units = 1, and panic = "abort" so the staticlib drops cleanly into a C++ consumer.

TOFU and offline verification.

The hard part of any cryptographic provenance system is the trust anchor. Who signs the signer's key. Who countersigns that. We could ship an X.509 hierarchy with an OCSP responder and CRLs; we would also ship the operational pain of maintaining one. The kolm receipt chain uses HMAC with a tenant receipt secret instead, and the trust anchor is TOFU (trust on first use).

The mechanism is plain. When a tenant compiles their first artifact, the receipt is signed under a secret stored locally. The deployer archives that secret alongside the artifact. Every subsequent verification reuses the same secret. A rotation of the secret produces new receipts but does not invalidate old ones (old receipts remain verifiable under the old secret).

This trade is correct for the workflows we ship into. Healthcare buyers archive the receipt secret in their existing key management. Defense buyers store it in a hardware security module. Solo developers store it in a password manager. The receipt is portable, the verifier is portable, the secret is the deployer's problem. We never see it; we cannot lose it for you; an outage in our infrastructure does not break verification.

The point of the chain is not to make us indispensable. It is to make us replaceable. The artifact verifies under your secret on your hardware; we are one of several entities that could have produced it.

Ed25519 public-key signatures are on the v0.2 roadmap, behind a feature flag, with the same TOFU model for the public key. The chain shape does not change; only the signature_alg field on the receipt body advances from hmac-sha256 to ed25519 when the field is supplied.

A receipt chain is not a moral position on cryptography; it is the mechanism by which a fine-tuned model gets to be a noun. The artifact is in your hand. The chain says where it came from. The verifier says it has not been tampered with. The CID says no other artifact is this one. That is the entire system.

Receipt chains for fine-tuned models.

Contents