kolm  /  compare  /  vs RAG stack

kolm vs a RAG stack.

A generic RAG stack (vector DB + frontier API) is fast to demo and slow to audit. The matrix below is what changes when each retrieval carries an HMAC receipt over (source_uri, content_sha, retrieved_at) and the decoder is constrained to refuse if the answer isn't grounded.

Ten axes. Reviewed 2026-05-15.

AxiskolmRAG stackWhy it mattersProof
Citation driftHMAC receipt per retrieval, < 0.5%~14% measuredWhen the source moves or changes, generic RAG carries on; kolm refuses or replays the bundle as it was.enterprise-search →
Hallucinated groundingnear 0 via constrained decoder3–8% measuredA confident citation that does not appear in the retrieved bundle is the bug regulators specifically test for.constrained decoding →
Offline replaymonths later, byte-identicalnoAn audit that asks "what was retrieved on 2025-09-04" is unanswerable without receipts.anatomy →
Costflat compile + cached embeddingsper-query LLM + vector DBRAG cost scales with users, not knowledge size. The line crosses quickly./roi →
Latency0.6 ms local retrieval + 12 ms decode200–500 ms network + LLMHelpdesk and field engineering UIs feel different at 12 ms vs 400 ms./benchmarks →
Audit readablereceipt chain in JSONspotty logsAn examiner needs a primitive that survives a 3-year audit window.receipt JSON →
Privacyon-prem, no third partyvector DB vendor + LLM vendorEach external dependency is a perimeter to defend./airgap →
Determinismseeded, reproduciblenoReproducibility is the floor for any regulated workflow.RS-1 →
Update cyclere-compile, new CIDre-index in placeCIDs let you pin a known-good bundle and roll back without re-indexing.K-score →
Refusal tokenbuilt-in, returns "I can't answer from this bundle"prompt engineering, brittleA constrained decoder refusal is enforced by the model architecture, not by hoping the prompt holds.constrained decoding →

When a generic RAG stack is the right answer.

You are building a consumer chatbot, the corpus is public knowledge, citation accuracy is nice-to-have, and you want to ship in a week. The vector-DB-plus-frontier-API pattern is the path of least resistance for that shape.

When kolm is the right answer.

The retrieval is over a regulated corpus (legal, medical, financial), the citation has to survive an audit, or the cost line is structural. The receipt chain is what every downstream procurement step converges on.