Every modality. One index. Every compile, grounded.
Recall is the multimodal vector substrate beneath the compiler. Drop a folder, get back a
per-tenant hybrid index, BM25 + vector + RRF + reranker, that every Distill call grounds itself in.
The same index ships inside the artifact as index.sqlite-vec, so the compiled model
knows your specifics offline, forever.
Ingest anything.
Multimodal sidecars.
Every non-text file gets a Markdown sidecar, a structured tokenization that turns pixels, frames,
and waveforms into something both humans and models can read. The sidecar is what gets embedded;
the original is what gets returned. Inspired by Tobi Lutke's
qmd pattern, extended to image / audio / video / PDF.
text + code
chunked by structure, embedded with bge-m3.
image
CLIP embed + caption sidecar via vision model.
audio
Whisper transcript + CLAP embed of timbre.
video
scene-detected → per-shot caption + ASR + CLIP.
unstructured layout parse + per-page sidecar.
everything else
magika sniffs, falls back to bytes-as-text.
Hybrid retrieval, reranked.
Lexical recall catches names and IDs. Vector recall catches paraphrase. RRF fuses without tuning a weight.
A small cross-encoder reorders the top of the list. The whole pipeline is what grounds every Distill k-sample
and every /v1/wrap/verified call when a corpus_namespace is set.
Three back-ends, one interface.
Same SDK, same CLI, same wire format. Pick the back-end at the boundary; the rest of the stack doesn't notice.
On-device is the default inside artifacts: every .kolm ships index.sqlite-vec so
recall keeps working when the network is gone.