cookbook / edge

One artifact for every edge box.

Edge AI usually means three runtimes (CoreML, ExecuTorch, TFLite), three quantization passes, and a six-month integration before the first device ships. kolm collapses the stack: compile once on your gold examples, ship the same signed file to ARM, x86, or RISC-V hardware. Runtime is offline, deterministic, and verifiable. The .kolm doesn't care whether you're on a NVIDIA Jetson, a Raspberry Pi, or a fleet of Linux x86 kiosks.

01three edge recipes

Where this shape wins.

Each is a single .kolm. Compile once, distribute via your existing OTA pipeline, run with zero network dependency.

industrial fault triageTake a vibration / thermal / current sensor stream, label the operator's recommended next action. Compile from 60–90 days of plant-floor sensor + maintenance logs.
retail kiosk routingTake a customer voice prompt or text input, route to the right knowledge base or staff escalation. Compile from your existing call-center transcripts.
in-vehicle assistantTake cabin voice, return navigation / climate / media intents bound to your HMI's API surface. Compile from logged dialogues + the OEM's intent schema.
02compile

Compile for a target class.

$ kolm compile fault-triage \
    --base qwen2.5-3b-instruct \
    --quantize int4 \
    --target-ram 4G \
    --recipe-pack-depth 64 \
    --examples ./plant-logs.jsonl

K-score        0.79  ok
size_bytes     1.84 GB  (fits ARM Cortex-A78 4GB SKU)
p50_latency    312us    (target: 800us on Jetson Orin Nano)
03run

Same .kolm, three architectures.

# Jetson Orin Nano (ARM, CUDA)
$ kolm run fault-triage.kolm --in /dev/sensors/vib0
runtime: arm64-cuda, p50 287us, 0 network calls

# Raspberry Pi 5 (ARM, CPU)
$ kolm run fault-triage.kolm --in /dev/sensors/vib0
runtime: arm64-cpu, p50 4.2ms, 0 network calls

# Industrial x86 mini-PC (Intel N100)
$ kolm run fault-triage.kolm --in /dev/sensors/vib0
runtime: x86_64-cpu, p50 3.8ms, 0 network calls

No retargeting. No re-quantization. The compiler chose the right primitives at compile time; the runtime adapts to the host.

04provisioning

Fits your existing OTA pipeline.

A .kolm is a regular zip with a sha256. Push it through whatever you already use (Mender, Balena, Azure IoT Hub, AWS IoT, your own apt repo). The runtime caches by hash and verifies the signature on every cold load.

signed at compileHMAC chain anchors to your team registry. A field device only loads artifacts whose anchor matches the deployment's expected anchor.
delta-friendlyRecipes and LoRA changes ship as a small delta against the base model pointer. No need to push 2GB on every policy update.
rollback in secondsOld .kolm stays in the cache. kolm pin v2026-04-12 reverts the active artifact instantly.
offline-firstThe runtime never phones home. Receipts are local; you mirror them upstream on your own schedule.
05guarantees

What we say. What we don't.

same artifact across architecturesYes. The compiler emits target-class metadata; runtime picks the right path. No retargeting required.
deterministic at run timeYes for recipe-mode tasks. LoRA-mode tasks include a deterministic seed in the manifest; same input + same artifact = same output.
tiny-target SKUs (<1GB RAM)Recipe-only mode on the Pro tier. Default compile targets 2GB RAM minimum; sub-1GB SKUs use the recipe-only flag and skip the LoRA layer.
real-time guaranteesNo. p50 latency is reported; p99 / hard real-time is your integrator's responsibility, not the artifact's.