Research memo

01executive thesis

Do not sell another runtime. Sell the artifact control plane above them.

A credible decabillion-dollar plan has to explain why developers will use kolm when native runtimes are free. The answer is not generic on-device inference. The answer is repeatable compilation, device-targeted evals, signed release evidence, and a registry that makes good artifacts easier to trust than hand-rolled model glue.

01

Compiler cache for intelligence

Each task compile should become a reusable artifact with a spec, eval pack, target profile, receipt, and provenance. The customer pays for the governed compile loop, not for a local function call.

02

Runtimes are suppliers

Core ML, LiteRT, ONNX Runtime, ExecuTorch, llama.cpp, and MLC should be target backends. Kolm should optimize, verify, and package across them instead of pretending they do not exist.

03

K-score becomes trust

The score matters only when it correlates with task performance, device fit, latency, size, and release policy. Publish the harness and make the score reproducible.

04

Compliance is evidence

Regulated teams will pay for control evidence, retention policy, audit logs, BAAs where applicable, and reviewed claims. They will not pay for slogans.

05

Registry is the marketplace

The most valuable asset is a reviewed catalog of artifacts with evals, device profiles, receipts, and revocation history. That is the network-effect wedge.

06

Personalization is the sharpest claim

Local personalization is compelling, but it must specify whether it is retrieval, adapter training, calibration, or another method. Battery, memory, and storage limits decide the truth.

02source-backed landscape

The threat is not theoretical. The platform layer is moving fast.

Kolm should assume the best execution engines are free or bundled. The strategy is to own the cross-runtime compile, eval, registry, and governance workflow.

Layer	Source signal	Threat to kolm	Winning response
Apple Core ML	Apple positions Core ML as optimized for on-device performance, model conversion, compression, Xcode reports, and Apple silicon execution. Source	Native iOS teams already get a deeply integrated path.	Make Core ML a first-class target and compare artifact size, latency, adapter support, and release evidence against native baselines.
Apple Foundation Models	Apple gives apps access to an on-device language model for text generation, structured output, and tool calling. Source	For simple Apple-only language features, the platform path may be enough.	Position kolm for cross-platform, task-specific artifacts, non-Apple targets, and audited compile history.
Google LiteRT	Google describes LiteRT as a high-performance on-device framework with multi-platform support, conversion, optimization, and hardware acceleration. Source	Android and edge teams can stay inside Google AI Edge tooling.	Target LiteRT output, then own the higher-level spec, eval pack, receipts, and registry metadata.
MediaPipe	MediaPipe Solutions provide ready-made cross-platform tasks, models, Model Maker customization, and browser-based evaluation tooling. Source	Common perception and LLM tasks may be solved before kolm enters the workflow.	Focus on custom business tasks, regulated release evidence, and artifacts that combine task examples, policies, and eval cases.
ONNX Runtime Mobile	Microsoft documents mobile deployment across iOS and Android, execution providers, binary-size controls, latency, power, and model-size measurement. Source	Framework-neutral mobile teams already have a mature route.	Use ONNX as a target and publish repeatable measurements rather than generic cross-platform claims.
ExecuTorch	PyTorch frames ExecuTorch as an end-to-end mobile and edge inference stack with portability, productivity, and hardware acceleration. Source	PyTorch-native teams will not leave familiar export and deploy flows without proof.	Import PyTorch tasks, target ExecuTorch where appropriate, and sell governed artifact promotion over raw deployment.
Local LLM open source	llama.cpp emphasizes minimal setup and strong performance across local and cloud hardware; MLC WebLLM uses WebGPU for in-browser local acceleration. llama.cpp MLC	Offline LLM inference alone is not a paid moat.	Sell small task artifacts, eval-backed specialization, release receipts, and local personalization governance.
Regulatory pressure	The EU AI Act applies progressively through 2027, and HHS frames the HIPAA Security Rule around administrative, physical, and technical safeguards for ePHI. EU timeline HHS	Blanket compliance claims become legal and sales risk.	Map every claim to a dated control, document owner, limitation, and customer-facing evidence artifact.

03redline requests

What must be proven before the story is fundable.

These are the investor, buyer, and technical diligence asks that should shape the next build sprint. Each one converts a claim into a proof asset.

P0

Runtime target matrix

Publish which targets are supported now, which are planned, and which are intentionally out of scope. Include iOS, Android, browser target, laptop, server, and embedded classes with device names.

P0

Native benchmark pack

Run the same task through native Core ML, LiteRT, ONNX Runtime, ExecuTorch where applicable, and a kolm artifact. Report p50, p95, artifact size, binary impact, memory, energy proxy, and K-score.

P0

Personalization spec

Define whether personalization uses retrieval, calibration, adapter training, local examples, or another mechanism. Document storage, encryption, deletion, hardware limits, and failure behavior.

P0

Artifact format truth

Show what is actually inside each artifact tier: recipe, evals, metadata, target binary, adapter, weights, and receipt. No investor should need to infer whether an artifact contains model-bearing payloads.

P1

K-score correlation

Prove that score movement predicts useful task outcomes across at least three workload families. Treat score drift as a release blocker.

P1

Compliance evidence pack

Replace broad claims with an evidence map: BAA status, DPA, subprocessor list, retention policy, audit log, encryption controls, review date, and limitation notes.

P1

Registry moat narrative

Make the registry the product surface: curated artifacts, device profiles, K-score history, review status, revocation, provenance, and customer-private namespaces.

P1

ICP and design partners

Pick one wedge for the next 90 days. Healthcare workflow apps, fintech mobile teams, and enterprise mobile teams each require different proof and sales language.

04evidence gates

Every public claim needs a proof object.

Before saying cross-platform

At least one reproducible artifact per named target, with hardware, OS, runtime, benchmark command, and fallback behavior.

Before saying production

Tenant storage, auth, key generation, failure logs, benchmark reproducibility, CI gates, and rollback have passing evidence.

Before selling regulated use

Control owner, legal review, BAA/DPA posture, data lifecycle, audit log, subprocessor list, and limitation copy are linked from the sales page.

Before charging enterprise

Org controls, private registry, receipt retention, access logs, SSO path, SLA language, and at least one credible pilot workflow exist.

05next research

The next 30 days should turn unknowns into proof.

week 1

Truth table

Inventory every site claim and map it to code, benchmark, source, or redline. Remove or soften anything without proof.

week 2

Benchmark harness

Run three tasks across at least two real devices and one server target. Publish raw output and commands.

week 3

Registry evidence

Seed 8 to 12 artifacts with eval packs, target profiles, score history, receipts, and source notes.

week 4

Design partner offer

Create one vertical pilot with a narrow outcome, price, success metric, and evidence pack.

06source register

Primary sources used in this memo.

Apple Core ML Apple Foundation Models Google LiteRT Google MediaPipe Solutions ONNX Runtime Mobile ONNX Runtime generate API ExecuTorch architecture llama.cpp MLC WebLLM EU AI Act timeline HHS HIPAA Security Rule Qualcomm and Edge Impulse