Email-reply recipe

What this recipe does

Most email-AI either sounds like a chatbot or hallucinates commitments you never made. This recipe does neither. It learns your style from your sent folder (greeting habits, sign-off, sentence rhythm, signature phrases) and is constrained at run time to never invent a meeting you didn't agree to or a detail not in the thread.

Runs entirely on-device. The training corpus is extracted on your laptop, the model is compiled on your laptop, the artifact stays on your laptop. We never see any of your email.

The spec

{
  "output_kind": "json",
  "schema": {
    "required": ["reply", "intent", "requires_human_review"],
    "properties": {
      "reply": { "type": "string", "maxLength": 1200 },
      "intent": { "enum": ["accept", "decline", "defer", "clarify", "ack"] },
      "requires_human_review": { "type": "boolean" },
      "flags": { "type": "array", "items": {"enum": ["contains_commitment", "contains_dollar_amount", "contains_date"]} }
    }
  },
  "verifier": {
    "style_corpus": "sent/",
    "reply_must_not_invent_commitments": true,
    "escalate_on_dollar_or_date": true,
    "redact_before_train": "pii.regex"
  }
}

Gold pairs (1 of 1,200 shown)

input - inbound thread

From: alex@studio.com
Subject: Quick coffee Thursday?

Hey - in town Thursday morning. Free 9-11 if you want to grab coffee
and catch up. Sightglass on 7th if that still works.

output

{
  "reply": "hey alex - 9:30 thursday at sightglass works for me. see you then. -r",
  "intent": "accept",
  "requires_human_review": false,
  "flags": ["contains_date"]
}

Compile

kolm compile "email reply drafter in my voice" \
  --base phi-3-mini-4k \
  --pairs ~/.kolm/sent-pairs.jsonl \
  --style-corpus ~/.kolm/sent/ \
  --redact-pii pii.regex \
  --verifier style-faithful,no-invented-commitments \
  --k-floor 0.78 \
  --output email-reply.kolm

ok wrote email-reply.kolm
   k_score=0.81  signature=hmac-sha256

K-score gate

K-score 0.81 held-out 360 inbound emails · style-similarity 0.83 (cosine vs sent corpus) · intent-correct 91% · zero invented commitments · 100% PII redacted before train

Style-similarity is cosine distance between the draft and a held-out sample of your sent emails of similar intent. 0.83 is "indistinguishable from your own writing in a blind read."

Run-time profile

M2 MacBook

920ms

RTX 5090

240ms

iPhone 15 Pro

2.6s

CPU x86 (server)

3.4s

Deploy

# gmail extension — on every inbound, drop a draft into the compose box:
chrome.gmail.onMessageReceived((msg) => {
  const d = kolm.run('email-reply.kolm', msg.body);
  if (d.requires_human_review || d.flags.includes('contains_dollar_amount')) {
    chrome.notifications.create({ title: 'review needed', msg: msg.subject });
  }
  gmail.draft(msg.threadId, d.reply);
});

Inbox in, draft out — in your voice.