Verifier from examples recipe

What this recipe does

The hardest 30 minutes of compiling a new .kolm are spec-design: deciding what verifier hooks to attach. This recipe collapses that: paste 20 pairs, get a verifier spec back. The output is the same JSON shape kolm compile --spec consumes directly, so you can review it, edit if needed, and run.

The verifier on this recipe rejects any spec that the kolm CLI would itself reject as malformed — schema validity is enforced by re-running the produced spec through kolm spec --check at compile time.

The spec

{
  "output_kind": "json",
  "schema": { "$ref": "kolm-verifier-spec.schema.json" },
  "verifier": {
    "output_must_pass_kolm_spec_check": true,
    "output_must_compile_dry_run": true
  }
}

Gold pair (1 of 230 shown)

input - 20 pairs + task

task: classify github issue into bug | feature | docs | question
sample 1:
  in: "login broken on safari, getting 500"
  out: { "type": "bug" }
sample 2:
  in: "would love a dark-mode option"
  out: { "type": "feature" }
... (18 more)

output - synthesized spec

{
  "output_kind": "json",
  "schema": {
    "required": ["type"],
    "properties": {
      "type": { "enum": ["bug", "feature", "docs", "question"] }
    }
  },
  "verifier": { "closed_vocab_label": true }
}

Compile

kolm compile "verifier-spec synthesizer from examples" \
  --base qwen2.5-coder-7b \
  --pairs ./recipe-pairs/*.jsonl \
  --verifier output-passes-spec-check,output-compiles-dry-run \
  --k-floor 0.82 \
  --output verifier-from-examples.kolm

ok wrote verifier-from-examples.kolm
   k_score=0.86  signature=hmac-sha256

K-score gate

K-score 0.86 held-out 70 tasks · spec-validity 100% (else rejected) · verifier-hook-correct 89% (right hook chosen) · compiled-dry-run 96%

Run-time profile

M2 MacBook

1.2s

RTX 5090

280ms

iPhone 15 Pro

3.4s

CPU x86 (server)

4.2s

Deploy

# the meta-loop: from 20 examples to a deployable artifact in three commands
kolm new my-task --pairs pairs.jsonl
kolm spec --from-pairs pairs.jsonl > spec.json    # this recipe
kolm compile "my-task" --spec spec.json --output my-task.kolm

Drop in 20 pairs, a verifier comes back.