kolm / tutorials / code review
Compile a PR review bot in 25 minutes.
By the end of this walkthrough you have an 8.2 KB .kolm that reads a unified-diff and emits a list of structured review comments. Three issue classes only: missing tests, unbounded recursion, hardcoded secrets. Tight scope on purpose; the K-score floor is 0.87 against a held-out set of 600 PRs.
Step 1 . 60 seconds
Install and authenticate.
$ npm install -g kolm
$ kolm login
$ kolm version
k o l m
------- the private AI compiler
kolm cli v0.1.0
spec rs-1
Step 2 . 3 minutes
Write the recipe.
Scope this tight. Three issue classes, an output schema that names file + line + class + suggestion, and a K-score floor at 0.87. The eval pack pr-review-3class-v1 covers the held-out test set.
$ cat > code-review.recipe.json <<'JSON'
{
"task": "review a unified diff for three issue classes: missing_tests, unbounded_recursion, hardcoded_secrets",
"base": "deepseek-coder-6.7b-instruct",
"objective": "per-class-precision-at-r80",
"adapter": "lora + dpo",
"context_window": 16384,
"output_schema": {
"comments": [
{
"file": "string",
"line": "int",
"class": { "enum": ["missing_tests", "unbounded_recursion", "hardcoded_secrets"] },
"severity": { "enum": ["block", "suggest"] },
"message": { "type": "string", "max_chars": 240 },
"suggestion": { "type": "string", "max_chars": 400, "required": false }
}
]
},
"target_k": 0.90,
"min_k": 0.87,
"eval_pack": "pr-review-3class-v1"
}
JSON
Step 3 . 4 minutes
Seed with examples.
Pull 80 real diffs from your repo history, labeled by issue class. The synthetic data step extrapolates to ~2,500 training pairs.
$ head -1 code-review.examples.jsonl | jq .
{
"input": "diff --git a/auth/login.py b/auth/login.py\n+++ b/auth/login.py\n@@ -12,3 +12,7 @@ def login(email, password):\n+ API_KEY = 'sk-prod-9af28e...' \n+ r = requests.post(url, headers={'Authorization': API_KEY})\n+ return r.json()",
"expected": {
"comments": [
{
"file": "auth/login.py",
"line": 13,
"class": "hardcoded_secrets",
"severity": "block",
"message": "Live API key committed in source. Move to env or secret manager before merge."
}
]
}
}
Step 4 . 14 minutes
Compile.
$ kolm compile --from code-review.recipe.json --examples code-review.examples.jsonl --out code-review.kolm [1/6] synthesizing pairs (Magpie + CodeSearchNet seed) ... 2,562 pairs 1m 24s [2/6] dedup + filter (per-language MinHash) .............. 2,318 pairs 11s [3/6] LoRA + DPO (preference over reviewer votes) ........ 4 epochs 10m 02s [4/6] constrained-decoder fit (comment schema) ........... 42s [5/6] K-score gate (pr-review-3class-v1) ................. K = 0.891 > 0.87 floor [6/6] sign + package ..................................... 3s artifact: ./code-review.kolm (8.2 KB) receipt: ./code-review.receipt.json (3.4 KB) CID: cidv1:sha256:a31e7c...
Step 5 . 1 minute
Run on a real diff.
$ git diff main...HEAD | kolm run code-review.kolm --stdin
{
"comments": [
{
"file": "ingest/walker.py",
"line": 42,
"class": "unbounded_recursion",
"severity": "block",
"message": "walk() recurses on subdir without a depth cap; nested-symlink loop will OOM the worker.",
"suggestion": "Add max_depth=64 with a counter; raise WalkLimit if hit."
},
{
"file": "auth/session.py",
"line": 88,
"class": "missing_tests",
"severity": "suggest",
"message": "New SessionStore.invalidate() path has no test; the rollover bug we fixed in #1422 would regress silently."
}
],
"latency_ms": 2104.8,
"receipt_cid": "cidv1:sha256:a31e7c..."
}
Step 6 . 2 minutes
Wire into GitHub Actions.
Drop this in .github/workflows/kolm-review.yml. The artifact runs on the runner; no PR diff leaves your network if you set KOLM_BACKEND=local_cpu.
name: kolm review on: pull_request: {} jobs: review: runs-on: ubuntu-22.04 steps: - uses: actions/checkout@v4 with: { fetch-depth: 0 } - uses: actions/setup-node@v4 with: { node-version: '20' } - run: npm install -g kolm - name: review env: KOLM_BACKEND: local_cpu run: | git diff origin/${{ github.base_ref }}...HEAD \ | kolm run ./code-review.kolm --stdin > review.json - uses: actions/github-script@v7 with: script: | const fs = require('fs'); const r = JSON.parse(fs.readFileSync('review.json')); for (const c of r.comments) { await github.rest.pulls.createReviewComment({ ...context.repo, pull_number: context.issue.number, body: `**${c.class}** [${c.severity}] ${c.message}`, path: c.file, line: c.line, }); }
Step 7 . optional
Verify and ship.
$ kolm verify code-review.kolm
✓ manifest CID matches canonical hash
✓ all 12 entries hashed and verified
✓ receipt HMAC valid
✓ K-score 0.891 (above declared gate 0.87)
artifact is valid.