kolm  /  launch bounties

Launch bounties: $14,000 for five .kolm artifacts.

Five concrete artifacts the world needs. Build one that passes its gate, publish to the kolm hub, get paid by ACH or BTC inside seven business days. No CFPs, no waiting list, no exclusivity: first verified .kolm wins the slot.

launch program . v1.0 . pool open through 2026-09-30 . payouts in USD or BTC

5 bounties $14,000 total first-to-finish verified by K-score gate published under your name

How it works

  1. Pick a bounty from the five below. Each lists the spec, the gate (minimum K-score and size), and the test set you will be measured against.
  2. Build the artifact with kolm compile. The test set is published; you can run kolm eval against it locally until you pass the gate.
  3. Publish with kolm publish <artifact>.kolm --public. The artifact becomes a public, signed, content-addressable file on the hub.
  4. Submit the handle (your-username/artifact-name@sha256:…) and your payout details to bounties@kolm.ai.
  5. We verify by running kolm pull <handle> and kolm eval against the published test set. If it passes, we pay within seven business days.

Rules in one paragraph

Each bounty is awarded to the first submission that passes the gate against the published test set. Submissions are evaluated in the order received. If a submission fails the gate, you can fix and resubmit; the timer keeps running. The test set is held constant for the life of the bounty (no goalpost moves). You retain ownership of your artifact; we get a non-exclusive license to feature it on the hub front page. Payouts are USD via ACH or wire, or BTC at spot. Anyone can submit, anywhere; tax forms are your responsibility.

The five bounties

bounty 01 . healthcareSOAP-note structuring
$4,000

Convert a free-text clinical encounter note into structured SOAP sections (Subjective, Objective, Assessment, Plan) with provenance offsets back to the source text. PHI must be redacted in the output.

gate K ≥ 0.90 size ≤ 50 MB p50 ≤ 300 ms runtime local-only (no egress) test set 200 cases, de-identified

Spec: Input is a 200-1500 word free-text note. Output is a JSON object with subjective, objective, assessment, plan as strings, plus spans as an array of {section, start_char, end_char} back into the original. Names, MRNs, dates of birth, addresses, and phone numbers must be replaced with category placeholders.

Test set: kolm pull kolm-verified/bounty-soap-test --metadata-only to fetch the manifest; published cases at /registry-pack/manifest.json under bounty-soap-structuring.

accept kolm eval <your.kolm> --examples bounty-soap-test.jsonl returns K ≥ 0.90 on the published 200 cases. The first published .kolm to clear the gate wins.
bounty 02 . legalContract clause extractor
$3,000

Pull governing-law, term-length, auto-renewal, exclusivity, and liability-cap clauses from a commercial contract. Output the exact span and a normalized value.

gate K ≥ 0.90 size ≤ 75 MB p50 ≤ 400 ms runtime local test set 150 cases (real MSAs, anonymized)

Spec: Input is a contract as plain text (15-80 pages). Output is a JSON object with five fields: governing_law (state/country name), term_months (integer or null if perpetual), auto_renews (bool), exclusivity (bool), liability_cap (string: USD amount or "uncapped" or "fees paid in 12 months"). Each field must include a _span annotation pointing to the source text.

Test set: 150 anonymized MSAs, NDAs, and SaaS agreements; ground truth labeled by two lawyers in agreement.

accept field-level F1 ≥ 0.90 weighted by the K-score formula. Span localization counts: a correct value with a wrong span is half-credit.
bounty 03 . edgeOn-device speech-to-text under 50 MB
$3,500

A .kolm artifact that does English speech-to-text on commodity mobile hardware, under 50 MB, p50 real-time-factor ≤ 0.5 on iPhone 14, WER ≤ 8% on LibriSpeech test-clean.

gate WER ≤ 8% size ≤ 50 MB RTF ≤ 0.5 on A16 runtime local (CoreML/ONNX-mobile) test set LibriSpeech test-clean (2620 utterances)

Spec: Input is 16kHz mono PCM audio (up to 60 seconds). Output is a transcription string and per-word timing as [{word, start_s, end_s}]. Artifact must run on an iPhone 14 (A16) or comparable Android (Snapdragon 8 Gen 2) with RTF ≤ 0.5. Quantization is your call.

Test set: public LibriSpeech test-clean. WER computed with the standard NIST evaluator.

accept WER ≤ 0.08, size ≤ 50 MB, RTF ≤ 0.5 measured on our reference device. We will validate on a real iPhone 14; if you do not have one, document your measurement methodology and we will repro.
bounty 04 . codePull-request risk classifier
$2,000

Classify a unified diff into one of six risk categories: security, performance, correctness, style, test-only, docs-only. Output the category, a confidence, and a one-line rationale.

gate K ≥ 0.90 size ≤ 20 MB p50 ≤ 100 ms runtime local test set 500 real diffs from public repos

Spec: Input is a unified diff (any size up to 50 KB). Output is {category: string, confidence: number, rationale: string}. Categories must be exactly one of: security, performance, correctness, style, test_only, docs_only. Rationale is a single sentence, at most 200 chars.

Test set: 500 labeled diffs from public OSS repos (Linux, Kubernetes, Rails, Postgres). Ground truth labeled by two senior engineers in agreement.

accept macro-F1 ≥ 0.90 across all six classes (no class can drop below 0.80 individually).
bounty 05 . voiceOn-device voice-clone safety filter
$1,500

A .kolm artifact that detects synthesized/cloned speech in a 5-second audio clip and outputs a confidence score plus the synthesis-family guess (none, tts-v1, voice-clone, deepfake-2024+).

gate AUROC ≥ 0.92 size ≤ 30 MB p50 ≤ 80 ms on M2 runtime local test set 1000 clips (balanced, 5 synthesizers)

Spec: Input is 5-second 16kHz mono PCM. Output is {synthetic: bool, confidence: 0..1, family: "none"|"tts-v1"|"voice-clone"|"deepfake-2024+"}. Test set is 1000 clips: 500 real human, 500 split evenly across five public synthesizers (ElevenLabs, OpenAI TTS, Cartesia, Resemble, XTTS-v2).

Test set: published cases at kolm pull kolm-verified/bounty-voice-clone-safety-test --metadata-only.

accept AUROC ≥ 0.92 across the four families combined; family-prediction top-1 accuracy ≥ 0.80 on detected synthetics.

FAQ

Can I submit more than one? Yes, you can win multiple bounties. Submit one .kolm per slot.

Do I need to retrain a model? No. If a recipe (regex, lookup, classifier head) clears the gate, that is the right answer. Smallest, fastest, simplest wins.

Can I use a proprietary model as the base? Only if its license permits redistribution inside a signed artifact. Otherwise, use an open base (Llama 3.1, Mistral, Qwen, Whisper, etc.).

What if two submissions arrive minutes apart? We award by submission timestamp on our hub server. Earlier wins.

Is this a Kaggle? No leaderboard, no rolling rank. Pass the gate, get paid, slot closes.

Can I see the held-out cases? Yes, the public test set is the held-out cases. We do not run a hidden second test; the published gate is the gate.

What if the gate is unreasonable? Email bounties@kolm.ai with your strongest evidence. If we agree it is unattainable for any honest method, we will adjust the gate; the adjusted version is the gate from that point onward.

Tax forms? US-based winners get a W-9 request before payout; international winners get a W-8BEN. Payouts in BTC at spot rate available on request.