Agent patterns · signed tools, signed steps

The tool a planner calls is a signed artifact.

A frontier agent decides what to call. A .kolm decides how. Five wiring patterns that let your planner reason in the cloud while the deterministic, regulated, or latency-sensitive steps stay local. Every tool call returns a signed receipt that pins the artifact CID into the agent's audit log.

Frameworks

Claude Code · Cursor · LangGraph · OpenAI Agents

Transport

MCP stdio · MCP HTTP · OpenAI tool_calls

Per-step proof

HMAC receipt + CID

Cost / call

$0 once compiled

Pattern 01

A .kolm as an MCP tool inside Claude Code.

MCP · stdio transporttool surface

Auto-attach every artifact under ~/.kolm/artifacts/

One command turns every compiled artifact on disk into a callable MCP tool. The Claude Code planner sees them in its tool palette, with input schemas inferred from the artifact's eval set. The model layer never sees raw payloads; the tool returns the deterministic output plus a receipt.

$ kolm serve --mcp
[mcp] phi-redactor.kolm   tool: phi_redactor   K=0.982  latency_p50=1.2µs
[mcp] sql-validator.kolm  tool: sql_validator  K=0.961  latency_p50=8.1µs
[mcp] price-router.kolm   tool: price_router   K=0.918  latency_p50=42µs

# Claude Code config (.mcp.json or settings.json)
{
  "mcpServers": {
    "kolm": { "command": "kolm", "args": ["serve", "--mcp"] }
  }
}

install kolm install claude-code --apply writes this automatically

Why this is honest: an MCP tool is opaque to the model. The artifact CID is what the agent audit log records, so a regulator can replay the exact bytes that produced any tool result, no model snapshot required.

Pattern 02

A .kolm as a function call inside the OpenAI Agents SDK.

OpenAI Agents · tool_callsfunction tool

Bind the artifact to a typed Python callable

The OpenAI Agents SDK takes Python callables and exposes them as tools. The kolm Python SDK wraps each .kolm as a typed callable: the agent's planner picks it like any other tool, the response carries the signed receipt the SDK pins into agent memory.

from agents import Agent, Runner
from kolm import as_tool

phi_redactor = as_tool("phi-redactor.kolm")  # typed Pydantic input/output

agent = Agent(
  name="ClinicalIntake",
  model="gpt-4o",
  instructions="When the user submits clinical free-text, call phi_redactor first.",
  tools=[phi_redactor],
)

result = Runner.run_sync(agent, "Patient John Doe, MRN 8847-21, complains of...")
print(result.final_output)
print(result.usage)  # token_count + kolm_receipts[]

min py 3.10 · extras [agents] · receipts auto-pinned to result.usage.kolm_receipts

Why this is honest: the cloud planner reasons about which tool to call. The local artifact decides what the tool returns, in microseconds, with a signature you can verify offline.

Pattern 03

A .kolm as a deterministic node in LangGraph.

LangGraph · graph nodestate transformer

Plug a signed artifact into a stateful graph

LangGraph nodes transform state. A kolm artifact is a deterministic state transformer with a signed receipt. Drop it in as a node; the surrounding nodes (planner, retriever, validator) keep their existing roles, the bytes you care about run locally.

from langgraph.graph import StateGraph, START, END
from kolm import load

redactor = load("phi-redactor.kolm")

def redact_node(state):
  out = redactor.predict(state["raw_note"])
  return {
    "clean_note": out.text,
    "receipts": state.get("receipts", []) + [out.receipt],
  }

g = StateGraph(dict)
g.add_node("redact", redact_node)
g.add_node("plan", planner_node)
g.add_edge(START, "redact")
g.add_edge("redact", "plan")
g.add_edge("plan", END)
app = g.compile()

state shape {"raw_note": str, "clean_note": str, "receipts": list}

Why this is honest: the receipt accumulates in the state. When the graph finishes, the agent has a complete chain of signed evidence covering every deterministic transformation it performed.

Pattern 04

A .kolm as a constrained-decoding backbone.

OpenAI schema · response_formatstructured generator

Force every agent step to emit valid JSON or a chosen tool

An agent that maybe-returns-JSON is an agent that maybe-crashes. A .kolm served over the OpenAI HTTP shim accepts response_format=json_schema and tool_choice; the sampler enforces the grammar token by token. The planner never produces a malformed call.

# Agent calls the local .kolm with the same response_format it sends to OpenAI.
from openai import OpenAI

client = OpenAI(base_url="http://localhost:8765/v1", api_key="local")
resp = client.chat.completions.create(
  model="planner.kolm",
  messages=[{"role":"user","content":"Plan three steps to triage this ticket."}],
  response_format={
    "type": "json_schema",
    "json_schema": {
      "name": "plan",
      "schema": {
        "type": "object",
        "properties": {"steps": {"type": "array", "items": {"type": "string"}, "minItems": 3, "maxItems": 3}},
        "required": ["steps"],
      },
    },
  },
)
plan = json.loads(resp.choices[0].message.content)

backend vLLM grammar / Outlines / lm-format-enforcer · zero-shot rate 100% (grammar is enforced, not validated post-hoc)

Why this is honest: traditional retry-on-parse loops waste tokens and add latency. Grammar-constrained sampling refuses to emit invalid tokens at all. The first response is valid by construction.

Pattern 05

A .kolm pair as a verifier inside a test-time-compute loop.

Best-of-N / self-consistency · verifier rolescoring head

Sample N candidates, score with a signed verifier, return the winner

A planner agent samples N candidate plans. A second artifact, a Bradley-Terry reward model trained on your accepted-vs-rejected pairs, scores each one. The agent returns the highest-scoring plan plus the score and the verifier's CID. Both the planner's K-score and the verifier's K-score ship in the same receipt.

from kolm import load
from kolm.test_time import best_of_n

planner = load("planner.kolm")
verifier = load("plan-rm.kolm")  # Bradley-Terry reward model

result = best_of_n(
  prompt="Triage this support ticket.",
  generator=planner,
  scorer=verifier,
  n=8,
)
print(result.best.text)
print(result.best.score)            # e.g. 0.873
print(result.receipts.generator.cid) # planner CID
print(result.receipts.scorer.cid)    # verifier CID

params n=8 default · tie-break by score then by latency · cost still $0 per call locally

Why this is honest: a single planner pick is one sample from a noisy distribution. Scoring eight candidates with a learned verifier reliably beats sampling once, and the verifier's bias is bounded by the eval set it was trained on, which ships inside its own artifact.

What you ship in the audit log

Three artifacts per agent step.

The model snapshot

The planner's response, the tool calls it produced, the temperature, the seed if exposed. For cloud planners this is what your provider gives you; for local planners served from a .kolm the snapshot also includes the artifact CID.

The tool receipt

Every .kolm call returns an x-kolm-receipt object: input hash, output hash, artifact CID, K-score floor, signed by the same HMAC chain as the artifact itself. Replayable byte-for-byte by an auditor with the secret.

The graph trace

For multi-step flows, the chain of receipts is itself canonicalizable and signable. The agent finishes with a single receipt that recursively pins every .kolm it touched, end-to-end, no model-vendor trust required.

Compatibility

Which integration point each framework hits.

Framework	Entry point	Transport	Receipt surface
Claude Code	`kolm serve --mcp`	MCP stdio	`x-kolm-receipt` header per tool reply
Cursor	`kolm install cursor --apply`	MCP HTTP on localhost	same receipt header, pinned to Cursor chat row
Continue	`kolm install continue --apply`	MCP HTTP on localhost	receipt threaded into Continue context block
Cline	`kolm install cline --apply`	MCP HTTP on localhost	receipt in tool result envelope
OpenAI Agents	`kolm.as_tool(art)`	Python callable	`result.usage.kolm_receipts` list
LangGraph	node fn returning `{receipts: []}`	in-process Python	state accumulates receipts across nodes
LangChain	`Tool.from_function(load(art).predict)`	in-process Python	callback handler reads receipt from return
Vercel AI SDK	fetch wrapper to `kolm serve --http`	OpenAI HTTP	`response.headers["x-kolm-receipt"]`

The planner decides. The artifact proves.

Agentic systems route requests through a planner; the planner's job is good taste, not byte-exact behavior. A .kolm is the byte-exact behavior, the deterministic floor, the regulated tool the planner reaches for when the answer needs a signature. Five patterns, one contract: the signed artifact is the API.

Compile your first agent tool MCP wiring docs Native SDKs