l3m: A Verified Coding Agent in Lean 4

l3m (pronounced “Lean LLM”) is a coding agent written entirely in Lean 4. It reads files, edits code, runs builds, and commits to git — the same operations as any Claude-Code-shaped agent. The difference: its runtime safety guarantees are kernel-verified theorems, not documentation promises. Every tool carries a contract. Every contract bottoms out in named axioms about the environment. The trust report is a build artifact.

The Problem

LLM-powered coding agents need sandboxes. The current approaches each have a gap:

Prompt-based safety — “don’t run rm -rf /” — is brittle. The model can be jailbroken, confused, or simply wrong about what a command does. No amount of prompt engineering produces a guarantee.
OS-level sandboxing (containers, seccomp, landlock) works for containment but doesn’t help you trust your own code. The sandbox tells you “this process can’t escape,” not “this agent only touches files inside ./workspace.”
Type-level restrictions in most languages are too coarse. You can restrict IO at the module boundary, but you can’t express “this path is confined to a subtree” or “this subprocess is one of exactly these five commands” in a way the compiler verifies.

l3m asks: what if the sandbox was your code, written in a language whose kernel can verify the confinement properties? Not “we tested it and it seems safe” — the Lean kernel checked a proof that the agent’s call graph cannot reach raw IO, cannot invoke arbitrary subprocesses, and cannot escape its working directory.

The thesis: A coding agent whose safety properties are kernel-verified theorems has a smaller trusted computing base than one relying on OS sandboxing + prompt discipline. The audit surface is explicit, named, and machine-checkable.

The Architecture

l3m has a pure core and a thin IO shell. The key design elements:

Typeclass-based tools with a closed operation set

Each tool (file read, file edit, grep, git, lake, shell) is a typeclass instance with a fixed set of operations. The agent’s run loop dispatches on a closed inductive of tool calls — there is no eval, no dynamic dispatch to arbitrary functions. Adding a tool means adding a constructor to the inductive and a typeclass instance; both are visible in the source.

inductive ToolCall where
  | fileRead   : SafePath → ToolCall
  | fileEdit   : SafePath → EditOp → ToolCall
  | dirList    : SafePath → ToolCall
  | grep       : GrepQuery → ToolCall
  | git        : GitOp → ToolCall
  | lake       : LakeOp → ToolCall
  | shell      : TameCommand → ToolCall

SafePath: opaque confined paths

SafePath is an opaque type. You cannot construct one from a raw string in pure code. The only constructor is confineIO, which lives in the IO shell and calls realpath to resolve symlinks, then checks that the result is a prefix of the workspace root:

opaque SafePath : Type

-- The ONLY way to get a SafePath
def confineIO (root : System.FilePath) (candidate : String)
    : IO (Except ConfineError SafePath) := do
  let resolved ← IO.FS.realPath candidate
  if resolved.toString.startsWith root.toString
  then return .ok (SafePath.mk resolved)
  else return .error (.escaped root candidate resolved)

Every file operation takes a SafePath. The type system enforces confinement — you literally cannot call fileRead with an unconfined path. No runtime check can be forgotten because the check is in the type.

TameCommand: closed inductive of allowed subprocesses

The agent can spawn subprocesses, but only from a fixed list:

inductive TameCommand where
  | git    : GitArgs → TameCommand
  | lake   : LakeArgs → TameCommand
  | grep   : GrepArgs → TameCommand
  | find   : FindArgs → TameCommand
  | cat    : SafePath → TameCommand

There is no | arbitrary : String → TameCommand. The reachability proofs (below) verify that the run loop can only reach TameCommand constructors, never raw IO.Process.spawn.

Tame vs. wild mode

By default, l3m runs in tame mode: only TameCommand subprocesses, only SafePath file access. There are two opt-in escalation tiers. Feral mode (--feral, or à la carte --enable-spawn, --enable-lake-run, --enable-eval-pure) unlocks tools that run user-controlled Lean code in subprocesses. Wild mode (--wild) unlocks raw shell and web_fetch with IO.Process.spawn. The manifests for wild-tier tools are permanently UnprovenConjecture — they can never be promoted, and the trust report always names them as unverified assumptions.

The design makes the tradeoff explicit: tame mode has 8 kernel-verified safety theorems. Wild mode has documentation. You choose.

What tame mode does not protect against. Capability bounding is not the same as secret-safety. An agent with read access to workspace files plus write access to commit-ready files can transcribe content the user did not want shared. We hit this in production: an early agent session, in --wild, read ~/.zshrc, faithfully transcribed the ANTHROPIC_API_KEY export into a documentation file, and committed. GitHub’s scanner caught it weeks later. None of l3m’s capability theorems were violated — the agent had legitimate shell access — but each individual step looked fine while the composition exfiltrated. We’ve written this up separately as a worked example. Defenses being added: env-var classification at the cap layer, output redaction at the tool-result boundary, and a commit-time secret scan in the git tool.

The Evidence Hierarchy

l3m borrows its verification model from Lean Manifests, which provides compiler-enforced evidence levels for Lean modules. Every claim in l3m’s codebase carries one of four evidence levels:

Level	Meaning	Example in l3m
`UnprovenConjecture`	Sorry IS the theorem. Zero evidence.	Shell tool safety (“the LLM won’t do anything bad”)
`TestedConjecture`	Sorry is the ∀. At least one witness exists.	Lake output parsing (“works on observed outputs”)
`DerivedConjecture`	Sorry is in OTHER manifests. Derivation is real.	confineIO correctness (depends on OS axioms)
`ProvenTheorem`	No sorry ANYWHERE. Kernel-verified.	Call graph reachability, compression roundtrip

Each manifest entry is tracked via a @[manifest_entry] attribute. At build time, a trust report is generated automatically: it walks every DerivedConjecture, collects its sorry dependencies, and reports exactly which assumptions the entire system rests on. Currently: 4 standard Lean axioms (propext, Quot.sound, funext, Classical.choice) + 10 typed UnprovenConjectures with real propositional content. The trust report is a build artifact — not a wiki page someone might forget to update.

Why this matters for an agent

When you deploy a coding agent, you want to know: “what am I trusting?” With l3m, the answer is a machine-generated list. The codebase currently contains 96 ProvenTheorems (kernel-verified), 17 TestedConjectures, 8 DerivedConjectures, and 10 typed UnprovenConjectures—with zero True-typed axioms. Every axiom has real propositional content and is falsifiable.

A ClosureTheorem proves that non-overlapping l3m workspaces produce observationally equivalent results regardless of write ordering, derived from a single narrow environmental axiom (SingleSystemAxiom). Multi-agent correctness is a theorem, not a hope.

Call Graph Reachability Proofs

The strongest guarantees in l3m are eight kernel-verified theorems about the agent’s call graph. These aren’t hand-written proofs — they’re generated by a metaprogram that walks the Lean environment at elaboration time.

How it works

A generator script runs during elaboration (inside elab commands).
It walks the environment starting from runLoop (the agent’s main entry point), collecting every reachable constant.
It emits the reachable set as a concrete List Name literal.
It emits native_decide proofs that check properties of this concrete list against the allowed sets.

Because the lists are concrete (no variables, no quantifiers over unknown sets), native_decide reduces each check to a boolean computation that the kernel verifies via Lean.ofReduceBool.

The eight theorems

Theorem	What it proves
`runLoop_in_reachable`	The run loop is in the computed reachable set (sanity check)
`reachable_closed`	The reachable set is transitively closed — no constant reachable from the set is outside it
`io_subset_allowed`	Every IO operation in the reachable set is in the allowed list
`no_unsafe_reachable`	No `unsafe` declaration is reachable from the run loop
`extern_subset_allowed`	Every `@[extern]` in the reachable set is in the allowed extern list
`no_L3m_raw_axioms`	No axioms declared in L3m modules (only Lean’s built-in axioms)
`no_L3m_unsafe`	No `unsafe` declarations in L3m modules
`no_L3m_custom_externs`	No custom `@[extern]` declarations in L3m modules

Together, these establish: starting from the agent’s entry point, you can only reach the declared tool operations. There is no hidden path to raw IO, no unsafe escape hatch, no smuggled extern.

Example: the PureSurface manifest

-- L3m/Manifests/PureSurface.lean
-- Generated at elaboration time. Do not edit by hand.

import L3m.Proofs.Reachability

ProvenTheorem runLoop_in_reachable :
    `L3m.Runtime.runLoop ∈ reachableSet := by native_decide

ProvenTheorem reachable_closed :
    ∀ c ∈ reachableSet, ∀ d ∈ (deps c), d ∈ reachableSet := by native_decide

ProvenTheorem io_subset_allowed :
    ∀ c ∈ reachableSet, isIOType c →
      c ∈ allowedIO := by native_decide

ProvenTheorem no_unsafe_reachable :
    ∀ c ∈ reachableSet, ¬ isUnsafe c := by native_decide

ProvenTheorem extern_subset_allowed :
    ∀ c ∈ reachableSet, isExtern c →
      c ∈ allowedExterns := by native_decide

ProvenTheorem no_L3m_raw_axioms :
    ∀ c ∈ reachableSet, isL3mDecl c →
      ¬ isAxiom c := by native_decide

ProvenTheorem no_L3m_unsafe :
    ∀ c ∈ reachableSet, isL3mDecl c →
      ¬ isUnsafe c := by native_decide

ProvenTheorem no_L3m_custom_externs :
    ∀ c ∈ reachableSet, isL3mDecl c →
      ¬ isExtern c := by native_decide

Each native_decide compiles to a boolean check over the concrete list. If someone adds a new tool that reaches raw IO, the proof fails to compile. The safety invariant is maintained by the build system, not by code review.

The Compression Budget

This is the novel bit. l3m can make web queries (for documentation lookups, API references, etc.), but those queries are charged against a byte budget based on compressed size.

Why compression?

A naïve byte budget (raw response size) is gameable: the agent could fetch a 10KB page that’s 90% boilerplate and extract 200 bytes of useful information. Compression measures information content more honestly. If a response compresses well, it contained little novel information relative to what the agent already knows.

How it works

The budget is denominated in compressed bytes (LZ77 with a public seed).
The seed is a known corpus — the agent’s existing context. Content that’s redundant with the seed compresses to nearly nothing.
Each web response is compressed against the seed. The compressed size is deducted from the budget.
A roundtrip check ensures the byte count is honest: decompress the compressed output and verify it matches the original. If it doesn’t, the compression implementation is buggy and the budget is meaningless.

Security model

The seed is public (it’s the agent’s context, which the agent itself can print). A man-in-the-middle who modifies the seed can make responses appear larger or smaller than they are — but this is a denial of service, not an information leak. The agent either runs out of budget too fast (DoS) or gets more budget than intended (which is bounded by the raw response size anyway). No secret is exposed by seed manipulation.

The proven theorem

ProvenTheorem compress_decompress_roundtrip :
    ∀ (seed : ByteArray) (input : ByteArray),
      decompress seed (compress seed input) = input := by
  intro seed input
  simp [compress, decompress, lz77_roundtrip]

This is a ProvenTheorem — no sorry, no axiom, kernel-verified. It guarantees that the compression/decompression pair is a faithful roundtrip: the byte count you’re charged is the byte count of a recoverable encoding of the original data. The budget cannot silently lie about how much information was consumed.

What this buys you

Most agents have no formal bound on information consumption. l3m’s compression budget means you can deploy the agent with a provably-enforced cap on how much external information it ingests per task. The cap is measured in information-theoretic terms (compressed size), not raw bytes.

What’s Trusted (The Audit Surface)

Every verified system has a trusted computing base — the things you believe without proof. l3m’s is explicit and small:

Always trusted

The Lean kernel — the type checker that verifies all proofs. This is the same trust assumption every Lean project makes. It’s ~5,000 lines of C++ that have been extensively reviewed.
Lean.ofReduceBool — the axiom that native_decide relies on. It says: if compiled Lean code evaluates a Decidable instance to true, the proposition holds. All 8 reachability proofs depend on this.

Typed axioms (10 named assumptions)

confineIO_rejects_symlink_escape — resolved SafePath starts with its declared root. If wrong, confineIO can be tricked by a symlink race.
single_writer / SingleSystemAxiom — only l3m instances modify workspace files. External editors violate this.
filesystem_writes_go_through_safepath — all write events are authorized by a SafePath.
network_io_only_in_llmclient — network requests originate only from expected modules.
llmclient_only_talks_to_configured_endpoint — LLM client contacts only configured HTTPS endpoints.
env_reads_only_in_main — environment variable reads only in Main.lean.
transcript_only_to_user_and_llm — transcript destinations are {user, llm, diskLocal}.
tier2_git_reversible — non-read-only git actions have a reversal.
tame_lakefile_confines_build — tame lakefiles produce only safe commands.

Every axiom references real structures from the codebase (FsEvent, WriteEvent, NetworkRequest, etc.), is falsifiable by a concrete counterexample, and is used by at least one downstream DerivedConjecture. Zero True-typed axioms remain—every axiom constrains something.

Heuristic code (~5 lines)

isIOType — a 5-line function that inspects a Lean Expr to determine whether a constant’s type involves IO. This is used by the reachability generator to classify which constants are IO operations. If it misclassifies a constant, the io_subset_allowed theorem could have a false negative. The function is small enough to audit by inspection.

Total audit surface: Lean kernel + ofReduceBool + 10 typed UnprovenConjectures + 5 lines of heuristic. Everything else is either kernel-verified or explicitly marked in the trust report. There is no unmarked middle ground.

Try It

l3m is open source and under active development:

Repository: github.com/deanpfoster/l3m

Requires Lean 4 (stable toolchain) and a path dependency on Lean Manifests for the evidence hierarchy macros. Clone both repos as siblings and lake build.

The project is designed to be self-hosting: the goal is for l3m to maintain its own codebase without human intervention. Concretely, it should handle tasks like “add a --json flag to Tools/Lake.lean, update the manifest, run the build, fix errors, commit” end-to-end.

If you’re interested in applying this pattern to your own Lean project, the key insight is: start with the tool inductive and SafePath. Those two types give you the closed operation set and confined paths. The reachability proofs come later, once the architecture stabilizes.

Lean Manifests (GitHub) — the evidence hierarchy system that l3m uses. Introduces UnprovenConjecture through ProvenTheorem, the @[manifest_entry] attribute, and build-time trust reports.

The Problem

The Architecture

Typeclass-based tools with a closed operation set

SafePath: opaque confined paths

TameCommand: closed inductive of allowed subprocesses

Tame vs. wild mode

The Evidence Hierarchy

Why this matters for an agent

Call Graph Reachability Proofs

How it works

The eight theorems

Example: the PureSurface manifest

The Compression Budget

Why compression?

How it works

Security model

The proven theorem

What this buys you

What’s Trusted (The Audit Surface)

Always trusted

Typed axioms (10 named assumptions)

Heuristic code (~5 lines)

Try It

Related