Skip to content
Get our updates
✓ Thanks for subscribing!

LemmaScript: A Verification Toolchain for TypeScript via Dafny

Star

When verifying code, we should strive for as close a relationship between the verified model and the code that executes. We have been exploring one way to achieve this in the context of web apps with dafny-replay and lemmafit. The idea there is to start with code in the Dafny verification-aware programming language, compile it to JavaScript, then hook the React frontend to use the compiled logic directly.

While this works quite well, especially for greenfield projects, we ran into issues that made us want to explore a different approach:

  • the Dafny compilation step introduces friction at the integration boundary between Dafny-compiled JS and the rest of the system
  • the approach doesn’t lend itself to a brownfield setup, where the initial code is already part of existing ecosystems (e.g. TypeScript)

In response to these challenges, we set out to create LemmaScript: it compiles TypeScript to Dafny (or Lean) for the sole purpose of having a model derived from the code upon which properties can be verified. The advantage is that the executable pipeline doesn’t change, and the verification pipeline is a complementary path that certifies correctness.

There are precedents for this approach: some inspirations include Verus for Rust, and Frama-C for C.

Example: brownfield verification of trimCookieWhitespace in Hono

Here is an utility function from the Hono web framework. Following CVE-2026-39410, we prove with in-place LemmaScript verification that this utility trims only space (0x20) and tab (0x09) and nothing else (e.g., 0xA0).

In oder to do so, we comment the code with @ verify, @ ensures, and @ invariant annotations. These are pure comments from the point of view of TypeScript. We are not modifying the executable code. However, when compiled by LemmaScript, these annotations are scaffolded for verification.

With all the annotations below, Dafny is able to prove that the @ ensures clauses are always true for any input/output of the function. This verifies that the CVE has been correctly fixed.

const trimCookieWhitespace = (value: string): string => {
  //@ verify
  //@ ensures \result.length <= value.length
  // CVE-2026-39410: only space (0x20) and tab (0x09) are stripped — nothing else (e.g. 0xA0) is removed.
  //@ ensures exists(start: int, exists(end: int, start >= 0 && start <= end && end <= value.length && \result === value.slice(start, end) && forall(i: int, i >= 0 && i < start ==> value.charCodeAt(i) === 0x20 || value.charCodeAt(i) === 0x09) && forall(i: int, i >= end && i < value.length ==> value.charCodeAt(i) === 0x20 || value.charCodeAt(i) === 0x09)))
  //@ ensures \result.length > 0 ==> \result.charCodeAt(0) !== 0x20 && \result.charCodeAt(0) !== 0x09
  //@ ensures \result.length > 0 ==> \result.charCodeAt(\result.length - 1) !== 0x20 && \result.charCodeAt(\result.length - 1) !== 0x09
  let start = 0
  let end = value.length

  while (start < end) {
    //@ invariant start >= 0 && start <= end && end <= value.length
    //@ invariant forall(i: int, i >= 0 && i < start ==> value.charCodeAt(i) === 0x20 || value.charCodeAt(i) === 0x09)
    const charCode = value.charCodeAt(start)
    if (charCode !== 0x20 && charCode !== 0x09) {
      break
    }
    start++
  }

  while (end > start) {
    //@ invariant start >= 0 && start <= end && end <= value.length
    //@ invariant forall(i: int, i >= 0 && i < start ==> value.charCodeAt(i) === 0x20 || value.charCodeAt(i) === 0x09)
    //@ invariant forall(i: int, i >= end && i < value.length ==> value.charCodeAt(i) === 0x20 || value.charCodeAt(i) === 0x09)
    const charCode = value.charCodeAt(end - 1)
    if (charCode !== 0x20 && charCode !== 0x09) {
      break
    }
    end--
  }

  //@ assert value.slice(0, value.length) === value
  return start === 0 && end === value.length ? value : value.slice(start, end)
}

See the files:

Example: greenfield verification (soundness and completeness) of a game decision procedure

Equality is a small game, where two sides have cards (numbers from 1 to 9), and the goal is to make each side equal by mathematical operations (+, -, *, / if wholesome).

In Dafny, we introduce a ghost predicate ExpressionsAgree(L: seq<int>, R: seq<int>), that says that there must exist expressions eL and eR, such that each expression matches its corresponding sequence as a multiset, and each expression evaluates to the same number.

We can then use this ghost predicate to ensure that canEqualize, which is used in the UI, is sound (if it returns true, then an equality exists) and complete (if it returns false, then no equality exists).

ghost predicate ExpressionsAgree(L: seq<int>, R: seq<int>)
{
  exists eL: Expr, eR: Expr ::
    multiset(leaves(eL)) == multiset(L) &&
    multiset(leaves(eR)) == multiset(R) &&
    evalExpr(eL).ok? && evalExpr(eR).ok? &&
    evalExpr(eL).v == evalExpr(eR).v
}

method canEqualize(L: seq<int>, R: seq<int>) returns (res: bool)
  requires (|L| >= 1)
  requires (|R| >= 1)
  ensures res <==> ExpressionsAgree(L, R)

See the files:

Looking forward

Each case study so far has pushed the boundaries of what LemmaScript can do.

We encourage you to try LemmaScript on your greenfield and brownfield projects, with the following caveats:

  • Each project might still need pushing the verification toolchain in new ways: for example, by handling a TypeScript construct or feature.
  • While the verified model is systematically derived from the TypeScript code, there might still be gaps or discrepancies in semantics. It would be beneficial to set up differential testing to gain assurance.
  • In-place verification of brownfield projects can sometimes be especially challenging, because the original code wasn’t designed with verification in mind.

With the caveats out of the way, we find it already promising and useful to catch issues and validate code. In terms of use cases, looking at greenfield vs brownfield suggests a roadmap: for greenfield projects, design with verification in mind from the start, encouraging patterns that lend themselves to smooth verification (like the dafny-replay pattern of specifying a state invariant and ensuring that each UI action preserves it); for brownfield projects, selectively verify important components in-place or with a security-minded rewrite for critical fixes. Then, the fix would not only claim to fix the security issue but actually prove it.