Mem2reg (Analysis + Transform plan) by dc-mak · Pull Request #998 · rems-project/cerberus

dc-mak · 2026-03-16T02:55:26Z

This PR adds support for the analysis part of a Core-to-Core mem2reg pass. This would be useful to speed up execution of Core programs and also CN proof (and Fulminate?) substantially.

It was developed with Claude and so I've included the plans that fed into the code changes for documentation in this PR - the final PR can have these deleted (including a fresh branch so they don't even exist in the history).

For example, Claude seemed to think that an Ail/Ail+Core analysis would be more complicated than a simple Core-to-Core, so the reasoning for this is included in doc/history/2026-03-13_mem2reg-plan.md

It's already got a few commits and a fair amount of code, but the PR can and should be read one commit at a time.

The analysis pass is quite naive - it does multiple traversals for one thing at a time (instead of one complex one to collect all the info), but it's easy to understand and I figured it can be improved later once we have a working implementation that passes tests (or not, if it turns out to be fast enough), e.g. I assume perhaps the Core rewriter library does some sort of pass-fusion?

Add the SW_mem2reg switch, an identity-stub Core pass (core_mem2reg.ml), pipeline wiring, and 10 CI tests covering simple scalars, branches, loops, address-taken variables, structs, and mixed cases. All 10 tests pass with the stub (identity transform). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Phase 1: run all pre-existing CI tests with --sw mem2reg and verify output is unchanged (regression safety). Phase 2: run --pp core --sw mem2reg on the 10 new mem2reg tests and count create( occurrences to verify promotable variables are eliminated. - run-mem2reg.sh: thin dispatcher that runs phase1 then phase2 - run-mem2reg-phase1.sh: regression check — --sw mem2reg must not change output of existing CI tests (0001–0340) - run-mem2reg-phase2.sh: elimination check — counts create() in Core IR for all 10 mem2reg tests (0341–0350); current expected values reflect the stub pass, with target values noted in a comment Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replaces the identity-transform stub with a full promotability analysis: find all Create(PrefSource)-bound pointer syms in each Proc and determine which ones are only ever used as Load0/Store0/Kill addresses (never address-taken or passed to arbitrary expressions). Key details: - sym_occurs_in_{pexpr,expr,action}: conservative occurrence check - classify_action: precise Load0/Store0/Kill address classification - collect_uses: handles the Core load-alias idiom (let weak tmp = pure(ptr) in load(ty, tmp)) via a special Ewseq arm - collect_creates: finds Esseq-bound Create(PrefSource _) syms - Debug output at level 3: "[mem2reg] f: N promotable: [...]" - Note in collect_creates on the via-pointer calling-convention assumption and how CN's value-passing convention would allow extending this to non-address-taken function parameters Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ries Under Normal_callconv only PrefSource (C locals) are promoted; under Inner_arg_callconv PrefFunArg Create bindings are also included, since in that convention the callee owns the argument slot. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…nalysis check_definitely_init now correctly handles: - Ewseq: e2 receives already_init (not ia1), since Neg stores in e1 are not sequenced before e2; init_after = ia1 || ia2 - Eunseq: all arms receive already_init; safe/init_after both require AND across arms (no arm's result is visible to siblings) - SeqRMW: classified as Use_seqrmw (promotable); (safe=already_init, init_after=true) — atomically reads then writes Add no_mixed_unseq_uses predicate: blocks promotion when a write arm and >=2 mentioning arms coexist in an Eunseq, preserving Cerberus's unsequenced-race detector for cases like `x + (x = 42)`. find_promotable now requires all three predicates: use_is_promotable && check_definitely_init && no_mixed_unseq_uses Add seven CI tests (0351-0357) covering Ewseq promotion, uninit/init unsequenced races, read-only Eunseq arms, post/pre-increment SeqRMW, and x+(x++) UB detection. Update run-mem2reg-phase2.sh and tests.sh. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dc-mak · 2026-03-16T21:52:56Z

Pasting this discussion here for visiblity.

@cp526:
I suspect for Esave one wants to transform the label type. In the normal Core elaboration, for any in-scope C variable x, the Esave will take as an argument a pointer to the stack location of x. The mem2reg transformation should make it so the label instead takes the current value of x.

@neel-krishnaswami:
Operationally, save is basically just a let-binding, so it's easy.

It's only when you hit a run that anything weird happens. Its operational rule uses the "labelmap" which (annoyingly) shows up in the run operational rule and nowhere else in Kayvan's thesis. Conceptually it's pretty straightforward but there may be details which live only in the Cerberus codebase and/or Kayvan's brain.

@dc-mak:
Ok so it seems like for the analysis, the default parameter of save counts as a use, but a slightly unusual one. As Neel pointed out, it is basically a let-binding, so we need to consider the binder as an alias for analysing the body. But for the transformation/executing the save as straightline code (not from a run), we would want it to count the default value as a load.

AFAIR, elaboration ensures that saves are closed wrt C variables in scope (ie. only C local vars are bound in its binders). However, if this were not the case, any uses of a C local variable in the body (which were not bound by the save params), they would disqualify a local variable from promotion (since the save can be reached from straightline, forward run, and backward run, but any transformation would effectively lock-in the straightline case).

I think the fact that save parameters only bind the C local vars means that run should be easy. Unlike function calls, where passing the address of a local var would escape and disqualify it, passing the address of a local var to run would basically count as a load. I don't think labelmap will be much of a blocker, I remember understanding it once and I'll revisit it to be sure.

The transformation for save will be a little fiddly, like with if, since any mutated local vars will need to passed out the final expression, rebound with a let strong, and the environment updated.

@cp526 For what it's worth, there's two alternatives to doing this as a core-to-core rewrite: option (1), which Kayvan was considering, is to adjust the elaboration around the variables mem2reg would optimise; option(2) would be to make this a mucore-to-mucore rewrite. We want to merge the core and mucore expression grammars, but will want to keep mucore's separate handling of labels (with labels as function-local procedures rather than Esave), so then you wouldn't have to think about how mem2reg should work for Esave

dc-mak · 2026-03-17T21:15:25Z

@cp526:
For what it's worth, there's two alternatives to doing this as a core-to-core rewrite: option (1), which Kayvan was considering, is to adjust the elaboration around the variables mem2reg would optimise; option(2) would be to make this a mucore-to-mucore rewrite. We want to merge the core and mucore expression grammars, but will want to keep mucore's separate handling of labels (with labels as function-local procedures rather than Esave), so then you wouldn't have to think about how mem2reg should work for Esave

Other related points raised in discussion:

Consider the impact on the code which supports reading local variables in at a program point within annotations (CN statements). We'd need to thread the environment of the transform through to that point.
Consider that the promotable variables info could be useful to Fulminate; this may push towards an Ail-analysis at least, if not an Ail-transform too. IIRC, the symbols are the same in Ail and Core so this may not be a big concern.
Another option is analysing at Ail/Core, but emulating the transform in the type system. This would make the type system more complicated.

…tracking See doc/history/2026-03-18_mem2reg-esave-erun.md for the full design. Previously, any occurrence of a CREATE-bound sym in an Esave's args or body conservatively blocked promotion. The fix rests on two invariants of the elaborator: 1. Every CREATE-bound local pointer sym that appears inside an Esave body is passed in as a bare PEsym default arg — it is never referenced directly from the body under its outer name. (Non-CREATE-bound syms such as function parameters may appear freely; the invariant is narrowly about CREATE-bound locals.) Consequently, if a sym is not aliased by any param, it cannot reach the body, and the analysis can skip it entirely. 2. Erun is terminal: control jumps unconditionally into the Esave body and the sequential continuation is unreachable, so init_after = true. Erun args for CREATE-bound syms are bare PEsym aliases of the target Esave's params, so alias resolution via find_single_direct_alias is sufficient and no structural occurrence check is needed. Both collect_uses and check_definitely_init use the (cached) result of recursing on the Esave body using the param_sym which is an alias of the Erun arg or Esave default arg. They then combine that with any existing information about the arg itself to calculate the result. New infrastructure: - collect_esave_defs: pre-walk that builds a memo table of Esave definitions (label_sym → params + body), enabling Erun sites to resolve forward references without re-traversal. - find_single_direct_alias: finds the unique param whose default is a bare PEsym equal to the outer sym. - Memoised collect_uses / check_definitely_init: results cached per (label_sym, param_sym); a sentinel is written before the recursive call to handle back-edge Eruns without diverging. For collect_uses the sentinel [] is exact (a back-edge Erun carries no semantic use). For check_definitely_init the sentinel (true, true) is a sound over-approximation: any load-before-store on an exit path sets safe=false through &&-propagation before the back-edge Erun is reached, so the sentinel cannot mask a real unsafety. - expr_writes_sym and no_mixed_unseq_uses updated with the same param-alias indirection for Esave. New CI tests: 0361–0364 cover pre-init loop reads, escaping writes, nested loops, and uninit loads inside loops. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The definite-initialisation check previously used a plain bool for "is sym initialised after this expression", conflating two distinct notions: a Paction Pos store (committed before any Esseq continuation) and a Paction Neg store inside Ewseq (unsequenced w.r.t. the Ewseq continuation). This made the analysis overly conservative for the common case of compound literals, whose Core IR is: Ewseq(_, Store0(Paction Pos, ptr, val), Load0(ptr)) The Pos store is sequenced before the Load in memory-model terms, but the old bool could not express this, so the Load was conservatively rejected and the variable was not promoted. Replace the bool with a three-element lattice Write_kind = No_write < Weak < Strong. Esseq promotion now treats any write (Weak or Strong) as initialised for its continuation; Ewseq only propagates Strong. Load/SeqRMW safety checks use is_strong. Branch and unsequenced joins use meet (&&); sequential joins use join (||). Adds test 0365-mem2reg_compound_lit.c verifying that an int variable initialised via a compound literal produces the correct result under --sw mem2reg, and adds it to run-mem2reg-phase2.sh with the stub create-count of 2 (fully-working target: 0). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Records the finding that the Write_kind refactoring does not change promotable-variable counts in practice: check_definitely_init checks for Load0(PEsym sym) directly, but C-generated Core always loads locals through an intermediate alias (Ewseq(tmp, pure(sym), load(tmp))), so the Ewseq rule change never fires. Includes annotated Core output for the compound-literal test to show why a_508 is filtered by collect_uses (Use_other) before check_definitely_init is reached, and why x is promotable under both old and new analysis for unrelated reasons. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Introduces SW_copy_prop / --sw copy_prop, a new Core pass that runs immediately before mem2reg in core_passes. The pass finds bindings of the form `let alias = pure(sym)` in Ewseq/Esseq/Elet/PElet and substitutes sym for alias throughout the body, replacing the named pattern with a wildcard. It also handles the extended case where the RHS is effectful but its tail expression is pure(sym) with sym not locally bound within the RHS (the compound literal motivating case from 2026-03-20_mem2reg-weak-init.md). Key design point: binding nodes are always kept (pattern → wildcard), never dropped, so source-location annotations are preserved. New files: copy_propagation.mli/.ml, tests 0366-0373, phase1/phase2 test scripts, history doc. All 208 CI tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dc-mak · 2026-03-20T06:26:05Z

I've been coding and thinking some more and I'm now less sure about doing a mem2reg transformation on Core.

It can affect source locations.
The eliminated and new symbols will need to be threaded through to any backends like anyways CN to support existing features to make sure that changes in mem2reg don't break specifications. The mem2reg pass would essentially need to preserve the environment used to support the ghost load in the middle of the function, or loop invariants.
```
int main() {
    int x = 10;
    while (x > 0)
    /*@ inv 0i32 < x; x <= 10i32 @*/
   { x--; }
   /*@ assert (x == 0i32); @*/
   return x;
 }
```

I always found the treatment of local vars in CN to be confusing (and I was not alone in this #443).

Could it be possible to end up in a situation where updating a mem2reg pass would cause existing verified code to break? That would be bonkers, but I am confused right now.

These considerations push me towards favouring an analysis and then passing those locations to CN, so they can be treated specially/on a fast path in the checker itself. But I am not decided/am still unsure.

Replaces the three-pass promotability analysis (check_definitely_init, expr_writes_sym, no_mixed_unseq_uses) with a single abstract interpretation, sequentialisable, that tracks Mem_event sequences and an env (Uninit / Init pe / DelayedInit pe / Killed) through the IR. Key improvements over the old Write_kind lattice approach: - Tracks the concrete stored value (pexpr) so the transform phase can substitute it directly at Load sites without a second traversal. - DelayedInit models Neg-polarity (Ewseq) stores, blocking any Load/Store on the same sym before the sequence point. - SeqRMW is handled: atomically reads the current value, substitutes it into the update expression, and records the result as Init. - Eunseq sequencing-violation detection is now integrated: raises Not_sequentialisable if a write arm coexists with any other arm that mentions sym. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Regrettably, this commit fixes up many small and big things in mem2reg. - Mem_event => Event - Mem_event.t list => Set.Make(Event).t - Remove DelayedInit and fix handling of Neg_stores in wseq - Remove alias case for wseq in collect_uses (handled by copy_prop pass) - Use Core_aux.unsafe_subst_pexpr instead of hand-rolled version - Use Pexpr (.. PEundef ..) for env "value" of non-returning case-branches to retain pattern-completeness - Handle End and Epar same as Eunseq - Fix the Ebound case to keep the env

- Replace tail_sym/tail_sym_acc with tail_pexpr/tail_pexpr_acc, which returns any safe pure expression at the tail of an effectful expression, not just PEsym. Two new helpers support this: pexpr_safe (conservative free-variable check) and binders_of_pat (collects all named symbols from any pattern, including CaseCtor tuples). - Change type env from (sym, sym) Pmap.map to (sym, pexpr) Pmap.map. Bindings for Ewseq/Esseq/Elet now store the full pexpr value in the env; PEsym substitution returns the stored pexpr directly. All 208 CI tests pass; copy-prop phase-1 regression suite unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Generalise tail_pexpr to return a pexpr_tree rose tree that mirrors Eunseq nesting, and add match_pat_tree to walk a Core pattern and tree in lock-step, partially wildcarding tuple positions where a known pure expression is available. This lets copy_prop propagate constants out of unsequenced sub-expressions — comparison operators, function-pointer loads, etc. — into the continuation. Also fix a Core typechecker asymmetry exposed by the new propagations: the checking rule for Cspecified was stricter than inference — it did not allow Specified(pe) to be checked against BTy_storable even when pe has an object type, which is semantically fine. New CI test: 0374-copy_prop_tuple.c. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Tracking the actual pexpr value in the environment per sym complicates the code needlessly (at this stage) and is also just plain incorrect, if done one symbol at a time. Hence this commit removes it, which simplifies much of the code drastically. This commit also makes sure that non-returning code in a wseq or an unseq throws a Not_sequentialisable error, since we can't soundly approximate the events to detect a race condition (though I'm fairly sure the elaboration never places a run inside either of those constructs). For example, for this C program: ```c int main() { void *p; int houdini; escape: p = &houdini; return p ? 0 : 1; } ``` After copy propagation, we get this label. ```ocaml save escape: unit (p: pointer:= p, houdini: pointer:= houdini) in let strong _: loaded pointer = bound( ... let weak _: unit = neg(store('void*', p, Specified(houdini))) in pure(Specified(houdini)) ) in pure(Unit) ; ``` And we'd have to be very careful with tracking the environment for `p` after the save block, to make sure it doesn't refer to the save-symbol houdini, but to the procedure-level one.

Leaving the values in place still causes mem2reg to fail with plenty Use_others, so updating copy propagtion to rewrite to Units removes such false negative uses. This rewrite found that evaluation of PEare_compatible is stricter than it needed to be/can take multiple steps, due the fact (a) pure expression can diverge (b) so function calls reduce in more than 1 step (c) let pat = pe1 in pe2 eagerly evaluates pe1 to a value in more than 1 step but (d) are_compatible(pe1, pe2) only gives pe1 and pe2 one step each so (e) rewriting `let x = params_nth(param, 0) in are_compatible(_, x)` with `let _ = Unit in are_compatible(_, params_nt(param, 0))` caused evaluation to fail (test 0021-fact.c, amongst others). This was fixed by updating param_nth to return the whole expression when reducing one step each didn't produce a value (values of the wrong type are still rejected, of course). I still don't understand why marking PEcfunction as unsafe makes it not substituted - I would have thought the pattern matching on the tuples would have prevented it.

Previously, a tuple like let (x,y) = pure((v, cfunction(v)) in .. would be discarded from consideration altogether due to the cfunction on the right, even though x can be propagated. The reason for this was that pure tuples were replaced with Unit as a whole rather than per branch; this commit fixes that to do it per component recursively. It also tidies up much of the comments and function names along the way (sorry!).

Replace the two-phase pexpr_tree build + match_pat_tree approach with a single-pass analyze_pat_pexpr / analyze_pat_expr that takes the binding pattern alongside the expression and returns a coherent (bindings, new_pat, new_expr) triple. Previously, prop_and_rm could replace e.g. Specified(ptr) with unit while match_pat_tree left the Cspecified(x) pattern unchanged, producing a type-incoherent let Specified(x) = pure(unit). The new code recurses into Cspecified wrappers and rebuilds them consistently. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

For CN, the elaboration annotates pure(<int>) literal expressions with a C type so to support the use of signed and sized bit-vector typing. When replacing such expressions with pure(Unit), this messes up CN typing and so this commit removes such annotations from the units, but propagates them inside the pure expression so that propagated integer values get them. It also pushes the location annotations into pure expressions similarly. Testing with CN also uncovered rems-project#1000, hence the integer annotations aren't required/assumed to be unique.

dc-mak and others added 6 commits March 16, 2026 15:42

(docs) Add mem2reg transformation phase design doc

f704ba4

dc-mak and others added 5 commits March 19, 2026 14:30

Add a list of to-dos for this feature

e112dcc

dc-mak and others added 6 commits March 21, 2026 10:02

Update CLAUDE.md

26a1b8e

dc-mak force-pushed the mem2reg branch from ab954c2 to 0dd494c Compare March 23, 2026 11:27

dc-mak added 3 commits March 24, 2026 11:41

Update notes

64b9868

dc-mak force-pushed the mem2reg branch from 709d99b to 8390ca5 Compare March 26, 2026 09:19

Ship promotable syms out of Proc

63e182c

dc-mak force-pushed the mem2reg branch from 8390ca5 to 63e182c Compare March 26, 2026 09:32

dc-mak and others added 2 commits March 27, 2026 11:06

dc-mak force-pushed the mem2reg branch from 98ce0af to 42a3c16 Compare March 27, 2026 20:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mem2reg (Analysis + Transform plan)#998

Mem2reg (Analysis + Transform plan)#998
dc-mak wants to merge 23 commits intorems-project:masterfrom
dc-mak:mem2reg

dc-mak commented Mar 16, 2026 •

edited

Loading

Uh oh!

dc-mak commented Mar 16, 2026 •

edited by cp526

Loading

Uh oh!

dc-mak commented Mar 17, 2026

Uh oh!

dc-mak commented Mar 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dc-mak commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dc-mak commented Mar 16, 2026 • edited by cp526 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dc-mak commented Mar 17, 2026

Uh oh!

dc-mak commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dc-mak commented Mar 16, 2026 •

edited

Loading

dc-mak commented Mar 16, 2026 •

edited by cp526

Loading

dc-mak commented Mar 20, 2026 •

edited

Loading