Build a competitive BTOR2 model checker based on Bitvector Decision Diagrams (BVDDs). The solver should handle both bitvector and array benchmarks from HWMCC'24, with bitwuzla as the reference solver.
bvdd/: Standalone BVDD library — value sets, terms, constraints, BVCs, BVDD nodes, HSCbitr/: BTOR2 solver — parser, Canonicalize/Solve engine, theory blasting, BMC loop- See
.claude/commands/bitr-expert.mdfor complete algorithmic reference
Phase: 10+ — Correctness foundation landed; competitiveness push in flight Last updated: 2026-04-17
See the plan at /Users/ck/.claude/plans/make-a-plan-on-melodic-dolphin.md and the
detailed log entries in experiments.log.
- A1 BMC bound-exhausted → Unknown (was Unsat) — eliminates wrong-UNSAT
- A2 SAT witness verification (extract_witness_verified + BMC/kind downgrade)
- A3
--verifyflag: oracle cross-check with panic-on-mismatch - A4 Golden tests lock in 16 tiny benchmarks
- A5 Reset-signal soundness fence (collect_non_reset_uses)
- B Incremental SAT BMC module + BitBlaster snapshot API
- C2 Bitblaster ITE memoization + 11 peepholes
- Internal tier timeouts now capped by outer budget; Python sweep runner
(
scripts/run_sweep.py) enforces wall-clock via SIGKILL on timeout+5s - 120/120 tests passing, 0 clippy warnings
- HWMCC BV 155-benchmark sweep measurement (in progress)
- Default-on
--incremental-bmcif sweep shows net gain - Phase D: inductive generalization for
kind_onlytimeouts - AIG structural rewriting in bitblaster (larger than just ITE)
- Cooperative timeout cancellation inside substitute_states (term DAG walk can blow budget; Python wrapper masks this today)
- Repository structure created
- Cargo workspace configured
- Core types defined (
bvdd/src/types.rs) - ValueSet with tests (
bvdd/src/valueset.rs) - Term table with hash consing (
bvdd/src/term.rs) - Constraint table with simplification (
bvdd/src/constraint.rs) - BVC manager stub (
bvdd/src/bvc.rs) - BVDD node representation (
bvdd/src/bvdd.rs) - HSC stub (
bvdd/src/hsc.rs) - BTOR2 parser (
bitr/src/btor2.rs) - CLI entry point (
bitr/src/main.rs) - Tiny benchmarks committed
-
cargo testall passing (40 tests) -
cargo clippyclean
- ValueSet: xor, insert, remove, min/max_element, from_fn, iter (11 tests)
- TermTable: SubstAndFold with constant folding (4 tests)
- Recursive term evaluator — all OpKind variants (12 tests)
- Compiled term evaluator (
bvdd/src/eval.rs) — register machine (10 tests) - Cross-validation: compiled vs recursive agree on all inputs
- Constraint hash consing via unique table (12 tests)
- Restrict operation with short-circuit AND/OR
- Predicate simplification (empty→FALSE, full→TRUE)
- AND/OR normalization for better hash consing
- collect_preds, has_no_predicates helpers
- BVC Apply: structural (PRED constraints) and lifted (fresh vars) (6 tests)
- BTOR2→BVC lifter (
bitr/src/lifter.rs) — all operators + constants (5 tests) - 61 total tests, 0 clippy warnings
- BVDD unique table hash consing (terminals + decision nodes)
- FALSE terminal (BvddId(0)) with FullyCanonical canonicity
- Edge merging: same-child edges get OR'd value sets
- Empty-edge filtering and single-FULL-edge collapse
- Bottom-up flag computation (can_be_true, is_ground)
- All-FALSE-children collapse to FALSE terminal
- Direct-mapped computed cache (64K entries) for Solve results
- restrict_to_valueset with cache integration
- 74 total tests, 0 clippy warnings
- Solve algorithm with ground check, terminal→Canonicalize, decision traversal
- Canonicalize with 6 phases: ground, SAT witness, UNSAT prune, no-preds, Decide, Restrict
- Decide: predicate selection + coarsest partition
- Generalized blast: narrowest-variable-first enumeration (budget 2^20)
- Theory resolution for predicate-free constraints
- BTOR2→BVC lifter with all operators, slice/ext special handling
- CLI integration: parse → lift → solve → output sat/unsat
- 9/9 tiny benchmarks correct (6 SAT, 3 UNSAT)
- 82 total tests, 0 clippy warnings
- BMC loop: init/next substitution, step-by-step unrolling
- Automatic sequential/combinational detection
- State variable substitution with SubstAndFold
- Fixed Solve ground check to verify value against target set
- CLI: --bound N option for BMC depth
- 11/11 benchmarks correct (9 combinational + 2 sequential: counter_sat, counter_unsat)
- 82 total tests, 0 clippy warnings
- Boolean decomposition: find comparison subterms in 1-bit expressions, branch true/false
- Compiled evaluator wired into generalized blast for fast single-variable enumeration
- Domain budget: 2^28 with compiled evaluator (up from 2^20)
- 4-stage theory resolution cascade: bool decomp → compiled blast → byte-blast → oracle
- SMT oracle (
bitr/src/oracle.rs): term→SMT-LIB2, subprocess invocation, result parsing - Oracle caching via HashMap
- Auto-detect solver binary (bitwuzla, boolector, z3)
- Byte-blast oracle in solver: split widest var MSB byte, enumerate 256×LSB values
- Adaptive bailout (25% threshold), max depth 4
- Oracle integration via
set_oracle()callback on SolverContext
- Array state tracking (Base / Write chain) in lifter
- READ-over-WRITE expansion: READ(WRITE(a,w,v), r) → ITE(EQ(r,w), v, READ(a,r))
- Nested write chains → nested ITE chains
- Array input/state handling with sort detection
- Array benchmarks: array_row_sat, array_row_unsat
- 13/13 benchmarks correct (9 combinational + 2 sequential + 2 array)
- 82 total tests, 0 clippy warnings
- Oracle wired into CLI: auto-detect bitwuzla/boolector/z3, --no-oracle flag
- BMC term-for-term substitution: subst_term for symbolic next-state expressions
- HSC cascade implemented: MSB→LSB 8-bit slice decomposition + tests
- Multi-variable compiled blast: full enumeration via CompiledProgram (≤65536 total)
- Counterexample extraction: extract_witness walks BVDD + records during blast
- Witness display: --verbose prints variable assignments on SAT
- Fixed Uext/Sext arity (1, not 2)
- New benchmark: shift_reg_sat (4-bit shift register, symbolic BMC)
- 14/14 benchmarks correct, 87 tests, 0 clippy warnings
- Native CDCL bit-blasting via splr SAT solver (Tseitin encoding, gate memoization)
- CDCL bitblaster moved to Stage 2b (before byte-blast/parallel blast)
- 6,600x speedup on twovars_16_unsat (25.4s → 0.004s), 3,200x on wide_mul_32
- Total perf time: 22.7s → 1.4s (16x improvement, near-parity with bitwuzla)
- 35/35 benchmarks correct, 103 tests, 0 clippy warnings
- K-induction prover (bitr/src/kinduction.rs): inductive safety proofs
- HWMCC BV: 32/155 → 60/155 solved via k-induction + CDCL bitblaster in BMC
- Finalize ValueSet operations (tested)
- Complete TermTable with SubstAndFold
- Add term evaluation (recursive + compiled)
- Unit tests for all operations
- Full constraint table with hash consing
- BVC Apply operation (cross-product)
- BVC from BTOR2 operators
- Lifted variable definitions
- Unique table for hash consing
- Apply operation
- Edge merging (same-child → bitmask OR)
- HSC cascade construction
- Computed cache (direct-mapped)
- C API header generation
- BTOR2 parser → BVC construction
- BVC → BVDD lifting (lazy mode)
- Handle all BTOR2 operators
- Validate on tiny benchmarks
- Implement 6-phase Canonicalize
- Implement Solve (decision node traversal)
- Decide (predicate selection + partition)
- Restrict (constraint restriction)
- Test on tiny sat/unsat benchmarks
- Boolean decomposition (1-bit terms)
- Generalized blast (variable enumeration)
- Compiled evaluator
- Domain budget checks
- Bitwuzla integration via C FFI
- Oracle caching
- Byte-blast oracle (recursive splitting)
- Budget management (depth, time)
- State unrolling
- Init/Next substitution
- Bad property checking per step
- Counterexample extraction
- WRITE → READ-over-WRITE expansion
- ITE chain construction for READ
- Array-track benchmark testing
- Profile hot paths
- SIMD for ValueSet operations
- Tune domain budgets
- Maximize HWMCC'24 solved count
- Timeout: 300s per benchmark
- Memory: 8GB limit
- Correctness: paramount — wrong answers are fatal
- Primary metric: number of benchmarks solved correctly
- Secondary metric: total solving time (PAR-2)
Before each benchmark run:
- Note the change being evaluated
- Record git commit hash
- Run full benchmark suite with timeout
- Record results in
results/as CSV - Append summary to
experiments.log - Compare against previous best and reference solver