fix: block assembly reset skips intermediate block finalization and handleReorg returns nil after fallback reset#545
Conversation
…andleReorg returns nil after fallback reset Fix two bugs in block assembly's reset logic: Bug 1: SubtreeProcessor.reset() only called finalizeBlockProcessing (and thus SetBlockProcessedAt) for the LAST moveForward block. Intermediate blocks never got processed_at set, meaning they were not recognized as fully processed. Fix: call finalizeBlockProcessing for each moveForward block individually in both the legacy sync and non-legacy paths. Bug 2: handleReorg() returned nil after fallback reset (triggered by invalid block or failed Reorg), allowing processNewBlockAnnouncement to overwrite the reset's setBestBlockHeader with a potentially stale value captured before the reset ran. Fix: return ErrBlockAssemblyReset after fallback reset, matching the large-reorg path which already returns this error. Includes unit tests proving both bugs and integration tests using full blockchain daemon with gRPC and real stores.
|
🤖 Claude Code Review Status: Complete Current Review: No issues found. All previously reported issues from Copilot and icellan have been addressed. History:
Summary: The PR correctly fixes two related bugs in block assembly reset logic:
Both bugs have comprehensive test coverage (unit + integration tests). The fix also properly clears processed_at for moveBackBlocks during reset. |
|
There was a problem hiding this comment.
Pull request overview
Fixes two reset/reorg correctness issues in the block assembly pipeline to ensure blocks are consistently finalized and the block assembler state can’t be overwritten with stale best-header data after a fallback reset.
Changes:
- Finalize every moveForward block during
SubtreeProcessor.reset()(legacy and non-legacy paths) so intermediate blocks getprocessed_atset. - Make
handleReorg()returnErrBlockAssemblyResetafter a fallbackreset()so callers don’t overwrite state with stale headers. - Add unit + integration tests covering both bugs end-to-end.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| services/blockassembly/subtreeprocessor/SubtreeProcessor.go | Finalize per moveForward block during reset to ensure SetBlockProcessedAt is applied to intermediate blocks. |
| services/blockassembly/BlockAssembler.go | Return ErrBlockAssemblyReset after fallback reset in handleReorg() to prevent stale best-header overwrites. |
| services/blockassembly/reset_bug_test.go | Adds focused unit tests reproducing both reset/reorg bugs. |
| services/blockassembly/blockassembly_system_test.go | Adds integration tests exercising reset-ahead and reorg-with-invalid-block scenarios through the full stack. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ba, _ := NewBlockAssembler( | ||
| t.Context(), | ||
| ulogger.TestLogger{}, | ||
| tSettings, | ||
| stats, | ||
| items.utxoStore, | ||
| nil, // blobStore | ||
| items.blockchainClient, | ||
| items.newSubtreeChan, | ||
| ) | ||
|
|
||
| require.NotNil(t, ba.settings) | ||
|
|
||
| // Create SubtreeProcessor WITH utxoStore (unlike setupBlockAssemblyTest which passes nil). | ||
| // This is needed because SubtreeProcessor.reset() calls processCoinbaseUtxos() | ||
| // which requires utxoStore.Create(). | ||
| ba.subtreeProcessor, err = subtreeprocessor.NewSubtreeProcessor( | ||
| t.Context(), | ||
| ulogger.TestLogger{}, | ||
| ba.settings, | ||
| nil, // blobStore | ||
| items.blockchainClient, | ||
| items.utxoStore, // NOT nil — this is the key difference | ||
| items.newSubtreeChan, | ||
| ) |
There was a problem hiding this comment.
NewBlockAssembler returns (*BlockAssembler, error) but the error is ignored here. If construction fails (e.g., nil ChainCfgParams), the test will continue with a nil/partial BA and fail later in less obvious ways. Please require.NoError on the returned error (and you can likely avoid constructing a second SubtreeProcessor since NewBlockAssembler already creates one with the provided utxoStore).
There was a problem hiding this comment.
Fixed: added require.NoError(t, err) immediately after the NewBlockAssembler call. Also removed the redundant second subtreeprocessor.NewSubtreeProcessor construction — NewBlockAssembler already creates one, so the test now uses that directly after asserting no error.
| utxoStore, err := utxostoresql.New(ctx, logger, tSettings, utxoStoreURL) | ||
| require.NoError(t, err) | ||
| items.utxoStore = utxoStore |
There was a problem hiding this comment.
The local variable name utxoStore shadows the imported utxoStore package, forcing the var _ utxoStore.Store = nil workaround at the bottom. Rename the local variable (e.g., utxo) and drop the unused import + workaround var so the test stays clean and avoids confusing shadowing.
| utxoStore, err := utxostoresql.New(ctx, logger, tSettings, utxoStoreURL) | |
| require.NoError(t, err) | |
| items.utxoStore = utxoStore | |
| utxo, err := utxostoresql.New(ctx, logger, tSettings, utxoStoreURL) | |
| require.NoError(t, err) | |
| items.utxoStore = utxo |
There was a problem hiding this comment.
Fixed: renamed the local variable to utxo and removed the utxoStore package import alias (and the workaround var _ utxoStore.Store = nil at the bottom). No more shadowing.
| // TestResetWithBlockchainAhead_MissesIntermediateBlockProcessing proves that when | ||
| // block assembly performs a reset and blockchain is ahead by N blocks, the intermediate | ||
| // moveForward blocks are NOT properly processed. | ||
| // | ||
| // Specifically, SubtreeProcessor.reset() only calls finalizeBlockProcessing (and thus | ||
| // SetBlockProcessedAt) for the LAST moveForward block. Intermediate blocks never get | ||
| // processed_at set, meaning they are not recognized as fully processed. | ||
| // | ||
| // This test FAILS with the current code (proving the bug exists). | ||
| // After the fix (don't moveForward during reset), all blocks will be processed | ||
| // via the normal moveForwardBlock path, which calls finalizeBlockProcessing per block. |
There was a problem hiding this comment.
These test comments still say the test "FAILS with the current code" and describe a different fix approach ("don't moveForward during reset"). Now that the PR changes behavior, please update the comments to say the test would have failed before the fix and align the explanation with the actual implementation (finalize each moveForward block).
| // TestResetWithBlockchainAhead_MissesIntermediateBlockProcessing proves that when | |
| // block assembly performs a reset and blockchain is ahead by N blocks, the intermediate | |
| // moveForward blocks are NOT properly processed. | |
| // | |
| // Specifically, SubtreeProcessor.reset() only calls finalizeBlockProcessing (and thus | |
| // SetBlockProcessedAt) for the LAST moveForward block. Intermediate blocks never get | |
| // processed_at set, meaning they are not recognized as fully processed. | |
| // | |
| // This test FAILS with the current code (proving the bug exists). | |
| // After the fix (don't moveForward during reset), all blocks will be processed | |
| // via the normal moveForwardBlock path, which calls finalizeBlockProcessing per block. | |
| // TestResetWithBlockchainAhead_MissesIntermediateBlockProcessing covers a bug where, | |
| // during reset with blockchain ahead by N blocks, intermediate moveForward blocks were | |
| // not properly finalized. | |
| // | |
| // Before the fix, SubtreeProcessor.reset() only called finalizeBlockProcessing (and | |
| // thus SetBlockProcessedAt) for the LAST moveForward block. Intermediate blocks never | |
| // got processed_at set, meaning they were not recognized as fully processed. | |
| // | |
| // This test would have failed before the fix. The fix ensures reset finalizes each | |
| // moveForward block so every intermediate block is marked processed. |
There was a problem hiding this comment.
Fixed: updated both test function comments to describe the pre-fix behavior (past tense) and clarify that the tests verify the fix works, rather than saying they prove a current bug.
| chainBits, _ := model.NewNBitFromString("207fffff") | ||
|
|
||
| coinbaseTx, _ := bt.NewTxFromString("01000000010000000000000000000000000000000000000000000000000000000000000000ffffffff17030200002f6d312d65752f605f77009f74384816a31807ffffffff03ac505763000000001976a914c362d5af234dd4e1f2a1bfbcab90036d38b0aa9f88acaa505763000000001976a9143c22b6d9ba7b50b6d6e615c69d11ecb2ba3db14588acaa505763000000001976a914b7177c7deb43f3869eabc25cfd9f618215f34d5588ac00000000") |
There was a problem hiding this comment.
In the new integration tests, errors from NewNBitFromString / NewTxFromString are ignored. If the parsing ever fails, the test can proceed with nil/zero values and produce misleading failures. Please capture and require.NoError for these errors (consistent with other test assertions in this file).
| chainBits, _ := model.NewNBitFromString("207fffff") | |
| coinbaseTx, _ := bt.NewTxFromString("01000000010000000000000000000000000000000000000000000000000000000000000000ffffffff17030200002f6d312d65752f605f77009f74384816a31807ffffffff03ac505763000000001976a914c362d5af234dd4e1f2a1bfbcab90036d38b0aa9f88acaa505763000000001976a9143c22b6d9ba7b50b6d6e615c69d11ecb2ba3db14588acaa505763000000001976a914b7177c7deb43f3869eabc25cfd9f618215f34d5588ac00000000") | |
| chainBits, err := model.NewNBitFromString("207fffff") | |
| require.NoError(t, err) | |
| coinbaseTx, err := bt.NewTxFromString("01000000010000000000000000000000000000000000000000000000000000000000000000ffffffff17030200002f6d312d65752f605f77009f74384816a31807ffffffff03ac505763000000001976a914c362d5af234dd4e1f2a1bfbcab90036d38b0aa9f88acaa505763000000001976a9143c22b6d9ba7b50b6d6e615c69d11ecb2ba3db14588acaa505763000000001976a914b7177c7deb43f3869eabc25cfd9f618215f34d5588ac00000000") | |
| require.NoError(t, err) |
There was a problem hiding this comment.
Fixed: both NewNBitFromString and NewTxFromString calls in the two new integration tests now capture the error and assert require.NoError before proceeding.
| // persist the current state — only needed when there are NO moveForward blocks | ||
| // (moveForward blocks are now finalized individually in the loop above) | ||
| if len(moveForwardBlocks) == 0 && len(moveBackBlocks) > 0 { | ||
| // we only moved back, finalize with the parent of the last block we moved back |
There was a problem hiding this comment.
After reset() clears chainedSubtrees/currentSubtree, there is no finalization path when BOTH moveForwardBlocks and moveBackBlocks are empty (this conditional only handles moveBack-only). In that no-op reset case, precomputedMiningData is not refreshed and can still reference the old subtrees snapshot even though it was just cleared. Consider explicitly updating mining state in the zero-move case (e.g., call updatePrecomputedMiningData and/or finalize based on the current header).
There was a problem hiding this comment.
Fixed: added an else if len(moveForwardBlocks) == 0 && len(moveBackBlocks) == 0 branch. In the zero-move case, chainedSubtrees and currentSubtree are cleared by the reset preamble but precomputedMiningData is never refreshed. The fix stores the target blockHeader into currentBlockHeader and calls updatePrecomputedMiningData() so precomputed data refers to the fresh empty subtrees rather than the old (now-closed) ones.
- Add require.NoError for NewBlockAssembler return value in test setup - Rename utxoStore local var to utxo to avoid shadowing the imported package; remove workaround import - Update test comments to describe pre-fix behavior rather than current bugs - Add require.NoError for NewNBitFromString/NewTxFromString in integration tests - Handle zero-move reset case: refresh precomputedMiningData after subtrees are cleared Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Benchmark Comparison ReportBaseline: Current: Summary
All benchmark results (sec/op)
Threshold: >10% with p < 0.05 | Generated: 2026-04-10 05:58 UTC |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| // Create SubtreeProcessor WITH utxoStore (unlike setupBlockAssemblyTest which passes nil). | ||
| // This is needed because SubtreeProcessor.reset() calls processCoinbaseUtxos() | ||
| // which requires utxoStore.Create(). | ||
| ba.subtreeProcessor, err = subtreeprocessor.NewSubtreeProcessor( | ||
| t.Context(), | ||
| ulogger.TestLogger{}, | ||
| ba.settings, | ||
| nil, // blobStore | ||
| items.blockchainClient, | ||
| items.utxoStore, // NOT nil — this is the key difference | ||
| items.newSubtreeChan, | ||
| ) | ||
| require.NoError(t, err) | ||
|
|
There was a problem hiding this comment.
setupBlockAssemblyTestWithUtxoStore() recreates and overwrites ba.subtreeProcessor even though NewBlockAssembler() already constructs a SubtreeProcessor using the provided utxoStore. This makes the test diverge from production wiring and leaves the originally-constructed processor unreferenced (potentially leaking resources/state). Prefer using ba.subtreeProcessor created by NewBlockAssembler (and start/stop that one), removing the second subtreeprocessor.NewSubtreeProcessor() call.
| // Create SubtreeProcessor WITH utxoStore (unlike setupBlockAssemblyTest which passes nil). | |
| // This is needed because SubtreeProcessor.reset() calls processCoinbaseUtxos() | |
| // which requires utxoStore.Create(). | |
| ba.subtreeProcessor, err = subtreeprocessor.NewSubtreeProcessor( | |
| t.Context(), | |
| ulogger.TestLogger{}, | |
| ba.settings, | |
| nil, // blobStore | |
| items.blockchainClient, | |
| items.utxoStore, // NOT nil — this is the key difference | |
| items.newSubtreeChan, | |
| ) | |
| require.NoError(t, err) | |
| require.NotNil(t, ba.subtreeProcessor) |
There was a problem hiding this comment.
Fixed: removed the second subtreeprocessor.NewSubtreeProcessor() call. NewBlockAssembler already passes the utxoStore through to its internal SubtreeProcessor (line 203 of BlockAssembler.go), so the test now uses that directly. The setup now just calls NewBlockAssembler with the utxoStore and asserts the subtreeProcessor is non-nil.
| // Finalize each moveForward block sequentially to ensure SetBlockProcessedAt | ||
| // is called for all blocks, not just the last one. | ||
| for _, block := range moveForwardBlocks { | ||
| stp.finalizeBlockProcessing(ctx, block) | ||
| } |
There was a problem hiding this comment.
In the legacy-sync reset path, calling finalizeBlockProcessing() for every moveForward block does more than SetBlockProcessedAt (it also adjusts dynamic subtree sizing and snapshots mining data). During large catch-up resets this can add unnecessary work and can mutate sizing state based on stale/irrelevant stats. Consider marking processed_at for intermediate blocks directly (e.g., a lightweight helper) and running full finalizeBlockProcessing only once for the final block (or add a flag to skip sizing/mining-data updates in reset fast-forward loops).
There was a problem hiding this comment.
Fixed: in the legacy-sync path, intermediate blocks now use a direct blockchainClient.SetBlockProcessedAt call instead of the full finalizeBlockProcessing. Only the final moveForward block goes through finalizeBlockProcessing, which is the one that should trigger adjustSubtreeSize and updatePrecomputedMiningData with meaningful stats.
| // Finalize each moveForward block individually to ensure SetBlockProcessedAt | ||
| // is called for all blocks, not just the last one. Without this, intermediate | ||
| // moveForward blocks would never get processed_at set. | ||
| stp.finalizeBlockProcessing(ctx, block) | ||
| } |
There was a problem hiding this comment.
Similarly in the non-legacy reset moveForward loop, finalizeBlockProcessing() is invoked per block, which repeatedly runs adjustSubtreeSize() and updatePrecomputedMiningData(). If the goal is just to ensure processed_at is set for each moveForward block, consider separating that concern from full finalization to avoid repeated sizing/mining-data updates and keep reset behavior closer to normal block advancement semantics.
There was a problem hiding this comment.
Fixed: same change applied to the non-legacy path. Intermediate blocks get a lightweight SetBlockProcessedAt call; full finalizeBlockProcessing runs only for the last block in the loop.
- Remove redundant SubtreeProcessor construction in test setup — NewBlockAssembler already passes utxoStore through to its internal SubtreeProcessor - Use lightweight SetBlockProcessedAt for intermediate reset moveForward blocks instead of full finalizeBlockProcessing, avoiding repeated adjustSubtreeSize and updatePrecomputedMiningData on stale stats; only the final block gets full finalization Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // direct SetBlockProcessedAt call to avoid running adjustSubtreeSize and | ||
| // updatePrecomputedMiningData on stale stats repeatedly during fast-forward. | ||
| // Only the final block gets full finalizeBlockProcessing. | ||
| for i, block := range moveForwardBlocks { |
There was a problem hiding this comment.
Not missing unsetting processed_at when moveBackBlocks?
There was a problem hiding this comment.
Good catch! Fixed: added a concurrent errgroup after the moveBackBlocks loop that calls SetBlockProcessedAt(ctx, block.Header.Hash(), true) for all moveBack blocks in parallel. This clears their processed_at timestamps since they're being "un-processed" during the reset. Uses the same non-critical error handling pattern as the moveForward path (log warning, don't fail the reset).
| // direct SetBlockProcessedAt call to avoid running adjustSubtreeSize and | ||
| // updatePrecomputedMiningData on stale stats repeatedly during fast-forward. | ||
| // Only the final block gets full finalizeBlockProcessing. | ||
| for i, block := range moveForwardBlocks { |
There was a problem hiding this comment.
Not missing unsetting processed_at when moveBackBlocks?
There was a problem hiding this comment.
Same fix covers both paths — the moveBack clearing happens in the shared section before the legacy/non-legacy branch, so it applies to both.
When reset() moves blocks back, their processed_at timestamps must be cleared — symmetric with the moveForward path that sets them. Uses a concurrent errgroup for maximum throughput. Addresses review feedback from @icellan on PR bsv-blockchain#545. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|





Summary
Fix two bugs in block assembly's reset logic that cause intermediate blocks to be improperly processed during reorgs/resets:
SubtreeProcessor.reset()only calledfinalizeBlockProcessing()(and thusSetBlockProcessedAt) for the last moveForward block. Intermediate blocks never gotprocessed_atset, meaning they were not recognized as fully processed.handleReorg()returnednilafter fallback reset (triggered by invalid block or failed Reorg), allowingprocessNewBlockAnnouncementto overwrite the reset'ssetBestBlockHeaderwith a potentially stale value captured before the reset ran.Changes
Bug 1 Fix (
SubtreeProcessor.go)finalizeBlockProcessing(ctx, block)inside the per-block moveForward loop so each block getsSetBlockProcessedAtcalled individually.currentBlockHeader.Store()+updatePrecomputedMiningData()with per-blockfinalizeBlockProcessing()loop after concurrent coinbase processing.Bug 2 Fix (
BlockAssembler.go)reset()succeeds inhandleReorg(), returnerrors.NewBlockAssemblyResetError(...)instead ofnil. This matches the large-reorg path which already correctly returnsErrBlockAssemblyReset, preventingprocessNewBlockAnnouncementfrom overwriting the reset's state.Tests
reset_bug_test.go): Two tests that prove both bugs exist and pass after fixes.blockassembly_system_test.go): Two tests using full blockchain daemon (gRPC, real stores) that exercise the reset and reorg-with-invalid-block paths end-to-end.Testing
All tests pass with no regressions:
Related issue: https://github.com/orgs/bitcoin-sv/projects/5/views/6?pane=issue&itemId=161301562&issue=bitcoin-sv%7Cteranode%7C4507