feat: unknown payload sync for incoming blocks#9102
feat: unknown payload sync for incoming blocks#9102
Conversation
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the beacon node's ability to synchronize blocks by introducing a dedicated mechanism for handling unknown execution payload envelopes. It addresses scenarios where a block's parent payload is missing, allowing the node to proactively request and integrate these payloads from the network. This improves the robustness of block propagation and ensures the chain can continue to build even when execution layer data is temporarily unavailable. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new feature for handling unknown execution payloads for incoming blocks, specifically for the upcoming "gloas" fork. The changes are extensive, touching upon chain events, error handling, networking with new ReqResp methods, metrics, and the core synchronization logic in BlockInputSync. The implementation correctly parallels the existing logic for unknown block syncing. My review provides suggestions to improve the correctness of new metrics, enhance code clarity in the peer selection logic, and increase the robustness of the payload fetching mechanism by adding better error handling and peer management.
| private async fetchPayloadEnvelope(pending: PendingPayloadEnvelope): Promise<void> { | ||
| pending.status = "fetching"; | ||
| pending.attempts++; | ||
| try { | ||
| const peer = this.peerBalancer.bestPeerForPendingColumns(new Set(), new Set())?.peerId; | ||
| if (!peer) { | ||
| pending.status = "pending"; | ||
| return; | ||
| } | ||
|
|
||
| const envelopes = await this.network.sendExecutionPayloadEnvelopesByRoot(peer, [fromHex(pending.blockRootHex)]); | ||
| if (envelopes.length === 0) { | ||
| pending.status = "pending"; | ||
| return; | ||
| } | ||
|
|
||
| const envelope = envelopes[0]; | ||
| const payloadInput = this.chain.seenPayloadEnvelopeInputCache.get(pending.blockRootHex); | ||
| if (!payloadInput) { | ||
| throw Error(`PayloadEnvelopeInput missing for known block root ${pending.blockRootHex}`); | ||
| } | ||
|
|
||
| payloadInput.addPayloadEnvelope({ | ||
| envelope, | ||
| source: PayloadEnvelopeInputSource.byRoot, | ||
| seenTimestampSec: Date.now() / 1000, | ||
| peerIdStr: peer, | ||
| }); | ||
|
|
||
| if (payloadInput.isComplete()) { | ||
| await this.chain.processExecutionPayload(payloadInput); | ||
| } | ||
| this.metrics?.blockInputSync.payloadFetchSuccess.inc(); | ||
| } catch (e) { | ||
| this.logger.debug("Error fetching payload envelope", {root: pending.blockRootHex}, e as Error); | ||
| pending.status = "pending"; | ||
| this.metrics?.blockInputSync.payloadFetchError.inc(); | ||
| } | ||
| } |
There was a problem hiding this comment.
The fetchPayloadEnvelope function currently tries to fetch from a single peer. If the fetch fails, it resets the status to "pending" for a later retry, but it doesn't penalize the failing peer or prevent it from being picked again for the next attempt. This could lead to inefficient retries with a faulty or slow peer.
Consider enhancing this function to be more robust, similar to fetchBlockInput. This could involve:
- Adding a retry loop within
fetchPayloadEnvelopeto try multiple peers. - Tracking excluded peers for the duration of the fetch attempts for a given payload.
- Applying peer scoring penalties for certain types of
RequestError.
| payloadFetchSuccess: register.gauge({ | ||
| name: "lodestar_sync_block_input_payload_fetch_success_total", | ||
| help: "Total successful payload envelope fetches", | ||
| }), |
There was a problem hiding this comment.
The metric lodestar_sync_block_input_payload_fetch_success_total is defined as a gauge, but it's used as a counter since its value only ever increases. For metrics that represent a cumulative count, it's more idiomatic and correct to use a counter. This provides clearer intent and allows monitoring systems like Prometheus to apply counter-specific functions (e.g., rate()).
This feedback also applies to other new metrics in this file that are used as counters but defined as gauges, such as payloadFetchError, payloadRequests, and the metrics within awaitingEnvelopeGossipMessages.
| payloadFetchSuccess: register.gauge({ | |
| name: "lodestar_sync_block_input_payload_fetch_success_total", | |
| help: "Total successful payload envelope fetches", | |
| }), | |
| payloadFetchSuccess: register.counter({ | |
| name: "lodestar_sync_block_input_payload_fetch_success_total", | |
| help: "Total successful payload envelope fetches", | |
| }), |
| pending.status = "fetching"; | ||
| pending.attempts++; | ||
| try { | ||
| const peer = this.peerBalancer.bestPeerForPendingColumns(new Set(), new Set())?.peerId; |
There was a problem hiding this comment.
The method bestPeerForPendingColumns is being used here to find a peer for fetching a payload envelope, not columns. While it works because it's called with empty sets, the name is misleading and could cause confusion for future maintenance. Consider renaming bestPeerForPendingColumns to something more generic, like findBestPeer, and making the pendingColumns argument optional to better reflect its dual use.
Performance Report✔️ no performance regression detected Full benchmark results
|
| private onUnknownPayloadEnvelope = (data: ChainEventData[ChainEvent.unknownPayloadEnvelope]): void => { | ||
| try { | ||
| const {blockRootHex, peer} = data; | ||
| const block = this.chain.forkChoice.getBlockHexDefaultStatus(blockRootHex); |
There was a problem hiding this comment.
if there is no block we should search for it via addByRootHex()
twoeths
left a comment
There was a problem hiding this comment.
the ChainEvent.unknownPayloadEnvelope should have a PayloadEnvelopeInput instead of a SignedExecutionPayloadEnvelope
BlockInputSync should then enhance downloadByRoot to download envelope + DataColumnSidecars and track it in PayloadEnvelopeInput to know when data is available
but there is already a TODO here https://github.com/ChainSafe/lodestar/pull/9025/changes#diff-c702dde1ffa90a6d2720c7b4930bc5c08cc1f302cfd09ed9e74d80755b92da8fR850
we cannot create aPayloadEnvelopeInput without a bid/proposerIndex. Maybe make these fields optional? also maybe we store more than needed
that is to also handle the (rare) case where payload/envelope comes even before the block
When a gossip block claims its parent was FULL but the parent's envelope hasn't been seen, the block is queued and the parent envelope is fetched via sendExecutionPayloadEnvelopesByRoot. Changes: - Add PARENT_PAYLOAD_UNKNOWN block error code and gossip validation - Add PendingPayloadEnvelope type for tracking missing envelopes - BlockInputSync: pendingPayloads map, triggerPayloadSearch, fetchPayloadEnvelope, onUnknownPayloadEnvelope handler - Subscribe to unknownEnvelopeBlockRoot and executionPayloadAvailable events - Add metrics for payload fetch tracking - Handle PARENT_PAYLOAD_UNKNOWN in processBlock error recovery Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a gossip block claims its parent was FULL but the parent's envelope hasn't been seen, the block is queued and the parent envelope is fetched via sendExecutionPayloadEnvelopesByRoot. Changes: - Add PARENT_PAYLOAD_UNKNOWN block error code and gossip validation - Add PendingPayloadEnvelope type for tracking missing envelopes - BlockInputSync: pendingPayloads map, triggerPayloadSearch, fetchPayloadEnvelope, onUnknownPayloadEnvelope handler - Subscribe to unknownEnvelopeBlockRoot and executionPayloadAvailable events - Add metrics for payload fetch tracking - Handle PARENT_PAYLOAD_UNKNOWN in processBlock error recovery Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a gossip block claims its parent was FULL but the parent's envelope hasn't been seen, the block is queued and the parent envelope is fetched via sendExecutionPayloadEnvelopesByRoot. Changes: - Add PARENT_PAYLOAD_UNKNOWN block error code and gossip validation - Add PendingPayloadEnvelope type for tracking missing envelopes - BlockInputSync: pendingPayloads map, triggerPayloadSearch, fetchPayloadEnvelope, onUnknownPayloadEnvelope handler - Subscribe to unknownEnvelopeBlockRoot and executionPayloadAvailable events - Add metrics for payload fetch tracking - Handle PARENT_PAYLOAD_UNKNOWN in processBlock error recovery Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
superseded by #9241 |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #9102 +/- ##
=========================================
Coverage 52.32% 52.32%
=========================================
Files 848 848
Lines 62020 62016 -4
Branches 4538 4538
=========================================
- Hits 32449 32447 -2
+ Misses 29506 29504 -2
Partials 65 65 🚀 New features to boost your workflow:
|
When a gossip block claims its parent was FULL but the parent's envelope hasn't been seen, the block is queued and the parent envelope is fetched via sendExecutionPayloadEnvelopesByRoot. Changes: - Add PARENT_PAYLOAD_UNKNOWN block error code and gossip validation - Add PendingPayloadEnvelope type for tracking missing envelopes - BlockInputSync: pendingPayloads map, triggerPayloadSearch, fetchPayloadEnvelope, onUnknownPayloadEnvelope handler - Subscribe to unknownEnvelopeBlockRoot and executionPayloadAvailable events - Add metrics for payload fetch tracking - Handle PARENT_PAYLOAD_UNKNOWN in processBlock error recovery Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Depends on #9050
In parallel to #9025
This PR mainly handle case where an incoming block that contains a bid that contains an unknown parent payload hash. See case 2 in the table below.
This PR does not fully handle other one-off cases like unknown payload from attestation data.index === 1, PTC payloadPresent === true. Only to provide the barebones. Will fully handle these cases after #9025 is merged
ChainEvent.unknownParentPayload→ BlockInputSync adds child topendingBlocks+ parent topendingPayloads, fetches payloadextractBlockSlotRootFnsreturns{slot, root}→ queued inawaitingMessagesByBlockRoot. Already live inunstable