Merge block was beacon chain block 3639527 er(0x4f361e3a144edf3bc08a9718f2efc70d6d68d1f07eb271b9bb73a17140473691
), execution block hash 0xb43d883d637e8434b3f664e0f8ba0d1914d4d9c01a46be1dfc5f1ecd7bd381fd
execution block height 7382819.5 Parent execution hash: 0xa1be33e115ca56b51b80f2b7070d8062c3959f0b4a4f5fa35e91135894f55d30
Problematic block:
execution hash: 2d0230d6a70a8ad017c0cb846fc931c8c8507a7677e48d5a2a4a208279ab3f4d
So besu first saw a terminal block which wound up being a fork and Teku selected it as the terminal transition block.
At this point Teku will send fCU with the terminal block it has selected as the chain head until it eventually imports the merge beacon block and will then switch to using the execution payload from that block.
Initially Besu gives a VALID response to Teku:
Then Teku would have received a merge beacon block from another node which picked a different terminal block. It gets a SYNCING response from besu and is unable to optimistically import the merge block:
FIXME: Teku should log the block root and execution payload hash of the block when it can't import because of a syncing response.
Despite the lack of logs, this must have been the block at slot 3639527. That contained execution payload 0xb43d883d637e8434b3f664e0f8ba0d1914d4d9c01a46be1dfc5f1ecd7bd381fd
(7382819) with the parent being our alternate terminal block 0xa1be33e115ca56b51b80f2b7070d8062c3959f0b4a4f5fa35e91135894f55d30
Besu has not yet seen 0xa1be33e115ca56b51b80f2b7070d8062c3959f0b4a4f5fa35e91135894f55d30
so is correct in returning SYNCING. Backward sync appears to have worked except with some CliqueExtraData warnings:
But then it rejects block 7382819 (0xb43d883d637e8434b3f664e0f8ba0d1914d4d9c01a46be1dfc5f1ecd7bd381fd):
FIXME: That rejection is incorrect. 0xb43d883d637e8434b3f664e0f8ba0d1914d4d9c01a46be1dfc5f1ecd7bd381fd
is the execution block hash for the canonical block at height 7382819 (beacon block 3639527 - 0x4f361e3a144edf3bc08a9718f2efc70d6d68d1f07eb271b9bb73a17140473691
). That's the block Teku originally got a SYNCING response for, and when it retried execution then got an INVALID response:
THEORY: One thing to note here is that the log messages suggest the backward sync completed successfully, but the retry from Teku triggered the import failure and that retry ran just before the backward sync completed. Could there be a race condition here and the retry newPayload call wind up executing with an incorrect or only partially updated world state?
At this point Teku is unable to follow the canonical chain because it has been told its invalid. It is left with it's chain head at the last pre-merge beacon block.
FIXME: Teku should log the actual block slot and root when newPayload returns invalid instead of just "Payload for child of block root".
Eventually, one of the validators is assigned to propose, so builds and imports beacon block 3639557 9135525cd1ff878a37bd42bf254198a732bb125f09c8fcaaf99ec6804c0bccf5
. Which contained execution payload 0x321035c2dfaf9adc3f9c49cf4908acfe35a984566c35ddaa91b821cd8874f293
(height 7382819) with parent 0x2d0230d6a70a8ad017c0cb846fc931c8c8507a7677e48d5a2a4a208279ab3f4d
(which is our original problematic block that Teku picked as its terminal block). This triggers Teku to print pandas and make it look like Teku caught up on metrics - then it starts falling behind again until next time it produces a block.
Teku then reorgs to a pre-merge block:
From that point on besu starts rejecting the problematic block:
Since Teku is back to a pre-merge head, it will be sending fCU with the terminal block it has selected as the head block, but now besu is saying it's invalid.
FIXME: I believe besu is incorrect to reject this here. It is valid for the beacon chain to re-org back to a pre-merge chain and then try to build on a different transition block.