## Issue On Prysm nodes, we're seeing some attestations like this from Lighthouse ```! time="2023-08-02 16:25:04" level=debug msg="Gossip message was ignored" agent="Lighthouse/v4.3.0-dfcb336/x86_64-linux" error="epoch 3 root 0x094f99ab02d5b0d71f54fe3a2309d360eeb6ccca96f91d72c7f8b350533d836b: not a checkpoint in forkchoice" gossip score=21.22931150205812 multiaddress="/ip4/144.126.197.69/tcp/59008" peer id=16Uiu2HAm9GwUAyubQR85fa4RDaTmmqp6QKniDXVHoboKrx8dRBMc prefix=sync topic="/eth2/eec15ee2/beacon_attestation_63/ssz_snappy" ``` From @potuz: > @sproul could you check if Lighthouse is struggling or in a very long fork? we have a block in 64 which is a direct descendant of 63 and Lighthouse keeps using 63 as a checkpoint > but yeah, whatever this is, the processing time of these blocks shows we can't reallistically handle these networks in these computers at least. 63 took like 9 seconds to process It seems like Lighthouse is using an old root for the target checkpoint in attestations. ## Investigation I checked a few log files and found similar occurence as well on different slot numbers in the **Lighthoue/Besu** & **Prysm/Geth** nodes: ```! time="2023-08-02 16:24:21" level=debug msg="Gossip message was ignored" agent="Lighthouse/v4.3.0-dfcb336/x86_64-linux" error="epoch 3 root 0xae769c6e065ecca568214148c2d4676d623d3ca24d7db008c0061910475b1f22: not a checkpoint in forkchoice" gossip score=0.6617806177346196 multiaddress="/ip4/68.183.113.156/tcp/9000" peer id=16Uiu2HAmMs63ZdcqCtf7dbQH7JUXkXhEsDeeLHPTmQEhxFDRYqne prefix=sync topic="/eth2/eec15ee2/beacon_attestation_52/ssz_snappy" ``` This checkpoint root `0xae769c6` for this attestation is pointing to the block at slot **91**. Comparing the lighthouse logs at around the same time, the block at slot **94** was already imported as the head block: ```! Aug 02 16:23:51.436 DEBG Fork choice updated head old_head: 0xae769c6e065ecca568214148c2d4676d623d3ca24d7db008c0061910475b1f22, old_head_weight: Some(4013024000000000), new_head: 0x02ba7c03aacd27bad2e9dd3a910bebff8dfb5988efb84eae20ec684d93e62bec, new_head_weight: Some(840000000000000), service: beacon <more logs..> Aug 02 16:23:54.001 INFO Synced slot: 94, block: 0x02ba…2bec, epoch: 2, finalized_epoch: 0, finalized_root: 0x5096…9387, exec_hash: 0xaa94…2f26 (verified), peers: 81, service: slot_notifier ``` And then this was logged when preparing the attestation: ```! Aug 02 16:24:16.004 DEBG Attester cache miss request_slot: 96, head_state_slot: 94, beacon_block_root: 0x02ba7c03aacd27bad2e9dd3a910bebff8dfb5988efb84eae20ec684d93e62bec, service: beacon ``` So we know that for this attestation: - request slot = `96`, request epoch = `3` - head state slot = `94` - target check point should have epoch `3`, and root `0x02ba7c0` (`94`) - however the Prysm node is seeing an older root `0xae769c6` (`91`) in the target checkpoint, which I can't really explain from looking at the code. It seems like the whole time we're seeing a large number of invalid attestation from the network, mostly due to them being "Past Slot", which makes me wonder if the messages delivery were severely delayed. ```! Aug 02 16:24:16.171 DEBG Invalid attestation from network type: "aggregated", peer_id: 16Uiu2HAm7ZKCLyx3KmtyZLuTofaWDjjqqGPko7qhkCJWRBpRRYw7, block: 0x29a25932ed6eaea731f3375f1668e9a4a986824ce14a584320387776beb88fa6, reason: PastSlot { attestation_slot: Slot(45), earliest_permissible_slot: Slot(64) } ``` ## Additional Info - Lighthouse Peer IDs:`16Uiu2HAmV9PVEounMWpf3TWNsvrKi2tbkNHEZSyeEBDFKhn7MkXf` (Prysm disconnected due to fault, due to invalid attestations from this node) - Prysm Peer ID: `16Uiu2HAmKE7dpAckeUxzaGta9HDgp5FMDGjMxrz7gDozC1o1eabm` <!-- ## slot 64: Lighthouse - 16:17:50 Recieved block - 16:17:53 Delayed head block Prysm - 16:17:50 Recieved block - 16:17:54 Synced and finish applying state transition - Error "not a checkpoint in forkchoice" - not a checkpoint in forkchoice"` - epoch 1 root `0x3ca77262d3b036848682680a562882af0b409d9d92c7b70ee816b43691108ec2` ## Error ### Logs - beacon(1) - prysm - beacon(3) - lh/besu ### Analysis epoch 0 0-31 epoch 1 32-63 epoch 2 64-95 epoch 3 96-127 Attestation for slot 96+ epoch 3 root Lighthouse besu (`16Uiu2HAmMs63ZdcqCtf7dbQH7JUXkXhEsDeeLHPTmQEhxFDRYqne`) slot 91 -> slot 94 - 91 - received block 16:23:13 - root 0xae769c6e065ecca568214148c2d4676d623d3ca24d7db008c0061910475b1f22 - 94 - received block 16:23:49 - root 0x02ba7c03aacd27bad2e9dd3a910bebff8dfb5988efb84eae20ec684d93e62bec - fork choice updated to ^ 16:23:51 - fork choice updated 16:24:56 (next) - 95 empty block - 96 empty block Prysm - recieved block for slot 94 16:23:50 - 16:24:21 received 91 as checkpoint 16:24:21 not a checkpoint in forkchoice epoch 3 root 0xae769c6e065ecca568214148c2d4676d623d3ca24d7db008c0061910475b1f22 (91) why is lighthouse sending 91 as checkpoint and not 94 ## Messages request slot 96 head state 94 target { epoch 4, root: 94} source { epoch 3, root: advance to 96 already justified checkpoint epoch 3 using root 91 94 not included at epoch transition 16:24:18 slot 96 maybe already pulled up checkpoints before 94 added to fork choice -->