On Prysm nodes, we're seeing some attestations like this from Lighthouse
time="2023-08-02 16:25:04" level=debug msg="Gossip message was ignored" agent="Lighthouse/v4.3.0-dfcb336/x86_64-linux" error="epoch 3 root 0x094f99ab02d5b0d71f54fe3a2309d360eeb6ccca96f91d72c7f8b350533d836b: not a checkpoint in forkchoice" gossip score=21.22931150205812 multiaddress="/ip4/144.126.197.69/tcp/59008" peer id=16Uiu2HAm9GwUAyubQR85fa4RDaTmmqp6QKniDXVHoboKrx8dRBMc prefix=sync topic="/eth2/eec15ee2/beacon_attestation_63/ssz_snappy"
From @potuz:
@sproul could you check if Lighthouse is struggling or in a very long fork? we have a block in 64 which is a direct descendant of 63 and Lighthouse keeps using 63 as a checkpoint
but yeah, whatever this is, the processing time of these blocks shows we can't reallistically handle these networks in these computers at least. 63 took like 9 seconds to process
It seems like Lighthouse is using an old root for the target checkpoint in attestations.
I checked a few log files and found similar occurence as well on different slot numbers in the Lighthoue/Besu & Prysm/Geth nodes:
time="2023-08-02 16:24:21" level=debug msg="Gossip message was ignored" agent="Lighthouse/v4.3.0-dfcb336/x86_64-linux" error="epoch 3 root 0xae769c6e065ecca568214148c2d4676d623d3ca24d7db008c0061910475b1f22: not a checkpoint in forkchoice" gossip score=0.6617806177346196 multiaddress="/ip4/68.183.113.156/tcp/9000" peer id=16Uiu2HAmMs63ZdcqCtf7dbQH7JUXkXhEsDeeLHPTmQEhxFDRYqne prefix=sync topic="/eth2/eec15ee2/beacon_attestation_52/ssz_snappy"
This checkpoint root 0xae769c6
for this attestation is pointing to the block at slot 91.
Comparing the lighthouse logs at around the same time, the block at slot 94 was already imported as the head block:
Aug 02 16:23:51.436 DEBG Fork choice updated head old_head: 0xae769c6e065ecca568214148c2d4676d623d3ca24d7db008c0061910475b1f22, old_head_weight: Some(4013024000000000), new_head: 0x02ba7c03aacd27bad2e9dd3a910bebff8dfb5988efb84eae20ec684d93e62bec, new_head_weight: Some(840000000000000), service: beacon
<more logs..>
Aug 02 16:23:54.001 INFO Synced slot: 94, block: 0x02ba…2bec, epoch: 2, finalized_epoch: 0, finalized_root: 0x5096…9387, exec_hash: 0xaa94…2f26 (verified), peers: 81, service: slot_notifier
And then this was logged when preparing the attestation:
Aug 02 16:24:16.004 DEBG Attester cache miss request_slot: 96, head_state_slot: 94, beacon_block_root: 0x02ba7c03aacd27bad2e9dd3a910bebff8dfb5988efb84eae20ec684d93e62bec, service: beacon
So we know that for this attestation:
96
, request epoch = 3
94
3
, and root 0x02ba7c0
(94
)0xae769c6
(91
) in the target checkpoint, which I can't really explain from looking at the code.It seems like the whole time we're seeing a large number of invalid attestation from the network, mostly due to them being "Past Slot", which makes me wonder if the messages delivery were severely delayed.
Aug 02 16:24:16.171 DEBG Invalid attestation from network type: "aggregated", peer_id: 16Uiu2HAm7ZKCLyx3KmtyZLuTofaWDjjqqGPko7qhkCJWRBpRRYw7, block: 0x29a25932ed6eaea731f3375f1668e9a4a986824ce14a584320387776beb88fa6, reason: PastSlot { attestation_slot: Slot(45), earliest_permissible_slot: Slot(64) }
16Uiu2HAmV9PVEounMWpf3TWNsvrKi2tbkNHEZSyeEBDFKhn7MkXf
(Prysm disconnected due to fault, due to invalid attestations from this node)16Uiu2HAmKE7dpAckeUxzaGta9HDgp5FMDGjMxrz7gDozC1o1eabm