Week 14 - 15

TL;DR

Over the past two weeks, my primary focus has been on starting the last part of my project proposal which aims at adding an attestation simulator in order for Lighthouse to evaluate the implication of attestating 32 times per epoch. That's quite important in terms of performance as well as it helps understand how a consensus client would transition to protocol's behaviour changes such as SSF in combination of increasing the max balance.

Digging into the different committees, the caching implementations of LH and the interaction between the Validator Client and the Beacon node when producing and signing the aggregations.
Also, I created a PR introducing the first metrics to see how LH behaves if it were to simulate one attestation per slot.
I've resolved the last issue with the tests related to the missed block feature, and the PR has been merged into the unstable branch of LH for testing.

Exposing Validator Missed Block Metrics

Context

After Paul helped me wrap up the PR, I realised that the tests did not work for all the different fork names. This issue led me to delve into understanding how the specifications were incorporated into the BN harness (the name assigned to the generated BN in the tests) for those various forks.
After comparing the specifications for two different test runs, each with its own fork specs, I noticed that the primary variable that changed was the epoch at which the fork's specs actually begin. For instance, if we consider the specs injected when testing compatibility with the Altair fork:
- Command to run for a single test:
  env FORK_NAME=phase0 cargo nextest run --release --features "fork_from_env,slasher/lmdb," -p beacon_chain --test beacon_chain_tests validator_monitor::produces_missed_blocks
- Specs output:
  harness1spec: ChainSpec { config_name: Some("mainnet"), genesis_slot: Slot(0), far_future_epoch: Epoch(18446744073709551615), base_rewards_per_epoch: 4, deposit_contract_tree_depth: 32, max_committees_per_slot: 64, target_committee_size: 128, min_per_epoch_churn_limit: 4, max_per_epoch_activation_churn_limit: 8, churn_limit_quotient: 65536, shuffle_round_count: 90, min_genesis_active_validator_count: 16384, min_genesis_time: 1606824000, hysteresis_quotient: 4, hysteresis_downward_multiplier: 1, hysteresis_upward_multiplier: 5, proportional_slashing_multiplier: 1, min_deposit_amount: 1000000000, max_effective_balance: 32000000000, ejection_balance: 16000000000, effective_balance_increment: 1000000000, genesis_fork_version: [0, 0, 0, 0], bls_withdrawal_prefix_byte: 0, eth1_address_withdrawal_prefix_byte: 1, genesis_delay: 604800, seconds_per_slot: 12, min_attestation_inclusion_delay: 1, min_seed_lookahead: Epoch(1), max_seed_lookahead: Epoch(4), min_epochs_to_inactivity_penalty: 4, min_validator_withdrawability_delay: Epoch(256), shard_committee_period: 256, base_reward_factor: 64, whistleblower_reward_quotient: 512, proposer_reward_quotient: 8, inactivity_penalty_quotient: 67108864, min_slashing_penalty_quotient: 128, domain_beacon_proposer: 0, domain_beacon_attester: 1, domain_blob_sidecar: 11, domain_randao: 2, domain_deposit: 3, domain_voluntary_exit: 4, domain_selection_proof: 5, domain_aggregate_and_proof: 6, safe_slots_to_update_justified: 8, proposer_score_boost: Some(40), eth1_follow_distance: 2048, seconds_per_eth1_block: 14, deposit_chain_id: 1, deposit_network_id: 1, deposit_contract_address: 0x00000000219ab540356cbb839cbe05303d7705fa, inactivity_penalty_quotient_altair: 50331648, min_slashing_penalty_quotient_altair: 64, proportional_slashing_multiplier_altair: 2, epochs_per_sync_committee_period: Epoch(256), inactivity_score_bias: 4, inactivity_score_recovery_rate: 16, min_sync_committee_participants: 1, domain_sync_committee: 7, domain_sync_committee_selection_proof: 8, domain_contribution_and_proof: 9, altair_fork_version: [1, 0, 0, 0], altair_fork_epoch: Some(Epoch(0)), inactivity_penalty_quotient_bellatrix: 16777216, min_slashing_penalty_quotient_bellatrix: 32, proportional_slashing_multiplier_bellatrix: 3, bellatrix_fork_version: [2, 0, 0, 0], bellatrix_fork_epoch: None, terminal_total_difficulty: 58750000000000000000000, terminal_block_hash: 0x0000000000000000000000000000000000000000000000000000000000000000, terminal_block_hash_activation_epoch: Epoch(18446744073709551615), safe_slots_to_import_optimistically: 128, capella_fork_version: [3, 0, 0, 0], capella_fork_epoch: None, max_validators_per_withdrawals_sweep: 16384, deneb_fork_version: [4, 0, 0, 0], deneb_fork_epoch: None, boot_nodes: [], network_id: 1, attestation_propagation_slot_range: 32, maximum_gossip_clock_disparity_millis: 500, target_aggregators_per_committee: 18446744073709551615, attestation_subnet_count: 64, subnets_per_node: 2, epochs_per_subnet_subscription: 256, gossip_max_size: 10485760, min_epochs_for_block_requests: 33024, max_chunk_size: 10485760, ttfb_timeout: 5, resp_timeout: 10, message_domain_invalid_snappy: [0, 0, 0, 0], message_domain_valid_snappy: [1, 0, 0, 0], attestation_subnet_extra_bits: 0, attestation_subnet_prefix_bits: 6, domain_application_mask: 16777216, domain_bls_to_execution_change: 10 }

The crucial piece of data is altair_fork_epoch: Some(Epoch(0)), which causes the BN to behave according to the Altair spec from epoch 0. After comparing the 'phase0' and 'Altair' fork names, I couldn't identify any significant changes that would affect the state validation by LH's Validator Monitor.

Therefore, I decided to conduct a step-by-step debugging process. Upon a deeper investigation into the values generated by the harness itself, I discovered that the proposers_per_epoch vector in the function add_validators_missed_blocks varied depending on the fork name.

proposers_per_epoch = self.get_proposers_by_epoch_from_cache(
    slot_epoch,
    shuffling_decision_block,
);

I was expecting the validator to miss the block to always be at the index 0 with the same value (validator index = 7) where the validator index would be equal to validator_indexes[slot_in_epoch.as_usize()] as defined in the tests.

As I was wondering why the value at the index 0 is different depending on the fork name - eg. 7 (pre-altair) and 2 (post-altair), I figured that I hadn't taken into account that the get_beacon_proposer_indices is initiated depending on the fork seed.

Finally, I added the validator_index that should be monitored per fork name and pass that list when initialising the validator_monitor component in the harness such as:

// 2nd scenario //
//
// Missed block happens when slot and prev_slot are not in the same epoch
// making sure that the cache reloads when the epoch changes
// in that scenario the slot that missed a block is the first slot of the epoch
validator_index_to_monitor = 7;
// We are adding other validators to monitor as thoses one will miss a block depending on
// the fork name specified when running the test as the proposer cache differs depending on the fork name (cf. seed)
let validator_index_to_monitor_altair = 2;
// Same as above but for the merge upgrade
let validator_index_to_monitor_merge = 4;
// Same as above but for the capella upgrade
let validator_index_to_monitor_capella = 11;
// Same as above but for the deneb upgrade
let validator_index_to_monitor_deneb = 3;

The PR is now merged.

Attestation Simulator

Context

I began drafting my PR after thoroughly examining the interactions between the VC and the BN regarding aggregation requests and generation, as well as the roles of the various committees (beacon and sync committees) excluding the 'sync' committee, which is beyond the scope of this discussion. You can read more about the committees in the eth2book and you can see the draft of my PR here.
After understanding the workflow between the VC and the BN. From the VC perspective, we could resume the workflow as the following:

duty_service task (interval: next_slot_duration):
- fetch get_beacon_proposer(GET /eth/v1/validator/duties/proposer/{epoch})
- fetch get_beacon_attester(POST /eth/v1/validator/duties/attester/{epoch})

attester_service task (interval: next_slot_duration + slot_duration / 3)
- compute duties_by_committee_index
- for_each duties_by_committee_index {
	- publish_attestations_and_aggregates {
		- for_each committee_index {
			- get_validator_attestation_data (GET /eth/v1/validator/attestation_data)
			- check duty == attestation {
				- sign(attestation)
			}
		}
        - send post_beacon_pool_attestations(POST /eth/v1/beacon/pool/attestations)
	}
}

I started working on bypassing this workflow and concentrate on how the BN and only the BN itself would behave when generating one (un)aggregated attestation per slot.
I asked for Paul Hauner for first feedbacks and asked him if I was not off topic. You can see more of the conversation here.

Next steps

I will run a version of LH with the missed block feature on Holesky and see the results with the graphs added to the validator monitor dashboard.
Wrapping up the attestation simulator in next update (or hopefully getting close to it).

Week 14 - 15

TL;DR

Exposing Validator Missed Block Metrics

Context

Attestation Simulator

Context

Next steps

Read more

EPF cohort 4 Final Dev Update

EPF cohort 4 Week 16 update

EPF cohort 4 Week 12 - 13 update

EPF cohort 4 Week 10 - 11 update