geemo

@geemo

Joined on Oct 25, 2022

  • EPF 4 Final Update Project Description This project sought to increase block proposer rewards for Lighthouse users. The project proposal lives here: Lighthouse Attestation Packing. Current Status The project is practically complete. We have gone through phases of optimization to arrive at something able to be merged into Lighthouse. The code is not yet merged though. We switched the roadmap of the project after realizing that the optimal aggregation strategy with a greedy solution to the weighted maximum coverage problem results in the optimal solution on all test cases. Also my partner stopped working on the project so the whole roadmap outlined in the project was pushed back as I completed both sides. I implemented the so called 'decomposition strategy' outlined by Satalia This took longer than I initially expected due to a lack of familiarity with the techniques on my part. I also implemented the 'enumeration approach' discussed in the paper linked above. That work can be found in this testing repository.
     Like  Bookmark
  • EPF Update #10 This update covers the process toward merging the optimality preserving atttestation aggregation process. It is slated to be included in the next update to Lighthouse v4.6.0. One of the major things worked on during this period was including a new test to ensure that given a 'pairwise overlapping' set of attestations our packing strategy should result in 2 attestations that cover the attester set. By 'pairwise overlapping' we mean something like the following: $$ A = [[1, 2, 3, 4], [3, 4, 5, 6], [5, 6, 7, 8], [7, 8, 9, 10]] $$ Consider each set as an attestation with an attester set corresponding to the set. Allo of the attestations have the same AttestationData, so recall that the only other rule to determine if two attestations are able to be aggregated is: the attester sets must be disjoint. This turned out to be a very fruitful test because the number of cliques from a set generated in a similar way to the above is very large. Let's see why. Remember that our aggregation technique is graph based and enumerates all maximal cliques. So lets list the edges. $$
     Like  Bookmark
  • This update will cover the current state of the new block packing methods for Lighthouse. With the assistance of Michael Sproul we have an optimality preserving aggregation strategy implemented. We are currently testing and optimizing it's performance. This comes as a pretty significant change to the block packing logic so we must ensure that everything functions correctly with good performance. Michael set an initial performance goal as attestation packing taking <50ms. To approach this we needed to profile the code to see what parts were taking the longest. I found that the Bron-Kerbosch algorithm was taking the longest followed by the greedy maximum coverage algorithm. After that efficiently removing aggregated attestations that have attester sets that are subsets of some other attestation in the set. Given that information I began to look into how to optimize the implementation of the Bron-Kerbosch algorithm. I didn't find any great way to improve things. Thankfully the aggregation requirements ensure that no two attestations with different attestation data can be aggregated, so we are able to apply parallel computing techniques to improve performance. Using rayon's ParallelIterator primitive we were able to reduce the average packing time from about 89ms. Down to ~35ms. To use the primitive took resturcuring the code to facilitate parallel compute. Also in the test repo I've implemented the "enumeration" approach which is an exact approach to the weighted maximum coverage problem. It uses a branch and bound algorithm. I'm sure some of it could be optimized but for now its quite slow. The structure of the algorithm lends itself to parallel compute so it should be able to improve things fairly significantly. However the current implentation is recursive making it less straightforward to employ rayon's primitives. I also implemented an approximate algorithm for the weighted maximum coverage problem that leverages the "decomposition" approach plus the a k-greedy algorithm. It allows us to choose from combinations of attestations to avoid picking atttestations that have an attester set that collide with other attestations. In conclusion, we are still figuring out the best approach to packing. We've implemented different approaches and are testing performance vs. reward. The exact approaches to me do not seem to be the best way to solve the problem given the tradeoffs.
     Like  Bookmark
  • In the past week I tested and optimised the decomposition approach that I described in my previous update. The resulting algorithm foes not perform as well as I would have hoped. That being said it is faster than the base MIP solution in some cases, but it's not a significant difference. It does achieve the optimal reward. Additionally I've tested the aggregation branch and resolved some outstanding issues on it. In the next week I'd like to get the aggregation branch merged or at least looked at by Michael Sproul. It still does not perform as well as the greedy aggregation approach so it certainly isn't worth merging as is, but I think this could be solved. Apologies for the short update. The next one will be longer as I explain the enumeration approach in depth after implenting it in the test repo.
     Like  Bookmark
  • During the past couple of weeks I have pivoted to focusing on the enumeration approach as well as the decomposition approach. In my previous update I outlined these approaches at a high level. Here I will go in more detail. The majority of the work I did recently focused on learning more about dynamic programming, branch and bound algorithms, implementing decomposition in the test repository, and fixing an annoying bug in the attestation aggregation branch. The dynamic programming approach to the attestation packing problem separates the problem into a main problem and sub-problems. The sub-problems will be very similar to ones you know if you have kept up with these updates. The synopsis being given some data $d_k$ and a set maximum number of attestations $n$ the function $g(d_k, n)$ computes the weighted maximum coverage over the set of maximally aggregated attestations for the particular data. In practice the set of maximally aggregated attestations for the data is small. With an average around 2, maximum (over a small data set) of 13, and a minimum of 1. From here you can choose any exact method to solve this NP-hard subproblem. In the enumeration approach we chose to employ a branch and bound algorithm. This is currently unimplemented, but I believe this will be an improvement to the MIP solver approach in terms of performance because we can use a greedy approach to calculate an initial value to bound the tree search procedure. The greedy approach will often be the optimal result from what we have seen in practice from testing Lighthouse’s current block proposer rewards. This will allow us to use an approximation of the value of the remaining leaves to reduce the size of the tree and hence decrease the runtime. The main problem is where the dynamic programming comes in to play. Satalia defined the approach as follows: Let $k$ be a non-negative integer and $m$ be the maximum number of attestations that can be included in the block. Then the function f(k, m) computes the optimal reward given that the first k data have been taken into account. The function $f$ is defined recursively with the base case being when k is zero. Base case: f(0,m) = {} the empty set
     Like  Bookmark
  • This week I was able to clear the roadblocks on the attestation aggregation branch as well as test the lazy attestation aggregation branch with flamegraph. Given that progress I've started to look forward to the next phase of my contributions to the project, namely attestation packing. Attestation Packing You'd be right to suggest that the whole project is about attestation packing. However, in this context I'm referring to the packing phase as Satalia describes it in: https://lighthouse-blog.sigmaprime.io/docs/satalia-02-exact-approaches.pdf. It really just assumes that the aggregation portion of things has already been done. Although it is likely less efficient to actually aggregate during the aggregation phase. Instead we could wait to know the final attestations that will be included into the block to aggregate. But I'm not sure if that will turn out to be the case in reality. The major problem of the packing stage can be characterized by the fact that it is an instance of the weighted maximum coverage problem. We'd like to pick from a set the subset of aggregated attestations that maximize the number of attesters that are included subject to the weights (associated rewards) and the contraint that only N attestations can be included in a block. The first approach proposed in the document linked above is called the MIP approach. A fitting name since it uses a Mixed Integer Programming solver to solve the entire problem of max coverage over the entire set of attestations available after the aggregagtion phase. Satalia didn't stop there. They also proposed an optimization that they called the Decomposition approach. Here they make use of the strictly additive relationship of attestation rewards when the attestations have different attestation data. They split the problem into many sub problems that are still instances of weighted max coverage but over a much smaller set. This is expected to perform better given that the number of clique attestations per data is on average small. To stitch the subproblems together a dynamic programming approach to a problem similar to the Knapsack problem can be employed. This is used to select the number of attestations from each data to be included in the final selection. Further, because the MIP solver takes time to start up and introduces a new dependency, Satalia proposed th Enumeration approach. This really isn't much of a new approach since it just substitutes the exact method of using the MIP solver with another exact method of using tree search algorithm called branch and bound. This is expected to increase performance because we will not have to wait for the MIP solver to start up for each subproblem. Additionally it introduces the possibility to optimize the search with better bounding functions or ordering of the search. So using the decomposition approach plus the branch and bound algorithm is the most performant approach in theory that we know so far.
     Like  Bookmark
  • Introduction This week I further worked on both issues disccussed previously. Those being lazy attestation signature decompression and maximal clique enumeration attestation aggregation. Lazy Attestation Signature Decompression For lazy signature decompression Michael Sproul ran some tests to see if the updates resulted in reduced CPU usage based on metrics from nodes on Goerli. The metrics showed no decrease in CPU usage. However, he also pointed out some optimizations that could be made to decrease CPU usage. The main idea of the lazy attestation signature decompression changes is simply to keep the data received from peers in the format received over the wire until necessary. This in itself would not lead to any decreased CPU usage because the signature would have to be decompressed at some point. However, Lighthouse maintains a list of observed aggregate attestations which allows us to check if the newly received attestation is the has attesting indices that are a subset of an observed one given the same data otherwise. These attestations do not provide any new utility because they have practically already been seen. So we skip further processing them and never have to decompress the signature. This should save some amount of CPU usage, but the exact gains are unclear. The tests mentioned earlier were conducted when there was still some obvious room for improvement: skipping was only implemented for batch processing, there was an extra unneccessary computation of the tree hash root of the attestation data, and there were more allocations than need be in the batch processing algorithm. I have since fixed these issues, but we have not tested the CPU usage on the Goerli nodes. I did confirm that decompressing attestation signatures is not taking a significant portion of compute as Michael Sproul had already pointed out. I wanted to learn how to run and make flamegraphs so I confirmed what he found. Attestation Aggregation On the attestaion aggregation front, I have been a bit stalled. I've found my current debugging workflow to be woefully inadequate and tedious. Some of the tests in the Lighthouse test suite fail to complete after a significant amount of time (20 minutes). Whereas in the unstable branch all cases pass within a couple of minutes. Really I think there are only 1 or 2 that even take more than a minute. So I'm working to get to the bottom of this problem. It seems as though the issue lies in the bron_kerbosch function. Which really shouldn't come as a surprise since it is easily the most complicated function being added. I'm now trying to make unit tests for the algorithm that fail in a similar way as the more integrated tests of the client. Additionally, Ankur and I made slides to present our project during the next office hours project proposal session.
     Like  Bookmark
  • Introduction This past week I focused on an adjacent issue about optimizing attestation signature decompression. Additionaly, Ankur Dubey and I wrote our project proposal and received feedback from Michael Sproul on it. Also I continued research on another area of interest of mine that I mentioned in my initial update, namely proposer builder separation. Lazy Attestation Signature Decompression The issue and PR which aims to resolve said issue can be found here: Issue: https://github.com/sigp/lighthouse/issues/4398 Pull Request: https://github.com/sigp/lighthouse/pull/4567 This still needs to be tested to see if it's achieving the goal of reducing the CPU usage of decompression. The key idea is just to keep the data structure as sent over the wire until neccessary to avoid decompressing unnecessary attestation signatures. Hence, Lazy attestation signature decompression.
     Like  Bookmark
  • Hello again! This update will be a short one. As I described in my previous update, this week would be devoted to implementing the maximal clique enumeration aggregation for the Lighthoues consensus client. Here is an overview of the changes that I have made: Deleted the importing agg from naive_aggregation_pool since the unagg attestations should already be in the op_pool. During unagg attestation processing add them to the op_pool Altered fields of AttestationDataMap to a HashMap for aggregated_attestation and a HashMap for unaggregated_attestations and altered methods on AttestationDataMap accordingly. Removed greedy aggregation on insertion to the op_pool Port Bron-Kerbosch implementation from Satalia to Lighthouse Add get_clique_aggregate_attestations_for_epoch -> Vec<(&CompactAttestationData, CompactIndexedAttestation<T>)> method (This is where most of the changes are. I'm curious if it would be better to return an Iterator)
     Like  Bookmark
  • This week I continued researching the attestation aggregation and packing problem (AAPP). The difference between this week and the last is that I began digging into the implementation details of both the current greedy approach and the proposed maximum clique enumeration approach. Additionally, I've started integrating the new approach into the Lighthouse consensus client: https://github.com/sigp/lighthouse/pull/4507. The PR so far is really just porting over code from the nightly Rust compiler to the stable one. Most of the work will be making Lighthouse's data structures to play nice with the Bron-Kerbosch algorithm implementation and then testing to ensure there is no regression in block rewards and other metrics. Note that this is just phase 1 of the problem after aggregation is solved we (Ankur and I) are going to tackle the max coverage problem. We got the greenlight from the SigP crew to work towards it! So I'm focusing on aggregation then will move to packing once it's complete. Here's an outline of what needs to be done. Michael Sproul provided the following overview: Add the Bron-Kerbosch alg in its own crate Modify the naive aggregation pool so that it retains the unaggregated attestations. Or, modify unaggregated attestation processing so that unaggregated attestations also get sent immediately to the op pool. Rework the attestation storage in the op pool so that it can do max-clique enumeration on the stored attestations. Use the output of max-clique enumeration (on a per-attestation-data basis) as the input to max coverage. Set up some nodes on mainnet and/or testnets to produce blocks every slot using blockdreamer. We can measure the profitability of these blocks to see if we can outcompete stable Lighthouse, as well as benchmark the time it takes to produce each block. Since 1 was already accomplished I moved on to step 2. The naive aggregation pool in Lighthouse is a field of the BeaconChain struct which takes in unaggregated attestations. On insertion, attestations with AttestationData that has not been seen before but is valid will be added as another entry in the pool. Otherwise, assuming they are valid, they are aggregated immediately with the member of the pool that shares the same AttestationData. This process I assume is why it's called the 'naive' aggregation pool but I don't think that is explicitly said anywhere. This process can make it so the aggregated attestation in the pool cannot be aggregated with some other attestation if they share even one signer. This definitely has nice properties though and works really well in practice. As number 2 presents two options, I investigated them. I think the second option is preferable but I haven't ran that by anyone else. So instead of maintaining the unaggregated attestations int the naive_aggregation_pool we will just send them to the operation pool after processing them for validity.
     Like 1 Bookmark
  • Introduction: Hi, I’m Geemo. During the last EPF cohort, I worked on implementing some of the light client networking for the consensus client Lighthouse. I’m excited for cohort 4 to get started to continue learning and contributing! This time around my goal for each update is that the reader is able to understand my progress on topics discussed without significant domain knowledge. However, I will assume a basic understanding of Ethereum node related terminology and I’ll lean on other explanations for some in depth topics and I’ll only provide a high level commentary. This approach should make the updates a bit longer, but hopefully they’ll read better than what I did last time around. Exploring Projects The initial task for the program is to explore current protocol development topics and decide what to work on. The mentors of the program provided a list of projects here: https://github.com/eth-protocol-fellows/cohort-four/blob/master/projects/project-ideas.md. After looking over the list of projects two stood out: ePBS Design and Prototyping (By Potuz and Terrence of Prysm) Optimising Attestation Packing (By Michael Sproul of Lighthouse) Proposer Builder Separation (PBS) has been a rich topic of research and development since the Merge. The out-of-protocol solution by Flashbots, MEV-boost, has been adopted by most of the validators on the network resulting in about 90% of blocks being built using the software. It allows block proposers to capture Maximal Extractable Value by outsourcing block building to a network of specialized entities that build highly profitable blocks. These blocks often include some form of arbitrage and their profitability is derived from ordering the transactions in the block. There has been much scrutiny around proposer builder separation after a vulnerability in MEV-boost was exploited to steal around $20 million from MEV bots. Additionally, the implementation hinges on centralized entities that have censored transaction in accordance with the list of OFAC sanctioned addresses. This weakens some of the core values of Ethereum of decentralization and censorship resistance.
     Like  Bookmark
  • Hey all! I implemented the beaon apis for light_client_optimistic_update, light_client_finality_update, light_client_bootstrap, and added the light client topics to the eventstream. The PR has been submitted to lighthouse. Additionally I continued exploring what it would take to include tests for the light client rpcs and have a solid understanding of how to achieve it. However the way I see it being done is not very clean and I wanted to get a second opinion, so I contacted @pawan, one of the lighthouse networking gurus, to see if there was a better way that I was missing. Moving forward the main components left for a working light client server are: putting the best LightClientUpdate per sync committee period into the database when needed. Implement: https://github.com/ethereum/consensus-specs/blob/dev/specs/altair/light-client/p2p-interface.md#sync-committee Do full testing using helios or other light clients. (If anyone knows of a light client that implements the gossip protocol please lmk as helios does not yet) add beacon api for LightClientUpdatesByRange Of course previous PRs also need to be merged as well.
     Like  Bookmark
  • Hey everyone! This week I've continued to progress the LightClientUpdatesByRange implementation as well as started implementing the beacon api's which coincide with the light client endpoints. There is a bit of a sticking point with adding testing for the light client rpcs, since the beacon node does not support OutboundRequests for the light client rpcs. All existing rpc tests use a function called build_node_pair so adding support for the OutboundRequests exclusively for testing has proven a bit difficult for me. This week I hope to complete the beacon apis for light_client/optimistic_update, light_client/finality_update, and light_client/bootstrap with accompanying tests. Also I'll be working towards completing the Light client updates by range rpc. This is a running list of PRs submitted to Lighthouse during the EPF program with their merge status listed. Light Server related PRs:
     Like  Bookmark
  • Hey everyone! I haven't been making consistent updates but I have been making progress towards implementing the light server with lighthouse. Here are the PRs I've made to the lighthouse codebase: Light Server related PRs: [Draft] Support Light client updates by range rpc and add LightClientHeader. https://github.com/sigp/lighthouse/pull/3886 Support LightClientFinalityUpdate and LightClientOptimisticUpdate rpcs as according to the spec. https://github.com/sigp/lighthouse/pull/3849 Adds light client optimistic update gossip reprocessing. We found that the gossip topics raced with receiving blocks. Add the parent_root to ReprocessQueueMessage::BlockImported so we can remove blocks from queue when a block arrives that has the same parent root. We use the parent root as opposed to the block_root because the LightClientOptimisticUpdate does not contain the block_root. https://github.com/sigp/lighthouse/pull/3799 [Merged] Support light client gossip topics. Implements the light client gossip topics from the spec: light_client_finality_update and light_client_optimistic_update. https://github.com/sigp/lighthouse/pull/3693 Miscellaneous PRs:
     Like  Bookmark
  • General Thoughts I will be focusing my efforts around light clients, because they are integral to trustless access to data on ethereum. To understand how to contribute meanfully to light client development, I first set out to learn the current state of affairs. I have been reading Ben Edgington's wonderful book Upgrading Ethereum. Light Client Development Today The Altair Spec outlines the p2p interface for full nodes and light clients to interact. https://github.com/ethereum/consensus-specs/tree/dev/specs/altair/light-client Client teams have varied in their progress towards implementing the spec. Lodestar: recently implemented the gossip topics serving the light client data and seem to be up to the spec I linked above. https://github.com/ChainSafe/lodestar/commit/03034ff77c70faf63ca81eb09a96e04ecb1bb26e
     Like  Bookmark