Final EPF Development Update (Cohort 3) This is the final report of my project On-chain analysis of staking pools attestations. It was around the study of the analysis pools & how they impact the ethereum mainnet. Project Summary My main goal was to analyze performance of the validators controlled by the staking pools to validators that are not the part of any pool. If there is a notable difference between them then it meant that clustering of many nodes is impacting ethereum mainnet. When I chose this project, I was planning to utilize the existing tools, information & data that is currently available on staking pools to get a headstart. After exploring this for a week on different platforms & talking with various teams building on Ethereum it seemed like this type of data was either not easily available or was private information yet to be released. Briefly analyzing the information gave me an unsettling feeling that there might be some discrepencies & unaccuracies. Due to this, performing analysis on this data won't make much sense. Therefore I divided my main project into 2 parts. Produce a dataset after Identify & Labelling the network with different staking pools with low false-positives & higher accuracy. [Finished]Here my goal was to associate pools with their public keys & atleast identify >70% of the network which I was able to do.
2/28/2023EPF Dev Update #9 This past week have been a very busy, debuggish & informative week. I have finalized the indexer for retrieving data and shaped it to a more useful form. The indexer is now able to retrieve data in a format that is more easily usable for analysis. I have been in touch with the authors of a repository (https://github.com/alrevuelta/eth-metrics/) where they statistically determined and associated staking pools and their addresses. However, after having the conversation it turns out that the data is becoming outdated and by manually analyzing the hard-coded address in that repo gav me doubts about association of addresses with a specific pool i.e Binance for example. We are currently discussing this and see where this goes. Another important thing (alrevuelta) mentioned that asking for address association to discord pools is not a very good approach as asking them to verify doesn't work apparently. While verifying my dataset authenticity, I found an anomaly in my data & the data that is displayed on the Beaconcha.in website regarding validator indexes. My dataset matches with that of etherscan but it differes when it is extracted After diving into the open-sourced codebase, I discovered a section where it directly queries the Lighthouse client for data retrieval. This anomaly doesn't seem to be an issue for the explorer, but rather the Lighthouse client. This still needs to be verified, I will probably get in touch with team later. I also thought of an idea for associating validators to staking pools, similar to how Uniswap maintains its token list. However, this approach relies on the optimistic assumption that the data associated is correct. We might also need to associate some incentives to the contributors to increase authenticity. This will be very beneficial for the comuunity as a whole. I might persue this after cohort. Currently, I am in the phase of classifying the staking pools and have a few approaches that I will follow, this may deviate a bit from the main project as this ground needs to be set before the analysis for accuracy. I want to note that if we don't have enough data, we may conclude that our analysis is incomplete or inaccurate. However, if I see the complications growing, I will shortlist the major staking pools with a few false positives and continue from there. This will not include the performance of all pools, but rather a few pools to get the performance analysis. This approach will still give us valuable insights while being more practical. Here is how I plan to label the data:
1/23/2023EPF Dev Update #8 This past week, I have been working on indexing data and storing it in MongoDB. The data I am collecting is transaction hashes from the blockchain, specifically for the 'from' and 'to' fields. The process of collecting this data took longer than expected, as there were 286k transaction hashes to gather. However, I was able to successfully gather all of the data needed. In addition to the data indexing, I also worked on developing the front-end for retrieving data. This includes creating a search bar for users to easily search for specific data, as well as general statistics that can be displayed on the front-end. With the indexer and front-end development now complete, I will be working on integrating the two. I anticipate this step will take one day to complete. Once the indexer and front-end are integrated, I will be compiling the data. This process will also take one day to complete. Next steps After the integration of the indexer and front-end is completed, I will conduct thorough testing to ensure everything is working properly. This will include checking that data is being properly indexed and stored in MongoDB, as well as testing the search bar is working properly. Label & associate the addresses retrieved from the data to staking pools.
1/17/2023EPF Dev Update #6 I went through the resources shared by Mario last week and found a useful repository at https://github.com/alrevuelta/eth-metrics that has a list of staking pool addresses. In order to ensure data accuracy and reduce false positives, I decided to develop an indexer for the deposit contract on Ethereum. I have created a repository for this project, which can be found here. In addition, I explored different APIs for data retrieval and took the time to verify the events that were being recorded. I compared the data for the first batch in the MongoDB database to the data on the Ethereum blockchain to make sure everything was consistent and correct. Next steps Continue to work on the event indexer and refining it as needed & finalize the data. Get in touch with staking pool teams & eth-metrics devs to cross-verify the addresses of the staking pools. (This might take time)
1/10/2023or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up