### Final EPF Development Update (Cohort 3) This is the final report of my project [`On-chain analysis of staking pools attestations`]('https://github.com/eth-protocol-fellows/cohort-three/blob/master/projects/On-chain-analysis-of-staking-pools.md'). It was around the study of the analysis pools & how they impact the ethereum mainnet. ### Project Summary My main goal was to analyze performance of the validators controlled by the staking pools to validators that are not the part of any pool. If there is a notable difference between them then it meant that clustering of many nodes is impacting ethereum mainnet. When I chose this project, I was planning to utilize the existing tools, information & data that is currently available on staking pools to get a headstart. After exploring this for a week on different platforms & talking with various teams building on Ethereum it seemed like this type of data was either not easily available or was private information yet to be released. Briefly analyzing the information gave me an unsettling feeling that there might be some discrepencies & unaccuracies. Due to this, performing analysis on this data won't make much sense. Therefore I divided my main project into 2 parts. 1. Produce a dataset after Identify & Labelling the network with different staking pools with low false-positives & higher accuracy. [`Finished`] - Here my goal was to associate pools with their public keys & atleast identify >70% of the network which I was able to do. - There were couple of challenges I faced during this. Many staking pools have a complex layer built on top of Eth deposit contract. I had to manually study & deep dive into the protocols to identify & properly label theri associated addresses to pools. Lido & Rocket pool fell into this category. - I also created a [`repo`]('https://github.com/abdulsamijay/Eth-deposit-indexer') for gathering the data. I do plan to make the dataset publiclly available & display through a dashboard. - I would like to mention a few key approaches I took to identify potential pools & verification. - By Segregate all of the transactions that made their deposit directly to deposit contract & the ones that made directly to the some smart contract. - By Reverse-sorting the dataset by the most deposit & work downways to identify big players. This would create a group of direct & indirect deposits to deposit contract. - For verifying a particular address I worked by diving deep into documentations & extracting data from protocols smart contracts (i.e lido, Rocket pool & few smaller pools) & marking those addresses in the dataset. I also used scraper for etherscan lebels to extract addresses & did some manual analysis (i,e check smart contracts, etc) to mark a staking pool. - After marking all, the remaining addresses was assumed to be solo-stakers. 2. Perform analysis on these selected pools with statiscally measuring their performance. [`Ongoing`] - This portion of the project is yet to be completed. I will probably be working on this post-cohort to set a basic ground for analysis or possibly collaborate with rated network for integrating this directly into their dashbaord. ### Future of the project Based on the current status of the project. This project is something that can be adopted by many information aggregators to display stats about the validators, staking pools & network health in general. I do look at this project that can be integrated to `rated.network`, `nansen.ai` or any similar platform. One interesting thing would be after identifying pools, the diversity in each the pool could be tracked along with their performance to get further insights of the network behaviour. ### Self-Evaluation of the EPF Ethereum fellowship has been a great opportunity for me to learn more deeply about ethereum with direct access to mentors and core devs. I do want to emphasize on the fact that you have to be `self-directed` & `consistent`. I would highly recommend anyone who wants to start with core development & get a sense of how disruption is made, they should enroll in the fellowship. ### Technical Progress For me personally, I have learned a great deal not only from the perpective of core ethereum but also from generating to manipulate big datset to handling continous streams of data over the RPC with no data loss. I would also say my analytical skills have also improved due to the nature of my project I had to recap probability & Statistics 101. ### Closing This program was very well-organized. I would specially like to thank Mario & Josh for their efforts in making this experience remarkable. They helped us with almost anything from connecting to nodes to resolving roadblocks. I would also like to thank my mentors `Sproul` & `Potuz` who have been extremely helpful during the project. I am very grateful to be a part of this and I look forward for what comes next.