### EPF Update 10 This past week I have been working on finalizing & refining the dataset of the pools in the overall ethereum network. I foucused on `Lido Finance`, `Binanace` & `Rocket Pool` & few other smaller pools. Starting with `Lido` & `Rocket pool`, both staking pools have a complex system to identify their validators & address through which they deposited on the Eth2 deposit contract. My initial approach to identify these pools was using the `etherscan-label` feature which gives an starting point. But these cannot be marked as authentic or reliable as anyone can label ay address on eherscan. For both, Lido & Rocket pool, I manually extracted few address by quickly scanning through the displayed addresses & by reverse-sorting the addreses in my dataset in which the '`To`' address was not the deposit contract. Then I dived deeper into the contracts for LIdo & rocket pool. Lido has a list of node operators i.e 30 at the time of writing from which which deposit to Eth2 contract. In their smart contract, they have a function that has two parameters, 1. The index of node operator. 2. The index of the deposit. This allows to get the node operator detail from which the addresses can be extracted by incrementing index. I quickly created a script to extract this information retrived the addresses which has a very low false positives. Note: In my dataset, I identified a pattern that are normally used by major staking pools, that is they have a system of smart contracts that accepts Eth deposits & then batch submits the 32 ether to deposit contract. So, 'To' address will not always be the deposit contract. This will also help us to identify smaller pools. For retreiving the rocket pool address, I dived deep into their documentation & `rocketpool-go-sdk` repo and studied how their contracts worked. It was a bit tricky to understand as `Go Lang` is not my cup of tea but feeding relevant pieces of code to ChatGPT made life a lot easier. Ultimately I was able to pull all of the minipools & 2 deposit addresses that are mainly used by rocket pool to deposit to Eth3 contract. Identifying minipools took a bit of an effort and later I realized it wasn't needed :/. Anyways learned alot from this. Note: Minipools cannot be counted as a part of rocketpool as they are operated independently by the users & not by rocketpool. However, rocketpool can use this to operate on users behave but I beleive its highly unliklely because they have seperate conracts specifically for this functionality. For Idenifying Binnace pools, I switched back to the revesre-sorted dataset to manually identified few addresses that were directly funded by Binance making it evident that these pools were operated by Binanace. Binanace issues bEth toke on binance blockchain called as BinanceEth. One this I struggled with was the pools for `Coinbase`. I could not directly Identify them & am currently talking wih some teams for marking these pools. Coinbase mints `cbETH` (CoinbaseEth) in exchange for Eth deposited through coinbase app. I couldn't find any deposits or any addresses related to Coinbase or `cbETH` contract on ethereum that directly deposited to Eth2 contract. I have also checked some ither sources like beaconcha.in & Dune analytics, in their stats it shows that coinbase occupies roughly of around 10-13% of the network of which I am skeptical of. This is yet to be identified and wont stop us proceeding further. If I were to quantify & assign percentages to the share of pools to the network according to my dataset (i.e data uptill 25th December, 2022) that has exactly `503,581` validators, we get the following numbers. 1. Lido Finance has `144,585` validators &makes up of `28.71%` of the network. 2. Kraken has `38,605` validators & makes up of `7.66%` of the network. 3. Binance has `30,878` validators & makes up of `6.13%` of the network. 4. Staked.Us has `13,586` validators & makes up of `2.70%` of the network. 5. RocketPool has `10,349` validators & makes up of `2.05%` of the network. 6. Stakefish has `12,377` validators & makes up of `2.449%` of the network. 7. Bitcoin Suisse has `12,376` validators & makes up of `2.45%` of the network. 8. Figment has `12,109` validators & makes up of `2.40%` of the network. 9. Abyss Finanace has `4,284` validators & makes up of `0.85%` of the network. 10. Celcius network has `4,943` validators & makes up of `0.98%` of the network. I also plan to build a small dashboard or python script that shows the above data in visual format like a pie chart & number of addresses each pool is currently associated with. Incorporating coinbase validators which are roughly estimated to be around `12%`, the number of minipools (10,131) `2%` that I identified earlier & roughly `4%` of smaller pools gives us a total of `74.37%` which can be safely maked as low false-positive staking pools in the ethereum mainnet. This means, the remaining `26.73%` vaidators are solo stakers. ### Next steps Gathering the dataset with low false-positives & higher accuracy was a tedious task as the association with manual analysis was done in the identfication process. Now that I have identified this, I can move on to the next phase which is to statistically analyze the prformance of solo-stakers vs pools and if they cause anamalies on the mainnet.