dknopik

@dknopik

Joined on Jun 30, 2024

  • 2024-11-24 After five months, EPF Cohort 5 and my project, "Network Simulations with Shadow", have finished. In this final development update I will first provide an abstract about my project. Additionally, using my initial project proposal, I will evaluate the results of my project and check if I met the goals I set therein. To conclude, I will report on the challenges I faced, the planned future of my project, and finally my personal experience as fellow within this cohort. Abstract Testing networking at scale is hard. Using actual networks, you need to either globally distribute your nodes or emulate network parameters on multiple local machines, as conventional single-machine testing tools such as Kurtosis can't handle large test networks. In my project, I developed Ethshadow, a tool for testing Ethereum networks using the simulation tool Shadow. Aside from developing Ethshadow, I used it to run several experiments on client networking, and thereby validated its usefulnes as another tool for the Ethereum community. Evaluation of goals
     Like  Bookmark
  • 2024-10-27 This week I tried to add support for more clients to Ethshadow, and improved the general code quality. Let's look into each client I attempted. The idea here was to quickly assess whether support can be added without much hassle. Prysm Unfortunately, adding Prysm didn't work. The issue here seems to be a bug in Shadow. Prysm (or rather, the Go standard library) uses a protocol called Netlink to retrieve the local IP address. Shadow seems to have support for that, but due to a bug, parsing the header sent by Go fails. I guess that can be avoided by making Prysm skip the address lookup and use the one I anyways provided via CLI, but the goal is to make unmodified clients work. Oh well.
     Like  Bookmark
  • 2024-10-21 Hello! This will be a rather short one. Ethshadow is now available at https://github.com/ethereum/ethshadow! It's awesome that my code now lives in the EF's GitHub organization; I certainly did not expect that when I started the project. Pop and I worked a bit more on the documentation, find it here: https://ethereum.github.io/ethshadow/ More work on the documentation is ongoing, and I will also explore support for the other clients this week. The next update will be likely the last where I report actual progress on Ethshadow, as Devcon nears and I will need to start preparing my project presentation at some point. Thanks for reading!
     Like  Bookmark
  • This will be a shorter update. I will report on the IDONTWANT simulation results and briefly elaborate on the plan for next week. IDONTWANT sims Last week, I ran a few simulations on IDONTWANT. In short: I ran several simulations to compare average bandwidth between 3/6 and 4/8 blobs (target/max), with different ratios of nodes with and without IDONTWANT support. For this, I modified Lighthouse to be able to enable and disable IDONTWANT aka Gossipsub 1.2.0. IDONTWANT allows nodes to signal to peers that a certain message has already been received and does not need to be gossiped to the node. This allows us to save some bandwidth in some cases, as blobs are rather large. The first attempt surprised me, as higher IDONTWANT usage caused higher bandwidth use! Please checkout my Interim IDONTWANT sim report for more details. However, after adjusting the simulation a bit, results looked more promising: IDONTWANT bandwidth simulation results. Unfortunately, in my simulations IDONTWANT does not decrease the bandwidth usage as much as a blob adjustment increases it. Also, this simulation series again shows that it is hard assess how realistic the simulation is. In this case, it took some modification to have more realistic timing of the messages sent, as well as a reduction of latency between nodes. In conclusion, the biggest weakpoints of my simulations seem to be (as assessed in the interim report):
     Like  Bookmark
  • 2024-09-29 Hey! At the end of the last update I ended with a cliffhanger on simulations with changes aiming to improve PeerDAS in Lighthouse, so I will get right to that. Afterward, I will discuss my plans for the next week, and as we are approaching the end of the cohort, my plans for the final phase of my project. This is by far my longest update post until now, so buckle up! Final notes on PeerDAS simulations As mentioned last time, PeerDAS did not perform well in my simulations. This seems to match with what teams see on devnets. This is somewhat expected, as both the spec as well as the implementations are still work in progress. Anyway, I ran some MORE simulations with modified parameters and implementation variants to try and fix that. First, I will explain the simulation setup, followed by an explanation of why we see such poor performance. Simulation setup
     Like 2 Bookmark
  • IDONTWANT bandwidth simulation results The simulations are named after the blob target/limit and the percentage of nodes supporting Gossipsub 1.2.0 aka IDONTWANT. All simulations use the same base Lighthouse version, with modifications to be able to disable IDONTWANT. The 4/8 simulations additionally include changes to change the blob target/limit. The table values are the average bandwidth used between the 5th and 15th minute after genesis, over the node category indicated in the table header. The "Diff" column indicates the difference between the averages of datacenter nodes and home nodes. The "Diff" rows indicate the difference between the preceding two rows. Datacenter nodes have 1Gbit/s up and down and no additional latency.Home nodes have 50Mbit/s up and down and 20ms additional latency.The simulation contains 500 datacenter nodes and 500 home nodes. Simulation All nodes DC nodes
     Like  Bookmark
  • ran six sims:3/6 blobs, 0% IDONTWANT compatible nodes: 16.1 KiB/s 3/6 blobs, 20% IDONTWANT (comparable to mainnet's ~15%): 17.7 KiB/s 3/6 blobs, 100% IDONTWANT: 18.3 KiB/s 4/8 blobs, 0% IDONTWANT: 19.4 KiB/s 4/8 blobs, 20% IDONTWANT: 20.8 KiB/s 4/8 blobs, 100% IDONTWANT: 21.6 KiB/s bandwidth suspiciously low, are blobs snappy compressed or truncated on wire if remaining bytes are 0? will try to fix IDONTWANT seems to RAISE bandwidth requirements?
     Like  Bookmark
  • 2024-09-16 I want to start this weeks update with a quote from last weeks update: This again shows that careful preparation is important 🙃 The past two weeks I analyzed the data I talked about in the last update, ran performance test and ran more simulations (and am waiting for even more to finish as I'm writing this).
     Like  Bookmark
  • 2024-09-02 Hello! The past weeks I've finally ran simulations on large servers - and got sick :( So while I have finally gathered data (and a large AWS bill), due to the illness I do not have any interesting conclusions for you yet. So the proper writeup of the results will follow. Let me tell you about my experiment anyways: First Experiment The first experiment consists of six simulations with 45 "simulated" minutes each. In the experiment, I want to figure out the effects of different PeerDAS parameters on the network.
     Like  Bookmark
  • 2024-08-19 During this week I ran again into some issues with the simulation... It never ends. However, now I believe we really are in a good spot to start testing. Let me tell you about the issues I faced and my plan for the next week! Blobssss For simulations with PeerDAS, it is rather important to have some blob transactions (duh). First, I tried integrating blob-spammer by EthPandaOps, which is also used in Kurtosis. While it spammed blocks well enough, it unfortunately has some strange performance issues which block the simulation for a few seconds every slot. That is unacceptable. This is why I implemented blobssss (Blob Spammer for Simulations, Stupid Simple), which only has the most basic features I require. At first, it did not work properly (which was my bad), so I tried a third solution: tx-fuzz by Marius van der Wijden, for which he implemented a basic blob spammer. During my first few times trying it, it crashed approximately a quarter of the Reth instances in my simulation with segmentation faults, even though the spammer only connected to one :cold_sweat:. Trying to figure out why this crash occurs was very hard - the core dumps were very misleading, as the techniques used by Shadow's injected shim masked the actual issue at first, so I spent a lot of time investigating Shadow. The actual culprit was a known issue of miscompilation of Reth in recent versions of Rust :face_palm: - omitting the flag for optimizations targeting my CPU fixed it. So much wasted time. Oh well. Trying again with tx-fuzz, it seemed a bit buggy still as no blob transactions were spammed, so in the end, I fixed my own blob spammer, and will go ahead with that one.
     Like  Bookmark
  • 2024-08-12 In this week, I finished Phase 1 of my project! Not much to tell about the last week though. As written last week, I went ahead and rewrote the config generation. It was more effort than anticipated, but I think I now have a pretty flexible system to specify simulations. There is only a bit of cleanup and documentation left to do, but I think I will wait with that until I have used it myself a bit - maybe I find some room for further improvements. Next week Now that Phase 1 is done, onwards to Phase 2! I will start with drafting up experiments for PeerDAS in Lighthouse, and as soon as they are shown to be viable and deliver reasonable data on a small scale, I want to run first large scale simulations. That is my goal for the upcoming week - but, as always, unexpected issues might cause me to take longer. Hopefully I can show you some nice graphs next week!
     Like 1 Bookmark
  • 2024-08-05 f1949ea1-2bde-4af7-9df0-c0ea0c71fbd5_text It works! Early this week, I was finally able to run a full Ethereum network simulation, this time with all nodes following the chain! There wasn't any actual issue left, I just misconfigured Lighthouse with the same fork version for Deneb and Electra, which resulted in bad behaviour - see this issue. :crab: RIIR :crab: Of course, my first instinct is to Rewrite it in Rust! This is actually warranted though, as the current scripts are already too messy for my taste. And since I want to make this even more flexible and configurable, it is time to support a more sophisticated configuration format, instead of environment variables. Doing this in Shell would be very painful. I decided on using Rust, as I am most comfortable with that language. Another sensible choice would have been Python.
     Like  Bookmark
  • 2024-07-28 Apologies for not posting an update last week - I was hoping to be able to report some success, but was not quite there yet. Even now, the network simulation does not quite work yet. In my project proposal I wrote that I plan three to four week to get the simulations working, so while I am still in time, I really hope to get the simulation running soon:tm:, especially since I still need to implement some features such as convenient metrics extraction for my experiments. However, I still feel like I am progressing well, and in this update, I will tell you about the issues I fixed in the past two weeks and more about the current state. But first, a small refresher about my project. Blockchain or kernel dev? In my project, I want to simulate Ethereum networks using Shadow, which allows us to simulate with unmodified client software (at least in theory). This works by intercepting system calls and reimplenting them as needed to route packets between clients and have a deterministic simulations.
     Like  Bookmark
  • This week was amazing, but only indirectly productive due to EthCC. :sweat_smile: I met a lot of amazing people though, especially my fellow Fellows! It was a really motivating experience. Presentation and Project Proposal I held my presentation at EPF day (slides & video, right at the start) and then finalized my project proposal during the week. Next week Well, it's time to get going with the real work! The next week will probably consist of troubleshooting minor and not-so-minor issues with ethereum-shadow. Let's see if I get it working properly within a week. See ya!
     Like  Bookmark
  • Wheew, what a week! Reality quickly settled in, and I realized that my project needs to by ready by the end of this week. Gulp. Last week I focussed on DVT and MaxEB, claiming that I will likely do a project involving them. So let's see what I ended up deciding on... And now, for something completely different... Network simulations! Early this week, I had a call with AgeManning from Sigma Prime, and he brought my attention to his project idea about network simulations with Shadow^idea. Basically, the idea is to use Shadow^shadow to run some network simulations to improve client networking. There already is an existing (but currently broken) repo^pop to run Shadow with Geth and Lighthouse, so my project will be to get that up and running, and then use that to run experiments and improve specs, Lighthouse, and maybe Reth. I won't go into details right now, and invite you to read my project proposal instead (as soon as its done). Next week
     Like  Bookmark
  • After two rough weeks in which I could not properly focus on EPF for personal reasons, I can finally post my first update. Hi, I'm Daniel! This week, I mostly focussed on Distributed Validator Technology (DVT) and Maximum Effective Balance (MaxEB). Distributed Validator Technology I looked a bit into DVT, however the specification repos^1 I found have not been updated in a while, so I am unsure if they represent the current state. I also investigated the approaches SSV^ssv and Charon^charon. While this topic is very interesting, I am unsure if it is considered "Core Development", as they build on top of Ethereum (or more specifically, on top of CL APIs). However, it would be nice to have a common spec and multiple clients available for DVT. Maximum Effective Balance I studied the MaxEB EIP-7251^3 (and the Electra CL specs^5), and it ended up being more complex than I anticipated. Being included into Pectra for a while, I expect client teams to have the implementation mostly ready. On the other hand, the spec tracking issue^4 seems to indicate that there are quite a few open questions and spec tests to consider, so this might still be a good topic to get involved with.
     Like  Bookmark