# EPF5 Week 9 & 10 Updates Week 9 was real busy at my job so I wasn't able to do much fellowship work. I think I made up for the absence in week 10. tl;dr: - Found some low hanging improvements in [`ethereum_ssz`](https://github.com/sigp/ethereum_ssz), [`milhouse`](https://github.com/sigp/milhouse), [`ssz_types`](https://github.com/sigp/ssz_types), and [`grandine/ssz`](https://github.com/grandinetech/grandine/tree/develop/ssz) - Wrote a custom `crtierion::Measurement` type to track allocated memory during runtime. this is a more relevant measurement for serialization than `WallTime`, which is highly affected by noise. ## Benchmarking is Tricky Business I started the week looking for easy improvements to existing crates. I thought I found one when I noticed the sigp and grandine crates used `bytes.chunks` instead of `bytes.chunks_exact`. The latter should usually [run faster](https://doc.rust-lang.org/std/primitive.slice.html#method.chunks_exact) than the former so I added `_exact` and tried benchmarking it. What should've been a layup led me down a rabbithole about non-determinism of walltime and how criterion benchmarks could be affected by noise. Long story short, depending on the computation, you might get vastly different runtimes when running a bench over and over again even with no change. It even passes the noise threshold sometimes, so your benchmarks might seem to have improved or regressed with no change in the code at all. Fortunately, serialization isn't actually bottlenecked by the time/difficulty or performing encoding/decoding. Rather, the only bottlenecks is memory allocation. This means using tools like dhat-rs to track heap allocations is sufficient to benchmark and view improvement/regression. Ideally, one could use some [custom measurement](https://bheisler.github.io/criterion.rs/book/user_guide/custom_measurements.html) to track memory allocation during the lifetime of a program and display that in criterion. This would be deterministic, instead of using a (potentially noisy) walltime measurement. I'm going to work on this next week. I think a custom measurement will help streamline how I benchmark these crates, and also let me leverage a lot of the scaffolding that criterion provides. This way, I won't have to build a benchmarking suite from scratch! ## Improvements Found two possible improvements to existing ssz crates (the sigp and grandine ones). No PRs yet but they're coming soon™. The changes were tested on `VariableList` and `FixedVector` from the `ssz_types` crate while decoding a `SignedBeaconBlock`. The code: ![Screen Shot 2024-08-18 at 7.04.13 PM](https://hackmd.io/_uploads/HkFx0lgsC.png) - `chunks_exact` had a negligible effect on runtime vs `chunks`. zero effect on memory allocation. - saved ~20% on allocated bytes using `map_while` to transform an iterator of `Result<T>` to an iterator `T` instead of `itertools::process_results`. tested with `dhat-rs` before (using `process_results`): ![Screen Shot 2024-08-18 at 7.06.55 PM](https://hackmd.io/_uploads/Hkn90ggiC.png) after (using `map_while`): ![Screen Shot 2024-08-18 at 7.06.08 PM](https://hackmd.io/_uploads/H16PRxgs0.png) ### To Do Next Week - implement `AllocationCount` measurement type for criterion - make PRs to `milhouse` and `grandine` - look for more optimizations in existing implementations