EPF5 Dev Updates - Week 6

# EPF5 Dev Updates - Week 6 ## Weekly Highlights ### Meetings and Standups This week has been insightful as I faced many bugs that I have not encountered before in my dev. career. Firstly, I attended the weekly standup and office hours, hosted by the EF and Grandine team. ## My First Bug This Week I encountered an error while trying to run `cargo profiler callgrind` to generate a detailed performance report. ### Error Message Expected type did not match the received type. ``` fn plus_one(x: i32) -> i32 { x + 1 } plus_one("Not a number"); // ^^^^^^^^^^^^^^ expected `i32`, found `&str` if "Not a bool" { // ^^^^^^^^^^^^ expected `bool`, found `&str` } let x: f32 = "Not a float"; // --- ^^^^^^^^^^^^^ expected `f32`, found `&str` ``` I am still not sure why this happened, but I discussed it with my teammate to understand and investigate further. ## Profiling Efforts I also ran this profiler on the Grandine **Flamegraph** to visualize sampled stack traces and identify performance bottlenecks in Grandine. While running this, I was unable to fully generate the flamegraph. So far, this is what I was able to generate. Based on the `flamegraph.svg` that I was able to produce, I drew some insights. However, I am not going to rely on this fully because I have yet to resolve some configuration issues that prevented me from fully generating the flamegraph. ### Insights from the Generated Flamegraph #### Function name: `grandine::main`; `grandine::main:` This is the main function of the grandine module. ![Screenshot 2024-07-22 at 17.12.29](https://hackmd.io/_uploads/BJGa_bh_0.jpg) Looking at the image above it shows that: - **2 samples**: Which indicates that out of the total samples collected, 2 samples were taken while the program was executing the `grandine::main` function. - **66.7%**: This indicates that 66.7% of the total samples collected were from the `grandine::main` function. In essence, this tells me that a significant portion (66.7%) of the time spent by the program was in the main function of the Grandine module, based on the samples collected during profiling. This high percentage suggests that `grandine::main` might be a bottleneck or a critical section of Grandine code worth optimizing. #### Function name: `std::rt::lang_start_internal` ![Screenshot 2024-07-22 at 17.25.44](https://hackmd.io/_uploads/rJebt-3uC.jpg) Looking at the image above it shows that: - **3 samples (100%)**: This indicates that 100% of the samples collected were from the `std::rt::lang_start_internal` function. - **std::rt::lang_start_internal**: This function is part of the standard runtime (std::rt) and specifically the `lang_start_internal` function, which is called by the Grandine module. The `lang_start_internal` function is the entry point, but within it, most of the execution time is spent in the main function. #### Function name: `env_logger::logger::Builder::try_init` ![Screenshot 2024-07-22 at 17.17.43](https://hackmd.io/_uploads/B1BHYZnOR.jpg) Looking at the image above it shows that: - **1 sample (33%)**: This indicates that 33% of the total samples collected were from the `try_init` function of the `env_logger::logger::Builder` module. ### To put this in context with the first and second data: Given these percentages, this indicates that the `try_init` function is part of the setup process, likely initializing logging. Since 1 out of 3 total samples (33%) were taken from this function, it suggests that the `try_init` function is a significant part of my program's initialization phase, though it is not the primary consumer of execution time. ## Overall Summary - The `try_init` function from the `env_logger::logger::Builder` module accounts for 33% of the samples, indicating it's part of the initialization routine. - Most of the program's execution time is spent in the main function (66.7%), within the context of the `lang_start_internal` function (100%), which includes setting up and running the main application logic. This helped me to understand where my program spends its time and also to identify potential areas for optimization and further investigation. ## Final Thought I believe using tools like Flamegraph and other profilers will help identify critical code paths that require comprehensive testing. For example, focusing on the main function and initialization routines to ensure they handle edge cases and errors gracefully, and also identifying bottlenecks. ## Next Steps I will work on resolving/debugging the configuration issues I had this week ## References - [FlameGraphs](https://www.brendangregg.com/flamegraphs.html) - [Brendangregg/FlameGraph](https://github.com/brendangregg/FlameGraph) - [What is a FlameGraph](https://www.datadoghq.com/knowledge-center/distributed-tracing/flame-graph/)