# Evil Martians testing optimizations [Test run times after PR merges](https://docs.google.com/spreadsheets/d/1Of_gTD26vo4Mck6wE5gL5rU_T1q8li813x7W_I0lBQs/edit?pli=1&gid=0#gid=0) ### PRs: * https://github.com/powerhome/nitro-web/pull/39912 * https://github.com/powerhome/nitro-web/pull/39992 * https://github.com/powerhome/nitro-web/pull/40119 * https://github.com/powerhome/nitro-web/pull/40096 * https://github.com/powerhome/nitro-web/pull/40239 * https://github.com/powerhome/nitro-web/pull/40423 ### Implemented in core_models * [before_all](https://test-prof.evilmartians.io/recipes/before_all) * [disable logging](https://test-prof.evilmartians.io/recipes/logging) ``` config.logger = ActiveSupport::TaggedLogging.new(Logger.new(nil)) config.log_level = :fatal ``` * ElasticSearch / nitro histories disabled by default * Bootsnap (for dummy boot time) * factory optimization for projects, nitro_user, and estimates * [let_it_be](https://github.com/powerhome/nitro-web/pull/40096) * [FactoryDefault for reusing records created by FactoryBot associations](https://github.com/powerhome/nitro-web/pull/40096) * It's better to rely on factories and not custom factory-like methods to generate data, since custom code is much harder to optimize (though we managed to do that) * Prefer using `association :x, factory: y` instead of `x { create(:x) }` * [Use a DNC lookup helper to make things faster](https://github.com/powerhome/nitro-web/pull/40096) * Disabled auditing features by default: * [History::Testing.fake!](https://github.com/powerhome/nitro-web/pull/39912) * [Audited.auditing_enabled = false](https://github.com/powerhome/nitro-web/pull/40423) ### [Implemented in projects](https://github.com/powerhome/nitro-web/pull/40239) General test optimizations: * added bootsnap to speed up the dummy application boot time * added test-prof, stackprof, and foobar * disabled asset compilation, which is not needed for tests * nitro_user (see core_models) * optimized factories' associations to avoid creating unnecessary records * disabled history logging and elasticsearch (see core_models) * log full call stack only when ENV["FULLTRACE"] is enabled * truncate database if ENV["CLEAN_DB"] is passed to ensure that the database is clean before running the tests Specific test optimizations: * project_balance_spec.rb * project_item_changes_spec.rb * permission_component_spec.rb ### Feedback on our docker config / asset compilation / ci-kubed * Sprockets is the slowest part * Unlike Vite, Sprockets doesn’t really benefit from cache (subsequent runs are samishly slow) * We found that [this line](https://github.com/powerhome/nitro-web/blob/e03f423ad4501d631b42ad5363b943a150a68b70/config/application.rb#L270) (`config.assets.precompile << %r{(^[^_/]|/[^_])[^/]*$}`) is somehow responsible for that; without it, warm runs are fast. * Also, we’re still on Sprockets v3, which is slower than v4 (the latest versions use some concurrency features to improve performance). * As we saw previously and as I see during my own builds, sometimes, you loose a ton of time on `libheif`, `imagemagick` and `tar` builds. Those steps are flaky and buildx often re-does them as if we have never built them before or something changed. Unfortunately, I’m still unsure why that happens. Your latest builds don’t suffer from that problem, but I think that it would’ve still been beneficial for you to either “split” the base stage into a separately maintained image or to create packets for those tools. * `RUN --mount=type=cache` is a local instance-specific cache only. You heavily rely on that during the yarn build step, so if a new CI build gets a new builder node you are loosing the speed there. * The last thing I’m trying to do now is to read your final image with dive to, maybe, get more ideas on how we can shrink it. I’m, about half way through. * Overall, the main thing which I can’t get off my mind is the overall complexity of the build configuration and the docker image. I kinda get why it is that way, but it is a double-edged sword The main conclusion is: your current configuration is extremely complex and hard to maintain. Too much. I strongly suggest you to simplify it somehow. I still stand by my first impression that you should consider separating at least the base image. Thus you won’t even need to consider caching for it in any scenario as it will be available for your app builds as a prepared image. This is also confirmed by the facts: recently you had to update the packages for mysql because the old ones were removed from the repo; libheif/imagemagick builds are unreliable for local test even on a capable laptops. If you can somehow drop dependencies completely or isolate them into a separate builds (hypothetical example: build your jsdeps without imagemagick), I believe you should. The other thing is that the image and the cache is huge. It is impossible to give you the exact advice as the whole process right now depends on a single huge build. Your infrastructure engineers definitely are top notch, architecting this solution. I do also see they are maintaining it successfully. I still think it is worth taking a step back and trying to assemble it (at least on paper first) - from scratch. Here is an assumptions to explain what I would’ve tried to find a way out the current “gridlock” myself: maybe you should try sacrificing some scalability options in favour of speed, e.g. limit the builders to a single data center or even a single set of nodes. One more crazy idea: if you already store artefacts locally using the `FROM scratch` trick, then it is possible to try and completely forget about assets build layer caching, store previous npm caches and prebuilt assets (if needed) and inject them instead of the old cache. It is not a simplification, quite contrary, but still. Please consider this last part as a brainstorming, as it it definitely crazy. ### Universal Gemfile Approach After porting all business-logic components to the umbrella Gemfile and running multiple CI builds to measure the effect, we discovered that, due to the large number of other moving parts and high variation in build times, the effect of eliminating bundle install stays mostly unnoticeable. ### Future Ideas / Wrap-Up * [shared config extraction](https://github.com/powerhome/nitro-web/pull/40508): * move some of the development deps (fuubar, testprof) to ruby_test_helpers * some components don't rely on shoulda-gem, not shoulda-matchers * extract common rspec setup * truncating tables (still not sure why this is needed?)