Performance Optimization within Caseflow

# Performance Optimization within Caseflow ## Ruby Performance Profiling with StackProf ### General Performance Tips and Tricks #### Avoiding Common Bottlenecks * Memory Leaks: Identify unnecessary object allocations and ensure proper garbage collection. * Expensive Queries: Optimize database queries by using appropriate indexes and avoiding N+1 queries. * Cache Usage: Leverage caching or memoization to minimize redundant computations and database hits. * Batch updates or inserts: Avoid looping through objects to perform database operations individually. Taking advantage of parallelization where possible also fits into this category although Ruby is mostly single threaded due to the Global Interpreter Lock (GIL). * Use faster libraries: If performance is critical, benchmark gems and switch to more optimized alternatives. An example of this would be possibly using [Oj](https://github.com/ohler55/oj) to serialize our json since it might be a performance increase with minimal effort. #### Common Peformance Issues in Caseflow In addition to some of the more generalized performance considerations outlined in the previous section, there are some addition highlights than we can outline that are more specific to Caseflow * Serialization: This ties in with N+1 queries. In general during serialization you should absolutely avoid doing additional DB queries. A good pattern to follow is to do all of your DB retrieval up front then serialize all of the data instead of serializating a group of active record objects that fetch more active record objects via methods called by the serializer. Preloading is just one way of fixing this but even preloading does not fix instantiation of active record objects when dealing with a lot of data. However, there are not many places in caseflow where this is a problem though since it really only becomes an issue when you are serializing or parsing thousands of objects at once. * Serializing more data than you need: You should only serialize the fields that you need when you need them. A good example of a place that puts this into practice is the Decision Review Queue. The main DR queue only needs to display a few fields so everything is fetched with a big query up front and then during serialization there are no further fetches from the database. The Task Page/Disposition page then later does a full fetch of all the claimant, veteran, PoA, and decision issue information that it needs. * Method definitions that invalidate preloading: This is a bit trickier to spot, but there are several methods without some of the model classes that invalidate preloading via a database scope or selector. An example of this is the pact method in Appeal.rb file [Pact method definition](https://github.com/department-of-veterans-affairs/caseflow/blob/f62b14fa0fb96e570e1423b00555638d5d87041e/app/models/appeal.rb#L267) This method references .active which is a scope which is a database fetch which will not take advantage of preloading. Rails will cache the DB fetch but it's still an i/o interupt during your serializer/code loops. * Caching External api/resource calls: For any external call that doesn't need to be constantly fetched we should utilize rails caching. Caseflow does this via Redis and in general most external calls are cached. However, some of the calls that perform this cache are done via getter methods which are called during a serializer. That means there's possibly an external fetch delaying your serializer loop. Even if you go out to fetch data from redis that's still an i/o interupt during a loop which is cycles that your CPU will be waiting for a response from Redis (it's fast but it's not as fast as looping through an array for instance) #### Profiling in Ruby There are a variety of gems that exist in Ruby that can be used for profiling. * [Ruby-prof](https://ruby-prof.github.io/) * [Benchmark](https://github.com/ruby/benchmark) * [Stackprof](https://github.com/tmm1/stackprof) - The one I will be using in this example * [Vernier](https://github.com/jhawthorn/vernier) - I haven't tried this one and it requires ruby > 3 so we can't use it for Caseflow right now, but it looks cool * [Memory profiler](https://github.com/SamSaffron/memory_profiler) - Useful when you start digging into object allocation and you really #### Profiling with Stackprof **Installation**: ``` gem install stackprof ``` Depending on how you install it you will have to add it to the gemfile as well. You can add it to the gemfile and then bundle install as well ```ruby! gem 'stackprof' ``` **Basic Usage:** ```ruby! StackProf.run(mode: :cpu, out: 'stackprof-cpu.dump') do # code to profile end ``` This will dump a file out named stackprof-cpu.dump. Then in your CLI you can run this command to generate a simple text representation of the callstack. ``` stackprof stackprof-cpu.dump --text ``` Stack prof has a few different modes: * cpu - Measures CPU cycles. This is useful when profiling specific ruby methods, but it is less useful when the ruby code has to make external calls like i/o interupts or DB calls. * object - Defined object allocation. This is useful for figuring out extra memory usage and garbage collection resources * wall - The closest to a real time representation and generally what I would use for most scenarios. * custom - I'm not sure how this one works but I think you can define custom parameters on how the profiling is conducted You can even configure it to run as a rack middleware option which is very useful for profiling entire pages of an app at once. #### Profiling example in Caseflow As an example let's profile a few methods in caseflow. Let's compare the DR queue task serialization vs the Specialty Case Team(SCT) serialization methods and see what some of differences are. First let's make a generic helper method to make profiling easier and add this method to either a concern that you can share around or just temporarily add it to whatever file you are working with. ```ruby! require 'stackprof' def profile_method(profile_name = nil, number_of_runs = 1) method_name = profile_name || caller_locations(1, 1)[0].label result = nil puts "***** Profiling #{method_name} #{number_of_runs} of times ******" StackProf.run(mode: :object, out: "#{method_name}_stackprof.dump") do number_of_runs.times { result = yield } end result end ``` In this case I'm going to add it to the decision_reviews_controller.rb file I'm then going to use it to wrap the in progress tasks method since that's the default tab for the vha DR queue. ```ruby! when "in_progress" then profile_method("in_progress_tasks_profile", 5) { in_progress_tasks(pagination_query_params(sort_by_column)) } ``` Now you can either load the page or manually call this method with params via the controller but the easiest way is to simply load the VHA DR queue page at the path /decision_reviews/vha This will generate a file named in_progress_tasks_profile. In our CLI now, we can use stackprof to generate a text representation of this profile ```bash! stackprof in_progress_tasks_profile_stackprof.dump --text ``` It will generate something similar to this ![image](https://hackmd.io/_uploads/rJiw74MaC.png) This breaks down the time based on samples and total time. You can drill down on specific methods in the sample by adding more specific commands to the stackprof cli tool It's also worth noting that sometimes the first run will behave differently and what stafprof ends up sampling can differ from run to run so it's always worth running the profiling a few times before jumping to conclusions. Below is a second run where Redis is much less presumably because it has been initialized and warmed up. Marking and sweeping is a part of garbage collection and typically eats up a pretty good portion of the total time of most call stacks in Ruby from what I have seen. ![image](https://hackmd.io/_uploads/B1nc4NM6R.png) So for example how I would read this is by looking and seeing if there are any methods that occupy a much larger percentage of time than I would expect. In the last run, I notice FeatureToggle is on there when it's only referenced a few times meaning it's probably a pretty inefficient call and it's not even done during a loop and it eats up that much time. So that would be something I might look at. Another thing that sticks out to me is the Redis callstack. I'm not sure where Redis is being referenced while building the query and retreiving the tasks but it must be somewhere. The most realiable way to get a better representation of the callstack would be to run the same method 10+ times to get a more thorough sampling, but when looking at a page through the UI it's hard to do that so when drilling down to a specific method the best way to profile will probably always be through the command line or averaging runtimes yourself. I added in a loop to the method profiler, but using a dumb loop is not always the best way. You can instead run a bunch of individual profiles and combine them, but this way is good enough for now. In addition to the in progress tasks profile I'm going to add a json profiling of the serializer as the same time. ```ruby= task_json = profile_method("in_progress_pagination_json", 5) { pagination_json(tasks) } render json: task_json ``` ```bash= stackprof in_progress_pagination_json_stackprof.dump --text ``` ![image](https://hackmd.io/_uploads/HyxGGHGpC.png) Nothing too out of the ordinary here either. Mostly just standard Rails, active model, and json methods. As a general rule ActiveSupport methods are generally pretty slow compared to manually performing work on array data, but unless you are really working with a lot of data it's probably fine in most cases. Just to demonstrate I ran it with the object mode once for the json creation ![image](https://hackmd.io/_uploads/SkO3vHzpA.png) As you can see Ruby makes a shocking amount of objects when generating json and this is why garbage collection ends up taking a huge portion of a lot of execution time. When you really get down to heavy optimization that's where the majority of your time will be spent. ## Specific Areas of Caseflow Needing Optimization ### General Issues Generally a few of the shared issues in caseflow is slow serialization, external API fetches (since caseflow doesn't house all the data it needs), . #### Serializers #### Not preserving data in Redux between pages #### Lack of client side caching ### Case Details Page ### Queue specifically attorney and judge queues ### Caseflow Queue Index page ## Actions to take ### Investigation TBD ### Work that can be done in a few sprints TBD