# Workshop ## Refactor or Die </br> HackMD Document: [Here](https://hackmd.io/gF9ZdJlaRRWccSBI0AE0mQ?both) Slides: [Here](https://hackmd.io/@ohafner/refactor-or-die#/) </br></br> <p style="text-align: left; margin-left: 40px;">Author: <a href="mailto:ohafner@mrmilu.com">Oscar Hafner</a></p> --- ## Definition Refactoring (noun): a change made to the internal structure of software to make it easier to understand and cheaper to modify without changing its observable behavior. --- ## Motivation The main idea behind code refactoring is making the inefficient and over-complicated code to more efficient, simpler, cleaner, and easier to understand. Code refactoring makes the entire application more efficient, secure, fast, maintainable, and scalable. It also helps developers to find the hidden bugs and vulnerabilities in the system. ---- ### Benefits * To Keep Your Code Clean * To Improve The Performance of The Application * To Save Time and Money in The Future * To Reduce the Technical Debt * To Improve the System Design --- ## Our Goal + Define basic concepts about refactoring + Explain a real case: Leonardo247 Report Modules ---- ### How + Resume the requirements for the module + Explain some of the problems/bad patterns found in code. + Pick a fragment of the code and propose how to apply refactoring #### Metodology + Define goals/scope of the changes to apply + Define how to test the applied changes + Analyze approach to improve code quality --- ## Getting started --- ### When to Refactor ---- #### Only refactor code that has tests. <br> Refactoring code without tests is, frankly, gambling. There's no way to be sure that you didn't break the code if you don't have tests that confirm that the before and after are equivalent. No matter how disciplined you are, you'll introduce bugs sooner or later. ---- #### Use incremental steps. <br> Change a bit of code, then check if everything works. Then, commit before continuing. ---- #### Keep refactorings small. <br> The bigger a change, the riskier it is. Split it into multiple smaller chunks. You mitigate risks, and you get flexibility. ---- #### Don't start adding functionality in the middle of a refactoring. <br> A refactoring doesn't change the output of the system. Finish it first, then continue with the next task. ---- #### Avoid the sunk cost fallacy. If you go down a route that leads you nowhere, abandon the whole thing. <br> Given that you are working in small steps, you won't lose much. --- ### How To Refactor --- #### Define Goals and Metrics ---- ##### Goals + Avoid write duplicate code for this **ViewSet** + Improve code legibility in this module. + Allow to reuse an utility class. ---- ##### Metrics + Number of lines of code to include a common feature. + Presence of well defined docstrings, variable names, legible methods. + Cyclomatic complexity + Performance (CPU, Mem, IO) + Number of tests / Test Coverage --- #### Define Scope of the refactor </br> Extract report endpoints from a viewset and allow to configure as a Mixin --- #### Ensure that Refactored code can be tested * Scenario: Create a ViewSet with Report feature. * Test that it contains the expected endpoints. * Test that it calls the underlying report methods. <img style="width: 75%" src="https://i.imgur.com/xR1QXC6.png"/> ---- #### Prepare code before refactor ![](https://i.imgur.com/b6HUqVG.png) ---- ##### Red-Green-Refactor * Stop and consider what needs to be developed. [RED] * Get the development to pass basic testing. [GREEN] * Implement improvements. [REFACTOR] --- ### Techniques See https://refactoring.guru/refactoring/techniques --- ## Leonardo247: Report Module Leonardo247 allows to generate and download reports related to different viewsets via API. Some of them are: + Action Items + Contacts + Workflows ---- These categories exposes a set of endpoints to perform async generation as json, pdf, excel, csv. Report files can be send via email or include into Document Manager module. --- ### Previous Report Module Common report module logic involves: 1. Call from ViewSet to create_report or ActionItemReportManager.run ```python= # leonardo.action_item.api.v2.views:1669-1683 @action(detail=False, methods=('GET', 'POST'), url_path='report/export-excel') def export_excel(self, request, *args, **kwargs): try: self.log_user_activity(user=self.request.user, action_done='report_export_excel') create_report(request, 'action_items_export_excel') return Response(status=status.HTTP_200_OK) except Exception as e: logger.exception(e) return Response(status=status.HTTP_503_SERVICE_UNAVAILABLE) @action(detail=False, methods=("GET",), url_path='report/export-excel-async') def export_excel_async(self, request, *args, **kwargs): qs, group_by_template = self.__get_report_queryset(request=request) report_manager = ActionItemReportManager(request=request, queryset=qs, group_by_template=group_by_template) return report_manager.run(export_type=report_manager.TYPE_EXCEL) ``` ---- 2. In reports/managers/action_items.py instancing ActionItemViewSet to get the required report queryset. ```python= # leonardo.reports.managers.action_items:146 class ActionItemReport(object): def __init__(self, request, data=None, export_type=None): super(ActionItemReport, self).__init__() self.request = request self.data = data self.export_type = export_type def report(self): qs, group_by_template = self.get_data() report = ActionItemReportManager(request=self.request, queryset=qs, group_by_template=group_by_template) return report.run(export_type=self.export_type) def get_data(self): from leonardo.action_items.api.v2.views import ActionItemViewSet vs = ActionItemViewSet() vs.request = self.request vs.format_kwarg = None return vs._ActionItemViewSet__get_report_queryset(request=self.request) ``` ---- 3. ActionItemReportManager provides grouping data and connects with PDF, CSV, Excel handlers ```python= # leonardo.action_items.report_manager:195 def _export_pdf(self): pdf_report = ActionItemReportPDF( request=self.request, title=self.report_title, fields_shown=self.fields_shown, group_by_template=self.group_by_template, report_model_name=self.report_model_name, ) if self.request.user.is_admin(): filter_account_name = ( ",".join([x.name for x in self.accounts]) if self.accounts else "All" ) else: filter_account_name = self.account_user.account.name data = { "filters": OrderedDict( [ ("account_name", filter_account_name), ( "portfolio_name", ", ".join(self.portfolios.values_list("name", flat=True)) if self.portfolios else "All", ), ( "property_name", ", ".join(self.properties.values_list("name", flat=True)) if self.properties else "All", ), ( "dates", report_date_interval_format( from_date=self.from_date, to_date=self.to_date ), ), ] ), "results": self.data, } report_data = pdf_report.format_data(data=data) pdf_report.write_pdf(data=report_data) return pdf_report.get_response(request=self.request) ``` ---- 4. ReportManager implements private methods reimplemented in ActionItemReportManager, run method and empty email sender. > See leonardo/utils/report_manager.py ---- 5. ActionItemReportPDF inherits from PDFUtils and apply parsing to existing data > See leonardo.action_items.report_pdf ---- Finally ... 6. PDFUtils handles different formatters, context (from request and user model) and docraptor or wkhtmltopdf logic. > See leonardo.utils.pdf Also 7. PDFUtils inherits from MailUtil, which provides some send email to recipents > See leonardo.utils.email --- ### Module issues + Code Mess + Duplicate code for different formats and between report types. + Formatting errors, there are no separation between content and format. + Character encoding issues. + High resource consumption + Duplicate data in memory. ---- More ... * Memory: * Not using queryset iterators * Not using optimizations * https://xlsxwriter.readthedocs.io/working_with_memory.html * Everything is in list or dicts, making dificult to apply profiling ---- More ... * DB: * Quite long queries, dense plannings, wrong indexing * High coupling, Low Cohesion (Should be the opposite) * Weird Report To View linking, some dark magic required to launch a report * High Coupling with Mail related tooling * We aren't using Mailer class from django * High Coupling between different stages of the report --- ### Refactor proposals + Decouple MailUtils from PDF and Excel get_response methods + Decouple PDFUtils from ActionItemReportPDF + Decouple Parsers from ActionItemReportPDF
{"metaMigratedAt":"2023-06-16T19:47:44.219Z","metaMigratedFrom":"YAML","title":"Workshop: Refactor or Die","breaks":true,"description":"Slide notes for Django Group Talk","contributors":"[{\"id\":\"555cfa5b-54f6-4f3e-b2c4-8c6b64a72c5c\",\"add\":15864,\"del\":8547}]"}
    341 views