Slides - HackMD

--- - [Code Review](#orgb43a79d) - [Returns on investement](#org54a0b82) - [Good practices](#org4f7ff08) - [Getting started with code review](#org7e50dcf) ## Title Birds-eye view on research code reviews *A few other suggestions below* Investing in code reviews for better research (software) Achieving better research software quality with code reviews Code review for scientific software: experience feedback and best practices ## Abstract Code review is a development practice that improves readability and maintainability of software projects, in addition to making collaboration easier and team work mor effective. ~~The concept of the peer code review is simple: reviewers read and check a software program by analysing its source code in order to find possible defects, hence to improve its overall quality.~~ A code review is a conversation between reviewer(s) and the author(s) of the code under review. The code is dissected and analysed in order to find areas of improvement according to the focus of the review. Examples include, but are not limited to, readability, security or performance improvements. ~~the advantanges of code review routine for any practitioner who is interested in improving the quality of their software, and levelling up their programming skills.~~ Despite code review being an effective tool for improving software quality, it is still not a standard practice within the academic software development process. In this talk, we will detail the benefits that code review can bring to acadmeic scientific software developers, particularly improvements in software quality, improved team work and knowledge transfer. This talk also highlights common difficulties faced by resaerchers to set up, perform and maitain frequent code reviews, and discusses several approaches and good practices to mitigate these difficulties. Finally, we describe common tools making code reviews easier and give examples of how to use them effectively. Particularly we will present demonstration of a typical code development cycle with continuous integration and automatic code checks. ## Bios - **Thibault Lestang** - Senior Research Software Engineer in Computational Fluid Dynamics, Imperial College London. Thibault works closely with researchers to develop and maintain software across the Department of Aeronautics, with a focus on open-source fluid flow solvers Xcompact3 and Nektar++. Through training courses and individual support, he also promotes the developement of good software engineering practices in the department, making research more open, sustainable and efficient. As a fellow of the Software Sustainability Institute, Thibault is interested in bridging the gap between software engineering methods and conventional scientific research workflows. - **Dominik Krzemiński** - Research Associate at the FlyConnectome group, University of Cambridge. Dominik's interest span between neural decision-making circuits and applications of AI to neuroscience. He organised a number of workshops and conferences for the wider scientific coding community. As a Software Sustainability Institute Fellow and Cambridge Data Champion, he promotes good coding practices, such as code reviews and testing, in the research enviroment. # Slides - [Code review?](#org221cf90) - [Not a peer review for code](#org600183c) - [CODECHECK](#org04cae1b) - [Two contexts](#org101f0c8) - [Why code reviews](#orgd5e6993) - [Code review for software quality 1](#orgc24ab45) - [Code review for software quality 2](#org70671c9) - [Understandability matters](#org5ce0b6c) - [Knowledge transfer](#orgb3203d0) - [Better team awareness](#org78086dd) - [Code review is challenging](#org50f8742) - [Code review is time and energy](#org881cf84) - [Being protective about code](#org429415a) - [Strong heterogeneity among team members](#orgb3392b5) - [Finding reviewers](#org29242eb) - [Bad experiences](#orgb5b2ba2) - [Bad experience 2](#org0992861) - [Lack of guidelines](#orgf4aab9d) - [Code review good practices](#org4e69f4e) - [Short meetings](#orga555b62) - [Engage with the review](#org7ca8e3b) - [Let authors be aware of their responsabilities](#orgde27032) - [Let authors specify the feedback they are after](#org31b641d) - [Define – and enforce – a scope](#orgaf89cfc) - [Whether "it works" or not is irrelevant](#org043275e) - [Make it formal but safe](#orgbed7c4f) - [Overheard in the next meeting room](#org5695235) - [All feedback isn't helpful](#org8233e6c) - [Critique the code, not the programmer](#orgeb09d4a) - [The art of giving feedback](#orga14d230) - [Define (and refine) a policy](#org59de70e) - [Use a checklist](#org90da9bf) - [A culture of openess and collaboration](#org0832e50) <a id="org221cf90"></a> ## Code review? - Viewed as an important practice in the software industry. - Key step in contributing to most software projects (gatekeeping). - Many benefits - Catch bugs - Ensure quality standard - Spread knowledge - Training new developers <a id="org600183c"></a> ## Not a peer review for code - Code review **throughout the research process**: - Frequent - Informal - Low stakes - Commonly referred as "Modern Code Review" in the SE litterature. Bachelli and Bird 2013 - Can be asynchronous (GitHub's Pull Requests) or synchronous (in person chat). <a id="org04cae1b"></a> ## CODECHECK Just a mention <a id="org101f0c8"></a> ## Two contexts 1. Individual developers writing their own specific software. 2. Developers collaboratin on a common codebase. - Code review as gatekeeping. <a id="orgd5e6993"></a> ## Why code reviews Modern Code Review: A Case Study at Google (Sadowski 2018) Expectations, Outcomes, and Challenges of Modern Code Review: (Bacchelli and Bird 2013) Code Reviewing in the Trenches: Understanding Challenges and Best Practices (McLeod et al 1017) Code review by and for scientists <a id="orgc24ab45"></a> ## Code review for software quality 1 1. Defects 2. Code improvements <div class="notes" id="org1a88004"> <p> If you ask Microsoft developers about why they are doing code reviews, you'll find that the main motivation is making the code better. This means identifying defects and improving the code. </p> <p> Code improvements are changes that do not affect the functionalities but make the code more readable or maintainable. You can think of complying to a naming scheme, identifying code smells. </p> </div> <a id="org70671c9"></a> ## Code review for software quality 2 <div class="notes" id="org974fa8b"> <p> In practice, defects aren't the first thing that code review really help with. It can help, but maybe testing is a better investement in order to prevent bugs. </p> <p> But code review is clearly effective for improving code. </p> </div> <a id="org5ce0b6c"></a> ## Understandability matters ![img](img/readable-code.jpg) Oftentimes source code is the only available docs.. <div class="notes" id="org6e12541"> <p> Code review is a good assessment of understandability and this makes it very attractive for research software. If you think of a research group in which researchers write their own bespoke programs for their specific data manipulation or computational experiement, these will not be documented. </p> <p> Understandability becomes a prime quality for research software. </p> </div> <a id="orgb3203d0"></a> ## Knowledge transfer Code review is a peer learning activity. - Spread of good practices. - Homogeneisation of styles and practicess across group. ```python filepath = "/my/own/specific/path/" + "data.csv" ``` ```python from pathlib import Path ## ... filepath = datadir_path / Path(datafile) ``` <a id="org78086dd"></a> ## Better team awareness Even if not working on *exactly* the same project, regular code reviews enable awareness of what others are doing. - Continuous knowledge exchange - Enhanced collaboration - Longer term resilience of project(s) (Bus factor!) <a id="org50f8742"></a> ## Code review is challenging ![img](img/Code-Review-Quadrant-by-Greiler-2.webp) A lot of content available, but what about **research software**? <a id="org881cf84"></a> ## Code review is time and energy It's a fact. Two complementary courses of actions: - Acknowledge code review as a worthy investement: - "middle-term" benefits for individuals. - Short and long term benefits for collectives. - Regularly reflect process and follow good practices. **Large return on investment** <a id="org429415a"></a> ## Being protective about code There can be some unhealthy competition going on. A large number of researchers feel shy about their coding practices: - Lack of training. - Other priorities, often systemic (e.g. funding). - Why would I share my code if nobody else does? Code review can turn the tide by putting software (back?) at the heart of scientific process. <div class="notes" id="org51002b7"> <p> One prerequisite of code reivew is being okay with somebody looking your code. Not only looking but judging it against some quality standards. </p> <p> I've worked with researchers in various and often feel ashamed of their coding practice. Several factors: lack of training, code is an afterthought, not used to share code. Code review can help <b>turn the tide</b>. </p> </div> <a id="orgb3392b5"></a> ## Strong heterogeneity among team members - Experience. - Skills (*e.g.* programming languages). - Interest & motivation. <div class="notes" id="orgf57c89b"> <p> Code review litterature reports on the effect of heterogeneity among participants who can be very junior developers up to employees with more than 10 years experience int he software industry or even the same company. </p> <p> But in research this is a whole other level of heterogeneity. </p> </div> <a id="org29242eb"></a> ## Finding reviewers What about "lone coders"? <a id="orgb5b2ba2"></a> ## Bad experiences Code review can lead both to inclusion and exclusion. Dual nature: both **technical** and **social** practice. <a id="org0992861"></a> ## Bad experience 2 Most common code review parasites are: - Irrelevant feedback. - Petty arguments decoupled from overall scientific goal. - Power struggles. These must be and active effort to keep these under control. Similar to technical debt. > A bad reviewer tries to force their preference on you. A good code reviewer makes your code confrom to certain principles, but not opinion. <a id="orgf4aab9d"></a> ## Lack of guidelines Where do ~~I~~ **we** start? <a id="org4e69f4e"></a> ## Code review good practices - Most good practices from software engineering industry are applicable. - Some of them for slightly different reasons - The following is an account of my personal experience and discussions with colleagues - not evidence-based conclusions. <a id="orga555b62"></a> ## Short meetings 3 times 30' instead of one time 90' - Fit snuggly in the diary. - Doesn't feel like a big commitment. - Code review can be a very demanding activity. <a id="org7ca8e3b"></a> ## Engage with the review It's easy for participants to fall into "comfort mode" - Author describes goes through code as if logic and implementation is obvious. - Reviewers assume author "know what they are doing". Reviewers should **never stop questionning** and trying to understand the code Authors should **give reviewers opportunities to interject**. <a id="orgde27032"></a> ## Let authors be aware of their responsabilities A code review's success partly rests on the author's shoulder. - Choose a small piece of code. - Provide a description of the purpose and structure of the code. - Think ahead what reviewers will and will not be familiar with - Specific libraries - Specific domain knowledge - Ensure minimum quality standard (*e.g.* style, naming) Put yourself into your reviewer(s)' shoes: what would you want to be told if asked to review your code? <a id="org31b641d"></a> ## Let authors specify the feedback they are after Feedback is likely to be more targeted and impactful. *I'm not happy with this loop* ```bash for i in `seq 1 $NUMOFFIG` do FIG=$(ls $IMDIR | head -n $i | tail -n 1) echo " ${placeholderpath}/${FIG}" >> $FILE done ``` *I'm having to define a lot of classes that don't do much, what do you think of my design?* *I don't have any specific issue in mind, but I'm curious to see whether or not you find it hard to to follow the code's logic.* <a id="orgaf89cfc"></a> ## Define – and enforce – a scope Example default scope: understandability - Poor formatting. - Obscure variable names. - Complex conditionals. - Long functions. - Long parameter lists. - ~~Design red flags~~. - ~~Performance sinks~~. - ~~Security concerns~~. Default scope can be overrriden at will. <a id="org043275e"></a> ## Whether "it works" or not is irrelevant Code review is not an evaluation of a finished product. It is more rewarding to look at code that is WIP or causing difficulties. The only expectation is that code is readable by reviewers: - Formatting, dead code, comments… <a id="orgbed7c4f"></a> ## Make it formal but safe Code review is more effective with a clear process (formal) *At the same time,* Code review meetings *must* remain inclusives and supporting spaces. **It's about creating an environment where people feel confident about discussing their code to each other.** <a id="org5695235"></a> ## Overheard in the next meeting room Author: *This loop I wrote looks too complicated to me.* Reviewer: *Hmmm yes. You could just use a pipe and `xargs`.* Author: *What's `xargs`?* Reviewer: *It's basically mapping a command over a set of inputs.* Author: *…* Reviewer: *Alhtough you could also do the same thing with `sed`.* Author: *I have no idea what you're talking about* <a id="org8233e6c"></a> ## All feedback isn't helpful At least for now. Esp. reviewers with more programming experience/enthusiasm must be careful not to overwhelm beginners. <a id="orgeb09d4a"></a> ## Critique the code, not the programmer Feedback can be hard to stomach *You clearly made little effort in naming things…* *You should name this differently* *I think this name is misleading* <a id="orga14d230"></a> ## The art of giving feedback 1. Own you opinions. 2. Make it about the code. 3. Be specific. 4. Suggest an alternative. *I think this function's purpose would be much clearer if it was given a more explicit name.. perhaps `apply_bwd_trasform`?* <a id="org59de70e"></a> ## Define (and refine) a policy - Well defined process - Default scope - Moderator(s) - Code of conduct - Conflict resolution <a id="org90da9bf"></a> ## Use a checklist - [ ] Poor formatting. - [ ] Dead code. - [ ] Missing documentation. - [ ] Obscure names. - [ ] Complex conditionals. - [ ] Obscure one-liners. - [ ] Duplicated code. - [ ] Long procedures. - [ ] Long parameter lists. - [ ] Global state. - [ ] Abuse of primitive types. - [ ] Data clumps. - … <a id="org0832e50"></a> ## A culture of openess and collaboration - Collective ownership of research project, teamwork Chris Woods' point about making research more of a team effort.