owned this note
owned this note
Published
Linked with GitHub
# Promoting Peer Code Review in Research Groups
There are research groups (at UA and beyond) where a large portion of scientific work is done using code, but it is extremely rare to find a research group with a culture that includes friendly code review.
Peer code review—the practice of reviewing a team member’s computer code for potential problems—is used commonly in the software industry and has been adapted as a teaching tool in computer science courses (Hundhausen et al. 2009; Song et al. 2020). However, the motivation for life sciences students to learn programming likely differs from that of CS majors. Correspondingly, peer code review should differ too.
Good overview: https://academic.oup.com/jeb/article/36/10/1347/7577476
Good example of post-publication code review. Ideally you want to catch mistakes like these before publication: https://ecoevo.social/@noamross/112679744941862891
## Current barriers to code review
- Lack of exposure / don't know what it is or why
- Don't know how to do it
- Code review anxiety
- https://doi.org/10.31234/osf.io/8k5a4
- Fear of criticism & judgement from peers
- Fear that code review will reveal *big* problems
- Leads to anxiety and avoidance
- Lack of incentive
- not required or suggested by supervisor
- no one will see code anyway
- Takes time
- Benefits are not clear
## Overcoming barriers
- Lee & Hicks (2024) find that code-review anxiety is *common* and basically like a special case of social anxiety. Interventions for code-review anxiety are similar to those for social anxiety
1) recognize and increase awareness of anxiety (e.g. psychoeducation, relaxation & mindfullness techniques)
2) reduced bias thinking, increase self-compassion (AKA "cognitive restructuring")
3) exposure—if something is difficult, do it more often
- Group Code of Conduct that emphasizes culture of friendly, constructive criticism
- Establish clear friendly goals of code-review as a group discussion before attempting
- Detailed instructions on how and why to do peer code review in life sciences
- Tools to aid constructive code review (e.g. [rubric](#Tools) or form to fill out)
- Understand benefits of code review using real-life examples (e.g. of published corrections that could have been avoided with code-review)
- Reprohack to practice reviewing code external to group to avoid fear of judgement
## Venues
**How can we (CCT Data Science) help researchers do code-review (better)?**
- Workshops (🤷♂️)
- Offer a lab-group specific training. At lab meeting, over several lab meetings, or separate event.
- Part of a course that includes teaching programming for life scientists (e.g. biostats).
- We provide curriculum and materials OR
- We guest lecture a lab session
- Curriculum that can be adapted by any research group
- Seminar talk for Eco/Evo audience: "Code review and reproducibility in Ecology"?
## Questions
Should this be tied to teaching GitHub in some way or is it better to teach code-review with just "pen and paper" first?
What are other venues or formats?
I'm interested in doing pedagogy research on whether peer-code review improves reproducibiliy of student work, student attitudes toward programming, and sense of community with peers—should we reach out to education researchers at UA to guage interest and involve them early on?
## Tools
Here's a rubric I created that could be used during peer code review in a life sciences research group.
A score of 4 is exceptional while 1 is unsatisfactory and in need of improvement.
| | 4 | 3 | 2 | 1 |
| ------------------------- | ------------------------------------------------------ | ---- | --- | --- |
| Reproducibility | Unmodified code runs on another machine without error. |Code must me minimally modified to run without error. Necessary modifications documented.|Code must be modified in multiple places or ways to run without error. Necessary modifications are not well documented.|Very difficult or impossible to reproduce analysis.|
| Code Readability |Code is formatted with human readability in mind. Variable and function names are concise, descriptive, and unambiguous. Follows a consistent style that improves readability.|Code is human readable, but could be improved. Variable or function names could be improved. Styling of code could be improved.|Readability of code could be improved in more than one way.|Format of code impedes readability.|
| Documentation |Code is understandable from documentation and comments alone.|Most complex code and functions are documented properly.|There are some comments, but major portions of code are left uncommented|Essentially no comments or documentation|
| Correctness / reusability | Analysis produces correct results, even with modified or updated data.|Minor changes are needed to produce correct results when data are modified or updated. |Code is of limited reusability and must be heavily edited to apply to modified or updated data.|Mistakes in the analysis result in incorrect results with original data.|
For more inspriation, read the [Tidyteam code review principles](https://code-review.tidyverse.org/)
### References
Hundhausen, C., Agrawal, A., Fairbrother, D., Trevisan, M., 2009. Integrating pedagogical code reviews into a CS 1 course: an empirical study. SIGCSE Bull. 41, 291–295. https://doi.org/10.1145/1539024.1508972
Song, X., Goldstein, S.C., Sakr, M., 2020. Using Peer Code Review as an Educational Tool, in: Proceedings of the 2020 ACM Conference on Innovation and Technology in Computer Science Education. Presented at the ITiCSE ’20: Innovation and Technology in Computer Science Education, ACM, Trondheim Norway, pp. 173–179. https://doi.org/10.1145/3341525.3387370
## Notes
Here's a possibly good resource: https://code-review.org/