--- tags: sprint, NASA, TOPS --- # Draft Lesson 2b: ## How to maintain good code quality ### Introduction: We've talked about markers of quality software in the prior lesson: good documentation and clean, readable code. The reality is that for most software, this is a journey and it is going to continue to change and develop over some period of time. Here, we discuss version control, testing, and responsibilities after sharing. These topics are centered around the evolution of your code and ensuring the work you've done to make quality open software is able to endure. ### Version control Open source codes can change overtime. This brings several challenges to researchers developing and using an ever-changing software. We covered the importance of reproducibility for open-software - and open-science as a whole. Now, how can we achieve reproducibility with a changing code source? That is done by keeping track of changes to our source code, using version control. Version control can be done with tools and systems designed to manage changes not only to source code, but also to documents, websites, and datasets. [Google Docs](#docs.google.com), for instance, has its own complex version control. This allows you and your collaborators to have access not only to the most updated google document you all are working on, but to the complete history of changes. So, if something goes wrong in a document: a child includes a thousand smiley faces in the text, a cat walks on the keyboard and deletes an entire section - you can just revert to the earlier, error-free version. This is the same for coding. For instance, you - the developer - receive a notification from a user that your code has a bug. You know that this bug was not present in the last version, so you can easily work through your history to look what recent changes might have caused a specific error, narrowing down your debugging work to specific parts of the code. So, version control allows a group of developers/users to know exactly what version of the code they are using, what changes were made and when - facilitating reproducibility. Version control also fosters collaboration, making it easier for people to work together at the same time and to merge changes from different users. There are several version control systems (VCS) available. We won't get into detail here, but some of the most popular open-source systems include [git](#link), [SVN](#), and [Mercurial](#). It is important to note that while some repositories have already a built-in version control, repositories and version control systems are different - *e.g.*, **git** is the *version control system*, while [Github](#https://github.com) is a *hosting service* for **git** repositories. On lesson 4, we revisit version control, giving some concrete examples of how you can use it to contribute for new or existing open-source code. > [name=anacarolvaz] > Need to check the lesson number at the end ~~Another important reminder here is that - since the code in *version control* repositories are likely changing - it is not considered a preservation repository and cannot be cited. You will need to use a preservation (or archival) repository (*e.g.* [Zenodo](#), [Figshare](#), [Software Heritage Archive](#https://www.softwareheritage.org)) to store a static version of your code. This will allow you to get a permanent citable DOI, and you will get credit for your authorship. Look for [DOI](#linktonextlesson need to fix this) for more details on your DOI and changing code.~~ ~~The workflow we present is based on decentralized version control systems, which are the most commonly used by researchers (see more [centralized and decentralized](#https://www.geeksforgeeks.org/version-control-systems/?ref=lbp)). to start - an user will get a copy (clone or fork) from an already existing repository, while a developer will create a repository from scratch (use git init here??). You will have a working directory, a staging area (an index that will track your changes), a local repository and a remote repository. (note: figure here!). We present here a simple definition of the workflow with common terms you will encounter, and offer some suggestions for a more in-depth lesson. [Software Carpentry](#https://swcarpentry.github.io/git-novice/) can be a great place to start!~~ > []need to add graph here! __~~Simple version control workflo~~w__ [ ] **~~Create Repository~~** - ~~Developer: creates a new repository from scratch. Our tip: just go for it. You can create your repository with one file, or an entire existing open software.~~ - ~~User: will create a copy (*clone* or *fork*(link to lesson from Johanna)) of an existing repository.~~ [ ] **~~Make changes~~** - ~~You can make any changes you want to your copy, but no one will see your changes until you *commit* (*i.e.*, submit them).~~ [ ] **~~Publish your changes~~** - ~~If you are like your changes and additions, *commit*. This will update your local repository.~~ - ~~So far, only your local repository has changed. To update your remote repository, *push* your modifications.~~ [ ] **~~Get changes from others~~** - ~~While you were working on your copy, other users might have changed the remote repository. To keep your local repository updated, you need to retrieve, or *pull* the latest changes.~~ [ ] **~~Keep track of changes~~** - ~~To check what is different in your copy since the last commit, you can check the *status* of your repository.~~ ~~As a last note, version control is a good practice for coding, so use it even if you are not sharing it immediately. You can use version control with your codes privately on your computer, or use the private mode on hosting services (*e.g.*, Github and Gitlab). And, once you are ready, you are one step ahead to share your code.~~ ##### ~~Further Resources:~~ - ~~[Sofware Carpentry Version Control with Git]~~(https://swcarpentry.github.io/git-novice/) - ~~[The Turing Way, Version Control]~~(https://the-turing-way.netlify.app/reproducible-research/vcs.html) - ~~[FAIR Use a publicly accessible repository with version control]~~(https://fair-software.eu/recommendations/repository) ### Testing There are many types of tests that operate on different scales like end-to-end testing and unit testing (tests each function in a software). Tests can be helpful for a developer to maintain consistency in the software and for gaining trust and confidence. Writing tests can ensure that any changes made to the software don't affect outputs that it shouldn't. Tests can be used to confirm that the same input into your code/software will have the same output each time. They are a critical part of ensuring your software is creating reproducible science. As a user, a software/code with good code coverage (meaning the percentage of lines in a code that the tests actually activate/encounter), especially that you can run yourself, should give you more confidence in the software and peace of mind that it is something you can trust to use. **how do we want to direct a user know the percentage of lines of code that the tests cover?** While a large portion of scientific software does not currently include tests or have great test coverage, wider adoption of testing would be beneficial to ensuring the viability of software/code to create consistent and trustworthy scientific outputs. To learn more about testing, types of tests, and how to write good tests check out these resources: **links to external resources here** ### Responsibilities after Sharing After sharing software, there are certain steps that need to be taken in regards to maintenance of that code/software. First, you should know it is not a requirement for you to be a permanent maintainer forever, but it is your responsibility to let users know if you do or don't intend to maintain the software/code. You can do this in your documentation where you discuss the development status of the project. This helps a user know if it will continue to be supported in the future, and make choices about if they should base ongoing work off your project. You don't want someone to spend a huge amount of time using your work as a dependency and then have their project become unusable in the future. The reality is that a developer/researcher may not have the time or continued funding to keep up with a project. In this case, perhaps consider handing ownership of the software to another researcher/developer, involved user, or entity invested in its continued use. You can either approach potential parties you think may be interested in this; or you can make your license permissive enough to allow others to create their own copies and continue your work (see more on choosing a license in this module). Depending on the license you choose, the use of your project, and if you have significant interest, you may be able to commercialize your software/code to provide funding for continued maintenance and feature requests. There is also the potential to apply for continued funding from agencies both governmental and private if your open software is widely used. If you're a user of a software that is no longer maintained, consider contacting the owner/developer and volunteering either as a maintainer, or to take over ownership of the project (you'll be more likely to get a positive response if you leave that choice up to the current owner). If you receive requests for features and fixes, and you have indicated you intend to maintain the code, these should be responded to. Either (a) tell the users you intend to perform their requested action or (b) you think that's out of scope for what you intend for the project. Additionally, you can invite the requester to contribute to the project and add that feature/fix themselves (which you can then approve it and add into your project) or invite them to fork (make a copy of) the project and create the feature/fix but notify them you will not be merging it into your (main/original) copy. ### Summary: Version control systems enable multiple developers, maintainer, or any other individual to collaborate on the same project by tracking every change to the code. It help to increase productivity and quality of the software code. Testing is critical to ensuring the reproducibility of your science.