Instructors: Rousslan Dossa & Nadine Spychala (also the origianl author)
This is a tutorial on best practices in research software engineering using Python as an example programming language:
It is a modified version of the original Collaborative Research Software Engineering Tutorial by Nadine Spychala. It is based on Intermediate Research Software Development course from the Carpentries Incubator, so for anyone wanting to delve into more detail or get more practice… looking into that course is probably one of the best options.
Here, you'll get:
We'll use GitHub CodeSpaces – a cloud-powered development environment that one can configure to one’s liking.
0. Welcome
0.1 Recap & motivation: why collaboration and best research software engineering practices in the first place?
0.2 Difference between "coding" and "research software engineering"
0.3 What you’ll learn
0.4 Target audience
0.5 Pre-requisites
1. Let's start! Introduction into the project & setting up the environment
1.1 The project
1.2 GitHub CodeSpaces
1.3 Integrated Development Environments
1.4 Git and GitHub
1.5 Creating virtual environments
2. Ensuring correctness of software at scale
2.1 Unit tests
2.2 Scaling up unit tests
2.3 Debugging code & code coverage
2.4 Continuous integration
3. Software design
3.1 Programming paradigms
3.2 Object-oriented programming
3.3 Functional programming
4. Writing software with - and for - others: workflows on GitHub, APIs, and code packages
4.1 GitHub pull requests
4.2 How users can use the program you write: application programming interfaces
4.3 Producing a code package
5. Collaborating research and sharing experiment results
5.1 Experiment analysis: custom training curve plotting
5.2 Tensorboard: a standard for training metrics plotting and export
5.3 DL Experiment Tracking and Management as a tool for collaborative research and result sharing
6. Wrap-up
7. Further resources
8. License
9. Original course
10. Acknowledgements
The terms programming (or even coding) and software engineering are often used interchangeably, but those terms don't mean the same thing. Programmers or coders tend to focus on one part of software development which is implementation. Also, in the context of academic research, they often write software just for themselves and are the sole stakeholders.
Someone who is engineering software takes a broader view on code which also considers:
Bearing the difference between coding and software engineering in mind, how much do scientists actually need to do either of them? Should they rather code or write software, or do both (and if both, when do they do what)? This is a hard question and will very much depend on a given research project. In Scientific coding and software engineering: what's the difference?, it is argued that "scientists want to explore, engineers want to build". Both too little or too much of an engineering component in writing code can be a hindrance in the research process:
To boil down the challenge in other words, when you start out writing code for your research, you need to ask yourself:
How much do you want to generalize and consider factors in the software lifecycle upfront in order to spare work at a later time-point vs. stay specific and write single-use code to not end up doing (potentially) large amounts of unnecessary work, if you (unexpectedly) abandon paths taken in your research?
While this is a question every coder/software engineer needs to ask themselves, it's a particularly important one for researchers.
It may not be easy to find a sweet spot, but, as a heuristic, you may err on the side of incorporating software engineering into your coding, as soon as
More often than not, one or both points will apply fairly quickly.
This tutorial equips you with a solid foundation for working on software development in a team, using practices that help you write code of higher quality, and that make it easier to develop and sustain code in the future – both by yourself and others. The topics covered concern core, intermediate skills covering important aspects of the software development life-cycle that will be of most use to anyone working collaboratively on code.
At the start, we’ll address
Regarding testing software, we will se how
Regarding software design, we will touch upon
With respect to working on software with - and for - others, you’ll hear about
An opiniated section on collaborative research in machine learning
Some of you will likely have written much more complex code than the one you’ll encounter in this tutorial, yet we call the skills taught “intermediate”, because for code development in teams, you need more than just the right tools and languages – you need a strategy (best practices) for how you’ll use these tools as a team, or at least for potential re-use by people outside your team (that may very well consist only of you). Thus, it’s less about the complexity of the code as such within a self-contained environment, and more about the complexity that arises due to other people either working on it, too, or re-using it for their purposes.
The best way to check whether this tutorial is for you is to browse its contents in this HackMD main document.
This tutorial is targeted to anyone who
It is suitable for all career levels – from students to (very) senior researchers for whom writing code is part of their job, and who either are eager to up-skill and learn things anew, or would like to have a proper refresh and/or new perspectives on research software development.
If you’re keen on learning how to restructure existing code such that it is more robust, reusable and maintainable, automate the process of testing and verifying software correctness, and collaboratively work with others in a way that mimics a typical software development process within a team, then we’re looking forward to you!
The only thing you need to before the event is to create an account on GitHub, if you haven't done so already.
In this tutorial, we will use the Patient Inflammation Study Project which has been set up for educational purposes by the course creators, and is stored on GitHub. The project's purpose is to study the effect of a treatment for arthritis by analysing the inflammation levels in patients who have been given this treatment.
The data:
data
folder of the repository represents inflammation measurements from one separate clinical trial of the drug,The project as seen on the repository is not finished and contains some errors. We will work incrementally on the existing code to fix those and add features during the tutorial.
Goal: Write an application for the command line interface (CLI) to easily retrieve patient inflammation data, and display simple statistics such as the daily mean or maximum value (using visualization).
The code:
inflammation-analysis.py
which provides the main entry point in the application - this is the script we'll eventually run in the CLI, and for which we need to provide inputs (such as which data files to use),inflammation
which contains collections of functions in views.py
and models.py
,data
and tests
which contains tests for our functions in inflammation
,README
file (describing the project, its usage, installation, authors and how to contribute).We will use GitHub CodeSpaces. A codespace is a cloud-powered development environment that you can configure to your liking. It can be accessed from:
GitHub CodeSpaces' most appealing feature is that you can code from any device and get a standardized environment as long as you are connected to the internet.
This is ideal for our purposes (and maybe for some of yours in the future, too) as we'll avoid the hassle to install programs on your machines and copy/clone GitHub repositories remotely/locally before you will be able to code.
This spares us unexpected problems that would very likely occur when setting up the environment we need.
Python in particular can be a mess when it comes to dependencies between different components… see this XKCD webcomic for an insightful illustration:
Creative Commons Attribution-NonCommercial 2.5 License
! FOLLOW ALONG IN YOUR CODESPACE ! Let's instantiate a GitHub codespace:
Code
, then choose Create codespace in main
. Your codespace should load and be ready in a few seconds.In the cloud, you'll see that we use VSCode as an Integrated Development Environment (IDE), however, you could also use a codespace via your locally installed VSCode program by adding a GitHub Codespaces extension to it.
Explorer
: browse, open, and manage all of the files and folders in your project,Run and Debug
: see all information related to running and debugging,Extensions
: add languages, debuggers, and tools to your installation,Source control
(or version control): to track and manage changes to code.Terminal
: an interface in which you can type and execute text based commands - here, we'll use Bash which is both a Unix shell and command language,
! FOLLOW ALONG IN YOUR CODESPACE ! Let's install the Python and Jupyter extension for VSCode
VSCode supports version control using Git version control: at the lower left corner, we can see which branch - something like a "container" storing a particular version of our code - in our version control system we're currently in (normally, if you didn't change into another branch, it's the one called main
).
git add filename
,git add filename
command to update it in the staging area,git commit -m "some message indicating what you commit/had changed"
command. Each commit is a new, permanent "snapshot" (checkpoint, or record) of your project in time which you can share and get back to.
git status
allows you to check the current status of your working directory and local repository, e.g., whether there are files which have been changed in the working directory, but not staged for commit (another command you'd probably use very often).git push origin branch-name
, and, if collaborating with other people, pulling their changes using git pull
or git fetch
to keep your local repository in sync with others.
git fetch
and git pull
is that the latter copies changes from a remote repository directly into your working directory, while git fetch
copies changes only into your local Git repo.
Git workflow from PNGWing.
main
) which is the version of the code that is fully tested, stable and reliable,develop
or dev
by convention) that we use for work-in-progress code. Feature branches get first merged into develop
after having been thoroughly tested. Once develop
had been tested with the new features, it will get merged into main
.git branch develop
creates a new branch called develop
,git merge branch-name
allows you to merge branch-name
with the one you're currently in,git branch
tells you which branch you're currently in (something you'd check probably very frequently) as well as gives you a list of which branches exist (the one you're in is denoted by a star symbol),git checkout branch-name
allows you to switch from your current branch into branch-name
.
Git feature branches, adapted by original course creators from Git Tutorial by sillevl (Creative Commons Attribution 4.0 International License)
! FOLLOW ALONG IN YOUR CODESPACE ! Ascertain the existence of the main
and develop
branches
venv
In inflammation/models.py
, we see that we import two external libraries:
from matplotlib import pyplot as plt
import numpy as np
venv
):
venv
,virtualenv
,pipenv
,conda
,poetry
.! FOLLOW ALONG IN YOUR CODESPACE ! Let's create our virtual
environment
This can be achieved with the following commands:
$ python3 -m venv venv # creating a new folder called "venv",
# and instantiating a virtual environment
# equally called "venv"
$ source venv/bin/activate # activate virtual environment
(venv) $ which python3 # check whether Python from venv is used
Output:
/workspaces/python-intermediate-inflammation/venv/bin/python3
(venv) $ deactivate # deactivate virtual environment
Our code depends on two external packages (numpy
, matplotlib
). We need to install those into the virtual environment to be able to run the code using a package manager tool such as pip
:
(venv) $ pip3 install numpy matplotlib
When you are collaborating on a project with a team, you will want to make it easy for them to replicate equivalent virtual environments on their machines. With pip, virtual environments can be exported, saved and shared with others by creating a file called, e.g., requirements.txt
(you can name as you like, but it's practice to label this file as "requirements") in your current directory, and producing a list of packages that have been installed in the virtual environment:
(venv) $ pip3 list # view packages installed
(venv) $ pip3 freeze > requirements.txt # produce list of packages
If someone else is trying to use your library within their own virtual environment, instead of manually installing every dependency, they can just use the command below to install everything specified in the requirements.txt
file.
(venv) $ pip3 install -r requirements.txt # install packages from
# requirements file
Let's check the status of our repository using git status
. We get the following output:
On branch main
Untracked files:
(use "git add <file>..." to include in what will be committed)
requirements.txt
venv/
nothing added to commit but untracked files present (use "git add" to track)
While you do not want to commit the newly created directory venv
and share it with others as it's is specific to your machine and setup only (containing local paths to libraries on your system specifically), you will want to share requirements.txt
with your team as this file can be used to replicate the virtual environment on your collaborators’ systems.
Checking the state of the codebase with git status
and using .gitignore
:
To tell Git to ignore and not track certain files and directories, you need to specify them in the .gitignore
text file in the project root. You can also ignore multiple files at once that match a pattern (e.g. “*.jpg” will ignore all jpeg files in the current directory). Let's add the necessary lines into the .gitignore
file:
# Virtual environments
venv/
.venv/
Checking the changes before committing
The most straightforward way to do so it to run git diff
under the repository's tree.
git diff
will highlighted the parts that were removed and / or addeddiff
GUI tool that is more intuitive to use (left sidebar, third icon usually below the looking glass.)Let's make a first commit to our local repository:
$ git add .gitignore requirements.txt
$ git commit -m "Initial commit of requirements.txt. Ignoring virtual env. folder."
requirement.txt
vs pip freeze
:
pip
fetch the latest build of the specified library during pip install -r
,(venv) pip
is tied to the version of native python that was used to created said env. As mentionned previously, new version of Python itself can also break working code from a previous version.
python
: (e.g. conda create -n venv python==3.9 -y
), on top of having similair mechanics to pip freeze
.Why is testing good?
We'll get into:
Let's create a new branch called test-suite
where we'll write our tests. It is good practice to write tests at the same time when we write some new code on a feature branch. But since the code already exists, we’re creating a feature branch just for writing tests this time. Generally, it is encouraged to use branches for even small bits of new work.
! FOLLOW ALONG IN YOUR CODESPACE ! Let's generate a new feature branch:
$ git checkout develop
$ git branch test-suite
$ git checkout test-suite
Now let's look at the daily_mean()
function in inflammation/models.py
. It calculates the daily mean of inflammation values across all patients. Let's first think about how we could manually test this function.
One way to test whether this function does the right thing is to think about which output we'd expect given a certain input. We can test this manually by creating an input and output variable, and use, e.g., npt.assert_array_equal()
to check whether the outcome of daily_mean()
given the input variable matches the output variable.
daily_mean()
, we need to import it. To import it, we need to instantiate a directory for the codespace.import numpy as np
, and choosing Run in Interactive Window
, and Run Selection/Line in Interactive Window
).
import os
# get the current working directory
os.getcwd()
import numpy as np
import numpy.testing as npt
from inflammation.models import daily_mean
test_input = np.array([[1, 2], [3, 4], [5, 6]])
test_result = np.array([3, 4])
npt.assert_array_equal(daily_mean(test_input), test_result) # Runs without a hitch
# print(daily_mean(test_input)) # confirm that it matches `test_result
We can think about multiple pairs of expected output given a certain input:
test_input = np.array([[2, 0], [4, 0]])
test_result = np.array([2, 0])
npt.assert_array_equal(daily_mean(test_input), test_result)
test_input = np.array([[0, 0], [0, 0], [0, 0]])
test_result = np.array([0, 0])
npt.assert_array_equal(daily_mean(test_input), test_result)
However, we get a mismatch between input and output for the first test:
...
AssertionError:
Arrays are not equal
Mismatched elements: 1 / 2 (50%)
Max absolute difference: 1.
Max relative difference: 0.5
x: array([3., 0.])
y: array([2, 0])
The reason here is that one of our specified outputs is wrong - which reminds us that tests themselves can be written in a wrong way, so it's good to keep them as simple as possible so as to minimize errors.
We could put these tests in a separate script to automate running them. However, a Python script stops at the first failed assertion, so if we get one no matter why, all subsequent tests wouldn't be run at all –> this calls for a testing framework such as Pytest
where we
Let's look at tests/test_models.py
where we see one test function called test_daily_mean_zeros()
:
def test_daily_mean_zeros():
"""Test that mean function works for an array of zeros."""
from inflammation.models import daily_mean
test_input = np.array([[0, 0],
[0, 0],
[0, 0]])
test_result = np.array([0, 0])
# Need to use NumPy testing functions to compare arrays
npt.assert_array_equal(daily_mean(test_input), test_result
Generally, each test function requires
test_input
NumPy
array,daily_mean()
function so we can use it (we only import the necessary library function we want to test within each test function),daily_mean()
with our test_input array and using np.assert_array_equal()
to test its validity,PyTest
, the letters ‘test_’ at the beginning of the function name.! FOLLOW ALONG IN YOUR CODESPACE ! Let's install PyTest
:
pip install pytest
We can then run PyTest
in the CLI…
python -m pytest tests/test_models.py # or
# pytest tests/test_models.py
… and get the following output:
======================================================== test session starts =========================================================
platform linux -- Python 3.10.8, pytest-7.3.2, pluggy-1.2.0
rootdir: /workspaces/python-intermediate-inflammation
collected 2 items
tests/test_models.py .. [100%]
========================================================= 2 passed in 1.06s ==========================================================
We can also test single testing functions in our test_models.py
file. To do that, we need to configure our testing set up by clicking on the testing icon, choosing Configure Python Test
, then Pytest
, and then the folder the tests are in.
! TASK 1 ! Write a new test case that tests the daily_max()
function, adding it to test/test_models.py
.
Ascertain we are are under the test-suite
branch, which was created from develop
daily_mean()
, defining input and expected output variables followed by the equality assertion.python -m pytest tests/test_models.py
, and have a look at your new tests pass.
def test_daily_max_integers():
"""Test that max function works for an array of positive integers."""
from inflammation.models import daily_max
test_input = np.array([[4, 2, 5],
[1, 6, 2],
[4, 1, 9]])
test_result = np.array([4, 6, 9])
npt.assert_array_equal(daily_max(test_input), test_result)
Now that we have added a new test, we would like to commit this new changes from the test-suite
branch to the develop
branch. Namely:
requirement.txt
since we added new dependencies (pip freeze
)git add
(staging phase)git commit
git merge
to incorporate the changes from test-suite
to develop
$ pip3 freeze > requirements.txt
$ git add requirements.txt tests/test_models.py
$ git commit -m "Add initial test cases for daily_max()"
$ git switch develop
$ git merge test-suite # develop <- test-suite
In case of mistakes
git log
to check the history of commits and changesgit reset --hard <commit hash>
to revert to previous state of the codeWe had used two different testing functions to distinguish between integer and string inputs. Writing a separate test functions to test the same function for different cases is quite inefficient - that's where test parameterisation comes in handy.
Instead of writing a separate function for each different test, we can parameterise the tests with multiple test inputs, e.g., in tests/test_models.py
, we can rewrite the test_daily_mean_zeros()
from above and test_daily_mean_integers()
@pytest.mark.parametrize(
"test, expected",
[
([ [0, 0], [0, 0], [0, 0] ], [0, 0]),
([ [1, 2], [3, 4], [5, 6] ], [3, 4]),
])
def test_daily_mean(test, expected):
"""Test mean function works for array of zeroes and positive integers."""
from inflammation.models import daily_mean
npt.assert_array_equal(daily_mean(np.array(test)), np.array(expected))
test
for inputs, and expected
for outputs -, as well as the inputs and outputs themselves that correspond to these names. Each row within the square brackets following the "test, expected"
arguments corresponds to one test case. Let's look at the first row:
[ [0, 0], [0, 0], [0, 0] ]
would be the input, corresponding to the input name test
,[0, 0]
would be the output, corresponding to the output name expected
,parameterize()
function is a Python decorator: A Python decorator is a function that takes as an input a function, adds some functionality to it, and then returns it (more about this in the section on functional programming).
parameterize()
is a decorator in that it takes as an input the respective testing function, adds functionality to it by specifying multiple input and expected output test cases, and calling the function over each of these inputs automatically when this test is called.! TASK 2 ! Rewrite your test functions for daily_max()
using test parameterisation.
test-suite
branch.python -m pytest tests/test_models.py
, and have a look at your new tests pass.test-suite
branch into the develop
branch.
@pytest.mark.parametrize(
"test, expected",
[
([ [0, 0], [0, 0], [0, 0] ], [0, 0]),
([ [1, 2], [3, 4], [5, 6] ], [5, 6]),
])
def test_daily_max(test, expected):
"""Test mean function works for array of zeroes and positive integers."""
from inflammation.models import daily_max
npt.assert_array_equal(daily_max(np.array(test)), np.array(expected))
We can find problems in our code conveniently in VScode using breakpoints (points at which we want code execution to stop) and our testing functions.
! FOLLOW ALONG IN YOUR CODESPACE !
daily_max()
, and set a breakpoint somewhere within that function by left-clicking the space to the left of the line numbers,test_daily_max()
, then choose Debug Test
.DEBUG CONSOLE
to check whether our function does what it's supposed to do, e.g., run np.max(data, axis=0)
and see whether it's giving the expected output (i.e., array([0, 0])
).While Pytest is an indispensable tool to speed up testing, it can't help us decide what to test and how many tests to run.
As a heuristic, we should try to come up with tests that test
This ensures a high degree of code coverage. A Python package called pytest-cov
that is used by Pytest gives you exactly this - the degree to which you've covered your code w. r. t. tests.
! FOLLOW ALONG IN YOUR CODESPACE ! Let's install pytest-cov
and assess code coverage:
$ pip3 install pytest-cov
$ python -m pytest --cov=inflammation.models tests/test_models.py
--cov
is an additional named argument to specify the code that is to be analysed for test coverage.Output:
================================================= test session starts =================================================
platform linux -- Python 3.10.8, pytest-7.3.2, pluggy-1.2.0
rootdir: /workspaces/python-intermediate-inflammation
plugins: cov-4.1.0
collected 7 items
tests/test_models.py ....... [100%]
---------- coverage: platform linux, python 3.10.8-final-0 -----------
Name Stmts Miss Cover
--------------------------------------------
inflammation/models.py 9 2 78%
--------------------------------------------
TOTAL 9 2 78%
================================================== 7 passed in 1.25s ==================================================
inflammation.models
are tested. To see which ones have not yet been tested, we can use the following line in the terminal:python -m pytest --cov=inflammation.models --cov-report term-missing tests/test_models.py
Output:
================================================= test session starts =================================================
platform linux -- Python 3.10.8, pytest-7.3.2, pluggy-1.2.0
rootdir: /workspaces/python-intermediate-inflammation
plugins: cov-4.1.0
collected 7 items
tests/test_models.py ....... [100%]
---------- coverage: platform linux, python 3.10.8-final-0 -----------
Name Stmts Miss Cover Missing
------------------------------------------------------
inflammation/models.py 9 2 78% 18, 32
------------------------------------------------------
TOTAL 9 2 78%
================================================== 7 passed in 0.29s ==================================================
! TASK 3 ! Clean up and final commit of our test-suite
changes
$ pip3 freeze > requirements.txt
$ git status
# TODO: Update .gitignore with __pycache__ and other indesirable, etc...
$ git add ./
$ git commit -m "Add coverage support"
$ git checkout develop
$ git merge test-suite
What is Test Driven Development?
In test-driven development, we first write the tests, and then the code, i.e., the thinking process would go from
This way, the set of tests act like a specification of what the code does. The main advantages are:
If we're collaborating on a software project with multiple people who push a lot of changes to one of the major repositories, we'd need to constantly pull down their changes to our local machines, and do our tests with the newly pulled down code - this would result in a lot of back and forth, slowing us down quite a bit. That's where Continuous integration (CI) comes in handy:
There are many CI infrastructures and services. We’ll be looking at GitHub Actions - which, unsurprisingly, is available as part of GitHub.
YAML
(a recursive acronym which stands for “YAML Ain’t Markup Language”) is a text format used by GitHub Action workflow files. YAML
files use
name: Kilimanjaro
height_metres: 5892
first_scaled_by: Hans Meyer
first_scaled_by:
- Hans Meyer
- Ludwig Purtscheller
height:
value: 5892
unit: metres
measured:
year: 2008
by: Kilimanjaro 2008 Precise Height Measurement Expedition
Let's set up CI using GitHub Actions: with a GitHub repository, there’s a way we can set up CI to run our tests automatically when we commit changes.
To do this, we need to add a new file in a particular directory of our repository (make sure you're on the test-suite
branch).
! TASK 4 ! Adding a basic testing workflow with Continuous Integration using Github Actions
Let's create a new directory .github/workflows
which is used specifically for GitHub Actions, as well as a new file called main.yml
:
$ mkdir -p .github/workflows
$ cd .github/workflows
$ vim main.yml # emphasis on the .yml, not .yaml
In the main.yml
, we'll write the following:
name: CI
# We can specify which Github events will trigger a CI build
on: push
jobs: # Can have multiple jobs, run in parallel
build:
# we can also specify the OS to run tests on
runs-on: ubuntu-latest
# a job is a seq of steps
steps:
# Next we need to checkout out repository, and set up Python
# A 'name' is just an optional label shown in the log - helpful to clarify progress - and can be anything
- name: Checkout repository
uses: actions/checkout@v2
- name: Set up Python 3.9
uses: actions/setup-python@v2
with:
python-version: "3.9"
- name: Install Python dependencies
run: |
python3 -m pip install --upgrade pip
pip3 install -r requirements.txt
- name: Test with PyTest
run: |
python -m pytest --cov=inflammation.models tests/test_models.py
name: CI
: name of our workflowon: push
: indication that we want this workflow to run when we push commits to our repository.jobs: build:
the workflow itself is made of a single job named build
, but could contain any number of jobs after this one, each of which would run in parallel.runs-on: ubuntu-latest
: statement about which operating systems we want to use, in this case just Ubuntu.steps:
the steps that our job will undertake in turn to 1) set up the job’s environment (think of it as a freshly installed machine, albeit virtual, with very little installed on it) and 2) run our tests. Each step has a name (which you can choose to your liking) and a way to be executed (as specified by uses
/run
).
name: Checkout repository for the job
: use a GitHub Action called checkout
name: Set up Python 3.9
: here, we use the setup-python
Action, indicating that we want Python version 3.9.name: Install Python dependencies
: install latest version of pip
, dependencies, and our inflammation
package: In order to locally install our inflammation package, it’s good practice to upgrade the version of pip that is present first, then we use pip to install our package dependencies.name: Test with PyTest
: finally, we let PyTest run our tests in tests/test_models.py
, including code coverage.Finally, lets commit the changes and push to trigger the actions
First, let's run the following commands:
$ git add .github/workflows
$ git commit -m "Added GH CI support"
$ git push
Then head under the "Actions" tab of your repository to check the build results.
To address whether our code works on different target user platforms (e.g., Ubuntu, Mac OS, or Windows), with different Python installations (e. g., 3.8, 3.9 or 3.10), we can use a feature called build matrices. Doing our tests across all these platforms and program versions would take a lot of time - that's where a build matrix comes in handy.
main.yml
, we can specify environments (such as operating systems) and parameters (such as Python versions), and new jobs will be created that run our tests for each permutation of these.strategy
as a matrix
of operating systems and Python versions within build
. We then use matrix.os
and matrix.python-version
to reference these configuration possibilitiesjob
name: CI
# We can specify which Github events will trigger a CI build
on: push
# now define a single job 'build' (but could define more)
jobs:
build:
strategy:
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
python-version: ["3.8", "3.9", "3.10"]
runs-on: ${{ matrix.os }}
# a job is a seq of steps
steps:
# Next we need to checkout out repository, and set up Python
# A 'name' is just an optional label shown in the log - helpful to clarify progress - and can be anything
- name: Checkout repository
uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
Let's add the new folder/file to the local repository and merge with develop
:
$ git add .github
$ git commit -m "Add GitHub Actions configuration & build matrix for os and Python version
$ git switch develop
$ git merge test-suite
Whenever you push your changes to a remote repository, GitHub will run CI as specified by the main.yml
.
You can check its status on the website of your remote repository under Actions
.
For each push, you'd get a report about which of the steps have been successfully/unsuccessfully taken.
You may also look into these resources on unit testing, scaling it up, and continuous integration: