US RSE-23 Tutorial
Valerio Maggio and Dave Clements, Anaconda
October 10, 2023
In this hands-on tutorial, we will be assuming that:
If you have the Linux Subsystem for Windows installed then you can use the mainline instructions in this tutorial. You will only need to follow the "Windows specific" instructions if you don't have LSW installed.
When instructions are different on Windows than they are on Linux, macOS, or LSW there will be a separate section with Windows instructions like this:
Windows (non LSW) 💁
Launch the command prompt.
Start → Windows System → Command Prompt
We'll be using the command prompt
and then the anaconda prompt
throughout.
Before diving into the exercises, let's make sure we have everything we need to get started.
conda
and git
installed on your computer.If you don't have all of these then please have a look at the next section for detailed instructions on how to install what you need.
Otherwise, please feel free to jump directly to the Create Conda Environment section to proceed.
1. Install Miniconda
For this tutorial we recommend Miniconda, a free minimal installer for conda
. It includes only conda, Python, the packages they depend on, and a small number of other useful packages, including pip
, zlib
and a few others.
Use the conda install
command to install over 3,000 additional conda packages from the Anaconda repository, and over 20,000 packages from conda-forge.
Please download and install the most recent Miniconda installer that matches your computer architecture and Operating System. Accept the default settings when installing.
Anaconda Distribution 🐍
Differently from Miniconda, Anaconda Distribution provides immediate access to lots of
packages that gets installed on your computer along side the Python distribution.
For the sake of this tutorial, installing Miniconda would be more than enough. However, if you want to install Anaconda Distribution, or you already have it installed on your computer, that would work too!
In that case, you would only need to make sure that you are running the latest version of conda
. To do so, please run the following command in the Terminal / Anaconda Prompt:
2. Install Git
Installing Git
will vary depending on your Operating System. You could follow all the instructions reported here for a detailed step-by-step setup.
3. Code Editor
Choosing the "right" code editor is always a matter of personal taste, laziness (to adjust to new UIs) and religious attachment to certain lines of code. Also, what you can possibly find online isn't that helpful either, with oxymoronic titles like "The 5 Best Code Editors".
For these very reason, we are not requiring you to use any specific code editor. Any editor that you like which has syntax highlighting for the Python language will do.
During the live coding sessions of the tutorial, we will be using Visual Studio Code, with the MS Python Extension.
If you are here, it's because you already have a working version of conda
installed on your computer. The first thing to do now is to create a virtual Conda environment that we will be using throughout the tutorial.
Windows (non LSW) 💁
Launch the newly installed Anaconda Prompt:
Start → Anaconda Prompt (miniconda3)
To create our new packaging
conda environment just type the following instruction in the terminal/command line/Anaconda Prompt:
This will create a new conda environment using Python 3.10, and then install IPython
and pytest
as extra packages.
This is all we need to get started. At this point, we just have to activate our new environment, and then we will be ready to go!
The command line now starts with (packaging)
instead of (base)
reflecting that you now in the packaging
environment.
In this tutorial we will be working together on a new and never-been-seen-before application that will be implement a die roller to be used during our campaign of Dungeons & Dragons.
The specs of this applications are pretty simple:
We would need to roll a single die, choosing one of d4
, d6
, d8
, d10
, d12
, d20
, d100
We would need to make multiple die rolls, of multiple dice (e.g. 2d4
and 3d6
).
Nice, let's get started! Shall we!?
dnd-roller
GitHub repositoryLet's start by creating the GitHub repository that will host our dnd-roller
project.
Let's go to your GitHub profile page (i.e. https://github.com/<your-gh-username>
); click the green 🖥️ New button in the upper right corner.
Name the new repository dnd-roller, with the following description:
Simple python app to roll dice for D&D (Dungeons&Dragons)
Also:
README
filegitignore
, choosing the template for PythonMIT License
We're now ready to proceed, and to hit the button Create repository.
At this point, all that remains is to clone this repo on your local computer:
Image from teeturtle.com
dnd-roller
It's finally time to write some Python code for our dnd-roller
!
We will start by creating the new dnd_roller
Python package, and its corresponding folder for tests
.
Note the underscore in the dnd_roller
subdirectory name. The directory of your project will look something like this:
dice.py
ModuleLet's now work on our new dice.py
module: dnd_roller/dice.py
. This module will contain all the functions that implement the main core functionalities of our dnd-roller
.
Let's recap what we need to implement in our dice roller:
roll
)
d4|6|8|10|12|100
dice_roll
)"2d4, 3d6, 1d10, 6d12"
) and will generate a tabular report of the outcomes.Here is the code for dice.py
. Paste this into a file named dnd_roller/dice.py
using your text editor.
Here is more information about this module. We will use this to implement tests.
roll
functionThe requirements for this function are pretty simple indeed:
roll(d=4) -> return a number in [1,4]
roll(d=10) -> return a number in [1,10]
roll(d=5) -> Exception: Unsupported die
roll(d=100) -> roll(d=10) * 10
💡 A few comments:
In this implementation, we first want to be sure that the parameter d
does actually correspond to an integer. We do so adopting an idiomatic approach known as EAFP
(Easier to ask for Forgiveness than Permission). Afterwards, we check that the number of sides chosen for the die is indeed valid in D&D, and then we simply rely on the random.randint
function to generate the result of our rolling. Simple as that!
dice_roll
functionGiven that all the heavy-lifting has been done already in the roll
function, the implementation of the dice_roll
function is pretty straightforward: it just needs to call the roll
function multiple times, and return the outcomes as a Python list.
sequence_rolls
functionThe last function in dnd-roller
is sequence_rolls
.
Generating the dice rolls is pretty straightforward. All that is really left to do is to parse the input sequence so that it becomes input parameters for the dice_roll
function. However that is pretty simple to do as well, so we do it in a quite convoluted way using a combination of functional programming and dictionary comprehension to make it fun.
tabulate
As for the tabulation part, we will leverage the tabulate
package, that is directly available in the default conda channel:
Alternatively, you could install the tabulate
package directly from PyPI:
__init__.py
We'll need an __init__.py
file to make Python treat our directory as a package. The file can be empty, but it must exist. Create it in the main dnd_roller
directory. __init__.py
is run once when the module is first imported and it's an important part of defining the public interface to our module.
Our module has three functions, roll
, dice_roll
, and sequence_rolls
. The roll
function supports the dice_roll
function, which in turn supports the sequence_rolls
function. So, it is possible that we want only sequence_rolls
in the package's public interface. But, these 3 functions all serve useful functions so, let's put all 3 in the public interface. Create __init_py__
in our top level directory:
which enables this
instead of this
Users of your package don't need to know the details of where you implemented your API.
💡 This is a good opportunity to remind you that it is indeed possible to install non-conda packages into a Conda environment 💫
🎉 We are finally done with our fancy dnd-roller
. All we need to do is to try to generate some tabular report. In a [I]Python interpreter:
We could try the new implementation interactively in the Python interpreter.
Note: ipython
(or even default python
interpreter) would equally do:
Published packages (and all software for that matter) should include test cases.
To verify that this implementation does everything we expect it to, the best possible way is to write a few tests.
We will be using PyTest as our testing framework, so let's first create a pytest.ini
configuration file in the main root folder. This file will instruct pytest
to run the test by managing the namespace resolution properly:
roll
tests
Now let's create a new test module test_roll.py
under the tests
folder with a bunch of pretty simple test functions that test roll
:
Using pytest, and its parametrize
feature, we're checking a few corner cases (e.g. NaN inputs, negative or float numbers), as well as correct expected behaviours for our roll
function (including the only "problematic" case of the d100
).
dice_roll
tests
Testing should verify that the actual implementation of the dice_roll
is indeed calling the roll
function (multiple times), without duplicating code! (which is a very bad practice, ed.). To do so, we will be creating a Mock
(more here).
The more interesting part here is about the tests:
first we could re-use the same strategy we used with roll
by using pytest.mark.parametrize
to generate a few test inputs (a.k.a. fixtures
).
In this case we will be generating a few combinations of throws
and sides
values, and we will be checking that
throws
[1, sides]
sequence_rolls
tests
Our last function needs some tests too. We will be adding the tests for the sequence_rolls
function into a new tests/test_sequence_rolls.py
test module.
The first test is pretty similar to the last previously discussed: we're just checking that sequence_rolls
is not reinventing the wheel, and that the dice_roll
function is called instead, with the right parameters. Again, we are leveraging on unittest.mock.patch
to do so.
The other tests are more generally testing the output generated by sequence_rolls
, so that each sequence has (a) the correct number of rolls, and (b) exactly the very same rolls we are expecting. To do so, we use a trick that sets the random seed. We repeat the calls to random in the same way/order. In this way, we are absolutely sure to always generate the same sequence of numbers.
FYI, this is the foundation on which Reproducibility in Data Science could be obtained (e.g. see Reproducibility in Deep learning).
To run our tests, let's move back to our terminal:
You should get an output similar to the one reported below:
It's finally time to pack! Our dnd-roller
is ready to become a re-usable Python package for everybody to use. So, let's create the skeleton for our future package-to-be, following the instructions reported in the official Python documentation.
setup.py
and package metadataFirst thing we do is to create a new setup.py
file that uses Python setuptools.setup
to specify initial package metadata:
💡 Please note that we specified our external dependencies in the install_requires=[...]
parameter of the setup
function, that is tabulate
.
❗️ At this point, please feel free to add any additional metadata to the README.md
file as this will be used for the long_description
of the dnd-roller
package. For example:
The `dnd_roller` provides three main functions: `roll`, `dice_roll`, and `sequence_rolls` to generate a single die roll, multiple rolls of the same die, or multiple rolls of multiple dice. The former (i.e. `roll()`) could generate a simple output in the terminal, whilst the latter (i.e. `sequence_rolls()`) generate a tabular report for the outcome of each roll in the sequence. Please have a look at the examples below for additional details. ### Examples Rolling a single game die: ```python >>> from dnd_roller import roll >>> roll(d=4) 4 >>> roll(d=20, verbose=true) You rolled 17 17 ``` Rolling multiple times the same game die: ```python >>> from dnd_roller import dice_roll >>> dice_roll(throws=3, sides=4) [3, 2, 4] ``` Rolling a sequence of dice rolls: ```python >>> from dnd_roller import sequence_rolls >>> sequence_rolls(sequence="12d20, 4d4, 2d10, 1d100", verbose=True) ╒════════╤════════════════════════════════════════════════╤═══════╕ │ dice │ rolls │ sum │ ╞════════╪════════════════════════════════════════════════╪═══════╡ │ 12d20 │ [15, 9, 13, 2, 14, 13, 18, 15, 13, 10, 17, 18] │ 157 │ ├────────┼────────────────────────────────────────────────┼───────┤ │ 4d4 │ [2, 3, 2, 1] │ 8 │ ├────────┼────────────────────────────────────────────────┼───────┤ │ 2d10 │ [8, 5] │ 13 │ ├────────┼────────────────────────────────────────────────┼───────┤ │ 1d100 │ [50] │ 50 │ ╘════════╧════════════════════════════════════════════════╧═══════╛ {'12d20': [15, 9, 13, 2, 14, 13, 18, 15, 13, 10, 17, 18], '4d4': [2, 3, 2, 1], '2d10': [8, 5], '1d100': [50]} ```
Finally we will add some additional metadata in the setup.cfg
and pyproject.toml
configuration files, such as the license, and the package classifiers.
💡 Note: We will soon see how these metadata can be consumed by automatic build tools for packaging.
Similarly, in pyproject.toml
:
🎉 Whoot whoot! Now everything is ready for our dnd-roller
package!
Time to publish everything on GitHub:
If we now open the browser, and visit the GH repository url (e.g., https://github.com/leriomaggio/dnd-roller), you should see something similar to the image below.
Image Credits: https://he-man.fandom.com/
All the hipster geeks in the audience shouldn't require further references and explanation.
And Yes: you are still in the right room! We are still talking about Conda and Python packaging.
From the official documentation:
Grayskull is an automatic conda recipe generator.
The main goal of this project is to generate concise recipes for conda-forge. The Grayskull project was created with the intention to eventually replaceconda skeleton
.
Presently Grayskull can generate recipes for Python packages available on PyPI and also those not published on PyPI but available as GitHub repositories. Grayskull can also generate recipes for R packages published on CRAN. Future versions of Grayskull will support recipe generation for packages of other repositories such as Conan and CPAN etc..
🎉 Looks like a fantastic treat! We will be using grayskull
to automatically generate a recipe for our dnd-roller
project, so that we can build a conda project for it!
Grayskull
We start by installing grayskull
using conda
, from the conda-forge
channel:
The next thing we want to do, is to create a release of our project on GitHub. Once we have done that, we will be able to use grayskull
to generate our recipe. In fact, grayskull will fetch all the necessary information (and package) from GitHub to prepare our conda-recipe
.
To do a release, we can use the GitHub interface directly. The only thing to bear in mind is to specify a proper version tag for our release: v0.0.1
. The version tag is what grayskull
will be using to gather the version of our package, as well as the name of the archive generated by GitHub. Click Create a new release on the right hand side of your GitHub's repo page, and then create the release
Once we have a release, we are able to proceed to generate our conda-recipe
with Grayskull.
dnd-roller
One of the main advantages of using grayskull
is not only that we don't need to worry about (manually) creating the conda-recipe
to build our package, but it can also get everything that is required directly from GitHub.
Generate the recipe for dnd-roller
with grayskull
is just two-steps away:
dnd-roller
folder, please move away (say, in the parent directory), and create a new folder named grayskull
(or as you prefer, the name does not matter):grayskull
folder, and generate the recipe:🎉 When it's completed, you should now see a dnd-roller
folder, containing a meta.yml
.
This is indeed your conda recipe we where hoping for! 👨🍳
conda build
Now that we have our recipe, all that's need to do is to use it to build our dnd-roller
conda package.
First install conda-build
which will enable the conda build
command.
Next, still within the grayskull
folder, let's type:
If everything goes well, a dnd-roller.tar.gz
archive should have been created here
Windows 💁
Today, we are not going to submit multiple (or even single) copies of our dnd-roller
package to conda-forge. We don't want to test the patience of the conda-forge gods.
But, we will get you to the point just before submission, and time allowing, we will also show you how to create your own channel on anaconda.org and publish packages there.
The instructions here are heavily based on the conda-forge instructions for package submission.
From conda-forge
- Ensure your source code can be downloaded as a single file. Source code should be downloadable as an archive (.tar.gz, .zip, .tar.bz2, .tar.xz) or tagged on GitHub, to ensure that it can be verified. (For further detail, see Build from tarballs, not repos).
We got this!
- Fork and clone the staged-recipes repository from GitHub.
We will fork the repo using the GitHub web interface, and then clone that fork on our laptop
Now, clone the new repo locally.
That clone may take a bit - the repo is around 100mb.
- Checkout a new branch from the staged-recipes
main
branch.
- Through the CLI, cd inside the ‘staged-recipes/recipes’ directory.
- Within your forked copy, create a new folder in the recipes folder for your package (i.e,
...staged-recipes/recipes/<name-of-package>
)
- Copy
meta.yaml
from the example directory. All the changes in the following steps will happen in the COPIED meta.yaml (i.e.,...staged-recipes/recipes/<name-of-package>/meta.yaml)
. Please leave the example directory unchanged!
We could do this, but grayskull
has already generated a perfectly good meta.yaml
that we can use, so let's use that instead.
Windows (non WSL) 💁
- Modify the copied recipe (
meta.yml
) as needed. To see how to modify meta.yaml, take a look at the recipemeta.yaml
.
Some things to note:
meta.yaml
file generated by grayskull
. See below.staged-recipes/recipes/example/meta.yaml
file is full of useful guidance, as is the conda-forge meta.yml
documentation. Spend some time getting to understand the contents of this file.Now, lets tidy up the meta.yaml
file we just copied in. The end of that file says:
You need to replace AddYourGitHubIdHere
with your GitHub ID.
- Generate the SHA256 key for your source code archive, as described in the example recipe using the
openssl
tool. As an alternative, you can also go to the package description on PyPI from which you can directly copy the SHA256.
Thanks to the power of Grayskull we already have a SHA256.
- Be sure to fill in the
test section
. The simplest test will simply test that the module can be imported, as described in the example.
Thanks to the power of Valerio we have already created our tests.
- Remove all irrelevant comments in the
meta.yaml
file.
The file generated by Grayskull contains no comments.
The conda-forge documentation follows the above instructions with this checklist:
- Ensure that the license and license family descriptors (optional) have the right case and that the license is correct. Note that case sensitive inputs are required (e.g. Apache-2.0 rather than APACHE 2.0). Using SPDX identifiers for license field is recommended. (see SPDX Identifiers and Expressions)
Our meta.yaml
says MIT
and MIT
is on the example list of approved strings, so we are good.
- Ensure that you have included a license file if your license requires one – most do. (see here)
Some of the packages that are merged into conda-forge have a LICENSE.txt
file alongside the meta.yaml
file, and some don't.
Does the MIT license require this? We have no idea, but many merged PRs use an MIT license and do not include a top level LICENSE.txt
file. If they don't need one, we don't either!
- In case your project has tests included, you need to decide if these tests should be executed while building the conda-forge feedstock.
- Make sure that all tests pass successfully at least on your development machine.
Already done.
- Recommended: run the test locally on your source code to ensure the recipe works locally (see Running tests locally for staged recipes).
Already done.
- Make sure that your changes do not interfere with other recipes that are in the
recipes
folder (e.g. theexample
recipe).
Our folder is dnd-roller
and that does not collide in anyway with example
which is the only other folder in the recipes
directory.
Let's get our staged recipe into our GitHub repo, and then submit a pull request to conda-forge.
The status message in the push tells us where to go next:
Go there to create (almost) a conda-forge PR submissions. conda-forge PRs use this PR template:
We have a decidedly Pythonic submission.
Next there is a checklist. Read it and add x
's as you confirm each item.
When it's ready, click Create pull request to submit it. EXCEPT DON'T DO THAT TODAY.
There is a whole post-staging process for what happens to your PR after submission.
conda-forge is an all volunteer organization, and depending on how much else is going on, it may take a while for the conda-forge team to engage with your PR. After the initial ping to the appropriate team (see above), how long should you wait before pinging conda-forge again? If you can, try to wait at least a week, and always (always) be polite in your communications.
From the conda-forge doc:
Once the PR containing the recipe for a package is merged in the
staged-recipes
repository, a new repository is created automatically called<package-name>-feedstock
. A feedstock is made up of a conda recipe (the instructions on what and how to build the package) and the necessary configuration files for automatic builds using freely available continuous integration (CI) services.Each feedstock contains various files that are generated automatically using our automated provisioning tool
conda-smithy
.
You can also publish your package in your own channel on Anaconda.org.
These instructions are based on the conda documentation.
Create an account on Anaconda.org
Install anaconda-client:
conda install anaconda-client
Login
anaconda login
Upload your package to Anaconda.org:
Windows 💁
Now, go to your profile page on anaconda.com, which is at https://anaconda.org/YOUR_USERNAME.