owned this note
owned this note
Published
Linked with GitHub
---
# Archive - Using Python in an HPC environment
**Welcome!**
Link to this document: https://hackmd.io/IoseeprATrWH0ZI7z6wJ-w?view
Link to the last sessions: <https://hackmd.io/@bclaremar/HPC-python>
**Lecture Zoom link**: https://umu.zoom.us/j/68897489502?pwd=TXlraXI5UkFvNWxIV0xGcWZ4M1lQdz09
**Separate UPPMAX session Zoom**: https://uu-se.zoom.us/j/68160935750
Password: 298853
**Separate HPC2N session Zoom**: https://umu.zoom.us/j/64721465122?pwd=NGtZM2h0bmJBTU4xNThjT0t5eFZndz09
**Course material**: <https://uppmax.github.io/HPC-python/index.html>
---
## Icebreaker question
- Q: What are your expectations of this course, briefly. Like what do you want to be able to do afterwards.
- A:
- A: Update my packages to support HPC2N and how HPC2N uses MPI.
- A: Using viritual env on HPC2N
- A: I would like to learn more about mpi
- A: Learn modue loading and submitting jobs to the GPU.
- A: Using virtual envire
- python to ru
- A: Using HPC resource to run ML and other python-realated parallelized codes.
- A: Run parllel programs, and use the GPU with python on UPPMAX
## Q&A
- Q: Are the lectures being recorded? Will they be made available?
- A: Yes. They will be available on the HPC2N YouTube channel. You will get the relevant link via email.
- Please keep your mic and camera off during the main Zoom room, otherwise you may end up in the final recording. In the breakout rooms you may ask questions freely. We will mention this again later on.
- Q: Where do I find the examples to be used during the course?
- A: At HPC2N: `/proj/nobackup/snic2022-22-641/bbrydsoe/examples.tar.gz`
- A: At UPPMAX: `/proj/snic2022-22-641/nobackup/examples.tar.gz`
- Q: How to I untar these files?
- A: `tar -xvzf examples.tar.gz` But please copy it to your directory first.
- Q: Did you manage to get the files? (Add a + sign)
- Yes ++++++
- No
- Q: Can docker be used to manage environments? I was thinking of pulling my own environment as a docker image instead of setting up everything from scratch.
- A: A bit more details, please. Docker can run complete Linux setups... at the same time Docker is not allowed on HPC clusters, because it requires administrative rights...
- A2:Right - in principle yes, if you can run Docker... the alternative approach is to use Singularity containers https://sylabs.io/singularity/ (Docker containers can be converted to Singularity if they are build properly to run in the user space...)
Drop a line to support@uppmax.uu.se and mention my name (Pavlin) if this is the road you want to follow.
- Singularity can run Docker images. HPC2N has some documentation here: https://www.hpc2n.umu.se/resources/software/singularity
- Q: Are the python/3.9.5 and python3/3.9.5 modules the same?
- A: If you load `python3/3.9.5`, `python` will point to the `python2` version. If you are unsure which version of `python` or `python3` you have, it's a very good idea to double check using some of these commands:
- `which python`
- `which python3`
- `python -V`
- Q: Does SNIC have a Jupyter Hub server similar to for example *notebooks.egi.eu*
- A: https://uppmax.github.io/HPC-python/jupyter.html We addvise on using Thinlinc to avoid multiple problems and have the option to directly connect to the compute node i.e. without port forwards. On Alvis this is done automatically for the users https://www.c3se.chalmers.se/documentation/alvis-ondemand/#interactive-apps
- Q: How do I find out which python modules are available on the system?
- A: `module avail python` or `module spider python` if you're unsure about the spelling
- A2: To check loaded version `which python`
- A3 `>>> help("modules")` in a python shell prints everything available in the python path, regardless how it was installed (pip, conda, or manually) - it is complete list.
- Q: Is it possible to run Jupyter notebooks at HPC2N?
- A: Yes, it's it a bit complicated. Write to us at support@hpc2n.umu.se and we'll help you with that.
- Q: Can one run jupyter using: jupyter notebook --no-browser --port=XXXX and then logging in to it?
- A: We'll look into this later. Please reminder us in case it gets forgotten.
- A: https://uppmax.github.io/HPC-python/jupyter.html
- Q: How can I get a prompt asking me to confirm removal of file?
- A: In your .bashrc include: `alias rm='rm -i`. This can however be a bit dangerous if you are on another system and have gotten used to getting a prompt. See https://superuser.com/questions/382407/best-practices-to-alias-the-rm-command-and-make-it-safer for some best practices.
- Q: Can one use Kebnekaise for computing an UPPMAX project?
- A: One needs a different SUPR application for Kebnekaise. Or if one is in the process of writing the application, one could choose the resources on multiple centers (HPC2N, UPPMAX, …) within SNIC.
- Q: Because we might often be using "outdated" versions of packages to ensure reproducibility, is there a good way to automatically suppress warnings or prompts to upgrade?
- A: You need to specify explicitly the version when installing like `pip install --user -U package==1.3.4 another_package` or
https://pip.pypa.io/en/stable/cli/pip_freeze/
- A2: `pip <command> --disable-pip-version-check [options]`
- Q: If / when a user installs a package with *pip install* without explicitly creating a virtual env: will the install be preserved until next login or is it temporary?
- A: The installed packages will stay there (default location). But it is a good practice to work with environments as things will get messy after some installations.
- Q: Can venv use global packages from cluster if we don’t install packages in venv?
- A: The idea is to create an "isolated" enviroment, so I guess the answer is no. At the same time if you add to the PYTHONPATH you can add other packages by pointing to their location - this sounds like you are looking for trouble.
- A: The flag``--system-site-packages ``includes the packages already installed in the loaded python module. This can be good if you require lots of packages and those available global packages are compatible. The packages you install in the venv will be taken in favor of the same global packages with different version.
- It may require trial and error to see which packages has to be replaced, though.
- Q: I just used pip install whitebox and it worked. Is this now avalible for all users in a project?
- A: As long as they have the permission to access it. Please note that by default it will be in `~/.local/lib/python3.9/site-packages`, so you will need to specify a different location for the packages to make them easier accessible. You may do so via: `pip install <package> --target <dir>` or `-t <dir>`, which will install the `<package>` in `<dir>` but please note that the best practice is to use virtual envrironments, see additional answer below.
- A2: I will suggest using `venv` in your project folder, then everybody in the project just activate this enviroment and use it.
- Q: Are there any guidelines and restrictions how virtualenv can be used? I tend to use these kind of [terminal functions](https://gist.github.com/danielk333/635fa22f43beffe614355a4902046212)
- A: Looks fine to me, perhaps too much magic :smile: (re: haha ok!). just keep it with simple names - special character might confuse bash and so on...
- Re: Looking at the awnser to the above question i think keeping the structure as "all in one folder" seems good: here is a [bash version gist](https://gist.github.com/danielk333/588c1b9b07cdfefc8d7af7f2d885a051)
- Q: what's the difference between venev and conda? in terms of virtual environment
- A: In general `venv` and similar can manage only python packages, while `conda` can install non python packages, binaries, and libraries.
- Q: can viritual environments be created using a script sort of like a dockerfile?
- A: Perhaps the simplest solution would be a bash script and a requirements.txt file that contains the python packages you wish to install.
- A2:
``` bash
#!/bin/bash
# sphinx
python3 -m venv ${HOME}/venv/sphinx
source ${HOME}/venv/sphinx/bin/activate
python3 -m pip install -U setuptools pip wheel
python3 -m pip install sphinx sphinx_rtd_theme
deactivate
```
- Q: When developing a python script, what ide do you recommend to use on HPC2N?
do you have experience with vim or emacs as ide for python? Do you reccomend them (vim or emacs)?
- A: On a local computer VSCode or [Spyder](https://www.spyder-ide.org/)
- On remote I personally use Vim and `ipython` for debugging. If the code is not running in parallel `ipython -i python_program.py` will execute and leave you in the python shell where you can debug the main of the program (i.e. can not jump in functions and modules).
- Q: How can I monitor the CPU and memory usage of a job?
- A: At HPC2N: On the terminal "job-usage Job_ID". Job_ID is the number you get upon submitting the job with "sbatch". This will give you a URL that you can paste on your local browser. Notice that statistics will appear after a couple of minutes the job starts
- A: At UPPMAX you may use the command `jobstats -p <jobid>` which will generate a .png plot of the CPU and memory usage vs. time. More info at https://www.uppmax.uu.se/support/user-guides/jobstats-user-guide/.
- Q: I am stuck in vim, send help :(. :q! worked.
- A: Type `:q!` (=exit without saving). Some times (if in edit or visual mode) you may need to press ESC key a couple of times before you type `:q!`. Does it help? Feel free to use your favourite editor for editing files.
- A: if you are starting with file editing, "nano" editor is a good recommendation.
- > I have been using Vim for two years now, mostly because I dont know how to exit. *Anonymous user*
- Q: Is there a clever way of keeping track of the dependecies of a package? I mean, when creating an environment how would I know whether a package will conflict with another one?
- A: perhaps `pip install somepackage --only-list-deps`
- A2: You could record the dependencies that you have with `pip freeze > requirements.txt`. This is especially useful if you move your project to anther machine or in another environment, you may re-install the dependencies listed in the file `requirements.txt` using the command `pip install -r requirements.txt`.
- Q: What is the difference between CPU and GPU calculation?
- A: https://www.heavy.ai/technical-glossary/cpu-vs-gpu
## Q&A for the UPPMAX session (environments)
### Exercise
https://uppmax.github.io/HPC-python/isolatedUPPMAX.html#prepare-the-course-environment
Task: create and activate the `venv-python-course` environment
If you get stuck, please write your BO room number here.
- Q: Did you manage to finish the exercise?
- A: Yes ++
- A: No
## Q&A for the HPC2N session (environments)
- Q: Can we just move the virtual environment to another directory?
- A: You probably can but it seems to be easier to recreate the environment (see: https://stackoverflow.com/questions/32407365/can-i-move-a-virtualenv) as some paths are hardcoded.
- Q: Do we have to go via `python setup.py build` or does `pip install (options) .` work?
- A: some packages use the first option (`python setup.py ...`) but if `pip install` works then it should be OK to use it
- RE: :thumbsup: most modern setuptools configurations should support `pip install` nowadays so its probably safe for up-to-date packages
- There are some ML packages which are not available through `pip`, so for these you would have to use `setup.py`, but it is worth trying with `pip` first
- RE: but if I clone the package then local `pip install` should work right?
- RE: In most cases, yes.
- Q: When I have logged in on the cluster, I can get information about different partitions with `sinfo` and about specific nodes with `scontrol show node`. Is there a way to get a summarised info on the whole cluster, i.e. how many nodes are there, number of cores per each node, memory and total number of cores available on the cluster? Something like `lscpu` but for the whole cluster
- A: You may use `sinfo` with additional options: `sinfo -e -o '%D %N %C %m'`. On Rackham, for example, you get
```
NODES NODELIST CPUS(A/I/O/T) MEMORY
450 r[33-334,339-486] 8362/452/186/9000 128000
144 r[1001-1072,1179-1250] 1836/468/0/2304 249611
32 r[1-32] 600/20/20/640 256000
4 r[335-338] 80/0/0/80 1000000
```
which means there are 450 nodes with 128GB/node and 4 nodes with 1TB/node, for example. For more options, please check https://slurm.schedmd.com/sinfo.html.
```
$ sinfo -o '%C'
CPUS(A/I/O/T)
10878/940/206/12024
```
The example above shows that Rackham has 10878 allocated and 940 idle (=not allocated by Slurm) cores at this moment.
- Q: If I want to save a, say h5 file, during a job, how do i do that/access the file afterwards? Or will we do this later?
- A: If I understood your question well, the file should be in the PATH where you created it. Write please, more details if this doesn't answer your question.
- RE: So conclusion was: usual saving puts data on the computation node, its the code-developer who handles node-distributed disk-memory explicitly such as collecting the data in the end. With MPI the h5 interface is an option for example.