Try   HackMD

Archive - Using Python in an HPC environment

Welcome!

Link to this document: https://hackmd.io/IoseeprATrWH0ZI7z6wJ-w?view

Link to the last sessions: https://hackmd.io/@bclaremar/HPC-python

Lecture Zoom link: https://umu.zoom.us/j/68897489502?pwd=TXlraXI5UkFvNWxIV0xGcWZ4M1lQdz09

Separate UPPMAX session Zoom: https://uu-se.zoom.us/j/68160935750

Password: 298853

Separate HPC2N session Zoom: https://umu.zoom.us/j/64721465122?pwd=NGtZM2h0bmJBTU4xNThjT0t5eFZndz09

Course material: https://uppmax.github.io/HPC-python/index.html


Icebreaker question

  • Q: What are your expectations of this course, briefly. Like what do you want to be able to do afterwards.
    • A:
    • A: Update my packages to support HPC2N and how HPC2N uses MPI.
    • A: Using viritual env on HPC2N
    • A: I would like to learn more about mpi
    • A: Learn modue loading and submitting jobs to the GPU.
    • A: Using virtual envire
    • python to ru
    • A: Using HPC resource to run ML and other python-realated parallelized codes.
    • A: Run parllel programs, and use the GPU with python on UPPMAX

Q&A

  • Q: Are the lectures being recorded? Will they be made available?

    • A: Yes. They will be available on the HPC2N YouTube channel. You will get the relevant link via email.
    • Please keep your mic and camera off during the main Zoom room, otherwise you may end up in the final recording. In the breakout rooms you may ask questions freely. We will mention this again later on.
  • Q: Where do I find the examples to be used during the course?

    • A: At HPC2N: /proj/nobackup/snic2022-22-641/bbrydsoe/examples.tar.gz
    • A: At UPPMAX: /proj/snic2022-22-641/nobackup/examples.tar.gz
  • Q: How to I untar these files?

    • A: tar -xvzf examples.tar.gz But please copy it to your directory first.
  • Q: Did you manage to get the files? (Add a + sign)

    • Yes ++++++
    • No
  • Q: Can docker be used to manage environments? I was thinking of pulling my own environment as a docker image instead of setting up everything from scratch.

    • A: A bit more details, please. Docker can run complete Linux setups at the same time Docker is not allowed on HPC clusters, because it requires administrative rights
    • A2:Right - in principle yes, if you can run Docker the alternative approach is to use Singularity containers https://sylabs.io/singularity/ (Docker containers can be converted to Singularity if they are build properly to run in the user space)
      Drop a line to support@uppmax.uu.se and mention my name (Pavlin) if this is the road you want to follow.
    • Singularity can run Docker images. HPC2N has some documentation here: https://www.hpc2n.umu.se/resources/software/singularity
  • Q: Are the python/3.9.5 and python3/3.9.5 modules the same?

    • A: If you load python3/3.9.5, python will point to the python2 version. If you are unsure which version of python or python3 you have, it's a very good idea to double check using some of these commands:
      • which python
      • which python3
      • python -V
  • Q: Does SNIC have a Jupyter Hub server similar to for example notebooks.egi.eu

  • Q: How do I find out which python modules are available on the system?

    • A: module avail python or module spider python if you're unsure about the spelling
    • A2: To check loaded version which python
    • A3 >>> help("modules") in a python shell prints everything available in the python path, regardless how it was installed (pip, conda, or manually) - it is complete list.
  • Q: Is it possible to run Jupyter notebooks at HPC2N?

    • A: Yes, it's it a bit complicated. Write to us at support@hpc2n.umu.se and we'll help you with that.
  • Q: Can one run jupyter using: jupyter notebook no-browser port=XXXX and then logging in to it?

  • Q: How can I get a prompt asking me to confirm removal of file?

  • Q: Can one use Kebnekaise for computing an UPPMAX project?

    • A: One needs a different SUPR application for Kebnekaise. Or if one is in the process of writing the application, one could choose the resources on multiple centers (HPC2N, UPPMAX, …) within SNIC.
  • Q: Because we might often be using "outdated" versions of packages to ensure reproducibility, is there a good way to automatically suppress warnings or prompts to upgrade?

    • A: You need to specify explicitly the version when installing like pip install --user -U package==1.3.4 another_package or
      https://pip.pypa.io/en/stable/cli/pip_freeze/
    • A2: pip <command> --disable-pip-version-check [options]
  • Q: If / when a user installs a package with pip install without explicitly creating a virtual env: will the install be preserved until next login or is it temporary?

    • A: The installed packages will stay there (default location). But it is a good practice to work with environments as things will get messy after some installations.
  • Q: Can venv use global packages from cluster if we don’t install packages in venv?

    • A: The idea is to create an "isolated" enviroment, so I guess the answer is no. At the same time if you add to the PYTHONPATH you can add other packages by pointing to their location - this sounds like you are looking for trouble.
    • A: The flag--system-site-packages includes the packages already installed in the loaded python module. This can be good if you require lots of packages and those available global packages are compatible. The packages you install in the venv will be taken in favor of the same global packages with different version.
      • It may require trial and error to see which packages has to be replaced, though.
  • Q: I just used pip install whitebox and it worked. Is this now avalible for all users in a project?

    • A: As long as they have the permission to access it. Please note that by default it will be in ~/.local/lib/python3.9/site-packages, so you will need to specify a different location for the packages to make them easier accessible. You may do so via: pip install <package> --target <dir> or -t <dir>, which will install the <package> in <dir> but please note that the best practice is to use virtual envrironments, see additional answer below.
    • A2: I will suggest using venv in your project folder, then everybody in the project just activate this enviroment and use it.
  • Q: Are there any guidelines and restrictions how virtualenv can be used? I tend to use these kind of terminal functions

    • A: Looks fine to me, perhaps too much magic :smile: (re: haha ok!). just keep it with simple names - special character might confuse bash and so on
    • Re: Looking at the awnser to the above question i think keeping the structure as "all in one folder" seems good: here is a bash version gist
  • Q: what's the difference between venev and conda? in terms of virtual environment

    • A: In general venv and similar can manage only python packages, while conda can install non python packages, binaries, and libraries.
  • Q: can viritual environments be created using a script sort of like a dockerfile?

    • A: Perhaps the simplest solution would be a bash script and a requirements.txt file that contains the python packages you wish to install.
    • A2:
    ​​#!/bin/bash
    ​​# sphinx
    ​​python3 -m venv ${HOME}/venv/sphinx
    ​​source ${HOME}/venv/sphinx/bin/activate
    ​​python3 -m pip install -U setuptools pip wheel
    ​​python3 -m pip install sphinx sphinx_rtd_theme
    ​​deactivate
    
  • Q: When developing a python script, what ide do you recommend to use on HPC2N?
    do you have experience with vim or emacs as ide for python? Do you reccomend them (vim or emacs)?

    • A: On a local computer VSCode or Spyder
      • On remote I personally use Vim and ipython for debugging. If the code is not running in parallel ipython -i python_program.py will execute and leave you in the python shell where you can debug the main of the program (i.e. can not jump in functions and modules).
  • Q: How can I monitor the CPU and memory usage of a job?

    • A: At HPC2N: On the terminal "job-usage Job_ID". Job_ID is the number you get upon submitting the job with "sbatch". This will give you a URL that you can paste on your local browser. Notice that statistics will appear after a couple of minutes the job starts
    • A: At UPPMAX you may use the command jobstats -p <jobid> which will generate a .png plot of the CPU and memory usage vs. time. More info at https://www.uppmax.uu.se/support/user-guides/jobstats-user-guide/.
  • Q: I am stuck in vim, send help :(. :q! worked.

    • A: Type :q! (=exit without saving). Some times (if in edit or visual mode) you may need to press ESC key a couple of times before you type :q!. Does it help? Feel free to use your favourite editor for editing files.
    • A: if you are starting with file editing, "nano" editor is a good recommendation.
    • I have been using Vim for two years now, mostly because I dont know how to exit. Anonymous user

  • Q: Is there a clever way of keeping track of the dependecies of a package? I mean, when creating an environment how would I know whether a package will conflict with another one?

    • A: perhaps pip install somepackage --only-list-deps
    • A2: You could record the dependencies that you have with pip freeze > requirements.txt. This is especially useful if you move your project to anther machine or in another environment, you may re-install the dependencies listed in the file requirements.txt using the command pip install -r requirements.txt.
  • Q: What is the difference between CPU and GPU calculation?

Q&A for the UPPMAX session (environments)

Exercise

https://uppmax.github.io/HPC-python/isolatedUPPMAX.html#prepare-the-course-environment
Task: create and activate the venv-python-course environment
If you get stuck, please write your BO room number here.

  • Q: Did you manage to finish the exercise?
    • A: Yes ++
    • A: No

Q&A for the HPC2N session (environments)

  • Q: Can we just move the virtual environment to another directory?

  • Q: Do we have to go via python setup.py build or does pip install (options) . work?

    • A: some packages use the first option (python setup.py ...) but if pip install works then it should be OK to use it
    • RE: :thumbsup: most modern setuptools configurations should support pip install nowadays so its probably safe for up-to-date packages
    • There are some ML packages which are not available through pip, so for these you would have to use setup.py, but it is worth trying with pip first
    • RE: but if I clone the package then local pip install should work right?
    • RE: In most cases, yes.
  • Q: When I have logged in on the cluster, I can get information about different partitions with sinfo and about specific nodes with scontrol show node. Is there a way to get a summarised info on the whole cluster, i.e. how many nodes are there, number of cores per each node, memory and total number of cores available on the cluster? Something like lscpu but for the whole cluster

    • A: You may use sinfo with additional options: sinfo -e -o '%D %N %C %m'. On Rackham, for example, you get
NODES NODELIST CPUS(A/I/O/T) MEMORY
450 r[33-334,339-486] 8362/452/186/9000 128000
144 r[1001-1072,1179-1250] 1836/468/0/2304 249611
32 r[1-32] 600/20/20/640 256000
4 r[335-338] 80/0/0/80 1000000

which means there are 450 nodes with 128GB/node and 4 nodes with 1TB/node, for example. For more options, please check https://slurm.schedmd.com/sinfo.html.

$ sinfo -o '%C'
CPUS(A/I/O/T)
10878/940/206/12024

The example above shows that Rackham has 10878 allocated and 940 idle (=not allocated by Slurm) cores at this moment.

  • Q: If I want to save a, say h5 file, during a job, how do i do that/access the file afterwards? Or will we do this later?
    • A: If I understood your question well, the file should be in the PATH where you created it. Write please, more details if this doesn't answer your question.
    • RE: So conclusion was: usual saving puts data on the computation node, its the code-developer who handles node-distributed disk-memory explicitly such as collecting the data in the end. With MPI the h5 interface is an option for example.