Try   HackMD

Using Python in an HPC environment

Welcome!

Link to this document: https://hackmd.io/@bclaremar/HPC-python

Link to first part of Q/A: https://hackmd.io/IoseeprATrWH0ZI7z6wJ-w?view

Lecture Zoom link: https://umu.zoom.us/j/68897489502?pwd=TXlraXI5UkFvNWxIV0xGcWZ4M1lQdz09

Separate UPPMAX session Zoom: https://uu-se.zoom.us/j/68160935750

Password: 298853

Separate HPC2N session Zoom: https://umu.zoom.us/j/64721465122?pwd=NGtZM2h0bmJBTU4xNThjT0t5eFZndz09

Course material: https://uppmax.github.io/HPC-python/index.html


MPI session

  • Q: Do you have an example for C/C++ interface also?
    • A: pybind11 is great for this: https://github.com/pybind/pybind11. This cookiecutter https://github.com/scikit-hep/cookie is a really good place to get started writing C/C++ bindings.
    • Participant comment: ctypes is a low-level good start to interface already existing C code without automatic "intent" guessing, also #include <Python.h> is a slightly harder way to write a python module in C. I have written some examples for my students here
    • Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More β†’
  • we have a reservation on UPPMAX:
    • add --reservation=snic2022-22-641_1 as slurm flag in bash script or in command line for sbatch or interactive
    • this is only for today until 4PM
  • Q: Why keeps this hackmd deleting what I write? Should I setup something?

    • A: Appologies for the annoyance. It happens sometime if several people edit at the same time. Please re-type if it happens.
  • Q: Cores vs. processes vs. threads, what is the difference?

    • A: Core: part of a multicore CPU, processor. Hardware. Think of it like a small CPU, I guess, inside the CPU. Usually cores share memory with other cores in the same socket on the CPU. Process: it's a task or instance of a program that is active. Thread: software. A process can do stuff concurrently by having several threads. Shares memory

    • A: so with MPI

      • a copy of the program (process)
        is copied to several instances (tasks)
        they work independently from each-other but has to communicate to exchange data and to wait for some calculations to be finished before continuing.
    • A: https://davescomputertips.com/cpu-cores-versus-threads-explained/

    • A: https://www.geeksforgeeks.org/difference-between-process-and-thread/

    • A: https://smileipic.github.io/Smilei/parallelization.html

    • A: something that really improved my low-level optimization skills was understanding of CPU/GPU memory cache levels and how to chunk memory between communication and calculations (this seems to explain levels pretty OK: https://www.makeuseof.com/tag/what-is-cpu-cache/ )

    • A: A node is the physical box containing a computer. An hpc cluster usually has several nodes.

      • Each motherboard in each node might have several sockets and can have a cpu chip in each socket.
      • Each cpu has multiple cores.
      • Depending on your configuration each core can run multiple threads to have several processes running at once (our cluster has multi threading turned off so 1core=1thread).
    • A: At UPPMAX our clusters has multi threading turned off so 1core=1thread. The same at HPC2N.

  • Q: In the GPU example you login to snowy with interactive but in the snowy user guide they write that you must use salloc. So can I use interactive for snowy?

    • A: well it worked for me, put please follow instructions on UPPMAX web!
    • A: interactive is a wrapper around salloc, it works for Snowy too.

Machine Learning session

  • Q: In order to run the "example-tf.sh" on Uppmax I had to change to #SBATCH –gres=gpu:0 Does this mean that no gpus are available? Followup question: is everyone allowed to use Snowy? It seems not to work with -A SNIC2022-22-641
    • A: There are GPUs on Snowy but not on Rackham

UPPMAX session - Conda and Bianca

  • Q: Once I got an error "the environment is inconsistent, please check the package plan carefully" when I used conda before. How to solve it?

    • A: Was it in a clean enviroment or just when adding more packages?
    • A2: In general environment inconsistencies can show up due to a multitude of reasons e.g. using both conda install and pip install for the same environment. Many answers seem to suggest that conda update --all could solve the issue.
  • Q:

    • A:
  • Q:

    • A:

HPC2N session: Machine learning (continued)

  • Q: Why are we now using python -m pip install package instead of the pip install –no-cache-dir –no-build-isolation package? Is python needed? Will this still avoid other environments/be isolated?
    • A: the second option is the recommended one, I will update this information

Q/A on demand

  • Storage basics UPPMAX and similar to HPC2N

    • All nodes can access:

      ​​​​​​​​Your home directory 
      ​​​​​​​​Your project directories 
      ​​​​​​​​Its own local scratch disk (2-3 TB)
      
    • If you’re reading/writing a file once, use a directory on Crex or Castor

    • If you’re reading/writing a file many times:

      • Copy the file to ”scratch”, the node local disk:

      • cp myFile $SNIC_TMP

Exercise help

Please write the number of the breakout room here if you need help:

Archive Q&A from the earlier sessions: https://hackmd.io/@dianai/HPC-Python-2022