RSH 014 internal: running on a cluster

# RSH 014 internal: running on a cluster Examples: - https://github.com/AaltoSciComp/hpc-examples/blob/master/slurm/pi-openmp.c - https://aaltoscicomp.github.io/python-for-scicomp/parallel/#mpi Messages to get across: - instructions manual for script that is well defined - "grow" the calculation: don't start immediately with 40 nodes - Intro and purpose - You get/have a code, you need cluster. How do you do that interface? What is your strategy? - What kinds of things can go wrong? - Request many processors, use few - Request few processors, use many - Request many processors but the code does not scale - Ask for "wrong" queue (sometimes there are many) - Being unsure about MPI/OMP capabilities - Hard-coded values in the script - safe shell settings: make script stop if there in undefined variable - Program, how to run it - Run code - Run code interactively (srun) - make script, test it - seff - change to '-c 2' and '--threads=2' - change to '--threads=$SLURM_CPUS_PER_TASK' (check variable name) - Now, this script works with any '-c' option. - MPI code - create script that uses it - OpenMP code - How does it know how many processors to use? - OMP_NUM_THREADS - Find example code online. How do you tell what kind it is? - is it: - OpenMP - MPI - custom thread/multiprocessing (multiprocessing, threading, parallel) - dig through docs - notice it doesn't use words MPI or OpenMP - Notice some "CPUs" parameter - How to select how many cores - hyperthreading - Radovan shows Slurmbrowser on Norwegian HPC - actually does not seem to work today but i will show something else - How to select the amount of memory