# Spack at HPCFS
Spack provides prebuild modules for all users on HPCFS. Specific module builds are listed below. We restric our builds to RHEL8.4 provided compiler GCC@8.5.0 in order to provide `haswell` and `rome` CPU compatibility (arch=zen) for system provided layer (SLURM, UCX, knem).
Higher version GCC compilers are used on ROME with zen2 (avx2) compatibility across `rome` and `haswell` partitions, while AOCC compilers are intended just for AMD `rome` partition.
## User Spack development
It is also possible to use modules locally for your own use by setting up the following configurations for local build and deployment.
~~~bash
rm -rf ~/.spack
mkdir ~/.spack
sed -e '/root:/s,$spack,/work/$USER,' /opt/spack/etc/spack/defaults/config.yaml > ~/.spack/config.yaml
sed -i -e '/source_cache:/s,:.*,: ~/.spack/cache,' ~/.spack/config.yaml
sed -i -e 's/# build_jobs: 16/build_jobs: 32/' ~/.spack/config.yaml
sed -e 's/# roots:/roots:/' -e 's,# lmod:.*, lmod: /work/$USER/opt/lmod,' -e 's,# tcl:.*, tcl: /work/$USER/opt/modules,' -e '/- tcl/a\ - lmod' /opt/spack/etc/spack/defaults/modules.yaml > ~/.spack/modules.yaml
~~~
Run source in active bash or add source line in your .bashrc profile file
source /opt/spack/share/spack/setup-env.sh
## hwloc
Hardware locality detects CUDA with the latest version since it is not used for computing but rather detection. Cuda provides OpenCL devices. NVML is provided by NVIDIA drivers and can't be part of hwloc since only few gpu nodes provides drivers and corresponding `libnvidia-ml.so.1` library.
The rest of locality info are provided through ubus and pci.
~~~spack
[spack@gpu02 ~]$ spack spec --install-status --long hwloc
Input spec
--------------------------------
- hwloc
Concretized
--------------------------------
- gg2gaiz hwloc@2.6.0%gcc@8.5.0~cairo+cuda~gl+libudev+libxml2~netloc~nvml+opencl+pci~rocm+shared arch=linux-almalinux8-zen
[+] 5lbelaa ^cuda@11.5.1%gcc@8.5.0~dev arch=linux-almalinux8-zen
...
~~~
## PMIX
Process managment interface is used by SLURM and OpenMPI.
To provide a chained compilation with input specification propagated by hash of the build (e.g. /gg2gaiz)
~~~spack
[spack@gpu02 ~]$ spack spec -lI pmix@3.2.1 ^hwloc/gg2gaiz
Input spec
--------------------------------
- pmix@3.2.1
[+] ^hwloc@2.6.0%gcc@8.5.0~cairo+cuda~gl+libudev+libxml2~netloc~nvml+opencl+pci~rocm+shared arch=linux-almalinux8-zen
[+] ^cuda@11.5.1%gcc@8.5.0~dev arch=linux-almalinux8-zen
[+] ^libxml2@2.9.12%gcc@8.5.0~python arch=linux-almalinux8-zen
[+] ^libiconv@1.16%gcc@8.5.0 libs=shared,static arch=linux-almalinux8-zen
[+] ^xz@5.2.4%gcc@8.5.0~pic libs=shared,static arch=linux-almalinux8-zen
[+] ^zlib@1.2.11%gcc@8.5.0+optimize+pic+shared arch=linux-almalinux8-zen
[+] ^libpciaccess@0.16%gcc@8.5.0 arch=linux-almalinux8-zen
[+] ^ncurses@6.1.20180224%gcc@8.5.0~symlinks+termlib abi=6 arch=linux-almalinux8-zen
Concretized
--------------------------------
- atirvkd pmix@3.2.1%gcc@8.5.0~docs+pmi_backwards_compatibility~restful arch=linux-almalinux8-zen
[+] gg2gaiz ^hwloc@2.6.0%gcc@8.5.0~cairo+cuda~gl+libudev+libxml2~netloc~nvml+opencl+pci~rocm+shared arch=linux-almalinux8-zen
[+] 5lbelaa ^cuda@11.5.1%gcc@8.5.0~dev arch=linux-almalinux8-zen
...
~~~
## SLURM
Slurm includes PMIX module with hwloc and PMI2 provided internally.
~~~spack
[spack@gpu02 ~]$ spack spec -Il slurm@21-08-1-1%gcc@8.5.0+hwloc+pmix ^pmix/atirvkd
Input spec
--------------------------------
- slurm@21-08-1-1%gcc@8.5.0+hwloc+pmix
[+] ^pmix@3.2.1%gcc@8.5.0~docs+pmi_backwards_compatibility~restful arch=linux-almalinux8-zen
[+] ^hwloc@2.6.0%gcc@8.5.0~cairo+cuda~gl+libudev+libxml2~netloc~nvml+opencl+pci~rocm+shared arch=linux-almalinux8-zen
[+] ^cuda@11.5.1%gcc@8.5.0~dev arch=linux-almalinux8-zen
[+] ^libxml2@2.9.12%gcc@8.5.0~python arch=linux-almalinux8-zen
[+] ^libiconv@1.16%gcc@8.5.0 libs=shared,static arch=linux-almalinux8-zen
[+] ^xz@5.2.4%gcc@8.5.0~pic libs=shared,static arch=linux-almalinux8-zen
[+] ^zlib@1.2.11%gcc@8.5.0+optimize+pic+shared arch=linux-almalinux8-zen
[+] ^libpciaccess@0.16%gcc@8.5.0 arch=linux-almalinux8-zen
[+] ^ncurses@6.1.20180224%gcc@8.5.0~symlinks+termlib abi=6 arch=linux-almalinux8-zen
[+] ^libevent@2.1.8%gcc@8.5.0+openssl arch=linux-almalinux8-zen
[+] ^openssl@1.1.1l%gcc@8.5.0~docs certs=system arch=linux-almalinux8-zen
Concretized
--------------------------------
- bpmt4tx slurm@21-08-1-1%gcc@8.5.0~gtk~hdf5+hwloc~mariadb+pmix+readline~restd sysconfdir=/etc/slurm arch=linux-almalinux8-zen
...
~~~
## OpenMPI
OpenMPI is build on top of SLURM, PMIX.
UCX fabrics with knem is provided externally as part of Mellanox OFED HPC build. To sumarize these `.spack/packages.yml`
~~~yaml
packages:
lustre:
externals:
- spec: lustre@2.14.55
prefix: /usr
buildable: false
#...
#...
hwloc:
compiler: [gcc@8.5.0]
variants: +cuda+opencl+libudev^cuda@11.5.1
munge:
compiler: [gcc@8.5.0]
variants: localstatedir=/var
slurm:
compiler: [gcc@8.5.0]
variants: +hwloc+pmix sysconfdir=/etc/slurm
knem:
externals:
- spec: knem@1.1.4
prefix: /opt/knem-1.1.4.90mlnx1
ucx:
externals:
- spec: ucx@1.11.1+thread_multiple +cma +rc +ud +dc +mlx5-dv +ib-hw-tm +dm +cm +knem
prefix: /usr
openmpi:
variants: +pmi+pmix+lustre fabrics=ucx,knem schedulers=slurm
mpich:
variants: +slurm pmi=pmix
~~~
For various compilers slurm concretization is applied with the above defaults.
~~~spack
[spack@gpu02 apps]$ spack spec -lI openmpi%gcc@8.5.0 ^slurm/bpmt4tx
Input spec
--------------------------------
- openmpi%gcc@8.5.0
[+] ^slurm@21-08-1-1%gcc@8.5.0~gtk~hdf5+hwloc~mariadb+pmix+readline~restd sysconfdir=/etc/slurm arch=linux-almalinux8-zen
...
Concretized
--------------------------------
- nezrdtx openmpi@4.1.2%gcc@8.5.0~atomics~cuda~cxx~cxx_exceptions+gpfs~internal-hwloc~java~legacylaunchers+lustre~memchecker+pmi+pmix~singularity~sqlite3+
static~thread_multiple+vt+wrapper-rpath fabrics=knem,ucx schedulers=slurm arch=linux-almalinux8-zen
[+] gg2gaiz ^hwloc@2.6.0%gcc@8.5.0~cairo+cuda~gl+libudev+libxml2~netloc~nvml+opencl+pci~rocm+shared arch=linux-almalinux8-zen
[+] 5lbelaa ^cuda@11.5.1%gcc@8.5.0~dev arch=linux-almalinux8-zen
...
[spack@gpu02 apps]$ cat >> $(spack location -i openmpi%gcc@8.5.0)/etc/openmpi-mca-params.conf << EOF
btl=^openib
opal_common_ucx_opal_mem_hooks=1
EOF
[spack@gpu02 ~]$ spack install openmpi%aocc@3.1.0 ^slurm/bpmt4tx ^knem%gcc@8.5.0 ^lustre%gcc@8.5.0
[spack@gpu02 ~]$ cat >> $(spack location -i openmpi%aocc@3.1.0)/etc/openmpi-mca-params.conf << EOF
btl=^openib
opal_common_ucx_opal_mem_hooks=1
EOF
spack spec -lI openmpi%intel@2021.4.0 ^slurm/bpmt4tx
spack install openmpi%intel@2021.4.0 ^slurm/bpmt4tx
==> Warning: Intel compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors
cat >> $(spack location -i openmpi%intel@2021.4.0)/etc/openmpi-mca-params.conf << EOF
btl=^openib
opal_common_ucx_opal_mem_hooks=1
EOF
[spack@gpu02 ~]$ spack install openmpi%gcc@11.2.0 ^slurm/bpmt4tx ^ucx/i4eomv4
cat >> $(spack location -i openmpi%gcc@11.2.0)/etc/openmpi-mca-params.conf << EOF
btl=^openib
opal_common_ucx_opal_mem_hooks=1
EOF
~~~
Compiling and running OpenMPI with srun requires PMIX
~~~sh
cat > hello.f90 <<EOF
program hello
use mpi
integer rank, size, ierror, strlen, status(MPI_STATUS_SIZE)
character(len=MPI_MAX_PROCESSOR_NAME) :: hostname
call MPI_INIT(ierror)
call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierror)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierror)
call MPI_GET_PROCESSOR_NAME( hostname, strlen, ierror )
print*, trim(hostname), rank, size
call MPI_FINALIZE(ierror)
end
EOF
ml openmpi-4.1.2-gcc-8.5.0-nezrdtx
mpif90 hello.f90
srun --mpi=pmix -p rome -N5 -n40 --mem=0 a.out
ml purge
ml openmpi-4.1.2-aocc-3.1.0-43vdsd3
ml aocc-3.1.0-gcc-11.2.0-plf5zph
mpif90 hello.f90
srun --mpi=pmix -p rome -N5 -n40 --mem=0 a.out
~~~
Wrapper `mpirun` is not built by default anymore with OpenMPI.
You should use `srun --mpi=pmix` instead. See http://hpc.fs.uni-lj.si/slurm examples.
## Linpack
~~~spack
[spack@gpu02 ~]$ spack spec -lI hpl+openmp%aocc@3.1.0 ^amdblis%aocc@3.1.0 threads=openmp ^openmpi/43vdsd3
Input spec
--------------------------------
- hpl%aocc@3.1.0+openmp
- ^amdblis%aocc@3.1.0 threads=openmp
...
~~~
## OpenFOAM
~~~ bash
spack spec -Il openfoam%gcc@11.2.0+metis+zoltan+mgridgen+paraview ^openmpi/ip5xqcx ^paraview+qt ^mesa@21.3.8
module load openfoam-2112-gcc-11.2.0-qi6xbnd
~~~
There are also AOCC compiled OpenFOAM modules but it seems that not all utilities are compiled due to incompatibility or compiler detection.
Sample OpenFOAM batch script can be
~~~bash
#SBATCH -p rome
#SBATCH -n 96
#SBATCH --mem=0
#SBATCH --ntasks-per-node=48
module purge
module load openfoam-2112-gcc-11.2.0-qi6xbnd
# Decompose solution (serial)
decomposePar -force > log.decomposeParDict 2>&1
# Run the solution (parallel)
srun --mpi=pmix compressibleInterFoam -parallel > log.CIF 2>&1
~~~
or
~~~ bash
~~~
## TAU profiler and analysis utility
Built for Intel and gcc compiler
~~~ bash
$ spack install tau%intel+mpi+ompt+openmp ^openmpi/hovu6pi
$ spack install tau%gcc@11.2.0+mpi+ompt+openmp ^openmpi/ip5xqcx
~~~
## AMDScalaPack, MUMPS, PETSc
spack spec -I amdscalapack%aocc ^openmpi/43vdsd3
spack spec -I mumps%aocc ^amdscalapack/g4eckyd
spack spec -I petsc%intel ^amdscalapack/g4eckyd
spack spec -I petsc%intel +mkl-pardiso+scalapack+valgrind ^openmpi/hovu6pi
## stream benchmark
According to [AMD developer instructions](https://developer.amd.com/spack/stream-benchmark/) the benchmark should be:
spack spec -I stream%aocc+openmp cflags="-mcmodel=large -DSTREAM_TYPE=double -mavx2 -DSTREAM_ARRAY_SIZE=260000000 -DNTIMES=10 -ffp-contract=fast -fnt-store"
## ParaView/tajjatz
spack install paraview+mpi+qt ^openmpi/nezrdtx ^mesa@21.3.8
ha7jdza
spack install paraview+mpi+qt+python3 ^openmpi/nezrdtx ^mesa@21.3.8
## Fenics
vtzd4lz
spack spec fenics-dolfinx ^openmpi/ip5xqcx
## gmsh@4.8.4/p2z7wue
spack install gmsh%gcc ^openmpi/nezrdtx
###### tags: `HPCFS` `spack`