# RCC Internal Workshop
Parmanand Sinha
July 17, 2023
---
## UNIX system organization
Things go in standard directories: binaries in /bin, libraries in /lib, configuration in /etc, etc.
But not research programs on Midway. Why?
● Lots of software, much of it is non-standard.
● We want different versions of the same software package available at the same time, but without conflicts.
● Dependency management: different codes may require different versions of libraries.
---
## Source Compile in Linux
Download the source and install it yourself
**configure** // Compilation environment setting result: Makefile creation
Default install location: /usr/local/***
**make** // compile. Run Makefile
**make install** // copy
Command sequence
1) Unzip
2) run configure
3) make
4) make install
5) Update $PATH
$PATH is a list of paths separated by colons containing all the location to look for an executable.
## Source Compile in Linux : Examples
---
Download and compile SQLite:
```
wget https://www.sqlite.org/2022/sqlite-autoconf-3360000.tar.gz
tar xvf sqlite-autoconf-3360000.tar.gz
cd sqlite-autoconf-3360000
./configure --prefix=$HOME/local
make -jN
make install
```
Download and compile GDAL:
```
wget http://download.osgeo.org/gdal/CURRENT/gdal-X.Y.Z.tar.gz
tar xvf gdal-X.Y.Z.tar.gz
cd gdal-X.Y.Z
./configure --prefix=$HOME/local --with-sqlite3=$HOME/local --with-static-proj4=/usr --with-geos=/usr/bin/geos-config
make -jN
make install
```
---
## Midway organization
Software is installed at /software.
● Directories are named
\$SOFTNAME-\$SOFTVERS-\$OS-\$ARCH
● For example, gedit lives at
/software/gedit-2.28-el6-x86_64
● bin, lib, src, doc under that directory
---
## Environment Modules
‘Environment Modules’ are the mechanism by which much of the software is made available to the users.
What happens when you type “module load openmpi/1.6”?
● module looks for a file called
/software/modulefiles/openmpi/1.6 (or in \~/privatemodules …)
● Applies changes to the environment based on that file:
○ load module dependencies
○ adds \$appdir/bin to PATH
○ adds \$appdir/lib to LD_LIBRARY_PATH
○ adds man directory to MANPATH, include directory to CPATH, etc...
---
## Environment Modules
LMOD (Lmod) and Environment modules (TCLmodule) are both environment module systems for managing software packages and their dependencies on high-performance computing (HPC) systems.
TCLmodule are written in ([Tcl](https://www.tcl.tk/)) syntax.
Modules are not the only way of managing software on clusters: increasingly common approaches include:
* Conda package manager (Python-centric but can manage software written in any language);
* Apptainer/Singularity, a means for deploying software in [containers](https://en.wikipedia.org/wiki/Operating-system-level_virtualization).
---
## Environment modules (TCLmodule)
A minimum example of TCL module file for loading Miniconda:
```
#%Module1.0
##
## Miniconda modulefile
##
## Specify the version of Miniconda
set version x.x.x
## Set the path for Miniconda
set home <path-to-miniconda> # Replace with the actual path
## Add Miniconda binary paths to the shell's PATH variable
prepend-path PATH $home/bin
## Add Miniconda library paths to the LD_LIBRARY_PATH environment variable
prepend-path LD_LIBRARY_PATH $home/lib
## Set environmental variables specific to Miniconda
setenv CONDA_HOME $home
```
----
## Midway Build System
Midway3 Repo: https://git.rcc.uchicago.edu/rcc-staff/midway3-software.git
● Build scripts live in
Midway2: the SVN repo at pubsw/software/build
Midway3: the git repo at midway3-software/build
● A subdirectory for every software package
● Build steps:
○ Create subdirectory and populate with configuration files
○ Run the build scripts from top-level directory
○ Test modulefile, then commit to repository.
---
## The build configuration subfolders
**buildopts.sh** defines important variables for the build, notably BUILDTYPE. **moduleconf.sh** (new) defines the module dependancies and package dependancies of the software.
**source.sh** contains the command(s) required to download the source.
**test.bats** contains tests to ensure the program was installed correctly.
**build.sh** is a script that builds and installs the software package. This is only required if BUILDTYPE is *custom*.
**build.sh.pre** and **build.sh.post** are optional scripts that are run before and after a build, respectively.
---
## Type of buildtype
* configure -- ./configure; make; make install
* rsync -- use rsync to copy $SRCDIR to $PREFIX
* noop -- Only create the modulefile. The software is installed in other ways.
* custom -- use build.sh in the package directory
---
## Module Example: tmux
Content of source.sh
`gitsource https://github.com/tmux/tmux.git`
Content of buildopts.sh
```
SRCDIR=${SRCDIR}-${SOFTVERS}
BUILDTYPE=configure
CORES=8
export LIBEVENT_CFLAGS="-I${LIBEVENT_HOME}/include"
export LIBEVENT_LIBS="-L${LIBEVENT_HOME}/lib -levent_core"
```
Content of moduleconf.sh
```
MODULEDESCRIPTION="tmux is a terminal multiplexer, it enables a number of terminals (or windows)
to be accessed and controlled from a single terminal."
MODULEDOCURL="https://github.com/tmux/tmux"
MODULELICENSE="opensource"
MODULETAGS="terminal"
BUILDDEPEND="libevent/2.1.12"
MODULEDEPEND="libevent/2.1.12"
```
## Module Example: tmux
**Run build scripts**
From the top-level folder: eg: for Midway3: midway3-software/build
```
SOFTNAME=tmux
SOFTVERS=3.2
. /get-source $SOFTNAME $SOFTVERS \# Get source for software
. /build $SOFTNAME $SOFTVERS \# compile software to /software/staging
. /build -i $SOFTNAME $SOFTVERS \# rsync from /software/staging to /software
```
---
## What about the modulefile?
● build puts modulefile in \~/privatemodules
● Test with “module load use.own; module load $SOFTNAME”
● If that works:
○ move \~/privatemodules/\$SOFTNAME/\$SOFTVER to midway3-software/modulefiles/\$SOFTNAME
● Commit changes to repository
## custom buildtype module
**Example: R**
**Content of source.sh**
`curl -O https://cran.rstudio.com/src/base/R-4/R-${$SOFTVER}.tar.gz`
**Content of buildopts.sh**
```
SRCDIR=${SRCDIR}-${SOFTVERS}
BUILDTYPE=custom
```
**Content of build.sh**
```
./configure --enable-R-shlib --enable-optimisations --enable-openmp \
--enable-BLAS-shlib --with-lapack --with-blas="-lopenblas" \
--disable-R-profiling --with-pcre1 --prefix=$PREFIX
make
make install
```
**Content of moduleconf.sh**
```
MODULEDOCNAME="R"
MODULEDESCRIPTION="The R language and environment for statistical computing \
and graphics."
MODULELICENSE="GNU General Public License v2"
MODULEDOCURL="https://www.r-project.org"
MODULETAGS="development, general programming, graphics, R, statistics"
SOFTWARESUFFIX=DISTARCH
MODULEDEPEND="openblas/0.3.13 java/15.0.2"
BUILDDEPEND=""
MODULECONFLICT="R"
MODULEUSAGE=''
MODULEEXTRA='
prepend-path LD_LIBRARY_PATH $appdir/lib64/R/lib
prepend-path LIBRARY_PATH $appdir/lib64/R/lib
setenv OMP_NUM_THREADS 1
'
```
---
## Software Management with Spack
[Spack](https://spack.readthedocs.io/en/latest/) is an open source package manager that simplifies building, installing, customizing, and sharing HPC software stacks.
It is written in pure Python including software build requirements and dependencies.
It follows Nix, GUIX based store model where all the software are installed inside its own filesystem
In midway3, two common repo is available at /software/spack-0.17.0-el8-x86_64/ and /software/spack-dev/ (Preferred for RCC CS)
Step1: source /software/spack-0.17.0-el8-x86_64/share/spack/setup-env.sh
---
## Basic Usage
``` bash
spack list <software-name> # list a software
spack info <software-name> # show information and variants of a software
spack spec -I <software-name> # show specifications (dependencies)
```
Option `-I` or `--install-status` shows status of the software
dependencies i.e. installed (`+`) or will be installed during the
installation (`-`).
Now let’s install a new software’s from Spack:
``` bash
spack install <software_name> or <software_name@version> or <software_name@version %compiler@version>
```
In general, `@version` for both software and compiler could be removed.
Spack installs the most stable version by default (see `spack versions
-s <software-name>`). You may find complete list of software that you
can install by using `spack list` or in Spack online [package
list](https://spack.readthedocs.io/en/latest/package_list.html). Also.
we can use `--verbose` option to see more details during the
installation, `--no-cache` to install a package directly from the
source, and `--overwrite` to overwrite an installed package.
---
## Basic Usage
After installation, we can find the software by:
``` bash
spack find # to see all installed software
spack find <software-name> # to find a software (use -lfvp to see hashes, flags, variants and pathes)
spack location -i <software-name> # to find location of a software
```
And we can load the software:
``` bash
spack load <software_name>
spack find --loaded # see what is loaded
```
---
## Compilers
We can select compiler version and settings. To find and list compilers,
use:
``` bash
spack compiler list
spack compiler find
```
Compilers can be added to the Spack compilers list or removed from the
list by:
``` bash
spack compiler add <compiler-name@ver> # for example $(spack location -i gcc@10.1.0) add gcc 10 compiler that already is installed by Spack
spack compiler remove <compiler-name@ver>
```
Also, we can directly modify `compilers.yaml` file by:
``` bash
spack config edit compilers
```
---
## Personal Spack installation
// Setting up inside scratch directory
git clone --depth=100 --branch=releases/v0.20 https://github.com/spack/spack.git $SCRATCH/midway3/$USER/spack
cd $SCRATCH/midway3/$USER/spack
. share/spack/setup-env.sh
---
---