Try   HackMD

Containers on Perlmutter

  • podman-hpc vs shifter

    • Differences (env, volume mounts, swapped modules, and image caching)
  • Best practices of running containers

    • proper storage system to use
    • Containers per task vs containers per node
    • A deeper view at library swapping, looking at .conf files under /etc/podman_hpc/ and /etc/podman_hpc/modules.d
      • GPU
      • MPI
      • other modules
        • cvmfs, nccl, openmpi-pmi2 openmpi-pmix etc
      • skip library swapping with shifter (--module=none)
    • Share images with other users without pushing to registry

local image sharing on CFS

podman-hpc --squash-dir /path/to/shared/area/on/cfs migrate ubuntu:latest
chmod -R a+rx  /path/to/shared/area/on/cfs

Other users

export PODMANHPC_ADDITIONAL_STORES=/path/to/shared/area/on/cfs podman-hpc images

Running noVNC on login node for GUI apps

Ref: dingpf/nvidia-noVNC

The example dockerfile with

Start the container on one login node

Generate a VNC password file.

podman-hpc run --rm -it -v $HOME:/scratch --entrypoint=/bin/bash docker.io/dingpf/novnc-nvidia:latest
# inside the container, run 
vncpassword
# once finished
cp ~/.vnc/passwd /scratch/.vnc_passwd

Run the container on the login node.

podman-hpc run \ --gpu \ --rm -d -p 6080:6080 \ --name ubuntu2204-novnc \ -v <software_dirs_on_global_common>:<path_in_container> \ -v ~/.vnc_passwd:/root/.vnc/passwd dingpf/novnc-nvidia:latest

Note that if someone else is using port 6080 on the login node, you may need to change the first 6080 in -p 6080:6080 to another free port.

Setting up a SSH tunnel to the login node

# change ~/.ssh/nersc if the path to your nersc identity file is different # change loginXX to the hostname of the login node where you have your container running ssh -o IdentitiesOnly=yes \ -o IdentityFile=~/.ssh/nersc \ -J perlmutter.nersc.gov \ -L 6080:localhost:6080 loginXX -N -f

Access noVNC in a browser

You should now be able to go to localhost:6080 in a browser window. The default noVNC password is 00000000.