# SLURM INSTALLATION INSTRUCTIONS FOR RAVENCLAW CLUSTER ON KI-CENTRAL IT VMs (IAAS). **Written by:** Venkatesh Chellappa and Sarath Murugan **Reviewed by:** Rebecka Bergström **Version 3.0 (2023-04-12) (STRICTLY CONFIDENTIAL - For internal purposes only)** ### 1. Update advance package tool (ubuntu apt) following cmds have to be run with sudo permission, so login as root ``` sudo su - apt-get update ``` ### 2. Install munge tool that manages server connections with passwordless logins `apt install munge` #### Login to head node, copy munge key (from head node - c8ravenclaw01) into new servers ``` scp /etc/munge/munge.key venche@c8ravenclaw03.ki.se:/tmp scp /etc/munge/munge.key venche@c8ravenclaw04.ki.se:/tmp scp /etc/munge/munge.key venche@c8ravenclaw05.ki.se:/tmp ``` #### back on new nodes, move the munge key to munge folder ``` mv /tmp/munge.key /etc/munge/ chown munge:munge /etc/munge/munge.key ``` #### check if the key is intact and hasnt been modified during the copy and move ``` ls -lh /etc/munge/munge.key md5sum /etc/munge/munge.key ``` #### check munge installation systemctl enable munge systemctl start munge systemctl status munge munge -n | unmunge # Install SLURM #install slurm `apt install slurm-wlm` #### to check slurm installation `which slurmd` ---------------------------------------------------------------------------- #### edit slurm.conf on head node c8ravenclaw01 `vi /etc/slurm-llnl/slurm.conf` #### add new compute nodes along with correct ip and specs below 'COMPUTE NODES' ``` # COMPUTE NODES NodeName=c8ravenclaw02 NodeAddr=193.10.16.199 CPUs=48 Sockets=2 CoresPerSocket=24 ThreadsPerCore=1 State=UNKNOWN NodeName=c8ravenclaw03 NodeAddr=193.10.16.212 CPUs=48 Sockets=2 CoresPerSocket=24 ThreadsPerCore=1 State=UNKNOWN NodeName=c8ravenclaw04 NodeAddr=193.10.16.213 CPUs=48 Sockets=2 CoresPerSocket=24 ThreadsPerCore=1 State=UNKNOWN NodeName=c8ravenclaw05 NodeAddr=193.10.16.169 CPUs=48 Sockets=2 CoresPerSocket=24 ThreadsPerCore=1 State=UNKNOWN PartitionName=core Nodes=ALL Default=YES MaxTime=INFINITE State=UP ``` #### copy slurm.conf (from head node - c8ravenclaw01) into new servers ``` scp /etc/slurm-llnl/slurm.conf venche@c8ravenclaw03.ki.se:/tmp scp /etc/slurm-llnl/slurm.conf venche@c8ravenclaw04.ki.se:/tmp scp /etc/slurm-llnl/slurm.conf venche@c8ravenclaw05.ki.se:/tmp ``` #### copy slurm-lllnl/cgroup.conf (from head node - c8ravenclaw01) into new servers ``` scp /etc/slurm-llnl/cgroup.conf venche@c8ravenclaw03.ki.se:/tmp scp /etc/slurm-llnl/cgroup.conf venche@c8ravenclaw04.ki.se:/tmp scp /etc/slurm-llnl/cgroup.conf venche@c8ravenclaw05.ki.se:/tmp ``` #### on compute nodes move them from tmp to slurm folder ``` mv /tmp/slurm.conf /etc/slurm-llnl/ mv /tmp/cgroup.conf /etc/slurm-llnl/ ``` #### change permissions to root for the conf files ``` chown root:root /etc/slurm-llnl/slurm.conf chown root:root /etc/slurm-llnl/cgroup.conf md5sum /etc/slurm-llnl/slurm.conf ``` #### enable slurmd on all new compute nodes ``` systemctl enable slurmd systemctl start slurmd systemctl status slurmd ``` #### create log file for slurm accounting ``` mkdir /usr/local/slurm/ touch /usr/local/slurm/slurm_accounting.log chmod 777 /usr/local/slurm/slurm_accounting.log ``` #### make sure slurm nodes are up manually if they are in mixed or down state `sudo scontrol update nodename=c8ravenclaw05 state=idle` #### Make slurm’s userid to 350 and add it to group slurm ``` usermod -u 350 slurm groupmod -g 350 slurm grep 350 /etc/passwd ``` #### Check slurm cluster if the new node shows up `sinfo --Node --format="%10N %.6D %10P %.11T %.4c %.6m %10e %8O"` #### Test the slurm installation by submitting blocking jobs. ``` ## blocking jobs for i in `seq 1 20`; do sbatch -c 8 -J blocking_job_$i -t 15:00 -o job_$i.out --wrap="sleep 10m" done ```
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up