# Slurm 安裝 ## FIX MACOS locale Problem Edit /etc/ssh/ssh_config and comment out SendEnv LANG LC_* line. ## set NTP ``` yum install ntp -y systemctl enable ntpd.service ntpdate pool.ntp.org systemctl start ntpd ``` ## MariaDB 打开/etc/yum.repos.d/,新建MariaDB.repo文件 ``` cd /etc/yum.repos.d/ vi MariaDB.repo ``` 開啟 https://downloads.mariadb.org/mariadb/repositories/ 選擇CentOS版本 ``` # MariaDB 10.5 CentOS repository list - created 2021-03-30 17:33 UTC # http://downloads.mariadb.org/mariadb/repositories/ [mariadb] name = MariaDB baseurl = http://yum.mariadb.org/10.5/centos7-amd64 gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB gpgcheck=1 ``` ``sudo yum install MariaDB-*`` ``` service mysql start mysql_secure_installation systemctl enable mariadb.service systemctl status mariadb.service mysql -u root -p ``` ## munge ``` yum install bzip2 -y yum -y install openssl openssl-devel -y yum -y install epel-release yum install munge munge-libs munge-devel -y yum install rng-tools -y rngd -r /dev/urandom /usr/sbin/create-munge-key -r dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key chown munge: /etc/munge/munge.key chmod 400 /etc/munge/munge.key systemctl start munge ``` ``` scp -p /etc/munge/munge.key root@10.7.129.56:/etc/munge scp -p /etc/munge/munge.key root@10.7.129.57:/etc/munge scp -p /etc/munge/munge.key root@10.7.129.58:/etc/munge scp -p /etc/munge/munge.key root@10.7.129.59:/etc/munge ``` ## slurm ``` yum install rpm-build mysql-devel python3 pam-devel numactl numactl-devel hwloc hwloc-devel lua lua-devel readline-devel rrdtool-devel ncurses-devel man2html libibmad libibumad cpanm* -y cd /tmp wget https://download.schedmd.com/slurm/slurm-20.11.5.tar.bz2 tar -xaf slurm-20.11.5.tar.bz2 cd /tmp/slurm-20.11.5 ./configure --prefix=/usr && make && make install cp /tmp/slurm-20.11.5/etc/slurmctld.service /usr/lib/systemd/system/slurmctld.service cp /tmp/slurm-20.11.5/etc/slurmdbd.service /usr/lib/systemd/system/slurmdbd.service cp /tmp/slurm-20.11.5/etc/slurmd.service /usr/lib/systemd/system/slurmd.service ldconfig -n /usr/lib ``` ``` scp -p /usr/etc/slurm.conf root@10.7.129.56:/usr/etc/ scp -p /usr/etc/slurm.conf root@10.7.129.57:/usr/etc/ scp -p /usr/etc/slurm.conf root@10.7.129.58:/usr/etc/ scp -p /usr/etc/slurm.conf root@10.7.129.59:/usr/etc/ ``` vi /usr/etc/slurm.conf https://slurm.schedmd.com/configurator.html vi /usr/etc/slurm.conf ``StateSaveLocation=/var/spool/slurmctld`` ``` mkdir /var/spool/slurmctld chown slurm:slurm /var/spool/slurmctld systemctl start slurmctld.service systemctl status slurmctld.service cp /tmp/slurm-20.11.5/etc/slurmdbd.conf.example /usr/etc/slurmdbd.conf chown slurm:slurm /usr/etc/slurmdbd.conf chmod 600 /usr/etc/slurmdbd.conf mkdir /var/log/slurm/ touch /var/log/slurm/slurmdbd.log chown slurm: /var/log/slurm/slurmdbd.log ``` vi /usr/etc/slurm.conf 在AccountingStorageType=accounting_storage/mysql后添加这几行 ``` AccountingStorageHost=localhost AccountingStoragePort=3306 AccountingStoragePass=!QAZ2wsx AccountingStorageUser=slurm ``` ``` cp /tmp/slurm-20.11.5/etc/slurmd.service /usr/lib/systemd/system/slurmd.service ``` ## compute node ``` yum -y install epel-release yum install munge munge-libs munge-devel rng-tools -y rngd -r /dev/urandom chown -R munge: /etc/munge/ /var/log/munge/ chown munge:munge /etc/munge/munge.key chmod 0700 /etc/munge/ /var/log/munge/ systemctl start munge systemctl enable munge systemctl status munge ``` 测试MUNGE服务 在Master Node测试访问Compute Node ``` munge -n | ssh 10.7.129.58 unmunge ``` 安裝依賴包 ``` yum install rpm-build mysql-devel python3 pam-devel numactl numactl-devel hwloc hwloc-devel lua lua-devel readline-devel rrdtool-devel ncurses-devel man2html libibmad libibumad cpanm* -y ``` 安裝slurm ``` scp root@10.7.129.55:/tmp/slurm-20.11.5.tar.bz2 /tmp/ cd /tmp tar -xaf slurm-20.11.5.tar.bz2 cd /tmp/slurm-20.11.5 ./configure --prefix=/usr && make && make install cp /tmp/slurm-20.11.5/etc/slurmd.service /usr/lib/systemd/system/slurmd.service ldconfig -n /usr/lib ``` ``` mkdir /var/spool/slurmd chown slurm: /var/spool/slurmd chmod 755 /var/spool/slurmd touch /var/log/slurmd.log chown slurm: /var/log/slurmd.log cp /tmp/slurm-20.11.5/etc/cgroup.conf.example /usr/etc/cgroup.conf systemctl start slurmd systemctl status slurmd ``` # Problem shooting ``` scontrol show config | grep -i resumetimeout ```
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up