Run Python Programs on Pitt CRC Cluster

# Run Python Programs on Pitt CRC Cluster [![hackmd-github-sync-badge](https://hackmd.io/N3CtmODbQaaNYKdbYyUrtg/badge)](https://hackmd.io/N3CtmODbQaaNYKdbYyUrtg) A brief guide for quickly getting your hands on computing on cluster using python in addition to the [instructions from CRC](https://crc.pitt.edu/). Start reading if you already have access to the cluster! ## Connect to the cluster through SSH From your terminal (e.g. Windows command prompt), login to the cluster using your username and password using the command `ssh {yourPittID}@[hostname].crc.pitt.edu` where `[hostname]={h2p, htc}`. Note: The nodes `{h2p, htc}` are the gateways everyone uses to perform interactive work like editing code, submitting and checking the status of jobs, etc. HTC is for the HTC cluster, and H2P is for the SMP, MPI, and GPU clusters ## Load pre-installed Python distribution You don't want to install Python distributions such as anaconda to the cluster since we don't have the permission to do so and there are pre-installed versions. 1. Use the command `module spider python` to list all the existing pre-installed Python versions on the cluster. 2. Select one conda release from the list, I will take the release "python/anaconda3.10-2022.10" as an example. 3. Use command `module spider python/anaconda3.10-2022.10` to see the modules needed to load before the anaconda module I selected is available to load. I have "gcc/8.2.0" showed up. 4. My "gcc/8.2.0" can be loaded directly which I know from `module spider gcc/8.2.0`. So I load it using `module load gcc/8.2.0`. Then I load the anaconda module using `module load python/anaconda3.10-2022.10`. ## Create and activate your Conda environment Note that `/ihome` (`$HOME`) is the system housing home directories which is limited in space for all CRC users, and under which, `/ihome/{your group name}/{your user name}` is also the location you are at once logged in. Group project storage locations are one of `{/ix, /ix1, /bgfs, /zfs1, /zfs2}`. Anaconda creates virtual environments in your home directory by default (`/ihome///.conda/envs/`), so it is better to create your conda environment under the storage location of your group project. (Take a look at the [file systems manual](https://crc-pages.pitt.edu/user-manual/data-management/data-management-overview/) for more information). You can use the conda command once you load the conda module. 1. Use the following command env to create (taking the group storage location as `/bgfs` for example), and you can specify your own python version (I use 3.8 here). ```bash conda create --prefix=/bgfs/{your group name}/{your user name}/envs/{env name} python=3.8 ``` 2. Activate the environment (instead of `conda activate`) using ```bash source activate /bgfs/{your group name}/{your user name}/envs/{your env name} ``` ## Install Python packages that you need Once you create and activate your conda env, you can install some useful Python packages that you will need. For example, the following packages can be installed using ```bash conda install -c conda-forge numpy pandas scipy matplotlib scikit-learn jupyter pingouin kdepy ruptures tqdm ``` - NumPy: A library for numerical computing in Python. - Pandas: A powerful data manipulation and analysis library. - scipy: A library for scientific computing in Python. - Matplotlib: A plotting library for creating visualizations in Python. - Scikit-learn: A machine learning library for Python. - Jupyter: An interactive computing environment for Python. - pingouin: A statistical package for Python. - KDEpy: A package for kernel density estimation. - ruptures: A package for change point detection. - tqdm: A package for progress bars. ## Edit your .bashrc file You can add the following lines to your `.bashrc` (bash runtime configuration) file under your home folder to load the modules when you login to the cluster. ```bash module load gcc/8.2.0 module load python/anaconda3.10-2022.10 ``` You can also add the following line to activate the conda environment automatically. ```bash source activate /bgfs/{your group name}/{your user name}/envs/{your env name} ``` ## Clone your repository, copy your input data, and run some tests You can clone your repository to the cluster using the command `git clone {your repository URL}`.  WinSCP is a good software with GUI to transfer files between your local machine and the cluster. Try to test your code on the cluster to see if it works. ## Submit a job to the cluster You can submit a job to the cluster using the `sbatch` command. You can create a script file (e.g. `job.sh`) with the following content and submit it using `sbatch job.sh`. ```bash #!/bin/bash #SBATCH --job-name=job_name # Job name #SBATCH --output=job_name.out # Output file name under current directory #SBATCH --error=job_name.err # Error file name under current directory #SBATCH --mail-type=ALL # Mail events (NONE, BEGIN, END, FAIL, ALL) #SBATCH --mail-user=your@email.address # Where to send mail #SBATCH --clusters=smp # Cluster name (smp, mpi, gpu, htc) #SBATCH --partition=smp # The partition that the job will run on, for smp cluster, choose from {smp, high-mem} #SBATCH --time=00:10:00 # Time limit hrs:min:sec #SBATCH --nodes=1 # Number of nodes (usually 1) #SBATCH --ntasks=1 # Number of tasks #SBATCH --cpus-per-task=32 # Number of CPU cores per task (currently 32 is the max for smp cluster) #SBATCH --mem=256gb # Memory limit for the job (currently 512gb or more can be requested for smp cluster, not sure about the exact max) # Load the modules and activate the conda environment (if needed) module load gcc/8.2.0 module load python/anaconda3.10-2022.10 source activate /bgfs/{your group name}/{your user name}/envs/{your env name} # Run your python script python your_script.py ``` ## Useful commands The most frequently used commands are bolded. ### CRC provided commands CRC provides some useful commands to help you manage your environment and resources. - `crc-idle`: List idle nodes - `crc-interactive`: Request an interactive session - `crc-job-stats`: Show job statistics, not used for bash terminal - `crc-proposal-end`: Proposal end date - `crc-quota`: Show your storage quota on all storage systems (~~takes long time~~) - `crc-scancel`: **Cancel a job, takes job id as argument** - `crc-show-config`: Show configuration information - `crc-sinfo`: Show information about nodes - `crc-squeue`: **Show information about jobs** - `crc-sus`: Show usage statistics - `crc-usage`: **Show usage statistics for your group** - `crc-seff`: **Show efficiency of a job, takes job id as argument** ### Slurm commands The CRC cluster uses Slurm to manage resources and jobs. Some useful commands are: - `sbatch`: **Submit a job scrip, takes job script as argument** - `srun`: Run a command on a compute node - `salloc`: Allocate resources for a job - `scontrol`: View and modify Slurm configuration - `sacct`: Show accounting information - ~~`squeue`: Show the status of jobs~~ Use `crc-squeue` instead. - ~~`scancel`: Cancel a job~~ Use `crc-scancel` instead. - ~~`sinfo`: Show information about nodes~~ Use `crc-sinfo` instead. ### Conda commands We've already mentioned some conda commands such as `conda create`, `conda install`, and `conda activate`. Here are some more useful commands: - `conda info`: Show information about the current environment - `conda list`: List all the packages installed in the current environment - `conda env list`: List all the environments - `conda env remove --name {env_name}`: Remove an environment by name - `conda env remove --prefix {env_path}`: Remove an environment by path ### Git commands Some useful git commands are: - `git clone {repository_url}`: Clone a repository - `git pull`: Pull changes from the remote repository - `git add {file}`: Add a file to the staging area - `git commit -m "{message}"`: Commit changes with a message - `git push`: Push changes to the remote repository ### File management commands Some useful file management commands are: - `ls`: List files and directories - `cd {directory}`: Change directory - `pwd`: Print the current working directory - `cp {source} {destination}`: Copy a file or directory - `mv {source} {destination}`: Move a file or directory - `rm {file}`: Remove a file - `rm -r {directory}`: Remove a directory and its contents ### Other useful commands - `cat {file}`: Display the contents of a file - `head {file}`: Display the first few lines of a file - `tail {file}`: Display the last few lines of a file - `grep {pattern} {file}`: Search for a pattern in a file - `man {command}`: Display the manual page for a command - `which {command}`: Show the path to a command - `history`: Show the command history - `!{number}`: Repeat a command by number from history ## Glossary - **Home folder**: The folder that you are at once logged in, which is `/ihome/{your group name}/{your user name}`. - **Slurm**: A workload manager used to allocate resources and schedule jobs on the cluster.