Learn More →
Machine Learning examples on Aristotle Cluster
will hopefully help you learn how to:
Please go through a Unix Command cheat sheet
as the following:
A few unix commands can be useful to run the examples that follow.
from your browser: https://hpc.auth.gr/
or via SSH
: https://hpc.it.auth.gr/intro/
To get the examples use the Web inteface file manager or the copy command (cp
) on the terminal:
$ cp -r /mnt/apps/share/HPC-AI-examples $HOME
The Extreme Gradient Boosting (XGBoost
) open-source library is used for this simple Regression example.
XGBoost
implements machine learning algorithms under the Gradient Boosting framework.
on the Jupyter Server
Use cp
command to copy the example jupyter notebook:
$ cp /mnt/apps/custom/jupyter/nb/xgboost_example.ipynb .
Source the prebuilt python virtual environment:
$ source /mnt/apps/custom/python-envs/xgboost-env/bin/activate
Install the IPython kernel
in this environment for your user account:
$ python -m ipykernel install --user --name xgboost-env \
--display "xgboost environment"
Using the custom environment
On Jupyter menu select File -> Open
to load the xboost example notebook
.
At the notebook menu select:
Download as
-> Python (.py)
(+ Jupyter IPython Kernel)
To create a new custom python venv
on your account
the following process can be used:
module load gcc/9.4.0-eewq4j6 python/3.9.10-ve54vyn
python -m venv xgboost-env
source xgboost-env/bin/activate
pip install --upgrade pip
pip install jupyter xgboost matplotlib scikit-learn
python -m ipykernel install --user --name xgboost-env \
--display "xgboost environment"
to Access HPC Resources
Allocates and manages exclusive users access to cluster resources
Provides a framework for job tracking and parallel job execution
$ sbatch <job_script>
$ squeue
# Filter results for one user
$ squeue -u <username>
# Filter results for one partition
$ squeue -p <partition>
$ scancel
$ sinfo
$ sinfo -N --long # how node status
$ seff <jobid>
$ sacct
Steps:
Related docs:
Submission script
#!/bin/bash
#SBATCH --time=10:00
#SBATCH --partition=testing
echo "Hello from $(hostname)"
sleep 30
echo Bye
#!/bin/bash
#SBATCH --partition=rome
#SBATCH --time=10:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
stress --cpu ${SLURM_NTASKS} --timeout 60
CPU Efficiency: seff <jobid>
Memory Per Task = Total Memory on Node / #CPUs on Node
To allocate more memory use --mem
directive:
#!/bin/bash
#SBATCH --partition=rome
#SBATCH --job-name=memory
#SBATCH --time=4:00
#SBATCH --mem=11G
./allocate-10gb
#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-task=20
#SBATCH --time=10:00
nvidia-smi
as a batch job on the cluster
#!/bin/bash
#SBATCH --job-name=xgboost-example
#SBATCH --partition=rome
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8
#SBATCH --time=1:00:00
source /mnt/apps/custom/python-envs/xgboost-env/bin/activate
python example.py