# UW HYAK Notes ## SALLOC - Request for GPU ``` salloc --partition=gpu-l40 --account=stf --mem=10G --gres=gpu:1 --cpus-per-task=1 --time=2:00:00 ``` - Check if GPU is requested ``` scontrol show job 24333466 | grep gpu ``` - check the current status ``` squeue -u pingw220 -o "%.18i %.9P %.8u %.2t %.10M %.10m %.6D %R" ``` ## Conda Reinstall ``` rm -rf '/gscratch/scrubbed/andysu/miniconda3' bash Miniconda3-latest-Linux-x86_64.sh -p /gscratch/scrubbed/andysu/miniconda3 ``` ``` python -m pip install --force-reinstall --upgrade setuptools pip ``` ## 看哪個節點閒置 ``` sinfo -t idle salloc --partition=ckpt-all --gres=gpu:1 --nodelist=g3091 --time=8:00:00 ``` ## GPU 確認 code ``` module load cuda/11.8.0 python -c "import torch; print(torch.cuda.is_available())" ``` - python code ```python import torch print(torch.__version__) print(torch.version.cuda) # 確保 PyTorch 版本支援 CUDA print(torch.backends.cudnn.enabled) ``` ## 看有沒有GPU裝置 ``` scontrol show job 24202314 | grep TRES ``` ## 如果是用salloc,登出後要回原本computing node ``` srun --jobid=<jobid> --pty bash ``` ## flash_attn 嘗試2.6.1 (考慮cuda版本) -> 安裝成功 ## Conda Related Commands - Create conda environment ``` conda create --name my_env python=3.9 conda activate my_env ``` ## VSCode for Windows