Tutorial Install GPU and Deep Learning Frameworks

# Tutorial Install GPU and Deep Learning Frameworks # Upgrade Kernel Ubuntu ```bash sudo add-apt-repository ppa:cappelikan/ppa sudo apt-get update && sudo apt-get upgrade sudo apt install mainline ``` Open mainline app and install the last release kernel Or compile your kernel ```bash sudo apt-get install git fakeroot build-essential ncurses-dev xz-utils libssl-dev bc flex libelf-dev bison latest_tag=$(git ls-remote --tags --sort='v:refname' https://github.com/torvalds/linux.git | tail -n1 | sed 's/.*\///; s/\^{}//') kernel_version=$(echo $latest_tag | sed 's/^v//') git clone --depth 1 --branch $latest_tag https://github.com/torvalds/linux.git "linux-$kernel_version" cd "linux-$kernel_version" # Crear la carpeta de salida si no existe mkdir -p ./output cp -v /boot/config-$(uname -r) .config yes "" | scripts/config --disable SYSTEM_TRUSTED_KEYS yes "" | scripts/config --disable SYSTEM_REVOCATION_KEYS yes "" | make oldconfig sudo make -j $(nproc) sudo make -j $(nproc) modules_install sudo make -j $(nproc) install # make -j $(nproc) deb-pkg # Mover los archivos generados a la carpeta de salida #mv ../*.deb ./output/ # cd ./output/ # sudo dpkg -i ../linux-headers-*.deb ../linux-image-*.deb sudo update-initramfs -c -k $kernel_version sudo update-grub ``` # 1. Install Graphic Cards Drivers ## 1.1 Not necessary in MacOS ## 1.2 Case Nvidia: ### Download drivers from [Nvidia page] ```bash sudo apt-get remove --purge '^nvidia-.*' --yes sudo apt-get --purge remove nvidia* --yes sudo apt-get --purge remove "*cublas*" "cuda*" --yes sudo apt remove nvidia-* --yes sudo apt-get --purge remove "*nvidia*" --yes sudo apt autoremove --yes sudo apt-get update --yes && sudo apt-get full-upgrade --yes wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb sudo dpkg -i cuda-keyring_1.1-1_all.deb sudo apt-get update sudo apt-get install -y nvidia-open sudo apt-get install nvitop --yes curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list \ && \ sudo apt-get update sudo apt-get install -y nvidia-container-toolkit ``` ### In case of GTX ```bash sudo apt-get install -y cuda-drivers echo "options nvidia NVreg_OpenRmEnableUnsupportedGpus=1" | sudo tee /etc/modprobe.d/nvidia-gsp.conf ``` ## 1.3 Case Intel ```bash # Intel APT sudo apt update sudo apt -y install cmake pkg-config build-essential sudo wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \ | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null && echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list # CODEPLAY sudo wget -qO - https://developer.codeplay.com/apt/public.key | gpg --dearmor | sudo tee /usr/share/keyrings/codeplay-keyring.gpg > /dev/null && echo "deb [signed-by=/usr/share/keyrings/codeplay-keyring.gpg] https://developer.codeplay.com/apt all main" | sudo tee /etc/apt/sources.list.d/codeplay.list ``` ### Restart the PC ## 1.4 Case AMD ```bash sudo apt-get remove --purge '^rocm-.*' sudo apt remove rocm-* sudo apt purge '*rocm*' sudo apt update && sudo apt full-upgrade sudo apt install "linux-headers-$(uname -r)" "linux-modules-extra-$(uname -r)" wget https://repo.radeon.com/amdgpu-install/7.0.1/ubuntu/noble/amdgpu-install_7.0.1.70001-1_all.deb sudo apt install ./amdgpu-install_7.0.1.70001-1_all.deb sudo apt update sudo apt install python3-setuptools python3-wheel sudo apt install amdgpu-dkms # drivers sudo apt install rocm sudo usermod -a -G render,video $LOGNAME # Add the current user to the render and video groups ``` ### Restart the PC # 2. Install Library HPC ## 2.1 Case Nvidia ### 2.1.2 Install Cuda: Source: https://developer.nvidia.com/cuda-downloads Extra information: The **cuda** and **cuda-toolkit** meta-packages will be equivalent in CUDA 12.8. ```bash sudo apt-get -y install cuda-toolkit cudnn tensorrt nvidia-gds ``` Add PATH to .bashrc ```bash sudo gedit ~/.bashrc ``` ```bash export PATH=/usr/local/cuda-13.0/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-13.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} ``` ```bash source ~/.bashrc ``` ### Restart the PC ## 2.2 Case Intel ## Install oneAPI: ```bash # Install all the packages or selected packages. All basic compilers (Fortran, C, C++, and Python) sudo apt install intel-basekit # HPC toolkits sudo apt install intel-hpckit # IoT (internet of things) toolkits sudo apt install intel-iotkit # AI (Artificial Intelligence) analytics toolkits sudo apt install intel-aikit # Rendering toolkit sudo apt install intel-renderkit ``` ```bash # After Installation # After you have installed the oneAPI kits, they can be activated using the following command written in the ~/.bashrc file. # For all users (root access) . /opt/intel/oneapi/setvars.sh # For single user . ~/intel/oneapi/setvars.sh ``` ### Restart the PC ## 2.3 Case AMD ### Install RocM: Add PATH to .bashrc ```bash sudo gedit ~/.bashrc ``` ```bash export ROCM_PATH=/opt/rocm-6.4.3 export PATH=$ROCM_PATH/bin:$PATH export LD_LIBRARY_PATH=$ROCM_PATH/lib:$LD_LIBRARY_PATH ``` ```bash source ~/.bashrc ``` ```bash # check with rocminfo rocminfo amd-smi metric ``` ## 3. Install Anaconda ## Case Linux 1. Download .sh: https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh 2. ```chmod +x Miniconda3-latest-Linux-x86_64.sh``` 3. ```bash Miniconda3-latest-Linux-x86_64.sh``` ## Case Mac 1. Download .sh: https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh 2. ```chmod +x Miniconda3-latest-MacOSX-arm64.sh``` 3. ```bash Miniconda3-latest-MacOSX-arm64.sh``` # Benchmark 1. Create your environment ```conda create -n myenv python=3.11 & conda activate myenv``` 2. Install framework tensorflow or pytorch 3. Install jupyter lab in console: ```conda install numpy pandas matplotlib notebook jupyterlab``` 4. Open jupyter lab in console: ```jupyter lab``` 5. Create a notebook 6. Install ai-benchmark in a cell: ```!pip3 install new-ai-benchmark``` 7. Execute: ```python from ai_benchmark import AIBenchmark benchmark = AIBenchmark(use_cpu=False, verbose_level=2) # FOR CPU use_cpu=True results = benchmark.run() ``` ## Optional Install Tensorflow... ### Nvidia ```bash pip install tensorflow[and-cuda] # Verify install: python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))" ``` ### Extension Intel ```bash pip install --upgrade intel-extension-for-tensorflow[gpu] ``` ### Apple ```bash python -m pip install tensorflow python -m pip install tensorflow-metal ``` ### AMD ```bash pip install -U tensorflow-rocm ``` # Optional: Install Pytorch ### Case Nvidia ```pip3 install -U --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121``` ### Case Intel ```bash pip3 install -U --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu python -m pip install intel_extension_for_pytorch ``` ### Case Mac ```pip3 install -U --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu``` ### Case AMD ```pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.0``` If you are using cuda <12.0 ```python import tensorflow as tf gpus = tf.config.experimental.list_physical_devices('GPU') for gpu in gpus: tf.config.experimental.set_memory_growth(gpu, True) ``` ----------------------------------------------------------- ## FIX NUMA ERROR ```bash sudo echo 0 | sudo tee -a /sys/bus/pci/devices/0000\:01\:00.0/numa_node ```