深度學習python套件安裝 (with Ubuntu22.04)

tags: `Tutorial`, `Ubuntu`, `python`

Pytorch
Jax
Tensorflow
Pyspark
Kaggle API

其他參考：

Ubuntu22.04重灌筆記

Pytorch

Installation


pip install torch torchvision torchaudio

Cuda check






import torch
torch.cuda.is_available()
torch.cuda.device_count()
torch.cuda.current_device()
torch.cuda.device(0)
torch.cuda.get_device_name(0)

Jax

Installation

CPU only (Linux/MacOS/Windows)


pip install -U jax

GPU (NVIDIA with CUDA 12, Linux)


pip install -U "jax[cuda12]"

(2025.02.18)
This version (via pip) doesn't support for CUDNN 9. (XlaRuntimeError: INTERNAL: the library was not initialized)
To use the latest CUDNN, try install locally instead.

Problems

Jax(<=0.5.0) was built with NumPy 1.25.0, doesn't support for NumPy>=2.0. Downgrade it to specific version is needed.


pip install --upgrade numpy==1.25.0

TensorFlow

Installation*

Tensorflow2 最新穩定版本，支援 CPU 和 GPU (Ubuntu 和 Windows)，無須額外指定。


pip install --upgrade tensorflow

但目前版本(2.11)可能與TensorRT 8.x不相容，建議參考下面的安裝方式。

Problems

Could not load dynamic library 'libnvinfer.so.7'

先安裝新版TensorRT(8.x)





pip install --upgrade setuptools pip
pip install nvidia-pyindex
pip install nvidia-tensorrt
# verify installation of tensorrt
python3 -c "import tensorrt; print(tensorrt.__version__); assert tensorrt.Builder(tensorrt.Logger())"

建立libnvinfer version 7 和 8 的 symbolic link





# 到tensorrt的安裝位置(因安裝方式不同有異)
cd /env/lib/python3.10/site-packages/tensorrt
# create symbolic links
ln -s libnvinfer_plugin.so.8 libnvinfer_plugin.so.7
ln -s libnvinfer.so.8 libnvinfer.so.7

更改.bashrc或venv/activate，新增tensorrt至LD_LIBRARY_PATH


export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/env/lib/python3.10/site-packages/tensorrt/

安裝Tensorflow並驗證GPU安裝是否成功


pip install --upgrade tensorflow
python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

Fixing "successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node"

What is NUMA (Non-Uniformed Memory Access)

Check Nodes


lspci | grep -i nvidia

輸出結果如下：

01:00.0 VGA compatible controller: NVIDIA Corporation TU106 [GeForce RTX {xxxx}] (rev a1)
01:00.1 Audio device: NVIDIA Corporation TU106 High Definition Audio Controller (rev a1)

表示 VGA compatible device, NVIDIA Geforce 在 01:00，如果顯示結果不同可能須視情況調整下列code。

Check if it is connected.


cat /sys/bus/pci/devices/0000\:01\:00.0/numa_node

0表示已連結。-1表示未連結，須執行下一步驟。

Fix it with the command below.


sudo echo 0 | sudo tee -a /sys/bus/pci/devices/0000\:01\:00.0/numa_node

可再次執行步驟2確認是否連結成功。

Memory limit setting

Setting GPUs memory limit for each notebook.

僅建議作為替代方案，最好還是能順利安裝TensorRT，不然還是會有很多bug。

如果TensorRT安裝失敗/沒有成功抓到，可能在使用Jupyter Notebook時會遇到GPU記憶體不足的問題，此時可以通過設定每個Notebook的可用記憶體限制來解決。

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  try:
    tf.config.experimental.set_virtual_device_configuration(
        gpus[0],[tf.config.experimental.VirtualDeviceConfiguration(memory_limit=4096)])
  except RuntimeError as e:
    print(e)

Pyspark

Install Pyspark on Ubuntu

Use Pyspark on Jupyter Notebook and Virtualenv

進入虛擬環境，安裝findpyspark


pip install findspark

在Notebook的起始或python script的起始使用下列代碼：

import findspark
findspark.init()

Kaggle API

進入虛擬環境，安裝kaggle


pip install kaggle --upgrade

API credentials

要使用API下載競賽或其他資料，則必須要先設定你的API token才能正常使用。

在官網注冊帳號，再到帳號的介面中選取'Create API Token'即會下載一個kaggle.json的檔案，當中包含你的API token資料。
設定API token，設定的方法可以為下列任一種：

將Token放在指定路徑$HOME/.kaggle/kaggle.json下，並運行下列指令防止其他電腦使用者讀取你的API

chmod 600 ~/.kaggle/kaggle.json

將username和key (參考kaggle.json) export到虛擬環境中

export KAGGLE_USERNAME={user_name}
export KAGGLE_KEY={xxxxxxxxxxxxxx}

深度學習python套件安裝 (with Ubuntu22.04)

tags: Tutorial, Ubuntu, python

Pytorch

Installation

Cuda check

Jax

Installation

Problems

TensorFlow

Installation*

Problems

Could not load dynamic library 'libnvinfer.so.7'

Fixing "successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node"

Memory limit setting

Pyspark

Install Pyspark on Ubuntu

Use Pyspark on Jupyter Notebook and Virtualenv

Kaggle API

API credentials

Read more

Ubuntu22.04重灌筆記：GPU安裝與深度學習

R in Ubuntu22.04

VScode 使用 SSH 進行遠端程式開發

tags: `Tutorial`, `Ubuntu`, `python`