在Linux下要能使用GPU,要安裝好多元件,如NV驅動程式,CUDA,CUDNN,CONDA,PYTHON,PYTORCH,TORCHVISION等,每一個版本有問題就出問題,而且如果是使用圖型介面的Ubuntu,還會常常有驅動跑掉,整個X跑不起來的情況。
為了避免這種麻煩,為何不用最好用的DOCKER呢?只要在主系統上安裝NVIDIA驅動,其它事全部交給docker解決。
sudo add-apt-repository ppa:graphics-drivers/ppa
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
sudo apt-get update
sudo apt-get nvidia-430
$ nvidia-smi [15:06:36]
Mon Jul 29 15:06:39 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.26 Driver Version: 430.26 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... Off | 00000000:0A:00.0 Off | N/A |
| 30% 41C P0 58W / 250W | 0MiB / 11019MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce RTX 208... Off | 00000000:41:00.0 Off | N/A |
| 36% 44C P0 1W / 250W | 0MiB / 11011MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
(base) (immust02)joshhu:4014/ $
docker community
使用環境
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker $USER
檢查docker
版本
(base) (immust02)joshhu:4014/ $ docker version [15:01:38]
Client:
Version: 18.09.6
API version: 1.39
Go version: go1.10.8
Git commit: 481bc77
Built: Sat May 4 02:35:27 2019
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.0
API version: 1.40 (minimum version 1.12)
Go version: go1.12.5
Git commit: aeac949
Built: Wed Jul 17 18:14:42 2019
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.2.5
GitCommit: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc:
Version: 1.0.0-rc6+dev
GitCommit: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
docker-init:
Version: 0.18.0
GitCommit: fec3683
nvidia-docker
先安裝好nvidia-docker
# Add the package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
檢查nvidia-docker
的版本
$ nvidia-docker version [15:18:42]
NVIDIA Docker: 2.0.3
Client:
Version: 18.09.6
API version: 1.39
Go version: go1.10.8
Git commit: 481bc77
Built: Sat May 4 02:35:27 2019
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.0
API version: 1.40 (minimum version 1.12)
Go version: go1.12.5
Git commit: aeac949
Built: Wed Jul 17 18:14:42 2019
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.2.5
GitCommit: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc:
Version: 1.0.0-rc6+dev
GitCommit: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
docker-init:
Version: 0.18.0
GitCommit: fec3683
docker pull moeidb/aigo:cu10.0-dnn7.6-gpu-pytorch-cv-19.06
檢查是否下載成功:
$ docker images [15:42:07]
REPOSITORY TAG IMAGE ID CREATED SIZE
moeidb/aigo cu10.0-dnn7.6-gpu-pytorch-19.06 492bce9e825f 3 weeks ago 17GB
nvidia/cuda 9.0-base
?查看container中的python版本
$ docker run --rm moeidb/aigo:cu10.0-dnn7.6-gpu-pytorch-19.06 python3 --version
Python 3.7.3
startj.sh
,內容如下# 決定Jupyterlab該監聽本機的哪一個port
host_port=9999
# 啟動容器並取得容器ID
container_id=$(nvidia-docker run --rm -d --ipc=host -p ${host_port}:8888 -v $PWD:/workspace moeidb/aigo:cu10.0-dnn7.6-gpu-pytorch-cv-19.06) # 休息一會,靜待容器服務啟動
# 等待服務啟動
sleep 2.
# 擷取容器的Jupyterlab token
notebook_token=$(docker logs ${container_id} 2>&1 | grep -nP "(LabApp.*token=).*" | cut -d"=" -f 2)
# 顯示連線至Jupyterlab服務的網址
printf "Open a browser and connect to:\n
http://127.0.0.1:${host_port}/?token=${notebook_token}\n
chmod +x startj.sh
./startj.sh
,會出現一個網址,即Jupyter Lab的網址root
權限,要注意。!pip install
的東西都要重裝pytorch
docker
tensorflow
作業系統中的重要元件
May 8, 202420240116:什麼是人工智慧
Jan 27, 2024電腦A是一台在防火牆後面內網的Linux主機(甚至連DHCP都會變),而防火牆則是沒有固定IP的。如果我們想要控制電腦A,就需要一台具有固定IP的Linux主機(我用的是GOOGLE雲端主機,電腦B),然後從電腦A建立一條反向通道到電腦B,以後要從任何地方(電腦C)連入電腦A,就透過電腦B,然後從這條反向通道連回電腦A。
Dec 24, 2023Proxmox安裝Windows虛擬機的重點
Dec 16, 2023or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up