Nvidia Driver, Cuda and Cudnn, TensorRT install/uninstall === > update at 2021 6/23 by JasonLuo > test on Ubuntu 18.04 > install Nvidia Driver 460.84, Cuda 11.1, Cudnn 8.2.1 > TensorRT 8.0.0.3 > compatible with tensorflow-gpu 2.4 <font size=4> 因為deep learning的套件更新速度很快,有時更新DL套件後相對應的cuda, nvidia driver也需要重新安裝,因此在這邊紀錄一下如何install, uninstall這些東西 </font> ## Nvidia Driver ### 安裝之前 1. 首先,檢查顯卡是否可用```lspci -nnk | grep -i nvidia```,應出現下面這些 ``` 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB Rev. 2] [10de:1c06] (rev a1) Kernel driver in use: nvidia Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia 01:00.1 Audio device [0403]: NVIDIA Corporation GP106 High Definition Audio Controller [10de:10f1] (rev a1) ``` 2. 檢查是否已有安裝driver,可使用nvidia-smi或是dpkg -l 'nvidia*'檢查 - `nvidia-smi`除了可以檢查是否有安裝,同時也可以看到一些資訊(使用數據量,driver版本等)  - `dpkg -l nvidia*`  3. 下載Nvidia Driver,到[官網](https://www.nvidia.com/Download/index.aspx?lang=en-us)下載 ### 解安裝Nvidia Driver 4. 解除安裝之前先進入command prompt然後stop running Graphics session (此步驟可省略) - 按下 `[Ctrl]+[Alt]+[F1~F6]` 然後登入作業系統 - 關閉Graphics Session `sudo service lightdm stop` (For Ubuntu) 5. 卸載Nvidia Driver,下面三個步驟選其中一個 - sudo apt-get purge nvidia* - sudo /usr/bin/nvidia-uninstall - sudo NVIDIA-Linux-x86_64-460.84.run --uninstall 6. 重新開機 - `reboot` ### 安裝Nvidia Driver 7. 安裝之前先進入command prompt然後stop running Graphics session - 按下 `[Ctrl]+[Alt]+[F1~F6]` 然後登入作業系統 - 關閉Graphics Session `sudo service lightdm stop` (For Ubuntu) 8. 安裝Nvidia Driver - `sudo chmod +x NVIDIA-Linux-x86_64-XXX.run` (在官網下載的檔案) - `sudo NVIDIA-Linux-x86_64-XXX.run` 9. 重新啟動電腦 - `reboot` 10. 確認套件是否已經安裝 - `dpkg -l nvidia*` - `nvidia-smi` ## Cuda ### 安裝之前 1. 如何確認電腦上的Cuda版本 - cat /usr/local/cuda/version.txt - nvcc -V 2. 下載cuda,到[官網](https://developer.nvidia.com/cuda-toolkit-archive)下載cuda 11.1,選哪種下載都可以,我這邊是選擇runfile(local) or 用wget ### 移除Cuda 3. 移除舊版cuda,下面選一種(確保沒有/usr/local/cuda-XX.X資料夾) - `sudo apt-get --purge remove 'cuda*'` - `sudo apt-get autoremove --purge cuda` - `sudo /usr/local/cuda-10.2/bin/cuda-uninstaller` [官網上寫的](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-uninstallation),cuda-10.2換成自己的版本 ### 安裝Cuda 4. 開始安裝Cuda (記得先用chmod +x讓檔案可以執行,前面兩種不適用跑_linux檔的) 1. `sudo ./cuda_11.1.1_455.32.00_linux.run --driver --silent` (裝了nvidia driver就不用) 2. `sudo ./cuda_11.1.1_455.32.00_linux.run --toolkit --silent` 3. `sudo ./cuda_11.1.1_455.32.00_linux.run --samples --silent` 5. **注意!!** 如果/tmp資料夾容量不夠時會不能安裝,需要給tmpdir=[YourDirectory]的參數,這邊我是給/home/tmp,所以上面的command會變成,同時/home/tmp要是chmod 777: 1. `sudo ./cuda_11.1.1_455.32.00_linux.run --driver --silent --tmpdir=/home/tmp` (裝了nvidia driver就不用) 2. `sudo ./cuda_11.1.1_455.32.00_linux.run --toolkit --silent --tmpdir=/home/tmp` 3. `sudo ./cuda_11.1.1_455.32.00_linux.run --samples --silent --tmpdir=/home/tmp` 6. 安裝時若出現**Missing recommended library:libGLU.so**,則進行下面命令: - `sudo apt-get install libglu1-mesa libxi-dev libxmu-dev libglu1-mesa-dev` 7. 安裝Cuda之後,需要將cuda路徑加入 **~/.bashrc** 當中: ```bash export PATH=/usr/local/cuda/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda/64lib:$LD_LIBRARY_PATH ``` 8. 然後,`source ~/.bashrc` 9. 使用`nvcc -V`檢查是否安裝成功 ## Cudnn (可以加速的套件) ### 安裝之前 1. 檢查主機上是否有cudnn及版本多少 `cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2` 2. 從[官網](https://developer.nvidia.com/cudnn)下載Cudnn,要下載CuDNN需要登錄,登錄後找尋與已安裝好的cuda版本相容的CuDNN(e.g. cuDNN v8.2.1 Library for Linux) > tensorflow2.0需要v7.6以上的版本 ### 刪除Cudnn 3. 刪除相關檔案,通常會在`/usr/local/cuda/include/`和`/usr/local/cuda/lib64/`資料夾中 1. `sudo rm -f /usr/local/cuda/include/*cudnn*` 2. `sudo rm -f /usr/local/cuda/lib64/*cudnn*` ### 安裝cudnn 4. 下載之後進行解打包、壓縮 - `tar -xvf cudnn-11.3-linux-x64-v8.2.1.32.tgz` 5. 解完打包壓縮之後可以發現有個cuda的文件夾,裡面會有 - cuda/include - cuda/lib64 - cuda/NVIDIA_SLA_cuDNN_Support.txt 6. 複製檔案到cuda資料夾中 - `sudo cp cuda/include/cudnn.h /usr/local/cuda/include/` - `sudo cp cuda/include/cudnn_version.h /usr/local/cuda/include/` - `sudo cp cuda/lib64/lib* /usr/local/cuda/lib64/` 7. ~~檢測Cudnn - 到cudnn_samples_v7檢測~~ ## TensorRT (optional) > 跟libnvinfer.so.8, libnvinfer_plugin.so.8的檔案有關 ### 安裝 1. [下載TensorRT](https://developer.nvidia.com/tensorrt),同樣也需要先登錄才能下載 2. 下載tar文件,並選擇相對應的系統 3. 解壓縮tar.gz檔案 - `tar xzvf TensorRT-XXXXXXXXXXXXXX.tar` 4. 將路徑加入`LD_LIBRARY_PATH` - `export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/root/TC184610137/TensorRT-8.0.0.3/lib` 5. 安裝TensorRT, uff, raphsurgeon ```bash=0 cd TensorRT-XXXX/python pip install tensorrt-8.0.0.3-cp36-none-linux_x86_64.whl ``` ```bash=+ ## install uff cd TensorRT-XXXX/uff pip install uff-0.6.9-py2.py3-none-any.whl ``` ```bash=+ ## install graphsurgeon cd TensorRT-XXXX/graphsurgeon pip install graphsurgeon-0.4.5-py2.py3-none-any.whl ``` 6. 測試: ```python import tensorrt as trt import uff as uff ``` Ref: - [Uninstall nvidia driver](https://www.linux.com/blog/install-uninstall-nvidia-driver-33179-ubuntu-linuxmint) - [檢查gpu驅動程式是否安裝](http://samwhelp.github.io/book-ubuntu-qna/read/case/driver/install-driver-package/is-nvidia-driver-installed) - [當/tmp資料夾容量不夠時](https://devtalk.nvidia.com/default/topic/1014448/cuda-installation-error-extraction-failed/) - [安裝步驟](https://medium.com/@zihansyu/ubuntu-16-04-%E5%AE%89%E8%A3%9Dcuda-10-0-cudnn-7-3-8254cb642e70?fbclid=IwAR1ZTYfdlyH5NSSrGNEjo9NyKZibqZaazey6lGcr6a0mxgiJYowVouJNVbk) - [TensorRT](https://zhuanlan.zhihu.com/p/85365075)
×
Sign in
Email
Password
Forgot password
or
Sign in via Google
Sign in via Facebook
Sign in via X(Twitter)
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
Continue with a different method
New to HackMD?
Sign up
By signing in, you agree to our
terms of service
.