<p>首先先到 <a href="https://developer.nvidia.com/cuda-toolkit-archive">nvidia 官網</a>下載 CUDA<br />
首先會看到很多的版本,這邊我們先安裝12.1.1<br />
(因為pytorh 現在最高支援到12.1)<br />
<img src="https://i.imgur.com/otGi5aw.png" alt="Screenshot 2023-11-13 at 15.02.36" /><br />
點進去之後按照對應平台以及系統等資訊點選<br />
主要要注意最後要 Installer Type 要選擇 <code>deb(local)</code><br />
<img src="https://i.imgur.com/1s3g4ea.png" alt="Screenshot 2023-11-13 at 15.04.33" style="width:750px;" /><br />
點選完畢之後,就會看到下面出現安裝的指令。<br />
<img src="https://i.imgur.com/stbk1Uq.png" alt="test" style="width:750px;" /><br />
直接貼在 terminal 就可以安裝了。</p>
<h2><a id="%E5%AE%89%E8%A3%9Dnvidia-cuda-toolkit" class="anchor" aria-hidden="true"><span class="octicon octicon-link"></span></a>安裝 nvidia-cuda-toolkit</h2>
<pre><code class="language-bash">sudo apt install nvidia-cuda-toolkit
</code></pre>
<p>確認安裝是否成功</p>
<pre><code class="language-bash">> nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0
</code></pre>
<p>若出現錯誤,則安裝對應版本</p>
<pre><code class="language-bash">mercury@AERIAL:~$ sudo apt install nvidia-cuda-toolkit
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
libcuinj64-11.5 : Depends: libnvidia-compute-495 (>= 495) but it is not going to be installed or
libnvidia-compute-495-server (>= 495) but it is not installable or
libcuda.so.1 (>= 495) or
libcuda-11.5-1
libnvidia-ml-dev : Depends: libnvidia-compute-495 (>= 495) but it is not going to be installed or
libnvidia-compute-495-server (>= 495) but it is not installable or
libnvidia-ml.so.1 (>= 495)
nvidia-cuda-dev : Depends: libnvidia-compute-495 (>= 495) but it is not going to be installed or
libnvidia-compute-495-server (>= 495) but it is not installable or
libcuda.so.1 (>= 495) or
libcuda-11.5-1
Recommends: libnvcuvid1 but it is not installable
E: Unable to correct problems, you have held broken packages.
</code></pre>
<p>判斷為缺少 <code>libnvidia-compute-495</code></p>
<pre><code class="language-bash">sudo apt install libnvidia-compute-495
</code></pre>
<p>再重新安裝 <code>toolkit</code> 即可完成安裝</p>
<h2><a id="%E5%AE%89%E8%A3%9Dcudnn" class="anchor" aria-hidden="true"><span class="octicon octicon-link"></span></a>安裝 cuDNN</h2>
<h3><a id="%E4%B8%8B%E8%BC%89%E5%AE%89%E8%A3%9D%E6%AA%94" class="anchor" aria-hidden="true"><span class="octicon octicon-link"></span></a>下載安裝檔</h3>
<p>需至此<a href="https://developer.nvidia.com/rdp/cudnn-download">網站</a>申請nvidia developer 帳號,才能下載 cuDNN 函式庫。<br />
點選對應CUDA版本以及 Linux 版本的deb檔案下載。<br />
<img src="https://i.imgur.com/s297AOI.png" alt="Screenshot 2023-11-15 at 11.29.46" /><br />
下載完檔案,將檔案放置到 Remote server 上,可以用 Filezilla 或是 scp 傳送。</p>
<h3><a id="%E5%9F%B7%E8%A1%8C%E5%AE%89%E8%A3%9D%E6%AA%94" class="anchor" aria-hidden="true"><span class="octicon octicon-link"></span></a>執行安裝檔</h3>
<p>在remote server 端,執行安裝檔案<br />
<strong>注意檔案名稱是否相同</strong><br />
Enable the local repository.</p>
<pre><code class="language-bash">sudo dpkg -i cudnn-local-repo-ubuntu2204-8.9.6.50_1.0-1_amd64.deb
</code></pre>
<p>Import the CUDA GPG key.</p>
<pre><code class="language-bash">sudo cp /var/cudnn-local-repo-*/cudnn-local-*-keyring.gpg /usr/share/keyrings/
</code></pre>
<p>Refresh the repository metadata.</p>
<pre><code class="language-bash">sudo apt-get update
</code></pre>
<p>Install the runtime library.</p>
<pre><code class="language-bash">sudo apt-get install libcudnn8
</code></pre>
<p>Install the code samples.</p>
<pre><code class="language-bash">sudo apt-get install libcudnn8-samples
</code></pre>
<p>安裝完之後,就可以測試一下安裝是否成功。</p>
<h3><a id="%E9%80%B2%E8%A1%8C%E6%B8%AC%E8%A9%A6" class="anchor" aria-hidden="true"><span class="octicon octicon-link"></span></a>進行測試</h3>
<p>先進入測試檔案目錄</p>
<pre><code class="language-bash">cd /usr/src/cudnn_samples_v8/
</code></pre>
<p>進入 mnistCUDNN 測試資料夾</p>
<pre><code class="language-bash">cd mnistCUDNN/
</code></pre>
<p>安裝測試需要的套件</p>
<pre><code class="language-bash">sudo apt install libfreeimage3 libfreeimage-dev
</code></pre>
<p>編譯</p>
<pre><code class="language-bash">sudo make clean && sudo make
</code></pre>
<p>執行</p>
<pre><code class="language-bash">./mnistCUDNN
</code></pre>
<p>最後輸出 <code>Test passed!</code> 就代表成功安裝 cudnn 函式庫了。</p>