###### tags: `lab` `sys` # Teslaマシン環境構築 Boot meneuはF11 ## Teslaへのインストール 一般的なGTX系のドライバと同じ方法でインストールできる。注意したいのはNvidia-Driverを誤ったものを入れるとログインループになります。 ### 手順 1. Ubuntuのセットアップ - これはGTX系の資料を参考に 2. X serverを殺す - `init 3`で殺すと確実 3. `Tesla Driver for Linux x64`のインストール - [Tesla Driver for Linux](https://www.nvidia.com/Download/index.aspx?lang=en-us) - ==CUDAとnvidia-Driverの互換性== - :::info 一般的にnvidia-driver384はCUDA7, 8, 9番台をサポートしている。今回のTeslaのdriverバージョンは最も古いものでv384である。つまり、公式ではTesla V100はCUDA9しかサポートしていないが、nvidia-driverに互換性のあるCUDAなら7でも8でも実行可能である可能性が高い ::: ## 注意事項 Teslaには専用のドライバがある。 ``` chmod +x NVIDIA-Linux-x86_64-384.183.run ./NVIDIA-Linux-x86_64-384.183.run --no-opengl-files --no-libglx-indirect --dkms ``` そう、384なんである。なので Driverはこのスクリプト経由 Cudaの本体はtoolkitから焼くことで、cuda8.0環境が使えるのではないかと考えている https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#package-manager-metas | Meta Package | Purpose | | ----------------------- | ------------------------------------------------------------ | | cuda | Installs all CUDA Toolkit and Driver packages. Handles upgrading to the next version of the cuda package when it's released. | | cuda-10-2 | Installs all CUDA Toolkit and Driver packages. Remains at version 10.2 until an additional version of CUDA is installed. | | cuda-toolkit-10-2 | Installs all CUDA Toolkit packages required to develop CUDA applications. ==Does not include the driver.== | | cuda-tools-10-2 | Installs all CUDA command line and visual tools. | | cuda-runtime-10-2 | Installs all CUDA Toolkit packages required to run CUDA applications, as well as the Driver packages. | | cuda-compiler-10-2 | Installs all CUDA compiler packages. | | cuda-libraries-10-2 | Installs all runtime CUDA Library packages. | | cuda-libraries-dev-10-2 | Installs all development CUDA Library packages. | | cuda-drivers | Installs all Driver packages. Handles upgrading to the next version of the Driver packages when they're released. | ## 参考サイト [CUDA Toolkit/GPUカードドライバー導入手順](https://help.sakura.ad.jp/115000122721/#03) ## 選択肢 - > The distribution-provided pre-install script failed! Are you sure you want to continue? - Continue installation ☑ - ② - Abort installation - instalalation is falid - > ②Would you like to register the kernel module sources with DKMS? This will allow DKMS to automatically build a new module, if you install a different kernel later. - yse ☑ - no `` WARNING: Unable to find a suitable destination to install 32-bit compatibility libraries. Your system may not be set up for 32-bit compatibility. 32-bit compatibility files will not be installed; if you wish to install them, re-run the installation and set a valid directory with the --compat32-libdir option. ``` - OK ## 地獄のChainer2 実際のところこれで行けるかわからない... ### cuDNNのインストール ``` tar xzvf cudnn-8.0-linux-x64-v5.1.tgz sudo cp -a cuda/lib64/* /usr/local/cuda-8.0/lib64/ sudo cp -a cuda/include/* /usr/local/cuda-8.0/include/ sudo ldconfig ``` https://qiita.com/JeJeNeNo/items/05e148a325192004e2cd - chainer v1.18.0 - cudnn 5.1 インストール可能だが動作しない - cudnn 6.0 インストール失敗 - chainer v1.24.0 - cudnn 6.0 インストール可能 ``` cupy-cuda80 6.3.0 ito@FRONTIER:~$ pip uninstall cupy-cuda80 ``` OK? https://stackoverflow.com/questions/51283708/python-pip-package-requestsdependencywarning-when-installing-elastic-search-cura これと同じエラーが出たので同じ対処法をした ![](https://i.imgur.com/n0nCTPK.png) 成功 とりあえず、これでretry ファイル回してみるわ。 ``` 1epoch: 0%| | 0/11464 [00:00<?, ?it/s]Traceback (most recent call last): File "Main.py", line 50, in <module> TrainNetwork.TrainCNNModel(dataidx["train"], i) File "/home/ito.t/work/naka2018_31/TrainNetwork.py", line 84, in TrainCNNModel network = DNN.TrainCNN(X,T,epoch_cnt=60, fold_cnt=fold_cnt) File "/home/ito.t/work/naka2018_31/DNN.py", line 514, in TrainCNN opt.update(model,x,t) File "/home/ito.t/.local/lib/python2.7/site-packages/chainer/optimizer.py", line 411, in update loss = lossfun(*args, **kwds) File "/home/ito.t/.local/lib/python2.7/site-packages/chainer/links/model/classifier.py", line 67, in __call__ self.y = self.predictor(*x) File "/home/ito.t/work/naka2018_31/DNN.py", line 216, in __call__ x = getattr(self, name)(x, self.tarin) File "/home/ito.t/work/naka2018_31/DNN.py", line 113, in __call__ x = getattr(self, name)(x, train) File "/home/ito.t/work/naka2018_31/DNN.py", line 90, in __call__ x = F.concat((x, tmp)) File "/home/ito.t/.local/lib/python2.7/site-packages/chainer/functions/array/concat.py", line 87, in concat return Concat(axis=axis)(*xs) File "/home/ito.t/.local/lib/python2.7/site-packages/chainer/function.py", line 199, in __call__ outputs = self.forward(in_data) File "/home/ito.t/.local/lib/python2.7/site-packages/chainer/functions/array/concat.py", line 43, in forward return xp.concatenate(xs, axis=self.axis), File "/home/ito.t/.local/lib/python2.7/site-packages/cupy/manipulation/join.py", line 49, in concatenate return core.concatenate_method(tup, axis) File "cupy/core/core.pyx", line 2072, in cupy.core.core.concatenate_method (cupy/core/core.cpp:63848) File "cupy/core/core.pyx", line 2115, in cupy.core.core.concatenate_method (cupy/core/core.cpp:63748) File "cupy/core/core.pyx", line 2167, in cupy.core.core.concatenate (cupy/core/core.cpp:64932) File "cupy/core/core.pyx", line 1298, in cupy.core.core.ndarray.__setitem__ (cupy/core/core.cpp:27914) File "cupy/core/core.pyx", line 2734, in cupy.core.core._scatter_op (cupy/core/core.cpp:71428) File "cupy/core/core.pyx", line 1541, in cupy.core.core.elementwise_copy (cupy/core/core.cpp:57352) File "cupy/core/elementwise.pxi", line 776, in cupy.core.core.ufunc.__call__ (cupy/core/core.cpp:47765) File "cupy/util.pyx", line 39, in cupy.util.memoize.decorator.ret (cupy/util.cpp:1480) File "cupy/core/elementwise.pxi", line 584, in cupy.core.core._get_ufunc_kernel (cupy/core/core.cpp:44235) File "cupy/core/elementwise.pxi", line 32, in cupy.core.core._get_simple_elementwise_kernel (cupy/core/core.cpp:33912) File "cupy/core/carray.pxi", line 86, in cupy.core.core.compile_with_cache (cupy/core/core.cpp:33567) File "/home/ito.t/.local/lib/python2.7/site-packages/cupy/cuda/compiler.py", line 156, in compile_with_cache cubin = nvcc(source, options, arch) File "/home/ito.t/.local/lib/python2.7/site-packages/cupy/cuda/compiler.py", line 78, in nvcc _run_nvcc(cmd, root_dir) File "/home/ito.t/.local/lib/python2.7/site-packages/cupy/cuda/compiler.py", line 56, in _run_nvcc raise RuntimeError(msg) RuntimeError: `nvcc` command returns non-zero exit status. command: ['nvcc', '--cubin', '-arch', 'sm_70', '/tmp/tmpCODN17/kern.cu'] return-code: 1 stdout/stderr: nvcc fatal : Value 'sm_70' is not defined for option 'gpu-architecture' ```