1.更新python版本
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install python3.10
sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9 1
sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 2
sudo update-alternatives --config python3
2.清空資料
cd /usr/share/keyrings/
sudo rm -rf /usr/lib/nvidia/
sudo apt-get --purge remove "*cublas*" "*cuda*" "nsight*" "*nvidia*"
sudo apt autoremove
sudo apt autoclean
sudo apt-get install build-essential linux-headers-$(uname -r)
2.安裝nvidia driver
sudo apt-get update
#https://www.nvidia.com/Download/index.aspx?lang=en-us
sudo bash NVIDIA-Linux-x86_64-550.67.run
# sudo apt install nvidia-detect
# sudo apt install nvidia-driver
3.安裝cuda
#https://developer.nvidia.com/cuda-downloads
wget https://developer.download.nvidia.com/compute/cuda/11.7.1/local_installers/cuda-repo-debian11-11-7-local_11.7.1-515.65.01-1_amd64.deb
sudo dpkg -i cuda-repo-debian11-11-7-local_11.7.1-515.65.01-1_amd64.deb
sudo cp /var/cuda-repo-debian11-11-7-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo add-apt-repository contrib
sudo apt-get update
sudo apt-get -y install cuda
nvidia-smi
4.安裝cudnn
#https://developer.nvidia.com/rdp/cudnn-download
sudo tar -xzvf cudnn-11.3-linux-x64-v8.2.1.32.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
5.安裝miniconda
$wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
$bash
conda create -n myDL python=3.10 pip
conda activate myDL
6.安裝PyTorch
conda install pytorch torchvision torchaudio cudatoolkit=11.7 -c pytorch -c conda-forge
7.其他指令
pip3 install -U d2l jupyter
sudo apt install nvidia-driver firmware-misc-nonfree
檢驗版本 dpkg -l | grep -i nvidia
8.遠端Jupyter
https://stackoverflow.com/questions/42848130/why-i-cant-access-remote-jupyter-notebook-server
jupyter notebook --no-browser & disown
10.故障排除
perl 語言錯誤
sudo apt-get update && sudo apt-get install locales
sudo dpkg-reconfigure locales
header搜尋
ls /usr/src/linux-6.1.0-headers
sudo apt list linux-headers-$(uname -r)
sudo apt install linux-headers-amd64
#sudo apt install kernel-headers-$(uname -r)
sudo apt install build-essential
sudo dpkg --configure -a
sudo dkms install -m nvidia-current -v 535.104.05
sudo apt-get install multipath-tools
11.Stable Diffusion
mkdir StableDiff
cd StableDiff
# 程式框架部分
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
# 下載權重檔
wget https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.ckpt
mv v1-5-pruned-emaonly.ckpt stable-diffusion-webui/models/Stable-diffusion/
# 執行
./webui.sh --xformers --share
https://docs.conda.io/en/latest/miniconda.html
https://developer.nvidia.com/cuda-downloads
https://developer.nvidia.com/rdp/cudnn-download
https://pytorch.org/get-started/locally/
https://docs.nvidia.com/deploy/cuda-compatibility/index.html
https://www.datacamp.com/tutorial/how-to-run-stable-diffusion
Learn More →
α−1
Aug 19, 2023Reference: 陳彥甫 Leetcode1234 Original code - Remove Linked List Elements I choose the Problem Remove Linked List Elements since linked-list is an ubiquitous topic in computer science so I wonder how it works in assembly. Also, I find some mistakes in his either c code or assembly. Hence, I would like to correct it and try to speedup the execution with optimizations. c++ code rewritten from 陳彥甫 ListNode* removeElements(ListNode* head, int val) { ListNode *cur = head, *prev = NULL; while (cur) { if (cur->val == val) {
Jan 13, 2023RV32M is a variation of the RISC-V instruction set architecture (ISA) that is designed for faster mathematical computation and provides balance between performance and code density. It is a extension of the RV32I base ISA and includes the following subsets:Multiplication operations are significant for many applications in the domain of digital signal processing, image processing, scientific computing, and many more. Division operations are needed for some scientific computing and special purpose operations like graphics rendering, etc. SPU32 ("Small Processing Unit 32"), a compact RISC-V processor implementing the RV32I instruction set, also includes some peripherals. This project is written in Verilog and is designed to be synthesizable using the yosys. YOSYS is an opensource framework for RTL synthesis, that is translate HDL into gate-level netlist implementation. We need the Verilog/SystemVerilog simulators such as Verilator, supporting RISC-V simulation. Verilator accepts Verilog or SystemVerilog, and compiles HDL code into a much faster optimized and optionally thread-partitioned model, which is in turn wrapped inside a C++/SystemC module. Setup environment
Jan 13, 2023蟻群演算法 ACO(Ant Colony Optimization)是模擬自然界中蟻群尋找食物的仿生演算法。在螞蟻移動的過程中,不斷嘗試最短路徑,並會在走過的路上留下特殊的費洛蒙,螞蟻之間可以感知這種物質濃度。如果路徑上的費洛蒙量較高,代表走過該路徑的螞蟻較多或是最新才走過,則後續的螞蟻有較大機率亦選擇此路徑。 在旅行商人的問題中,所有路徑都嘗試的方式為$\frac{n!}{2}$,即便使用平行運算進行暴力破解也需要花費許多時間(例如:最小的作業有17個城市需要3.5568743e+14次)。利用ACO演算法,由於每一次迭代就有m隻螞蟻依照規定走完所有的城市,並傳遞下訊息,可以優化下次選擇過程,讓我們在“猜測”路徑更有效率。 調整權重參數 $\alpha$、$\beta$ 和 $\rho$ 可以改變最終結果: $\alpha$ : 決定局部資訊影響性,提高可以讓收斂速度變快,但也可能不是最佳解 $\beta$ : 距離影響決策程度,提高提高會讓選擇變得隨機 $\rho$ : 訊息揮發速度,較小的話可以保留長時間訓系進行比較(但會慢一些)
Jan 9, 2023or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up