Try   HackMD

Stable-diffusion AMD GPU GFX803加速實驗

  • Gfx803-docker
    • 在啟動stable diffusion webui時會發生segFault
      目前根據https://github.com/xuhuisheng/rocm-gfx803/issues/27#issuecomment-1892611849的建議嘗試自編torch和torch vision試試看會不會成功
https://github.com/tsl0922/pytorch-gfx803/releases/tag/pytorch-1.13.1
sudo apt install libopenmpi3 libstdc++-11-dev
pip install torch-1.13.1-cp310-cp310-linux_x86_64.whl
pip install torchvision-0.14.1-cp310-cp310-linux_x86_64.whl

python -m venv venv --system-site-packages
source venv/bin/activate
pip install -r requirements.txt
python launch.py --skip-torch-cuda-test

https://github.com/xuhuisheng/rocm-gfx803/issues/27
sudo apt autoremove rocm-core amdgpu-dkms
sudo apt install libopenmpi3 libstdc++-12-dev libdnnl-dev ninja-build libopenblas-dev libpng-dev libjpeg-dev
sudo -i
sudo echo ROC_ENABLE_PRE_VEGA=1 >> /etc/environment
sudo echo HSA_OVERRIDE_GFX_VERSION=8.0.3 >> /etc/environment
# Reboot after this

wget https://repo.radeon.com/amdgpu-install/5.5/ubuntu/jammy/amdgpu-install_5.5.50500-1_all.deb
sudo apt install ./amdgpu-install_5.5.50500-1_all.deb
sudo amdgpu-install -y --usecase=rocm,hiplibsdk,mlsdk

sudo usermod -aG video $LOGNAME
sudo usermod -aG render $LOGNAME

# verify
rocminfo
clinfo

#in home directory create directory pytorch2.1.2
#Build Torch
cd pytorch2.1.2
git clone --recursive https://github.com/pytorch/pytorch.git -b v2.1.2
cd pytorch
pip install cmake mkl mkl-include
pip install -r requirements.txt
sudo ln -s /usr/lib/x86_64-linux-gnu/librt.so.1 /usr/lib/x86_64-linux-gnu/librt.so
export PATH=/opt/rocm/bin:$PATH ROCM_PATH=/opt/rocm HIP_PATH=/opt/rocm/hip
export PYTORCH_ROCM_ARCH=gfx803
export PYTORCH_BUILD_VERSION=2.1.2 PYTORCH_BUILD_NUMBER=1
export USE_CUDA=0 USE_ROCM=1 USE_NINJA=1
python3 tools/amd_build/build_amd.py
python3 setup.py bdist_wheel
pip install dist/torch-2.0.0-cp310-cp310-linux_x86_64.whl --force-reinstall

cd ..
git clone https://github.com/pytorch/vision.git -b v0.16.2
cd vision
export BUILD_VERSION=0.16.2
FORCE_CUDA=1 ROCM_HOME=/opt/rocm/ python3 setup.py bdist_wheel
pip install dist/torchvision-0.15.1-cp310-cp310-linux_x86_64.whl --force-reinstall

# automatic
cd ..

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
cd stable-diffusion-webui
python3 -m venv venv
source venv/bin/activate
python -m pip install --upgrade pip wheel
pip uninstall torch torchvision
pip3 install /home/*******/pytorch2.1.2/pytorch/dist/torch-2.1.2-cp310-cp310-linux_x86_64.whl
pip3 install /home/*******/pytorch2.1.2/vision/dist/torchvision-0.16.2-cp310-cp310-linux_x86_64.whl
pip list | grep 'torch'

run stable-diffusion-webui

source /stable-diffusion-webui/venv &&
python launch.py --listen --medvram

Docker run:

sudo docker run --rm --device=/dev/kfd --device=/dev/dri --group-add=video --ipc=host --cap-add=SYS_PTRACE --security-opt --restart=unless-stopped seccomp=unconfined -p 8088:7860 stablediffusion-gfx803:0.2 /bin/bash -c "source /stable-diffusion-webui/venv/bin/activate && python /stable-diffusion-webui/launch.py --listen --medvram" 

Docker compose:

version: '3'
services:
  stablediffusion:
    image: stablediffusion-gfx803:0.2
    ports:
      - "8088:7860"
    devices:
      - "/dev/kfd"
      - "/dev/dri"
    group_add:
      - "video"
    ipc: "host"
    cap_add:
      - "SYS_PTRACE"
    security_opt:
      - "seccomp=unconfined"
    command: /bin/bash -c "source /stable-diffusion-webui/venv/bin/activate && python /stable-diffusion-webui/launch.py --listen --medvram"
    restart: unless-stopped

確認在stable-diffusion下有啟動GPU加速

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

podman有一些其他步驟

chmod /dev/kfd 666
groupadd -g $(getent group video | cut -d: -f3) video
groupadd -g $(getent group nogroup | cut -d: -f3) nogroup
groupadd -g $(getent group render | cut -d: -f3) render
usermod -aG video root
usermod -aG nogroup root
usermod -aG render root