Stable Diffusion WebUI Manual

# Stable Diffusion WebUI Manual Documentation ### Notes - Main usage of GPUs: 2 and 3 - Program uses port: 7860 ## Environment Setup ### Python venv ```bash python -m venv {venv_name} source {venv_name}/bin/activate pip install --upgrade pip pip install -r /home/diffusion/Documents/buildenv/environment.txt ``` ### environment.txt ```python albumentations==0.4.3 diffusers opencv-python==4.1.2.30 pudb==2019.2 invisible-watermark imageio==2.9.0 imageio-ffmpeg==0.4.2 pytorch-lightning==1.4.2 omegaconf==2.1.1 test-tube>=0.7.5 streamlit>=0.73.1 einops==0.3.0 torch-fidelity==0.3.0 transformers==4.19.2 torchmetrics==0.6.0 kornia==0.6 streamlit-drawable-canvas==0.8 -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers -e git+https://github.com/openai/CLIP.git@main#egg=clip -e . ``` ### Deployment Steps #### Start Environment ```bash source /home/diffusion/Documents/venv_inference/bin/activate ``` #### Start WebUI ```bash nohup /home/diffusion/Documents/SD_webui/stable-diffusion-webui/webui.sh & ``` - Access the interface via hostname. Example: 127.0.0.1 ### File Structure - `dataset`: Folder for uploaded data (custom path allowed) - `models`: Folder for trained models (custom path allowed) - `instructPix2Pix_finetune`: Scripts for dataset creation - `SD_webui`: Main program folder - `Text_to_image`: Contains all training scripts for Stable Diffusion - `buildenv`: Environment setup files - `venv_inference`: Virtual environment - `unused`: Deprecated or temporary files ## Training Scripts Local dataset creation: ```bash /home/diffusion/Documents/instructPix2Pix_finetune/Push_dataset_to_local_pix2pix.py ``` HuggingFace dataset creation: ```bash /home/diffusion/Documents/instructPix2Pix_finetune/Push_dataset_to_hub_pix2pix.py ``` Local training script: ```bash /home/diffusion/Documents/Text-to-image/diffusers/examples/instruct_pix2pix/train_instruct_pix2pix_local.py ``` HuggingFace training script: ```bash /home/diffusion/Documents/Text-to-image/diffusers/examples/instruct_pix2pix/train_instruct_pix2pix.py ``` Convert model to WebUI format: ```bash /home/diffusion/Documents/Text-to-image/diffusers/scripts/convert_diffusers_to_original_stable_diffusion.py ``` ### Image Modes - img2img: No editing, outputs full image. - Sketch: Editable image, outputs full image. - Inpaint: Partial editing, replaces specific regions. ### Extensions Modifiable content: /home/diffusion/Documents/SD_webui/stable-diffusion-webui/extensions/stable-diffusion-webui-custom-train/scripts/template_on_tab.py ![image](https://hackmd.io/_uploads/SyCzIXVxkg.png) ### Finetune InstructPix2Pix 1. Upload Dataset: Allows users to upload their datasets. 2. Upload Dataset (HuggingFace): Allows HuggingFace dataset upload. 3. Finetune: Fine-tune model with the dataset. 4. Convert Model: Converts the fine-tuned model for WebUI usage. #### Example Training Command ```bash nohup accelerate launch --mixed_precision="fp16" --gpu_ids="2,3" --main_process_port=29500 --num_processes=2 /home/diffusion/Documents/Text-to-image/diffusers/examples/instruct_pix2pix/train_instruct_pix2pix_local.py --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" --dataset_name="/home/diffusion/Documents/dataset/Test_Dataset_1K_0216" --use_ema --resolution=512 --train_batch_size=8 --gradient_accumulation_steps=1 --gradient_checkpointing --max_train_steps=5000 --checkpointing_steps=5000 --learning_rate=1e-05 --max_grad_norm=1 --seed=512 --lr_scheduler="constant" --lr_warmup_steps=100 --output_dir="/home/diffusion/Documents/models/Test-Model-0220-2" --original_image_column='mask_image' --edited_image_column='image' --edit_prompt_column='text' --random_flip > /home/diffusion/Documents/models/Test-Model-0220-2.log 2>&1 & ``` # Upload Dataset | Field Name | Default | Description | | -------------- | ------- | ----------------------------------- | | Upload File | none | Upload the compressed dataset file | | Dataset Path | none | Enter the path to save on the host | # Upload Dataset (HuggingFace) | Field Name | Default | Description | | ----------------- | ------- | --------------------------------------------- | | Upload File | none | Upload the compressed dataset file | | Huggingface Token | none | Obtain from your Huggingface account | | Huggingface Dataset | none | Name of the dataset stored on Huggingface | | Dataset Path | none | Enter the path to save on the host | | ![image](https://hackmd.io/_uploads/HJaOI7Ve1e.png) # Finetune | Field Name | Default | Description | | --------------- | ------- | ------------------------------- | | Model Name | none | Name of the model being trained | | Output Directory| none | Path where the trained model will be saved | | Dataset Path | none | Path to the dataset | | Dataset Type | Local | Type of dataset | | Resolution | 512 | Image Size | | Learning Rate | 1e-5 | Learning rate | | Batch Size | 8 | Batch size | | Step | 1000 | Number of training steps | ![image](https://hackmd.io/_uploads/ryC0Im4lyx.png) # Convert Model | Name | Default | Description | | --------------- | ------- | --------------------------- | | Model Name | none | Training Model Name | | Model Path | none | Model Generation Path | ![image](https://hackmd.io/_uploads/HyRgwmExke.png) # Training Parameters Description | Parameter Name | Default | Description | | --------------------------- | -------- | ----------------------------------------------------- | | mixed_precision | none | Type of mixed precision calculation | | gpu_ids | none | Choose the GPU IDs for training | | main_process_port | 25900 | Set the port for the main training process | | num_processes | 1 | Number of parallel processes | | resolution | 512 | Training image size | | train_batch_size | 4 | Batch size | | max_train_steps | 1000 | Maximum number of training steps | | checkpointing_steps | 500 | Number of steps before saving a checkpoint | | learning_rate | 1e-5 | Learning rate | | lr_scheduler | constant | Training strategy | | output_dir | none | Output directory | | pretrained_model_name_or_path| none | Pretrained model to use for training | #### Dataset Restrictions 1. The name of the first folder in the compressed file must match the name of the file itself. 2. Each dataset must contain image, masked image, and prompt with matching filenames. #### Dataset Structure A dataset must include: - `{filename}.jpg` - `{filename}_mask.jpg` - `{filename}.txt` Extensions: .jpg or .png #### Local Dataset Files - `xxxx.arrow` - `dataset_info.json` - `state.json` - `dataset_dict.json` Note: These files won’t exist if uploaded to HuggingFace. ## Models Converted models for WebUI usage should be placed at `/stable-diffusion-webui/models/Stable-diffusion` ![image](https://hackmd.io/_uploads/SJA-umVxke.png) ## Reference Materials [Stable diffusion](https://huggingface.co/docs/diffusers/v0.13.0/en/training/text2image) [Gradio](https://www.gradio.app/)