Reneil
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    3
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # Train Stable Diffusion LoRA from VRM Avatar In this guide I'll explain how you can train a [LoRA](https://replicate.com/blog/lora-faster-fine-tuning-of-stable-diffusion) of your VRM avatar that you can use to draw images of your 3D avatar in any Stable Diffusion Model that you can find. In this Tutorial we are going to train the LoRA for the tokenized identity [Nature](https://twitter.com/naturevrm) and I'm going to provide the file so you can play around with it in your own Stable Diffusion installation. The images below give you a sense of the prompt results. Find vtubing content on my [Youtube Channel](https://www.youtube.com/@reneil1337) to get an impression about the actual 3D avatar. A LoRA captures and conserves a concept or an idea in a way that it can be aggregated into larger models to be part of its outputs. A LoRA can be anything but in this case it is a character. [<img src="https://hackmd.io/_uploads/B1J2DEqk6.jpg"/>](https://hackmd.io/_uploads/B1J2DEqk6.jpg) The SD1.5 LoRA that we've trained in this guide was [released on CivitAI](https://civitai.com/models/31462?modelVersionId=166927) under cc0 licence. Play with it for free in your own stable diffusion instance, we added lots of example prompts. Add your creations on CivitAI and [tag Nature on Twitter](https://twitter.com/naturevrm) so that she can retweet your social media postings. [<img src="https://hackmd.io/_uploads/BylRvotyp.jpg"/>](https://hackmd.io/_uploads/BylRvotyp.jpg) Update: In September 2023, five months after the first NatureVRM LoRA was released, we stepped up our game with a highres 1024px SDXL LoRA that you can also [download on CivitAI](https://civitai.com/models/31462?modelVersionId=165670) for free. ## Install the Stack You'll need the following 3 ingrediences to train a Stable Diffusion model on your VRM avatar. All links that you need are provided in the step-by-step guide below. The Stable Diffusion and Kohya resources are linked in the descriptions of the Youtube Videos and outlined in that content. - VRM Posing Desktop (paid) / VRM Live Viewer (free) - Optional: Blender for automated capture - Stable Diffusion - Kohya ## Step-by-Step Guide <iframe style="width:100%;display:inline-block;padding:0px" height="420"src="https://www.youtube.com/embed/YkFsgjOHx8A" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> Install the tools linked above and learn the entire pipeline from data set recording, data set curation, LoRA model training up to prompting final outputs from your local stable diffusion instance. We are going to do all of these things step by step. Let's go! ### 1. Caputure your avatar shots To generate the training dataset you need to capture your avatar in different poses. There are different ways to do that. There are free ways to do it but if you plan to tinker with this more than once I'd recommend you to spend a few dollars and go with VRM Posing Desktop. ### 1.1 VRM Posing Desktop The most convinient way for manual capture is [VRM Posing Desktop](https://store.steampowered.com/app/1895630/VRM_Posing_Desktop/) which costs you $10 bucks but provides lots of default poses aswell as a community section with hundreds of free poses that you can apply to your avatar. Watch [this tutorial](https://www.youtube.com/watch?v=nXFB6jGxQl8) to get an idea on how to use it. [<img src="https://hackmd.io/_uploads/HypAtjJkp.jpg"/>](https://hackmd.io/_uploads/HypAtjJkp.jpg) - Change the background to white (bottom left corner) - Load your VRM (upper left corner) - Hit the "Select Pose" button (upper left corner) - Select one of the poses in the popup (lower left) - Position the avatar in the center of the window using your mouse - Hit the camera button (upper right corner) - Configure the resolution and export the image in the popup (hint: camera positioning continues to work in that view) [<img src="https://hackmd.io/_uploads/ryygioJJa.jpg"/>](https://hackmd.io/_uploads/ryygioJJa.jpg) Make sure to capture a variety of different poses. The training will take longer the more images you decide to curate into the dataset. Go for 1024px files if you want to train a SDXL LoRA but note that your GPU should have at least 16GB VRAM to train on that resolution. [<img src="https://hackmd.io/_uploads/r1V1GBly6.jpg"/>](https://hackmd.io/_uploads/r1V1GBly6.jpg) ### 1.2 VRM Live Viewer (Free) Download [VRM Live Viewer](https://booth.pm/ja/items/1783082) and extract the files to your computer. Launch the exe to launch the software. VRM Live Viewer allows for the configuration and playback of dance choreographies involving avatars, visuals and most importantly animations. ![](https://i.imgur.com/Q9yCGtM.png) The UI can be pretty overwhelming if you are not used to it but lets go through it step by step. 1. Load your VRM avatar into a 3D world. The app has a couple of predefined worlds and performances that you can launch right away. However we're going to change the surrounding to white which ensures that the LoRA doesn't contain background artifacts. ![](https://i.imgur.com/bjv06e8.png) 2. So once loaded, the first choreography will start right away. The default stage is pretty wild when it comes to the VFX so change the stage to "Plane" in the bottom of the right menu. Set the floor (2nd select) to None/3D and the sky to (3rd select) to None/360. Click that 360 icon right next to that last select box which opens this popup. Click the box right next to "BackgroundColor" and set the color to #FFFFFF so that we have a completely clean environments which will help to isolate our avatar in the next steps. ![](https://i.imgur.com/AvuqXov.png) 3. In the top right you can choose between 3 preinstalled dance choreographies. You can also load custom bvh animation files from your computer via the blue folder icon above but for this guide we'll stick to the 3 dances as these are very long animations which leaves us enough time for the camera positioning to take the shots of our avatar. ![](https://i.imgur.com/tTWE21A.png) 4. Hit "tab" on your keyboard to toggle the side menus. Now you take screenshots of our avatar, cut them to square orientation and save 512x512 px jpg files on your computer. Every one in a while hit "tab" to display the menu again as you shoot different positions on various stages from all sorts of angles. The more variety you give, the better the AI understands your avatar. - Hold left mouse button to turn the camera - Hold right mouse button to move the camera - Use scroll wheel to zoom in and out ![](https://i.imgur.com/sm4Cyux.png) When the animation is over, just reselect another choreography via the dropdown indicated in step 3. The stop + play buttons at the bottom of the right menu are sometimes buggy for me. Camera flights with XBOX Controller works great - give it a try. I've decided to go for 30 images for the full body capture of the Nature avatar. ### 1.3 Automate larger Training Data Creation via Blender Fellow builder [Howie Duhzit released a Blender plugin](https://twitter.com/HowieDuhzit/status/1693866269911515469) that allows you to automate the process that allows to create way larger training data which is going to massively improve the results that you'll see in the actual prompting. The plugin was added into [Duhzit Wit Tools](https://howieduhzit.gumroad.com/l/dwtools) which is a suite of useful blender tools. You can also find it on the [Github Repo](https://github.com/HowieDuhzit/Duhzit-Wit-Tools). Make sure to give it a try! <iframe style="width:100%;display:inline-block;padding:0px" height="420"src="https://www.youtube.com/embed/EW2MAaNhZJA" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> ### 2. Prepare your Avatar Shots via Stable Diffusion and Kohya 1. Watch this tutorial and install Stable Diffusion quick and easy within 15 minutes. It is an essential component of the stack that we'll be using here so there is no way around it. <iframe style="width:100%;display:inline-block;padding:0px" height="420"src="https://www.youtube.com/embed/onmqbI5XPH8" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> 2. Now start to watch this tutorial on how to install Kohya on your local machine and learn how to prepare the LoRA training data from the images that we've prepared in the first step. This tutorial contains everything that you need to know related to preprocessing and training. Open it in a new tab and keep it open as it helps you to understand the next steps. <iframe style="width:100%;display:inline-block;padding:0px" height="420"src="https://www.youtube.com/embed/70H03cv57-o" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> <iframe style="width:100%;display:inline-block;padding:0px" height="420"src="https://www.youtube.com/embed/N_zhQSx2Q3c" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> 3. Now we'll use an AI to automatically analyze the content of the images of our avatar that we've captured. Create a folder called "processed" inside the folder in which you've stored your training imgs. Then launch Stable Diffusion (close Kohya before if you still have it running due to the installation in step 2) and navigate to "Train > Preprocess images". ![](https://i.imgur.com/FYa9yUd.png) Insert your folder path into "Source directory" and the path of the newly created processed folder into "Destination directory". Check "Use BLIP for caption" and hit the Preprocess button to start, 4. After the preprocessing step, the folder you use for training should look like this. I've used "NatureVRM" as character name instead of Nature to ensure that there won't be confusion about the already existing definition of Nature when I'm prompting the outputs later on. ![](https://i.imgur.com/4x8Vn3R.png) 5. When you are optimizing the preprocessed txt files make ensure to strip out everything that describes the essence your character. Any information in the txt is substracted from the LoRA that will be created during the training. The more clear you are, the less artifacts will be in your LoRA. In my case the files originally says "in a costume with flowers on her arm" but I stripped that from the txt as the roses are an essential part of Natures visual characteristic and this is not a custome. Make sure to always copy+paste the name of your LoRA in the beginning of any txt file. You need to do this for every image in the folder. ![](https://i.imgur.com/nZAMmK9.png) ::: info Txt input: ``NatureVRM a person dancing with arms outstretched, one hand in the air`` ::: 6. Now prepare the final folder structure for your LoRA training. As all of this is explained step-by-step in the LoRA tutorial by Aitrepreneur I won't explain this here. Your final folder structure before we begin the training should look just like this. In the image folder sits the "20_NatureVRM" folder. With 15 imgs or less that number differs (see the LoRA video). ![](https://i.imgur.com/RLKBMdO.png) All the preprocessed images that we have prepared during the previous steps are located inside the image/20_NatureVRM and the model + log folders are still empty. From here we can start to train the LoRA with the Stable Diffusion model inside Kohya. ![](https://i.imgur.com/c1TPpFu.png) ### 3. Start the LoRA Training inside Kohya To start the actual training you can now launch Kohya - make sure to close Stable Diffusion in case you had that still open. Also close VRM Live Viewer if this is still open as you'll need every bit of VRAM from your GPU during the training. First we'll train the LoRA with SD 1.5 as this ensures compatibility with most models on [CivitAI](https://civitai.com/) as most of them were trained with that SD version. You can repeat the training with SDXL so that you also have a LoRA that works with the upcoming models that might be trained with that newer version of Stable Diffusion. The end result of the training is that you'll have two .safetensor files - one of each SD version. For SDXL you will def need 1024 resolution training images. 1. Switch into the "Dreambooth LoRA" tab to start. The regular Dreambooth View looks very similar so double check that you are in the LoRA tab! Click the "Configuration file" section which pops out an uploader area. ![](https://i.imgur.com/K4mxw2I.png) Download the [LoRA settings json file](https://reneil.mypinata.cloud/ipfs/QmZGygM5j89uYkuDLZNKwiFFbwXPY5wik8hftcXD68fbJZ/v15.json) (for SDXL you use [this json file](https://reneil.mypinata.cloud/ipfs/QmZGygM5j89uYkuDLZNKwiFFbwXPY5wik8hftcXD68fbJZ/sdxl.json)) and open it in the interface. Hit load to inject all the parameters into Kohya. In source model you hit the white icon next to the folder and select either the basic SD 1.5 or SDXL model on your PC. Make sure that SDXL is checked in case you want to train an SDXL model. 2. Navigate to the "Folders" tab where you overwrite the 3 folders paths (regularisation stays empty) according to the structure that you've prepared in step 2 of this tutorial. Then set "Model output name" according to the Folder that you've prepared and add v15 to indicate that this LoRA was trained with SD 1.5 as a base model. ![](https://i.imgur.com/2lVjsKW.png) Then click train model to start the training process. Depending on your hardware and the number of training images this is going to take a while. You can go AFK to touch some grass during that process. ![](https://i.imgur.com/eDbQV2r.png) 3. Once this is done you should see multiple safetensors file in the model folder that you've created. The file without numbers reflects. The safetensors files with appended numbers reflect snapshots of the model that were created during the training. The one without numbers - in my case NatureVRM.safetensors - is the final training stage. This file and the later stage iterations are most likely overtrained. We'll dig into that later. ![](https://hackmd.io/_uploads/HJYvouSk6.jpg) Copy+Paste all safetensors files into your "stable-diffusion-webui\models\Lora" folder so that you can access it from the webUI of your Stable Diffusion installation. ## Prompt Images with your Avatar LoRA Launch Stable Diffusion and click the red icon below the Generate button to show your extra networks. Click the Lora tab and you should now see the LORAs that you copied into your SD installation. When you click one of those it will add the lora tag into your prompt text area. You can embed this into more complex prompts to embed your avatar into your creation. ![](https://i.imgur.com/b4ofkv3.png) You can download a few models to play around. Lets start prompting with [Protogen V22 Anime](https://civitai.com/models/3627/protogen-v22-anime-official-release) which was also used to create the images in the intro of this article. You can scroll down the linked model page to get some example prompts that work particular well for the selected model. Adjust those and don't forget your LoRA tag in the beginning of your prompt. ![](https://hackmd.io/_uploads/HJ20MMLjT.png) You don't have to keep the LoRA tab open. Click the red icon again to hide it. The anime model is adding a face and haircut into the nature avatar. To prevent this you can increase the weight of your LoRA by replacing the 1 in the pointed brackets to 1.3 just play around with all these things as you prompt your way towards all sorts of configurations and models. ## Select the Best LoRA Weight Ok now you played around a bit. To get better prompting results we need to identity the best LoRA file from your training results. A LoRA can be both undertrained or overtrained. Undertrained means that it doesn't carry enough information about the character or concept which results in unsatisfying results that don't look like the character yet. In contrast to that, prompting with an overtrained LoRA results in artifacts and missing flexibility. You want flexibility as you want to be able to prompt your character with different models in all sorts of surroundings. ![](https://hackmd.io/_uploads/HyxTCdHk6.png) Stable Diffusion allows you to plot the same prompt with different parameters into a single overview image that helps you to find the best LoRA. To do that you write a prompt with `<lora:NatureVRM-000001:0.8>` which reflects the results of the first training epoch at 80% weight. We can now use an X/Y/Z plot (under script in the bottom of the page) and "Prompt S/R" (search + replace) values on the X and Y axis. We want to plot all epochs on the X axis and increase the weight on the Y axis. Make sure to check the two boxes and hit generate. [<img src="https://hackmd.io/_uploads/SJ3c6OHyp.jpg"/>](https://hackmd.io/_uploads/SJ3c6OHyp.jpg) Create 5-10 plots with different prompts in different models and select the best epoch (in this I'd say epoch 3 wins) for each of them. You will see a tendency and once you've found your overall winner, copy that safetensors file and remove the digits from the filename - your LoRA is ready. You can keep the original training files in case you want to revisit your decision later on. ## Bonus: Dig into ControlNet Lets get a bit advanced and bring in ControlNet. If you don't have that extension installed yet [watch this video by Aitrepreneur](https://www.youtube.com/watch?v=OxFcIv8Gq8o) and after you've installed and understood ControlNet feel free to follow along. Controlnet allows you to position your avatar according to the characters on the input image. It's an entirely new rabbit hole to explore and Aitrepreneur explains how. ![](https://hackmd.io/_uploads/BkuPrfLoT.png) I hope you learned a couple of things in this guide. You can [follow me on Twitter](https://twitter.com/reneil1337) in case you're interested in these topics and want to be notified about more guides like this. ## Join the Conversation Drop your questions in [this Reddit thread](https://www.reddit.com/r/StableDiffusion/comments/12bpkqn/guide_convenient_process_to_train_lora_from_your/) will try to reply periodically over there.

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully