This is not an AI Art Podcast (Ep. 11)

![pod logo](https://i.imgur.com/SlYH9da.png =600x408) ## Intro Welcome to episode eleven! This is your host, Doug Smith. This is Not An AI art podcast is a podcast about, well, AI ART – technology, community, and techniques. With a focus on stable diffusion, but all art tools are up for grabs, from the pencil on up, and including pay-to-play tools, like Midjourney. Less philosophy – more tire kicking. But if the philosophy gets in the way, we'll cover it. But plenty of art theory! Little different this week! Today we've got: * Model madness model reviews: 3 models and a LoRA * Technique of the week: Getting a specific hairstyle * My project update: so you can learn from my process Bunch of news, a PSA, but no art crits -- I'm late to record, and I was out camping all weekend, and while it was glorious, I am now behind on all my hustles! Available on: * [Spotify](https://open.spotify.com/show/4RxBUvcx71dnOr1e1oYmvV) * [iHeartRadio](https://www.iheart.com/podcast/269-this-is-not-an-ai-art-podc-112887791/) * [Google Podcasts](https://podcasts.google.com/feed/aHR0cHM6Ly9hbmNob3IuZm0vcy9kZWY2YmQwOC9wb2RjYXN0L3Jzcw) Show notes are always included and include all the visuals, prompts and technique examples, the format is intended to be so that you don't have to be looking at your screen -- but the show notes have all the imagery and prompts and details on the processes we look at. ## PSA: Don't get ripped off. I saw a [Reddit post](https://www.reddit.com/r/StableDiffusion/comments/14jpth9/hiring_multiple_artists/) asking people to do "a sample for a try out" -- don't do this stuff, this is how graphic design scammers work. If you're new to the scene -- the way it works is, you have a portfolio. People choose to hire you. Typically, you ask for a deposit first. Also -- don't do work for exposure, do it for money (usually.) ## News ### SDXL is on the horizon! * https://stability.ai/blog/stable-diffusion-xl-beta-available-for-api-customers-and-dreamstudio-users * https://stability.ai/blog/sdxl-09-stable-diffusion You can check out the examples @ https://clipdrop.co/stable-diffusion Lots of hands! Left = beta, Right = SDXL 0.9 ![](https://images.squarespace-cdn.com/content/v1/6213c340453c3f502425776e/b22965ee-fcea-4a9d-938b-d369a3b0829c/Stability+AI+SDXL.9+coffee.png?format=1500w) Word on the street.. * [Might be difficult to train on consumer hardware](https://www.reddit.com/r/StableDiffusion/comments/14igpa0/a_report_of_trainingtuning_sdxl_architecture/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=share_button) * [Other folks are saying that Stability claims you can train on a 4090](https://www.reddit.com/r/StableDiffusion/comments/14hw20z/it_will_be_an_absolute_madness_when_sdxl_becomes/jpd80eo?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=share_button) At any rate, it will for sure take more hardware. And someone saying: "We need training docs" -- and boy do we ever. ### Mainframe vs. PC Something happening since the late 70s and continues today... Mainframe vs. Personal computer There's trade-offs all around. Cloud computing is like mainframe computing. I think we're going to see a lot of uses on both sides. ### Nvidia released a bunch of papers [From Youtube](https://www.youtube.com/watch?v=lPZk4ZvPiY4) ## Theme of the week: Shape the box. > You have to shape the box before you roll the dice inside the box, Otherwise the box covers too much space and the dice could land on anything at all From my friend who I'm calling "The Psychedelic Prof" ## Model Madness ### Analog Madness v5 https://civitai.com/models/8030/analog-madness-realistic-model This is an A+. Even better than expected, and I use this model daily. ``` 1920s flapper, RAW photo, 4k, UHD, analog style, film grain, depth of field, bokeh, fujifilm XT3 Negative prompt: (bad_prompt_v2:0.8),Asian-Less-Neg,bad-hands-5, BadDream, UnrealisticDream, (skinny:1.2), (drawing, anime, render) Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 5, Seed: 4059780811, Face restoration: CodeFormer, Size: 512x680, Model hash: f968fc436a, Model: analogMadness_v50, Denoising strength: 0.5, Hires upscale: 1.5, Hires steps: 10, Hires upscaler: Latent ``` ![](https://hackmd.io/_uploads/S19U_zLd2.jpg) ``` 1990s raver chick from upstate NY at a rave, RAW photo, 4k, UHD, analog style, film grain, depth of field, bokeh, fujifilm XT3 Negative prompt: (bad_prompt_v2:0.8),Asian-Less-Neg,bad-hands-5, BadDream, UnrealisticDream, (skinny:1.2), (drawing, anime, render) Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 5, Seed: 1484487730, Face restoration: CodeFormer, Size: 512x680, Model hash: f968fc436a, Model: analogMadness_v50, Denoising strength: 0.5, Hires upscale: 1.5, Hires steps: 10, Hires upscaler: Latent ``` ![](https://hackmd.io/_uploads/rJI7FzI_h.jpg) ### DuchaitenJourney v5 Definitely high grade model. Got some great results out of it. I've omitted the ravers because this model gets 🌶️🌶️🌶️🌶️/5 on the spiciness scale. It thinks ravers go to the show naked, or in underwear. * [From Reddit](https://www.reddit.com/r/StableDiffusion/comments/14f40ou/duchaiten_journey_v5_is_now_available_dream_team/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=share_button) * [On CivitAI](https://civitai.com/models/20261/duchaitenjourney) ``` photorealistic, highest quality, masterpiece, RAW photo, artistic composition, full body photograph of a 1920s flapper in a 1920s style dress, night club, speakeasy, at night, dynamic lighting, pale skin, highly detailed skin, skin texture, skin wrinkles, skin spots, (abundant details, intricate details:1.2), detailed background, subsurface scattering, sharp focus, 8k, highly detailed, UHD, HDR, ttpt-fc, <lora:Elixir:0.25>,<lora:add_detail:0.25> Negative prompt: (low quality, worst quality), [boring_e621_v4, UnrealisticDream, BadDream, badhandv4, negative_hand-neg, Negfeet-neg, BadNegAnatomyV1-neg], 3D, render, 3D render, low-res, unreal engine, sketch, cartoon, manga, blurry, undetailed, undetailed skin, untextured skin, bad teeth, silicone, fake breasts, hard breasts, bad hands, poorly detailed hands, polydactyl, poorly detailed physiognomy, poorly detailed face, poorly detailed eyes, poorly detailed iris, blurred iris, strabism, bad anatomy, wrong anatomy, muscle stiffness, poorly detailed musculature, poorly detailed body, poorly detailed clothes, poorly detailed background Steps: 60, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 1496819259, Face restoration: CodeFormer, Size: 640x848, Model hash: f968fc436a, Model: analogMadness_v50, Denoising strength: 0.46, Hires upscale: 1.5, Hires steps: 10, Hires upscaler: Latent ``` ![](https://hackmd.io/_uploads/BkvU17Uuh.jpg) ### Elixir LoRA https://civitai.com/models/78283/elixir-enhancer-lora This is apparently some "enhancer" type of LoRA that's kind of claimed to sort of "modernize" your model. The author explains that they have some kind of mechanism to mix up some LoRAs that look like obvious enhancements, and somehow like... merge and extract the best.... commonalities? ``` photorealistic, highest quality, masterpiece, RAW photo, artistic composition, full body photograph of a 1920s flapper dressed in a 1920s dress in a speakeasy, at night, dynamic lighting, pale skin, highly detailed skin, skin texture, skin wrinkles, skin spots, (abundant details, intricate details:1.2), detailed background, subsurface scattering, sharp focus, 8k, highly detailed, UHD, HDR, ttpt-fc, <lora:Elixir:0.0> Negative prompt: (low quality, worst quality), [UnrealisticDream, BadDream, bad-hands-5], 3D, render, 3D render, low-res, unreal engine, sketch, cartoon, manga, blurry, undetailed, undetailed skin, untextured skin, bad teeth, silicone, fake breasts, hard breasts, bad hands, poorly detailed hands, polydactyl, poorly detailed physiognomy, poorly detailed face, poorly detailed eyes, poorly detailed iris, blurred iris, strabism, bad anatomy, wrong anatomy, muscle stiffness, poorly detailed musculature, poorly detailed body, poorly detailed clothes, poorly detailed background Steps: 60, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 100, Face restoration: CodeFormer, Size: 576x768, Model hash: f968fc436a, Model: analogMadness_v50 ``` ![](https://hackmd.io/_uploads/rkXjvlw_n.jpg) ### NextPhoto 2.0 I believe it's a merge. * [on Reddit](https://www.reddit.com/r/StableDiffusion/comments/14jf57r/nextphoto_v20_is_fantastic/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=share_button) * [on CivitAI](https://civitai.com/models/84335/nextphoto) My results are "passable" -- I was drawn to it because of the landscape examples, which I would say are rather good. So I'll probably give it a try for those here and again. The flappers, while good... are only so-so flapperish, and the output is OK, but not mind blowing. ``` a perfect photo of a 1920s flapper in a speakeasy at night with an oversized moon, high-quality digital art, trending on Artstation Negative prompt: (worst quality:0.8), cartoon, halftone print, (cinematic:1.2), (surreal:0.8), (modernism:0.8), (art deco:0.8), (art nouveau:0.8) Steps: 60, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 790463415, Face restoration: CodeFormer, Size: 576x768, Model hash: 3166f786a0, Model: nextphoto_v20, Denoising strength: 0.43, Hires upscale: 1.5, Hires steps: 10, Hires upscaler: Latent ``` ![](https://hackmd.io/_uploads/SJR0VvPuh.jpg) ``` a perfect photo of the forgotten abandoned mill by the creek super moon, golden hour, perfect lighting, 8 k high detail, masterpiece, trending on artstation Negative prompt: (worst quality:0.8), cartoon, halftone print, (cinematic:1.2), (verybadimagenegative_v1.3, easynegative:0.7), (surreal:0.8), (modernism:0.8), (art deco:0.8), (art nouveau:0.8) Steps: 60, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 2183069153, Face restoration: CodeFormer, Size: 768x576, Model hash: 3166f786a0, Model: nextphoto_v20, Denoising strength: 0.43, Hires upscale: 1.5, Hires steps: 10, Hires upscaler: Latent ``` ![](https://hackmd.io/_uploads/B1Ew4PwOn.jpg) ### Additional resources Turn any model into an inpainting version of that model https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/How-to-make-your-own-Inpainting-model ## Technique of the week: Get the hairstyle you want. Inspired by [this reddit post](https://docs.google.com/document/d/1VKdVdpzXFsOCVY1JVdodpVnMBJLiNd7njh4ZAPVgDQk/edit) First you gotta figure out one thing: Are you doing this once, or a bunch? * If you're doing this a ton: Consider training a LoRA * If you're doing this as a one-off: Just photo bash it. I'd definitely recommend you DON'T get hung up on prompting for every last detail. Get the rest of the scene. If a detail is important enough, then you want to take another approach to get it just right. This person was looking for a "crown braid" which is a hairstyle. I've seen it but I don't think I would've recognized it. So let's see how well prompting for it goes... Not bad! Crown overpowers it. I kinda get a crown braid. But, it's going to need a fix anyway. ``` portrait of a victorian woman with a crown braid hairstyle, RAW photo, analog style, 4k, UHD, Fujifilm XT3, depth of field, (bokeh:0.7) Negative prompt: (bad_prompt_v2:0.8),Asian-Less-Neg,bad-hands-5, BadDream, (skinny:1.2), Steps: 60, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 3470096931, Face restoration: CodeFormer, Size: 576x768, Model hash: f968fc436a, Model: analogMadness_v50, Denoising strength: 0.43, Hires upscale: 1.5, Hires steps: 10, Hires upscaler: Latent ``` ![](https://hackmd.io/_uploads/ryGjvvv_h.jpg) So let's generate a starting point... We can see this isn't totally what we want. And the emphasis is the hair, so it definitely needs work. ``` profile view portrait of a victorian woman with a crown braid hairstyle, hairstyle magazine style, luscious garden, RAW photo, analog style, 4k, UHD, Fujifilm XT3, depth of field, (bokeh:0.7) AS-MidAged Negative prompt: (bad_prompt_v2:0.8),Asian-Less-Neg,bad-hands-5, BadDream, (skinny:1.2), Steps: 60, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 1875539628, Face restoration: CodeFormer, Size: 576x768, Model hash: f968fc436a, Model: analogMadness_v50, Denoising strength: 0.52, Hires upscale: 1.5, Hires steps: 10, Hires upscaler: Latent ``` ![](https://hackmd.io/_uploads/B1_qYPP_3.jpg) So I tried to find one on Pexels, but I couldn't find one, so I borrowed some hair from elsewhere... then I bashed together this: I'm not too worried about making it perfect. Also I plan to lose some other facial details, or have them change. ![](https://hackmd.io/_uploads/HkW09wD_3.jpg) Now I just inpaint the area, and play with denoising, I wound up at about `0.38` and I actually wound up turning off facial correction. ![](https://hackmd.io/_uploads/rJwAswDu3.jpg) ## Update on my project Some initial outputs for a Victorian fashion based LoRA About 600 images for about 90k steps over 8 epochs. ![](https://hackmd.io/_uploads/SJtn8zU_n.jpg) ![](https://hackmd.io/_uploads/HJ8QPM8_n.jpg) ![](https://hackmd.io/_uploads/rykb_fUO2.jpg)