This is not an AI Art Podcast (Ep. 15)

![pod logo](https://i.imgur.com/SlYH9da.png =600x408) ## Intro Welcome to episode fifteen! This is your host, Doug Smith. This is Not An AI art podcast is a podcast about, well, AI ART – technology, community, and techniques. With a focus on stable diffusion, but all art tools are up for grabs, from the pencil on up, and including pay-to-play tools, like Midjourney. Less philosophy – more tire kicking. But if the philosophy gets in the way, we'll cover it. But plenty of art theory! Today we've got: * No Model madness this week. Just news! * Bloods and crits: * A little prompt engineering: Using the `BREAK` keyword * Technique of the week: Roop for face swap * My project update: Busted environment but good interactions Bunch of news, a PSA, but no art crits -- I'm late to record, and I was out camping all weekend, and while it was glorious, I am now behind on all my hustles! Available on: * [Spotify](https://open.spotify.com/show/4RxBUvcx71dnOr1e1oYmvV) * [iHeartRadio](https://www.iheart.com/podcast/269-this-is-not-an-ai-art-podc-112887791/) * [Google Podcasts](https://podcasts.google.com/feed/aHR0cHM6Ly9hbmNob3IuZm0vcy9kZWY2YmQwOC9wb2RjYXN0L3Jzcw) Show notes are always included and include all the visuals, prompts and technique examples, the format is intended to be so that you don't have to be looking at your screen -- but the show notes have all the imagery and prompts and details on the processes we look at. ## News SDXL is out! If you haven't heard, you must be living under a rock! But I'm glad if you heard it from me first. Good things: * Training looks to be really easy, effective, and flexible. * Initial results look really good Not as good of things: * Just like SD 2.X, people are complaining about a censored training dataset, boo * No control net * No adetailer (granted, I don't use it) Did you know that SD 1.5 was actually a leak? It wasn't supposed to be released publicly. Maybe we have to hope for the same, or hope that community training. I don't even really care "that much" about NSFW. But, I do believe that just like an art student whose taken a figure drawing class -- it's a big help. ### SDXL resources * [Cool article on a 30 minute training for SDXL, reddit](https://www.reddit.com/r/StableDiffusion/comments/15cbky5/sdxl_lora_30min_training_time_far_more_versatile/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=1) * [4k artists reference, on reddit](https://www.reddit.com/r/StableDiffusion/comments/15h0ypc/sdxl_10_artist_study_4000_artists/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=1) * [Tilt shift prompt templates](https://www.reddit.com/r/StableDiffusion/comments/15caius/sdxl_10_some_tiltshift_miniature_villages/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=1) * [Rabbits XL Reference](https://clio.so/rabbitsxl) * It's like... artist references but everything's a rabbit lol * ## SDXL Install Download the model and offset noise lora: https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/tree/main And the refiner model: https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/tree/main ..Into your models/stable-difussion folder And then I did a: ``` git fetch --all git checkout v1.5.1 ``` and then run web-user, and... moments later I was doing this: I didn't particularly do anything fancy, just set the size to 1024x1024 ``` a 1970s polaroid photograph Negative prompt: bad quality, low resolution, blurry, render Steps: 30, Sampler: DPM++ 2M SDE Karras, CFG scale: 7, Seed: 855725717, Size: 1024x1024, Model hash: 31e35c80fc, Model: sd_xl_base_1.0, Version: v1.5.1 ``` ![](https://hackmd.io/_uploads/SJCm6g7sh.jpg) This was the first one that came out and I kinda love it. ![](https://hackmd.io/_uploads/HyRgXWms3.jpg) ## Bloods and crits: Toy edition I think the barbie movie is getting toys on peoples mind. Glad I got some toys out of the way in my project before they blew up. I didn't even know there was going to be a barbie movie. You almost always get more "lucky hands" with toys! It helps mask what's going on with hands, and tends to get more simple hand positions. ### SDXL untitled: The toy boat * [on reddit](https://www.reddit.com/r/StableDiffusion/comments/15cwunm/sdxl_9/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=1) ![](https://preview.redd.it/goktyxyknxeb1.png?width=960&crop=smart&auto=webp&s=1aa87abe944d8d52feef9d5e40d0a4bc975f6f51) It's fun, which adds to the narrative. I love generations of toys. Rendering is phenomenal. Centered subject is... not helping. Could add excitment by making the boat smaller and something big to make it look even tinier. ### Pin up doll * [on reddit](https://www.reddit.com/r/tasteful_diffusion/comments/15a3lwu/from_the_pinup_doll_collection/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=1) ![](https://preview.redd.it/7ttwkgy4saeb1.jpg?width=640&crop=smart&auto=webp&s=31bc6bc7f9a8d2093d6137d52fe7cde2df5a29b3) This is an artist I've been seeing around, /u/missesfit who does pin up art and has a pin up subreddit [/r/tasteful_diffusion](https://www.reddit.com/r/tasteful_diffusion/) Again, I really like the toy. The render is great, and also picked a great generation. The choices for the doll really do scream "pin up" and it's quite effective. I've seen other works from the artist and it fits what I think they're going for, so I think it's really working in terms of their project, from what I know. While the toy is somewhat centered -- it's not. And it's helping. The figure goes off the frame and we get a lot of creation of negative space from it. It's working like that. It adds to the visual flow and moves your eye around the piece. That's working so well. There's just another easy thing to do with the narrative and to push the concept of the toy even further... it's part-to-whole relationships. Get something going to give a sense of scale. Show there's a whole here, a whole world that's more than the toy. The toy is part of the whole. But what's the whole? Granted, there's a lot of economy here. "Less is more" and that's working too. So I can see why maybe you wouldn't choose to go busy here. But you could do something that's subtle for the part to whole relationship. ## Break keyword From [this reddit post](https://www.reddit.com/r/StableDiffusion/comments/15bty86/prompt_trick_for_more_consistent_results_in/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=1) ``` 1920s flapper, looking into the distance, sly smile BREAK wearing earrings, lots of jewelry BREAK in a speakeasy BREAK analog style, Nikon Z 85mm camera RAW, (best quality:1.2), (masterpiece:1.2), award winning glamour photograph, (realistic:1.2), (intricately detailed:1.1), by camille souter, saturated colors, cinematic, warm dramatic sidelight, bloom, bokeh, blurry background, depth-of-field Negative prompt: (bad_prompt_v2:0.8),Asian-Less-Neg,bad-hands-5, BadDream, (skinny:1.2) Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 3284744675, Size: 432x648, Model hash: 47170319ea, Model: juggernaut_final, Denoising strength: 0.52, Hires upscale: 2, Hires upscaler: Latent Used embeddings: bad_prompt [1d99], Asian-Less-Neg [f94a], bad-hands-5 [10ca], BadDream [48d0] ``` ![](https://hackmd.io/_uploads/ByxXPd_sh.jpg) ``` 1920s flapper, looking into the distance, sly smile, wearing earrings, lots of jewelry, in a speakeasy, analog style, Nikon Z 85mm camera RAW, (best quality:1.2), (masterpiece:1.2), award winning glamour photograph, (realistic:1.2), (intricately detailed:1.1), by camille souter, saturated colors, cinematic, warm dramatic sidelight, bloom, bokeh, blurry background, depth-of-field Negative prompt: (bad_prompt_v2:0.8),Asian-Less-Neg,bad-hands-5, BadDream, (skinny:1.2) Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 4215061689, Size: 432x648, Model hash: 47170319ea, Model: juggernaut_final, Denoising strength: 0.52, Hires upscale: 2, Hires upscaler: Latent ``` ![](https://hackmd.io/_uploads/HyCODudin.jpg) ## Roop -- face swap! The install didn't go super smooth... There's an issue for vlad about it @ https://github.com/vladmandic/automatic/issues/1499 So I tried an install from URL from: https://github.com/Gourieff/sd-webui-roop-nsfw as suggested in the issue. Didn't work at first, so I tried updating my vlad -- I had a commit from april (damn that's kinda old now!), I just yolo'd a `git pull` from master. Didn't work on master either -- and shame on me for yolo'ing, my image browser tab is broken. Ugh. So, I'm trying to automatic1111. Turned out I needed to install the [MS visualstudio tools](https://visualstudio.microsoft.com/thank-you-downloading-visual-studio/?sku=Community&channel=Release&version=VS2022&source=VSLandingPage&cid=2030&workload=dotnet-dotnetwebcloud&passive=false#dotnet) Which is like 9 gig of dev tools, ridiculous! lol. And a [youtube reference at the right timestamp](https://youtu.be/TMBkrLd-7Q8?t=89) And then I did: ``` pip install insightface==0.7.3 ``` ### On to the rooping! So I want to try this to use for consistent characters for my project. I might even try with some generations for a specific But, one real person I want to use is an author. So I'm using a photo of her as a reference for roop. The tough thing is there isn't a ton of photos of her, there's some, but not really enough quality images to train a LoRA. Maybe if I tried, but roop seems like a shortcut. So I took one that I liked, actually it was a stock photo site, and what I did was use inpainting (and photoshop "remove tool" magic eraser) to erase the watermarks. Then I upscaled with gigapixel, and I got: ![](https://hackmd.io/_uploads/B1QXouuih.png) Then I just installed roop from the extensions tab. I used this prompt: ``` beautiful blond lumberjack woman on a lake in New Hampshire in the year 1 9 8 2, middle aged, portrait photography, RAW photo, (flannel and jeans:1.1), sharp focus, 8k UHD, DSLR, high quality, film grain, Fujifilm XT3 Negative prompt: (bad_prompt_v2:0.8),Asian-Less-Neg,bad-hands-5, BadDream, (skinny:1.2) Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 708732137, Size: 512x512, Model hash: f968fc436a, Model: analogMadness_v50, Denoising strength: 0.5, Hires upscale: 2, Hires upscaler: Latent, TI hashes: "bad_prompt: f9dfe1c982e2", Version: v1.5.1 ``` ![](https://hackmd.io/_uploads/S1K194Kj2.jpg) Then I enabled roop and recycled the seed... ![](https://hackmd.io/_uploads/SknVq4Fjh.jpg) And then a 4-pack of 'em to see how it looks. ![](https://hackmd.io/_uploads/HJnp9Etih.jpg) ## My project update: Busted environment but good interactions Busted my auto1111 environment, and vlad environment. I'm panicking lol.