This is not an AI Art Podcast (Ep. 18)

![pod logo](https://i.imgur.com/SlYH9da.png =600x408) ## Intro Welcome to episode eighteen! This is your host, Doug Smith. This is Not An AI art podcast is a podcast about, well, AI ART – technology, community, and techniques. With a focus on stable diffusion, but all art tools are up for grabs, from the pencil on up, and including pay-to-play tools, like Midjourney. Less philosophy – more tire kicking. But if the philosophy gets in the way, we'll cover it. But plenty of art theory! Today we've got: * Model madness: 3 SDXL models * Technique of the week: Perspective warp & "where's waldo style" image Available on: * [Spotify](https://open.spotify.com/show/4RxBUvcx71dnOr1e1oYmvV) * [iHeartRadio](https://www.iheart.com/podcast/269-this-is-not-an-ai-art-podc-112887791/) * [Google Podcasts](https://podcasts.google.com/feed/aHR0cHM6Ly9hbmNob3IuZm0vcy9kZWY2YmQwOC9wb2RjYXN0L3Jzcw) Show notes are always included and include all the visuals, prompts and technique examples, the format is intended to be so that you don't have to be looking at your screen -- but the show notes have all the imagery and prompts and details on the processes we look at. ## News Hidden text images are taking the reddits by storm! Do you find yourself squinting at all the images to see if there's something hidden? You're not alone. Gist is that these are created with QR Code Monster -- the same thing we were seeing earlier this summer with the QR code generations. QR Code monster makes it easy to hide an image within a generation, apparently. If you want, you can [download the QR code monster on huggingface](https://huggingface.co/monster-labs/control_v1p_sd15_qrcode_monster/tree/main), it's just like any other control net. ## Model Madness ### Albedo Base XL * [on Civitai](https://civitai.com/models/140737/albedobase-xl) Looking pretty good. I'm happy with what I got out of it. ``` 1920s flapper in a speakeasy, cinematic still Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 2526496416, Size: 1024x1024, Model hash: fddbb9b511, Model: albedobaseXL_v04, Version: v1.5.1 ``` ![](https://hackmd.io/_uploads/rk76Q6Ela.jpg) ``` 1970s discotheque, polaroid photography Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 2907653379, Size: 1024x1024, Model hash: fddbb9b511, Model: albedobaseXL_v04, Version: v1.5.1 ``` ![](https://hackmd.io/_uploads/B1dHNaExa.jpg) ### RealViz XL 2.0 Unsurprisingly for a realistic vision model -- it's looking really good. It's cool in the notes that it was trained in 140 hours locally on a GPU. That's really achievable! * [on Civitai](https://civitai.com/models/139562?modelVersionId=169921) ``` 1920s flappers in a speakeasy, cinematic still Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 1277344234, Size: 1024x1024, Model hash: 74dda471cc, Model: realvisxlV20_v20Bakedvae, Version: v1.5.1 ``` ![](https://hackmd.io/_uploads/BJzZHpNeT.jpg) ``` 1970s discotheque, polaroid photography Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 3077113078, Size: 1024x1024, Model hash: 74dda471cc, Model: realvisxlV20_v20Bakedvae, Version: v1.5.1 ``` ![](https://hackmd.io/_uploads/S1M6NpVlp.jpg) ### Juggernaut XL * [on civitai](https://civitai.com/models/133005/juggernaut-xl) ``` 1920s flappers in a speakeasy, beautiful lady, (freckles), big smile, blue eyes, hyperdetailed photography, soft light, head and shoulders portrait, cover Negative prompt: (worst quality, low quality, normal quality, lowres, low details, oversaturated, undersaturated, overexposed, underexposed, grayscale, bw, bad photo, bad photography, bad art:1.4), (watermark, signature, text font, username, error, logo, words, letters, digits, autograph, trademark, name:1.2), (blur, blurry, grainy), morbid, ugly, asymmetrical, mutated malformed, mutilated, poorly lit, bad shadow, draft, cropped, out of frame, cut off, censored, jpeg artifacts, out of focus, glitch, duplicate, (airbrushed, cartoon, anime, semi-realistic, cgi, render, blender, digital art, manga, amateur:1.3), (3D ,3D Game, 3D Game Scene, 3D Character:1.1), (bad hands, bad anatomy, bad body, bad face, bad teeth, bad arms, bad legs, deformities:1.3) Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 3386825384, Size: 1024x1024, Model hash: 70229e1d56, Model: juggernautXL_version5, Version: v1.5.1 ``` ![](https://hackmd.io/_uploads/H1CWU6Eg6.jpg) ``` 1920s flappers in a speakeasy, color cinematic still Negative prompt: (worst quality, low quality, normal quality, lowres, low details, oversaturated, undersaturated, overexposed, underexposed, grayscale, bw, bad photo, bad photography, bad art:1.4), (watermark, signature, text font, username, error, logo, words, letters, digits, autograph, trademark, name:1.2), (blur, blurry, grainy), morbid, ugly, asymmetrical, mutated malformed, mutilated, poorly lit, bad shadow, draft, cropped, out of frame, cut off, censored, jpeg artifacts, out of focus, glitch, duplicate, (airbrushed, cartoon, anime, semi-realistic, cgi, render, blender, digital art, manga, amateur:1.3), (3D ,3D Game, 3D Game Scene, 3D Character:1.1), (bad hands, bad anatomy, bad body, bad face, bad teeth, bad arms, bad legs, deformities:1.3) Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 1539299730, Size: 1024x1024, Model hash: 70229e1d56, Model: juggernautXL_version5, Version: v1.5.1 ``` ![](https://hackmd.io/_uploads/Skdh8aVlp.jpg) ``` 1970s discotheque, polaroid photography Negative prompt: (worst quality, low quality, normal quality, lowres, low details, oversaturated, undersaturated, overexposed, underexposed, grayscale, bw, bad photo, bad photography, bad art:1.4), (watermark, signature, text font, username, error, logo, words, letters, digits, autograph, trademark, name:1.2), (blur, blurry, grainy), morbid, ugly, asymmetrical, mutated malformed, mutilated, poorly lit, bad shadow, draft, cropped, out of frame, cut off, censored, jpeg artifacts, out of focus, glitch, duplicate, (airbrushed, cartoon, anime, semi-realistic, cgi, render, blender, digital art, manga, amateur:1.3), (3D ,3D Game, 3D Game Scene, 3D Character:1.1), (bad hands, bad anatomy, bad body, bad face, bad teeth, bad arms, bad legs, deformities:1.3) Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 3534473958, Size: 1024x1024, Model hash: 70229e1d56, Model: juggernautXL_version5, Version: v1.5.1 ``` ![](https://hackmd.io/_uploads/SJ1p8p4xT.jpg) ## Refactoring an object with perspective warp So related to my project, I saw someone post about this newly found artifact that's from a historical figure that I'm interested in. It's a chest from an old school photographer, from the mid-late 1800s. ![](https://hackmd.io/_uploads/r1pWwlXxT.jpg) I wanted to use it in a piece. So I came up with this generation from midjourney. With this prompt: ``` A beautiful victorian styled woman sits on a photographers luggage case, a photoshoot in the year 1880, a lake in New Hampshire, color photography by Meret Oppenheim, inspired by Michael Breitung ``` ![](https://hackmd.io/_uploads/ByxLwlmgp.jpg) Now I want to replace the chest. But.... I have the problem that the perspective is TOTALLY different. So I cut it out and used the "perspective warp" feature in photoshop. (Here's a more detailed [youtube tutorial on perspective warp](https://www.youtube.com/watch?v=STCI_sgjVOo)) ![](https://hackmd.io/_uploads/rkj5vgQlT.jpg) Then I fit it into place, cut it further, painted a little, changed brightness/contrast, and generally painted some shadow areas. ![](https://hackmd.io/_uploads/HyPbue7xp.jpg) Then, I inpainted... Things I fixed up: * The chest (of course) * inpaint "at full resolution" the face. * I also moved her behind down a little so she sits more nicely on the chest Both color, and sepia, which I used as the final. Kinda wish I made another pass at the chest, but, yeah. ![](https://hackmd.io/_uploads/HJxntx7lT.jpg) ![](https://hackmd.io/_uploads/rylhtxml6.jpg) ## Where's Waldo imitation So I saw this post on reddit asking "Can you make a Where's Waldo style illustration with tons of detail at a print scale?" I think it got it generally! At least... This final result would wind up being: ``` 4096 x 3264 == 10x13" @ 300 dpi ``` Things ### Character sheets Initial prompt... ``` group of women in futuristic space suits character sheet, illustration, isometric, packed with hidden details, intricate details <lora:add_detail:0.4> ``` Kinda better luck without `isometric` ``` group of men in futuristic space suits character sheet, detailed face, full body, boots, illustration, packed with hidden details, intricate details <lora:add_detail:0.4> ``` Examples: ![](https://hackmd.io/_uploads/rkqTH_mgT.jpg) ![](https://hackmd.io/_uploads/ry5aBO7xp.jpg) ### Initial Generation I worked on an initial generation... First, I interrogated a Where's Waldo image from the web, then... I used img2img at like `0.60` denoise with a Where's Waldo image. ``` people inside a space station, find the hidden object, where's waldo, illustration, isometric, packed with hidden details, intricate details <lora:add_detail:0.4> ``` I wound up with an initial generation. I then upscaled it with SD Ultimate Upscale script, and used controlnet tile. I wound up with a 4096x image, but this is a JPEG'y general gist of what I wound up with... ![](https://hackmd.io/_uploads/SkNjwqNla.jpg) I later took it into photoshop and fixed colors and levels. ### Photobashing the characters (and other stuff) And then I photobashed them to a scale I picked... ![](https://hackmd.io/_uploads/S1UNUY7g6.jpg) And inpainted... ![](https://hackmd.io/_uploads/HyBEIKQxa.jpg) It starts to get fun because you can add your own details, I was like "let's put some monster floating in a specimen jar" So I started with a literal specimen jar... ![](https://hackmd.io/_uploads/rkjPKtmea.jpg) Then I inpainted it... ![](https://hackmd.io/_uploads/r1oPtF7ep.jpg) ## The final! Right click and open it in a new tab so you can see it in full res. ![](https://i.imgur.com/qMuthl4.jpg)