This is not an AI Art Podcast (Ep. 2)

# This is not an AI Art Podcast (Ep. 2) ![pod logo](https://i.imgur.com/SlYH9da.png =600x408) ## Intro Welcome to episode two! This is your host, Doug Smith. This is Not An AI art podcast is a podcast about, well, AI ART – technology, community, and techniques. With a focus on stable diffusion, but all art tools are up for grabs, from the pencil on up, and including pay-to-play tools, like Midjourney. Less philosophy – more tire kicking. But if the philosophy gets in the way, we'll cover it. But plenty of art theory! Today we've got: * Model madness model reviews: 3 models and a lora, including ControlNet 1.1 * In a segment I'm calling "Bloods and crits": Art critique on 4 pieces, and to keep it fair one of mine * Technique of the week: one quick traditional art technique, and we're going to deconstruct a piece of art and build our own workflow to imitate it, as a study * My project update: so you can learn from my process One theme we're going to keep in mind today is "audience" -- how your artwork plays to the audience that consumes it. The format is intended to listen when you're on the go, but, pick up the show notes later to take a look at the visuals. ## Model Madness ### iCoMix -- Comics Model Version 4 just released this past week! https://civitai.com/models/16164?modelVersionId=43844 Two prompts, one for gangsters and one for flappers. Honestly the flappers are spicy. I found this while helping my wife with a project, and she wanted to super heroes. BUT! The audience was kind of "rated G" and the spice was too much. I spent all the time painting clothes back on the figures! Gangsters: ``` 1920s mafia men meeting in a new york city speakeasy, 1920s, sly smile, night, 4k, uhd, masterpiece, style of Kelly Sue Deconnick Negative prompt: bad_prompt_version2:0.8, ((((big hands, un-detailed skin, semi-realistic, cgi, 3d, render, sketch, cartoon, anime)))), (((ugly mouth, ugly eyes, missing teeth, crooked teeth, close up, cropped, out of frame))), worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck Steps: 35, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: -1, Face restoration: CodeFormer, Size: 512x512, Model hash: ca20c01d0a, Model: icomix_V04, Denoising strength: 0.58, Hires upscale: 1.5, Hires upscaler: Latent ``` Flappers: ``` 1920s flapper in a new york city speakeasy, 1920s, sly smile, night, 4k, uhd, masterpiece, style of Kelly Sue Deconnick Negative prompt: bad_prompt_version2:0.8, ((((big hands, un-detailed skin, semi-realistic, cgi, 3d, render, sketch, cartoon, anime)))), (((ugly mouth, ugly eyes, missing teeth, crooked teeth, close up, cropped, out of frame))), worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck, ((big boobs, skinny)) Steps: 35, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: -1, Face restoration: CodeFormer, Size: 512x512, Model hash: ca20c01d0a, Model: icomix_V04, Denoising strength: 0.58, Hires upscale: 1.5, Hires upscaler: Latent ``` ![](https://i.imgur.com/q7AUtRr.jpg) ![](https://i.imgur.com/YX9CYKj.jpg) ### OpenJourney It's not new, but, you should try it if you haven't. It's a model based on output from MidJourney v4. Really stoked on the output I got from it, especially with some artist names, I wonder if we got some virtual training of artists from rehashes of their work? Should be interesting with training from v5, which is really quite good. There's also https://vicuna.lmsys.org/ -- Vicuna, a large language model (LLM) that's trained on GPT-4 output, which also does a fairly good job. https://huggingface.co/prompthero/openjourney ``` portrait of a 1920s flapper in a night club, night, dusk, 4k, uhd, masterpiece, style of william etty Negative prompt: bad_prompt_version2:0.8, ((((big hands, un-detailed skin, semi-realistic, cgi, 3d, render, sketch, cartoon, anime)))), (((ugly mouth, ugly eyes, missing teeth, crooked teeth, close up, cropped, out of frame))), worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck Steps: 35, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: -1, Face restoration: CodeFormer, Size: 512x512, Model hash: aba96b389d, Model: mdjrny-v4, Denoising strength: 0.58, Hires upscale: 1.5, Hires upscaler: Latent ``` *NOTE*: We'll use the same negative prompt the rest of the time, let's just leave it at that. ![](https://i.imgur.com/cCoyrye.jpg) ![](https://i.imgur.com/tPxmYoN.jpg) ## Reflections Lora https://civitai.com/models/43671?modelVersionId=48310 I got some great results with it! Really interesting outcome, and looks like it's thoughtfully trained based on the description. ``` RAW photo, Reflections of a railroad bridge on the Hudson river in NYC, at dawn, 8k uhd, dslr, soft lighting, high quality, film grain, Fujifilm XT3 <lora:reflectionsOnWater_v10:1> Negative prompt: bad_prompt_version2:0.8, ((((big hands, un-detailed skin, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime)))), (((ugly mouth, ugly eyes, missing teeth, crooked teeth, close up, cropped, out of frame))), worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck Steps: 35, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: -1, Face restoration: CodeFormer, Size: 768x768, Model hash: e6415c4892, Model: realisticVisionV20_v20 ``` With Lora: ![](https://i.imgur.com/KO2YDsg.jpg) Without Lora: ![](https://i.imgur.com/eHXx8xr.jpg) ### Control Net v1.1 I took a public domain illustration from an old book, and used it as a control net, here's the result with v1 with: Here's the control net... ![](https://i.imgur.com/VMPFTGm.png) ``` man fishing from a canoe with another man paddling, lake, forest, New England, masterpiece, fine details, style of agnes lawrence pelton Negative prompt: ((((big hands, un-detailed skin, semi-realistic, cgi, 3d, render, cartoon, anime)))), (((ugly mouth, ugly eyes, missing teeth, crooked teeth, close up, cropped, out of frame))), worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 2927671094, Face restoration: CodeFormer, Size: 800x488, Model hash: a60cfaa90d, Model: dreamshaper_5BakedVae, ControlNet Enabled: True, ControlNet Module: canny, ControlNet Model: controlnetPreTrained_cannyV10 [e3fe7712], ControlNet Weight: 1, ControlNet Guidance Start: 0, ControlNet Guidance End: 1 ``` ![](https://i.imgur.com/IKG4QLv.png) And then, with v1.1 also using canny and the same prompt... ![](https://i.imgur.com/lgfhELC.png) I think it does do a better job, the positioning of the hands and arms is much better And I wound up taking it to a (more-or-less) final piece, as well ![](https://i.imgur.com/XADXyqK.jpg) (Primary techniques: inpainted many portions after I did the initial control net run, and mostly ran lots of pieces individually, each person, their faces, lots of corrections with a raster editor to address) ## Resources Big collection of prompts to search through. https://unprompt.ai/ Artist reference https://www.seedscienceai.com/ If you're not familiar with it, worth it if you're not even a midjourney user, the Midjourney Style blog by C Kovalev https://ckovalev.com/midjourney-ai/styles ## Bloods & Crits I'm going to do some crits. First thing first -- I choose these because I like them. I'm not going to pick any pieces that I don't just absolutely dig. There's a lot of art that reminds me of the 15-16th century portraits at the Museum of Modern Art. Lots of the same kind of... same looking people, same looking expression, same kind of art style. It gets hum drum. I can remember visiting the met when I was 15 and getting bored in the like 1400's portraiture, which actually I like more now. I wanted to see dadaism, give me a statue with a baguette on its head, like I saw at the moma. Or, the smashed piano installation at the whitney. Not to mention 1 million and one pop culture and political works. They bore me. I'm probably not the right audience for them, but, still I don't like them. So I'm picking those that stand out. So if I chose one of your pieces to critique, it's because I like it and I want to learn more about my own process, and help other people's process too. I'm trying to pick ones that have a workflow, but, I'll pick some that don't. I think it shows a lack of confidence when people don't give a general outline. You have skills that are just from you, sharing the technique doesn't take away from what you have, it builds upon it. ### Experimenting with stone [On Reddit](https://www.reddit.com/r/StableDiffusion/comments/12nyn64/experimenting_with_stone/) Love the concept! And the textures are amazing. I want to touch my screen. The compositions are good. But as much as the concept pushes the narrative, there's still a kind of typical portraiture vibe, which could probably be improved. But the quality is outstanding. It's a shame the workflow isn't shared, not even at a high level. Above all, I love the repetition of form. That is working SO well. I'd also love to see this with more a part-to-whole relationship, like... rocks::mountains, rocks::cliffs, etc. ![](https://i.imgur.com/zowydMk.jpg) ### AI inside your computer [On Reddit](https://www.reddit.com/r/StableDiffusion/comments/12nopay/artificial_intelligence_in_your_pc/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=share_button) Love the meta value. We get a really amazing relevance to the audience. And it plays well to /r/StableDiffusion, and audience is an important thing to consider in your artwork. The composition could use some work, the fact it's all inside the frame could be improved, maybe even just with a crop in this case. However, I get that it makes a more iconic PC computer case. Depth of field looking great in this piece, really like it, builds a lot of depth, and it's got really good feeling of space. ![](https://i.redd.it/8hd6w3k365ua1.png) ### "Mister Carter, I'm paying you a lot of money, and I expect results." [On reddit](https://www.reddit.com/r/StableDiffusion/comments/12ohg6b/mister_carter_im_paying_you_a_lot_of_money_and_i/) This is my favorite of the week. And it's my favorite for a few reasons, it's got the best composition out of the bunch. And the way the artist created action of the subject is working REALLY well. VERY strong use of negative space, and it's working really well to create composition and demonstrate form. The color field here works super well. Good movement. Additionally, The painterly style overall is crazy good, very much gouache (maybe even used in the prompt? I forget) And their workflow is purely amazing. Love it. They use their drawing skills to build the idea, the composition, and eventually parlay it all with a great narrative. It's not just a pretty girl, even though it is, it's got a mood and a tone, and I can feel the story from this. And I love the title that goes with it, that even builds it more. It's maybe a little flat in the left shoulder. I realize that fields of color and cel shading is a thing, so it's likely intentional, but it'd be the thing I was going after if I was. I sometimes put a finger up over the piece to see what looks wrong or flat. ![](https://i.redd.it/uew1n5sc3aua1.png) ### My own piece, based on a hunting guide On the workflow: I didn't save my progress to share with you, but it's using one of my LoRA's, and then I thought I had a finished piece, but I used that piece as a control net and then colorized it. Then, I inpainted details. I'm really happy with the composition. I think that the dark background mountains aren't idea. Maybe if that was lighter it'd give a better overall sense of space The gun has some problems on the forestock, the wood towards the tip of it. That could've probably been fixed. I really like the pose and the narrative I'm creating though. Her facial expression is maybe a little neutral. ![](https://i.imgur.com/A3sWsdo.jpg) Update: I took into account my own crits, and I changed the background to improve the contrast line a bit, and I do think it helps with the depth of the image. ![](https://i.imgur.com/FlJacuj.jpg) Additionally, you can see that I made the ledge into a greyer color, which matches the rock found in the region I'm representing, as well. I did so by creating a mask in my image editor, selecting that area. I then used a "palette match" in my image editor, I generated images of granite and then used that to palette match. I then used inpaint-upload to upload the exact mask, and then did inpaint img2img with a low denoising amount (0.25). ## Technique of the week Traditional art: [Gesture drawing -- wikipedia](https://en.wikipedia.org/wiki/Gesture_drawing) Just try it now and again. Do it on a blank canvas digitally if you have to. But, do it. Let's reverse engineer an image, we're going to use the "set in stone lady" as an example. We'll use some MJ tools for reference, but we'll do the work in SD. First I ran it through this interrogator: https://huggingface.co/spaces/fffiloni/CLIP-Interrogator-2 And then got: ``` a close up of a person wearing a hat on a beach, digital art, inspired by andrey ryabovichev, cgsociety contest winner, fantasy art, yuri shwedoff and tom bagshaw, elle fanning as an android, realistic 8k bernini sculpture, portrait of a norse moon goddess ``` I modified it a bit, and used this prompt: ``` a close up of a person wearing a crown and jewelry on a beach, digital art, inspired by andrey ryabovichev, cgsociety contest winner, fantasy art, yuri shwedoff and tom bagshaw, elle fanning as an android, realistic 8k bernini sculpture, portrait of a norse moon goddess Negative prompt: ((((big hands, un-detailed skin, semi-realistic, cgi, 3d, render, cartoon, anime)))), (((ugly mouth, ugly eyes, missing teeth, crooked teeth, close up, cropped, out of frame))), worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 3099376671, Face restoration: CodeFormer, Size: 768x768, Model hash: a60cfaa90d, Model: dreamshaper_5BakedVae ``` ![](https://i.imgur.com/V8QlSrX.jpg) Then, through Midjourney's `/describe` and got: ``` 1️⃣ 3d character shabby chica, in the style of detailed atmospheric portraits, crystalline and geological forms, victorian-inspired illustrations, stone, dark beige and sky-blue, unreal engine 5, delicate landscapes --ar 2:3 2️⃣ a woman wears a hat covered in the stones, in the style of daz3d, bella kotak, anime-inspired characters, reylia slaby, sky-blue, detailed miniatures, jagged edges --ar 2:3 3️⃣ digital illustration of a woman wearing a big hairstyling.jpg, in the style of stone sculptures, john wilhelm, natalia rak, whimsical anime, unreal engine 5, dark beige and sky-blue, detailed miniatures --ar 2:3 4️⃣ a woman holding a big hat, in the style of zbrush, sky-blue and gray, stone sculptures, detailed facial features, cute and dreamy, tinycore, rococo portraitures --ar 2:3 ``` Midjourney gets the idea a little bit, and I ran it through MJv5 on #2 I got: ![](https://i.imgur.com/42LpPRy.png) It's cool but it doesn't capture the whole idea. The images are high quality, though. So, I went and modified my prompt until I had: ``` a close up of a woman wearing a crown and costume jewelry in front of a cliff, (smirk:1.2), looking over shoulder, flowing hair, at twilight, digital art, inspired by andrey ryabovichev, cgsociety contest winner, fantasy art, yuri shwedoff and tom bagshaw, elle fanning as an android, realistic 8k bernini sculpture, portrait of a norse moon goddess ``` (same settings as above otherwise) And I wound up with: ![](https://i.imgur.com/TdRXBlz.png) I overpainted it, using some stock photo of some stones... ![](https://i.imgur.com/2sZJH5J.png) And then I inpainted, and I had my prompt as: ``` RAW photo, a close up of a woman wearing (detailed small smooth stones:1.3) in front of a cliff, (smirk:1.2), looking over shoulder, flowing hair, at twilight, digital art, inspired by andrey ryabovichev, cgsociety contest winner, fantasy art, yuri shwedoff and tom bagshaw, elle fanning as an android, realistic 8k bernini sculpture, portrait of a norse moon goddess ``` I eventually also added the same image as a control net, and played with the denoising amount. When I got something I like, I scaled it up with img2img and 0.25 denoising, and then inpainted the crown a few more times, and I wound up with: ![](https://i.imgur.com/RZvhyXB.jpg) We still have problems with the face, especially the eye, so I generated a bunch of inpaints of the face, and the crown again, and wound up with... ![](https://i.imgur.com/Pm2yGQz.jpg) I'm not going to take it any further, but at this point, I'd say that we've developed the START of a workflow. I'm missing the mark on a number of areas. I still suffer the generic portrait syndrome. I didn't get the repetition of form. The crown isn't sitting right on the top of the head (garbage, garbage out). And the detailing isn't as fine, and the concept isn't as strong. But! We could work from here. I'd probably start over at this point since I got a workflow. ## Project update * Touching moments from connecting with my audience * Still working on my LoRAs, doing some x-y-z scripts * https://github.com/xrpgame/xyz_plot_script * LoRA training comparison: https://www.youtube.com/watch?v=yogymGvKVG8 * My pieces are taking too long to complete! I want to add more automation and get to a place where I'm using some more like "instant generated" stuff, and I'm hoping my LoRA's can help me. ## Meta How the show got its name! https://en.wikipedia.org/wiki/The_Treachery_of_Images ## Ramblings