Intro

Welcome to episode twenty three! This is your host, Doug Smith. This is Not An AI art podcast is a podcast about, well, AI ART – technology, community, and techniques. With a focus on stable diffusion, but all art tools are up for grabs, from the pencil on up, and including pay-to-play tools, like Midjourney. Less philosophy – more tire kicking. But if the philosophy gets in the way, we'll cover it.

But plenty of art theory!

Today we've got:

An interview with NoCo Fever Dreams
News: Including then
Then the interview with Mitch
Model madness: 1 model and 2 LoRAs
Technique of the week: Some prompt engineering! JPEG as a negative and LoRA keyword test

Available on:

Show notes are always included and include all the visuals, prompts and technique examples, the format is intended to be so that you don't have to be looking at your screen – but the show notes have all the imagery and prompts and details on the processes we look at.

News

WoTC under attack for using AI generated art

https://www.forbes.com/sites/paultassi/2024/01/07/wizards-of-the-coast-apex-legends-under-fire-for-ai-art/

They only used it for promo materials and not the cards.

…I'd be shocked if their artists aren't using it already. There's already been scandals where people were using reference art, and… Didn't actually wind up overpainting it and got busted by the artists.

It's philosophical, but, I'm reminded if the Picasso quote:

Good artists copy, great artists steal.
-Picasso

Which I personally interpret to mean: Artists learn from and borrow from one another in order to grow art in general. They learn and use references from one another.

Blatantly taking a copy of something without making it your own is whack – that's like when I complain about thin concepts.

Photobashing is a thing though, and way before AI art generations. In fact check out this post where it was banned on reddit sub for digital painting, 10 years ago. Which is, I think myopic, but, it's my opinion. And see this article about what photo bashing is.

I think a lot of the public overlooks "how the sausage is made" – maybe you don't want to know.

My father is a signmaker, and he originally made all his signs hand carved, but… He's used a CNC machine since 1996 or so. And his stuff is awesome. But he doesn't advertise "I carve these with CNC!" people like to believe their sign was hard carved, he doesn't try to fool them either (and doesn't need to).

Model madness

Midjourney v6

Overall: This is RIDICULOUS. The generations I'm getting are insane.

Everything you want out of midjourney original generations – with no frills. No regional variations, no outpainting, no pan. Just generations.

There's also upscaling which provides really big images, we're talking 2400px range.

Some folks have complaints that the hands have regressed from version 5.

Lots of praise for prompt coherence.

Hot take: Midjourney is WAY better than DALL-E, especially for someone who's interested in graphic arts in general and has an ability to iterate. I feel like DALL-E has some signature look that's… Honestly, it's ugly, it's oversaturated. And to me? Prompt coherence is only part of the battle, even if DALL-E is the reigning champion.

the 1920s flappers gather around the table for cocktails in a smoky speakeasy, kodachrome, color photography by Slim Aarons --v 6.0 --ar 5:4

the stunning 1920s flapper in a dynamic pose at the bar of a busy speakeasy, color photography by Paul Outerbridge --v 6.0 --ar 4:5

And the prompt coherence is rather swell.

As an homage to (Dali's retrospective bust of a woman)[https://www.dalipaintings.com/retrospective-bust-of-a-woman.jsp]

a 1920s flapper with a baguette on her head as a hair accessory, photography by Flora Borsi --v 6.0

This turned out really cool, I was really just messing around, but I love referencing Zena Holloway – worth checking out the artist, too.

Granted, the faces are a little… Ivory-ish, doll-ish? Bad skin tone. But other things are working out.

she's in the deep dark sea, by Zena Holloway --v 6.0 --ar 3:6

Here's one I was really happy with…

Brownie Blinn hides away at his ramshackle wilderness shelter in the woods, in the winter night, 28 year old hermit, in the year 1975, 1970s down parka, adventure film still, stunning composition, bokeh, photography by Ed Ruscha --v 6

Original generation:

This didn't take much for edits to get it where I wanted, I didn't even run it through stable diffusion in this case. Original generations are getting better all the time.

But still, I edited:

Removed the bokeh flares (photoshop magic eraser)
Levels (photoshop)
Used google photos to depth blur the background (wanted to downplay the unrecognizable objects)
Color popped the subject by laying a greyscale with transparency over the background (photoshop) (even further focus your attention on the subject)

Interview with North Country Fever Dream

You can check out @noco_feverdream on Instagram!

I'd like to welcome Mitch Teich the creator behind the instagram account, NoCo Fever Dreams!

Mitch a professional broadcaster, sooooo… Enjoy the ride as I stumble and Mitch sounds all awesome.

I met Mitch as a guest on the fantastic public radio broadcast and podcast 'Northwords.' (you can find the episode featuring my project here) Well, today, we're flipping the script. I'm joined by the host of 'Northwords' and the creative mind behind the Instagram account 'noco fever dreams' — a microdose of fever induced psychedelia with absurdist, surreal, and undeniably humorous AI-generated art.

Also, if you like this kinda stuff, check out: https://www.reddit.com/r/weirddalle/

Mitch uses the prompts as the description in instagram, so…

"Please draw an avant garde magazine ad for a cereal called Neap Tide."

"Please draw a photo of the affable host and crew of a public TV series called 'At the Croatian Snack Farm' preparing to film a scene."

Which includes the world's best hashtag ever: #hypotheticalTV

Also make sure to check out the midlibrary.io article about MJv6

Model madness continued

JibMix XL

From one of their prompt template examples…

Really good skin tone and texture.

cinematic photo women's eyes, cinematic photorealistic, 8k uhd natural lighting, raw, rich, intricate details, key visual, atmospheric lighting, 35mm photograph, film, bokeh, professional, 4k, highly detailed . 35mm photograph, film, bokeh, professional, 4k, highly detailed
Negative prompt: 3d, render, cgi
Steps: 35, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 2708949351, Size: 816x1024, Model hash: 3b4501db98, Model: jibMixRealisticXL_v70PromptAdherence, Version: v1.5.1

Another from an item in the gallery

Rather clean rendering…

A futuristic Polaroid portrait in glitch art style, featuring a stylish woman with barcode-patterned iridescent makeup, set against a formula-themed background and captured in high contrast chiaroscuro
Negative prompt: 3d, render, cgi

Now let's try to put our own thing into it…

the flapper looks bored sitting at the bar in smoky speakeasy, busy bar scene, nightclub, late night, RAW photo, photography by __wildcards/favoritephotographers__
Negative prompt: 3d, render, cgi
Steps: 35, Sampler: DPM++ 2M Karras, CFG scale: 4.5, Seed: 3404268463, Size: 816x1024, Model hash: 3b4501db98, Model: jibMixRealisticXL_v70PromptAdherence, Version: v1.5.1

a view of lake Champlain across the rickety old dock, abandoned, creepy at dusk, landscape photography
Negative prompt: 3d, render, cgi
Steps: 35, Sampler: DPM++ 2M Karras, CFG scale: 4.5, Seed: 1877438157, Size: 1024x816, Model hash: 3b4501db98, Model: jibMixRealisticXL_v70PromptAdherence, Version: v1.5.1

the dancer in a dynamic pose, pastel by Edgar Degas
Negative prompt: 3d, render, cgi
Steps: 35, Sampler: DPM++ 2M Karras, CFG scale: 4.5, Seed: 78281419, Size: 816x1024, Model hash: 3b4501db98, Model: jibMixRealisticXL_v70PromptAdherence, Version: v1.5.1

Display Case LoRA

display case for viking boat on the water in a storm in a dark studio environment <lora:DisplayCaseXL:0.8>
Negative prompt: 3d, render, cgi
Steps: 35, Sampler: DPM++ 2M Karras, CFG scale: 4.5, Seed: 382125635, Size: 1024x816, Model hash: aeb7e9e689, Model: juggernautXL_v8Rundiffusion, Lora hashes: "DisplayCaseXL: 80e9482a6deb", Version: v1.5.1

display case for a revolutionary war soldier action figure in a mansion study <lora:DisplayCaseXL:0.8>
Negative prompt: 3d, render, cgi
Steps: 35, Sampler: DPM++ 2M Karras, CFG scale: 4.5, Seed: 2816100972, Size: 816x1024, Model hash: aeb7e9e689, Model: juggernautXL_v8Rundiffusion, Lora hashes: "DisplayCaseXL: 80e9482a6deb", Version: v1.5.1

Now let's turn the LoRA way down

display case for a revolutionary war soldier action figure in a mansion study <lora:DisplayCaseXL:0.4>

display case for antique jewels in the museum of modern art <lora:DisplayCaseXL:0.8>
Negative prompt: 3d, render, cgi
Steps: 35, Sampler: DPM++ 2M Karras, CFG scale: 4.5, Seed: 4086810684, Size: 1024x816, Model hash: aeb7e9e689, Model: juggernautXL_v8Rundiffusion, Lora hashes: "DisplayCaseXL: 80e9482a6deb", Version: v1.5.1

Bad Quality LoRA

From an example…

((deep focus, crisp focus, digital sharpening, jpg, grainy snapchat still, instagram post, blurry, motionblur:2, flat light , haze, blur , out of focus, slow shutter, high ISO:2)) video still, medium shot cute, thick knitted sweater busy public park, natural light, books <lora:badquality_v02:0.85>
Negative prompt: 3d, render, cgi
Steps: 35, Sampler: DPM++ 2M Karras, CFG scale: 5.5, Seed: 3955893400, Size: 816x1024, Model hash: aeb7e9e689, Model: juggernautXL_v8Rundiffusion, Lora hashes: "badquality_v02: 8e95bbbd81fa", Version: v1.5.1

a selfie at a fast food restaurant <lora:badquality_v02:0.85>
Negative prompt: 3d, render, cgi
Steps: 35, Sampler: DPM++ 2M Karras, CFG scale: 4.5, Seed: 2373944286, Size: 1024x816, Model hash: aeb7e9e689, Model: juggernautXL_v8Rundiffusion, Lora hashes: "badquality_v02: 8e95bbbd81fa", Version: v1.5.1

a disposable camera picture of an amateur comedian on stage in an underground club <lora:badquality_v02:0.85>
Negative prompt: 3d, render, cgi
Steps: 35, Sampler: DPM++ 2M Karras, CFG scale: 5.5, Seed: 365298231, Size: 1024x816, Model hash: aeb7e9e689, Model: juggernautXL_v8Rundiffusion, Lora hashes: "badquality_v02: 8e95bbbd81fa", Version: v1.5.1

tailgate party, amateur photography, backlit <lora:badquality_v02:0.85>
Negative prompt: 3d, render, cgi
Steps: 35, Sampler: DPM++ 2M Karras, CFG scale: 5.5, Seed: 778627094, Size: 816x1024, Model hash: aeb7e9e689, Model: juggernautXL_v8Rundiffusion, Lora hashes: "badquality_v02: 8e95bbbd81fa", Version: v1.5.1

Things I wish I tried…

Meshgraphormer for hand fixes, from Olivio on YT
Hand Refiner
- On Github
- Reddit Thread

Prompt engineering: JPEG in the negative prompt

Using "jpeg" in the negative prompt.

It has subtle impact, but, definitely has a little impact. I think it's kind of a crap shoot based on your prompt.

Note that I used a x/y/z plot, and I needed something to replace in the negative prompt, so, I used "z0rg" which hopefully has little impact on the generations, maybe a little anyway.

Open the images in a new tab to zoom in and check 'em out.

a gorgeous pencil rendering of a medieval castle
Negative prompt: z0rg
Steps: 25, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 50, Size: 1024x816, Model hash: aeb7e9e689, Model: juggernautXL_v8Rundiffusion, Script: X/Y/Z plot, X Type: Seed, X Values: "50,60,70,80,90", Fixed X Values: "50, 60, 70, 80, 90", Y Type: Prompt S/R, Y Values: "z0rg,jpeg", Version: v1.5.1

she's got the look, 1980s magazine model
Negative prompt: z0rg
Steps: 25, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 50, Size: 1024x816, Model hash: aeb7e9e689, Model: juggernautXL_v8Rundiffusion, Script: X/Y/Z plot, X Type: Seed, X Values: "50,60,70,80,90", Fixed X Values: "50, 60, 70, 80, 90", Y Type: Prompt S/R, Y Values: "z0rg,jpeg", Version: v1.5.1

a luscious jungle landscape, photography by Marc Adamus
Negative prompt: z0rg
Steps: 25, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 50, Size: 1024x816, Model hash: aeb7e9e689, Model: juggernautXL_v8Rundiffusion, Script: X/Y/Z plot, X Type: Seed, X Values: "50,60,70,80,90", Fixed X Values: "50, 60, 70, 80, 90", Y Type: Prompt S/R, Y Values: "z0rg,jpeg", Version: v1.5.1

Do you need a LoRA keyword?

The eternal question remains: Should you train with it or not? Which might be an even better experiment.

Inspired by this reddit thread.

But in this case, I used an existing LoRA I trained that's supposed to be the kind of "stereotypical outdoor social media influencer gal". And in a stereotypical fashion, this LoRA likes producing puffy jackets and yoga pants.

And I used the keyword or not…

It has quite an impact on a simple prompt… You can see that with the keyword included, outdoorgals, you seem to get outdoor photography (which it was definitely trained on, all outdoor photography)

woman in a jacket <lora:outdoorgals_v1-000007:0.6>
Negative prompt: 3d, render, cgi, nsfw
Steps: 35, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 111, Size: 816x1024, Model hash: aeb7e9e689, Model: juggernautXL_v8Rundiffusion, Lora hashes: "outdoorgals_v1-000007: 9a5f53db83c6", Script: X/Y/Z plot, X Type: Seed, X Values: "111,222,333,444,555", Fixed X Values: "111, 222, 333, 444, 555", Y Type: Prompt S/R, Y Values: "\"woman in a jacket\",\"woman in a jacket, outdoorgals\"", Version: v1.5.1

And a little less so on a more complex on… Maybe subtlely. Although I definitely feel like the compositions are more true to what it's trained on, and in my opinion, are better and more interesting compositions.

the hunting goddess is on an adventure, canadian rockies, RAW photo, Landscape Photography, color photography by Natalia Drepina <lora:outdoorgals_v1-000007:0.6>
Negative prompt: 3d, render, cgi
Steps: 50, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 50, Size: 816x1024, Model hash: aeb7e9e689, Model: juggernautXL_v8Rundiffusion, Lora hashes: "outdoorgals_v1-000007: 9a5f53db83c6", Script: X/Y/Z plot, X Type: Seed, X Values: "50,60,70,80,90", Fixed X Values: "50, 60, 70, 80, 90", Y Type: Prompt S/R, Y Values: "canadian rockies,\"canadian rockies, style of outdoorgals\"", Version: v1.5.1