Cavendish

Cavendish@lemmynsfw.com · 13 days ago

After a careful review, I believe you are correct. I’ll take the matter up with the clothier.

Cavendish@lemmynsfw.com · 13 days ago

I never use models trained on specific people or ask for celebs in my prompts. This is the base Flux model plus a Gil Elvgren style model. I can see the resemblance though, but it wasn’t something I was aiming for.

Cavendish@lemmynsfw.com · 17 days ago

Why not both? 🤷

Cavendish@lemmynsfw.com · 17 days ago

I think some of both. I gravitate towards the more fashion photography models/loras first, rather than start with the porn models. I also prompt for an assortment of expressions and nationalities, and I do go into detail on the hair length/color/style, and I think that does a lot for generating a wide variety of faces.

Some of that is also basic curation. I generate a ton of images, then look at it as a photography shoot, and select the ones that grab my attention more.

Cavendish@lemmynsfw.com · 5 months ago

These look like modern day Art Frahm pinups. Digging the oil paint aesthetic!

Cavendish@lemmynsfw.com · 5 months ago

Lookin good! I was just perusing the metadata and got a chuckle out of “wearing a light grey shaved smooth vagina.”

Cavendish@lemmynsfw.com · edit-2 5 months ago

Wildcards and dynamic prompts are a killer feature for getting diversity. You can have ChatGPT output long lists of options and save them as text to pull from, rather than putting the OR options in the prompt. For instance, here’s just a snippet of my 100 line “hair_color_natural_hightlights.txt” wildcard file:

chestnut brown with sun-kissed golden highlights
dark blonde with naturally blended caramel highlights
auburn with subtle coppery undertones and honey highlights
jet black with hints of espresso brown and warm burgundy highlights
chocolate brown with natural honey and toffee highlights
ash brown with delicate pearl-toned highlights
light brown with sunlit blonde highlights

One question though, are you sure you have the dynamic prompts extension enabled in A111? These all look very similar, and the fact that the OR statement is in the prompt metadata still makes me wonder. Typically, the “brunette | ginger | blonde” would get resolved by the dynamic prompt processor before generation, and the image metadata would only show the single selected term by itself.

Cavendish@lemmynsfw.com · 6 months ago

Lol! Yeah, this workflow idea was one of the main reasons i’m making the transition.

Cavendish@lemmynsfw.com · 6 months ago

These are very sweet, and love the embroidery details!

All of the SDXL models are too airbrushy plastic to my eye, but you can’t argue with the hand quality over the older 1.5 models. One of the things I’m wanting to try is to start with an XL model to get the basic pose structure in place, then finish with the grittier and more realistic 1.5 models. That introduces a whole new level of adjustment knobs to play with!

Cavendish@lemmynsfw.com · 6 months ago

Lol! I promise this is a one-time thing. 🤡

Cavendish@lemmynsfw.com · 6 months ago

Comfy has pros and cons. In A1111 I use a lot of dynamic wildcards that load loras, like “wearing skirt + upskirt_lora OR wearing sweater + downblouse_lora.” That kind of thing is very difficult in Comfy. But the node system gives you a ton of flexibility in other areas. It’s worth fiddling with if you have the time.

Cavendish@lemmynsfw.com · 6 months ago

Nice!

Cavendish@lemmynsfw.com · 6 months ago

The method I’ve settled on takes a bit of work to put together. First, I upload PNGs to catbox.moe. This preserves metadata so someone can feed the image into the A1111 PNG Info tab, or by copying the url to https://pngchunk.com.

Next, I upload JPG copies here. That gives me the lemmynsfw hosted url and builds the gallery. Then I put them both together using markdown so that the image gallery is also links to the PNGs. The final format looks like this:

[![](https://lemmynsfw.com/pictrs/image/59c7f6e6-de70-4354-937b-5b82b67fc195.webp)][1]
[![](https://lemmynsfw.com/pictrs/image/88b14211-4464-4cd2-bb28-05e781dd5fc8.webp)][2]
[![](https://lemmynsfw.com/pictrs/image/bf3a69bb-d0f9-4691-b95e-6794880bbc86.webp)][3]

[1]: https://files.catbox.moe/5dsqza.png
[2]: https://files.catbox.moe/dljkxc.png
[3]: https://files.catbox.moe/kcqguv.png

This seems to work well. The only hiccup is that I need to include the first image twice, once in the post body so it shows in the gallery, and once as the post header image. That works okay in the browser, but some lemmy mobile apps show it as a duplicate.

Here’s the final result: https://lemmynsfw.com/post/1372540

Cavendish@lemmynsfw.com · edit-2 7 months ago

In the past, I’ve uploaded to catbox.moe and then provided a link here.

Edit to add that i’m looking forward to seeing this. I haven’t gotten good results with animate diff and realistic models.

Cavendish@lemmynsfw.com · 8 months ago

Two belly buttons, or one extremely long belly button?

Cavendish@lemmynsfw.com · 11 months ago

Have fun cooking that new GPU!

Cavendish@lemmynsfw.com · 1 year ago

I thought i had a good system where each outpost was only exporting 1 solid, 1 liquid, and 1 gas. This allowed me to isolate and sort at the receiving outpost.

The problem occurs when each outposts import & export containers get full. At which point materials flow from the export station, go to the import location where they can’t be unloaded, THEN THEY COME BACK to the original outpost where they get offloaded. You wind up with all the same materials filling both the import and export containers. Now the entire material flow is completely borked, nothing is getting imported, and all you have access to is the stuff thats locally produced.

Cavendish@lemmynsfw.com · 1 year ago

There’s not much out there on training LoRAs that aren’t anime characters, and that just isn’t my thing. I don’t know a chibi from a booru, and most of those tutorials sound like gibberish to me. So I’m kind of just pushing buttons and seeing what happens over lots of iterations.

For this, I settled on the class of place. I tried location but it gave me strange results, like lots of pictures of maps, and GPS type screens. I didn’t use any regularization images. Like you mentioned, i couldn’t think of what to use. I think the regularization would be more useful in face training anyway.

I read that a batch size of one gave more detailed results, so I set it there and never changed it. I also didn’t use any repeats since I had 161 images.

I did carefully tag each photo with a caption .txt file using Utilities > BLIP Captioning in Kohya_ss. That improved results over the versions I made with no tags. Results improved again dramatically when I went back and manually cleaned up the captions to be more consistent. For instance, consolidating building, structure, barn, church, house all to just cabin.

Epochs was 150, which gave me 24,150 steps. Is that high or low? I have no idea. They say 2000 steps or so for a face, and a full location is way more complex than a single face… It seems to work, but it took me 8 different versions to get a model I was happy with.

Let me know what ends up working for you. I’d love to have more discussions about this stuff. As a reward for reading this far, here’s a sneak peek at my next lora based on RDR2’s Guarma island. https://files.catbox.moe/w1jdya.png. Still a work in progress.

Cavendish@lemmynsfw.com · 1 year ago

not OP, but you can download the original png from the catbox link above, then drag it into SD WebUI’s PNG Info tab.