Today, we’re headed to the frozen north. Dispite the snow on the ground, the sun is out and the light is perfect for a brisk shoot at the weather-worn cabins of Colter.

Two months ago, I fell into the trap that is Stable Diffusion. Today, I released my first trained model based on the snowbound town of Colter from Red Dead Redemption 2. For anyone interested in SD image generation, you can grab a copy at CivitAI. https://civitai.com/models/137327. I’d appreciate you taking a look, and giving it a like or a rating if you’re so inclined. The LoRA model is stylistically versatile, and there’s a bunch of SFW examples I made of its range.

As always, images link to full-size PNGs that contain prompt metadata.


  • RandomUser88@lemmynsfw.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 年前

    Thanks for your contributions to the community!

    I have questions if you don’t mind.

    In really trying to get into LoRA training in general and there’s a lot of things I can’t intuitively work out or find solid answers to.

    For example, with this, what is your “class”? And did you use regularization images? (I want to make a habit of using them). If you did, what did you use for them? Like places that aren’t this? Like deserts and forests, etc?

    Would you consider elaborating on batch size, repeats, epochs, etc, too?

    Thanks again!

    • Cavendish@lemmynsfw.comOP
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 年前

      There’s not much out there on training LoRAs that aren’t anime characters, and that just isn’t my thing. I don’t know a chibi from a booru, and most of those tutorials sound like gibberish to me. So I’m kind of just pushing buttons and seeing what happens over lots of iterations.

      For this, I settled on the class of place. I tried location but it gave me strange results, like lots of pictures of maps, and GPS type screens. I didn’t use any regularization images. Like you mentioned, i couldn’t think of what to use. I think the regularization would be more useful in face training anyway.

      I read that a batch size of one gave more detailed results, so I set it there and never changed it. I also didn’t use any repeats since I had 161 images.

      I did carefully tag each photo with a caption .txt file using Utilities > BLIP Captioning in Kohya_ss. That improved results over the versions I made with no tags. Results improved again dramatically when I went back and manually cleaned up the captions to be more consistent. For instance, consolidating building, structure, barn, church, house all to just cabin.

      Epochs was 150, which gave me 24,150 steps. Is that high or low? I have no idea. They say 2000 steps or so for a face, and a full location is way more complex than a single face… It seems to work, but it took me 8 different versions to get a model I was happy with.

      Let me know what ends up working for you. I’d love to have more discussions about this stuff. As a reward for reading this far, here’s a sneak peek at my next lora based on RDR2’s Guarma island. https://files.catbox.moe/w1jdya.png. Still a work in progress.