Under the hood of image generator

Grth@lemmy.world · 1 year ago

Under the hood of image generator

perchance@lemmy.world · edit-2 1 year ago

Yep SD 1.5, and you should be able to replicate the text-to-image-plugin’s results locally by just following vanilla tutorials on /r/stablediffusion or youtube with pretty much any of the top models on civitai - I’m not doing anything special. Your local results will actually end up be better than the plugin’s because I have a stupid amount of regex and stuff trying (and somewhat failing) to prevent the model from creating oversexualised stuff for benign prompts, and that almost always comes at a cost of quality/coherence. I’m not the best person to ask about troubleshooting local setups, but I’d just advise that you follow a tutorial/guide exactly to start with, and then once you’ve replicated what they’ve shown, you start exploring your own prompts, tweaking parameters, etc.

Grth@lemmy.world · 1 year ago

Thanks for getting back to me so quickly, even if it’s taken me so long to respond. Whatever non-special things you’re doing seem to work really well! It works a lot faster than my local instance (suboptimal video card to blame there) and I get really good consistent base results, which I can then pull into my own instance to do more fun stuff with inpainting, upscaling and the like. So hopefully that ad revenue is making it worth your while. :D

perchance@lemmy.world · 1 year ago

Yeah it’s definitely worth investing in a fast graphics card if you’re getting deep into AI stuff, but they’re pricey. Inpainting and image-to-image should be possible on perchance within the next month or so if all goes well. Ad revenue doesn’t cover all the server costs yet, so I pay for a portion of it out of my own pocket, but it’ll eventually be self-sustaining and it’s not ‘breaking the bank’ for me. Much closer to self-sustaining than it was 12 months ago when I made the plugin - research community has made SD inference a lot more efficient.

Ashenthorn@lemmy.world · 1 year ago

@[email protected] Is there a roadmap or existing discussion anywhere about your experiments with the t2i plugin or where you might be going with it? Or for user questions/feedback/requests?

I’m currently getting some fantastic results with it “as is”… but additional options are always appreciated. =)

perchance@lemmy.world · 1 year ago

Best place is probably here on the lemmy community. I’ll post updates here when there are new features available (e.g. inpainting, image-to-image), etc.

Also, @[email protected] has some notes and interesting experiments (with linked generators to play with) here: https://perchance.org/learn-perchance-plugins-text-to-image

ArtificialScr00b@lemmy.world · 1 year ago

You can hover over the images to see the prompt that was used. Some keywords are added to the prompt on the backend depending on the “Art Style” used, and you can see those, as well as the seed, when you hover.

At one time in the past, the generator was saving prompt/generation information in the EXIF data of the images. The last image I have saved that contained EXIF data showed the generator was using SD1.5+Deliberate v2, 20 steps, Euler a, CFG/Guidance 7. It really does seem like there’s other things going on under the hood though, because if I use those settings and input the saved prompt exactly, even including the same seed, from one of the images I have saved from before the EXIF data was dropped, I can’t replicate the image using HappyAccidents. The quality is the same basically the same, but experience tells me that if I feed the exact same prompt, seed, and configuration into both, I should get the same output, which I do not, which tells me there’s still something hidden somewhere.

I really wish that saved EXIF data would be brought back. It makes it a lot easier to go back to an image I saved from earlier and tweak or retouch it.

VioneT@lemmy.world · 1 year ago

@[email protected]