I tried to get SD-XL to generate an image of a frog with its eyes closed. It refused. I even cranked up the attention on closed to an absurd level, and it seemed to get sassy with me.

  • The Barto@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    5
    ·
    edit-2
    11 months ago

    This has meme potential! This has meme potential! Edit: voyager messed up, I’m not that excited about it.

  • BOMBS@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    11 months ago

    This little guy has either check out completely or is entirely hyperfocused. There is no middle.

  • tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    5
    ·
    11 months ago

    Another interesting prompt I remember someone on here having trouble getting Stable Diffusion to do was colored tire treads. Probably not much by way in the world of colored tire treads out there – they’re always black, so…

    My own is drawings with crosshatching. For some reason, though Stable Diffusion can do about a zillion types of media and art styles, including sketches, and there are lots of images on the Internet that have crosshatching, in my experience, Stable Diffusion will not generate images using crosshatching. I don’t know if that’s because the algorithm used just has problems with highly-similar-but-not-identical line patterns all over, whether it’s been excluded from training on all of the models I’ve tried, or what.

    • nul@programming.dev
      link
      fedilink
      English
      arrow-up
      4
      ·
      11 months ago

      Did you try putting (eyes open) in the negative prompt instead? I find that when it doesn’t have a strong understanding of a compound phrase, it sometimes focuses more on the individual words. So, “eyes closed” may have been impeded by a stronger influence from “eyes”.

    • wewbull@feddit.uk
      link
      fedilink
      English
      arrow-up
      1
      ·
      11 months ago

      The problem here is that you have the token “eyes” with very heavy weighting, and it’s showing you eyes. Another way of thinking about it is…

      What do you see when somebody closes their eyes? Eyelids

  • OhmsLawn@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    11 months ago

    I love it !

    AI really seems to have trouble with descriptors like high/low, big/small, open/closed, above/below, etc. the best workaround I’ve found is to try to use synonyms. In this case, I might try something like “squinting completely.”

  • heavyboots@lemmy.ml
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    11 months ago

    My guess is because this is what they look like when they’re blinking or asleep? It’s transparent, so probably basically looks the same as open eyes and the model probably doesn’t have any closed opaque eyelids to reference as a result. (Although you are correct the AI seems to be getting sassy with you in this image, lol.)

    Frogs have three eyelids: an upper eyelid that blinks to keep their eyes moist, a lower eyelid that stays still, and a third semi-transparent eyelid called the nictitating membrane. The nictitating membrane is used for swimming, camouflage, hibernation, and sleeping.1 It is a thin, transparent eyelid that protects frogs’ large, fragile eyes, making their eyes waterproof and keeping them safe from debris.

    • tal@lemmy.today
      link
      fedilink
      English
      arrow-up
      3
      ·
      11 months ago

      https://www.mramphibian.com/sleep-behavior-of-frogs/

      Here are some indications that a frog is sleeping:

      • Limbs are tucked under their body
      • Head is pointed downward
      • Eyes may or may not be covered by a nictitating membrane

      For some species, it can be difficult to tell when they’re resting. In fact, most frogs either sleep with their eyes open (described as “Cataleptic Sleep”) or have a clear or camouflaged membrane covering their eyes.