• BountifulEggnog [she/her]@hexbear.net
    link
    fedilink
    English
    arrow-up
    4
    ·
    11 months ago

    Of course, you could also train a LLM to utilize ControlNet settings solely through prompts, since LLMs allow us to use our own personal vocabulary as a high-level Natural Language Programming language.

    I really don’t understand what you mean by this. It sounds like a very inefficient way to use controlnet. What settings would you be tweaking with an llm? Why would you use an llm for that, instead of a slider/checkbox?

    • SubstantialNothingness [none/use name]@hexbear.net
      link
      fedilink
      English
      arrow-up
      1
      ·
      11 months ago

      I wouldn’t do it lol. But you could create a workflow in which you could make similar products using only prompts and some custom tooling.

      The creator of this piece said that it can’t be made by a prompt. What they are insinuating, is that different inputs make their outputs more or less superior. But that’s such an arbitrary and ignorant argument when you come at it from the view of LLM product design.

      It used to be that parameters were input at the command line. Obviously this becomes impractical for mature use cases, so UI frontends were created to give us sliders and checkboxes. That matches the kind of environments that they are already using. But these are also typically power users (by interest, not necessarily competency) and we’ve seen a new iteration of UI designs for consumer LLM products like Dall-E and Midjourney. Just because the LLM has a user-friendly skin does not mean the functionality is any less capable - you can pass parameters in prompts, for example.

      But if you were to have a use case for a product that uses ControlNet with only prompts, like I said, it would benefit from extra tooling: Presets, libraries, defaults, etc. With these in place more powerful functionality would be more quickly accessible through the prompt format. I’m not Nostradamus, but one area in which simpler inputs are much more desirable than power user dashboards, is in products that are intended to be used by drivers (not all of which are related to the act of driving itself - like voice2text messaging).

      I guess my point is that the argument that “my sliders are better than your prompt” is like saying the back door gets you into the house better than the front door, it’s really nothing more than a schoolyard pissing contest that shows a limited perspective on the matter.