Leaderboard scores often can be a bit misleading since there are other factors to consider.

  • Censorship: Is the model censored?
  • Verbosity: How concise is the output?
  • Intelligence: Does the model know what it is talking about?
  • Hallucination: How much does the model makes up facts?
  • Domain Knowledge: What specialization a model has.
  • Size: Best models for 70b, 30b, 7b respectively.

And much more! What models do you use and would recommend to everyone?

The model that has caught my attention the most personally is the original 65b Llama. It seems genuine and truly has a personality. Everyone should chat with the original non-fine tuned version if they can get a chance. It’s an experience that is quite unique within the sea of “As an AI language model” openai tunes.

  • noneabove1182@sh.itjust.worksM
    link
    fedilink
    English
    arrow-up
    5
    ·
    11 months ago

    I’ve been very partial lately to anything ORCA tuned, i’m not sure if it’s placebo but it always feels like they’re just that much smarter and have a bit more ability to think things through

    for instance, I have a character in oobabooga, and in its description/pre-prompt I told it to ask questions about what it doesn’t know with “It only answers questions it knows the answer to, choosing to ask for additional context when information is unclear.” and anything that’s tuned on orca is 10x more likely to actually consider what it doesn’t know and ask for context rather than hallucinating information

    lately I’ve been playing with Dolphin which is llama 1 based, and it’s an absolute pleasure https://huggingface.co/ehartford/dolphin-llama-13b

  • moreeni@lemm.ee
    link
    fedilink
    English
    arrow-up
    2
    ·
    11 months ago

    I didn’t try out many models but Vicuna has been the best one so far

  • rufus@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    11 months ago

    What kind of prompt (format) do you use when chatting to something like the original base LLaMA?

  • Audalin@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    11 months ago

    Wizard-Vicuna-30B-Uncensored works pretty well for most purposes. It feels the smartest of all I’ve tried. Even when it hallucinates, it gives enough to refine the google query on some obscure topic. As usual, hallucinations are also easily counteracted by light non-argumentative gaslighting.

    It isn’t very new though. What’s the current SOTA for universal models of similar size? (both foundation and chat-tuned)