Over just a few months, ChatGPT went from correctly answering a simple math problem 98% of the time to just 2%, study finds. Researchers found wild fluctuations—called drift—in the technology’s abi…::ChatGPT went from answering a simple math correctly 98% of the time to just 2%, over the course of a few months.

    • drspod@lemmy.ml
      link
      fedilink
      English
      arrow-up
      13
      ·
      1 year ago

      They list the currently available models that users of their API can select here:

      https://platform.openai.com/docs/models/overview

      They even say that while the main models are being continuously updated (read: re-trained) there are snapshots of previous models that will remain static.

      So yes, they are storing and snapshotting the models and they have many different models available with which to perform inference at the same time.

    • hedgehog@ttrpg.network
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 year ago

      Each parameter corresponds to a single number, so if it’s using 16 bit numbers then that’s 200 TB. They might be using 32 bit numbers (400 TB) but wouldn’t be using anything larger.

    • Lukecis@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Makes me wonder how exactly they curate said data, its such an insane amount even teams of thousands of human programmers sifting through all of it 24/7 all day everyday wouldn’t be able to fact check or assess all the data for years. Presumably they use ai to go over the data scraped and thrown into the model, since I cant imagine any human being able to curate it all.

      I’ve heard from various videos detailing the topic that many of the developers have little to no clue as to what’s going on inside the LLM once it’s assembled and set about its work on training itself and what not- and I’m inclined to believe them, the human programmers simply set the params, and system up and then the system eats all the data loaded into it and immediately becomes a sort of black box which nobody knows exactly whats going on inside of it to produce the output it does.