Mona Awad and Paul Tremblay’s lawsuit claims their books were used without their consent. But copyright protection doesn’t apply to ideas – they’ll need to demonstrate the likelihood of economic loss.
On a related note, I would be very curious to see how something like ChatGPT trained exclusively on works in the public domain would turn out. It would likely have a very different diction and style based on the older source material, but I wonder what other differences there would be.
The problem I have with this view is that AI “reading” a book is not the same as you or I reading. It doesn’t actually learn it’s just predicting the most likely sequence of words to be a response to whatever prompt it receives. In that sense, the words are just data, not actual words. Given how valuable data is in this day and age, I think it makes perfect sense for OpenAI to have to either: only use public domain/authorized works, or pay the creators for their work.
On a related note, I would be very curious to see how something like ChatGPT trained exclusively on works in the public domain would turn out. It would likely have a very different diction and style based on the older source material, but I wonder what other differences there would be.
What do they mean train? If by reading then how can that be wrong. But if copying the text and using it as it’s own works that would be wrong.
The problem I have with this view is that AI “reading” a book is not the same as you or I reading. It doesn’t actually learn it’s just predicting the most likely sequence of words to be a response to whatever prompt it receives. In that sense, the words are just data, not actual words. Given how valuable data is in this day and age, I think it makes perfect sense for OpenAI to have to either: only use public domain/authorized works, or pay the creators for their work.
Here, these videos are a fairly good explanation of how AI is created and “trained”:
https://youtu.be/R9OHn5ZF4Uo
https://youtu.be/wvWpdrfoEv0