With AI scraping everything, would it make sense to add copyright notices to comments on the fediverse? CC BY-NC-SA 4.0 Deed

onlinepersona@programming.dev · 6 months ago

With AI scraping everything, would it make sense to add copyright notices to comments on the fediverse? CC BY-NC-SA 4.0 Deed

Fosheze@lemmy.world · edit-2 6 months ago

Copyright doesn’t matter for AI training data because that AI is considered a derivitive work therefor using whatever content they find for training data is fair use under current copyright law. People are literally training AI on Pixar content without copyright being an issue.

Also if you don’t want people using your stuff then why are you posting it in the open in a public board that basically everyone has access to? If you want to protect something then the first step would be not handing it to everyone and everything with an internet connection.

onlinepersona@programming.dev · 6 months ago

I don’t know if a verdict has already been on the topic, nor by whom, but it’s my impression that the issue of copyrighted material used to train LLMs or GPTs is still hotly debated. IIRC Microsoft was sued because of Copilot on the grounds of copyright infringement. Especially with stuff like GPLv3 where derivative works must be of the same licence.

Personally, it’s not much of an issue to me whether my public musings will be scraped. I’d rather they are scraped and used for opensource stuff, but there’s no way for me to enforce that. However, if a model starts spitting out “CC BY-NC-SA 4.0”, I’d at least find that funny and maybe it might even help bring a case on the grounds of breaching the “Non Commercial” clause of the licence.