this post was submitted on 28 Feb 2024
41 points (100.0% liked)
Furry Technologists
64 readers
1 users here now
Science, Technology, and pawbs
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
So I guess there are two paths of training data. Some company selling it explicitly, and the companies just scraping accessible data. Not that either is "good", but at least with public data, you only have the AI company profiting.
Yep. That's why the two things I say Automattic MUST do to make things right are about proper consent controls for Automattic's use of data and sale to AI vendors, but the third thing is a proposed proactive defense against scrapers.