@crashdoom @brodokk There's a few other folks that have said things to the effect of "I don't want my data elsewhere." Short form is, that's impossible under ActivityPub (it's a copy-on-read architecture)
If you don't think Meta is scraping the output of every federated feed already, you're sorely mistaken.
The fact of the matter is that Meta's content is going to get into the rest of the fedi and they're going to get our content. The only ones that won't isolated themselves away from fedi.
@crashdoom I've had thoughts about this for a while.
Building a first-warning system (using e.g. TensorFlow) would be a good way of at least seeing possible issues ahead.
I'm curious if, functionally, considering anything that ends up flagged as spam be marked as the equivalent of "followers only" for some amount of time until a human has had a chance to clear it, would help, as I would expect it helps with the shadowban issue.
New accounts, especially those from a server that nobody follows from, I think are the biggest one to look out for.
Honestly, I also take a fairly straightforward opinion on posts from other places: if it's public to the world, it's fair game, especially for classifier data. Generative models, no, but pure classifiers? Go ham. They put it on the Public Internet.