this post was submitted on 28 Jun 2023
6 points (100.0% liked)

Programming

13386 readers
1 users here now

All things programming and coding related. Subcommunity of Technology.


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago
MODERATORS
 

From a technical and legal standpoint, ignoring ethics and dignity, is there anything preventing us from scripting a scraper that recreates reddit posts in a lemmy instance? Like maybe top 100 posts of the top 50 subreddits, without comments. I think it would help convince people to join, since the major argument for sticking with reddit is that it has more content. Thoughts?

you are viewing a single comment's thread
view the rest of the comments
[–] furrowsofar 2 points 1 year ago

Personally I would think very carefully before doing this from a number of points of view. Below is not legal advice, just some thoughts you might want to consider.

  • From an operational and strategic question is it desirable? Personally I question whether the threadiverse should want to align itself with R$ from a supply chain point of view in terms of content, or generating traffic for them, or perceptually.

  • Is this activity something that the stakeholders on the R$ side would object to? Personally, I would put this somewhere between the maybe and probably category.

  • Especially since there may be objections from stakeholders one would think that there may be legal issues. A question to ask then is under what legal theory are you allowed to do this and also under what legal theory would other stakeholders argue you are not? You might start by reading R$ user agreements. Can you find any explicit permission for the contemplated activity? Probably not? Keep in mind copyright is typically a deny, then allow type regime. So no support may mean denied. Moreover there are the contract terms if any. Moreover the agreement probably has the user give R$ a lot of rights, but does not give anyone else much for rights. Check.

  • Ok what support might exist? Take a look at the user agreement, but last time I checked, I think the agreement states that the content originator retains copyright. This suggests that that the originator can give permission or perhaps can do what they want with their content. One might guess though that this may not include the structure of the content on the service.

  • Then there is the question of linking. Since search does that all the time maybe that might be an avenue? But then the question, why would one want to direct people to content on R$?

Beyond this only an attorney can say. Seems a question for an attorney say at the Software Freedom Law Center (https://softwarefreedom.org/) or the EFF (https://www.eff.org/), or an attorney of your choice.