this post was submitted on 15 Jun 2023
19 points (100.0% liked)
Technology
37738 readers
45 users here now
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I'm sorry, I still don't quite follow what you want. What does it mean to access the entire Fediverse?
Usenet propagates all messages to every server (unless the receiving server specifically blocks a newsgroup or hierarchy). Messages originate at a specific server, but they're copied everywhere.
The Fediverse, in its current state, stores each message on only one server. It will provide a pointer to a given message to another server only if the other server asks for it.
In theory, you could copy all communities and messages from every server yours is federated with onto your home server for a more Usenet-like effect. If you did that, you would be able to view the whole set of communities and messages from your home server even if no one there had subscribed to them yet. In practice no one does that. Yet.
I suspect the best way of achieving what the original poster wants is to copy the community list and message text content, but leave any embedded media on the originating server—text is low-bandwidth, so you could probably fit the entire Fediverse's daily production of same into the bandwidth required for a couple of hours of 4K Netflix.
The Usenet model does seem a better fit for Lemmy than the current setup, because community discovery here is kind of painful, really.
@nyan @leopardboy that. Although even accessing federated content on demand would work. There's no technical reason to limit what you display to the user to one instance. If I want to access lemmy.cafe from mastodon.social why is that not possible? It's federated, so I should be able to do that. The API definitely allows it, as seen in this thread. The User Interface just doesn't deal with it. I understand the limitations at play quite well, which is why I'm thinking dedicated client.
Upon further consideration, the smart way to do this would be to have the servers exchange community lists, with post subjects and other metadata. That's enough information to find and evaluate a community. Post bodies and comments can then be fetched on request.
Putting all the data in the client strikes me as more bandwidth-inefficient, but whether that matters these days is questionable. Usenet clients don't download subject lines from the server until after you subscribe to a group, but that's a relic of the days of modems whose speed was measured in baud. And of course it's always easier to start your own project than try to fork the existing server code. Not to mention that I haven't bothered to look at the API yet, so my thinking may be way off-base.
Plus, I get the impression that the existing clients are half-baked, so the more the merrier. ;)
@nyan couple of hours of research in and I'm baffled. There is no apparent mechanism for discovery in ActivityPub.
It appears as if the protocol has simply ignored the literal first step of all social interaction: Observation.
Take lemmy as an example, if I visit beehaw.org directly I can click on "Communities" and get a list of them. Yet, there is no way I could identify to get this via ActivityPub. I'm completely and utterly baffled.
A quick skim of the protocol documents suggests you're right: Activity Vocabulary doesn't even mention the "discovery" use case, and no allowance seems to have been made for it. I can't fathom what they were thinking.
The short-term workaround would be scraping the Web portal "Communities > Local" page of each instance periodically for a list (and cache the info centrally? Not sure.) Hopefully the raw HTML is parsable enough that you wouldn't have to involve Selenium (which I've used in my day job—it's awful) or its ilk.
The correct fix is, of course, to add the "discovery" case and messages supporting it to the protocol, either as an extension or as core for the next version. This might take years.
It might be worth asking browse.feddit.de how they get their information.
@nyan so it's not only me missing the elephant in the room. O_o
That's a pretty huge gap. Which has severe consequences all the way down the pipe...
There's a current issue with lemmy with comments not being synchronized properly. Possibly a direct consequence of the entire system replication being push with no backup pull or reconciliation in the protocol.
I'm looking into a couple directions right now, but this is completely breaking the foundational promise of the very concept of fediverse...
It's possible that if you're trying to build a Twitter substitute on top of the protocol, the issue looks like a mere Sicilian dwarf elephant, since Twitter doesn't have great discoverability either (or need it, really), and synchronization hiccups matter less for that kind of service. It's when you're trying to build a Reddit/Usenet substitute that things fall down.
But yeah, it really is a gap in the design.
@JustusWingert
@technology @leopardboy @nyan
My guess is that this actually regarded as a feature in the wider #fediverse. At least in the Mastodon community it is. Toots not being searchable/indexed and you discover topics and people to follow by looking for hashtags and organically setup and control your own home feed. This, leading to much frustration for many who've moved over from Twitter.
Reasons being that it should not be easy for big tech (and others) to just slurp up and make peoples data open publicly for a wider audience through public search engines. A level of privacy and owning ones own data has been a priority over the inconvenience of not being as discoverable.
Of course the same thinking does not apply or map equally well to the #threadiverse as it does to Mastodon. I'm sure there are workarounds and a way to make this more seamless for users, but this is just getting started. It was never an issue when everything was on a small number of instances which everyone knew about.