Well that was fun.
Didn't go as planned of course, restored from backups, pre migration attempt. Thank you for your patience while we try to get all these moving parts working well together. Sorry for the troubles.
Support and meta community for Beehaw. Ask your questions about the community, technical issues, and other such things here.
A brief FAQ for lurkers and new users can be found here.
Our September 2024 financial update is here.
For a refresher on our philosophy, see also What is Beehaw?, The spirit of the rules, and Beehaw is a Community
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
Well that was fun.
Didn't go as planned of course, restored from backups, pre migration attempt. Thank you for your patience while we try to get all these moving parts working well together. Sorry for the troubles.
I once caused an AWS outage that impacted 20% of their customers in their largest region. They called my manager to ask why we were performing around 10k writes per second to a bucket. It was fun times
They don't limit that?! I've worked with a lot of AWS services and most have built in rate limits. That's wild lol
They do now...
lol, that's how rules get made
I can get into it in more detail if anyone's interested. But basically, they had a rate limit on direct writes, but not a rate limit on cross bucket replication if you connected many buckets to replicate into a single bucket
Right? Don't feel bad, you found a "vulnerability" and now you're a hero
That was you?!
(jk, I'm on a different cloud 😂)
Thanks for the update and hope you have less trouble in the future! Don't worry about the downtime I really appreciate that here it's serving a clear purpose unlike Twitter lol
Loads of love. There's always ASCII art.
Every technical bump in the road we hit now is one we won't hit/will know how to handle quickly in the future! Thank you for doing what you do for Beehaw!
Yeah, moving to object storage is best to do now. Arguably, we should've done it sooner since the longer we've waited, the more it was gonna catch up to us and cost us in time and money.
I'd imagine the list of things you should do RIGHT NAO is pretty long though and there's only 24h per day 😅
I concur. A minor inconvenience on occasion is a small price to pay for your amazing efforts! Thank you for doing what you do.
Surprised beehaw hosts images at all. Sounds like that could become very expensive very quickly.
It could, and will. Hopefully they are taking advantage of CDNs for image delivery so they aren't paying high egress costs and can keep it in slow, cheap, storage.
I'm honestly surprised that Lemmy hasn't embraced distributed, community, hosting. Many existing niche communities (outside of Lemmy) operate with the ability for others to run their service to serve up images and media, or to act as workers for computationally expensive operations like compression or encoding (Which will also save you a ton of space). Even gamificating it in the case of e-hentai.
Hard drives (Tapes even more) at home/office are incredibly cheap compared to cloud storage costs (even including networking, server, redundancy...etc hardware costs), but come with reliability concerns, which is where a distributed community becomes critical. Though you'll always have to have them stored somewhere like Backblaze B2, or somewhere slower/cheaper/frozen to ensure safety.
We'll definitely be using a CDN to help avoid high egress costs.
I feel like Lemmy definitely needs to embrace distributed computing in some fashion. I have no interest in hosting my own instance, but I'm not against running a docker image that would offload some of the processing requirements large instances need. It would just need to be relatively straightforward for me to setup
Distributed computing isn't really a good fit for low computational tasks like forum software. It's good for heavy calculations like "Could you please fold proteins to see if there's any interesting stuff to be found" and "Here are 50 years of radio data. See if any of it is anomalous." You need a sufficiently complex enough long-running task to warrant the computational overhead of a supervisor process assigning and receiving the outputs of tasks. LLMs, epigenetics, and deep space analysis are all good candidates for distributed computing. Lemmy is more of a candidate for an autoscaling clustered multi-tennent approach. The computational tasks are basic, but there's a lot of them. Further, the computational needs are not constant. A fantastic case study for making the most of resources in the Fediverse is mastodon.world and lemmy.world running on the same server and making scale up and scale down requests to the docker daemon. The ideal world topology, in my opinion, for a Fediverse application ecosystem would be a Kubernetes cluster with three supervisor nodes and a minimum of two worker nodes, all with autoscaling enabled. The idea would be that your database resources can hold multiple databases (Lemmy, Mastodon, Peertube) AND can scale. The mechanism you would use to do this would depend on your hosting decisions.
Digression now on database solutions. There are three basic ways I could see running the perfect Fediverse database cluster. The first, and least beholden to any given cloud provider, is to run Postgres in a Kubernetes cluster either on a single machine emulated cluster at your house, or within several clustered machines. The upside to this is that no one but you controls your infrastructure. The downside is that your ability to scale is hard capped to the amount of RAM and CPU resources you physically have in your house. Next would be a similar set-up on a hosted Kubernetes cluster through a cloud provider such as Google, Microsoft, IBM, or AWS. The downside here is that tech giants are all, for various reasons, shit. Google has the best eco-friendliness score, so they're listed first. They're still shit, though, and one of the platforms I'm suggesting hosting is a direct competitor to one of their golden goose products.
Your next option is to just pay one of those cloud providers to host a database cluster for you, rather than using an ad hoc Kubernetes cluster solution. It will cost you more money, but the tools available to you for managing databases through these cloud providers are much better. In terms of user experience and performance, this is a clear upgrade over hosting your databases on your Kubernetes cluster. The final option I'd want to talk about is called "Aurora Serverless." So far, I've only discussed ways you can scale up to meet demand, but Aurora Serverless allows you to scale down. This will be the cheapest option if you run a small instance with clear peaks and valleys of load. It's not the best answer for a user like Beehaw, but would come with the lowest cost in terms of management and money for someone running an instance for a low number of people.
So, does that solve the image hosting problem? No. Not really. Postgres is TERRIBLE for image hosting. Right now, Beehaw is, per my understanding, using the simplest image storing solution, which is "Just keep it on the server." This is great for a first pass at hosting a web service, and will remain fine long term for a low user instance, but will fast run into issues with any instance that hosts numerous users uploading pictures. Basically, servers have finite space because they're running the Harvard architecture. The only solution is to bring the service down and put in bigger disks. Eventually, you reach the upper limit of how big of disks are manufactured, and how many disks you can attach via the interfaces that connect to a motherboard. A much better solution, and in fact the best solution, is what Beehaw is implementing right now: block object storage. If I'm going to tie all of this first in the DIY "I'm a strong independent Fediverse citizen, and I don't need no corporations," I'll start by recommending Ceph. Ceph can run on Kubernetes and will provide block object storage based on Kubernetes persistent volumes. But more likely, you will want to aim for something with infinite storage capabilities, and your only real options for that currently are the cloud providers. You don't have to worry about disks running out of space, and they do not charge you very much money.
I get where you're coming from, though. "How do we all own the images so that the instances don't run out of space but without being beholden to the corporations who own the storage?" The closest we come right now is peer 2 peer solutions, but all of them have a discovery and durability problem. In terms of discovery, the problem is "how does a server providing the Lemmy service find the peer 2 peer hosted files?" There's no way to perform get object operations to serve the files via HTTP other than for the host server to fetch (download) the file from the peer 2 peer network and then deliver that to the user who made the request. The problem with this is that the server synced the file to its local storage, and is now hosting it, thus defeating the purpose of the peer 2 peer hosting solution. The other problem, the durability problem, is what happens when a low number of people are interested in an image, and the last person online hosting the image closes their laptop. Now no one can get the image as there was never a canonically available version of the file. The only solutions that I know of that come close to solving these problems right now are Nostr and Secure Scuttlebutt. There are major issues with these protocols as they stand right now. Firstly, people already find joining the Fediverse too hard. For Nostr you have to generate GPG keys to create your identity. This isn't... horrible, but it definitely takes some work and some doing. You have to generate the files and then load them into your Nostr client. Secure Scuttlebutt is based on a protocol where to follow someone, someone has to invite you to follow them. People already complain about Beehaw asking you a question about what you like about Beehaw to make sure you read the rules. Imagine the frustration with a pure invite only social network where you can't join until someone you know has joined.
The second problem is moderation. Secure Scuttlebutt is fine for this. You only ever follow people you like, you only ever see updates from people you like. Fantastic. Nostr has basically no moderation at all. If you've spent any time at all on the internet, you've probably realized by now that this is TERRIBLE. My time on Nostr was basically opening the app, seeing an entire feed full of pro-Russian propaganda, and then uninstalling it. I do think there's something to be said for the idea of a pure peer 2 peer social network, but I don't think we're anywhere close to implementing it yet. So, where does that leave us?
The Fediverse. It was designed for a distributed governance system in which each instance acts as its own country with its own rules and governance, and it accidentally has some pretty neat clustering features that help it perform better under heavy load and keep data more permanent and durable. I want to emphasize that, too. The current computational and architectural benefits of the Fediverse are accidental. They're side effects of the distributed governance, not the core purpose. I don't expect anyone to put focus into enhancing these aspects of the Fediverse, at least not for a while. We're much more likely to see someone design a community based social network from the ground up on peer to peer technologies. I'd be excited about that, but it will need to have more open signups than Secure Scuttlebutt, and moderation tools like... At all, unlike Nostr. The most likely solution for the latter would be collaborative blocklists. Maybe me and two of my friends have a shared view of what is and isn't hate speech. So, we all spend some time just blocking the shit out of users. But, no one of us is who writes the block list, the block list itself is a peer 2 peer distributed construct so that we don't all have to reach consensuses about "Hey, was this guy being a jerkass"
Lemmy definitely needs to embrace distributed computing in some fashion
It would just need to be relatively straightforward for me to setup
Pick one.
I can't upload my puppy and flower pics!!! Fucking damn you!!! WTF did I sign up for!!!?!?!??!
You can always host your own instance...
duck
<bender.jpg> Caption: I'll just make my own Beehaw - with blackjack and hookers!
Thanks for everything y'all do to keep Beehaw afloat!
But then I have to click links!
There's a direct embed feature. I've used it for everything i've posted to reduce load on the servers.
Well, we do proxy the image so it's just saving on storage costs which after this move will be very cheap. 5$/TB/month.
Good to know for now though :)
Which object store did you go with for that price? It's been awhile since I looked, but I remember them being more than that.
We've chosen Backblaze B2, it's one of the cheaper options. Wasabi has similar pricing.
Not that anyone asked, but as a long time user, Baxkblaze is great. Good choice I think!
Look here....
Beehaw, Lemmy.ml, and the mastodon instance i use were all down at the same time. I thought it was some nepharious Meta plot 😂
Turns out it was object storage for beehaw and masto. Im not sure about Lemmy.ml
Thank you for making it possible to share endless pictures of beans in the future! It will never get old.
Beans, beans, beans, more beans, perhaps a cat, beans, beans, never gets old!, beans.
Just to be clear, this is just a moving of images, and it will be back correct? Just a temporary measure?
Yep, they're moving pictures to a service where it's cheaper to store them rather than keeping them on the server's hard drive
You guys are the best!
I did an ADHD, and misread as you saying you were turning off pictures for good, but given how much I'm enjoying the Beehaw community and the hard work you guys to keep it online, I wasn't even that upset about that! A short, well telegraphed, partial outage is nothing in comparison!
Thanks to all you wonderful people!
No worries on the short notice, thank you for the heads up! Sincerely appreciate the transparency.
Good luck with your migration! I can live with a bit of instability until we are through this. I noticed past couple days that the server seemed to go down every hour on the dot... hopefully that won't be the case once the migration is complete.
Thanks for the update!
So to save them space we should use externally linked images from other sources.
What are some of your favorite image hosts?
I just started using https://imgbox.com/
If it fails, you can always tell users to upload images to pixelfed and share the link here (I'm joking, don't take this seriously)
Maybe to Gifycat? It's like a nice short-term storage
Gfycat announced they are shutting down and deleting everything on September 1st.
That was the joke :P
Welcome back to onlineness! Well mostly-onlineness.