this post was submitted on 05 Jul 2023
240 points (100.0% liked)

Beehaw Support

2797 readers
2 users here now

Support and meta community for Beehaw. Ask your questions about the community, technical issues, and other such things here.

A brief FAQ for lurkers and new users can be found here.

Our September 2024 financial update is here.

For a refresher on our philosophy, see also What is Beehaw?, The spirit of the rules, and Beehaw is a Community


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.


if you can see this, it's up  

founded 2 years ago
MODERATORS
240
submitted 1 year ago* (last edited 1 year ago) by Lionir to c/support
 

We have to move to object storage to have efficient image storage. We are currently at 70% of our disk storage.

We are sorry for the short notice and we don't know how long it will take though we believe it will take multiple hours.


Small post mortem :

  1. The migration required us to be down - we thought we could run without images
  2. It took much longer than initially anticipated
  3. It broke at close to 2:30AM - 3 hours into the migration and I couldn't fix it so the server has been down doing nothing for hours now
  4. We're booting back up from the backup made before the migration was attempted, we'll try another strategy.

There might've been some data loss in images, we're looking into it. In the meanwhile, if your profile picture or banner is broken, feel free to re-upload them.


Update :

Hi Beeple!

We're trying this again.

This time, Beehaw should remain available though no pictures will be able to be uploaded. The error will likely be weird because Lemmy will think it is possible but we will block the upload from happening.

We'll take a snapshot of the pictures before the migration in 60 minutes at 21:00 UTC - it should take around one hour, do the migration and testing on our own before shipping on Beehaw.

Once we've resolved all these kinks, Beehaw will momentarily go down and then back up with the migration complete without losing any old pictures.

top 50 comments
sorted by: hot top controversial new old
[–] Penguincoder 78 points 1 year ago (4 children)

Well that was fun.

Didn't go as planned of course, restored from backups, pre migration attempt. Thank you for your patience while we try to get all these moving parts working well together. Sorry for the troubles.

[–] Cube6392 33 points 1 year ago (2 children)

I once caused an AWS outage that impacted 20% of their customers in their largest region. They called my manager to ask why we were performing around 10k writes per second to a bucket. It was fun times

[–] Huggernaut 9 points 1 year ago (1 children)

They don't limit that?! I've worked with a lot of AWS services and most have built in rate limits. That's wild lol

[–] Cube6392 26 points 1 year ago (1 children)
[–] longshaden 10 points 1 year ago (2 children)

lol, that's how rules get made

[–] Cube6392 9 points 1 year ago

I can get into it in more detail if anyone's interested. But basically, they had a rate limit on direct writes, but not a rate limit on cross bucket replication if you connected many buckets to replicate into a single bucket

[–] ButterBiscuits 7 points 1 year ago

Right? Don't feel bad, you found a "vulnerability" and now you're a hero

[–] msprout 8 points 1 year ago

That was you?!

(jk, I'm on a different cloud 😂)

[–] l4sgc 20 points 1 year ago

Thanks for the update and hope you have less trouble in the future! Don't worry about the downtime I really appreciate that here it's serving a clear purpose unlike Twitter lol

[–] artemisia 12 points 1 year ago

Loads of love. There's always ASCII art.

load more comments (1 replies)
[–] knittedmushroom 58 points 1 year ago (2 children)

Every technical bump in the road we hit now is one we won't hit/will know how to handle quickly in the future! Thank you for doing what you do for Beehaw!

[–] Lionir 28 points 1 year ago (1 children)

Yeah, moving to object storage is best to do now. Arguably, we should've done it sooner since the longer we've waited, the more it was gonna catch up to us and cost us in time and money.

[–] interolivary 11 points 1 year ago

I'd imagine the list of things you should do RIGHT NAO is pretty long though and there's only 24h per day 😅

[–] AndrewZabar 6 points 1 year ago

I concur. A minor inconvenience on occasion is a small price to pay for your amazing efforts! Thank you for doing what you do.

[–] alehel 46 points 1 year ago (1 children)

Surprised beehaw hosts images at all. Sounds like that could become very expensive very quickly.

[–] douglasg14b 22 points 1 year ago* (last edited 1 year ago) (2 children)

It could, and will. Hopefully they are taking advantage of CDNs for image delivery so they aren't paying high egress costs and can keep it in slow, cheap, storage.

I'm honestly surprised that Lemmy hasn't embraced distributed, community, hosting. Many existing niche communities (outside of Lemmy) operate with the ability for others to run their service to serve up images and media, or to act as workers for computationally expensive operations like compression or encoding (Which will also save you a ton of space). Even gamificating it in the case of e-hentai.

Hard drives (Tapes even more) at home/office are incredibly cheap compared to cloud storage costs (even including networking, server, redundancy...etc hardware costs), but come with reliability concerns, which is where a distributed community becomes critical. Though you'll always have to have them stored somewhere like Backblaze B2, or somewhere slower/cheaper/frozen to ensure safety.

[–] Lionir 14 points 1 year ago

We'll definitely be using a CDN to help avoid high egress costs.

[–] greenskye 7 points 1 year ago (2 children)

I feel like Lemmy definitely needs to embrace distributed computing in some fashion. I have no interest in hosting my own instance, but I'm not against running a docker image that would offload some of the processing requirements large instances need. It would just need to be relatively straightforward for me to setup

[–] Cube6392 15 points 1 year ago

Distributed computing isn't really a good fit for low computational tasks like forum software. It's good for heavy calculations like "Could you please fold proteins to see if there's any interesting stuff to be found" and "Here are 50 years of radio data. See if any of it is anomalous." You need a sufficiently complex enough long-running task to warrant the computational overhead of a supervisor process assigning and receiving the outputs of tasks. LLMs, epigenetics, and deep space analysis are all good candidates for distributed computing. Lemmy is more of a candidate for an autoscaling clustered multi-tennent approach. The computational tasks are basic, but there's a lot of them. Further, the computational needs are not constant. A fantastic case study for making the most of resources in the Fediverse is mastodon.world and lemmy.world running on the same server and making scale up and scale down requests to the docker daemon. The ideal world topology, in my opinion, for a Fediverse application ecosystem would be a Kubernetes cluster with three supervisor nodes and a minimum of two worker nodes, all with autoscaling enabled. The idea would be that your database resources can hold multiple databases (Lemmy, Mastodon, Peertube) AND can scale. The mechanism you would use to do this would depend on your hosting decisions.

Digression now on database solutions. There are three basic ways I could see running the perfect Fediverse database cluster. The first, and least beholden to any given cloud provider, is to run Postgres in a Kubernetes cluster either on a single machine emulated cluster at your house, or within several clustered machines. The upside to this is that no one but you controls your infrastructure. The downside is that your ability to scale is hard capped to the amount of RAM and CPU resources you physically have in your house. Next would be a similar set-up on a hosted Kubernetes cluster through a cloud provider such as Google, Microsoft, IBM, or AWS. The downside here is that tech giants are all, for various reasons, shit. Google has the best eco-friendliness score, so they're listed first. They're still shit, though, and one of the platforms I'm suggesting hosting is a direct competitor to one of their golden goose products.

Your next option is to just pay one of those cloud providers to host a database cluster for you, rather than using an ad hoc Kubernetes cluster solution. It will cost you more money, but the tools available to you for managing databases through these cloud providers are much better. In terms of user experience and performance, this is a clear upgrade over hosting your databases on your Kubernetes cluster. The final option I'd want to talk about is called "Aurora Serverless." So far, I've only discussed ways you can scale up to meet demand, but Aurora Serverless allows you to scale down. This will be the cheapest option if you run a small instance with clear peaks and valleys of load. It's not the best answer for a user like Beehaw, but would come with the lowest cost in terms of management and money for someone running an instance for a low number of people.

So, does that solve the image hosting problem? No. Not really. Postgres is TERRIBLE for image hosting. Right now, Beehaw is, per my understanding, using the simplest image storing solution, which is "Just keep it on the server." This is great for a first pass at hosting a web service, and will remain fine long term for a low user instance, but will fast run into issues with any instance that hosts numerous users uploading pictures. Basically, servers have finite space because they're running the Harvard architecture. The only solution is to bring the service down and put in bigger disks. Eventually, you reach the upper limit of how big of disks are manufactured, and how many disks you can attach via the interfaces that connect to a motherboard. A much better solution, and in fact the best solution, is what Beehaw is implementing right now: block object storage. If I'm going to tie all of this first in the DIY "I'm a strong independent Fediverse citizen, and I don't need no corporations," I'll start by recommending Ceph. Ceph can run on Kubernetes and will provide block object storage based on Kubernetes persistent volumes. But more likely, you will want to aim for something with infinite storage capabilities, and your only real options for that currently are the cloud providers. You don't have to worry about disks running out of space, and they do not charge you very much money.

I get where you're coming from, though. "How do we all own the images so that the instances don't run out of space but without being beholden to the corporations who own the storage?" The closest we come right now is peer 2 peer solutions, but all of them have a discovery and durability problem. In terms of discovery, the problem is "how does a server providing the Lemmy service find the peer 2 peer hosted files?" There's no way to perform get object operations to serve the files via HTTP other than for the host server to fetch (download) the file from the peer 2 peer network and then deliver that to the user who made the request. The problem with this is that the server synced the file to its local storage, and is now hosting it, thus defeating the purpose of the peer 2 peer hosting solution. The other problem, the durability problem, is what happens when a low number of people are interested in an image, and the last person online hosting the image closes their laptop. Now no one can get the image as there was never a canonically available version of the file. The only solutions that I know of that come close to solving these problems right now are Nostr and Secure Scuttlebutt. There are major issues with these protocols as they stand right now. Firstly, people already find joining the Fediverse too hard. For Nostr you have to generate GPG keys to create your identity. This isn't... horrible, but it definitely takes some work and some doing. You have to generate the files and then load them into your Nostr client. Secure Scuttlebutt is based on a protocol where to follow someone, someone has to invite you to follow them. People already complain about Beehaw asking you a question about what you like about Beehaw to make sure you read the rules. Imagine the frustration with a pure invite only social network where you can't join until someone you know has joined.

The second problem is moderation. Secure Scuttlebutt is fine for this. You only ever follow people you like, you only ever see updates from people you like. Fantastic. Nostr has basically no moderation at all. If you've spent any time at all on the internet, you've probably realized by now that this is TERRIBLE. My time on Nostr was basically opening the app, seeing an entire feed full of pro-Russian propaganda, and then uninstalling it. I do think there's something to be said for the idea of a pure peer 2 peer social network, but I don't think we're anywhere close to implementing it yet. So, where does that leave us?

The Fediverse. It was designed for a distributed governance system in which each instance acts as its own country with its own rules and governance, and it accidentally has some pretty neat clustering features that help it perform better under heavy load and keep data more permanent and durable. I want to emphasize that, too. The current computational and architectural benefits of the Fediverse are accidental. They're side effects of the distributed governance, not the core purpose. I don't expect anyone to put focus into enhancing these aspects of the Fediverse, at least not for a while. We're much more likely to see someone design a community based social network from the ground up on peer to peer technologies. I'd be excited about that, but it will need to have more open signups than Secure Scuttlebutt, and moderation tools like... At all, unlike Nostr. The most likely solution for the latter would be collaborative blocklists. Maybe me and two of my friends have a shared view of what is and isn't hate speech. So, we all spend some time just blocking the shit out of users. But, no one of us is who writes the block list, the block list itself is a peer 2 peer distributed construct so that we don't all have to reach consensuses about "Hey, was this guy being a jerkass"

[–] interolivary 7 points 1 year ago (2 children)

Lemmy definitely needs to embrace distributed computing in some fashion

It would just need to be relatively straightforward for me to setup

Pick one.

load more comments (2 replies)
[–] remington 41 points 1 year ago (1 children)

I can't upload my puppy and flower pics!!! Fucking damn you!!! WTF did I sign up for!!!?!?!??!

[–] Penguincoder 36 points 1 year ago (1 children)

You can always host your own instance...

duck

[–] jabib 9 points 1 year ago (1 children)

<bender.jpg> Caption: I'll just make my own Beehaw - with blackjack and hookers!

load more comments (1 replies)
[–] Powderhorn 21 points 1 year ago

Thanks for everything y'all do to keep Beehaw afloat!

[–] metaltoilet 21 points 1 year ago* (last edited 1 year ago) (8 children)
[–] Lionir 20 points 1 year ago (1 children)

But then I have to click links!

[–] metaltoilet 14 points 1 year ago (1 children)

There's a direct embed feature. I've used it for everything i've posted to reduce load on the servers.

[–] Lionir 15 points 1 year ago (3 children)

Well, we do proxy the image so it's just saving on storage costs which after this move will be very cheap. 5$/TB/month.

Good to know for now though :)

[–] psudo 10 points 1 year ago (1 children)

Which object store did you go with for that price? It's been awhile since I looked, but I remember them being more than that.

[–] Lionir 20 points 1 year ago (2 children)

We've chosen Backblaze B2, it's one of the cheaper options. Wasabi has similar pricing.

[–] Haatveit 7 points 1 year ago

Not that anyone asked, but as a long time user, Baxkblaze is great. Good choice I think!

load more comments (1 replies)
load more comments (2 replies)
[–] Penguincoder 7 points 1 year ago

Look here....

load more comments (6 replies)
[–] worfamerryman 16 points 1 year ago

Beehaw, Lemmy.ml, and the mastodon instance i use were all down at the same time. I thought it was some nepharious Meta plot 😂

Turns out it was object storage for beehaw and masto. Im not sure about Lemmy.ml

[–] dr_catman 16 points 1 year ago (1 children)

Thank you for making it possible to share endless pictures of beans in the future! It will never get old.

Beans, beans, beans, more beans, perhaps a cat, beans, beans, never gets old!, beans.

load more comments (1 replies)
[–] average650 16 points 1 year ago (1 children)

Just to be clear, this is just a moving of images, and it will be back correct? Just a temporary measure?

[–] interolivary 12 points 1 year ago

Yep, they're moving pictures to a service where it's cheaper to store them rather than keeping them on the server's hard drive

[–] dandelion 15 points 1 year ago

You guys are the best!

I did an ADHD, and misread as you saying you were turning off pictures for good, but given how much I'm enjoying the Beehaw community and the hard work you guys to keep it online, I wasn't even that upset about that! A short, well telegraphed, partial outage is nothing in comparison!

Thanks to all you wonderful people!

[–] cyberdecker 15 points 1 year ago

No worries on the short notice, thank you for the heads up! Sincerely appreciate the transparency.

[–] Rentlar 12 points 1 year ago

Good luck with your migration! I can live with a bit of instability until we are through this. I noticed past couple days that the server seemed to go down every hour on the dot... hopefully that won't be the case once the migration is complete.

[–] vertelleus 12 points 1 year ago (3 children)

Thanks for the update!
So to save them space we should use externally linked images from other sources.
What are some of your favorite image hosts?
I just started using https://imgbox.com/

load more comments (3 replies)
[–] retronautickz 11 points 1 year ago (1 children)

If it fails, you can always tell users to upload images to pixelfed and share the link here (I'm joking, don't take this seriously)

[–] TemporalSoup 15 points 1 year ago (2 children)

Maybe to Gifycat? It's like a nice short-term storage

[–] pixelpop3 9 points 1 year ago (2 children)

Gfycat announced they are shutting down and deleting everything on September 1st.

[–] jherazob 20 points 1 year ago

That was the joke :P

load more comments (1 replies)
load more comments (1 replies)
[–] argv_minus_one 7 points 1 year ago* (last edited 1 year ago)

Welcome back to onlineness! Well mostly-onlineness.

[–] Pistcow@lemm.ee 7 points 1 year ago (1 children)

So it's turning into Live Journal?

load more comments (1 replies)
load more comments
view more: next ›