this post was submitted on 21 Jun 2023
1149 points (100.0% liked)

/kbin meta

199 readers
2 users here now

Magazine dedicated to discussions about the kbin itself. Provide feedback, ask questions, suggest improvements, and engage in conversations related to the platform organization, policies, features, and community dynamics. ---- * Roadmap 2023 * m/kbinDevlog * m/kbinDesign

founded 2 years ago
 

Currently, on the main instance, people have created 40191 accounts (+214 marked as deleted). I don't know how many are active because I don't monitor it, but once again, I greet all of you here :) In recent days, the traffic on the website has been overwhelming. It's definitely too much for the basic docker-compose setup, primarily designed for development use. I was aware of the possible consequences of the situation happening on Reddit, but I assumed that most people would migrate to one of the Lemmy instances, which already has an established position. I hoped that a few stray enthusiasts would find their way to kbin ;)

The first step was to upscale the VPS to a higher version (66.91EUR). It quickly turned out that it wasn't enough. I had to enable CF protection just to keep the website responsive, but the response times were still very slow. At this stage, the instance was practically unusable. The next step was a full migration to a dedicated server (100EUR, the current hardware). It can be done relatively quickly, so it resulted in a 5-minute technical break. Despite the much higher parameters, it didn't get any better. It became clear that the problem didn't lie there. I'm really frustrated when it comes to server administration. That was the moment when I started looking for help. Or rather, it found me.

A couple days ago I wrote about how kbin qualified for the Fast Forward program. To be honest, I did it out of pure curiosity and completely forgot because a lot was happening during that time. During the biggest fire incident, Hannah ( @haubles ) reached out with a proposal to help. I outlined the situation (in short: the server is dying, I don't even know what I need, help! ;). She quickly connected us with Vlad ( @vvuksan ) and Renaud ( @renchap ). I was probably too tired because I don't know if the whole operation lasted 60 minutes or 6 hours, but after a series of precise questions and getting an understanding of the situation, the guys themselves adjusted the entire job. I love working with experts, and it's not often that you come across individuals so well-versed in the fediverse. Thanks to Hannah's kindness, we will be staying there a bit longer. Currently, fastly.com handles the caching layer and processes images. Hence those cool moving thumbnails ;)

Things were going well at that point. I could disable Cloudflare protection. Probably thanks to that, many of you are here today, and we got to know each other a bit better :) However, even then, when I tried to enable federation, the server would stop working.

Around the same time, Piotr ( @piotrsikora ), whom I already knew from the Polish fediverse, contacted me. He is the administrator of the Polish Mastodon instance pol.social, operates within the ftdl.pl foundation, and specializes in administering applications with a very similar tech stack. I made the decision to grant him server access. It only took him a few moments, and he came back to me with a few tips that allowed us to enable federation. In the following days, there was more of it, and we managed to reach the current level. I think it's not too bad.

Nevertheless, managing the instance has taken up about 60% or more of my time so far, which prevents me from fully focusing on current tasks. That's why I would like to collaborate with Piotr and hand over full care of the server to him. Piotr will also take care of the security side. Now I have to take this much more seriously. We still need to work out the terms of cooperation, but I want you to know the direction I intend to pursue.

We also need to migrate to a new environment because one server will sooner or later become insufficient. This time, I want to be prepared for it. This may be associated with transient issues with the website in the coming days.

The next two updates will still be about project funding (I still can't believe what happened) and moderation. The following ones will be more technical, with descriptions of changes and what contributors are doing on Codeberg. I would like to be here more often, but not as an admin, just as myself.

Thank you all for this.

P.S. In private messages, I also received numerous offers of help that I didn't even have a chance to read and respond to. You are the best!

you are viewing a single comment's thread
view the rest of the comments
[–] stevecrox@kbin.social 3 points 2 years ago

That isn't the issue.

A complete rewrite of the application might add capacity, but its vertical, you stack increase load in one instance. No matter how much performance you extract eventually you run out of capacity.

As scales increase you need to add horizontal capacity. This is the idea of adding 2, 10, 100 servers. That means breaking out services into stateless parts which can run concurrently (or managed state behaviour).

This is where something like Kubernetes comes into play, since its designed to manage docker images over hubdreds of servers. Instead of using every last bit of capacity from one server you spread it.

Similarly postgres like most SQL platforms doesn't particularly scale beyond 1 instance.

Facebook invented Apache Cassandra for this reason, it was the first NoSQL database and is designed to deploy in multiples of its replicaset number (3 is the default).

Having data spread over 3, 30, 300 is less efficient, but you know have 3,30, 300 servers responding.

The other advantage is horizontal scaling is fault tolerant by design.

There is an argument for compiled languages like Go, C# and Java, but honestly the next big win is making as much as possible scale horizontally.