this post was submitted on 03 Jun 2023
149 points (100.0% liked)

Chat

7512 readers
1 users here now

Relaxed section for discussion and debate that doesn't fit anywhere else. Whether it's advice, how your week is going, a link that's at the back of your mind, or something like that, it can likely go here.


Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago
MODERATORS
 

Over the past 48 hours I have been glued to my screen trying to figure out how to make Beehaw more robust during this Reddit exodus.

My eyes are burning but I am thankful for so much financial support as well as the work of two sysadmins that have given us all a breath of fresh air.

One of the sysadmins was up until 2:30 am helping us out as a volunteer. I am so very grateful for persons such as this.

Thank you all for your continued support and patience.

you are viewing a single comment's thread
view the rest of the comments
[–] veroxii@lemmy.ml 11 points 2 years ago (1 children)

No, joins are always faster. If you ultimately need to combine the data for the app, the database will be faster than your code can do it, since that's what it was built to do.

[–] argv_minus_one 4 points 2 years ago (1 children)

Any idea why those queries are slow, then, if not because of all the duplicate data? Missing indices or something?

[–] veroxii@lemmy.ml 8 points 2 years ago (1 children)

Looking at the query I think it only returns a single row per post. So not really duplicate data. It all looks very straight forward and you'd think all the "_id" and "id" columns are indexed.

I asked for an EXPLAIN ANALYZE plan to see what really happens and where the most time is spent.

If it's indexes we'll see quickly. It might strangely be in the WHERE clause. Not sure what Hot_rank()'s implementation is. But we'll find that out too if we can get the plan timings. Without looking at the numbers it's all just guessing.

And I can't run them myself since I don't have access to a busy instance with their amount of production data. It's the thing about databases - what runs fast in dev, doesn't always translate to real workloads.

[–] argv_minus_one 9 points 2 years ago (1 children)

It’s the thing about databases - what runs fast in dev, doesn’t always translate to real workloads.

Yeah, that's what really scares me about database programming. I can have something work perfectly on my dev machine, but I'll never find out how well it works under a real-world workload, and my employer really doesn't like it when stuff blows up in a customer-visible way.

I decided to write a stress-test tool for my project that generates a bunch of test data and then hits the server with far more concurrent requests than I expect to see in production any time soon. Sure enough, the first time I ran it, my application crashed and burned just like Beehaw did. Biggest problem: I was using serializable transactions everywhere, and with lots of concurrent requests, they'd keep failing and retrying over and over, never making progress.

That's a lesson I'm glad I didn't learn in production…but it makes me wonder what lessons I will learn in production.

[–] darkfoe@lemmy.serverfail.party 6 points 2 years ago

This is why I love canary and mirror releases when feasible. Hard to do with some projects though