this post was submitted on 19 Jun 2023
17 points (100.0% liked)

Lemmy Server Performance

6 readers
1 users here now

Lemmy Server Performance

lemmy_server uses the Diesel ORM that automatically generates SQL statements. There are serious performance problems in June and July 2023 preventing Lemmy from scaling. Topics include caching, PostgreSQL extensions for troubleshooting, Client/Server Code/SQL Data/server operator apps/sever operator API (performance and storage monitoring), etc.

founded 1 year ago
MODERATORS
 

I have been working with pg_stat_statements extension to PG and it give us a way to see the actual SQL statements being executed by lemmy_server and the number of times they are being called.

This has less overhead than cranking up logging and several cloud computing services enable it by default (example) - so I don't believe it will have a significant slow down of the server.

A DATABASE RESTART WILL BE REQUIRED

It does require that PostgreSQL be restarted. Which can take 10 or 15 seconds, typically.

Debian / Ubuntu install steps

https://pganalyze.com/docs/install/self_managed/02_enable_pg_stat_statements_deb

Following the conventions of "Lemmy from Scratch" server install commands:

sudo -iu postgres psql -c "ALTER SYSTEM SET shared_preload_libraries = 'pg_stat_statements';"

Followed by a restart of the PostgreSQL service.

you are viewing a single comment's thread
view the rest of the comments
[โ€“] RoundSparrow@lemmy.ml 1 points 1 year ago* (last edited 1 year ago) (1 children)

Once the extension is installed and enabled, you can reset the stats whenever you want and it will start to build a new list of queries and I assume reset the average execution time.

I found a curious query that seems to take 10 or 11 seconds on my server:

SELECT "person"."id", "person"."name", "person"."display_name", "person"."avatar", "person"."banned", "person"."published", "person"."updated", "person"."actor_id", "person"."bio", "person"."local", "person"."banner", "person"."deleted", "person"."inbox_url", "person"."shared_inbox_url", "person"."matrix_user_id", "person"."admin", "person"."bot_account", "person"."ban_expires", "person"."instance_id", "person_aggregates"."id", "person_aggregates"."person_id", "person_aggregates"."post_count", "person_aggregates"."post_score", "person_aggregates"."comment_count", "person_aggregates"."comment_score" FROM ("person" INNER JOIN "person_aggregates" ON ("person_aggregates"."person_id" = "person"."id"))
WHERE (("person"."admin" = $1) AND ("person"."deleted" = $2)) ORDER BY "person"."published"

This query is being called over 1000 times an hour with my instance only really doing incoming Federation content. Should this be optimized by doing a subSELECT on the WHERE clause before doing the INNER JOIN?

[โ€“] RoundSparrow@lemmy.ml 1 points 1 year ago

IMPORTANT NOTE: I am rusty on my PostgreSQL, it's been over 10 years since I designed and ran a major time-critical website with it. That is not "SECONDS", it is "MILLISECONDS", but it is still a slow query being called frequently.