Programming

13386 readers

2 users here now

All things programming and coding related. Subcommunity of Technology.

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago

MODERATORS

remington

alyaza

Hexorg

What are your ideas for a controversial sorting algorithm for Lemmy? (github.com)

submitted 1 year ago* (last edited 1 year ago) by Veritas@lemmy.ml to c/programming

12 comments fedilink hide all child comments

There is a request for a comment on this issue Controversial posts and comments #2515. Do you have any ideas on how best to implement this?

I'd like to see some more people chime in with opinions, but maybe that'll come with a PR. At the very least, it's something that can be moved forward with.

— dcormier

you are viewing a single comment's thread
view the rest of the comments

[–] ConsciousCode 5 points 1 year ago* (last edited 1 year ago) (1 children)

If I was God Emperor for any sort of content algorithm, I'd want to try a decentralized "simulated annealing" of feature vectors based on up/down votes, subscriptions, etc. That lets you create linear combinations (eg on Wikipedia the vector for King - Man might be close to Royalty) and share them around on different platforms. This makes The Algorithm^TM^ a lot more transparent and user-driven, since you can feasibly program on a per-user basis all sorts of policies for how their own preference vector updates, or they could have layers of linearly combined vectors for base style, mood, binging, etc.

What I like about this is it doesn't presume some kind of inherent value - it implies a philosophy that everything is equally valid/valuable to someone, and people just have different preferences for what they prefer. If it's something utterly vile that everyone downvotes into oblivion, that'll naturally sink into the recesses of feature-space that occupy the opposite of whatever "all humans" value. It's also dead simple to implement, you just need an update policy (N-dimensional modular incrementing?) and a way to search a database by a given metric (cosine distance is the new hotness with AI, but back when I first thought of this I used hamming distance).

[–] spaduf@lemmy.blahaj.zone 4 points 1 year ago* (last edited 1 year ago) (1 children)

I think this is brilliant but not what they're looking for. You should absolutely spread this idea around if you're not willing to get a group together to implement it.

In my dreams I’d also like to see separate drop downs for timespan and criteria. They could even have options referring to different time dropoff curves. Then the traditional past day, past week, etc would effectively be flat.

EDIT: I implore you to make the pitch though. I am desperately waiting for someone to make an honest attempt at a nonpredatory content algorithm.

[–] ConsciousCode 1 points 1 year ago* (last edited 1 year ago)

To clarify, since I didn't focus on OP's question nearly as much as I should've, it does still apply as an edge case. A post starts as some transformation of the poster's preference vector, then every up/down vote nudges its vector toward/away from the preference vector of the voter. If a post gets 50 downvotes, it'll be nudged in whatever the opposite direction of all those voter's preferences are. Then the app just needs to have a policy for showing posts to users so long as it's within some threshold distance of their preferences. This means it naturally attends to the preferences of the context/community of the post, and if for some reason your own preferences are so far from those downvotes that it's still close to your own preferences, you can still see it.

It does have the potential to create echo chambers and silence unusual, fringe, or unpopular opinions though so it's not strictly better than the normal heuristics.

Your point about timespan and criteria reminds me a lot of a technique I've been seeing in AI circles lately, eg the Generative Agent paper - sort a database of memories by some combination of metrics like "recency" and "relevancy" (eg distance to feature vector) and select the top-k results to provide to the AI. This approximates human memory, which itself prioritizes recency and similarity along with other metrics like emotional saliency.

As far as making the pitch, well... I don't know who I'd pitch it to, I'm too disorganized to pursue it myself, and I'm a nobody so no one would listen to me anyway :P Feel free to steal it if you want though.