this post was submitted on 03 Jan 2024
21 points (100.0% liked)

Programming

423 readers
3 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev



founded 1 year ago
MODERATORS
 

I'm a retired Unix sysadmin. Over the years I've built things in COBOL, FORTAN, C, perl, rexx, PHP, visual basic, various Unix shells and maybe others. Nothing has been a real "application" - mostly just utilities to help me get things done.

Now that I'm retired, and it's cold outside, I'm curious to try some more coding - and I have an idea.

The music communities here seem to post links to YouTube. I generally use Lemmy on my phone but don't use YouTube, or listen to music, on my phone if I can help it. I'd like to scrape a music community here and add the songs posted to a playlist in my musicbrainz account.

Does that sound like a reasonable learner project? Any suggestions for language and libraries appreciated. My preferred IDE is vim on bash and I have a home server running Linux where this could run as a daemon, or be scheduled.

you are viewing a single comment's thread
view the rest of the comments
[–] Diabolo96@lemmy.dbzer0.com 9 points 10 months ago (2 children)

The best programming language for automating things is python. Python is easy and comes with a lot of modules that allow you to do anything and everything, I guarantee you that once you start automating stuff it'll become like a drug and you'll just "automate it" whenever you have anything repetitive.

And BTW, one of the main uses of python is website scraping.

https://musicbrainz.org/doc/MusicBrainz_API

[–] some_guy@lemmy.sdf.org 11 points 10 months ago (1 children)

The best language for automation is the one you know best. The second best is one you have to learn.

I think you could do this in bash with YouTube-dl.

[–] Diabolo96@lemmy.dbzer0.com 1 points 10 months ago* (last edited 10 months ago) (1 children)

Indeed. while my bash-fu is redimentary at best, I don't think Bash can be used for web scrapping ? But I think he could use RSS to get the posts, then extract youtube links with Regex and use the dump feature of yt-dlp* to get the video category, title,etc by using jq to parse the json. Then, it's probably just a matter of using curl to do the API calls and voilà.

*yt-dlp is better maintained than youtube-dl, or so I heard.

[–] some_guy@lemmy.sdf.org 3 points 10 months ago (1 children)

I built two scrapers for a website that hosts images and videos using bash.

They're educational, I swear! /s

I looked through the html and figured out regexes for their media. The scripts will parse all the links on the thumbnail pages and then load the corresponding primary pages with curl. On those pages, it then uses wget to grab the file. Some additional pattern matching names the file to the name of the post.

It's probably convoluted, but you can accomplish a lot in bash if you want to.

[–] Diabolo96@lemmy.dbzer0.com 2 points 10 months ago (1 children)

Man, there's something really wrong with lemmy lately. I only got the notification for your comment 8 days after you sent it. It's the third time this happens but this must be the longest time before the notification reaches me.

[–] some_guy@lemmy.sdf.org 3 points 10 months ago (1 children)

Yes, there's a discussion about this on my instance. Someone there provided a link to where this was getting addressed. Some aspects of federation have been broken for a bit.

https://github.com/LemmyNet/lemmy/issues/4288#issuecomment-1878442186

[–] Diabolo96@lemmy.dbzer0.com 1 points 10 months ago (1 children)
[–] some_guy@lemmy.sdf.org 2 points 10 months ago

Seems like it. My inbox had five replies yesterday (after >1w of only local replies). Today, even more. Yesterday, the GUI was partially broken. Today looks normal.

[–] GreatBlueHeron@lemmy.ca 7 points 10 months ago (3 children)

I find Python difficult - no idea why, it just doesn't feel right. I've tried a few times but never been able to do anything useful with it - that's why it's not in my list above. It does seem though that my proposed project, and development "style", is best suited to Python. Maybe it's time to try again.

[–] some_guy@lemmy.sdf.org 5 points 10 months ago

If you work in bash and don’t like python, maybe it’s too strict. Look into Ruby. It was inspired by Perl. I found it more to my style in that there are many correct solutions and not one implied correct solution.

[–] onlinepersona@programming.dev 2 points 10 months ago* (last edited 10 months ago) (1 children)

Python is basically runnable pseudo code that you would write on a napkin to explain stuff to somebody. There you don't care about curly backets and naturally indent to show scope. It's way simpler C and if you want to, you can add type hints (aka faux static typing).

Package management is done with pip although nowadays poetry is better as it uses one file to define everything about your project and configure the tools (linter, tester, autoformatter, static type checking)

The advantage of python is that it has lots and lots of libraries. You don't need to fiddle around with the lemmy API - use a library:

Want to connect to musicbrainz? https://pypi.org/project/musicbrainzngs/ is probably the best.

-->

Create a virtual env (basically allows you to install all your project dependencies in an environment separate from the global one): python3 -m venv .venv.

Activate the virtual in your shell source .venv/bin/activate.

Now you can start installing dependencies. If you want it super simple, use pip install $package, but updating the list of packages you want in your project is manual: pip freeze > requirements.txt (install them again with pip install -r requirements.txt after rm -rf .venv should you want to start fresh) and you can run into problems with clashing dependencies.
So, I recommend using poetry pip install poetry. poetry new . to setup basic project structure, then add runtime dependencies with poetry add $package e.g poetry add pylemmy musicbrainzngs.

It's possible to add dev dependencies with poetry like ruff for linting and autoformatting your code and mypy for static type checking. Your unit tests can be written using unittest from the standard library.

CC BY-NC-SA 4.0

[–] GreatBlueHeron@lemmy.ca 1 points 10 months ago (1 children)

Thank you for your detailed response. It's a bit much for my proposed "project". I won't be using any libraries (other than built-in python json etc.). I've prototyped most of it and it's currently about 15 lines of code. Literally one call to lemmy, a search to Musicbrainz and a playlist update to listenbrainz. I know it will grow lots as I make it a bit more robust, but it's still very small.

[–] onlinepersona@programming.dev 2 points 10 months ago

I see. No problem :) If it's simple, does what you need it to, and you're happy with it, that's all that matters.

[–] Diabolo96@lemmy.dbzer0.com 2 points 10 months ago* (last edited 10 months ago)

It was just a recommendation. If you feel like python isn't for you, you can try any other language and the only difference will be how much time it'll take to make it, but otherwise you can use C if you want. Maybe you're so used to low level programming like managing memory and having to declare types everywhere that python dumb proof approach is difficult for you. Just don't think too hard about it, if it's a personal use script then there's no need to think about it's efficiency or ugliness. If it works, it works.