this post was submitted on 14 Dec 2023
39 points (100.0% liked)

Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ

1444 readers
16 users here now

⚓ Dedicated to the discussion of digital piracy, including ethical problems and legal advancements.

Rules • Full Version

1. Posts must be related to the discussion of digital piracy

2. Don't request invites, trade, sell, or self-promote

3. Don't request or link to specific pirated titles, including DMs

4. Don't submit low-quality posts, be entitled, or harass others



Loot, Pillage, & Plunder

📜 c/Piracy Wiki (Community Edition):


💰 Please help cover server costs.

Ko-Fi Liberapay
Ko-fi Liberapay

founded 1 year ago
MODERATORS
 

TL-DR; for stuff that is NOT from sonarrr/radrr (e.g. downloaded long time ago / gotten from friends, RSS feeds, whatever), is there a better way to find subs than downloading everything from manual DDL sites and trying everything until one works (matching english text and correctly synced)?

I am not currently using bazarr and I understand that it can catch anything from sonarr that is missing subs but that is not the use-case I need. I am still open to it but since most of the new stuff I get already has subs, I'm looking more at my stuff that is NOT coming from sonarr bc that's where I have the most missing subs. thinking since there github say:

Be aware that Bazarr doesn't scan disk to detect series and movies: It only takes care of the series and movies that are indexed in Sonarr and Radarr."

that most of my use-case is going to be manual searches. It also sounds like Bazarr uses same kind of DDL sites like opensubtitles and subscene that I am already using as its backend / source so curious if there is any advantage vs looking up old stuff on the sites directly.

And especially if there is some way to match existing files with the correct subs, even if the file/folder names no longer contain the release group (e.g. via duration or other mediainfo data or maybe even via checksums). I know vlc can do it for a single file.. but since I have a LOT of stuff w missing subs, I'm looking for a way that I can do something similar from a bash script or some other bulk job without getting a bunch of unsynced subs.

top 18 comments
sorted by: hot top controversial new old
[–] pe1uca@lemmy.pe1uca.dev 8 points 11 months ago (2 children)

I got annoyed at not finding CC for the media I have dubbed, so if the show/movie is originally in English and I have it in Spanish, the Spanish subtitles are not from the Spanish audio, but translations of the English audio, so they don't usually match.
(which Tom Scott recently made a video about this issue https://youtu.be/pU9sHwNKc2c)

I found and been using this project https://github.com/jhj0517/Whisper-WebUI
It's been pretty good, for youtube videos (10-30 minutes) has been perfect.
But there are some issues when I tried it with movies, the timings are not great, and sometimes it hallucinates some words in parts where there aren't any. Just a few words are actually wrong/missing. (I tried it with fastwhisper since I don't have that much ram)

[–] ramenbellic@midwest.social 5 points 11 months ago (2 children)

Running it through Subtitle Edit with WhisperX can help a lot for longer movies. It breaks the file into much smaller pieces and runs Whisper on them one by one before stitching the result back together.

[–] BlackFlagsForever@lemmy.dbzer0.com 2 points 11 months ago* (last edited 11 months ago)

interesting, that actually sounds like an awesome idea for the OTA tv rips, cuz I doubt I would even be able to find anything that matches by duration on normal sub sites.

I hadn't heard of whisper gui / whisperx before but I see it has a github. ~~do you know if that is cloud-based or something you can run entirely local? (wondering if it is cloud-based in case i need to allow it net access & also curious if it would eat a lot of bandwidth for roughly 2 seasons of broadcast tv shows aka somewhere around 30-35 hrs worth of audio)~~

edit: apparently whisper can be run entirely offline according to this so if whisperx is a fork, then i assume it would allow this too

[–] SchizoDenji@lemm.ee 1 points 11 months ago (1 children)

Does it work well with movies in other languages? I assume due to BGM it might cause errors?

[–] ramenbellic@midwest.social 1 points 10 months ago

My limited experience has been positive w/ non-English languages.

[–] BlackFlagsForever@lemmy.dbzer0.com 2 points 11 months ago* (last edited 11 months ago)

thanks for the suggestion. i was completely unaware of the whisper project and even if it doesnt help much for movies, that might come in real handy for some of the OTA rips I have from my friends (was pretty sure I was SOL for those but this seems like a dcent option).

sounds like it can even be run entirely offline so even better

[–] tun@lemm.ee 4 points 11 months ago (1 children)

I used to use subliminal command line to download subtitles.

subscene is the website I used to find (no api) if subliminal failed.

[–] BlackFlagsForever@lemmy.dbzer0.com 1 points 11 months ago (1 children)

yeah, i mostly use subscene now. Looks like I was able to pip install subliminal so will check that out.. guess i need to make some accounts/api keys first.

do you still get issues with mismatched / out-of-sync subs here and there?

[–] tun@lemm.ee 3 points 11 months ago (1 children)

I didn't use any API account (opensubtutle was still free and open to public).

  1. most of the time I got a hit (subliminal supports many sites with API access)
  2. If missed, I check the file with mediainfo and check fps.
  3. If there is release info and fps, I manually downloaded by searching with FPS.
  4. If fps is correct and timing is out I use subler to correct the time after manually syncing time with VLC or MPV

These days, *arr setup (according to trash guide) and Plex pass automatically get me the subtitles.

[–] BlackFlagsForever@lemmy.dbzer0.com 1 points 11 months ago

thanks for this, that's some good info 😀

[–] iesou@lemm.ee 3 points 11 months ago (2 children)

Opensubtitles.com has an AI service to transcribe, translate, or provide VO for a small fee: https://ai.opensubtitles.com.

I was thinking of using it for some of my older more obscure stuff bazarr can't find.

[–] moosetwin@lemmy.dbzer0.com 5 points 11 months ago (1 children)

opensubtitles puts heavy advertising in their subtitles

[–] Tetsuo@jlai.lu 5 points 11 months ago (1 children)

They also put literally download links towards malware...

Definitely in my list of unsafe websites.

Too bad it's the reference for subs.

[–] BlackFlagsForever@lemmy.dbzer0.com 1 points 11 months ago* (last edited 11 months ago) (1 children)

what do you use instead? i usually start on subscene and on the rare time it doesn't have it or down, then i go and hit all the others i know until i find it or come up empty handed.

I use ublock in the browser and never click on links when watching videos (does vlc even support that out of the box? never tried)

[–] Tetsuo@jlai.lu 2 points 11 months ago

I use the VLC subtitles download feature.

I think it goes to opensubtitles anyway but at least you don't have to experience the website.

[–] nudnyekscentryk@szmer.info 3 points 11 months ago* (last edited 11 months ago) (1 children)

~~isn't opensubtitles.COM an impostor of opensubtitles.ORG? which is nasty enough to advertise itself on the original website? or are they actually related?~~

according to FAQ it is related

[–] hyperspace@kbin.social 2 points 11 months ago (1 children)

I'd like to know this as well

[–] iesou@lemm.ee 2 points 11 months ago

.org just switched to .com

Same site, new domain.