(Technology Connections) Closed captions on DVDs are getting left behind
(Technology Connections) Closed captions on DVDs are getting left behind
(Technology Connections) Closed captions on DVDs are getting left behind
I'm surprised VLC fares that badly with CCs encoded this way. Usually it's pretty good. I'm also now wondering if ffmpeg also shares the same problem
The top Youtube comment by Ridley Combs explains it pretty well:
FFmpeg maintainer here, and the details behind the caption decoding issues you're seeing in VLC are complex and horrific. They largely stem from how the EIA-608 caption format expects text to be laid out in a monospace grid onscreen, which isn't really how the text rendering stacks used for modern subtitling work (this is probably why changing the font caused problems on those Sony players); beyond that, the behavior can just end up pretty complex, and there's no convenient public-domain corpus of sample files for open-source software developers to test against. These kinds of issues also affect the Japanese (ARIB) and European (Teletext) formats to varying extents. These days, a lot of the focus ends up being on converting the text into modern Unicode text formats, styled using modern techniques, so direct rendering of the legacy formats hasn't had as much attention lately.
Because of the way those captions are stored VLC has to use OCR to convert the .SRT file (which basically stores low resolution b/w images I assume to easier allow for different alphabets) to normal text. I don't know why the open source solutions are so bad at this (especially considering how good the proprietary solutions seem to be) but I had similar problems ripping a DVD. I would assume that had he turned off the special font VLC uses for the subtitles and instead just seen the raw data there wouldn't have been a problem. Why VLC doesn't enable this by default (/ have this) I don't know.
This is not about DVD subtitles, which are images as you say. This is about "Line 21" closed captioning. I.E. the text data that is embedded in an analog tv signal. There should be no OCR needed.
There is no .srt in this case. This is also not about bitmap dvd vobsubs.