Feature request: Searchable closed captions for interviews

Frylock · March 18, 2020, 8:16pm

Not sure whether I, or even someone else, has brought this up before. Even just a timecoded transcript of the interviews would be super handy.

I found myself today vaguely recalling Carl Miner (I think) and @Troy talking about Bryan Sutton in one of Carl’s interviews, and if there were a way for me to quickly locate that section (or other random bits of recalled dialog), it would make that body of material even more useful.

Brendan · March 19, 2020, 4:36pm

I think this may have come up at some point on the forum a while back but I can’t find it. Agree this is a cool idea and would probably be useful for us as well!

We’ve looked into this a bit, and I think it’d be technically feasible, there are a couple approaches we could take, from simple to more advanced, roughly:

Simplest would be just an ordinary plain text transcript, with timestamps, so it’d at least be manually searchable for cross-referencing with the videos
More complex but I think ideal would be an actual interactive transcript where you can see it below the video, perhaps it scrolls automatically as the video plays and/or you can click on the transcript and it automatically jumps to that point in the video.

We’ve seen instances of this and I briefly looked into it / played around with a demo a while back, but not sure exactly how much time it’d take to get it working in a useful (and stable, responsive etc.) way.

Also would probably take more time than we’ve got at the moment just to transcribe everything! But I wonder if it could be a good start to see if anyone here might be interested in volunteering to transcribe like one favorite interview or lesson…

I’d say don’t jump on it quite yet because we’d want to plan the best way to do this — there’s certain software / transcript format needed to make it consistent etc. — but anyone interested in this please let us know and we’ll keep in mind for sure!

Frylock · March 19, 2020, 7:21pm

One option, while not zero cost, would be software transcription followed by volunteer proofreading. Less onerous than manual transcription, and might even produce tolerable non-proofread results.

Microsoft, Google, and Amazon all offer pay-as-you-go cloud-based speech-to-text services (with timecode) for about $1.40 per hour of content. I realize it’s easy for me to spend other people’s money, but that seems like a manageable price for the volume of content that CTC produces (though I recognize there are also hidden costs to implementing any solution). Anyway, one more option to throw in the incubator.

https://cloud.google.com/speech-to-text/pricing

Brendan · March 26, 2020, 9:28pm

Yeah these could be useful! There are also some cool transcription apps like Trint and Descript that seem promising here as well.

I’ve done some poking around here and there are some potentially useful little Javascript libraries as well to get something nice working on the front end video player side of things. Lots more testing to do & stuff to figure out technically but I do really like this idea!

I have an old rough prototype test maybe I can revisit and post here at some point for feedback