Captioning and/or transcription for Project Jupyter calls

In the February 16 JupyterLab team meeting, attendees discussed recording these meetings (as they once were). If we are going to start creating even more recorded/video and audio content around projects, we need to have some kind of support for captioning and/or transcribing the content in them.

Some ideas for this are:

  • Otter.ai (proposed by @astrojuanlu; subscription based and we may be able to test with the free version)
  • Rev (pay per-video, written by humans, may not sufficiently support real-time call needs)
  • Find ways to fund someone who can do this during meetings, maybe as another community role?

I have experimented with how to make this happen for the Jupyter community calls and Jupyter accessibility workshops, and so far have had poor results with any automated tools. In particular, those I’ve tried butcher transcription for speakers with non-U.S. accents and make up all kinds of gibberish for all tech-specific terms. I understand this solution may still best support the number of meetings we have across the calendar.

I’m posting this in General because I believe this is something that multiple/all Jupyter projects could benefit from. If you run a meeting and are interested in advocating for this, please let me know below or reach out to me on gitter. It would be very helpful to know who else is interested.

(big thanks to @astrojuanlu for following up on my in-meeting comment!)

2 Likes

AssemblyAI looks interesting. Free trial

Automatically convert audio/video files, and live audio streams, into accurate transcriptions with a simple API. Powered by cutting edge research into Automatic Speech Recognition.

1 Like

As an update, I’ve been told that we might have the following options too:

  • The Project Jupyter licensed Zoom account might have auto-generated captions available if we set it up properly. (I have not had any success with this yet, though perhaps I’m missing some steps in the documentation.)
  • For post-call captions, it sounds like YouTube’s automatically generated captions are back if creators enable them.

I’m not resolving this discussion as answered yet since we have not actually achieved any solutions yet.