Three Video Captioning Tools

claude80By Claude Almansi
Staff Writer

First of all, thanks to:

  • Jim Shimabukuro for having encouraged me to further examine captioning tools after my previous Making Web Multimedia Accessible Needn’t Be Boring post – this has been a great learning experience for me, Jim
  • Michael Smolens, founder and CEO of DotSUB.com and Max Rozenoer, administrator of Overstream.net, for their permission to use screenshots of Overstream and DotSUB captioning windows, and for their answers to my questions.
  • Roberto Ellero and Alessio Cartocci of the Webmultimediale.org project for their long patience in explaining multimedia accessibility issues and solutions to me.
  • Gabriele Ghirlanda of UNITAS.ch for having tried the tools with a screen reader.

However, these persons are in no way responsible for possible mistakes in what follows.

Common Features

Video captioning tools are similar in many aspects: see the screenshot of a captioning window at DotSUB:

dotsub_transcribe

and at Overstream:

overstream_transcribe

In both cases, there is a video player, a lst of captions and a box for writing new captions, with boxes for the start and end time of each caption. The MAGpie desktop captioning tool (downloadable from http://ncam.wgbh.org/webaccess/magpie) is similar: see the first screenshot in David Klein and K. “Fritz” Thompson, Captioning with MAGpie, 2007 [1].

Moreover, in all three cases, captions can be either written directly in the tool, or creating by importing a file where they are separated by a blank line – and they can be exported as a file too.

What follows is just a list of some differences that could influence your choice of a captioning tool.

Overstream and DotSUB vs MAGpie

  • DotSUB and Overstream are online tools (only a browser is needed to use them, whatever the OS of the computer), whereas MAGpie is a desktop application that works with Windows and Mac OS, but not with Linux.
  • DotSUB and Overstream use SubRip (SRT) captioning [2] while MAGpie uses Synchronized Multimedia Integration Language (SMIL) captioning [3]
  • Overstream and Dotsub host the captioned result online, MAGpie does not.
  • The preparation for captioning is less intuitive with MAGpie than with Overstream or DotSUB, but on the other hand MAGpie offers more options and produces simpler files.
  • MAGpie can be used by disabled people, in particular by blind and low-sighted people using a screen reader [4], whereas DotSUB and Overstream don’t work with a screen reader.

Overstream vs DotSUB

  • The original video can be hosted at DotSUB; with Overstream, it must be hosted elsewhere.
  • DotSUB can also be used with a video hosted elsewhere, but you must link to the streaming flash .flv file, whereas with Overstream, you can link to the page of the video – but Overstream does not support all video hosting platforms.
  • If the captions are first written elsewhere then imported as an .srt file, Overstream is more tolerant of coding mistakes than DotSUB – but this cuts both ways: some people might prefer to have your file rejected rather than having gaps in the captions.
  • Overstream allows more precise time-coding than DotSUB, and it also has a “zooming feature” (very useful for longish videos), which DotSUB doesn’t have.
  • DotSUB can be used as a collaborative tool, whereas Overstream cannot yet: but Overstream administrators are planning to make it possible in future.
  • With DotSUB, you can have switchable captions in different languages on one player. With Overstream, there can only be one series of captions in a given player.

How to Choose a Tool . . .

So how to choose a tool? As with knitting, first make a sample with a short video using different tools: the short descriptive lists above cannot replace experience. Then choose the most appropriate one according to your aims for captioning a given video, and what are your possible collaborators’ availability, IT resources, and abilities.

. . . Or Combine Tools

The great thing with these tools is that you can combine them:

As mentioned in my former Making Web Multimedia Accessible Needn’t Be Boring post, I had started captioning “Missing in Pakistan” a year ago on DotSUB, but gone on using MAGpie for SMIL captioning (see result at [5] ). But when Jim Shimabukuro suggested this presentation of captioning tools, I found my aborted attempt at DotSUB. As you can also do the captioning there by importing a .srt file, I tried to transform my “.txt for SMIL” file of the English captions into a .srt file. I bungled part of the code, so DotSUB refused the file. Overstream accepted it, and I corrected the mistakes using both. Results at [6] (DotSUB) and [7] (Overstream) . And now that I have a decent .srt file for the English transcript, I could also use it to caption the video at YouTube or Google video: see YouTube’s “Video Captions: Help with Captions” [8]. (Actually, there is a freeware program called Subtitle Workshop [9] that could apparently do this conversion cleanly, but it is Windows-only and I have a Mac.)

This combining of tools could be useful even for less blundering people. Say one person in a project has better listening comprehension of the original language than the others, and prefers Overstream: s/he could make the first transcript there, export the .srt file, which then could be mported in DotSUB to produce a transcript that all the others could use to make switchable captions in other languages. If that person with better listening comprehension were blind, s/he might use MAGpie to do the transcript, and s/he or someone else could convert it to a .srt fil that could then be uploaded either to DotSUB or Overstream. And so on.

Watch Out for New Developments

I have only tried to give an idea of three captioning tools I happen to be acquainted with, as correctly as I could. The complexity of making videos accessible and in particular of the numerous captioning solutions is illustrated in the Accessibility/Video Accessibility section [10] of the Mozilla wiki – and my understanding of tech issues remains very limited.

Moreover, these tools are continuously progressing. Some have disappeared – Mojiti, for instance – and other ones will probably appear. So watch out for new developments.

For instance, maybe Google will make available the speech-to-text tool that underlies its search engine for the YouTube videos of the candidates to the US presidential elections (see “”In their own words”: political videos meet Google speech-to-text technology” [11]): transcribing remains the heavy part of captioning and an efficient, preferably online speech-to-text tool would be an enormous help.

And hopefully, there will soon be an online, browser-based and accessible SMIL generating tool. SubRip is great, but with SMIL, captions stay put under the video instead of invading it, and thus you can make longer captions, which simplifies the transcription work. Moreover, SMIL is more than just a captioning solution: the SMIL “hub” file can also coordinate a second video for sign language translation, and audio descriptions. Finally, SMIL is a W3C standard, and this means that when the standard gets upgraded, it still “degrades gracefully” and the full information is available to all developers using it: see “Synchronized Multimedia Integration Language (SMIL 3.0) – W3C Recommendation 01 December 2008 [12].

Making Web Multimedia Accessible Needn’t Be Boring

claude80By Claude Almansi
Guest Author
7 November 2008

Some people see the legal obligation to follow Web content accessibility guidelines – whether of the W3C or, in the US, of section 508 – as leading to boring text-only pages. Actually, these guidelines do not exclude the use of multimedia on the web. They say that multimedia should be made accessible by “Providing equivalent alternatives to auditory and visual content” and in particular: “For any time-based multimedia presentation (e.g., a movie or animation), synchronize equivalent alternatives (e.g., captions or auditory descriptions of the visual track) with the presentation.”[1]

This is not as bad a chore as it seems, and it can be shared between several people, even if they are not particularly tech-savvy or endowed with sophisticated tools.

Captioning with DotSUB.com

Phishing Scams in Plain English, by Lee LeFever[2], was uploaded to DotSub.com, and several volunteers did the captions in the different languages. The result can be embedded in a blog, a wiki or a web page. The captions also appear as copyable text under dotsub“Video Transcription,” which is handy if people discussing the video want to quote from it. Besides, a text transcription of a video also tends to raise its ranking in search engines, which still mainly scan text.

The only problem is that the subtitles cover a substantial part of the video.

Captioning with SMIL

This problem can be avoided by captioning with SMIL, which stands for Synchronized Multimedia Interaction Language. A SMIL file, written in XML, works as a “cogwheel” between the original video and other files (including captioning files) it links to and synchronizes.[3]

The advantage, compared to DotSUB, is that captions stay put in a separate field under the video and don’t interfere.

This is why, after having tried DotSUB, I chose the SMIL solution for: “Missing in Pakistan – Sottotitolazione Multilingue.[4]

So far, the simple text timecoded files for SMIL captioning still have to be made off-line, though Alessio Cartocci – who conceived the player in the example above – has already made a beta version of an online SMIL captioning tool.

Captioning with SMIL Made Easy on Webmultimediale.it

The Missing in Pakistan example is on Webmultimediale.org, the site where the WebMultimediale project team experiments with the creative potential of applying accessibility guidelines to online multimedia – for instance, in collaboration with theatrical companies.

web_multiHowever, the project also has a public video sharing and captioning platform, Webmultimediale.it, where everyone can upload a video and its captioning file to produce a captioned video for free. The site is fairly bilingual, Italian-English. By default, you can only upload one captioning file, but you can contact Roberto Ellero, the founder of the project, through http://www.webmultimediale.org/contatti.php if you wish to add more captions.

Webmultimediale.it also has a video tutorial in Italian on how to produce a time-coded captioning file using MAGpie, which is only accessible when you are signed in, but as it is in Italian, English-speaking users might prefer to use the MAGpie Documentation[5,6] directly.

Other Creative Potentialities of SMIL

As can be seen in the MAGpie Documentation and in the W3C Synchronized Multimedia page[3], SMIL also enables the synchronization of an audio description file and even of a second video file, usually meant for sign language translation. While these features are primarily meant to facilitate access to deaf and blind people, they can also be used creatively to enhance all users’ experience of a video.