Accessibility and Literacy: Two Sides of the Same Coin

Accessibility 4 All by Claude Almansi

Treaty for Improved Access for Blind, Visually Impaired and other Reading Disabled Persons

On July 13, 2009, WIPO (World Intellectual Property Organization) organized a discussion entitled  Meeting the Needs of the Visually Impaired Persons: What Challenges for IP? One of its focuses was the draft Treaty for Improved Access for Blind, Visually Impaired and other Reading Disabled Persons, written by WBU (World Blind Union), that had been proposed by Brazil, Ecuador and Paraguay at the 18th session of  WIPO’s Standing Committee on Copyright and Related Rights in May [1].

A pile of books in chains about to be cut with pliers. Text: Help us cut the chains. Please support a WIPO treaty for print disabled=

From the DAISY Consortium August 2009 Newsletter

Are illiterate people “reading disabled”?

At the end of the July 13 discussion, the Ambassador of Yemen to the UN in Geneva remarked that people who could not read because they had had no opportunities to go to school should be included among “Reading Disabled Persons” and thus benefit from the same copyright restrictions in WBU‘s draft treaty, in particular, digital texts that can be read with Text-to-Speech (TTS) software.

The Ambassador of Yemen hit a crucial point.

TTS was first conceived as an important accessibility tool to grant blind people access to  texts in digital form, cheaper to produce and distribute than heavy braille versions. Moreover, people who become blind after a certain age may have difficulties learning braille. Now its usefulness is being recognized for others who cannot read print because of severe dyslexia or motor disabilities.

Indeed, why not for people who cannot read print because they could not go to school?

What does “literacy” mean?

No one compos mentis who has seen/heard blind people use TTS to access texts and do things with these texts would question the fact that they are reading. Same if TTS is used by someone paralyzed from the neck down. What about a dyslexic person who knows the phonetic value of the signs of the alphabet, but has a neurological problem dealing with their combination in words? And what about someone who does not know the phonetic value of the signs of the alphabet?

Writing literacy

Sure, blind and dyslexic people can also write notes about what they read. People paralyzed from the neck down and people who don’t know how the alphabet works can’t, unless they can use Speech-to-Text (STT) technology.

Traditional desktop STT technology is too expensive – one of the most used solutions, Dragon NaturallySpeaking, starts at $99 – for people in poor countries with a high “illiteracy” rate. Besides, it has to be trained to recognize the speakers’ voice, which might not be an obvious thing to do for someone illiterate.

Free Speech-to-Text for all, soon?

In Unhide That Hidden Text, Please, back in January 2009, I wrote about Google’s search engine for the US presidential campaign videos, complaining that the  text file powering it – produced by Google’s speech-to-text technology – was kept hidden.

However, on November 19, 2009, Google announced a new feature, Automatic captions in YouTube:

To help address this challenge, we’ve combined Google’s automatic speech recognition (ASR) technology with the YouTube caption system to offer automatic captions, or auto-caps for short. Auto-caps use the same voice recognition algorithms in Google Voice to automatically generate captions for video.

(Automatic Captions in YouTube Demo)

So far, in the initial launch phase, only some institutions are able to test this automatic captioning feature:

UC Berkeley, Stanford, MIT, Yale, UCLA, Duke, UCTV, Columbia, PBS, National Geographic, Demand Media, UNSW and most Google & YouTube channels


As the video above says, the automatic captions are sometimes good, sometimes not so good – but better than nothing if you are deaf or don’t know the language. Therefore, when you switch on automatic captions in a video of one of the channels participating in the project, you get a warning:

warning that the captions are produced by automatic speech recognition

Short words are the rub

English – the language for which Google presently offers automatic captioning – has a high proportion of one-syllable words, and this proportion is particularly high when the speaker is attempting to use simple English: OK for natives, but at times baffling for foreigners.

When I started studying English literature at university, we 1st-year students had to follow a course on John Donne’s poems. The professor had magnanimously announced that if we didn’t understand something, we could interrupt him and ask. But doing so in a big lecture hall with hundreds of listeners was rather intimidating. Still, once, when I noticed that the other students around me had stopped taking notes and looked as nonplussed as I was, I summoned my courage and blurted out: “Excuse me, but what do you mean exactly by ‘metaphysical pan’?” When the laughter  subsided, the professor said he meant “pun,” not “pan,” and explained what a pun was.

Google’s STT apparently has the same problem with short words. Take the Don’t get sucked in by the rip… video in the UNSW YouTube channel:

If you switch on the automatic captions [2], there are over 10 different transcriptions – all wrong – for the 30+ occurrences of the word “rip.” The word is in the title (“Don’t get sucked in by the rip…”), it is explained in the video description (“Rip currents are the greatest hazards on our beaches.”), but STT software just attempts to recognize the audio. It can’t look around for other clues when the audio is ambiguous.

That’s what beta versions are for

Google deserves compliments for having chosen to semi-publicly beta test the software in spite of – but warning about – its glitches. Feedback both from the partners hosting the automatically captionable videos and from users should help them fine-tune the software.

A particularly precious contribution towards this fine-tuning comes from partners who also provide human-made captions, as in theOfficial MIT OpenCourseWare 1800 Event Video in the  MIT YouTube channel:

Once this short word issue is solved for English, it should then be easier to apply the knowledge gained to other languages where they are less frequent.


…as the above-embedded Automatic Captions in YouTube Demo video explains, now you:

can also download your time-coded caption file to modify or use somewhere else

I have done so with the Lessig at Educause: Creative Commons video, for which I had used another feature of the Google STT software: feeding it a plain transcript and letting it add the time codes to create the captions. The resulting caption .txt  file I then downloaded says:

and think about what else we could
be doing.

So, the second thing we could be doing is
thinking about how to change norms, our norms,

our practices.
And that, of course, was the objective of

a project a bunch of us launched about 7 years
ago,the Creative Commons project. Creative


Back to the literacy issue

People who are “reading disabled” because they couldn’t go to school could already access texts with TTS technology, as the UN Ambassador of Yemen pointed out at the above-mentioned WIPO discussion on Meeting the Needs of the Visually Impaired Persons: What Challenges for IP? last July.

And soon, when Google opens this automated captioning to everyone, they will be able to say what they want to write in a YouTube video – which can be directly made with any web cam, or even cell phone cam – auto-caption it, then retrieve the caption text file.

True, to get a normal text, the time codes should be deleted and the line-breaks removed. But learning to do that should be way easier than learning to fully master the use of the alphabet.


  • Text-to-Speech, a tool first conceived to grant blind people access to written content, can also be used by other reading-disabled people, including people who can’t use the alphabet convention because they were unable to go to school and, thus, labeled “illiterate.”
  • Speech-to-Text, a tool first conceived to grant deaf people access to audio content, is about to become far more widely available and far easier to use than it was recently, thus potentially enabling people who can’t use the alphabet convention because they were unable to go to school and labeled “illiterate” the possibility to write.

This means that we should reflect on the meanings of the words “literate” and “illiterate.”

Now that technologies first meant to enable people with medically recognized disabilities to use and produce texts can also do the same for those who are “reading disabled” by lack of education, industries and nations presently opposed to the Treaty for Improved Access for Blind, Visually Impaired and other Reading Disabled Persons should start thinking beyond “strict copyright” and consider the new markets that this treaty would open up.

Unhide That Hidden Text, Please

claude80By Claude Almansi
Staff Writer

Thanks to:

  • Marie-Jeanne Escure, of Le Temps, for having kindly answered questions about copyright and accessibility issues in the archives of the Journal de Genève.
  • Gabriele Ghirlanda, of Unitas, for having tested the archives of the Journal de Genève with a screen reader.

What Hidden Text?

Here, “hidden text” refers to a text file combined by an application with another object (image, video etc.) in order to add functionality to that object: several web applications offer this text to the reader together with the object it enhances – DotSUB offers the transcript of video captions, for instance:


Screenshot from “Phishing Scams in Plain English” by Lee LeFever [1].

But in other applications, unfortunately, you get only the enhanced object, but the text enhancing it remains hidden even though it would grant access to content for people with disabilities that prevent them from using the object and would simplify enormously research and quotations for everybody.

Following are three examples of object-enhancing applications using text but keeping it hidden:

Multilingual Captioning of YouTube and Google Videos

Google offers the possibility to caption a video by uploading one or several text files with their timed transcriptions. See the YouTube example below.

yt_subtYouTube video captioning.

Google even automatically translates the produced captions into other languages, at the user’s discretion. See the example below. (See “How to Automatically Translate Foreign-Language YouTube Videos” by Terrence O’Brien, Switch,

yt_subt_trslOption to automatically translate the captions of a YouTube video.

Nov. 3, 2008 [2], from which the above two screenshots were taken.) But the text files of the original captions and their automatic translations remain hidden.

Google’s Search Engine for the US Presidential Campaign Videos

During the 2008 US presidential campaign, Google beta-tested a search engine for videos on the candidates’ speeches. This search engine works on a text file produced by speech-to-text technology. See the example below.

google_election_searchGoogle search engine for the US presidential election videos.

(See “Google Elections Video Search,” Google for Educators 2008 – where you can try the search engine in the above screenshot – [3] and “‘In Their Own Words’: Political Videos Meet Google Speech-to-text Technology” by Arnaud Sahuguet and Ari Bezman. Official Google blog, July 14, 2008 [4].) But here, too, the text files on which the search engine works remain hidden.

Enhanced Text Images in Online Archives

Maybe the oddest use of hidden text is when people go to the trouble of scanning printed texts, produce both images of text and real text files from the scan, then use the text file to make the image version searchable – but hide it. It happens with Google books [5] and with The European Library [6]: you can browse and search the online texts that appear as images thanks to the hidden text version, but you can’t print them or digitally copy-paste a given passage – except if the original is in the public domain: in this case, both make a real textual version available.

Therefore, using a plain text file to enhance an image of the same content, but hiding the plain text, is apparently just a way to protect copyrighted material. And this can lead to really bizarre solutions.

Olive Software ActivePaper and the Archives of Journal de Genève

On December 12, 2008, the Swiss daily Le Temps announced that for the first time in Switzerland, they were offering online “free access” to the full archives – (English version at [7]) – of Le Journal de Genève (JdG), which, together with two other dailies, got merged into Le Temps in 1998. In English, see Ellen Wallace’s “Journal de Geneve Is First Free Online Newspaper (but It’s Dead),” GenevaLunch, Dec. 12, 2008 [8].

A Vademecum to the archives, available at [9] (7.7 Mb PDF), explains that “articles in the public domain can be saved as


images. Other articles will only be partially copied on the hard disk,” and Nicolas Dufour’s description of the archiving process in the same Vademecum gives a first clue about the reason for this oddity: “For the optical character recognition that enables searching by keywords within the text, the American company Olive Software adapted its software which had already been used by the Financial Times, the Scotsman and the Indian Times.” (These and other translations in this article are mine.)

The description of this software – ActivePaper Archive – states that it will enable publishers to “Preserve, Web-enable, and Monetize [their] Archive Content Assets” [10]. So even if Le Temps does not actually intend to “monetize” their predecessor’s assets, the operation is still influenced by the monetizing purpose of the software they chose. Hence the hiding of the text versions on which the search engine works and the digital restriction on saving articles still under copyright.

Accessibility Issues

This ActivePaper Archive solution clearly poses great problems for blind people who have to use a screen reader to access content: screen readers read text, not images.

Le Temps is aware of this: in an e-mail answer (Jan. 8, 2009) to questions about copyright and accessibility problems in the archives of JdG, Ms Marie-Jeanne Escure, in charge of reproduction authorizations at Le Temps, wrote, “Nous avons un partenariat avec la Fédération suisse des aveugles pour la consultation des archives du Temps par les aveugles. Nous sommes très sensibilisés par cette cause et la mise à disposition des archives du Journal de Genève aux aveugles fait partie de nos projets.” Translation: “We have a partnership with the Swiss federation of blind people (see [11]) for the consultation of the archives of Le Temps by blind people. We are strongly committed/sensitive to this cause, and the offer of the archives of Journal de Genève to blind people is part of our projects.”

What Digital Copyright Protection, Anyway?

Gabriele Ghirlanda, member of Unitas [12], the Swiss Italian section of the Federation of Blind people, tried the Archives of JdG. He says (e-mail, Jan. 15, 2009):

With a screenshot, the image definition was too low for ABBYY FineReader 8.0 Professional Edition [optical character recognition software] to extract a meaningful text.

But by chance, I noticed that the article presented is made of several blocs of images, for the title and for each column.

Right-clic, copy image, paste in OpenOffice; export as PDF; then I put the PDf through Abbyy Fine Reader. […]

For a sighted person, it is no problem to create a document of good quality for each article, keeping it in image format, without having to go through OpenOffice and/or pdf. [my emphasis]

<DIV style=”position:relative;display:block;top:0; left:0; height:521; width:1052″ xmlns:OliveXLib=”; xmlns:OlvScript=”; xmlns:msxsl=”urn:schemas-microsoft-com:xslt”><div id=”primImg” style=”position:absolute;top:30;left:10;” z-index=”2″><img id=”articlePicture” src=”/Repository/getimage.dll?path=JDG/1990/03/15/13/Img/Ar0130200.png” border=”0″></img></div><div id=”primImg” style=”position:absolute;top:86;left:5;” z-index=”2″><img id=”articlePicture” src=”/Repository/getimage.dll?path=JDG/1990/03/15/13/Img/Ar0130201.png” border=”0″></img></div><div id=”primImg” style=”position:absolute;top:83;left:365;” z-index=”2″><img id=”articlePicture” src=”/Repository/getimage.dll?path=JDG/1990/03/15/13/Img/Ar0130202.png” border=”0″></img></div><div id=”primImg” style=”position:absolute;top:521;left:369;” z-index=”2″><img id=”articlePicture” src=”/Repository/getimage.dll?path=JDG/1990/03/15/13/Img/Ar0130203.png” border=”0″></img></div><div id=”primImg” style=”position:absolute;top:81;left:719;” z-index=”2″><img id=”articlePicture” src=”/Repository/getimage.dll?path=JDG/1990/03/15/13/Img/Ar0130204.png” border=”0″></img></div>

From the source code of the article used by Gabriele Ghirlanda: in red, the image files he mentions.

Unhide That Hidden Text, Please

Le Temps‘ commitment to the cause of accessibility for all and, in particular, to find a way to make the JdG archives accessible to blind people (see “Accessibility Issues” above) is laudable. But in this case, why first go through the complex process of splitting the text into several images, and theoretically prevent the download of some of these images for copyrighted texts, when this “digital copyright protection” can easily be by-passed with right-click and copy-paste?

As there already is a hidden text version of the JdG articles for powering the search engine, why not just unhide it? already states that these archives are “© 2008 Le Temps SA.” This should be sufficient copyright protection.

Let’s hope that Olive ActivePaper Archive software offers this option to unhide hidden text. Not just for the archives of the JdG, but for all archives working with this software. And let’s hope, in general, that all web applications using text to enhance a non-text object will publish it. All published works are automatically protected by copyright laws anyway.

Adding an alternative accessible version just for blind people is discriminatory. According to accessibility guidelines – and common sense – alternative access for people with disabilities should only be used when there is no other way to make web content accessible. Besides, access to the text version would also simplify life for scholars – and for people using portable devices with a small screen: text can be resized far better than a puzzle of images with fixed width and height (see the source code excerpt above).

The pages linked to in this article and a few more resources are bookmarked under