EPUB & Kindle File Glossary and Dictionary Selection - dictionary

The Puzzle
I am working on an eBook file, or series of files, which should be compatible with the maximum range of eReaders on the market. This would include, for example:
The e-ink Kindle family
The Kindle App on iOS, Android (including Kindle Fire) and anywhere else
iBooks for iOS
The Nook
The Nook App for iOS, Android, and anywhere else
Kobo, possibly (haven't looked into this much)
So we're going for maximum compatibility with different eReaders, I hope that's clear.
Now, here's the issue:
The eBook file has a unique custom glossary which should be easily accessible to the reader.
Although it is one thing to have a glossary in the back of the eBook, all of the modern eReaders have some kind of dictionary functionality that is very accessible (put your cursor on a word, long-press on a word, etc.) so the glossary needs to be just as accessible to encourage usage by readers.
Hyperlinks
One way to do this would be hyperlinking each (or the first) occurrence of a term to the glossary entry in the back, and having hyperlinks in the glossary that go back to the occurrence(s).
Hyperlinks are supported by EPUB, as well as MOBI / AZN / KF8 / etc. for Kindle. The links can be styled so that they are unobtrusive (not underlined, dark gray or black, etc.)
This is the best solution I have been able to concoct so far.
However, having the hyperlinked words look different from the rest of the text could be distracting to the reader. If I use this method, and the hyperlinks are styled to look like the rest of the text, the reader will not know whether to navigate (press) so they will simply use the built-in dictionary (long-press).
(Also note that the newest Kindle software (latest Kindle Paperwhite) shows a little "Footnotes" popup window instead of navigating to the glossary. This is great, except for the fact that it says "Footnote", whereas it should say "Glossary", but this seems to be a Kindle software default--any hints on how to change this would be awesome.)
Modify Built-In Software
If there is any way to let the software know (iBooks / Kindle App / other software) that the book has a custom glossary, so that the default behavior is modified, this would be ideal. In other words, when you long-press the word, you don't just get the default popover (as in the Kindle software or iBooks) but you also get some way to look at the glossary definition.
Personally I know of no way that this can be done, but I'm asking in case anyone knows.
Javascript
Javascript is theoretically supported in EPUB 3, but in reality, out of the major eReading options, iBooks has support, and possibly Kobo (haven't looked into it) but nothing else does. Certainly not the rather antiquated MOBI format, and the KF8 format officially does not support JS.
The idea behind using JS would be to create a custom popover, whenever you tap or long-press a word that has a glossary entry. The custom popover would ideally allow you to choose between the built-in dictionary and the glossary entry.
It seems like this would be feasible only in iBooks, and perhaps Kobo, to show the glossary entries. (The popover would just have the hyperlink in it, basically.) In iBooks, I'm not sure how I would activate the built-in dictionary from my own custom JS popover, because the default popover you get in iBooks is the iBooks app's own hook into the text, based on your long-press.
Anyway, this obviously doesn't have cross-platform support into the Kindle family, but I'm throwing it out there as an option.
In Summary
In summary, I'm looking for a way to allow a reader easy access to a glossary in an otherwise standard eBook file (EPUB, Kindle family) across a wide range of eReading options. The eBook will be purchased in a normal eBook store and downloaded through normal methods. An App is not a solution, because of their limited distribution capabilities.
Any potential way of solving this puzzle is welcome!

Related

make non-native application accessible to screen readers for the visually impaired

I create applications, that are divorced from any native framework. All rendering happens in OpenGL, with a context provided by GLFW, all in C, with no framework to rely on supplying compatibility. As such, standard screen readers like NVDA have no chance of picking up information ( excluding OCR ) and my applications are an accessibility black hole.
How can I provide an interface for screen readers to cling unto? I presume this is a per OS thing... How would that be possible on Windows, Linux, BSD or even android? In the *NIX world, I presume this would be Desktop environment dependent...
I'm finding a lot of information on this, with a framework as a starting point, but have a hard time finding resources on how to do it from scratch.
I'm fully aware this is far beyond the capability of a sole developer and know, that writing programs by ignoring native interfaces is a common accessibility hole, which you are advised to avoid.
However, I have a tough time finding resources and jump-in points to explore this topic. Can someone point me in the right direction?
TL;DR: How to provide screen-reader compatibility from scratch. Not in detail - but conceptually.
As you have already well identified, your app is an accessibility blackhole because you are using a rendering engine.
It's basicly the same for OpenGL, SDL, or <canvas> on the web, or any library rendering something without specific accessibility support.
WE can talk about several possibilities:
Become an accessibility server. Under windows, it means doing the necessary so that your app provide accessible components on demand from UIA / IAccessible2 interface.
Use a well known GUI toolkits having accessibility support and their provieded accessibility API to make your app.
Directly talk to screen readers via their respective API in order to make them say something and/or show something on a connected braille display.
Do specific screen reader scripting
However, it doesnt stops there. Supporting screen readers isn't sufficient to make your app really accessible. You must also think about many other things.
1. Accessibility server, UIA, IAccessible2
This option is of course the best, because users of assistive technologies in general (not only screen readers) will feel right at home with a perfectly accessible app if you do your job correctly.
However, it's also by far the hardest since you have to reinvent everything. You must decompose your interface into components, tell which category of component each of them are (more commonly called roles), make callback to fetch values and descriptions, etc.
IF you are making web development, compare that with if you had to use ARIA everywhere because there's no defaults, no titles, no paragraphs, no input fields, no buttons, etc.
That's an huge job ! But if you do it really well, your app will be well accessible.
You may get code and ideas on how to do it by looking at open source GUI toolkits or browsers which all do it.
Of course, the API to use are different for each OS. UIA and IAccessible2 are for windows, but MacOS and several linux desktops also have OS-specific accessibility API that are based on the same root principles.
Note about terminology: the accessibility server or provider is your app or the GUI toolkit you are using, while the accessibility client or consumer is the scren reader (or others assistive tools).
2. Use a GUI toolkit with good accessibility support
By chance, you aren't obliged to reinvent the wheel, of course !
Many people did the job of point 1 above and it resulted in libraries commonly called GUI toolkits.
Some of them are known to generally produce well accessible apps, while others are known to produce totally inaccessible apps.
QT, WXWidgets and Java SWT are three of them with quite good accessibility support.
So you can quite a lot simplify the job by simply using one of them and their associated accessibility API. You will be saved from talking more or less directly to the OS with UIA/IAccessible2 and similar API on other platforms.
Be careful though, it isn't as easy as it seems: all components provided by GUI toolkits aren't necessarily all accessible under all platforms.
Some components may be accessible out of the box, some other need configuration and/or a few specific code on your side, and some are unaccessible no matter what.
Some are accessible under windows but not under MacOS or vice-versa.
For example, GTK is the first choice for linux under GNOME for making accessible apps, but GTK under windows give quite poor results. Another example: wxWidgets's DataView control is known to be good under MacOS, but it is emulated under windows and therefore much less accessible.
In case of doubt, the best is to test yourself under all combinations of OS and screen readers you intent to support.
Sadly, for a game, using a GUI toolkit is perhaps not a viable option, even if there exist OpenGL components capable of displaying a 3D scene.
Here come the third possibility.
3. Talk directly to screen readers
Several screen readers provide an API to make them speak, adjust some settings and/or show something on braille display. If you can't, or don't want to use a GUI toolkit, this might be a solution.
Jaws come with an API called FSAPI, NVDA with NVDA controller client. Apple also alow to control several aspects of VoiceOver programatically.
There are still several disadvantages, though:
You are specificly targetting some screen readers. People using another one, or another assistive tool than a screen reader (a screen magnifier for example), are all out of luc. Or you may multiply support for a big forest of different API for different products on different platforms.
All of these screen reader specific API support different things that may not be supported by others. There is no standards at all here.
Thinking about WCAG and how it would be transposed to desktop apps, in fact you are bypassing most best practices, which all recommand first above anything else to use well known standard component, and only customize when really necessary.
So this third possibility should ideally be used if, and only if, using a good GUI toolkit isn't possible, or if the accessibility of the used GUI toolkit isn't sufficient.
I'm the autohr of UniversalSpeech, a small library trying to unify direct talking with several screen readers.
You may have a look at it if you are interested.
4. Screen reader scripting
If your app isn't accessible alone, you may distribute screen reader specific scripts to users.
These scripts can be instructed to fetch information to give to the user, add additional keyboard shortcuts and several other things.
Jaws has its own scripting language, while NVDA scripts are developed with Python. AS far as I know, there's also scripting capabilities with VoiceOver under MacOS.
I gave you this fourth point for your information, but since you are starting from a completely inaccessible app, I wouldn't advise you to go that way.
In order for scripts to be able to do useful things, you must have a working accessible base. A script can help fixing small accessibility issues, but it's nearly impossible to turn a completly inaccessible app into an accessible one just with a script.
Additionally, you must distribute these scripts separately from your app, and users have to install them. It may be a difficulty for some people, depending on your target audience.
Beyond screen reader support
Screen reader support isn't everything.
This is beyond your question, so I won't enter into details, but you shouldn't forget about the following points if you really want to make an accessible app which isn't only accessible but also comfortable to use for a screen reader user.
This isn't at all an exhaustive list of additional things to watch out.
Keyboard navigation: most blind and many visually impaired aren't comfortable with the mouse and/or a touch screen. You must provide a full and consist way of using your app only with a keyboard, or, on mobile, only by standard touch gestures supported by the screen reader. Navigation should be as simple as possible, and should as much as you can conform to user preferences and general OS conventions (i.e. functions of tab, space, enter, etc.). This in turn implies to have a good structure of components.
Gamepad, motion sensors and other inputs: unless it's absolutely mandatory because it's your core concept, don't force the use of them and always allow a keyboard fallback
Visual appearance: as much as you can, you should use the settings/preferences defined at OS level for disposition, colors, contrasts, fonts, text size, dark mode, high contrast mode, etc. rather than using your own
Audio: don't output anything if the user can't reasonably expect any, make sure the volume can be changed at any time very easily, and if possible if it isn't against your core concept, always allow it to be paused, resumed, stopped and muted. Same reflection can apply to other outputs like vibration which you should always be able to disable.

Accessibility with Google Slides Embed on a webpage

Got a site with powerpoint presentations that the client wants embedded, the embedding is being done via google docs embed. I have been doing some accessibility testing, albeit not particularly in-depth but even with OS X screen reader it is not having any luck reading the slides. (I am aware slides are probably terrible for accessibility anyway). I can get the text content of the slides stripped out via the Google API, but I don't know if thats the best thing, to include it on the page below/above the iframe embed with one of the CSS tricks for hiding it from normal view?
I am aware of iframe title and aria-label but those seem to imply they are only to describe the contents of the iframe, which I am doing, but I need somewhere that can contain more text.
Has anyone got any good tips for the best way to deal with such things? Thanks!!
Embedding rich 3rd-party content in web pages poses many challenges.
When we put something like this in a web page, we typically think we're adding a bit of "content", but it often amounts to embedding a complex application; and the user interface, semantics, and presentation are outside of our control. In your case it's a presentation slide deck, but it could also be a Flash/Silverlight/Java applet, a slippy map, interactive SVG infographic, a 3D-panorama, virtual tour, zoomable image, chemical molecule viewer, or who-knows-what. (Note: I'm not familiar with the Google docs embed/API specifically, so most parts of my answer will address these rich content cases in general.)
Even if the embedded rich 3rd-party content/application is accessible today, there's no guarantee it will remain so after the 3rd-party system gets an update.
So what can you do? The safest thing might be to assume it's inaccessible, and consider the best way to provide an accessible alternative. The Web Content Accessibility Guidelines (WCAG) calls this a "conforming alternate version", and it sounds like you're already thinking along these lines.
An important caveat to all of this, is that the use of "conforming alternate versions" isn't considered ideal by many accessibility specialists. It's greatly preferred to make your main content accessible as you can.
Some relevant parts of WCAG:
Understanding conformance, especially the section about "Understanding Conforming Alternate Versions".
Technique G190: Providing a link adjacent to or associated with a non-conforming object that links to a conforming alternate version.
F19: Failure of Conformance Requirement 1 due to not providing a method for the user to find the alternative conforming version of a non-conforming Web page.
It's worth mentioning the 3rd-party content in your website's accessibility statement. Statement of Partial Conformance - Third Party Content offers guidance about that.
The crucial thing about conforming alternate versions, is that it's no use at all if the user isn't made aware of it, or isn't able to reach it.
Implementation-wise, there are a variety of approaches you might take. In many ways, providing an alternative for embedded rich content is similar to providing a long description of a complex image, or a transcript of a video. Have a look at these WAI tutorials for some ideas.
Web Accessibility Tutorials: Complex Images
Making Audio and Video Media Accessible: Where to Put Transcripts
I can get the text content of the slides stripped out via the Google API
It sounds like you're trying to automate the process. That's fine, but it might not give satisfactory results. Some things you should consider:
Is the text content alone going to be enough? Presentations often have images too. Did the author provide text alternatives for these images, and are they present in the text extracted via the API? If the author hasn't provided text alternatives for images in the slide deck, the text you get from the API won't be telling the whole story.
Not all text in slides carries the same weight. Some slides serve to introduce sections, some slides have headings. Does the text obtained from the API convey these relationships?
Lists are very commonly found in presentation. Does the text obtained from the API preserve this structure?
Slides sometimes contain links. Are these included in the text obtained from the API, so the links are available to everybody using the alternative version?
Slides sometimes contain tables and charts. How will the information in these be conveyed in your alterative version? Is the information included in the text obtained from the API?
Sometimes, presentation decks also contain rich 3rd-party embedded content themselves! A slide containing a video, or an animated GIF are examples of this. If so, you can find yourself with a much more complex challenge.
... and many other meaningful info and relationships. Quotations, code samples, etc.
If any of the above points give cause for concern, you will need to consider managing your alternative version manually.
The "conforming alternative version" has to conform to WCAG; if you just offer a choice of two non-conforming version, that doesn't satisfy WCAG.
include it on the page below/above the iframe embed with one of the CSS tricks for hiding it from normal view?
No, I wouldn't recommend that. I assume you're refering to visually-hidden text, using CSS utility classes such as .visually-hidden or .sr-only. It sounds like you're only thinking about screen reader users.
You need to offer the alternative version available to everybody, not just one group of users who you think will need it.
Many groups of users may experience difficulty using the rich 3rd-party embedded content. This includes people using the keyboard only, screen readers, magnifiers, speech control, switch access, or other tools. The conforming alternative version can be navigated like a normal web page though.
The 3rd-party content likely has a different visual style, and it may not be adaptable in the same way as the page it is embedded in. That can frustrate people who make use of browser zooming, text resizing, font preferences, reader mode, Windows high-contrast themes, viewport resizing, and other user-applied presentation changes. The conforming alternate version should be as adaptable as the rest of your site.
Rather than hiding the alternative version in a visually-hidden container, here are some other ways to present it. The first two are the simplest and most reliable.
Put it on the same page, just after the original content, visible to everybody.
Put it in a collapsible disclosure element just after the embedded content. A <details> element is an easy way to achieve this. This is useful if the alternative version is quite long.
Put it elsewhere on the same page, and tell users where to find it. An internal link can help here.
Put it on a separate page, with a link next to the embedded content. I'm less keen on this approach. Only use it if you can commit to maintaining both pages.
Provide a way for users to switch between the two versions. For example some buttons, or a tabbed UI. However, you must also ensure that the switching mechanism is accessible. That might mean a full-blown ARIA tabs implementation.
I am aware of iframe title and aria-label but those seem to imply they are only to describe the contents of the iframe
Giving the iFrame a useful name is important. It's also a useful mechanism to inform screen reader users that an alternative version is available. The WAI Complex Images tutorial linked above has some similar approaches. Example: <iframe title="Google Slides Presentation of FOO BAR BAZ. Link to text version follows this frame.">. This only helps screen reader users though; you still need to make the availability of the alternative version clear to everyone.
How committed are you to using Google Docs for displaying these slides?
Any accessibility enhancements that you'll be able to implement on Google Slides won't be very good.
One way around this whole thing is to offer PDF versions of the slides that have been fully-remediated for accessibility instead of using Google Slides. That would potentially be a single solution that could be accessible to all visitors. Going this route means that you wouldn't have multiple copies of the same slides to update, which could lead to a split in content if one gets updated and the other is forgotten.
If you're really set on having the slides embedded in the page, then you could offer both formats by applying aria-hidden to the embedded iframe and then hiding the PDF links from sighted users using CSS clip, or by positioning content off-screen.

Voiceover and Text To Speech - differences in pronunciation

I've been playing around a little with both VoiceOver and the Text to speech functionality on my mac. I've noticed a few differences in the way numbers and punctuation is pronounced. For example the sentence "the year was 1978", is read out perfectly when I highlight it and use text to speech. With Voiceover however, it reads "the year was one nine seven eight".
How can I tell screen readers that I want something pronounced in a certain way? Is there ARIA attributes I can add for this kind of behaviour?
It is not just dates and years but prices and punctuation as well (and probably a lot of other things!).
I don't think you can control speech output with the APIs currently available.
I assume that you are talking about HTML page as you referenced ARIA attributes. WAI ARIA does not have attributes to control the screen reading. I believe W3C is coming up with Text-to-Speech synthesizing markup language (SSML) to provide better control over speech output.
If your issue is related to native Mac OS application screens, you can check out Apple's Speech Synthesis APIs.
The Apple VoiceOver as well as Microsoft Windows Narrator are very basic screen readers with little or no intelligence built in. Apple's text-to-speech is little advanced but it still lags behind commercial screen readers like JAWS. But good thing is that users with impaired vision typically have good screen readers which can read your content appropriately.
Common speech synthesizers allow their clients to configure whether numbers will be read as words or spelled-out (both on a global default and per-specific-case setting). E.g. Apple's speech synthesis API propagates this possibility using the NSSpeechNumberModeProperty property. VoiceOver users can then set only global (or at best per-app) default in VoiceOver Utility > Verbosity > Text > "Read numbers as:".
Now if you want to influence how screen readers should pronounce numbers in web content, there is a CSS Speech Module. It defines the speak-as property, which can have a value of normal or digits.
However whether such CSS value will be respected by a combination of particular version of particular screen reader and particular version on a particular browser running of particular version of particular OS is a thing you have to try yourself. The accessibility APIs on OS X currently have no way to specify pronunciation of digits in the returned strings, so if it worked for WebKit, it would have to be using some private extension to those public accessibility APIs. I just tried on OS X 10.9.3 and this does not work.
If I had to guess, you would be lucky to find a combination of screen reader / browser / OS where this would be implemented as of now. But this is just speculation.

How can I replace the screen reader audio with a prerecorded audio file?

I work on a multilingual website that will contain many languages that are not normally written, and I wonder if there are any ways to get this working for people using screen readers? Is it possible to give a text an attribute to make the screen reader play a prerecorded sound instead of trying to read the text by itself?
The whole menu system will be translated into the languages that are not supported by any screen readers.
The two popular screen readers are JAWS and NVDA. You can see what languages JAWS supports, 28 in total. NVDA supports 43 languages (I couldn't find a list).
I wonder if there are any ways to get this working for people using screen readers
There is a few things you could do that come to mind:
Declare the language of the page via the <body lang="">, so that if the screen reader happens to know how to interpret it, it uses that language
Put links to common language translations near the top of the page so if somebody lands on a random page from a search engine hit, they can change languages quickly.
Is it possible to give a text an attribute to make the screen reader play a prerecorded sound instead of trying to read the text by itself?
The lang attribute makes the screen reader switch to another language if it understands it. You can provide links to audio files to be listened to, I would be a little cautious with providing your own audio player. Not all audio players are accessible, the two common issues with these are the controls are not labeled and they cause focus trap.
Unlabeled controls make the assistive technology say "unlabeled" or something similar, so you cannot tell the buttons apart from each other. Focus trap effects people who use the keyboard to navigate a page, this is usually using the Tab key, and instead of getting out of the audio player, it goes to the first element of the audio player again.
From Comments
How I can make the screen reader play these files instead of trying to read the text.
The only thing you can do is use ARIA to hide the content via the aria-hidden='true' attribute. You can check my answer about aria-hidden for more details. Essentially you would do something like:
<article aria-hidden="true">
<h1>Some Really cool language</h1>
<p>Blah blah blah</p>
<section aria-hidden="false">
<h2>Audio of language</h2>
<p>below is an audio sample of ____. Blah blah blah</p>
<p class="offScreen"><!-- it may be a good idea to put additional
info for people using assistive tech --></p>
<p>audio stuff</p>
</section>
<article>
CSS
.offScreen{
position: absolute;
top: 0;
left: -999px;
}
Ryan, I've seen this question asked elsewhere about "click" languages, as of southwestern Africa. So far as I know, there is no written alphabet that is intrinsic to these languages. Scholars might record the languages phonetically, but more common techniques involve adding exclamation points and perhaps other basic keyboard characters to indicate the vocalizations that cannot be conveyed by European alphabets. The Kx'a family of languages is one such group.
If you look for RFC 1766 on sourceforge.net, you'll find a list of 122 languages or variants of languages that map to specific values of the lang attribute. And RFC 1766 itself shows how to add Klingon and other "experimental" languages to the mix.
So there are several issues, it seems:
If a language has not yet been mapped, how does one create a mapping of its characters and groups of characters (its graphemes) to its sounds (its phonemes)?
Assuming that's all that is required, how does one get that mapping associated with a new value for the lang attribute? (To get that new value, RFC 1766 says to create, complete, and submit a simple form. But, given that the document called RFC 1766 is 18 years old, how reliable is that information? And just where does the mapping of symbols to sounds fit into the picture?)
Ultimately, how does one get a screen reader to recognize that mapping and the corresponding value of the lang attribute?
My somewhat contrarian take: don't try to automatically replace the text with pre-recoreded content; instead focus on ensuring that the user is aware that both are available, and can access whichever is most appropriate for them based on the tools they have at their disposal.
Some more background context might help: from your description, it sounds like this is perhaps an academic or research site, that has fragments of text in these languages, with audio; but where the remainder of the site structure - headings and supporting narrative text - is in some 'well-supported' language (English, etc.)? (What is the encoding system used for this test?)
If so...
Be aware that a screenreader user does not typically read an entire page top-to-bottom in a completely linear fashion; they can browse the page using the heading structure. In a well-marked-up-page, the user has the freedom to skip over the portions that they are not interested in or which are not relevant to them. Focus on providing this flexibility rather than making (well-meaning, but potentially incorrect) policy decisions on behalf of the user.
Don't assume that a screenreader user is using speech in the first place; they could be using Braille, whether due to the fact that speech output is not an option for them, or simply because Braille is their preferred form of output.
Finally, don't assume that because a screenreader user can't hear the text properly (due to text-to-speech limitations), that the textual form of the content should be hidden from them entirely; they may still want the ability to cut-and-paste the characters that represent the text so that they can send them to a colleague, for example. Or, depending on the writing scheme used, a screenreader user may still be able to step through the characters letter-by-letter and have the words spelled out to them letter-by-letter - many screenreaders can call out non-latin characters by their Unicode name.
The issue here is less about JAWS and more about having a Synthesizer which speaks the language and can communicate with JAWS through a driver such as SAPI 5. Development of these languages for the various synthesizer companies can be costly, especially if there is not a good business case driving it such as GPS, ATMs, Call Centers, etc.
There are open source solutions such as eSpeak which you might look into as well. It is not the highest quality but could be an approach if you have access to developers willing to work on such a project.
As for the question regarding an API or method to communicate information to JAWS via prerecorded sound files of the web site? This is not really going to meet the need of the screen reader user who would have no way to navigate the information or interact with it using Links or form field elements. I really think the synthesizer development is the only solution unfortunately.

Multi-lingual Flex app - preferred fonts for embedding specific languages

I work on a collaboration web app, built with Flex 3, that needs to support multiple languages.
Does anyone know which fonts are best for creating embedded font libraries for Chinese, Korean, Japanese, and Russian languages? I know Arial Unicode MS will do the job, but I don't know if it will do the job best.
Localization alone won't solve the entire problem: chat input and display, for example, need to support multiple languages in the same textfield - anything typed in Chinese needs to display in Chinese; anything typed in English needs to display in English.
Using _sans is an option, but is far from preferred.
Thanks.
Went with an approach that switches TextFormat of characters based on unicode value. So, characters in the primary language display in the preferred (embedded) font, while characters in other languages display in _sans.
This works out really nicely, but requires that you inspect every character that is added to a field, and requires you to inspect everything when a deletion occurs. Kind of a lot of inspecting and I'm sure a textfield with a lot of content would start running into performance issues, but this is for a chat tool, so that isn't too critical of a use case.

Resources