Just take me to the demo of the Incredible Accessible Modal Window, Version 2.
Rich Caloggero at MIT and I were talking recently about screen reader support in the original Incredible Accessible Modal Window and he pointed out to me that different screen readers handle virtual cursor support differently in the modal window. This sent me further down the rabbit hole of screen reader support with ARIA and what exactly is going on.
I hesitate to call this the “problem” because I’m not sure what the real problem is yet. I’m sure it is some combination of differing interpretations of specifications, technological limitations of different screen readers, design decisions, the needs of the user, and bugs. The situation is that all screen readers, except for NVDA, can use their virtual cursor to navigate a role=”dialog”.
The root of the situation is that a role=”dialog” should be treated as an application window. This fundamentally changes the way a screen reader user interacts with an object because the default way the user is to now interact with the application window is by using the application’s defined navigation methods. In other words, the screen reader user is not supposed to use their virtual cursor.
It is clear from the spec that when a screen reader encounters an object with an application-type role, it should stop capturing keyboard events and let them pass to the application. This in essence turns off the virtual cursor for JAWS and NVDA. What is not clear is if it is permissible for the user to optionally re-enable their virtual cursor within an application. JAWS says yes and NVDA says no. (Just a note, JAWS actually requires the user to manually enable application mode instead of doing it for them automatically.)
This has real world implications. Typically for a role=”dialog” the user would use their Tab key to navigate between focusable elements and read the page that way. But what if there is text within the modal dialog that is not associated with a focusable element in the modal dialog?
The spec says that “if coded in an accessible manner, all text will be semantically associated with focusable elements.” I think this is easily achievable in many situations, however, I question if it is practical in all situations. In my experience a lot of content is being crammed into some modal dialogs, sometimes more content than can always be neatly associated with focusable elements. In theory, with enough tabindex=”0” and aria-labelledby attributes you could associate everything with a focusable element, but I wonder if this would get too unwieldy in some situations.
There is always the question of if developers should be cramming so much information into modal dialogs, but that’s another discussion for another day. I’m simply trying to deal with the fact that people are putting so much content in there.
A further real world implication of the ability or inability to use the virtual cursor is if you allow users to use their virtual cursor in some situations in an application region, are there situations where that could end up hurting the user? For example, it’s not hard for me to imagine a modal dialog where it would be useful to allow the user the ability to navigate with their virtual cursor, however, if a screen reader user is interacting with Google Docs, which is in essence one large role=”application”, the results can be disastrous. Are there certain application contexts where we would want the user to be able to enable their virtual cursor and other contexts where we would want to prevent it? That just made things a lot more complicated.
Just to complicate things more, VoiceOver and ChromeVox don’t really have a concept, to my knowledge, of turning a virtual cursor on and off. That means they can browse the contents of the role=”dialog” any way they want, and there is not much I as a developer can do about it.
A Partial Solution?
One of the things Rich and I learned in this adventure is if you include a role=”document” inside of the role=”dialog”, then NVDA allows you to use the virtual cursor. This now gives all screen reader users the ability to fully navigate all of the contents.
Is this a good thing? Based on the reality of how people are actually implementing modal dialogs, I think it is. Some modal dialogs are in essence becoming miniature versions of Web pages, not just simple forms or messages. Given the alternative of having to programmatically shoehorn every piece of text into a relationship with a focusable element, I think this is a good option for some pages.
I still think that people should revisit the overall usability of their application which might require such complex modal dialogs in the first place. There are probably better ways to design the user interactions.
So is NVDA wrong in their implementation of not allowing virtual browsing in an application? I don’t think so. That is the intention behind the application region. Is JAWS wrong for allowing the use of the virtual cursor in an application? Probably not, because it is always good to give screen reader users the option of trying to save themselves from bad coding and using the virtual cursor might be the only way they can do that. However, my guess is that using the virtual cursor in something designed to be an application will usually lead to more confusion than assistance.
One additional improvement – in the original version of the Incredible Accessible Modal Window there was a shim in place for VoiceOver users so that the aria-labelledby attribute would be automatically announced. VoiceOver in OS X 10.9 fixes this problem so the shim is not needed any more.