Dramatis Personae
- Doug Schepers (a Montague)
- Anne van Kestern (a Capulet)
- Simon Pieters (a Capulet)
- Henri Sivonen (a Capulet)
- Rich Schwerdtfeger (a Franciscan)
- Aaron Leventhal (an Apothecary)
The Prologue
Over the past few weeks, there has been a long, drawn-out discussion about how to integrate ARIA (an accessibility specification) into both SVG and HTML in a uniform way. There have been two main camps, as there usually are in these matters: the XML advocates, and the HTML5 (WHATWG) advocates. Both sides want to make sure that all content works the same across all browsers, authoring tools, etc. (collectively called User Agents), but they differed in how they thought it should be done.
The XML True Believers have held that the way forward for the Web is to enforce a strict, well-defined syntax for Web languages, and that to extend or combine languages, a differentiating mechanism called “namespaces” would be used. Namespaces in XML are specified to take the form “schepers:doug”. They declared a do-over on HTML, recasting it in the new XML mold (XHTML), and planned for browsers to be stricter on new content, and for authors to have learned their lessons.
The HTML Young Turks came forward with the observation that no matter how well you define a syntax, people will make mistakes (or not know the right way to do things), and that the way to recover from those mistakes is to define error-handling behavior for each kind of mistake; they also believed that since they could now reconcile older “broken” content with new browser behavior, that they were constrained to define behavior that would work in all past browsers, as much as possible, and to degrade gracefully in those that didn’t. But this required a more rigid parsing mechanism that doesn’t work well with extensibility, a problem exacerbated by Internet Explorer’s idiosyncratic and variable (one might even say spasmotic) behavior regarding the colon (“:”), meaning XML Namespaces didn’t work right. Also, the colon has a special meaning in CSS, yet another wrinkle.
So, XHTML and SVG are based on the XML model, and the new HTML5 is based on the new-old model. This difference wasn’t really much of a problem for SVG, though, because SVG and HTML are two different languages…
But of course people want to use them together. (More on that in some other post.)
The obvious way around that is to have two different parsers (a parser is a program that reads a formal syntax and makes a model of it for the browser to act on), and when you encountered SVG inside HTML, you would switch to the stricter parser. HTML would have its model for naming attributes (with no colon), and SVG would have its attributes (with a colon).
But there’s a hitch. ARIA is an accessibility technology meant to work in HTML and XHTML and SVG (and other presentation languages). The functionality for ARIA is defined in its own specification, but it’s not intended to be used on its own… it has to be integrated into a host language. It would be confusing for authors to have to use two different syntaces for the same functionality in different languages.
For example, in SVG you might have:
<svg version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:aaa="http://www.w3.org/2005/07/aaa"> <g xhtml:role="checkbox" aaa:checked="true"> <rect x="5" y="5" width="20" height="20" rx="5" ry="5" fill="none" stroke="crimson" stroke-width="2"/> <text x="30" y="22" font-size="18" fill="crimson"> Enable Accessibility</text> </g> </svg>
While in HTML, you would have something like:
<html> <ul role="checkbox" aria-checked="true"> <li>Enable Accessibility </ul> </html>
Let the drama unfold…
Act I
Scene I
XHTML defined a new point of extensibility, the ‘role’ attribute, which was intended to be use to add semantics to an element.
Scene II
The WAI Working Group decided to use that as the point of entry for adding accessibility behaviors to custom controls. Custom controls are things like an interactive tree selector, or a slider knob; they extend the limited set of form-based controls that browsers settled on, and they are increasingly used in script libraries like dojo or Scriptaculous. SVG developers have been making custom controls using script and/or declarative animation for many years, since SVG doesn’t have any native controls at all, but can be used to build visually glitzy ones; they just aren’t treated as controls by AT (Assistive Technology) apps. ARIA can be used to tell the AT that a collection of elements is meant to represent a particular type of control, and report what its current state is (on, off, hidden, selected, etc.). So the SVG WG has been eyeing the progress of the Role and ARIA specs.
Act II
Scene I
The HTML5 movement, under the banner of the WHATWG, gained critical mass and was brought into the W3C for standardization with all major browser vendors. The ARIA proponents worked hard to get ARIA’s accessibility functionality integrated into the standard. Simon Pieters of Opera and Aaron Leventhal of Mozilla made a proposal for integration of ARIA using a limited ‘role’ attribute. Because XML Namespaces don’t work right in IE (and not very intuitively in CSS), it was proposed that the different states of ARIA would be referenced and set in HTML5 using the prefix ‘arai-*’ instead of ‘aaa:*’ or ‘aria:*’. I was only vaguely aware of early discussions of ‘role’ in HTML5, and despaired of its ever being adopted, knowing the resistance to generic extensibility. But Simon and Aaron were actively working on getting ARIA integrated into the next releases of Opera and Firefox, to enable accessible Web apps.
Scene II
Dave Raggett (who originated forms in HTML) made an independent public request for the ‘role’ attribute to be added to SVG; I jumped on this, expecting that it would be done using the built-in extensibility mechanism provided by XML Namespaces. Much to my dismay, immediately was made aware of Simon and Aaron’s proposal, which uses different syntax to how it would be done in SVG. Much email discussion ensued, and a little biting of thumbs.
Scene III
I talked with the SVG Working Group about the ‘role’ attribute (step one in integrating ARIA), and while there were some reservations, we agreed that it would be great if we could do it. (Note: really boring issues of process elided here… assume I explained why it’s kinda bad timing for SVG right now.) But we decided to first gather opinions about how best to do so from the SVG community and other interested parties. But the issue of ‘role’ was buried under the talk of whether or not to use the controversial and dreaded Namespaces (e.g. ‘aria:checked’ versus ‘aria-checked’). It’s undesirable to add arbitrary attributes to SVG, because the content wouldn’t validate and couldn’t take advantage of all the XML tools that already exist. And there are like 70 attributes, so we couldn’t just adopt them all… and if we did, what would happen when ARIA v2.0 came out? Ugh… messy. But the HTML5 guys insisted they couldn’t use namespaces.
Act III
Scene I
Rich Schwerdtfeger of IBM, who’s heavily invested in getting ARIA implemented, set up an informal call involving the Dramatis Personae (and a few others who couldn’t show up). I talked it over with people on “my” side, and reviewed the XML Namespaces specification. A little reflection revealed to me that, in fact, both approaches were using namespaces… just the syntax is different (‘aria:*’ versus ‘aria-*’). And while the Namespaces spec doesn’t describe any other mechanism than the namespace declaration and prefix, neither does it procribe it. I saw a possible way out.
The uninformed reader will assume this is some persnickety, trivial detail; such a reader would be right, but it’s still a pain to coordinate on crap like this. Everyone assumes they are right, the other guy is wrong, and everyone should just do it their way.
Scene II
The teleconference happened today, and arguments were presented, and assumptions were challenged, and a verbal duel took place. Issues with delimiters in IE and CSS were explained and explored. The colon was a real problem, but so was the dash. Making the “aria-*” string is a klutzy, short-term, one-off solution, and couldn’t really cut it as a true namespace prefix. SVG has many attributes that contain the dash (‘stroke-width’, ‘fill-rule’, ‘font-size’, ‘stroke-dasharray’, ‘fill-opacity’, ‘pointer-events’, ‘font-family’, ‘shape-rendering’, and many more); it’s a common convention to separate words in attributes with a dash. No future (or present) language could safely use ‘font-‘ or ‘shape-‘ or any other prefix to integrate with SVG.
But Aaron Leventhal, a very pragmatic guy, pulled out just the medicine I think was needed. The underscore (“_”). It works in IE, has no special meaning in CSS, is a legal (if underused) character for attributes in XML, doesn’t cause problems for the HTML parser, and doesn’t conflict with any conventions in SVG. It’s probably underutilized because it’s just slightly harder to type than a dash (on my keyboard, it’s the same key, but requires the shift key be depressed).
_!!
Scene III
The final scene has been lost in the mists of the future. But here’s my plan, if nothing gets in its way: to create a new namespacing specification that uses the underscore as its prefix delimiter. It would differ from Namespaces in XML 1.0 in three ways:
- It would make the namespace prefix string fixed. In XML Namespaces 1.0, any string is allowed to be used for any namespace prefix. For example, if you were feeling particularly contrary, you could delare that the string “svg” is the prefix for XForms content, “xforms” is the prefix for MathML content, and “math” is the prefix for SVG content; in reality, nobody does this, for obvious reasons. In my proposal, there would be a central “registry” of sorts for the most common prefixes; my suggestion is that such a registry should be defined in the CDF WICD profiles (though I haven’t consulted that group), and that a User Agent that supports multi-namespace documents (aka Compound Documents) would know what those prefixes stand for. The most obvious ones are “svg”, “html”, “xhtml”, “xforms”, “mathml”… oh, yeah, and “aria”. That this is less inherently extensible than the colon-delimited unfixed prefix might even be a bonus, ensuring that the combination of technologies has been combined systematically and specified, promoting interoperability betweem UAs. Prefix names would consist only of letters and digits, forgoing the dot and the dash, and of course the colon and underscore; a possible exception that occurs to me is the use of the dot to serve as a versioning convention, so you would have “aria.2_checked” to represent an ARIA v2.0 ‘checked’ attribute.
- As a consequence of #1, there would be no declaration needed or specified. The supporting UA would simply know the ” name + _” prefix, and treat it as it should be treated. This makes copy-paste authoring easier, since authors wouldn’t have to know or care about the namespace declaration earlier in the scope of the copied content.
- This specification would not reserve the underscore, but merely allow supporting implementations to assign special meaning to the combination of ” name + _”. This would prevent some problems in existing XML processors. Languages that register their prefix name would be encouraged to pick something that is unlikely to conflict with other partial attribute or element names. (I toyed with the idea of making this an extension mechanism for attributes only… not element names. I haven’t thought that through yet, though.)
This would be more intuitive for creatinging documents, since authors (and authoring tool) could depend upon a convention. Supporting XML processors would have a much easier time resolving namespaces. Languages and UAs could support both namespacing schmemes, if they so chose, to allow for custom or unconventional namespaces that aren’t registered; the two schemes are orthogonal and (I believe) compatible.
Let the rotten tomatoes fly.
In response to some feedback on IRC, I realize there is a potential problem with my scheme. My proposal would not make identical parse trees in HTML and XML (SVG). The “shape” of the tree would be the same, but that namespace URI on each would be different, and this would have to be reflected in the DOM. So, an additional difference between my underscore-delimited namespace prefix and Namespaces in XML is that you could use namespace-unaware methods like getAttribute and setAttribute on the underscored attributes, and it would function identically with the different languages; when using namespace-aware equivalents like setAttributeNS, you would supply the namespace keyword, like this: myEl.setAttributeNS( “aria”, “aria_checked”, true). The browser would resolve the namespaces URI internally. I’m not in love with this, but it seems workable to me on first glance.
On further thought, in order to preserve isomorphism with the existing DOM parse tree, I have another option… such a prefix might not involve namespaces at all (including the namespace URI property on the attribute, nor the interface methods). Instead, it could be used as an NVDL switching trigger, telling the validator (or other processor) that if it knows about the schema indicated by the keyword token, it should validate according to that schema; if it doesn’t know it, it simply uses the language’s own error handling for dealing with unknown content (which in the case of SVG 1.2 is simply to ignore it). This is a more modest scheme, requires less buy-in from existing processors, doesn’t involve namespaces per se, and probably solves most of the same problems. Note that I’ve been up all night, though (working on other things), so this may not be that bright…