Webmandering

A while back, Henri Sivonen stumbled upon a diagram on the W3C site of the technology stack, a curvy-block Venn-diagram overview of the different technologies from W3C and where they fit into the Big Picture.  It’s an attractive diagram, but it oversimplifies things, and shows a decidedly W3C bias toward the Web.  It’s clearly been used past its expiration date, and those who consume it might feel a bit queasy.

Henri strongly criticized this depiction.  He rightfully points out that HTML is not included in this vision (showing XHTML only, which looks a bit silly these days), but then complains that HTML5 and XHR are not included in the diagram.  But of course, HTML5 isn’t even in Last Call yet, much less a W3C Recommendation, so it doesn’t really belong in that particular diagram (oddly, Henri credits XMLHTTPRequest to WHATWG, rather than its originator, Microsoft).

To his credit, Henri put his money where his mouth is, and took the trouble to make a diagram of the Web stack the way he sees it, which presumably better reflects the “real Web”.  It omits many of the W3C technologies, and inserts some of the more common ones that aren’t from W3C (most notably Javascript).  It’s a good diagram, but oversimplifies the landscape dramatically.  He follows the W3C diagram in putting “Internet” at the base of the stack, but doesn’t correct it to include such ubiquitous technologies as email or chat (XMPP, IRC, etc.) even though those are often part of browser-based technologies (GMail, et al).  Of course, he deliberately omits intranets-connected devices, even though that’s part of the browser world, because the official doctrine of the WHATWG is that the intranets (including those that are partly open, such as at universities) are “not the Web”.  I will also quibble that he overlaps Accessibility only with HTML, not with SVG.  But most glaringly, he includes Ogg Vorbis and Ogg Theora, though they aren’t (yet!) really used on the Web, and omits the dominant technologies in that space, MP3 and Flash (and more specifically, H.264).  He covers himself here by saying that this is for a “contemporary browser”, with the insinuation that it doesn’t include plugins, though to users and authors that is a pale distinction.  He also neglects PDF (ISO 32000), which is all too prevalent on the Web, and which several browsers do render (if I recall correctly).  So, it’s not really a picture of the real-world Web stack, either.

The Frames remind us, in their song in “God Bless Mom”:

You’ll see how hard it can be
To keep your side of the deal,
And you’ll see how hard it can be
To keep one foot in the real.

So, his diagram is flawed.  So what?  Why am I picking on it?  I’m not, really… it’s a good diagram, and it serves a certain purpose.  I’m picking on that purpose itself.  Henri was quick to criticize the W3C diagram (on a page where nobody can comment, I note), not because it wasn’t accurate, but because it advanced his agenda to do so (just as the W3C was advancing its agenda by making the original diagram).

Data visualization, like statistics or slogans, has a way of territorializing the map, in a kind of graphical gerrymandering. I’m sure that Henri didn’t mean to make such glaring omissions, but I’m equally sure that the creator of the original W3C diagram didn’t have sinister motives either.  People get busy, and reuse what they have to hand that meets their needs, even when it’s sometimes not quite correct.

I really respect Henri, but what he fails to understand here, or at least to admit, is that different data visualizations are best suited for different audiences and different purposes.  He’s shown a clear bias in his diagram toward depicting the “Open Web Stack” (a bias I have to admit I share) and toward desktop browsers (which I find too narrow), with a Web developer audience.  That’s perfectly cromulent.  But his diagram is not at all suited toward showing the different work going on at W3C, and where it fits in the larger Internet, in an executive summary.  Both the offending W3C diagram and Henri’s own diagram are gross oversimplifications… which is the point of data visualizations.  The map is not the territory.  If I were to make a diagram that encompasses the Web tech landscape, it would include both W3C technologies and technologies from other sources, and code the origins with styling; it would clearly indicate which technologies are open (that is, not proprietary), which are under development and which are stable, and link each node to the definitive resource for that tech; it would not stack them up in a neat little box, but would show the interconnections via lines.  And it would serve a different purpose than either of these other diagrams.

Why was only XHTML included in that W3C diagram, and not HTML?  Wishful thinking.  Say it enough times, and it just might come true, and a picture is worth a thousand words.  We’re all dreaming the Web we want into reality, every day.  I’m tired of the false dichotomy that’s too often drawn between W3C and its members and participants.  How about we lay off the divisive rhetoric?