Annotators Anonymous

My name is Doug Schepers… and I’m an annotator.

You might be an annotator, too. If you think you might be, you should come to our support workshop in San Francisco in April.

I first realized I had an annotation problem about 3 months ago, at the Books in Browsers conference. I saw a great talk by Dan Whaley and Jake Hartnell of Hypothes.is, about their annotation engine, built on top of the Annotator project.

They demoed their browser extension, showing how they could select a passage of text, open up a sidebar to leave a comment on that specific passage; when the web page is reloaded, the extension finds the original selections for all the annotations on that page, and highlights the specific passage as you select and read each annotation. And you could even reply to annotations in a threaded conversation… annotating the annotations (whoah, dude, that’s so meta!).

I’d wanted this functionality for WebPlatform.org for a while; we have a primitive annotation system, but it only anchors on the section level, which is still better than comments at the bottom of the page, or on a separate page entirely. I immediately saw the potential in Hypothes.is’ much more sophisticated script library; as a tool for suggesting improvements, requesting expanded coverage, asking and answering questions, and generally peer-reviewing collaborative documents, this is a tried-and-true UI that’s a critical feature of office tools like Word and Google Docs.

Then I realized that we could use this same tool for W3C’s specs, to allow simpler and much more immediate and contextual feedback than the current clumsy system of firehose email lists and bug trackers.

In hindsight, I see that the signs of my annotation problem go back many years. I was jonesing for a way to improve the flow and timeliness of feedback from the average developer or designer, not just those working at big member companies (one of my main goals as W3C Developer Relations Lead). To get my fix, I’d prototyped a crappy annotation system (though I didn’t know that’s what it was) soon after I first started with W3C, but scrapped it when a less-crappy system was deployed on the HTML5 spec to let people file bugs right from the spec. It didn’t satisfy me, though… it’s not the same as a true annotation system where the annotations persist in context.

Once I saw that sweet, sweet new annotation engine, I had to have it. So, later this year, we’ll be experimenting in a couple of Working Groups (starting with the Audio WG) with allowing true annotations as a primary feedback channel. (I’ll keep you posted.)

And these spec annotations don’t have to be from humans; we already have automated scripts that decorate the specs with notifications of test coverage or implementation status, and what are those but a specialized kind of annotation? Yeah, see? Once you get a taste of annotations, everything starts to look like an annotation.

And I mean everything! (Well, no I don’t… but I do mean a lot of things!) Think about it: what’s your primary contributive activity on the web? Reading a blog or article or kitten meme or infographic is a consumer activity; sharing those artifacts, on Twitter, Facebook, Pinterest, Tumblr, or wherever, is another main activity, a distributive activity; leaving a comment on that artifact, or replying to someone else’s comment, is also a contributive activity, just as much as creating the artifact in the first place, and (let’s face it) far more common for most of us than creating the primary artifact (it took me ages to get around to writing this blog post, let me tell you).

So, yeah; most of our online contributive activity is some form of annotation. Whether your comment is in your tweet, or at the bottom of the article, or in some sort of threaded forum, when you’re talking about some other document that you’re linking to, you’re annotating.

Here’s a simple chart that shows just how deep I’ve gotten myself into this addiction (based on an original by Dan Whaley):

Annotation Classes and Features

Once I realized that annotations were what I’d been craving, I started looking for them wherever I could. I found at least 40 companies that are not yet W3C members doing web annotations in one form or another; from ebook readers that let you share notes, to hip sites like Rap Genius or Medium or Quartz, to education/research tools like Diigo, to more traditional (but innovative) sites like New York Times or Financial Times. Why were they all doing annotations? Because having the comments in context, right there where the reader is looking, leads to more incisive, directed, relevant comments. Because the immediacy of annotations lets people point out small errors in otherwise good articles, or gems in otherwise mediocre articles. Because the reader comments can add value to the article, rather than just be a misinformed screed that sinks to the bottom of the page.

I dug into the topic for a couple months. I told myself that I could stop anytime I wanted, but deep down, I knew the truth.

I wanted to standardize annotations.

W3C had an experimental project back in the late 1990s called Annotea, but it didn’t really make it into standards. So, what could usefully be standardized? I have a few ideas.

Robust Anchoring

A lot of great research has gone into this, but it remains a hard problem. How do you link to a passage of text that might have moved or changed (possibly due to an annotation that suggested the change!), and which doesn’t have a built-in anchor in the markup? What if it’s the multi-page view of the article, rather than the single-page view? What if it’s a new version at a different URL, but is mostly the same content (like a W3C spec)?

Hypothes.is has a nice multiple-factor selection algorithm that is pretty rigorous about finding the right passage even across changes. I think that would be a useful feature to have in browsers.

But even then, they have to insert tags into the content in order to highlight the passage, and this is a pain if there are multiple overlapping annotations or if the annotation crosses over element boundaries (like the last sentence in one paragraph and the first in the next paragraph). It would be nice to be able to style a passage like you would an active selection, with a CSS selection pseudo-element that only allows you to change the color, background-color, and outline properties, which don’t affect reflow of document layout and are thus pretty computationally inexpensive.

Federation and Syndication

It’s great that forward-looking companies are providing an annotation interface to their own content, but it still doesn’t lead to really free, open conversations;  for controversial sites, they still control which comments are approved, and that can lead to abuse just as surely as spam and trolling can. Individual commenting or annotation systems lead to fragmented online identities, and you still get personal-data silos even with distributed commenting systems like Livefyre or Disqus, or distributed logins with OpenId, Facebook, and Twitter. I’d like to be able to publish and aggregate comments across multiple systems at the same time. I’d like to be able to maintain a personal identity (or multiple pseudonyms or even anonymity) and authorship for my online content, not beholden to individual publishers (because that’s what commenting systems are: publishers), to share with specific groups or friends or just keep my notes to myself, and I’d like to be able to see comments by a particular user across the web, not just on a few sites. This feeds into the goal that some people call IndieWeb. Annotation services with multiple publishing channels could enable that.

Annotation Events

One of the parties that should be able to publish my annotations, if they wish, is the publisher of the original article. When Google tested the annotation waters with the now-defunct Sidewiki a few years ago, Jeff Jarvis rightfully complained that it effectively stole value from his own blog, by stealing away value-adding comments.

And it’s also hard for an annotation service to know when an annotation should be made, because many webapps swallow events.

To address these issues, I’d like to see a pair of annotation events: annotationstart that signals that a selection has been made, and annotationend that notifies the web page that an annotation has been made and where you can find the feed for it, so you can retrieve it via a REST API and bring in the best annotations into your own commenting system.

Data Model

To have this kind of syndication and federation, you need to have a common data model.

I had previously noticed that some folks had started a W3C Open Annotation Community Group, but it seemed a bit… Capital-S-Semantic for my tastes.

I swear I didn't make this W3C meme showing Dave from 2001: A Space Odyssey saying 'My God, it's full of RDF'

But once I read their spec and looked past the descriptions of RDF and SPARQL, I saw a sound data model that would be useful for interchanging annotations between services. Their model seems large, and at first glance might be overly complicated (or, more charitably, “robust”), but you don’t need to use all parts of it for every instance; an annotation can be really simple.

I personally don’t think that most people creating annotations will want to express them in RDF, but that shouldn’t matter. So long as an annotation can be expressed in an agreed-upon serialization, like a subset of HTML, it can be transformed into whatever representation the back-end system wants (including RDF).

Personally, I’d like to see a <note> element in HTML, with a client-side API for scroll position of a selections and other things. This could be used for comments (I don’t like the use of <aside> for this), and also for footnotes (like the clever way Wikipedia does them), which is a big issue for digital publishing.

 Join Us!

These are just some of the ideas that we’ve been noodling on in our proposed charter for a new W3C Web Annotations Working Group. If you have other ideas, come share them with us in the upcoming W3C Annotations Workshop, April 2nd in San Francisco, collocated with the I Annotate 2014 summit. At the end of the summit, we want to finalize the charter and hopefully launch a dedicated working group.

You don’t have to be alone. Join us. It helps to talk about it.

Disclaimer

Despite the playful tone of this blogpost, I don’t mean to make light of serious, harmful addictions, like Candy Crush Saga.

Update:

In the unlikely event that you found this blog post interesting but a bit too brief, you should listen to Jen Simmons and me talk about web annotations on her excellent The Web Ahead podcast.

 

3 thoughts on “Annotators Anonymous

  1. Eric Van der Vlist wrote a book. http://books.xmlschemata.org/relaxng/relax-CHP-1-SECT-3.html
    In draft, it was fully annotatable using an ingenious schema. Allowed reader feedback and hence improved
    the book, prior to publication. 
     
    I now want to annotate my epub books, even my Kindle books! Just as I scribble on my dead tree books.
     
    Dave

  2. Which is why we need capital-C-Consensus and capital-C-Compromise to ensure the broadest adoption at the same time as actual utility and interoperability in the capital-C-Community 🙂 And for that, I second Doug’s call to be part of the Annotations Workshop and I Annotate!

  3. Wow, I had no idea hypothes.is’ system was capable of all that. Really creative use for it. Also, good riddance to Bugzilla if this can really overtake that as a way to track issues within the specifications. 😉

Comments are closed.