Web Audio Goes to Eleven

I’m really excited about W3C’s new public Audio Incubator Group, just launched today, and open for collaborators, innovators, and instigators. Go take a look for yourself, and see if you can contribute.

To celebrate the occasion, here’s a simple example of an experimental audio inteface, in the world’s first (and worst) audio synthesizer in SVG (you’ll need a special Minefield build to use it). Just click on the keyboard… it’s pretty rough still, but it shows some of the potential:


(standalone SVG file)

For some background, read on after this break…

Background

Years ago, I was one of several SVG developers interested in doing for audio what SVG did for graphics, and we talked a bit about a format we called SynthML. The old Adobe SVG viewer included an audio element for MP3 playback, and I found it really useful for adding subtle sounds for button mouseovers, clicks, and other UI effects, so the idea of programmatic control of music on the Web had intrigued me for some time (anyone remember Thomas Dolby’s Beatnik?), for enhancing user interfaces, adding ambient background music, and providing full fledged music webapps. The SynthML effort fizzled out, and while there are some interesting music markup languages (most notably MusicXML, which is more like the MathML of music), there hasn’t been much traction beyond simple recording playback.

That’s why I was so psyched when I saw what Alistair MacDonald, David Humphrey, Corban Brook, and a few others were up to while attending a Processing.js community event at Bocoup. David hacked up a special build of Gecko to add script access to the raw audio stream of the HTML5 <audio> element, and Al, Corban, and friends were building cool demos on top of it.

I immediately knew I wanted to see this as a part of the native functionality of the Web platform, across browsers. In some ways, it’s the last major missing piece from the open Web… the ability to view source and hack sound, the last commonly digitizable sensorium. I also knew that I wanted the use cases and requirements to come from the larger community, the musicians and UI designers and developers who will ultimately be the ones using this functionality to sound out the depths of what’s possible on the Web (okay, I’m corny). So, I talked with Corban, David, and Al, and we agreed that a public discussion forum would be the best way to connect with this community, to get them talking and experimenting and building and sharing.

Chris Blizzard of Mozilla was there too, co-sponsoring the event, and he was extremely supportive of getting this work started at W3C. I wrote up a charter for a public incubator group, and started knocking on doors of the browser vendors, our other members, and the wider community who might be stakeholders, to drum up interest. Pretty quickly, I got support from folks at the University of Rio (PUC-Rio), BBC, and Google; currently, a W3C Incubator Group takes a minimum of three W3C members to kick it off, though once it’s started, anyone can join. With that nicety taken care of, W3C management approved the charter, and the new W3C Audio Incubator Group (or Audio XG) was launched today!

Joining the Audio XG

I invite anyone interested in audio, music, speech synthesis, games, or other related topics to get involved, join the Audio XG, and start shaping the future of Web audio on the mailing list, and chat on the #audio IRC channel (irc.w3.org, port 6665, channel #audio). Incubator participation is free and open to anyone who wants to help out.

About the SVG Synth

My SVG synthesizer is a true product of the Web. I know next to nothing about music… I can’t play an instrument, can’t read music, don’t know music theory… I love music, but I’m a consumer, not a creator. I’d love to change that, when I have time, but I’ll settle for now for enabling others.

So, when I decided to make something with the audio API, naturally my first thought was to do something I didn’t know how to do, for which I’m totally unqualified: create a synthesizer in SVG.

I started with the HTML example from the documentation for the experimental audio API on the Mozilla wiki, and converted it to SVG (with a few missteps along the way). As it turns out, the HTML5 audio interface is exposed in Gecko (the Firefox engine) even when there is no HTML content. This was good news for me. It made it even easier to programmatically create sound with it. I suspect that this is something that lies outside any current specification, but it should be codified somewhere.

Next, I looked for a piano keyboard in SVG; I found a few on Wikimedia Commons, but they were not quite what I wanted (and in typical Inkscape fashion, were a little bloated), so ultimately I just made my own, which was trivial to do.

Next, I had to understand which notes each key should sound. So I read a lesson on keyboard basics, which sufficed for my purpose.

Then I had to figure out what the frequency of each note is in Hertz (Hz), which is what the audio API understands. It was no surprise that someone had published just such a table corresponding notes to frequency, with the companion equations for deriving them.

To put the pieces together, I took the simplest possible approach: I handcoded the keyboard (I could have instantiated it with script, because it’s very deterministic, but I favor declarative markup that can be tweaked in an authoring tool); I assigned an id to each key based on the note (C4, Af0-Bb0, etc., where “f”==”flat” and “b”==”sharp”, and the number denotes the octave); I then created an associative array with the note-id as the index and the frequency as the value, so I could look it up with the key’s id; then it was just a matter of sending the frequency to the script from the HTML example, and adding a duration timer.

Finally, just as a decorative flourish, I found a partial musical staff with a treble clef and beamed eighth-note, in SVG, of course, and tweaked it to my liking.

My tools? A Web browser with the audio API enabled (libre source), a search engine, and a text editor.

Voilà! From knowing almost nothing about formal music (I’d never even heard the term “middle C” before), I created a simple synthesizer, and learned a bit about music along the way. That’s the power of the open Web. That’s why I’m so pumped up about this new focus on audio. This one goes to 11.