More on HTML5 audio, codecs and politics

1
Official Construct Post
Ashley's avatar
Ashley
  • 1 Aug, 2011
  • 1,966 words
  • ~8-13 mins
  • 2,653 visits
  • 0 favourites

In my last blog post, On HTML5 audio formats, I discussed the patchy audio support in today's browsers. I argued that Internet Explorer and Safari should support the Ogg Vorbis format, and that Firefox should support MPEG-4 AAC to allow us to have one format that works everywhere.

However, the situation turns out to be much more complicated, and even more patchy, than I first imagined. This blog post outlines what I hope is a fairly comprehensive outline of the situation in audio on the web today, some of the politics involved, and what we're going to do about it. In short, we want Ogg Vorbis, so we've set up WeWantOgg.com — and you can help.

So what does audio look like on the web today?

Flash

First of all, the most common workaround (for reasons which will soon become obvious) is to use a background Flash app to play sounds on behalf of the web page. For Construct 2 made HTML5 games this is out of the question. We describe HTML5 as being "like Flash but with no plugins", so using any kind of Flash (or other plugins like Java) basically negates our whole existence. Besides, HTML5 is supposed to be able to be a good substitute for browser plugins, so shouldn't audio work well in HTML5?

Using HTML5 <audio> for sound effects

The HTML5 audio tag has somewhat patchy support as far as gaming is concerned. Some basic features like muting and looping still aren't supported by some major browsers. There are also latency issues in many browsers: sometimes playback doesn't start until after a delay, and some changes (like muting/unmuting) also have a delay before they take effect. So for gaming, support is somewhere around "almost adequate". Hopefully it will see improvements soon.

There's still the issue of there not being a single audio format that works in all browsers, though. Right now Construct 2 helps you encode your sounds to Ogg Vorbis and MPEG-4 AAC to cover all browsers.

The HTML5 audio spec doesn't mention many features desirable in games though, like effects and 3D positioning. However, this is possible in the Web Audio API.

Web Audio API

The Web Audio API is an in-development addition to the HTML5 standard. It's essentially a complete audio API of comparable sophistication to modern commercial game's audio engines. You can process sound any way you like, creating audio processing graphs, using effects, spatial positioning, and so on. It looks really cool! Firefox and Chrome allegedly support it, but I can't find that it is enabled by default on the latest version of either. Even if it was, it's still not clear if other browser makers will support it. Microsoft could kick up another WebGL-style fuss with Internet Explorer. However, I'm sure MS are mainly concerned with WebGL impacting DirectX's market/mind-share, and I can't see a similar parallel with Web Audio, so hopefully they'll eventually support it.

Firefox have their own Audio Data API since version 4, but it's specific to Firefox and the Web Audio API is far more sophisticated.

Web Audio also does not change the audio format support. It's the same as with the ordinary audio tag. It may be possible to write a Javascript Vorbis decoder, but I'm not aware of the existence of any such project, and it would be inefficient compared to a built-in decoder anyway.

So for the time being, the audio tag is the only reliable cross-browser audio playback mechanism, with its quirks.

Audio format support

The HTML5 standard does not require browsers to support any one audio format. It used to recommend Ogg Vorbis (and Theora for video), which are free and open formats that can be used by anyone without having to get a license or pay royalties. However, the recommendation was retracted after pressure by some of the big corporations involved in the HTML standardisation process.

There is no technical reason for not supporting Ogg Vorbis or Theora in all browsers. The code is free and open, ready and waiting for any browser maker to implement. There are no realistic security issues, since Chrome and Firefox already support it and it hasn't caused any major issues. It's only politics holding it back.

The main alternative is from the MPEG formats: MP3 (MPEG-1 layer 3) and MP4 (MPEG-4 AAC) for audio, and H.264 (MPEG-4 AVC). These formats are professionally developed and perform excellently. However, their use requires licenses and royalty payments. This does not sit well with the free and open nature of the internet.

Browser support has fallen in to two major camps: Ogg supporters (Firefox, Opera), and MPEG supporters (Internet Explorer, Safari), with Chrome supporting both. Notably the Ogg supporters have a vested interest in making the web better (Mozilla and Google), and the MPEG supporters are heavy users of MPEG technologies (Apple use AAC in iTunes, and I don't know about Microsoft but I'm sure they have many dealings with the MPEG people, as well as a general scepticism of anything open). If you're wondering about WebM, it also uses Vorbis for audio. Given it's the same audio data in a different packaging, I'll just assume it's equivalent to Ogg Vorbis for now.

If Internet Explorer and Safari support Ogg Vorbis, the next obvious question is "why not also Ogg Theora (the free video codec)?" Now we're in to video, which I don't know a great deal about, except it involves billion-dollar industries, power struggles and high politics. Many senior people from companies like Microsoft, Google and Mozilla have made lengthy blog posts on the subject of video. I don't want to go in to that here - all we need is an audio format for our game sound effects - but it's a factor in Ogg Vorbis support.

I was wrong about Firefox and AAC

Looking over the browser audio support table it's easy to see a cross in the "Firefox/AAC" box, and think "gaawwwssh, if only that was a tick, we could have a near enough majority support of one format for our game sound effects!". That's what I did. I have to admit now I was pretty naive to argue that. I tried to make it equal to the "Internet Explorer and Safari/Ogg Vorbis" box, but in reality it isn't.

I've been looking in to obtaining a license to allow us to provide an AAC encoder with Construct 2. The people I have contacted have been very helpful and informative. It should be relatively straightforward. However, there's a $1000 fee to pay, forms to fill out, plus royalties on every copy of Construct 2 we sell with the encoder. Overall, it's mildly bureaucratic. (Meanwhile our Ogg Vorbis encoder is already implemented and working just fine.)

Now imagine if we had to repeat that for every single other technology Construct 2 uses. That includes HTML, Javascript, jQuery, PNG images, Google's Closure Compiler, and other tools like PNGCrush and Scintilla. If that were so, Construct 2 would simply never have happened. We'd be buried under royalties, fees and licenses, and never be able to develop a new editor. Imagine if there were MP3-style royalties requiring payments to even just distribute content! Scirra has grown from student-in-bedroom programmers with grand ideas. Even "fair" royalties are a serious impedance to innovative shoestring-budget projects. Every license makes creating something new and exciting a bit more difficult.

Mozilla are right: a free and open internet is the only way. Sooner or later, these non-free formats have to be replaced by free ones. One of the things that makes the internet great is that anyone can generate content for it. Technologies which allow free distribution (like AAC) are not enough. We also need free generation. Organisations like ours, and all our users, are concerned mainly with generating new content for the internet, and we have to be able to do that without restriction, or the internet would simply be a place for corporations with millions of dollars to tell you what they want you to hear. Sure, MPEG formats may be better quality or more widely supported, but we still want the open formats, so we can use the internet however we see fit, rather than in a fashion that suits patent holders.

Submarine patents

Some MPEG proponents have expressed concern over "submarine patents" - patent holders who could suddenly claim that some open format infringes their patents and sue you. Then, the only way to ensure a proper resolution is in a court of law.

Again, I don't want to cover video - we just want game sound effects. However, in the case of Vorbis, it's been around for over a decade now. It has been used in a number of hardware players, software players and music sites. If anyone were to attempt to torpedo someone with their submarine patent, it would be hard for them to claim they hadn't known anyone was using their patent before. So why haven't these patent holders acted already in the past 10 years? I guess there is no patent infringement after all.

Besides, presumably anyone could claim patent infringement on any technology at any time. Does Javascript infringe some patent archived away somewhere? What if it doesn't, but someone sues you anyway, claiming it does? Everyone just assumes it's patent-free and continues to use it anyway. It's a risk we're willing to take. What is especially annoying is a patent lawsuit against a small startup like ours could totally destroy it. A patent lawsuit against a large corporation with millions or billions of dollars in the bank is probably a mere inconvenience, and yet they still cite patent lawsuits as a reason not to support some formats. We have proportionally much more to lose and we're still willing to support them. What's the big worry for you guys up there? Scared someone's going to steal a coin from your mountain of gold? It's alright for you, I have an overdraft here.

We Want Ogg

So we would really like all browsers to support Ogg Vorbis for audio. Then game developers - and many other kinds of users - can have one unrestricted audio format to use for the web. It prevents setting a precedent for other new internet technologies to be licensed, which will only help in stamping out individuals or small teams with cool ideas. And, remember, there's no technical reason we can't have this. It's probably the politics over the video formats which is also holding back the free audio formats.

We're not bothered by politics here at Scirra, so we're not going to go down without a bit of a fight. We believe there is one loophole in this techno-political circus: if the majority of customers want something, usually the business is obliged to provide it. So we're asking you to help. We're starting a campaign to get as many of you as possible to demand support for the free and open Ogg Vorbis audio format. There is no technical reason not to support it, nor does there even seem to be a good political reason, and we want to keep the web a place where all of us can use it as we see fit. Vorbis has to eventually replace AAC on the internet, so let's start trying to make that happen.

So we've set up WeWantOgg.com. Sign up. Spread the word. Let's get as many people as possible involved, letting Microsoft and Apple know it's what we want. We all want Ogg Vorbis because it's better for us. Apple and Microsoft don't want to support it because that's better for them. Who's in charge here - the businesses or the customers? Let's do what we can to make it obvious: we are the users, and we want free and open formats on our internet.

Subscribe

Get emailed when there are new posts!