How browsers should handle autoplay restrictions

8
Official Construct Team Post
Ashley's avatar
Ashley
  • 8 May, 2018
  • 1,501 words
  • ~6-10 mins
  • 9,646 visits
  • 1 favourites

It's annoying when web pages unexpectedly blare out audio, so I can understand why Chrome 66 changes when web pages can autoplay audio. Unfortunately this was done by carrying over a needless limitation from mobile over to desktop - that attempts to play audio actually fail, rather than simply being muted until playback is allowed. Mobile always had this needless limitation from the start, so everyone just coded around it.

Unfortunately this needless limitation has now just been transferred to desktop, where lots of existing code expects playback calls to succeed. This has broken lots of existing web pages which in some cases no longer play audio at all due to the new restrictions. Consequently many web game developers are annoyed. This could all have been avoided if it was handled more thoughtfully.

Here's a bit of history on this.

Autoplay restrictions on iOS

Mobile browsers were the first to impose restrictions on audio (and video) playback. Initially Safari on iOS blocked autoplay unless you started playback in a touch event, partially to save bandwidth bills from unexpected video playback on cell data, and partially to avoid annoying users with unexpected audio. Autoplaying audio is particularly annoying on mobile, since you might be in a quiet place like a library and not expecting any audio playback. Limiting it to a touch event was meant to signal that the user wanted the content. For example if they opened a page and then immediately pressed back because the page was just an advert or otherwise unwanted content, there was no opportunity to play audio. If they touch the page, presumably they're interested in it, so it's OK to play audio.

Animated GIF images were still allowed to autoplay though, but GIF is an old format with really poor compression. People still want autoplaying video clips on their web pages, so predictably GIFs got used for video clips instead. This ended up wasting even more bandwidth due to the poor compression of GIF. In the end Apple finally allowed muted videos to autoplay, so web pages can use modern video formats for video clips and save bandwidth with their better compression. (These days many video clips are still labelled "GIF" even though they really play a modern video format like H.264.) So in the end the only restriction is on audio playback.

Apple's implementation of restricting audio playback had a nasty limitation: if you tried to start playing audio before a touch event, the attempt to play simply fails. That means the user will not hear anything even if they touch the screen a moment later. You have to write code that queues any playback on startup, and starts it in the first touch.

Inconsistent limitations

Then it got maddeningly inconsistent: after the first touch, the Web Audio API could then play audio at any time. However the <audio> element still couldn't start playback until the next touch. The <audio> restriction was entirely pointless, because you could simply bypass it by using Web Audio instead (although Web Audio doesn't stream playback like <audio> does, so this was less efficient).

Around four years ago I filed a WebKit bug arguing this was inconsistent and only impeded legitimate cases. Apple refused to change it, and it was left like that for years. (I think it might even still be the case in iOS 11.)

A better approach

These restrictions require special coding to handle them. Instead the browser could simply allow all playback attempts to succeed, but mute the master audio output. Then the browser can automatically unmute the master audio output the first time the user touches the screen (or whenever else it deems the user is OK with audio). This doesn't require any special changes to playback code. It also doesn't allow web pages to do anything they couldn't do before. This is a key point. All web pages can play audio from the first touch anyway. They can specially code themselves to act like that first touch unmutes all audio. Why force web pages to write special code for that when the browser could handle it automatically?

Alas, it got left like that. Mobile was relatively new, and web developers expected to have to make changes for mobile, so they got away with requiring changes to code. And Apple were ignoring requests to have it changed to something that made more sense. It seems like they were focused on preventing video autoplay and the audio restrictions were an afterthought.

Chrome for Android copies iOS

Essentially, Chrome for Android simply copied iOS's approach. Maybe they thought it was important to be compatible, or maybe they thought Apple had put in a lot of thought to get it exactly right (which I doubt they did). So Chrome for Android inherited basically the same mess. I also argued the restrictions were inconsistent and pointless years ago, and likewise Google refused to change it. I think it was a misunderstanding. I was saying "the restrictions are inconsistent", but they merged it in to a bug that said "audio can't autoplay on pageload" which then got closed as intentional. Alas, it got left like that for years too.

Chrome transfers restrictions to desktop

After some time, Google started to realise that autoplaying audio on desktop is annoying, too. For example you might browse a random news website, then suddenly an advert starts blasting out audio, and you realise you left your volume pretty high for listening to music earlier. Oops! So they decided to make desktop Chrome also prevent audio autoplay on pageload.

Unfortunately, they simply transferred the needlessly messy mobile approach to desktop.

There are some exceptions depending on if you're a top site, or users routinely play audio on your site, but let's assume that you're not YouTube.

This time, there's a lot of existing web content that expects playback to succeed on startup. Now it suddenly doesn't work because playback now fails and the special code to enable playback isn't coded in. If it instead simply allowed playback and muted the master output, and automatically unmuted it in the first input event, not only would developers now not need to make any specific changes - it would also allow all existing content on the web to keep working. Sure, you might miss a few seconds of audio at the start, but you'd get audio rather than silence. Games, music players, spoken guides and so on would all keep working. Web developers could specially code audio playback to wait until the first input event to make sure nothing is cut off, if that's what they wanted.

The key is it's a graceful fallback. It works by default. You get audio by default and can opt-in to queuing audio until it's unmuted, rather than opting in to getting any playback at all. And in the long run, audio is easier to code on the web, since there aren't poorly thought-out restrictions that you need to code your way around to get audio playback to work like you want.

Conclusion

For years audio playback on the web has been badly handled. The restrictions are inconsistent and serve only to impede legitimate uses. Malicious or annoying content will simply blare out audio at the first opportunity (in the first input event). After that first opportunity, further restrictions only make it more frustrating and difficult to get audio playback to work properly for real-world apps that the user wants to hear audio in, such as games. And the chopping and changing is a real pain - I know I've spent hours over the years catching up on whatever the latest quirks of playback restrictions are, and then all existing published content needs updating.

There is no reason for playback attempts to fail before that first input event. This serves only to break existing content and make it harder to code web apps. It seems to simply have been inherited from the historically messy and inconsistent mobile support that has its roots in saving bandwidth from unexpected video playback.

Browsers should simply allow all playback but mute the output, and automatically unmute at the first opportunity they would otherwise allow playback. That's all there is to it. Web pages won't be able to do anything they can't already do, they won't need special coding, and existing web content will still basically work. Instead there are nonsensical restrictions and a ton of web content is broken on desktop.

Web developers are right to be annoyed. Google are trying to emphasise that they gave us plenty of notice and that there are workarounds. That's true. However they have still needlessly broken a ton of content by imposing maddening restrictions.

For our part, Construct 2 and Construct 3 have already been updated to specially handle the new restrictions. However you'll need to re-publish old content to ensure audio playback works. Hopefully browsers like Chrome will eventually implement more sensible playback rules that works with old content, too, so archived content doesn't remain silent forever.

Subscribe

Get emailed when there are new posts!

  • 6 Comments

  • Order by
Want to leave a comment? Login or Register an account!
    • [-] [+]
    • 4
    • Ashley's avatar
    • Ashley
    • Construct Team Founder
    • 4 points
    • (1 child)

    Just filed a Chrome bug to change this: bugs.chromium.org/p/chromium/issues/detail

  • "word-break: break-all !important;" in your CSS ma

    kes your article annoying to re

    ad because there are lin

    e breaks in th

    e middle o

    f words.

    • Seems to only do this in Firefox, in Chrome and Safari the text isn't breaking in the middle of a word. Seems a little odd that a blog post about inconsistent browser problems is, in itself, having an inconsistent browser problem.

      • The fault lies with the CSS on this page, which is confusing at best.

        Firefox obeys the second line "break-all", whereas Chrome obeys the third line "break-word" which does not break words like break-all does.

        -ms-word-break: break-all !important;

        word-break: break-all !important;

        word-break: break-word !important

        I'm using Firefox and only commenting because, well, there's no polite way to say it, but if you can't "do text" correctly in a blog post, what other face-palms await in the main product? ;-)

        Load more comments (1 replies)