Layout is the next frontier of web app performance

3
Official Construct Team Post
Ashley's avatar
Ashley
  • 30 Jun, 2017
  • 1,866 words
  • ~7-12 mins
  • 2,929 visits
  • 0 favourites

In a real web app like Construct 3, JavaScript performance (and DOM calls) are a solved problem. We've been developing a high-performance HTML5 game engine since 2011, so we know what it takes to write efficient JavaScript. On top of that browser makers have been pouring huge resources in to optimising their JavaScript engines ever further. DOM calls are often relatively slow, but it's straightforward to only make a minimal set of calls. Fast JavaScript means eliminating all redundant calls — whether JavaScript functions or DOM functions or anything else. That approach gives you the minimum possible overhead, without needing a whole virtual DOM (VDOM) layer which still burdens you with the overhead of a diffing engine.

What happens when JavaScript and DOM calls are a solved problem? You run right in to the next wall: layout performance. Worse, there are no tools to help, little guidance available, and no cultural recognition that this is even a significant problem. This makes for a frustrating situation where we face real performance issues and no practical way to solve them, while everyone else is fixated on things which have minimal impact to our web app.

What is layout performance?

In case you're not familiar with how browsers work, here's a quick summary of what layout performance means. When you change the layout of a page, such as changing the width of a box, the browser has to calculate the new position of elements on the page. For example if there is text in the box, it will have to re-calculate the text wrapping. If the box is in a container, the container may change size too. And so on.

CSS provides a lot of layout features, ranging from floats and tables to flex and CSS grid. The algorithms used for calculating layout can sometimes be quite complex. Therefore when a change happens, the browser may spend some time calculating layout. This is done on the main thread, meaning the page freezes while it's being calculated. Often layout is fast enough for this not to be noticable, but it can cause jank, and it can even take seconds in an extreme case.

The naivety of browsers

Amazingly, browsers are still incredibly naive about layout. Pretty much any change you make to a document causes layout — even when you set the same value! For example setting a box's width to its current width still causes layout, even though nothing has actually changed. This is easy enough to avoid by removing redundant calls, but other cases are harder as we'll see in a moment. Further, when the browser sees a change, it tends to do layout for absolutely everything, jumping right to the worst-case scenario where it has to do the maximum amount of work. Here are some of the awkward cases we've run in to in Chrome:

So in general almost any change, however small, causes a lot of work. If you happen to have a lot of content on-screen, this can make every small change become a very slow operation, even if the vast majority of content has not actually changed.

How bad is it?

Too often we see a tiny amount of JS/DOM work taking maybe 1ms, followed by 10ms or more of layout — on a high-end desktop. Browsers provide comprehensive, in-depth tools to analyse JavaScript performance, and virtually none for layout. Bugs filed for JavaScript performance have engineers jumping on the report right away, whereas layout performance bug reports are ignored. Here's a case I found straight away. Just by dragging an item around in our UI and updating a marker showing where the item will be dropped, we get a performance trace like this:

Notice two things about this:

  1. The JavaScript overhead is basically negligible. No amount of optimisation or clever VDOM layers will improve the overall performance here. Also note there is a single update — there's no thrashing going on.
  2. We can go 10 layers deep in to the calls that made up the tiny amount of JS work, but the "recalculate style" and "layout" parts are just a black box. Chrome's dev tools can tell us a few minor details, like how many elements were affected (apparently hundreds for some reason), and pointing at our JS code saying what caused the work (our single and entirely necessary DOM call).

So what do we do about this?

I don't know. Nothing. Our drag-and-drop marker is kind of slow, and that's that.

It gets worse! Real-world users of our PWA soon identified a more extreme case, where merely expanding an item to display the content inside it takes seconds even on a high-end machine:

What can we possibly do about this? There's no useful information on why this is such a slow operation. You might recommend that we file a bug against Chrome so Google engineers can figure out what's going on. That brings us on to our next problem.

Layout performance isn't taken seriously

The V8 team know what's important about performance. They know that isolated benchmarks often have serious flaws. They are focused on optimising real-world websites with realistic use cases. By working with real web content, they can ensure they make a difference to the speed of actual websites, rather than synthetic benchmarks or uncommon patterns. This approach shows up in how they handle bug reports too: when a user recently filed a performance bug that came down to a bug in the V8 garbage collector, the issue was quickly identified and resolved. They didn't insist on a minimal reproduction, or assume we had done something irresponsible in our coding — they worked to make sure our real-world web app was efficient.

Contrast this to how layout is handled. When we filed a bug for the extreme layout performance issue above, it was simply closed without investigation, with a telling comment:

"It is not practical for us to investigate performance aspects of third party applications..."

So in this regard, they are not interested in real-world performance of web apps. The contrast with JavaScript performance is pretty stark: they also asked us to try and narrow it down, effectively making a micro-benchmark, taking the whole example out of the context of the actual app it lives in (which likely has an impact). This is an especially formidable task when we don't have any tools to look deeper in to why this is happening. With JavaScript we can profile which individual functions take a long time, but in layout we're just left staring at a little purple box that says "Layout" and wondering what on earth to do next other than a laborious process of shotgun-debugging (making random changes until you hit on something that works by chance).

If you look around the web for advice, there's very little on the specific aspect of layout performance. There's lots of advice on things like avoiding redundant calls, avoiding thrashing, batching calls and the like. We already do all of that — and we still have small, single changes that cause expensive layout times. Layout performance is a black box. I wonder if anybody really knows how it works or how to optimise it.

CSS containment to the rescue?

CSS containment is a new CSS feature that can do a lot to help. Using a new property like <code>contain: strict</code>, you indicate that the given element is isolated from the rest of the document. That means if something changes inside it, the browser can stop trying to do layout beyond that element. Similarly if something changes outside it, the browser doesn't need to go in to that element and calculate layout, since the outside change won't have affected it. This does a lot to solve the problem where the browser storms ahead and does layout for the entire document, just because something small changed.

Recognising the seriousness of layout performance, we accordingly use CSS containment everywhere. It's particularly effective in an app like Construct 3 where content is separated in to panes. Each pane or window uses CSS containment to isolate it from the others. This provides a useful guarantee that if something in that pane changes, only content inside that pane will have layout work done for it &mdash; it won't end up doing layout for other panes.

There's a few problems with this though:

  • Currently, only Chrome supports CSS containment, so it doesn't help in other browsers yet.
  • Even in Chrome, there's a bug where even when you use CSS containment, Chrome does layout for everything anyway. Oops! You have to use a hack where you wrap elements in an extra element to work around it.
  • All our above examples of long layout times are already using CSS containment everywhere we can possibly think to use it. So it doesn't magically solve everything: there are still many cases Chrome apparently still has to do a lot of layout work, and that can still take a long time.

So CSS containment definitely helps, but it's limited to Chrome, is kind of hacky at the moment, and definitely does not solve everything.

However CSS containment does help so much that it makes the difference between usable or not. Currently Construct 3 only supports Chrome. We'd love to support Firefox, and have essentially gotten almost everything working fine in Firefox &mdash; but without CSS containment, the whole UI has a sluggish feel and isn't much fun to use. With some investigation, it seems that one of the problems is updating the mouse position in the toolbar causes full document layout, which can be a lot of work if there's a lot of content on screen. It's kind of ridiculous that such a small, trivial change has such an enormous performance overhead. So CSS containment is definitely very effective at solving this particular problem, and this is why we're keeping Firefox support off pending support for CSS containment.

Conclusion

The main performance problem of our modern PWA is layout performance. Most of the time, if something is slow, it's something like 5% JavaScript, and 95% layout. CSS containment helps, but is buggy, doesn't solve all cases, and currently isn't widely supported. Despite the importance of layout performance, browser makers don't appear to be willing to investigate or improve it. There seems to be a general cultural focus on JavaScript, and layout is assumed to be a non-issue. Perhaps many web apps are really limited on JavaScript &mdash; but once you solve that, the next barrier is layout.

Hopefully browser makers can take this more seriously. CSS containment should be a priority for any vendors interested in sophisticated PWAs. Browser makers should have a greater focus on optimising layout time. And the web development community should recognise that for at least some web apps, the bottleneck is layout.

Subscribe

Get emailed when there are new posts!