Construct's effect compositor: part 2

5
Official Construct Team Post
Ashley's avatar
Ashley
  • 10 Aug, 2018
  • 1,654 words
  • ~7-11 mins
  • 3,165 visits
  • 2 favourites

This is the second blog in a series about the code in Construct that renders effects, called the effect compositor. In case you missed it, be sure to read part 1 first here. Let's jump right back in.

Rendering multiple effects

Suppose you add two effects to an object:

  1. Warp (a wavy distortion effect)
  2. AdjustHSL (to modify the colors)

The following image demonstrates how the effects combine.

WebGL can only render using one shader program at a time. This means you can't say "enable both shaders at once". It's possible to combine the shader programs at compile time, but this is very complicated and is difficult to extend to cover all possible combinations of effects. Instead Construct's effects compositor simply renders them one at a time. It renders with the first effect to an intermediate surface, and then renders the intermediate surface to the rendering destination with the second effect. (Note how the intermediate surface is similar to the way background blending effects are rendered, but the rendering to the background also applies a second effect.) The visual result is both effects are applied.

Performance overhead

Using an intermediate surface like this has similar performance implications as with background blending effects: it cannot be batched. Using a single effect sometimes involves an intermediate surface, but using multiple effects always involves an intermediate surface, so there will definitely be the performance overhead. This means you should avoid using multiple effects on large numbers of instances - it's difficult for the engine to render it efficiently. As with background blending effects, it may be more efficient to place all the instances on a layer, and add the effect chain to the whole layer.

Rendering even more effects

Suppose you add a chain of 5 effects to an object. Does Construct have to create lots of intermediate surfaces? Actually, no - the maximum it ever needs is 2. If we call each intermediate surface "A" and "B", then the effect can be rendered by "bouncing" the image between the two, along the lines of:

  1. Render with effect 1 to surface A
  2. Render with effect 2 to surface B
  3. Render with effect 3 to surface A
  4. Render with effect 4 to surface B
  5. Render with effect 5 to background

Only needing two intermediate surfaces helps limit the memory needed to render long chains of effects.

The last effect

Notice the last effect can generally render directly to the background, so the intermediate surfaces are only used 4 times instead of 5. However - and you may be starting to realise how many caveats there are involved in this now - if the last effect is background blending, that still can't render directly to the background due to the limitation on shaders reading and writing to the same surface. In that case the last step would actually be "Render with effect 5 to surface A", and then the final copy to the background will be a sixth step.

Background-distorting effects such as the Glass effect also count as needing this extra final copy. In general the effect compositor calls this the post-draw step. If it can it'll render the last effect directly to the background, but if it can't then it will use the extra intermediate surface. This also covers the case of using a single background-blending shader: the extra copy is the post-draw.

Pre-drawing

In addition to a possible post-draw step, there is also a possible pre-draw step. This is where the object is first rendered to an intermediate surface with no effects applied before continuing with the effects chain. It's possible a single object could need both pre- and post-draw steps. These are the reasons a pre-draw step might be required:

  • If a background blending effect is applied to a rotated object, then the object must be first drawn to an intermediate surface with its rotation applied to ensure the foreground and background match up exactly. This basically removes the rotation from the effect processing, since the rest of the chain works with unrotated surfaces.
  • Some effects can extend past the edge of the object. For example the Warp effect can distort the object's image past its bounding box. The intermediate surface normally only uses the size of the object, so to prevent these parts being cut off it adds some extra space around the object for these effects. Then the rest of the effect chain can work with a copy of the image with extra space available past the edges.
  • Opacity, and in the C3 runtime the built-in color tint, are handled by the default shader. Other effects don't take them in to account. So if an object has a non-default opacity/color, it is pre-drawn with these applied, essentially making the default shader the first effect in the chain.
  • Many objects - essentially everything other than Sprite - require pre-drawing since they don't draw a single texture with the default shader. Pre-drawing allows it to be treated by the effect compositor as a single surface of pixels even if it's not actually drawn like that.

Basically, if there is anything awkward about the thing being rendered, it's pre-drawn so it turns in to a single unrotated texture, which is easy for the effect chain to process.

An example of an object both pre- and post-drawing is a rotated object with a background-blending shader. It will render with steps like this:

  1. Pre-draw the rotated object to surface A. This allows the rotation to be handled correctly.
  2. Draw the object with the background-blending effect to surface B, since it can't be rendered directly to the background.
  3. Copy the fully rendered object from surface B to the background.

This involves a lot of rendering commands. Also note that the effect compositor had to use two intermediate surfaces to render the effect correctly. Blend modes really are a lot simpler! It's a shame there's not a broader selection available with them.

The overall strategy

By now we've covered enough to identify the principle of rendering an effect chain:

  1. If necessary, pre-draw the object.
  2. Bounce the image between intermediate surfaces rendering each effect one at a time.
  3. If necessary, post-draw the object and copy the result to the background. If not necessary, the last effect can render directly to the background.

Since every step has a performance overhead, the effect compositor removes steps anywhere it can. For example, as we've seen:

  • An unrotated background blending object will skip pre-draw.
  • A chain of effects with no background blending effect will skip post-draw.
  • A single 'AdjustHSL' shader can simply render directly to the background, skipping both pre- and post-draw. This even potentially allows it to batch efficiently.

The rules are pretty complicated and not particularly easy to remember. So I wouldn't worry about them too much; these are details to come back to if you've measured a performance problem, and nothing else you've tried has been enough to improve it yet.

More complications

There are even more facets to the effect compositor as it applies various optimisations and has extra paths to handle some pretty specific edge cases. However this really is getting in to the minutiae. It's unlikely to be important, and anyway many of the finer details change over time as we tweak optimisations, fix bugs, and add new features.

As I mentioned briefly, this all sums up to be one of the single most complex parts of Construct's runtime. It's just a complicated problem to begin with, but another big complicating factor is applying optimisations. It would be easy to simply pre- and post-draw absolutely all effects, ensuring everything always looked right and keeping the code simple. However this would mean even simple effects have a high performance overhead. There are many carefully handled points throughout the effect compositor where it tries to skip steps it doesn't have to do, often in highly specific circumstances. This is tough to get right for all possible effect chains, and can result in some bugs that are very difficult to deal with. I don't look forwards to having to deal with effect rendering bugs!

Conclusion

Hopefully knowing how the effect compositor in Construct works is interesting by itself, but these blogs can probably also be mined by advanced users hunting down absolutely optimal performance results. I'd discourage casual users from bothering with that though, particularly since I don't feel like any of this boils down to simple "do this" or "don't do this" advice. I think the best guidance is just try to use as few effects as possible, and try to keep things the same (i.e. using the same effects with the same parameters). And possibly also that effects only changing colors can be cheap on lots of instances, but other than that, avoid using effects on large numbers of instances - try using a layer effect instead. And remember that as ever, if your game is hitting 60 FPS on your target devices, it's doing just fine and probably doesn't need optimising yet.

If you know a bit of GLSL - the language used for shader programs - you can write your own effects with the Addon SDK. An interesting effect can often be achieved with just a few lines of shader code, so it's relatively easy to dabble in. You could also use this to manually combine multiple shaders in to one, removing steps in the effect chain.

For all its complexity, Construct's effects are an awesome feature. It's a little bit of Construct magic: you just add the effects you want, and the effect compositor does a lot of hard work to render that as efficiently as possible for you. It's a pretty unique feature that's difficult to match, especially with the same cross-platform reach. So keep playing with effects in your games, and now you're a little wiser as to how it all works!

Subscribe

Get emailed when there are new posts!

  • 4 Comments

  • Order by
Want to leave a comment? Login or Register an account!
  • Interesting read. I have a question though regarding performance.

    If I have a static sprite the size of the entire layout (and considering that the layout is bigger than the viewport) and I apply Multiply on the layer, can this pose a problem? What if the layout is considerably huge? And I assume I should also use render cells in that case?

    My PC seems to handle it just fine, my laptop seems to struggle a bit though but it runs on 4k so that might also be the reason. Just want to make sure I didn´t do something absurd here.

    Cheers

      • [-] [+]
      • 1
      • Ashley's avatar
      • Ashley
      • Construct Team Founder
      • 1 points
      • (0 children)

      The GPU only processes pixels on-screen. So anything not visible on the screen won't affect performance.

  • "It's possible to combine the shader programs at compile time, but this is very complicated and is difficult to extend to cover all possible combinations of effects."

    "You could also use this to manually combine multiple shaders in to one, removing steps in the effect chain"

    Could you consider making part 3, where you combine some of thous shaders, explaining ins and outs and making some test, showing how much difference/impact it will have.

    It could be helpful, trying to implement it as a runtime and cover all the shader compinations will be difficult. But if users can use your blog post and sdk to make various combinations of shaders and upload them to addon list, which then could ease the shader use impact.

    About SDK, could you also consider making blog post just about SDK, while making some simple addons? as it could help and give users jumping point into SDK. As you have covered various part of game engine complexity, but not SDK use.

  • instead of twitter posts would be nice to read new articles from you here again! and just link them to twitter.