The latest beta update r155 introduces a major new performance optimisation: collision cells. These can provide a huge performance boost to games using large layouts with lots of collision detection. So what are collision cells and how do they work?
The old way: brute force
Before r155, Construct 2 checked for collisions using a "brute force" approach. This meant if you have an object and you want to test if it collides with something solid, and there are 100 solid objects, it makes 100 checks for a collision. In other words, it tries every combination. If objects are far apart, Construct 2 can very quickly reject collisions using a bounding-box check. For many games, this is extremely fast and there is not a big problem with collision performance. However as games scale up and start using larger layouts, it can easily end up checking thousands of objects per frame.
A particularly difficult case is checking "A overlaps B", when there are hundreds (or even thousands) of instances of both A and B. Suppose there are 1000 instances of both: it has to do 1000 x 1000 = one million collision checks! Often, this is also done every tick, which results in a huge amount of CPU work and low framerates, even though each individual check is very fast.
The Collision checks System property in the debugger tells you how many checks the engine is making every tick. The Poly checks value also indicates how many objects it has to actually check the polygon for (i.e. their bounding boxes are overlapping, so it has to check further). However there are often not many poly checks and the performance degradation comes from the sheer number of collision checks that are made, even though they can mostly be rejected very quickly with the bounding-box check. So how do we reduce the number of collision checks?
The new way: collision cells
To solve this there needs to be a way to not even check objects that are far away. How do you know objects are far away without even checking them?
The solution is to do a little bit of extra work whenever objects move. The layout is effectively split in to cells the size of the window. All the objects on the layout are added to their current cell. It's possible objects can overlap cell borders, or for very large objects to overlap a range of cells, in which case the object is added to all the cells it overlaps. As objects move their cells are updated: they are removed from the cells they no longer overlap and added in to any new cells they overlap. So then there is an up-to-date grid of lists of objects in each cell.
Now when checking for collisions with an object, it only needs to check all the other objects in its cell (or range of cells). In a large layout, this can eliminate the vast majority of collision checks - far away objects are not even considered. This extends to all collision checks done in the engine. It includes Sprite's On collision and Is overlapping conditions, all collision checks made by behaviors such as the Platform behavior (which benefits particularly well since it typically makes a lot of collision checks), and even line-of-sight detection and Pathfinding grid generation. For some types of games, the performance boost is so significant as to make the game playable where it was previously unplayable!
An example: 1000 objects
Let's see a real-world example of the improvement. Here's a large layout zoomed out a long way so you can see the whole thing. The grid shows window-sized areas. There are 1000 variously sized blue and red sprites moving slowly around. They are all checking if they overlap any other objects, and if they do, the opacity is reduced. The screenshot below was taken using the old brute-force engine.
With 1000 objects all checking for overlap with another 1000 objects, the debugger indicates this is taking about one million collision checks per tick. Even on a powerful desktop machine, the framerate is reduced to just 10 FPS, and the CPU measurement is very high. (Note the CPU measurement is based on timers and can be pretty inaccurate - it's just a ballpark figure.)
Let's switch to the new collision cells engine and see the difference:
Importantly, this version works identically to the old version: there are no functional differences. The debugger now indicates around 40,000 collision checks per tick - a reduction of about 96%. The framerate is now a smooth 60 FPS, and the CPU measurement is about half what it was before!
This is at least 6 times faster, and our tests show similar improvements on mobile devices as well. So this is a fantastic engine-wide optimisation that benefits all platforms!
Moving cells
Sharp readers may be concerned about the extra work of moving objects. Is that a lot of work? Most of the time objects stay within the same cell area there's no work needed to move them. When they reach a cell border they need to be moved to the next cell, but this is basically a one-off operation and then the object is likely to remain in the next cell for some time. This is the case even for fast-moving bullets, which only need to transition once or twice per second at most. However we've still worked to make sure this transition is very fast. We use special data structures so that objects can be removed and added to cells in O(log n) time, so it scales well even when hundreds of objects are in the same cell.
Also it works perfectly for static scenery: if it never moves, it never does any extra work to move its cell... and yet it's still a lot faster to check for collisions!
In other words, moving cells is a very cheap operation, and is done relatively rarely. So we feel it is very unlikely this will ever cause more overhead than the massive performance gains you get from the faster collision testing. Virtually every game should be faster.
To give some measurement of what's happening, the debugger has two new performance measurements in the System object: Moved cell/sec and Cell count. Moved cell measures how many objects are transitioning to different cells. The Cell count measurement indicates how many cells are in use, which is not really that important, but could help you identify if extremely large layouts are using lots of memory due to these cells (but they are only created on-demand, so this is unlikely).
With the previous example if we crank up all the object's speeds to 100 pixels per second, with 1000 objects that means they are covering a total distance between them of 100,000 pixels per second. That's the width of over 50 HD displays. Despite this, the debugger indicates just 18-20 moved cell per tick. This is a very small amount of work compared to the 960,000 extra collision checks that it was doing before!
Note that if you disable collisions on an object, it will also stop doing work to move it to the correct cell. So in the unlikely event you think this is causing a performance problem, this could be a way to save on that work, but it is likely to be a micro-optimisation that is a waste of time. And remember objects which don't move never do the work anyway!
We did also look at some other data structures, such as quadtrees. However these are more complicated and can be more expensive to update as objects move around. The collision cells method is very simple, cheap to maintain, and still works excellently for vastly cutting down the number of collision checks.
Caveat 1: parallax layers
There are two minor caveats to this new system. The first is when using parallax layers. Construct 2 can check for collisions between objects on different layers with different parallax rates, taking in to account where they appear rather than where they are in layout co-ordinates.
Unfortunately in this case, using parallax means the objects no longer line up with the same collision cells. This means the collision cells can't be used to reduce the collision checks and it still has to brute-force it.
It's possible that objects on layers with the same parallax rate could successfully collision check with each other. However this becomes a pretty complicated problem. Layer parallax rates can be changed with the Set parallax action, and it would be necessary to track which instances of which object types are on which layers with which parallax rates. This could add more performance overhead for moving objects, and we think it's a rare case anyway. So this is simply not supported, and the engine only tracks if any instance is on a parallaxed layer, and if so, reverts to brute-forcing collision checks for that object type.
This may require some care: if you have 1000 objects, and a single instance is accidentally moved to a parallaxed layer (e.g. the UI layer, which typically has a parallax of 0,0), then the engine reverts to brute-forcing collisions for all 1000 objects. However, other object types are not affected.
In other words, to get the benefits of the collision cell system, all instances of the object type you are testing must be on layers with no parallax (i.e. the default 100,100). Hopefully this is common anyway, since parallax is generally only for decorative purposes with scenery objects that aren't tested for collisions. As long as all the collision testing stuff happens on non-parallaxed layers, you're good. But watch out for accidental objects on wrong layers, which could de-optimise collision checking.
Caveat 2: picking in events
It's common to combine Is overlapping with other conditions or sub-events. Our advice previously has been to put other conditions above it, so as to filter down the instances and reduce the number of collision checks that need to be made. This advice is now changing.
If Is overlapping (or On collision) comes after another condition which has picked certain instances, it can no longer use collision cells. Looking in collision cells returns all possible instances, not just those that have met prior conditions. If a prior condition picked a large number of instances, it must brute-force collision checks again.
To avoid this, our new advice is: make sure collision conditions are the first condition in the top-level event. When no objects are picked, the collision conditions can make efficient use of collision cells to reduce the number of checks made, ensuring performance is good in all circumstances. Otherwise you risk forcing long brute-force checks with the instances picked so far.
Conclusion
This new optimisation should enable very efficient collision testing for all games. It's especially beneficial for games with lots of static scenery. It could even make a whole new class of ambitious games possible, using sprawling layouts with hundreds of objects spread out across it. The performance boost applies to all platforms, and is especially nice to have on mobile. Just be careful with which layers you put objects on, and think again about how you order collision testing conditions in your events. But hopefully most of you will be seeing your framerates improve in r155 - without having to do anything at all!