Order of conditions and foreach performance.

1 favourites
  • 10 posts
From the Asset Store
Firebase: Analytics, Dynamic Links, Remote Config, Performance, Crashlytics on Android, iOS & Web Browser
  • Hey all,

    In a project where I spawn 40k sprites with a single instance variable: In a single event:

    (Foreach Sprite, Sprite.value < 0 ) do nothing

    vs

    (sprite.value < 0) do nothing

    Why is it that the first condition will drop the fps to 30 with 100% cpu usage, while the second will still cruise at 60fps and only 40% cpu? Even if I eliminate the value check and just have a foreach Sprite, That will still be 80% cpu at 60fps.

    I know foreach events are vastly less performant than some condition checks, but why? Isn't the engine internally doing the same thing (getting a list of objects, and iterating over it to check some condition and then returning the final list)

    In some cases, its useful to maintain a custom SOL and use a foreach to iterate over those objects and do something, check something, etc... Usually its more efficient if the alternative condition is checking overlaps, for example, but in any case, is foreach simply duplicating the overhead of a single event, in this case 40k times, while the former is only the overhead of 1 event and the resulting cpu usage is the internal time to generate the sol based on the condition?

  • I remember something similar tripping me up some time ago. In this case my guess is that "For each Sprite" is a loop and actually does iterate over all objects and also applies the actions in the event individually. The second one is a filter condition, which is faster and the action is applied to all sprites that end up being picked simultaneously and only once.

    Or in js

    	objects.forEach(e => {//thing});
    
    	VS
    
    	objects.filter(variable => variable < 0)
    

    People commonly misunderstand this. For example, if your event would be "add 10 to score" both examples would seemingly work at first glance. However, the second one will not work if multiple sprites end up with variable < 0 at the same tick. It will only add 10 score even if 1000 sprites have variable < 0, whereas foreach will add 10 score for each of the 1000 sprites individually. I think the "most correct" solution is actually the filter condition & adding 10*object.pickedcount to the score.

  • is foreach simply duplicating the overhead of a single event, in this case 40k times, while the former is only the overhead of 1 event and the resulting cpu usage is the internal time to generate the sol based on the condition?

    Yep.

    The event engine has some overhead (surprisingly small - enough to perform close to GML compiled to C++). 'For each' requires the event engine to repeat the engine overhead for each instance - it will re-run all following conditions, actions and expressions once per instance. However normal events run all conditions, actions and expressions a single time, but with a single long list of all the picked instances.

    It's analogous to this in code:

    const instancesArray = [ /* 40,000 entries */ ];
    
    // Normal action
    runAction(instancesArray);
    
    // With 'For each'
    for (const instance of instancesArray)
    {
    	const singleInstanceArray = [instance];
    	runAction(singleInstanceArray);
    }
    

    The first approach has the overhead of runAction once, passing 40k items, and the second approach has the overhead of runAction 40k times, passing one item each time.

    So yeah, don't use 'For each' unnecessarily.

  • Ah, that is code snippet makes total sense. Thanks ASHLEY.

  • Try Construct 3

    Develop games in your browser. Powerful, performant & highly capable.

    Try Now Construct 3 users don't see these ads
  • So yeah, don't use 'For each' unnecessarily.

    In the event where you are using foreach to pick associated objects by uid, do you know of more elegant solutions than the typical:

    Foreach Character

    Pick dictionary by UID (character.myDictionaryUID)

    Pick collider by UID (character.myColliderUID)

    etc...

    Foreach element in myArray

    pick Effect by uid (myArray.currentvalue)

    For each effect, pick refferences objects by uid

    In the past, I never had a need to think about the efficiency of all this foreach and picking stuff because it was only for dynamic characters (like npc and main player for example). Maybe you have 30 characters active in a scene. But now Iʻm trying to do a bullet hell and every bullet carries a dynamic list of effects (easily dozens) and those effects may further reference other objects, custom actions, etc.... So I find myself needing to do a foreach over a 1000 bullets (to pick ascociated objects) and then a foreach over an array of elements containing a reference to the list of effects that bullet is affected by. If the effect is a general effect, I can just do that without picking, but sometimes the effect also needs to pick other objects as well such as the object that fire the bullet.

    All this picking seems to be required in order to maintain a system where projectiles and characters can be decorated by numerous effects and have many links to one another or characters in the scene.

    Many times the effect list can be 0, so I can filter those out and not bother iterating and picking them. But other times, it can be every single bullet has a dozen effects that all need updated.

    For every object that needs to pick another object by uid, I have to run a foreach on using this architecture.

    This seemed to be the standard approach, but is there an updated way these days?

  • It's hard to comment because it depends on the details of what you're doing. If there's a 1:1 relationship between objects, you can use containers and everything should "just work" without needing extra picking. Even if not, if you can rearrange it to use such a system it may be more efficient. Nested loops tend to be pretty inefficient (as the algorithmic efficiency can be poor - a fundamental mathematical limitation that tends to make it slow in any tool). So if you can do anything that avoids nested loops it should help a lot. For example if instead of "for each bullet - for each effect in list", you can instead precompute the state for the bullet only when it changes and store it on an instance variable, then you don't need to do all that intensive CPU work every tick.

    If you have an extreme performance requirement and the event system just isn't cutting it, there is also the scripting feature - I know writing JS isn't for everyone but at least the option is there for raw code to make things as fast as possible if necessary.

  • I would love to put things in containers, but they donʻt work with families, and I usually end up with everything in families, because I usually always end up with multiple types.

    Iʻve tried using instance variables on objects to hold dynamic function names and using those functions to differentiate behavior and that works, but I havenʻt fully investigated the other issues this might cause.

    One thing, I think... it would be nice if dictionaries and arrays could be able to be added as behaviors or instance variable types, rather than being their own object. I always wondered why that was. Basically have them more baked in. Any object that needs a bunch of data for each instance, always ends up needing to be in a container with them, or worse, as a family, needing to run a bunch of picks.

    As far as optimizations go, I generally try to cache data that is costly to compute, and filter conditions with by order of least common first. Usually I have isDirty flags on set and check. Alot of effects that needed updates every tick, I just set a tween on them. Iʻm assuming that is better, but even if it isnʻt, the tween is a pretty awesome tool for simplifying timers, delays, and other varibles that need to update every tick.

  • If you have an extreme performance requirement and the event system just isn't cutting it, there is also the scripting feature - I know writing JS isn't for everyone but at least the option is there for raw code to make things as fast as possible if necessary.

    Iʻve been running some comparisons and I noticed many times the events are already so well optimized that rewriting it in JS seemed to do nothing to speed it up. I actually managed to slow it down using JS in some cases - which is always nice. I know that many pretty sweet optimizations have been added to c3 over the years and it really shines through sometimes. Iʻve had several occasions where c3 is able to be beating both construct and construct 2 by quite a large margin doing the same exact thing.

    Iʻd be curious to know if anyone has tried creating an ECS like system in construct. I donʻt even know if JS can go low enough in memory allocations to enable that type of design, but that would be a neat project.

  • Iʻve been running some comparisons and I noticed many times the events are already so well optimized that rewriting it in JS seemed to do nothing to speed it up. I actually managed to slow it down using JS in some cases

    I had that exact same experience a few times. :)

  • yea I noticed that rewriting stuff in js also leads to worse performance if it's just a simple event. I was able to get better performance if I was doing some heavy stuff on the instances, as this evens out the initial overhead.

    The reason for it seems to be that getting the instances is slow, no matter if using the iterator or getting the array of instances, maybe due to runtime glue or because they are lazy created? (getting instances by UID was fast again)

Jump to:
Active Users
There are 1 visitors browsing this topic (0 users and 1 guests)