Understanding Draw Calls.

0 favourites
From the Asset Store
You need this package if you want to program a Street Fighter or battle game.
  • https://www.scirra.com/demos/c2/quadissueperf/

    I tried this test on my phone, getting 7600 sprites on screen, at 30fps.

    So how do you explain then, Why I can't even have a few hundred static sprites on game screen without hitting 30fps on the same mobile?

    What is causing the slowdown then?, if it's not the draw calls and not the rendering? I there must be something causing it! And I can't find anything else to improve what I'm doing at the moment. Your Stress tests 7000 sprites on screen no problem... my project, not even a few hundred.

    I want to know why.... my only explanation is some kind of overhead.

  • I've experienced the same thing; too many sprite objects in a layout and the performance gets bogged down. I had to combine a lot of sprites I'd used as tiles, background decoration and whatnot, to get performance back up. Whatever the reason, it *is* an issue on my end at least.

  • I already answered that:

    Your screenshot shows FPS < 60 and CPU well under 100%, which is typically indicative of the GPU hardware being the bottleneck.

  • [quote:28hu49fx]Your screenshot shows FPS < 60 and CPU well under 100%, which is typically indicative of the GPU hardware being the bottleneck.

    [quote:28hu49fx]Sorry, but I don't think you actually understand how WebGL rendering works.

    You're absolutely right I don't, that's why I'm doing everything in my power to investigate why I'm getting bad performance on mobile.

    I made a gif to try to show you what I mean.

    This doesn't look very efficient to me. And I'm not a WebGl guru, but from what I've read, you should be minimizing draws to a minimum for webGL. This doesn't look like 1 draw per frame from one array, This ilooks like several drawElements, layer upon layer. You can see the blue dots building up to the right.

    But anyway, so your example is rendering 7500 sprites but my game only a few hundred, on the Same Phone!? I doubt it's GPU bottleneck. Other webGL examples and games are not bottlenecked, why only C2 games with lots of different sprites?

    I'm only guessing it has something to do with how the rendering is done.

    As I said. Merging most my artwork in to the same sprite, by using Animations and frames, seems to have a positive effect. So the only way to get around the bottleneck is to merge all sprites to one huge spritesheet? I want to go to the bottom of this. I shouldn't be getting 30fps with a couple of static sprites on screen.

    Anyway.... I'm going to set up a few different capx tests to further test this.

    Maybe that will help in finding out why, performance is dropping significantly, when using a lot of sprites from diffrent spritesheets, but not when using 1 sprite or 1 spritesheet (texture)

  • So that particular case touches on a pretty obscure part of the engine. Another piece of WebGL performance advice is not to submit huge buffers in one go, but to actually submit them in chunks. This also helps keep the memory usage down and reduce latency to issuing work to the GPU. So the engine issues chunks of several thousand quads at a time. In the quadissue case, it reaches extreme levels of sprite batching so you are seeing lots of chunks.

    There is nothing to gain by improving this. It looks like it's submitting about 2500 sprites at a time, which means the draw call overhead is about 0.04% of the naive case of one call per sprite. If we increased this to say 5000, it would make such a tiny difference it is totally irrelevant (0.02%), while increasing memory usage and latency. So like most engineering tasks there's a tradeoff here, and we've aimed at a good sweet spot.

    So you are in fact looking at the batching engine working in ideal circumstances, and accusing it of bad performance. You should not jump to conclusions about parts of the engine you don't understand.

    GPU fillrate is the bottleneck that most people run in to, so that is probably the limit in your game too.

  • Try Construct 3

    Develop games in your browser. Powerful, performant & highly capable.

    Try Now Construct 3 users don't see these ads
  • Ashley - if the GPU fillrate is bottle-necking literally everything, then how do we fix it?

  • Ashley - if the GPU fillrate is bottle-necking literally everything, then how do we fix it?

    Have all your artwork in 1 sprite object, seems to help. A pain in the ass to work with though.

    But I think it's starting to lean more towards some texture switching rather than GPU bottleneck.

  • There is nothing to gain by improving this. It looks like it's submitting about 2500 sprites at a time, which means the draw call overhead is about 0.04% of the naive case of one call per sprite. If we increased this to say 5000, it would make such a tiny difference it is totally irrelevant (0.02%), while increasing memory usage and latency. So like most engineering tasks there's a tradeoff here, and we've aimed at a good sweet spot.

    I can understand why you're doing that, with a lot of sprites. Bunnymark is doing the same thing, only their limit is a lot higher. They are doing 3 draws for 60.000 sprites.

    But in my game, I don't have a lot of sprites... currently 350 in layout, only a few on screen. But the draw and texture swapping is still apparent, quite a lot of texture swapping and draws for very few sprites, so even if the drawing workload is pretty small, for a mobile phone that's pretty weak, something is causing it to slow down. Even with just a few sprites on screen.

    How to get around it? Can it be the texture swapping that has overhead?, Looking at the WebGL inspector, it's drawing a few sprites, switching texture, drawing a few more sprites, switching again, drawing a few more sprites, over and over like that, each frame, in multiple layers. Not even near 2500 sprites per draw. Maybe maximum 10. So there must be something causing the slowdown.

    I imagine, if I were to put all my sprites into one spritesheet this wouldn't happen. I'm curious to try it on C3, as it's much smarter when generating the spritesheets, but I can't test this project there yet, as there's no photon cloud plugin so far.

    Exporting from C3, I can see that even separate sprite objects are on the same sprite sheet, In C2 this is not the case.

    So is it possible in C2 that the overhead is caused by the texture swapping, and unnecessary many draw calls due to that.?

  • Are you testing in C2 preview mode? That will have lots of texture swapping. It shouldn't happen so much in C3, and less after exporting in C2 (but still not as good as C3, which has a much better spritesheeting engine and combines more images on to sheets).

    GPU fillrate is usually a far bigger problem than draw calls. Spritesheeting doesn't affect it, it's to do with the number of pixels drawn to on the screen. To reduce that you have to use fewer, smaller sprites, avoid heavy overdraw (i.e. stacking lots of sprites on top of each other), and in particular avoid force-own-texture layers (which also includes layers with effects, blend modes, or opacity other than 100%)... or find a device with more fillrate

  • Ashley - my layout has 3 of the same tilemap on 3 different layers (4 if you count the collision tilemap). Plus a player and enemies layer (which has JUST players and enemies on it in random spots). A BG layer and a light layer. A few of the lights using additive (maybe 20) and no other effects being used.

    If that's too much for my surface book i7 with a 960 NVIDIA card, what I'm understanding is "make a flappy bird clone or it's too much for most GPUs to handle".

    I'm assuming Ori and the Blind forest developers are cheating somehow...

  • I have been working on a database front end. The screen shot below is from my desktop, but it also runs solid at 60 fps on my old iPhone 5s (running 8.4.1 - so not benefiting from newer iOS improvements). The CPU usage on my phone settles around 42%, but draw calls are only 2% (and only 0.4% on my desktop when nothing is moving).

    the bulk of the CPU usage is from the collision checks (12,000 a second) which is checking if the mouse is over a data line, and then highlights the line. If I turn that off, the collision checks go away, and the cpu usage drops way down on my phone.

    Many of the 1146 objects on this screen are spritefont objects (over 800), so it obviously benefits from drawing the same object many times - although each object has different text to display. My eventsheet for this layout has 1,381 events at the moment - in dozens of functions. Code only gets called when needed. When nothing is happening there is almost no code running...

    I use 6 layers, one of which forces its own texture, because it needs to mask out areas to make a scrollable pop-up window. So, for what I am doing, the engine seems to work very well.

  • I'm assuming Ori and the Blind forest developers are cheating somehow...

    Their lead gameplay programmer David Clark used to work with me on Construct Classic, and I still sometimes chat with him. He hasn't mentioned cheating so far

    Anyways, as ever if you give me a .capx I can actually look at something, otherwise I'll just shrug and say it's probably the fill rate. Construct 2 makes it really easy to hit the fillrate hard. Well-designed games go to extraordinary lengths to optimise this kind of thing.

  • I'm not talking desktop, I'm talking about mobile. Draw calls according to debugger, uses A LOT of CPU time, amost on par with all my game logic, collission checks and everything else combined. And I don't understand why..

    No effects, No Force own texture, No blend modes, Not a lot of sprites, no webGL effects.

    This test on my mobile... 7500 sprites, no problem.

    https://www.scirra.com/demos/c2/quadissueperf/

    Bunnymark 1500 sprites juping around, no problem.

    http://www.goodboydigital.com/pixijs/bunnymark/

    My game. rasing spritecount from 300 to 1000 sprites, in the entire layout.

    then the fps drops to 30 on the same phone.

    It can't be fill rate, the game is pretty barren so far, not even all of the sprites in yet. The only thing is a little bit of texture overlap, since it's isometric perspective.

    So what is it? The only thing i see on the debugger is draw calls, draw calls, draw calls, way up there. Fill rate? Bottleneck? What is causing it, and how do I lower it? If it is something that I'm doing, there should be a comprehensive tutorial on how to lower it.

    It's driving me nuts!

  • Ashley - I just sent a follow up capx with no plugins (hopefully)

  • I FOUND IT!!! AND I WAS RIGHT!!!!

    I created a completely new layout with the same kind of sprites that i use in my game, only this time much tighter packed, on different layers so there should be a lot more overlap and fillrate even. Wholla, way lower CPU for draw calls.

    Now for the fun part.

    Then i selected some of the sprites and changed their animation frame to another one where there is other sprite artwork. WHOLLA! Draw calls shooting through the roof!!! with same ammmount of sprites. It is FOR SURE something with the draw calls, and texture swapping!!!

    Ashley you need to look over it. I knew I wasn't crazy!

    The more different kind of sprites I use have the slower it gets, and it's because the draw call overhead and texture swapping seen HERE! I could even recreate it. EDIT: Maybe it's specific to Edge and Windows phone, but there must be some thing to mimimize this, as it doesn't seem to like it.

    Here is a link to capx, to confirm. The only thing you need to do i select some groups of random objects, and change their initial frame.

    https://www.dropbox.com/s/q5fr4y6mks63q ... .capx?dl=1

    Is this proof enough? My game is not badly coded, I'm not using a lot of unneccesary effects and blend modes. What is causing this?

Jump to:
Active Users
There are 1 visitors browsing this topic (0 users and 1 guests)