tunepunk's Forum Posts

  • https://www.scirra.com/demos/c2/quadissueperf/

    I tried this test on my phone, getting 7600 sprites on screen, at 30fps.

    So how do you explain then, Why I can't even have a few hundred static sprites on game screen without hitting 30fps on the same mobile?

    What is causing the slowdown then?, if it's not the draw calls and not the rendering? I there must be something causing it! And I can't find anything else to improve what I'm doing at the moment. Your Stress tests 7000 sprites on screen no problem... my project, not even a few hundred.

    I want to know why.... my only explanation is some kind of overhead.

  • Your screenshot shows FPS < 60 and CPU well under 100%, which is typically indicative of the GPU hardware being the bottleneck. So there's no evidence draw calls are the limitation there.

    [quote:lmn072l8]

    Fewer, larger draw operations will improve performance. If you have 1000 sprites to paint, try to do it as a single drawArrays() or drawElements() call. You can draw degenerate (flat) triangles if you need to draw discontinuous objects as a single drawArrays() call[/code:lmn072l8]. 
    that's exactly what bunnymark is doing when I check the webGL inspecitor.
    
    Using a webGL inspector i can clearly see you're not doing that! As I said you may not notice it, for small games on powerful devices, but you will notice it for LARGE games, and Mobile games.
    

    The engine does already do that, with a sophisticated batching engine. But changing texture is one of the operations that has to split the batch. In C3, or after export, textures are merged in to spritesheets and the batching works better since there are fewer texture swaps.

    So we're already doing everything you've asked for.

    No it doesn't! Use a WebGl inspector and check for your self! The aim should be 1 draw per frame, that's it! Yes and splitting the batch you're creating 100's of draw, where you could be doing a single one, with all the sprites in one go!

    Stepping through the C2 draws, I can see what you're explaining... some things are batched together, drawing layer upon layer 100 times per frame, where you SHOULD be drawing 1 time per frame as the bunnymark example is doing. All the sprites in one go!! The implementation is sloppy, It's doing it completely wrong with loads of unnecessary overhead.

    There IS an overhead issue, and it scales directly with number of sprites(draws), as you're rending layer upon layer of "drawElements", where all of it could be drawn in one go.

    I'm getting lots draws per frame, layer upon layer, upon layer, and i can step through them one by one to see how it's layered.

    Bunnymark is using 1 draw per frame, as you SHOULD be aiming for, no matter how many bunnies on screen, it's always 1 draw per frame.

    I don't even know why I have to point out the obvious?

    Do I have your permission to modify c2runtime.js and do it the right way?

  • You're worrying over nothing. There is nothing here to suggest any performance problems.

    Are you kidding me? Here's a new screenshot.... The only thing i did was to increase the number of sprites in layout to about 1000... Take notice... IN LAYOUT, not on screen, none of them are moving, just static sprites, and framerate dropped to 30fps.

    Draw calls also increase along with the number of sprites, becuase you're not using buffers!

    Of course, 100draw calls is not very much for a small game on a powerful device, but people doing large games and games for mobile ARE noticing the bad performance. Because, you're not even implementing best practices... general things you should do.

    https://developer.mozilla.org/en-US/docs/Web/API/WebGL_API/WebGL_best_practices

    Fewer, larger draw operations will improve performance. If you have 1000 sprites to paint, try to do it as a single drawArrays() or drawElements() call. You can draw degenerate (flat) triangles if you need to draw discontinuous objects as a single drawArrays() call[/code:38chx5cy]. 
    that's exactly what bunnymark is doing when I check the webGL inspecitor.
    
    Using a webGL inspector i can clearly see you're not doing that! As I said you may not notice it, for small games on powerful devices, but you will notice it for LARGE games, and Mobile games.
    
    Please please please....  Just try to look in to at least using best practices, and use a drawArray. It's a known fact that WebGL overhead is an issue, and you're doing nothing to minimize it.
    
    Or is my only option to modify c2runtime.js myself to prove you wrong?
    
    I can easily say that just by that little tweak we would get a LOT better performance.
    
    If I'm wrong I'd be happy to send you a fine bottle of whiskey.
    If you're wrong, the only thing you have to lose is a little time, and getting more happy customers because of a small tweak to how things are drawn
  • Try Construct 3

    Develop games in your browser. Powerful, performant & highly capable.

    Try Now Construct 3 users don't see these ads
  • This thread should have enough solid proof now that the way C2 does the rendering is not very efficient at all, considering it's WebGL, and what it should be capable of.

    If you can provide a minimal .capx that shows high draw call usage, I'd be happy to investigate optimising the engine. Without that the most I can do is speculate.

    So get on with it

    I'd be happy to play with the new superfast C2, C3, once the optimizations are in

  • Testing bunnymark VS my construct project rendering There is way less calls here, and far more bang for the buck. Looking at the WebGL inspector they are rendering things differently than C2 does. Seems to be using buffers.

    It seems the way C2 render stuff has a LOT more overhead...

    http://www.goodboydigital.com/pixijs/bunnymark/

    here's the link to bunnymark if anyone want to try it on their phone to test performance.

    I can have 1500 bunnies jumping around on a midrange (Nokia Lumia 830) before framerate goes below full 60fps.

    My construct project is struggling the same phone with 50 static object on screen. No animations, nothing moving.

    Here's a screenshot from my game, at an area with very few objects, CPU is pretty high, mostly due to draw calls. Framrate is getting low. About 50ish, with just a few static objects on the map.

    Here's a screenshot of Bunnymark with a lots of objects jumping around. at a similar framerate 60fps.

    I'm pretty confident that Ashley claiming near native performance is possible with WebGL, but not with the current implementation, as it's REALLY inefficient.

    Please take a look at this.... it's not only me experiencing bad performance, i think construct can do it better. It's just sloppy implementation, and bad optimization.

    And I think this should be a first priority, as people are choosing other engines due to performance issues.

  • I also noticed that I was able to get a good amount of performance boost, by merging most of my assets to as few sprites as possible, adding all assets to different frames, and animations, as everything is rendered "per texture", so that they are in the same spritesheet. If I wasn't doing that I wouldn't be getting as good performance as I currently am.

    So, my conclusion... use as few sprites as possible, but add all assets to the same sprite will increase performance, since they then will be on the same "TEXTURE" (spritesheet), will result in fewer draw calls, less overhead, and less drawing per frame.

    I was checking the c2runtime.js webGL the whole GL section.

    Are we allowed to modify c2runtime.js?, because i would like to make some test to see if I could make some improvements there.

  • Ashely Sent you a project file on mail.

    Draw calls is another matter really, and happens on the CPU side. It's probably best to split that topic off to a new thread. We have OpenGL ES 3 equivalent capabilities with WebGL 2 though, so if at any point draw calls prove to be a bottleneck, it's something we can potentially optimise in exactly the same way a native app would adjust their draw calls to be more efficient. Most 3D APIs, WebGL included, are specifically designed to allow as much drawing as possible with the fewest draw calls, to as far as possible eliminate the CPU overhead.

    I'm not a programmer, but I just feel that drawing, and draw calls is not very optimized currently. Like it's drawing every single sprite, multiple times per frame, instead of drawing from a buffer, lot of things at once.

    And it feels like there's a lot of overhead currently. And that there's a lot of room for improvement. Especially when it comes to draw calls and rendering.

    [quote:2pwvp7hf] - next I tested drawing with ANGLE_instanced_arrays, object positions are computed on CPU, written to a (double-buffered) dynamic vertex buffer, and then rendered with a single draw call, in Chrome on Windows with NVIDIA I can get 450k instances before the performance drops below 60fps (so 450k particle position updates per frame in JS, and no sweat!), performance in a native app isn't better here, my suspicion is that the vertex buffer update is the limiter here (500k instances means 8MByte of dynamic vertex data shuffled to the GPU each frame), on my OSX MBP I can go up to about 180k instances (again very likely vertex throughput limited). However in this case, the way the dynamic vertex buffer works is also important, it looks like vertex buffer orphaning is useless in WebGL (see discussion here: https://groups.google.com/forum/#!topic ... MNXSNRAg8M), so I switched to double-buffering

    Reading that quite it seems some people seem to be getting way more performance out of WebGL that we currently can in C2/C3, which I believe is due to overhead. Maybe both from draw calls and the way it is rendered? Any possibility there's something to this?

    I'm not a engine programmer, I'm a designer, but it just seems C2/C3 could perform a lot better, than it currently is, by minimizing overhead.

  • Ashley No problem, i can send over the actual project. Where do I send it as I don't want to share it publicly?

  • In my own project, checking the debugger I always notice draw calls using most CPU, more than all my game logic combined. Even if I'm not using any Blend modes, WebGL effects, particles, etc. I want to understand why it is so high and what I can do to reduce it.

    The only way I found to reduce it is to reduce the number of sprites on screen. Sometimes even merging graphics to one big sprite to reduce draw calls (but uses more memory). And as Draw calls goes up, I can notice frame rate dropping.

    My question goes out to the Ashley and the devs, if there's anything more they can do on their end to optimize this further on their and, and enlighten us a bit more of how it works and why it's using that much CPU. This seems to be a huge CPU hog even for simple games. Especially for me as I'm designing for mobile.

  • https://docs.google.com/presentation/d/12AGAUmElB0oOBgbEEBfhABkIMCL3CUX7kdAPLuwZ964/edit#slide=id.i0

    Found this presentation document for a good read on WebGL, I'm just trying to understand a little bit more on how it works. I have no doubt in my mind that it can match performance of native if optimized the right way. The only thing I'm not sure of is if C2/C3 is getting the most out of it. Since a lot of people still seem to be complaining about it.

    Maybe both are right? Ashley claiming close to native level performance (which probably is true in an optimal case), but users are experiencing something else with their projects because it's not optimized?

    What do I know? Just speculating...

  • Ashley

    I don't know if any of this makes any sense to you (I'm not a coder), but i just tried the WebGl inspector in Firefox. This looks like a lot of draw calls for 1 frame, where good practice (from what I've read) is to bundle them and draw all sprites at once. Is there any way to optimize this further?

    And Gecko seems to be using most CPU time.

    As I've said I'm not doubting webGL performance. I'm just suspecting we could get more bang for the buck if C2/C3 was optimized in such a way to reduce draw calls to a minimum. Since overhead is the major issue with webGL. Known fact. That's why people it's good practice to draw many things at once. Otimal is to draw 1 time per frame.

    Any thoughts on that? Is there anything that can be done to reduce # of draw calls you think?

  • I think it's always been like that.

    X - is usually left to right. (horizontal)

    Y - vertical.

    Z - depth.

    I know sometimes if just feels wrong i don't know why. :p

  • I think all modern OSs double buffer everything on-screen. Certainly all modern graphics applications do.

    I was under the impression that was something you have to setup manually?

    http://stackoverflow.com/questions/29565026/how-to-implement-vbo-double-buffering-in-webgl

    Anyway, from what I've read, the main issue with webGL is draw call overhead, so reducing that should improve the performance a lot.

    [quote:1n0y2rq9]Fewer, larger draw operations will improve performance. If you have 1000 sprites to paint, try to do it as a single drawArrays() or drawElements() call. You can draw degenerate (flat) triangles if you need to draw discontinuous objects as a single drawArrays() call.

    https://developer.mozilla.org/en-US/docs/Web/API/WebGL_API/WebGL_best_practices

    I'm not doubting the power of WebGL, vs Native, I just feel it's not really optimized when using C2/C3.

  • Ashley

    Is C2/C3 taking advantage of double buffering to reduce draw calls? Check my previous post, my only issue for my own game on mobile is draw calls, so maybe there's something to it?

    The only reason WebGL might perform slower than native is due to overhead, which should be reduced to a minimum for WebGL.

  • I've been following this topic a while, and from what I understand and read is that a WebGL application on Windows can beat a native desktop OSX application, because the OSX OpenGL driver sucks.

    On windows a native desktop application can easily have 10x more draw call throughput then a WebGL app running on the same machine, BUT not because of slow JS performance, but because of WebGL overhead.

    So what people do to combat this is to reduce the amount of draw calls to WebGl.

    [quote:1b3hksbp] - next I tested drawing with ANGLE_instanced_arrays, object positions are computed on CPU, written to a (double-buffered) dynamic vertex buffer, and then rendered with a single draw call, in Chrome on Windows with NVIDIA I can get 450k instances before the performance drops below 60fps (so 450k particle position updates per frame in JS, and no sweat!), performance in a native app isn't better here, my suspicion is that the vertex buffer update is the limiter here (500k instances means 8MByte of dynamic vertex data shuffled to the GPU each frame), on my OSX MBP I can go up to about 180k instances (again very likely vertex throughput limited). However in this case, the way the dynamic vertex buffer works is also important, it looks like vertex buffer orphaning is useless in WebGL (see discussion here: https://groups.google.com/forum/#!topic ... MNXSNRAg8M), so I switched to double-buffering

    I don't know how well C2/C3 handles this, but it might just be the case it's the draw call overhead, and why many people experience it as slow?

    Edit: I'm pretty certain the only performance issues with C2/C3 is overhead, nothing else...

    The C3 example, "Quad issue performance test" should probably be able to do double ammount of sprites without breaking a sweat if draw calls were reduced.