On quads vs. triangles, I'm sceptical that it's worth making any changes. Construct's pipeline has been fine-tuned for extreme performance with quads. Just today I tested the M1 Pro and found it can render 750k quads on-screen at 30 FPS. A quad is just two connected triangles, so that means 1.5 million triangles. On top of that, as best as I can tell from the evidence, this is bottlenecked on the memory bandwidth of iterating the JavaScript objects Construct uses to represent objects in the layout. So a single object issuing lots of quads would probably score significantly higher still. I've previously tried to optimise the way quads are issued, and it's made zero difference to the benchmark - presumably because the bottleneck is memory bandwidth iterating JS objects. So I think issuing a degenerate quad is fine: the performance penalty of sending a single extra vertex appears to be dwarfed by the other overheads.
Further if there is some other triangles mode, it will actually mean breaking the batch to change rendering parameters, since the default rendering mode can only render quads as it's been so heavily optimised for it. So even if we went and did it, I think there is a chance it would actually be slower than sticking with degenerate quads, as the overhead of changing modes could outweigh the overhead of sending a single extra vertex.
It's also possible to render pairs of connected triangles as quads, avoiding wasting a vertex. Our own engine does that for rendering mesh distortion. Our own engine also issues degenerate quads in a couple of corner cases where it just wants a single triangle. So I don't think there's any case for changing this, especially given the high level of complexity it would probably involve - in Construct, just go with degenerate quads. If you find some benchmark that proves it's unreasonably slow, let me know, and we can take it from there, but I think it's a good bet to say that won't happen.
On two color tinting, this is a more complicated problem and could involve more performance overhead. However I still want to understand a bit more about exactly how it's used, as it significantly affects the potential solutions. Construct has a special fast-path for simple color-only affects like "Adjust HSL" - those can just be rendered normally with (more or less) another shader program selected. The shader parameters matter though and can affect the batch. However if you do something like set one set of parameters and then render 1000 triangles, it will be fine: it can still batch everything, as it can see the parameters aren't changing. However if you do something like change the parameters per-triangle, then there will be batch thrashing and things like adding an extra vertex attribute could come in to play. So my question is, how do people really use this? Do you need per-triangle colors? Do lots of people really make use of per-triangle colors so this really is something that will affect a lot of cases? The answers to these questions could mean the difference between it basically working fine as-is, to a very complicated overhaul of the entire renderer - something I'm very reluctant to do. So those answers are important.
FWIW I've been nearing completion of our WebGPU renderer, and it works significantly differently to the WebGL renderer internally, but it still efficiently implements the same interface you get with IWebGLRenderer. This means both that trying to customise the renderer for things like extra vertex attributes is much more complicated, as there are two renderers that work significantly differently to support, and also that there is opportunity to make things much faster in the WebGPU renderer specifically.