I do tend to believe it's a C2 problem, or at least a 2D problem however. I just tried out this benchmark, which renders 150,000 cubes. On my integrated chip it runs at 40fps easily.
3D engines can take advantage of GPU acceleration much better. It was around the time of the hardware transform & lighting being incorporated into the GPU, before that, 3D games were also limited by the CPU to setup the scene.
2D tend to be CPU limited, since the fill-rate of modern GPUs are insanely high, its usually not the bottleneck.
I'm wary of this limitation & C2 logic being single threaded. Already in my early dev of SN2, I'm seeing 50-60% CPU usage (the GPU is barely doing any work at all!), all on 1 thread on my quad-core CPU.
After SN2, I'll move to a 3D game engine for future games.