I dug out an old C2 performance benchmarks used for collision cells, and adapted it to show all the content on a single screen: https://www.dropbox.com/scl/fi/8r2r91i0q84ay5454rssh/Overlap-benchmark.c3p?rlkey=x562ztroagbmoonxr057j4tes&dl=0
It uses 1000 objects all testing overlap with each other, which causes ~1 million collision checks per tick. On my high-end desktop system that results in ~40 FPS with CPU maxed out. I think it's true that some games will use this style of heavy content on a single screen (Vampire Survivors being a good example, as well as bullet-hell style games), so it's a good test case and collision cells don't work well by default in that case as the collision cell size is the viewport size and so collision cells don't help.
I added a way to change the collision cell size and it does help a great deal - for this benchmark the sweet spot seems to be about 50x50, which is a lot smaller than the default. That results in a smooth 60 FPS and ~25% CPU usage, so it's several times faster. I think that also shows collision cells can be a perfectly good solution if the size can be tweaked for the game - I'm not sure there's any reason to go further with more advanced algorithms. Too small a cell size does end up slower as the overhead of tracking cells outweighs the performance saving, but it seems fairly easy to land in the right ballpark and get big improvements.
It remains to be seen if any complications come up but I've added action to set the collision cell size for the next release cycle, so then it can be experimented with and tuned for specific games.