I'm not sure about the reasons behind the situation, but in r265, you can retrieve a property called ZScale from a 3DCamera object which to my understanding will give you the "ratio" between one "Z height unit" and a "width/height pixel".
I've only started testing the stuff, but it seems that if you want perfect cubes (at all resolutions), you need to make 3DShape objects with equal width and height and set their height at runtime to "3DShape.Width / 3DCamera.ZScale".