Your link about VRAM is very old (2010) and is pretty much out of date actually. Almost all desktop computers support non-power-of-two textures now so the limitations don't apply, at least in Construct 2's renderer. Even mobile devices support non-power-of-two textures except for tiled images, and in future they will support it for tiled textures too, making the "power of two" thing completely and thing of the past.
Music is streamed so should occupy very little RAM. Sound effect files generally sit in memory so will use as much RAM as their filesize. Generally audio is not a concern since images usually take up far more memory.