Diablo II doesn't even have to use VRAM. It's minimum requirements are only 32MB of REGULAR RAM. If Diablo II can do that with 32MB of RAM, well, most people here are making games that take more than that in VRAM already...
Like newt said, the Diablo 2 engine is not the Construct runtime. Old isometric games that use tons of image data have a number of optimizations that make them much less intensive than if they were using the Construct runtime. They use color palettes to save space, and who knows what else... perhaps they load images into ram and draw them on the screen in a special way that works in the game, but is inefficient for an all encompassing runtime like construct. Once, for fun, I loaded all 500+ frames that a single infantry unit from red alert 2 uses. It took forever to load the preview. Were talking like 30 seconds for one unit's frames.
So you're saying Construct is not good enough to do this well?
Construct is not the optimal solution to making an image heavy isometric game. Still, it's probably the easiest and best way to go about making one, if you're careful with texture space.