Lots of browsers already have GPU-accelerated video decoding.
I think 8 frames isn't enough for much benefit for video - the memory usage may or may not be lower, but 8 frames of video is likely to have a significant quality impact, since it will probably just have one keyframe then seven delta frames. Video is also designed for long-running media and it may end up just decoding everything fully in to memory anyway if it's short. It could be worth experimenting with though.
Note your spritesheets will be a lot more space efficient if you use just under a power-of-two size, e.g. 500x500, since spritesheets are power-of-two plus an extra pixel transparent border round each image. So you can only fit 3 tiles of 512x512 across a 2048x2048 spritesheet, not 4.