for an environnement that big, I would recommand indeed using tilemaps if possible to create the look of the environnement, it is less troublesome yet you will benefit from the fact they have "automatic render cells applied" (based on collisions cells technically, but that means that tilemaps can scale pretty well to very large sizes relative to the window size).
in a zelda game, IIRC the ennemies are created when arriving to the screen, so you should not have many issues with them as they should be only present in the same part of the world as you, innfact, in most older games this "what is on screen exists, yet what is not doesn't" applies and that means that the system has much less things to handle.
I would say very doable technically wise, of course demands works and efforts to be done correctly as most of the interactions are not handed to us with premade behaviors in that case, but if you know well zelda, you should be fine recreating how it works basically I would say.