1.) Oh right - I was thinking about character dialogues in a narrative game and issues related to dynamic text box sizing, line-wrapping etc. But some of those issues are probably relevant to what you're describing too.
2.) Your original question asked about gestures on mobile, but then your second comment mentioned mouse coords...? Either way, I recognise swipes/gestures using mousedown/mousereleased events, whereas button presses are detected through mouseclick (or the equivalent touch events), so I don't see how you get unwanted events?
3.) No - basically everything has got to be contained in the canvas.