I know a bit about this, so I guess I'll explain.
To draw a frame you need to calculate it first. so:
input -> framebuffer.
Input is where the game picks up your commands and framebuffer is the screen. This is a 1-frame lag. This is the minimum you can get, there's no way around it. Think of this as an unlimited framerate, single-buffered game.
This isn't so hot, because you're drawing directly to the screen. If your drawing code takes longer than a screen refresh, the scene might be half-drawn when presented.
input -> backbuffer -> framebuffer
So this is double buffering, where you can hold one frame in a backbuffer and draw to that one, and when it's done you copy that one to the framebuffer. This means your commands will show up two frames later.
input -> backbufferA -> backbufferB -> framebuffer
And of course, triple buffering is the same idea, just bigger.
All of these were with unlimited frames per second (fps).
Now what happens with vsynch?
let's see the most common double-buffer scenario:
input -> backbuffer -> VSYNC -> framebuffer
the backbuffer is copied to the framebuffer only on a VSync, which happens 60 or whatever times per second. So now your input is delayed by one frame plus the time it takes to the next VSync.
With triple buffering.... yeah
input -> backbufferA -> backbufferB -> VSYNC ->framebuffer
your input is delayed by two frames and one Vsync.
Now, most video cards have a hardware mouse cursor, which is calculated in paralell to the rest of the screen drawing stuff and it's directly overlaid on top of the framebuffer, that's why it responds so well.
I hope this clear things up. To sum:
The fastest cursor will ALWAYS be the OS' cursor, followed by the 2-frame delay that's minimum for DirectX, with no VSync and doublebuffering.