It sounds to me like you're missing local input prediction, which is not specifically covered in the pong tutorial, but it is implemented to a degree. You simply need to locally set the position of the peer's paddle to the correct location on the peer side, so sync only corrects the paddle if there is a discrepancy (the jitter).
If you rely only on the sync action to position your paddle, it will not look smooth. Local input prediction is described more in depth in the fourth multiplayer tutorial.