In the case of a Simon you have to think at the tick level.
On the start tick of the turn, the computer computes the sequence order.
Then for the following ticks, it will playback the sequence. During those ticks, the user must not input anything (or you don't take it into account).
Then for the following ticks, it will wait for the player's input.
On the tick the player's input is done, you compare the inputed sequence to the computed sequence.
If the two matches, you go to the next turn and compute a new sequence.
If the two doesn't match, you reset the inputed sequence and allow for the user to input a new sequence.
It's not a matter of wait action here, but of state/status of the intended step of the behaviour of your logic.
A capx to illustrate (not flawless, but should help with the methodology)