The number of neurons in the input layer can be determined based on the length of the longest sequence you expected. All shorter sequences can be simply padded to be the same length with the longest sequence.
Regarding the number of the hidden layers, there is a general rule for deep learning - the more hidden layers you use, the more complex model you create. Very complex models tend to overfit, so you should find a balance between the sufficient complexity of the model and the overfitting avoidance. It is worth to say also, that the more training data you have, the less is the probability that your model will overfit to this data, even if you use many hidden layers.