Here are some links of what statistical NLP can do today.
https://en.wikipedia.org/wiki/Language_model
The book Supervised Machine Learning for Text Analysis in R has a lot of good material about topics we will not discuss, such as the importance of tokenizing, stop words, stemming, bias in data. You can also find a good account of word embeddings.
If you want to see the math background of the language model of
This podcast with Ilya Sutskever for the deep learning side of things.