Markov models do not extrapolate. If we take intuitions from HMMs and use them in LLMs, we are prone to see nothing besides "stochastic parrots".
Anyway, for anyone who wants to go deeper, I recommend Lawrence R. Rabiner, "A tutorial on Hidden Markov Models and selected applications in speech recognition", https://www.cs.cmu.edu/~cga/behavior/rabiner1.pdf
Anyway, for anyone who wants to go deeper, I recommend Lawrence R. Rabiner, "A tutorial on Hidden Markov Models and selected applications in speech recognition", https://www.cs.cmu.edu/~cga/behavior/rabiner1.pdf
And for a cure example, there is also one in nothin else than Claude Shannon, "A Mathematical Theory of Communication", https://web.archive.org/web/20121108191018/https://www.alcat...