Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm pretty sure that the reinforcement learning algorithm they are using is guaranteed to converge. It just takes a very long time to train, and using human games probably sped it up.


As far as I know, using neural networks for function approximation destroys the various convergence guarantees available. NNs can easily diverge and have catastrophic forgetting, and this is one of the things that made them challenging to use in RL applications despite their power, and why one needs patches like experience replay and freezing the networks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: