I'm pretty sure that the reinforcement learning algorithm they are using is guar...

aab0 · on March 12, 2016

As far as I know, using neural networks for function approximation destroys the various convergence guarantees available. NNs can easily diverge and have catastrophic forgetting, and this is one of the things that made them challenging to use in RL applications despite their power, and why one needs patches like experience replay and freezing the networks.