How does RL relate to Neuro-Dynamic Programming?
To a first approximation, Reinforcement Learning and Neuro-Dynamic Programming are synonomous. The name “reinforcement learning” came from psychology (although psychologists rarely use exactly this term) and dates back to the eary days of cybernetics. For example, Marvin Minsky used this term in his 1954 thesis, and Barto and Sutton revived it in the early 1980’s. The name “neuro-dynamic programming” was coined by Bertsekas and Tsitsiklis in 1996 to capture the idea of the field as a combination of neural networks and dynamic programming. In fact, neither name is very descriptive of the subject, and I recommend you use neither when you want to be technically precise. Names such as this are useful when referring to a general body of research, but not for carefully distinguishing ideas from one another. In that sense, there is no point in trying to draw a careful distinction between the referents of these two names. The problem with “reinforcement learning” is that it is dated. Much of t