How does RL relate to the psychology of animal behavior?
Broadly speaking, RL works as a pretty good model of instrumental learning, though a detailed argument for this has never been publically made (the closest to this is probably Barto, Sutton and Watkins, 1990). On the other hand, the links between classical (or Pavlovian) conditioning and temporal-difference (TD) learning (one of the central elements of RL) are close and widely acknowledged (see Sutton and Barto, 1990). Ron Sun has developed hybrid models combining high-level and low-level skill learning, based in part on RL, which make contact with psychological data (see Sun, Merrill, and Peterson, 2001).