What advantages does RL offer in Operations Research problems?
Using function approximation, RL can apply to much larger state spaces than classical sequential optimization techniques such as dynamic programming. In addition, using simulations (sampling), RL can apply to systems that are too large or complicated to explicitly enumerate the next-state transition probabilities.