To what mathematical concept pioneered by Richard Bellman does RL owe its structural foundation for sequential decisions?

Answer

Dynamic programming

The formal structure that underpins Reinforcement Learning, especially concerning optimization problems involving sequences of decisions, is heavily indebted to the mathematical concepts of dynamic programming. This field was pioneered by Richard Bellman. Dynamic programming provided the necessary theoretical structure to solve optimization problems where subsequent decisions depend on the outcomes of previous ones. While modern RL often involves agents exploring without a complete model (unlike classic DP applications), this pioneering work established the necessary framework for mathematically defining and solving sequential decision-making problems, which is central to the agent-environment interaction loop in RL.

To what mathematical concept pioneered by Richard Bellman does RL owe its structural foundation for sequential decisions?
Artificial Intelligencemachine learningreinforcement learningdialogue