Skip to main content

IBM Israel Research Seminars

 

In Reinforcement Learning, an agent (or software) is faced an unknown environment (or input distributions) and tries to maximize its long term return (or overall performance) by performing actions and receiving rewards. The challenge is to understand how a current action will affect not only the immediate reward, but mainly the future rewards. The dominating approach in Reinforcement Learning to model this task is Markov Decision Process (MDP).

In this talk I will overview the fundamentals of MDP and Reinforcement Learning, and will give special emphasis on recent research ideas regarding large state and action spaces.