Inverse Reinforcement Learning
Inverse Reinforcement Learning (IRL) is the machine learning paradigm concerned with inferring the latent reward function of an agent based on its observed behavior. Formally, given a Markov Decision Process (MDP) without a specified reward signal and a set of expert demonstrations (state-action trajectories), IRL seeks to recover the underlying utility function that the expert is assumed to be optimally maximizing. This effectively inverts the standard reinforcement learning problem: rather than deriving a policy from a known reward, it derives the reward structure that best explains the observed policy.
| Title | Author / Year | Theme | Comment |
|---|---|---|---|
| Maximum Entropy IRL | Ziebart et al. (2008) | Algorithm | Methodological Paper |
| Deep Maximum Entropy IRL | Wulfmeier et al. (2015) | Algorithm | Methodological Paper |
| Adversarial Inverse Reinforcement Learning | Fu et al. (2018) | Algorithm | Methodological Paper |
| Inverse soft-Q Learning for Imitation (Environment Free) | Garg et al. (2022) | Algorithm | Methodological Paper |
| Variational IRL (Environment Free) | Qureshi et al. (2019) | Algorithm | Methodological Paper |
| Multi-Agent Adversarial IRL | Yu et al. (2019) | Algorithmic Enhancement | Methodological Paper |
| Context-aware IRL | Liu et al. (2025) | Modeling human behavior using IRL | Application Paper |
| IRL for modeling reservoir operations | Giuliani and Castelletti (2024) | Modeling human behavior using IRL | Application Paper |
| Multiple Expert and Non-stationarity in IRL | Likmeta et al. (2021) | Modeling human behavior using IRL | Application Paper |
| Advances and Applications in IRL | Deshpande et al. (2025) | Algorithms and Application | Literature Review |