Research Track "Reinforcement Learning in Regulated Domains"

PhD Candidate: Floris den Hengst
Track leader: Frank van Harmelen

Reinforcement learning is a powerful learning paradigm that can be used to learn how to behave optimally from data. It has become increasingly popular over the past decades for various tasks, including personalization. However, enforcing safety requirements onto the learned behaviors remains an open challenge. Strong guarantees on safety is a prerequisite for adoption of systems in practice.

This research is aimed at (a) bridging the gap between regulators’ and agent’s behaviour representation, and (b) reinforcement learning under the resulting constraints aimed at the application of an Adaptive Personal Assistant. It touches on practical and fundamental aspects of explainability and safety of AI. How can we formalize regulations so that domain experts can inspect, understand and validate the result? How can we bridge the gap between experts’ and RL agents’ representation of the world and actions? How does constraining a RL agent impact its learning capabilities? In this research, we use an adaptive conversational agent for financial advice to investigate these issues.

This research is a collaboration with the Vrije Universiteit Amsterdam’s groups Knowledge Representation and Reasoning and Computational Intelligence groups.

Selected Publications

  • Floris den Hengst, Vincent François-Lavet, Mark Hoogendoorn, Frank van Harmelen: Reinforcement Learning with Option Machines IJCAI-ECAI (2022). pdf

  • Floris den Hengst, Vincent François-Lavet, Mark Hoogendoorn, Frank van Harmelen: Planning for potential: efficient safe reinforcement learning. Machine Learning, Springer (2022). doi

  • Floris den Hengst, Mark Hoogendoorn, Frank van Harmelen, Joost Bosman: Reinforcement Learning for Personalized Dialogue Management. IEEE/WIC/ACM International Conference on Web Intelligence: 59-76 (2020) doi

  • van Zeelt, Mickey, Floris den Hengst, and Seyyed Hadi Hashemi. “Collecting High-Quality Dialogue User Satisfaction Ratings with Third-Party Annotators.” Proceedings of the 2020 Conference on Human Information Interaction and Retrieval. 2020. doi

  • Floris den Hengst, Eoin Grua, Ali el Hassouni, Mark Hoogendoorn: Reinforcement learning for personalization: A systematic literature review. Data Science: 1-4 (2020) doi

  • Mickey van Zeelt, Floris Den Hengst, and Seyyed Hadi Hashemi: Collecting High Quality Dialogue User Satisfaction Ratings with Third-Party Annotators. Proceedings of the 2020 Conference on Human Information Interaction and Retrieval. (2020) doi