Research Track "Concept Drift and ML-Ops"

PhD Candidate: Lorena Poenaru-Olaru
Track leader: Jan Rellermeyer, Luís Cruz

The deployment of ML-applications brings in a unique set of engineering challenges. If the underlying data changes, this may lead to concept drift. We explore the use of drift detectors to identify such situations, to minimize the (typically costly) training phase. Such drift detectors are employed while operating machine learning models, giving rise to the field of ML-Ops.

If models are re-trained and re-deployed frequently, full automation of all relevant deployment tasks is needed. This includes monitoring and optimization in production, as well as data quality validation. For models requiring explainability, verification of not just outputs but also of the revised explanations will be necessary. In this track we will seek to address such challenges at ING scale.

Selected publications

  • Lorena Poenaru-Olaru, Luis Cruz, Jan S. Rellermeyer, Arie van Deursen: Maintaining and Monitoring AIOps Models Against Concept Drift. CAIN 2023: 98-99

  • Arumoy Shome, Luís Cruz, Arie van Deursen: Towards Understanding Machine Learning Testing in Practise. CAIN 2023: 117-118

  • Lorena Poenaru-Olaru, June Sallou, Luis Cruz, Jan S. Rellermeyer, Arie van Deursen: Retrain AI Systems Responsibly! Use Sustainable Concept Drift Adaptation Techniques. GREENS@ICSE 2023: 17-18 (preprint).

  • Arumoy Shome, Luís Cruz, Arie van Deursen: Data smells in public datasets. CAIN 2022: 205-216

  • Lorena Poenaru-Olaru, Luis Cruz, Arie van Deursen, Jan S. Rellermeyer: Are Concept Drift Detectors Reliable Alarming Systems? - A Comparative Study. IEEE Big Data 2022: 3364-3373 (preprint).

  • Haiyin Zhang, Luís Cruz, Arie van Deursen: Code smells for machine learning applications. CAIN 2022: 217-228

  • Bart van Oort, Luis Cruz, Babak Loni, Arie van Deursen: “Project smells” - Experiences in Analysing the Software Quality of ML Projects with mllint. ICSE (SEIP) 2022: 211-220

  • Lorena Poenaru-Olaru, Judith Redi, Arthur Hovanesyan, Huijuan Wang: Default Prediction Using Network Based Features. COMPLEX NETWORKS 2021: 732-743

  • Lorena Poenaru-Olaru: AutoML: towards automation of machine learning systems maintainability. Middleware Doctoral Symposium 2021: 4-5

  • Bart van Oort, Luis Cruz, Maurício Aniche, Arie van Deursen: The Prevalence of Code Smells in Machine Learning projects. WAIN@ICSE 2021: 35-42

  • Yuanhao Xie, Luis Cruz, Petra Heck, Jan S. Rellermeyer: Systematic Mapping Study on the Machine Learning Lifecycle. WAIN@ICSE 2021: 70-73

  • Mark Haakman, Luis Cruz, Hennie Huijgens, Arie van Deursen: AI lifecycle models need to be revised. Empir. Softw. Eng. 26(5): 95 (2021) arxiv.org/abs/2010.02716.