Track 2: LLM Adaptation for Coding Tasks

Track leader at TU Delft: Maliheh Izadi (Assistant Professor)
Track leader at JetBrains: Egor Bogomolov (Code Modeling Research team lead)

Given the competitive landscape surrounding the use of AI today, mere development and deployment of LLM in the IDE does not suffice. On one hand, the current approach of shipping/querying the same generic model for every task, project, and user will not provide optimal results. On the other hand, researchers have continuously trained ever-larger models which require large amounts of training data. This data is usually a massive unsanitized corpus extracted from public domains. Research has shown the resulting LLMs can memorize their training data and emit verbatim [1] leading to legal issues. However, the models are less proficient outside their training data and may struggle when performing tasks in previously unencountered repositories. As new generations of models are being rolled out, there is a need to assess the emerging capabilities of such models.

This project proposes to adapt, personalize, and evaluate the giant generic language models to different scenarios to yield tangible, timely, safe, and personalized outputs for the end-users.

PhD Students:

Egor Bogomolov (JetBrains)
Danielle Cipollone (TU Delft)

MSc Students:

Tim van Dam (graduated in 2024): Thesis

Track News

13 January 2025

Lab Kickoff At Royal Delft!

01 October 2024

Daniele Cipollone started as a new PhD student in Track 2

15 July 2024

Three MSc students graduated

14 July 2024

ACM Distinguished Paper Award at AIWare 2024

22 April 2024

Strong AI4SE presence at ICSE 2024

12 October 2023

AI4SE Announced at the Bits & Chips Event

Publications

Daniele Cipollone. Enhancing Large Language Model Integration in Integrated Development Environments ACM International Conference on the Foundations of Software Engineering (FSE), 2025
Roham Koohestani and Maliheh Izadi. HyperSeq; A Hyper-Adaptive Representation for Predictive Sequencing of States. ACM International Conference on the Foundations of Software Engineering (FSE), 2025
Egor Bogomolov, Aleksandra Eliseeva, Timur Galimzyanov, Evgeniy Glukhov, Anton Shapkin, Maria Tigina, Yaroslav Golubev, Alexander Kovrigin, Arie van Deursen, Maliheh Izadi, and Timofey Bryksin. Long Code Arena; a Set of Benchmarks for Long-Context Code Models. , 2024
Aral de Moor, Arie van Deursen, and Maliheh Izadi. A Transformer-Based Approach for Smart Invocation of Automatic Code Completion. Proceedings of the 1st ACM International Conference on AI-Powered Software (AIWare), co-located with ACM FSE, ACM Distinguished Paper Award 🏆, 2024