What’s beyond the accuracy of code models?
When: April 26, 2023, 11:00 - 12:00
Where: Hybrid
Title: What’s beyond the accuracy of code models?
Abstract: Large language models of code, such as CoPilot and even ChatGPT, have delivered significant advancements in various code processing tasks. Researchers keep reporting how good their models are, e.g., high accuracy, high precision, etc. However, my recent interest is to explore “what’s beyond the accuracy of code models.“
This presentation will cover three aspects beyond accuracy: security, ethics, and deployment. First, code models also face a series of security threats. We recently highlight the naturalness requirement in attacking model models [1], and we evaluate stealthy backdoor attacks on code generation models [2]. Second, I will present ongoing works that analyze the ethical implications of code models, including data leakage, membership inference, and the right to be forgotten. Third, I will talk about how to compress the size of code models while maintaining their performance so they can be more efficiently deployed [3].
[1] Zhou Yang, Jieke Shi, Junda He, and David Lo. 2022. Natural attack for pre-trained models of code. In Proceedings of the 44th International Conference on Software Engineering (ICSE ‘22).
[2] Zhou Yang, Bowen Xu, Jie M. Zhang, Hong Jin Kang, Jieke Shi, Junda He, and David Lo. “Stealthy Backdoor Attack for Code Models.” arXiv preprint arXiv:2301.02496 (2023).
[3] Jieke Shi, Zhou Yang, Bowen Xu, Hong Jin Kang, and David Lo. 2023. Compressing Pre-trained Models of Code into 3 MB. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE ‘22).
Short Bio: Zhou YANG is a second-year PhD Student at Singapore Management University. He earned a distinction degree while studying for his Software Engineering MSc at UCL. His interests lie the intersection of software engineering and artificial intelligence. An Interesting Fact: Yang Zhou graduated from Yangzhou University in Yangzhou City.