Thesis defences

Using ChatGPT to Augment Software Engineering Chatbots Datasets

Date & time

Wednesday, December 13, 2023
10 a.m. – 11:30 a.m.

Speaker(s)

Khaled Badran

Cost

This event is free.

Organization

Department of Computer Science and Software Engineering

Contact

Emad Shihab

Where

ER Building
2155 Guy St.
Room ER-1222 and Zoom

Accessible location

Yes - See details

Abstract

Chatbots are envisioned to bring about a significant shift in the realm of Software Engineering (SE), enabling practitioners to engage in conversations and interact with various services using natural language. At the heart of each chatbot is a Natural Language Understanding (NLU) component that enables the chatbots to comprehend the user’s queries. However, the NLU requires extensive, high-quality training data (examples) to accurately interpret user queries. Prior work shows that the creation and augmentation of SE datasets are resource-intensive and time-consuming. To address this gap, we explore the potential of using ChatGPT to augment the SE chatbot training dataset. Specifically, we evaluate the impact of retraining the NLU on ChatGPT’s augmented dataset on the NLU’s performance using four widely used SE datasets. Moreover, we assess the syntactic and semantic aspects of the generated examples compared to human-written examples. Additionally, we conduct an ablation study to investigate the impact of each component in the prompt on the NLU’s performance and the diversity of the generated examples. The results show that ChatGPT significantly

improves the NLU’s performance, with F1-score improvements ranging from 3.9% to 11.6%. Moreover, we find that ChatGPT-generated examples exhibit syntactic diversity while maintaining consistent semantics (2.2% on average) across all datasets. Additionally, the results indicate that including a few human-written examples and a description of the intent’s objective in the prompt impacts the quality of the generated examples. Finally, we provide implications for practitioners

and researchers of SE chatbots.