Exploring the Effect of Multiple Natural Languages on Code Suggestion Using GitHub Copilot

Kei Koyanagi, Dong Wang, Kotaro Noguchi, Masanari Kondo, Alexander Serebrenik, Yasutaka Kamei, Naoyasu Ubayashi

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

4 Citations (Scopus)
30 Downloads (Pure)

Abstract

GitHub Copilot is an AI-enabled tool that automates program synthesis. It has gained significant attention since its launch in 2021. Recent studies have extensively examined Copilot's capabilities in various programming tasks, as well as its security issues. However, little is known about the effect of different natural languages on code suggestion. Natural language is considered a social bias in the field of NLP, and this bias could impact the diversity of software engineering. To address this gap, we conducted an empirical study to investigate the effect of three popular natural languages (English, Japanese, and Chinese) on Copilot. We used 756 questions of varying difficulty levels from AtCoder contests for evaluation purposes. The results highlight that the capability varies across natural languages, with Chinese achieving the worst performance. Furthermore, regardless of the type of natural language, the performance decreases significantly as the difficulty of questions increases. Our work represents the initial step in comprehending the significance of natural languages in Copilot's capability and introduces promising opportunities for future endeavors.
Original languageEnglish
Title of host publicationMSR '24
Subtitle of host publicationProceedings of the 21st International Conference on Mining Software Repositories
Place of PublicationNew York
PublisherAssociation for Computing Machinery, Inc.
Pages481-486
Number of pages6
ISBN (Electronic)979-8-4007-0587-8
DOIs
Publication statusPublished - 2 Jul 2024
Event21st International Conference on Mining Software Repositories, MSR 2024 - Lisbon, Portugal
Duration: 15 Apr 202416 Apr 2024

Conference

Conference21st International Conference on Mining Software Repositories, MSR 2024
Abbreviated titleMSR 2024
Country/TerritoryPortugal
CityLisbon
Period15/04/2416/04/24

Funding

We gratefully acknowledge the financial support of: (1) JSPS for the KAKENHI grants (JP21H04877, JP22K17874, JP22K18630, JP23K16864), and Bilateral Program grant JPJSBP120239929; and (2) the Inamori Research Institute for Science for supporting Yasutaka Kamei via the InaRIS Fellowship.

FundersFunder number
Inamori Research Institute for Science
Japan Society for the Promotion of ScienceJP23K16864, JP22K18630, JPJSBP120239929, JP21H04877, JP22K17874

    Keywords

    • Code Suggestion
    • Empirical Study
    • GitHub Copilot

    Fingerprint

    Dive into the research topics of 'Exploring the Effect of Multiple Natural Languages on Code Suggestion Using GitHub Copilot'. Together they form a unique fingerprint.

    Cite this