Doorgaan naar hoofdnavigatie Doorgaan naar zoeken Ga verder naar hoofdinhoud

Tree Variational Autoencoder for Code

Onderzoeksoutput: Bijdrage aan tijdschriftTijdschriftartikelAcademicpeer review

4 Downloads (Pure)

Samenvatting

Autoencoder models of source code are an emerging alternative to autoregressive large language models with important benefits for genetic improvement of software. We hypothesize that encoder-decoder architectures are suboptimal for source code because they ignore the grammatical structure that can be derived with an Abstract Syntax Tree parser. We propose a structured Variational Auto-Encoder based on TreeLSTM that operates directly on the AST. We train it along with a baseline sequence VAE on a dataset of competitive programming submissions We find the structured model to perform better in most tests, with some notable exceptions. These findings suggest structured autoencoder models could enable more effective generation and manipulation of source code for tasks like automated bug fixing and generative programming.

Originele taal-2Engels
Artikelnummer10886936
Pagina's (van-tot)30262-30273
Aantal pagina's12
TijdschriftIEEE Access
Volume13
DOI's
StatusGepubliceerd - 2025

Bibliografische nota

Publisher Copyright:
© 2025 IEEE.

Vingerafdruk

Duik in de onderzoeksthema's van 'Tree Variational Autoencoder for Code'. Samen vormen ze een unieke vingerafdruk.

Citeer dit