NORMA eResearch @NCI Library

Generating Python Code from Docstrings using OpenNMT

Bose, Sayok Kumar (2022) Generating Python Code from Docstrings using OpenNMT. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (6MB) | Preview
[thumbnail of Configuration manual]
Preview
PDF (Configuration manual)
Download (2MB) | Preview

Abstract

In the last two years with the birth of large language models we have seen some great advancements in the area of code generation in the past 2 years. With more years to come it is expected to rake in more developers from the current statistics of 26 million and stats show an exponential increase of code commits to GitHub every day. Which brings us to the idea considering writing code or a piece of a software can be scaffolded with the help of AI assisted systems. This following piece of article deals with the use two techniques firstly Neural machine translation and Decoder only Language Model built from scratch using OpenNMT Toolkit to generate python code from the docstrings that is scraped out of public GitHub repositories. The data source that we use for the research is CodeSearchNet (CSN) which is a cleaned dataset of code and docstring pairs. Moreover, the performance of the model is evaluated by human intervention, BLEU scores, Language Linting Tools and IDEs.

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QH Natural history > QH301 Biology > Methods of research. Technique. Experimental biology > Data processing. Bioinformatics > Artificial intelligence
Q Science > Q Science (General) > Self-organizing systems. Conscious automata > Artificial intelligence
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Tamara Malone
Date Deposited: 19 Jan 2023 13:33
Last Modified: 06 Mar 2023 15:45
URI: https://norma.ncirl.ie/id/eprint/6094

Actions (login required)

View Item View Item