Learning APIs through Mining Code Snippet Examples

Saifullah, C M Khaled 1993-

Learning APIs through Mining Code Snippet Examples

Files

SAIFULLAH-THESIS-2020.pdf (3.5 MB)

Date

2020-02-04

Authors

Saifullah, C M Khaled 1993-

ORCID

0000-0002-8822-2091

Type

Thesis

Degree Level

Masters

Abstract

Developers extensively use and reuse the Application Programming Interfaces (APIs) to faster the development time and effort. In order to do this, developers need to learn and remember APIs for effectively using them in their codebase. However, APIs are difficult to learn as they are large in numbers and are not properly documented and the documentation contains a lot of text to remember. To support developers learning and using those APIs, this thesis focuses three different studies that (1) enhances the code completion features of the modern integrated development environments (IDEs), (2) make the online forum code snippets compilable and (3) annotates the code elements of the dynamically typed programming language (e.g, JavaScript) by their types. Towards this direction, we first explore the method name, argument and code completion techniques in the literature and find that none of them is suitable for completing a full method call sequence which consists of a name and a list of arguments. Thus we propose a Bi-LSTM based encoder-decoder model with attention mechanism and beam search, DAMCA that takes all three lexical, syntactic and semantic contexts of a method call and returns a list of method call sequences as the completion suggestions. Evaluation results show that the proposed technique outperforms the state-of-the-art method name, argument, code completion and program synthesis techniques for method call sequence completion. Next, we explore the techniques that are proposed for resolving the Fully Qualified Name (FQN) of the API element of the online forums code snippets. We find that the techniques restrict themselves by the locally specific code tokens only. We incorporate globally related tokens with the local tokens and use likelihood, context similarity, and name similarity to resolve the API element. Experimental results show that the proposed technique outperforms the state-of-the-art techniques with faster training. Finally, in our third study, we explore the techniques developed for statically typed programming languages (i.e, Java) for dynamically typed programming languages (i.e, JavaScript). The evaluation results show that the techniques performed very poorly for JavaScript. Next, we investigate the causes and built a technique that leverages Word2Vec, context similarity as the global models and previous outputs on the same project as a local model. The combination of models outperforms the technique developed for Java. We then compare the proposed technique with state-of-the-art deep learning based techniques developed for JavaScript. The experimental results suggest that the proposed technique has faster training time than the deep learning based technique without sacrificing accuracy. We believe that findings from this research and proposed techniques have the potential to help developers learning different aspects of APIs, thus ease software development and improve the productivity of developers.

Keywords

API, Context-Sensitive, Code Compeltion, Neural Encoder Decoder, FQN, Word2Vec, Type System, Deep Learning

Degree

Master of Science (M.Sc.)

Department

Computer Science

Program

Computer Science

Advisor

Roy, Chanchal K.

Committee

Keil, Mark ; Khan, Shahedul ; Codabux, Zadia ; Lee, Roy

URI

http://hdl.handle.net/10388/12688

Collections

Graduate Theses and Dissertations

Full item page

Learning APIs through Mining Code Snippet Examples

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

ORCID

Type

Degree Level

Abstract

Description

Keywords

Citation

Degree

Department

Program

Advisor

Committee

Citation

Part Of

item.page.relation.ispartofseries

URI

DOI

item.page.identifier.pmid

item.page.identifier.pmcid

Collections