Repository logo
 

Supporting Source Code Search with Context-Aware and Semantics-Driven Query Reformulation

dc.contributor.advisorRoy, Chanchal
dc.contributor.committeeMemberMcQuillan, Ian
dc.contributor.committeeMemberGrosvenor, Andrew
dc.contributor.committeeMemberStavness, Ian
dc.contributor.committeeMemberStakhanova, Natalia
dc.contributor.committeeMemberMondal, Debajyoti
dc.contributor.committeeMemberRoy, Banani
dc.creatorRahman, Mohammad Masudur 1985-
dc.creator.orcid0000-0003-3821-5990
dc.date.accessioned2019-10-04T19:12:39Z
dc.date.available2019-10-04T19:12:39Z
dc.date.created2019-09
dc.date.issued2019-10-04
dc.date.submittedSeptember 2019
dc.date.updated2019-10-04T19:12:39Z
dc.description.abstractSoftware bugs and failures cost trillions of dollars every year, and could even lead to deadly accidents (e.g., Therac-25 accident). During maintenance, software developers fix numerous bugs and implement hundreds of new features by making necessary changes to the existing software code. Once an issue report (e.g., bug report, change request) is assigned to a developer, she chooses a few important keywords from the report as a search query, and then attempts to find out the exact locations in the software code that need to be either repaired or enhanced. As a part of this maintenance, developers also often select ad hoc queries on the fly, and attempt to locate the reusable code from the Internet that could assist them either in bug fixing or in feature implementation. Unfortunately, even the experienced developers often fail to construct the right search queries. Even if the developers come up with a few ad hoc queries, most of them require frequent modifications which cost significant development time and efforts. Thus, construction of an appropriate query for localizing the software bugs, programming concepts or even the reusable code is a major challenge. In this thesis, we overcome this query construction challenge with six studies, and develop a novel, effective code search solution (BugDoctor) that assists the developers in localizing the software code of interest (e.g., bugs, concepts and reusable code) during software maintenance. In particular, we reformulate a given search query (1) by designing novel keyword selection algorithms (e.g., CodeRank) that outperform the traditional alternatives (e.g., TF-IDF), (2) by leveraging the bug report quality paradigm and source document structures which were previously overlooked and (3) by exploiting the crowd knowledge and word semantics derived from Stack Overflow Q&A site, which were previously untapped. Our experiment using 5000+ search queries (bug reports, change requests, and ad hoc queries) suggests that our proposed approach can improve the given queries significantly through automated query reformulations. Comparison with 10+ existing studies on bug localization, concept location and Internet-scale code search suggests that our approach can outperform the state-of-the-art approaches with a significant margin.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/10388/12394
dc.subjectCode search, bug localization, concept location, Internet-scale code search, query reformulation, context-awareness, term weighting, Page Rank, semantic similarity, bug report, change request
dc.titleSupporting Source Code Search with Context-Aware and Semantics-Driven Query Reformulation
dc.typeThesis
dc.type.materialtext
thesis.degree.departmentComputer Science
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Saskatchewan
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy (Ph.D.)

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
RAHMAN-DISSERTATION-2019.pdf
Size:
5.69 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.28 KB
Format:
Plain Text
Description: