Show simple item record

dc.contributor.advisorRoy, Chanchal K.en_US
dc.contributor.advisorSchneider, Kevin A.en_US
dc.creatorSaha, Riponen_US
dc.date.accessioned2013-01-03T22:27:36Z
dc.date.available2013-01-03T22:27:36Z
dc.date.created2011-11en_US
dc.date.issued2011-12-15en_US
dc.date.submittedNovember 2011en_US
dc.identifier.urihttp://hdl.handle.net/10388/ETD-2011-11-202en_US
dc.description.abstractIt is believed that identical or similar code fragments in source code, also known as code clones, have an impact on software maintenance. A clone genealogy shows how a group of clone fragments evolve with the evolution of the associated software system, and thus may provide important insights on the maintenance implications of those clone fragments. Considering the importance of studying the evolution of code clones, many studies have been conducted on this topic. However, after a decade of active research, there has been a marked lack of progress in understanding the evolution of near-miss software clones, especially where statements have been added, deleted, or modified in the copied fragments. Given that there are a significant amount of near-miss clones in the software systems, we believe that without studying the evolution of near-miss clones, one cannot have a complete picture of the clone evolution. In this thesis, we have advanced the state-of-the-art in the evolution of clone research in the context of both exact and near-miss software clones. First, we performed a large-scale empirical study to extend the existing knowledge about the evolution of exact and renamed clones where identifiers have been modified in the copied fragments. Second, we have developed a framework, gCad that can automatically extract both exact and near-miss clone genealogies across multiple versions of a program and identify their change patterns reasonably fast while maintaining high precision and recall. Third, in order to gain a broader perspective of clone evolution, we extended gCad to calculate various evolutionary metrics, and performed an in-depth empirical study on the evolution of both exact and near-miss clones in six open source software systems of two different programming languages with respect to five research questions. We discovered several interesting evolutionary phenomena of near-miss clones which either contradict with previous findings or are new. Finally, we further improved gCad, and investigated a wide range of attributes and metrics derived from both the clones themselves and their evolution histories to identify certain attributes, which developers often use to remove clones in the real world. We believe that our new insights in the evolution of near-miss clones, and about how developers approach and remove duplication, will play an important role in understanding the maintenance implications of clones and will help design better clone management systems.en_US
dc.language.isoengen_US
dc.subjectCode Clone Genealogiesen_US
dc.subjectSoftware Evolutionen_US
dc.subjectSoftware Maintenanceen_US
dc.titleDetection and analysis of near-miss clone genealogiesen_US
thesis.degree.departmentComputer Scienceen_US
thesis.degree.disciplineComputer Scienceen_US
thesis.degree.grantorUniversity of Saskatchewanen_US
thesis.degree.levelMastersen_US
thesis.degree.nameMaster of Science (M.Sc.)en_US
dc.type.materialtexten_US
dc.type.genreThesisen_US
dc.contributor.committeeMemberMcCalla, Gordon I.en_US
dc.contributor.committeeMemberJamali, Nadeemen_US
dc.contributor.committeeMemberBradley, Michael P.en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record