University of SaskatchewanHARVEST
  • Login
  • Submit Your Work
  • About
    • About HARVEST
    • Guidelines
    • Browse
      • All of HARVEST
      • Communities & Collections
      • By Issue Date
      • Authors
      • Titles
      • Subjects
      • This Collection
      • By Issue Date
      • Authors
      • Titles
      • Subjects
    • My Account
      • Login
      JavaScript is disabled for your browser. Some features of this site may not work without it.
      View Item 
      • HARVEST
      • Electronic Theses and Dissertations
      • Graduate Theses and Dissertations
      • View Item
      • HARVEST
      • Electronic Theses and Dissertations
      • Graduate Theses and Dissertations
      • View Item

      Towards Semantic Clone Detection, Benchmarking, and Evaluation

      Thumbnail
      View/Open
      AL-OMARI-DISSERTATION-2021.pdf (4.371Mb)
      Date
      2021-06-07
      Author
      Al-omari, Farouq Ahmad
      Type
      Thesis
      Degree Level
      Doctoral
      Metadata
      Show full item record
      Abstract
      Developers copy and paste their code to speed up the development process. Sometimes, they copy code from other systems or look up code online to solve a complex problem. Developers reuse copied code with or without modifications. The resulting similar or identical code fragments are called code clones. Sometimes clones are unintentionally written when a developer implements the same or similar functionality. Even when the resulting code fragments are not textually similar but implement the same functionality they are still considered to be clones and are classified as semantic clones. Semantic clones are defined as code fragments that perform the exact same computation and are implemented using different syntax. Software cloning research indicates that code clones exist in all software systems; on average, 5% to 20% of software code is cloned. Due to the potential impact of clones, whether positive or negative, it is essential to locate, track, and manage clones in the source code. Considerable research has been conducted on all types of code clones, including clone detection, analysis, management, and evaluation. Despite the great interest in code clones, there has been considerably less work conducted on semantic clones. As described in this thesis, I advance the state-of-the-art in semantic clone research in several ways. First, I conducted an empirical study to investigate the status of code cloning in and across open-source game systems and the effectiveness of different normalization, filtering, and transformation techniques for detecting semantic clones. Second, I developed an approach to detect clones across .NET programming languages using an intermediate language. Third, I developed a technique using an intermediate language and an ontology to detect semantic clones. Fourth, I mined Stack Overflow answers to build a semantic code clone benchmark that represents real semantic code clones in four programming languages, C, C#, Java, and Python. Fifth, I defined a comprehensive taxonomy that identifies semantic clone types. Finally, I implemented an injection framework that uses the benchmark to compare and evaluate semantic code clone detectors by automatically measuring recall.
      Degree
      Doctor of Philosophy (Ph.D.)
      Department
      Computer Science
      Program
      Computer Science
      Supervisor
      Roy, Chanchal K
      Committee
      McQuillan, Ian; Keil, Mark; McCalla, Gord; Khan, Shahedul A
      Copyright Date
      April 2021
      URI
      https://hdl.handle.net/10388/13413
      Subject
      Semantic clones
      Clone detection
      Clone detection benchmark
      Stack Overflow
      Clone detection evaluation
      Collections
      • Graduate Theses and Dissertations

      Related items

      Showing items related by title, author, creator and subject.

      • Analyzing Clone Evolution for Identifying the Important Clones for Management 

        Mondal, Manishankar 1982- (2017-02-15)
        Code clones (identical or similar code fragments in a code-base) have dual but contradictory impacts (i.e., both positive and negative impacts) on the evolution and maintenance of a software system. Because of the negative ...
      • Root dynamics and carbon accumulation of six willow clones in Saskatchewan 

        Stadnyk, Christine Noelle (2010-06)
        Short rotation woody crops have gained global interest as an alternative energy source to fossil fuels. The availability of this resource is, however, dependent on successful research trials and the identification and ...
      • Understanding the Evolution of Code Clones in Software Systems 

        Saha, Avigit (2013-09-05)
        Code cloning is a common practice in software development. However, code cloning has both positive aspects such as accelerating the development process and negative aspects such as causing code bloat. After a decade of ...
      University of Saskatchewan

      University Library

      © University of Saskatchewan
      Contact Us | Disclaimer | Privacy