University of SaskatchewanHARVEST
  • Login
  • Submit Your Research
  • About
    • About HARVEST
    • Guidelines
    • Browse
      • All of HARVEST
      • Communities & Collections
      • By Issue Date
      • Authors
      • Titles
      • Subjects
      • This Collection
      • By Issue Date
      • Authors
      • Titles
      • Subjects
    • My Account
      • Login
      JavaScript is disabled for your browser. Some features of this site may not work without it.
      View Item 
      • HARVEST
      • College of Graduate and Postdoctoral Studies
      • Electronic Theses and Dissertations
      • View Item
      • HARVEST
      • College of Graduate and Postdoctoral Studies
      • Electronic Theses and Dissertations
      • View Item

      Analyzing Clone Evolution for Identifying the Important Clones for Management

      Thumbnail
      View/Open
      MONDAL-DISSERTATION-2017.pdf (4.974Mb)
      Date
      2017-02-15
      Author
      Mondal, Manishankar 1982-
      Type
      Thesis
      Degree Level
      Doctoral
      Metadata
      Show full item record
      Abstract
      Code clones (identical or similar code fragments in a code-base) have dual but contradictory impacts (i.e., both positive and negative impacts) on the evolution and maintenance of a software system. Because of the negative impacts (such as high change-proneness, bug-proneness, and unintentional inconsistencies), software researchers consider code clones to be the number one bad-smell in a code-base. Existing studies on clone management suggest managing code clones through refactoring and tracking. However, a software system's code-base may contain a huge number of code clones, and it is impractical to consider all these clones for refactoring or tracking. In these circumstances, it is essential to identify code clones that can be considered particularly important for refactoring and tracking. However, no existing study has investigated this matter. We conduct our research emphasizing this matter, and perform five studies on identifying important clones by analyzing clone evolution history. In our first study we detect evolutionary coupling of code clones by automatically investigating clone evolution history from thousands of commits of software systems downloaded from on-line SVN repositories. By analyzing evolutionary coupling of code clones we identify a particular clone change pattern, Similarity Preserving Change Pattern (SPCP), such that code clones that evolve following this pattern should be considered important for refactoring. We call these important clones the SPCP clones. We rank SPCP clones considering their strength of evolutionary coupling. In our second study we further analyze evolutionary coupling of code clones with an aim to assist clone tracking. The purpose of clone tracking is to identify the co-change (i.e. changing together) candidates of code clones to ensure consistency of changes in the code-base. Our research in the second study identifies and ranks the important co-change candidates by analyzing their evolutionary coupling. In our third study we perform a deeper analysis on the SPCP clones and identify their cross-boundary evolutionary couplings. On the basis of such couplings we separate the SPCP clones into two disjoint subsets. While one subset contains the non-cross-boundary SPCP clones which can be considered important for refactoring, the other subset contains the cross-boundary SPCP clones which should be considered important for tracking. In our fourth study we analyze the bug-proneness of different types of SPCP clones in order to identify which type(s) of code clones have high tendencies of experiencing bug-fixes. Such clone-types can be given high priorities for management (refactoring or tracking). In our last study we analyze and compare the late propagation tendencies of different types of code clones. Late propagation is commonly regarded as a harmful clone evolution pattern. Findings from our last study can help us prioritize clone-types for management on the basis of their tendencies of experiencing late propagations. We also find that late propagation can be considerably minimized by managing the SPCP clones. On the basis of our studies we develop an automatic system called AMIC (Automatic Mining of Important Clones) that identifies the important clones for management (refactoring and tracking) and ranks these clones considering their evolutionary coupling, bug-proneness, and late propagation tendencies. We believe that our research findings have the potential to assist clone management by pin-pointing the important clones to be managed, and thus, considerably minimizing clone management effort.
      Degree
      Doctor of Philosophy (Ph.D.)
      Department
      Computer Science
      Program
      Computer Science
      Supervisor
      Roy, Dr. Chanchal K.; Schneider, Dr. Kevin A.
      Committee
      Vassileva, Dr. Julita; Stanley, Dr. Kevin; McQuillan, Dr. Ian; Gokaraju, Dr. Ramakrishna
      Copyright Date
      February 2017
      URI
      http://hdl.handle.net/10388/7749
      Subject
      Important Clones
      Clone Refactoring
      Clone Tracking
      Clone Types
      Collections
      • Electronic Theses and Dissertations

      Related items

      Showing items related by title, author, creator and subject.

      • Short rotation culture of willow clones across Canada : growth requirements and implications for soil nutrients and greenhouse gas balances 

        Ens, Joel (2012-11-21)
        The cultivation of willow (Salix spp.) is being investigated as a potential feedstock for biomass energy in the Canadian prairies. For this purpose, and despite willow’s high nutrient and water demand, high rates of ...
      • Cloning, expression, and characterization of lactic acid bacteria recombinant prolidases 

        Yang, Soo In (2007)
        Lactobacillus plantarum (Lb. plantarum) NRRL B4496 and Lactococcus lactis (Lc. lactis) NRRL B1821 prolidase genes were isolated, cloned, and sequenced. The sequence-confirmed genes were subcloned into the expression systems. ...
      • Large-Scale Clone Detection and Benchmarking 

        Svajlenko, Jeff Thomas 1987-; 0000-0001-9738-7421 (2018-02-14)
        Code clones are pairs of code fragments that are similar. They are created when developers re-use code by copy and paste, although clones are known to occur for a variety of reasons. Clones have a negative impact on ...
      University of Saskatchewan

      University Library

      © University of Saskatchewan
      Contact Us | Disclaimer | Privacy