AUTOMATED BACKPORTING FOR EFFICIENT VERSION MANAGEMENT IN SOFTWARE REPOSITORIES
Date
2025-01-23
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
ORCID
Type
Thesis
Degree Level
Doctoral
Abstract
Managing software versions can be a complex software development task as it involves multiple stakeholders with
differing interests. Proper management of the versions requires manual interventions from developers, integrators,
managers, and release engineers. Intervention is required since the changes across multiple stable versions are created
to meet specific requirements from the stakeholders. Backporting, a commonly used activity to port changes from
development branches to other versions, is time-consuming and error-prone. It is essential to prioritize stable software
and backporting to avoid anomalies and vulnerabilities in specific versions, which can result in significant financial
losses and cybercrime. This thesis aims to explore porting, code propagation and backports, identify their challenges
and strategies, and present automated approaches to integrate the necessary changes from development to stable ver-
sions. First, I did a systematic literature review to provide insight into code propagation to practitioners with suitable
approaches to maintaining software versions while considering stakeholder interests. The study also explores general
strategies and challenges that can be used to mitigate code propagation issues for versions. Second, I did an exploratory
study and found that code propagation typically involves bug fixes, testing, documentation, and feature changes. We
also found that propagated changes are often inadequately linked to their original changes, contain incompatible code,
and face acceptance and delay issues. Furthermore, I point out that backporting strategies depend on the project type,
and further investigation is needed to determine their suitability. The exploratory study also includes the first-ever back-
ports dataset, which could be helpful for future research on backports and backporting. Third, I proposed ReBack, a
tool based on a deep learning model that can assist software development teams in identifying and managing backports
more efficiently. Fourth, I characterized changesets with an empirical study, aiming to develop automated systems for
integrating changesets across various versions. Finally, the thesis study aims to develop tools that support the final
step of the entire pipeline of automated code propagation. BackSlice, achieving accurate and essential propagation
through backport slicing, and BackTrans, leveraging a large language model to transform backporting changes, are two
tools that can be used to integrate changes in stable versions. Our analysis reveals important backports identification,
characterization, and adaptation findings. The results indicate that 49% of backports were inconsistently linked to their
original pull requests, 13% contained incompatible code, 10% were not accepted, and backporting delays averaged
16 days for creation and 5 days for merging. The study shows that a combination of development and management
activities can predict the potential number of software releases in a month (ρ < 0.05). In terms of identification, our
experiments demonstrate that ReBack can recommend backports with an accuracy of 92.13%, a precision of 90.98%,
a recall of 91.81%, an F1-score of 90.71%, and an AUC-score of 89.87%. The results also show typical charac-
terizations that include contextual differences, varying dependencies, and statement-level alterations. Furthermore,
examining version histories uncovers semantic inconsistencies and errors caused by unnecessary changes in stable
versions, necessitating reversions. Using the characterization information, our tool BackSlice achieves an efficiency
score above 0.50 across various metrics, showcasing its ability to minimize and adapt changesets for stable versions
effectively. Similarly, BackTrans achieves a score of over 0.65 across various metrics for adaptations. These results
highlight the performance of the proposed studies in this thesis for automating the backporting tasks in repositories.
Description
Keywords
porting, backport, pull-request, commit, github
Citation
Degree
Doctor of Philosophy (Ph.D.)
Department
Computer Science
Program
Computer Science