Repository logo
 

AUTOMATED BACKPORTING FOR EFFICIENT VERSION MANAGEMENT IN SOFTWARE REPOSITORIES

Date

2025-01-23

Journal Title

Journal ISSN

Volume Title

Publisher

ORCID

Type

Thesis

Degree Level

Doctoral

Abstract

Managing software versions can be a complex software development task as it involves multiple stakeholders with differing interests. Proper management of the versions requires manual interventions from developers, integrators, managers, and release engineers. Intervention is required since the changes across multiple stable versions are created to meet specific requirements from the stakeholders. Backporting, a commonly used activity to port changes from development branches to other versions, is time-consuming and error-prone. It is essential to prioritize stable software and backporting to avoid anomalies and vulnerabilities in specific versions, which can result in significant financial losses and cybercrime. This thesis aims to explore porting, code propagation and backports, identify their challenges and strategies, and present automated approaches to integrate the necessary changes from development to stable ver- sions. First, I did a systematic literature review to provide insight into code propagation to practitioners with suitable approaches to maintaining software versions while considering stakeholder interests. The study also explores general strategies and challenges that can be used to mitigate code propagation issues for versions. Second, I did an exploratory study and found that code propagation typically involves bug fixes, testing, documentation, and feature changes. We also found that propagated changes are often inadequately linked to their original changes, contain incompatible code, and face acceptance and delay issues. Furthermore, I point out that backporting strategies depend on the project type, and further investigation is needed to determine their suitability. The exploratory study also includes the first-ever back- ports dataset, which could be helpful for future research on backports and backporting. Third, I proposed ReBack, a tool based on a deep learning model that can assist software development teams in identifying and managing backports more efficiently. Fourth, I characterized changesets with an empirical study, aiming to develop automated systems for integrating changesets across various versions. Finally, the thesis study aims to develop tools that support the final step of the entire pipeline of automated code propagation. BackSlice, achieving accurate and essential propagation through backport slicing, and BackTrans, leveraging a large language model to transform backporting changes, are two tools that can be used to integrate changes in stable versions. Our analysis reveals important backports identification, characterization, and adaptation findings. The results indicate that 49% of backports were inconsistently linked to their original pull requests, 13% contained incompatible code, 10% were not accepted, and backporting delays averaged 16 days for creation and 5 days for merging. The study shows that a combination of development and management activities can predict the potential number of software releases in a month (ρ < 0.05). In terms of identification, our experiments demonstrate that ReBack can recommend backports with an accuracy of 92.13%, a precision of 90.98%, a recall of 91.81%, an F1-score of 90.71%, and an AUC-score of 89.87%. The results also show typical charac- terizations that include contextual differences, varying dependencies, and statement-level alterations. Furthermore, examining version histories uncovers semantic inconsistencies and errors caused by unnecessary changes in stable versions, necessitating reversions. Using the characterization information, our tool BackSlice achieves an efficiency score above 0.50 across various metrics, showcasing its ability to minimize and adapt changesets for stable versions effectively. Similarly, BackTrans achieves a score of over 0.65 across various metrics for adaptations. These results highlight the performance of the proposed studies in this thesis for automating the backporting tasks in repositories.

Description

Keywords

porting, backport, pull-request, commit, github

Citation

Degree

Doctor of Philosophy (Ph.D.)

Department

Computer Science

Program

Computer Science

Advisor

Part Of

item.page.relation.ispartofseries

DOI

item.page.identifier.pmid

item.page.identifier.pmcid