Repository logo
 

SV-JIM, detailed pairwise structural variant calling using long-reads and genome assemblies

dc.contributor.authorTodd, Clarence
dc.contributor.authorJin, Lingling
dc.contributor.authorMcQuillan, Ian
dc.date.accessioned2025-02-08T19:04:29Z
dc.date.available2025-02-08T19:04:29Z
dc.date.issued0001
dc.description.abstractThis paper proposes a detailed process for SV calling that permits a data-driven assessment of multiple SV callers that uses both genome assemblies and long-reads. The process is implemented as a software pipeline named Structural Variant − Jaccard Index Measure, or SVJIM, using the Snakemake [20] workflow management system. Like most state-of-the-art SV callers, SV-JIM detects the presence of variations between pairs of genomes, but it streamlines the numerous SV calling stages into a single process for user convenience and evaluates the multiple SV sets produced using the Jaccard index measure to identify those with the highest consistency among the included SV callers. SV-JIM then produces aggregated SV results based on how many callers supported the reported SVs. For validation, SV-JIM was assessed through three case studies on the Homo sapiens genome and two plant genomes – Brassica nigra and Arabidopsis thaliana. Executing SV-JIM identified a significant amount of inter-caller variance which varied by tens of thousands of results on the larger Brassica nigra and Homo sapiens genomes. Further, aggregating the SV sets helped simplify better retention of the less frequently occurring SV types by requiring a level of minimum support rather than from a specific SV caller combination. Finally, these case studies identified a potential for inflated precision reporting that can occur during evaluation. SV-JIM is available publicly under MIT license at https://github.com/USask-BINFO/SV-JIM
dc.description.versionPeer Reviewed
dc.identifier.doihttps://doi.org/10.1016/j.ymeth.2024.12.015
dc.identifier.urihttps://hdl.handle.net/10388/16549
dc.language.isoen
dc.publisherMethods
dc.rightsAttribution 2.5 Canadaen
dc.rights.urihttp://creativecommons.org/licenses/by/2.5/ca/
dc.subjectStructural variant calling
dc.subjectGenetic variation
dc.subjectComparative genomics
dc.titleSV-JIM, detailed pairwise structural variant calling using long-reads and genome assemblies
dc.typeArticle

Files

Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Todd_etal_JIM_Methods.pdf
Size:
2.11 MB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
Todd_etal_Supplementary_Materials.pdf
Size:
836.5 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.36 KB
Format:
Item-specific license agreed upon to submission
Description: