Assessing the Effectiveness of Multi-View Visualization Dashboard for Hi-C Data Analysis
Date
2024-12-02
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
ORCID
Type
Thesis
Degree Level
Masters
Abstract
Hi-C (High-throughput Chromosome Conformation Capture) is a technique used to map the three-dimensional organization of the genome by capturing interactions between different regions of DNA. The visualization of Hi-C data is essential to understanding how the spatial arrangement of the genome influences gene regulation, chromatin structure, and gene stability. Visualization methods such as heatmaps, Circos and parallel plots are widely used to represent these interactions. Heatmap visualization can effectively depict interaction frequencies between genomic regions. Parallel and Circos plots offer insights into genomic connections across chromosomes or within chromosomal bins. However, these methods have limitations in capturing the holistic view of genomic architecture, which refers to the three-dimensional arrangement and spatial arrangement of the genome. As a result, these methods fall short, and thus are unable to provide a comprehensive understanding of the data. To address these challenges, in this thesis, we design a multi-view dashboard for genomic data to integrate multiple visualization techniques into a cohesive platform with interactive features that allows for a more detailed and nuanced exploration of Hi-C data.
In this thesis, we present three case studies and formulate usage scenarios to offer perspectives on the usefulness of a multi-view visualization system. In the first case study, our multi-view dashboard was applied to explore the chromosomal structures of Bacillus subtilis under varying environmental conditions. We were able to observe distinct structural patterns through different visualization representations. The 3D and 2D graph visualizations revealed the spatial organization and interactions around key genomic sites. Moreover, the heatmap and parallel plot highlighted the frequency of chromosomal interactions. These different perspectives allowed us to validate known findings in the literature while offering insights that
were not as apparent with individual visualization methods. In the second case study, we utilized Hi-C data of Brassica napus in the dashboard, explicitly focusing on anomaly detection. This case study demonstrated the dashboard’s utility in identifying unexpected interactions between distant chromosomal bins. While the heatmap provided a clear indication of anomalies, integrating 3D, 2D, and parallel plot visualizations enabled a more detailed exploration of these anomalies, offering a multi-faceted understanding of the genomic architecture that an analysis using a single visualization might overlook. In the third case study, we used the multi-view dashboard to analyze the chromosomal structures of Agrobacterium tumefaciens, focusing on intra and inter-chromosomal interactions. The 3D and 2D graphs revealed spatial relationships between chromosomes, while the heatmap and parallel plot highlighted frequent interactions. These case studies illustrate how our multi-view visualization approach confirms existing knowledge and uncovers new dimensions of genomic data, enhancing the overall analytical process.
Our studies demonstrate the usefulness of the multi-view visualization dashboard in genomic research, highlighting its ability to provide a better understanding of genomic data. The dashboard offers researchers insights into how various representations can enhance the interpretability of genomic interactions. To ensure scalability and smooth performance on the web, we implemented optimizations such as Web Workers and IndexedDB to manage large datasets and prevent performance bottlenecks. Progressive data loading keeps user interactions responsive by initially only loading necessary data subsets. Through case studies across different species, we demonstrate how this scalable, multi-faceted approach to visualization can lead to more informed decision-making in bioinformatics, contributing to advances in genomic research.
Description
Keywords
Hi-C, genome organization, chromatin structure, gene regulation, genomic stability, data visualization, multi-view dashboard, heatmaps, Circos plots, parallel plots, 3D graph visualizations, 2D graph visualizations, genomic interactions, anomaly detection, Bacillus subtilis, Brassica napus, Agrobacterium tumefaciens, genomic architecture, web scalability, IndexedDB, Web Workers, progressive data loading, bioinformatics, genomic research, chromosomal structures, inter-chromosomal interactions, intra-chromosomal interactions, spatial organization, data interpretability, computational biology
Citation
Degree
Master of Science (M.Sc.)
Department
Computer Science
Program
Computer Science