Exploratory Analysis of the Gene Expression Matrix Based on Dual Conditional Dimensionality Reduction
Palabra(s) clave:
visual analytics, gene expression, exploratory data analysis
Fecha de publicación:
Editorial:
IEEE
Versión del editor:
Citación:
Descripción física:
Resumen:
One of the major goals in gene expression data analysis is to explore and discover groups of genes and groups of biological conditions with meaningful relationships. While this problem can be addressed by algorithms, their results require an analysis within context, since they may be affected by many side processes —such as tissue differentiation— that could hinder the target goal. Visual analytics-based methods for exploratory analysis of the gene expression matrix (GEM) are essential in biomedical research since they allow us to frame the analysis within the user's knowledge domain. In this paper, we present a visual analytics approach to discover relevant connections between genes and samples based on linking a reordered GEM heatmap and dual 2D projections of its rows and columns, which can be recomputed conditioned by subsets of genes and/or samples selected by the user during the analysis. We demonstrate the capability of our approach to discover relevant knowledge in three case studies involving two cancer types plus normal tissue from the TCGA database.
One of the major goals in gene expression data analysis is to explore and discover groups of genes and groups of biological conditions with meaningful relationships. While this problem can be addressed by algorithms, their results require an analysis within context, since they may be affected by many side processes —such as tissue differentiation— that could hinder the target goal. Visual analytics-based methods for exploratory analysis of the gene expression matrix (GEM) are essential in biomedical research since they allow us to frame the analysis within the user's knowledge domain. In this paper, we present a visual analytics approach to discover relevant connections between genes and samples based on linking a reordered GEM heatmap and dual 2D projections of its rows and columns, which can be recomputed conditioned by subsets of genes and/or samples selected by the user during the analysis. We demonstrate the capability of our approach to discover relevant knowledge in three case studies involving two cancer types plus normal tissue from the TCGA database.
ISSN:
Patrocinado por:
This work was supported by the Ministerio de Ciencia e Innovaci´on / Agencia Estatal de Investigaci´on (MCIN/AEI/ 10.13039/501100011033) grant [PID2020-115401GB-I00]. The authors would also like to thank the financial support provided by the Principado de Asturias government through the predoctoral grant “Severo Ochoa”.