Independent Component Analysis for Unraveling the Complexity of Cancer Omics Datasets

Nicolas Sompairac, Petr V. Nazarov, Urszula Czerwinska, Laura Cantini, Anne Biton, Askhat Molkenov, Zhaxybay Zhumadilov, Emmanuel Barillot, Francois Radvanyi, Alexander Gorban, Ulykbek Kairov, Andrei Zinovyev

Research output: Contribution to journalReview article

Abstract

Independent component analysis (ICA) is a matrix factorization approach where the signals captured by each individual matrix factors are optimized to become as mutually independent as possible. Initially suggested for solving source blind separation problems in various fields, ICA was shown to be successful in analyzing functional magnetic resonance imaging (fMRI) and other types of biomedical data. In the last twenty years, ICA became a part of the standard machine learning toolbox, together with other matrix factorization methods such as principal component analysis (PCA) and non-negative matrix factorization (NMF). Here, we review a number of recent works where ICA was shown to be a useful tool for unraveling the complexity of cancer biology from the analysis of different types of omics data, mainly collected for tumoral samples. Such works highlight the use of ICA in dimensionality reduction, deconvolution, data pre-processing, meta-analysis, and others applied to different data types (transcriptome, methylome, proteome, single-cell data). We particularly focus on the technical aspects of ICA application in omics studies such as using different protocols, determining the optimal number of components, assessing and improving reproducibility of the ICA results, and comparison with other popular matrix factorization techniques. We discuss the emerging ICA applications to the integrative analysis of multi-level omics datasets and introduce a conceptual view on ICA as a tool for defining functional subsystems of a complex biological system and their interactions under various conditions. Our review is accompanied by a Jupyter notebook which illustrates the discussed concepts and provides a practical tool for applying ICA to the analysis of cancer omics datasets.

Original languageEnglish
JournalInternational Journal of Molecular Sciences
Volume20
Issue number18
DOIs
Publication statusPublished - Sep 7 2019

Fingerprint

Independent component analysis
Systems Analysis
cancer
Proteome
Principal Component Analysis
Transcriptome
Meta-Analysis
Neoplasms
Factorization
Magnetic Resonance Imaging
factorization
matrices
biomedical data
Datasets
proteome
Blind source separation
machine learning
Deconvolution
Biological systems
data reduction

Keywords

  • cancer
  • data analysis
  • data integration
  • dimension reduction
  • independent component analysis
  • omics data

ASJC Scopus subject areas

  • Catalysis
  • Molecular Biology
  • Spectroscopy
  • Computer Science Applications
  • Physical and Theoretical Chemistry
  • Organic Chemistry
  • Inorganic Chemistry

Cite this

Independent Component Analysis for Unraveling the Complexity of Cancer Omics Datasets. / Sompairac, Nicolas; Nazarov, Petr V.; Czerwinska, Urszula; Cantini, Laura; Biton, Anne; Molkenov, Askhat; Zhumadilov, Zhaxybay; Barillot, Emmanuel; Radvanyi, Francois; Gorban, Alexander; Kairov, Ulykbek; Zinovyev, Andrei.

In: International Journal of Molecular Sciences, Vol. 20, No. 18, 07.09.2019.

Research output: Contribution to journalReview article

Sompairac, N, Nazarov, PV, Czerwinska, U, Cantini, L, Biton, A, Molkenov, A, Zhumadilov, Z, Barillot, E, Radvanyi, F, Gorban, A, Kairov, U & Zinovyev, A 2019, 'Independent Component Analysis for Unraveling the Complexity of Cancer Omics Datasets', International Journal of Molecular Sciences, vol. 20, no. 18. https://doi.org/10.3390/ijms20184414
Sompairac, Nicolas ; Nazarov, Petr V. ; Czerwinska, Urszula ; Cantini, Laura ; Biton, Anne ; Molkenov, Askhat ; Zhumadilov, Zhaxybay ; Barillot, Emmanuel ; Radvanyi, Francois ; Gorban, Alexander ; Kairov, Ulykbek ; Zinovyev, Andrei. / Independent Component Analysis for Unraveling the Complexity of Cancer Omics Datasets. In: International Journal of Molecular Sciences. 2019 ; Vol. 20, No. 18.
@article{0cec8b723f244b959dd51a8c17e73704,
title = "Independent Component Analysis for Unraveling the Complexity of Cancer Omics Datasets",
abstract = "Independent component analysis (ICA) is a matrix factorization approach where the signals captured by each individual matrix factors are optimized to become as mutually independent as possible. Initially suggested for solving source blind separation problems in various fields, ICA was shown to be successful in analyzing functional magnetic resonance imaging (fMRI) and other types of biomedical data. In the last twenty years, ICA became a part of the standard machine learning toolbox, together with other matrix factorization methods such as principal component analysis (PCA) and non-negative matrix factorization (NMF). Here, we review a number of recent works where ICA was shown to be a useful tool for unraveling the complexity of cancer biology from the analysis of different types of omics data, mainly collected for tumoral samples. Such works highlight the use of ICA in dimensionality reduction, deconvolution, data pre-processing, meta-analysis, and others applied to different data types (transcriptome, methylome, proteome, single-cell data). We particularly focus on the technical aspects of ICA application in omics studies such as using different protocols, determining the optimal number of components, assessing and improving reproducibility of the ICA results, and comparison with other popular matrix factorization techniques. We discuss the emerging ICA applications to the integrative analysis of multi-level omics datasets and introduce a conceptual view on ICA as a tool for defining functional subsystems of a complex biological system and their interactions under various conditions. Our review is accompanied by a Jupyter notebook which illustrates the discussed concepts and provides a practical tool for applying ICA to the analysis of cancer omics datasets.",
keywords = "cancer, data analysis, data integration, dimension reduction, independent component analysis, omics data",
author = "Nicolas Sompairac and Nazarov, {Petr V.} and Urszula Czerwinska and Laura Cantini and Anne Biton and Askhat Molkenov and Zhaxybay Zhumadilov and Emmanuel Barillot and Francois Radvanyi and Alexander Gorban and Ulykbek Kairov and Andrei Zinovyev",
year = "2019",
month = "9",
day = "7",
doi = "10.3390/ijms20184414",
language = "English",
volume = "20",
journal = "International Journal of Molecular Sciences",
issn = "1661-6596",
publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",
number = "18",

}

TY - JOUR

T1 - Independent Component Analysis for Unraveling the Complexity of Cancer Omics Datasets

AU - Sompairac, Nicolas

AU - Nazarov, Petr V.

AU - Czerwinska, Urszula

AU - Cantini, Laura

AU - Biton, Anne

AU - Molkenov, Askhat

AU - Zhumadilov, Zhaxybay

AU - Barillot, Emmanuel

AU - Radvanyi, Francois

AU - Gorban, Alexander

AU - Kairov, Ulykbek

AU - Zinovyev, Andrei

PY - 2019/9/7

Y1 - 2019/9/7

N2 - Independent component analysis (ICA) is a matrix factorization approach where the signals captured by each individual matrix factors are optimized to become as mutually independent as possible. Initially suggested for solving source blind separation problems in various fields, ICA was shown to be successful in analyzing functional magnetic resonance imaging (fMRI) and other types of biomedical data. In the last twenty years, ICA became a part of the standard machine learning toolbox, together with other matrix factorization methods such as principal component analysis (PCA) and non-negative matrix factorization (NMF). Here, we review a number of recent works where ICA was shown to be a useful tool for unraveling the complexity of cancer biology from the analysis of different types of omics data, mainly collected for tumoral samples. Such works highlight the use of ICA in dimensionality reduction, deconvolution, data pre-processing, meta-analysis, and others applied to different data types (transcriptome, methylome, proteome, single-cell data). We particularly focus on the technical aspects of ICA application in omics studies such as using different protocols, determining the optimal number of components, assessing and improving reproducibility of the ICA results, and comparison with other popular matrix factorization techniques. We discuss the emerging ICA applications to the integrative analysis of multi-level omics datasets and introduce a conceptual view on ICA as a tool for defining functional subsystems of a complex biological system and their interactions under various conditions. Our review is accompanied by a Jupyter notebook which illustrates the discussed concepts and provides a practical tool for applying ICA to the analysis of cancer omics datasets.

AB - Independent component analysis (ICA) is a matrix factorization approach where the signals captured by each individual matrix factors are optimized to become as mutually independent as possible. Initially suggested for solving source blind separation problems in various fields, ICA was shown to be successful in analyzing functional magnetic resonance imaging (fMRI) and other types of biomedical data. In the last twenty years, ICA became a part of the standard machine learning toolbox, together with other matrix factorization methods such as principal component analysis (PCA) and non-negative matrix factorization (NMF). Here, we review a number of recent works where ICA was shown to be a useful tool for unraveling the complexity of cancer biology from the analysis of different types of omics data, mainly collected for tumoral samples. Such works highlight the use of ICA in dimensionality reduction, deconvolution, data pre-processing, meta-analysis, and others applied to different data types (transcriptome, methylome, proteome, single-cell data). We particularly focus on the technical aspects of ICA application in omics studies such as using different protocols, determining the optimal number of components, assessing and improving reproducibility of the ICA results, and comparison with other popular matrix factorization techniques. We discuss the emerging ICA applications to the integrative analysis of multi-level omics datasets and introduce a conceptual view on ICA as a tool for defining functional subsystems of a complex biological system and their interactions under various conditions. Our review is accompanied by a Jupyter notebook which illustrates the discussed concepts and provides a practical tool for applying ICA to the analysis of cancer omics datasets.

KW - cancer

KW - data analysis

KW - data integration

KW - dimension reduction

KW - independent component analysis

KW - omics data

UR - http://www.scopus.com/inward/record.url?scp=85071977884&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85071977884&partnerID=8YFLogxK

U2 - 10.3390/ijms20184414

DO - 10.3390/ijms20184414

M3 - Review article

VL - 20

JO - International Journal of Molecular Sciences

JF - International Journal of Molecular Sciences

SN - 1661-6596

IS - 18

ER -