Motivation The analysis of high-dimensional 'omics data is often informed by the use of biological interaction networks. For example, protein-protein interaction networks have been used to analyze gene expression data, to prioritize germline variants, and to identify somatic driver mutations in cancer. In these and other applications, the underlying computational problem is to identify altered subnetworks containing genes that are both highly altered in an 'omics dataset and are topologically close (e.g. connected) on an interaction network. Results We introduce Hierarchical HotNet, an algorithm that finds a hierarchy of altered subnetworks. Hierarchical HotNet assesses the statistical significance of the resulting subnetworks over a range of biological scales and explicitly controls for ascertainment bias in the network. We evaluate the performance of Hierarchical HotNet and several other algorithms that identify altered subnetworks on the problem of predicting cancer genes and significantly mutated subnetworks. On somatic mutation data from The Cancer Genome Atlas, Hierarchical HotNet outperforms other methods and identifies significantly mutated subnetworks containing both well-known cancer genes and candidate cancer genes that are rarely mutated in the cohort. Hierarchical HotNet is a robust algorithm for identifying altered subnetworks across different 'omics datasets. Availability and implementation http://github.com/raphael-group/hierarchical-hotnet. Supplementary information Supplementary material are available at Bioinformatics online.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics