Accurate detection of aneuploidies in array CGH and gene expression microarray data

Chad L. Myers, Maitreya J. Dunham, S. Y. Kung, Olga G. Troyanskaya

Research output: Contribution to journalArticlepeer-review

90 Scopus citations


Motivation: Chromosomal copy number changes (aneuploidies) are common in cell populations that undergo multiple cell divisions including yeast strains, cell lines and tumor cells. Identification of aneuploidies is critical in evolutionary studies, where changes in copy number serve an adaptive purpose, as well as in cancer studies, where amplifications and deletions of chromosomal regions have been identified as a major pathogenetic mechanism. Aneuploidies can be studied on whole-genome level using array CGH (a microarray-based method that measures the DNA content), but their presence also affects gene expression. In gene expression microarray analysis, identification of copy number changes is especially important in preventing aberrant biological conclusions based on spurious gene expression correlation or masked phenotypes that arise due to aneuploidies. Previously suggested approaches for aneuploidy detection from microarray data mostly focus on array CGH, address only whole-chromosome or whole-arm copy number changes, and rely on thresholds or other heuristics, making them unsuitable for fully automated general application to gene expression datasets. There is a need for a general and robust method for identification of aneuploidies of any size from both array CGH and gene expression microarray data. Results: We present ChARM (Chromosomal Aberration Region Miner), a robust and accurate expectation-maximization based method for identification of segmental aneuploidies (partial chromosome changes) from gene expression and array CGH microarray data. Systematic evaluation of the algorithm on synthetic and biological data shows that the method is robust to noise, aneuploidal segment size and P-value cutoff. Using our approach, we identify known chromosomal changes and predict novel potential segmental aneuploidies in commonly used yeast deletion strains and in breast cancer. ChARM can be routinely used to identify aneuploidies in array CGH datasets and to screen gene expression data for aneuploidies or array biases. Our methodology is sensitive enough to detect statistically significant and biologically relevant aneuploidies even when expression or DNA content changes are subtle as in mixed populations of cells.

Original languageEnglish (US)
Pages (from-to)3533-3543
Number of pages11
Issue number18
StatePublished - Dec 12 2004

All Science Journal Classification (ASJC) codes

  • Computational Mathematics
  • Molecular Biology
  • Biochemistry
  • Statistics and Probability
  • Computer Science Applications
  • Computational Theory and Mathematics


Dive into the research topics of 'Accurate detection of aneuploidies in array CGH and gene expression microarray data'. Together they form a unique fingerprint.

Cite this