Sorting points into neighborhoods (SPIN): Data analysis and visualization by ordering distance matrices

D. Tsafrir, I. Tsafrir, L. Ein-Dor, O. Zuk, D. A. Notterman, E. Domany

Research output: Contribution to journalArticlepeer-review

121 Scopus citations

Abstract

Summary: We introduce a novel unsupervised approach for the organization and visualization of multidimensional data. At the heart of the method is a presentation of the full pairwise distance matrix of the data points, viewed in pseudocolor. The ordering of points is iteratively permuted in search of a linear ordering, which can be used to study embedded shapes. Several examples indicate how the shapes of certain structures in the data (elongated, circular and compact) manifest themselves visually in our permuted distance matrix. It is important to identify the elongated objects since they are often associated with a set of hidden variables, underlying continuous variation in the data. The problem of determining an optimal linear ordering is shown to be NP-Complete, and therefore an iterative search algorithm with O(n3) step-complexity is suggested. By using sorting points into neighborhoods, i.e. SPIN to analyze colon cancer expression data we were able to address the serious problem of sample heterogeneity, which hinders identification of metastasis related genes in our data. Our methodology brings to light the continuous variation of heterogeneity - starting with homogeneous tumor samples and gradually increasing the amount of another tissue. Ordering the samples according to their degree of contamination by unrelated tissue allows the separation of genes associated with irrelevant contamination from those related to cancer progression.

Original languageEnglish (US)
Pages (from-to)2301-2308
Number of pages8
JournalBioinformatics
Volume21
Issue number10
DOIs
StatePublished - May 15 2005
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Computational Mathematics
  • Molecular Biology
  • Biochemistry
  • Statistics and Probability
  • Computer Science Applications
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Sorting points into neighborhoods (SPIN): Data analysis and visualization by ordering distance matrices'. Together they form a unique fingerprint.

Cite this