### Abstract

Spectral embedding and spectral clustering are common methods for non-linear dimensionality reduction and clustering of complex high dimensional datasets. In this paper we provide a diffusion based probabilistic analysis of algorithms that use the normalized graph Laplacian. Given the pairwise adjacency matrix of all points in a dataset, we define a random walk on the graph of points and a diffusion distance between any two points. We show that the diffusion distance is equal to the Euclidean distance in the embedded space with all eigenvectors of the normalized graph Laplacian. This identity shows that characteristic relaxation times and processes of the random walk on the graph are the key concept that governs the properties of these spectral clustering and spectral embedding algorithms. Specifically, for spectral clustering to succeed, a necessary condition is that the mean exit times from each cluster need to be significantly larger than the largest (slowest) of all relaxation times inside all of the individual clusters. For complex, multiscale data, this condition may not hold and multiscale methods need to be developed to handle such situations.

Original language | English (US) |
---|---|

Title of host publication | Principal Manifolds for Data Visualization and Dimension Reduction |

Pages | 238-260 |

Number of pages | 23 |

DOIs | |

State | Published - Dec 1 2008 |

### Publication series

Name | Lecture Notes in Computational Science and Engineering |
---|---|

Volume | 58 |

ISSN (Print) | 1439-7358 |

### All Science Journal Classification (ASJC) codes

- Modeling and Simulation
- Engineering(all)
- Discrete Mathematics and Combinatorics
- Control and Optimization
- Computational Mathematics

## Fingerprint Dive into the research topics of 'Diffusion maps - A probabilistic interpretation for spectral embedding and clustering algorithms'. Together they form a unique fingerprint.

## Cite this

*Principal Manifolds for Data Visualization and Dimension Reduction*(pp. 238-260). (Lecture Notes in Computational Science and Engineering; Vol. 58). https://doi.org/10.1007/978-3-540-73750-6_10