Clustervision: Visual Supervision of Unsupervised Clustering

Bum Chul Kwon, Ben Eysenbach, Janu Verma, Kenney Ng, Christopher De Filippi, Walter F. Stewart, Adam Perer

Research output: Contribution to journalArticlepeer-review

105 Scopus citations

Abstract

Clustering, the process of grouping together similar items into distinct partitions, is a common type of unsupervised machine learning that can be useful for summarizing and aggregating complex multi-dimensional data. However, data can be clustered in many ways, and there exist a large body of algorithms designed to reveal different patterns. While having access to a wide variety of algorithms is helpful, in practice, it is quite difficult for data scientists to choose and parameterize algorithms to get the clustering results relevant for their dataset and analytical tasks. To alleviate this problem, we built Clustervision, a visual analytics tool that helps ensure data scientists find the right clustering among the large amount of techniques and parameters available. Our system clusters data using a variety of clustering techniques and parameters and then ranks clustering results utilizing five quality metrics. In addition, users can guide the system to produce more relevant results by providing task-relevant constraints on the data. Our visual user interface allows users to find high quality clustering results, explore the clusters using several coordinated visualization techniques, and select the cluster result that best suits their task. We demonstrate this novel approach using a case study with a team of researchers in the medical domain and showcase that our system empowers users to choose an effective representation of their complex data.

Original languageEnglish (US)
Article number8019866
Pages (from-to)142-151
Number of pages10
JournalIEEE Transactions on Visualization and Computer Graphics
Volume24
Issue number1
DOIs
StatePublished - Jan 2018
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Computer Graphics and Computer-Aided Design

Keywords

  • Interactive Visual Clustering
  • Quality Metrics
  • Unsupervised Clustering
  • Visual Analytics

Fingerprint

Dive into the research topics of 'Clustervision: Visual Supervision of Unsupervised Clustering'. Together they form a unique fingerprint.

Cite this