Clustering algorithm for experimental datasets using global sensitivity-based affinity propagation (GSAP)

Yiru Wang, Chenyue Tao, Zijun Zhou, Keli Lin, Chung K. Law, Bin Yang

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


To minimize the uncertainty of the parameters in combustion kinetics models, Bayesian methods are commonly used for uncertainty constraints based on experimental data. With the rapid and substantial growth of experimental data, using all the experimental data for optimization is not only redundant and time-consuming, but it could also lead to data consistency problems. In this work, the global sensitivity-based affinity propagation method (GSAP) is proposed to cluster experimental datasets and to select representative experimental conditions. Specifically, the global sensitivity coefficient is first obtained through an analysis to characterize the sources of uncertainty in the kinetic model under different experimental conditions. The similarity coefficient, which is defined based on the global sensitivity, measures the resemblance between two experimental conditions. By exchanging messages calculated from similarity, affinity propagation enables the experimental dataset to be automatically clustered into several classes without specifying the number of classes in advance. This method innovatively introduces the consideration of model and experimental uncertainty under different conditions to obtain better optimization results. The correctness and effectiveness of the method are validated through clustering and optimizing on a laminar flame speed dataset of common C0–C4 fuels. The dataset consisting of 288 experimental conditions has been automatically clustered into 27 categories, and an exemplar of each category is given. These exemplary conditions reflect the dominant chemistry behind their cluster. At the same time, these conditions have larger model prediction uncertainty and smaller experimental uncertainty to provide better Bayesian constraints. The uncertainty of the model parameters after Bayesian optimization is effectively constrained. The average uncertainty of model predictions across the dataset is reduced from 30 % to 10 % using only 27 exemplar conditions for optimization. While selecting experimental data for model optimization, the clustering strategies provided by this method also, in turn, help understand its underlying chemical essence.

Original languageEnglish (US)
Article number113121
JournalCombustion and Flame
StatePublished - Jan 2024
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • General Chemistry
  • General Chemical Engineering
  • Fuel Technology
  • Energy Engineering and Power Technology
  • General Physics and Astronomy


  • Affinity propagation
  • Data clustering
  • Global sensitivity analysis
  • Uncertainty quantification


Dive into the research topics of 'Clustering algorithm for experimental datasets using global sensitivity-based affinity propagation (GSAP)'. Together they form a unique fingerprint.

Cite this