Statistical Analysis of Clustering Performances of NMF, Spectral Clustering, and K-means

Andri Mirzal*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Nonnegative matrix factorization (NMF), spectral clustering, and k-means are the most used clustering methods in machine learning research. They have been used in many domains including text, image, and cancer clustering. However, there is still a limited number of works that discuss statistical significance of performance differences between these methods. This issue is epecially important in NMF as this method is still very actively researched with a sheer number of new algorithms are published every year, and being able to demonstrate newly proposed algorithms statistically outperform previous ones is certainly desired. In this paper, we present statistical analysis of clustering performance differences between NMF, spectral clustering, and k-means. We use ten NMF algorithms, six spectral clustering algorithms, and one standard k-means algorithm for benchmark. For data, eleven publicly available microarray gene expression datasets with numbers of classes range from two to ten are used. The experimental results show that statistically performance differences between NMF algorithms and the standard k-means algorithm are not significant, and spectral methods surprisingly perform less well than NMF and k-means.

Original languageEnglish
Title of host publication2020 2nd International Conference on Computer and Information Sciences, ICCIS 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728154671
DOIs
StatePublished - 13 Oct 2020

Publication series

Name2020 2nd International Conference on Computer and Information Sciences, ICCIS 2020

Bibliographical note

Publisher Copyright:
© 2020 IEEE.

Keywords

  • K-means
  • Nonnegative matrix factorization
  • Spectral clustering
  • Statistical analysis

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Information Systems and Management
  • Control and Optimization
  • Information Systems

Fingerprint

Dive into the research topics of 'Statistical Analysis of Clustering Performances of NMF, Spectral Clustering, and K-means'. Together they form a unique fingerprint.

Cite this