Recursive Hierarchical Regression Clustering

  • Asma Z. Yamani
  • , Rabah A. Al-Zaidy

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Regression Clustering (RC) is a combination of unsupervised and supervised learning techniques that aims to address the problems arising with performing regression that results in a varied set of parameters over a single dataset. RC approaches attempt to form optimal subsets of a dataset then apply a regression algorithm to each subset, with the goal of achieving a more accurate prediction. Selection of the optimal number of clusters that improve the goodness-of-fit of the overall regression outcome without prior knowledge about the data is a challenging task. In this paper, we investigate the adaptation of hierarchical clustering to the RC problem. The proposed recursive hierarchical regression clustering (RHRC) algorithm utilizes a top-down hierarchical clustering approach to obtain optimized clusters. Moreover, our approach addresses premature stopping of the recursive algorithm by incorporating a patience hyper-parameter which ensures the stability of the overall algorithm. To evaluate our approach, we compared the proposed method with previously proposed regression clustering algorithms. The paper also provides an evaluation using different flat clustering techniques with RHRC to test the performance of different regression algorithms when integrating them with RHRC. Our method achieves a 53.79% increase in accuracy as an average of improvement across ten datasets with respect to linear regression. Finally, compared to Random Forest, our algorithm obtained equal or higher accuracy (up to 23%) on seven out of ten datasets with a fraction of the runtime.

Original languageEnglish
Title of host publication2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering, CSDE 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665495523
DOIs
StatePublished - 2021
Event2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering, CSDE 2021 - Brisbane, Australia
Duration: 8 Dec 202110 Dec 2021

Publication series

Name2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering, CSDE 2021

Conference

Conference2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering, CSDE 2021
Country/TerritoryAustralia
CityBrisbane
Period8/12/2110/12/21

Bibliographical note

Publisher Copyright:
© IEEE 2022.

Keywords

  • HKM
  • Regression
  • clustering
  • hierarchical
  • kmeans

ASJC Scopus subject areas

  • Social Sciences (miscellaneous)
  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality
  • Health Informatics

Fingerprint

Dive into the research topics of 'Recursive Hierarchical Regression Clustering'. Together they form a unique fingerprint.

Cite this