Dimensionality reduction in data mining: A Copula approach

  • Rima Houari
  • , Ahcène Bounceur*
  • , M. Tahar Kechadi
  • , A. Kamel Tari
  • , Reinhardt Euler
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

66 Scopus citations

Abstract

The recent trends in collecting huge and diverse datasets have created a great challenge in data analysis. One of the characteristics of these gigantic datasets is that they often have significant amounts of redundancies. The use of very large multi-dimensional data will result in more noise, redundant data, and the possibility of unconnected data entities. To efficiently manipulate data represented in a high-dimensional space and to address the impact of redundant dimensions on the final results, we propose a new technique for the dimensionality reduction using Copulas and the LU-decomposition (Forward Substitution) method. The proposed method is compared favorably with existing approaches on real-world datasets: Diabetes, Waveform, two versions of Human Activity Recognition based on Smartphone, and Thyroid Datasets taken from machine learning repository in terms of dimensionality reduction and efficiency of the method, which are performed on statistical and classification measures.

Original languageEnglish
Pages (from-to)247-260
Number of pages14
JournalExpert Systems with Applications
Volume64
DOIs
StatePublished - 1 Dec 2016
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2016 Elsevier Ltd

Keywords

  • Copulas
  • Data mining
  • Data pre-processing
  • Dimensionality reduction
  • Multi-dimensional sampling

ASJC Scopus subject areas

  • General Engineering
  • Computer Science Applications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Dimensionality reduction in data mining: A Copula approach'. Together they form a unique fingerprint.

Cite this