Abstract
Mining association rules is essential in the discovery of knowledge hidden in datasets. There are many efficient association rule mining algorithms. The problem is with the large number of rules they often discover. Large number of rules makes the discovery of knowledge very challenging because too many rules are difficult to understand, interpret, or visualize. To reduce the number of discovered rules, researchers proposed a number of solutions. However, these solutions are limited to the rules generated from traditional datasets and are incapable of handling rules generated from big datasets. To solve this problem, this paper proposes a Hadoop MapReduce-based parallel association rule pruning algorithm, named PPrune. Experimental results show that PPrune to be efficient and has good speedup, scaleup, and sizeup.
Original language | English |
---|---|
Title of host publication | Lecture Notes in Networks and Systems |
Publisher | Springer |
Pages | 117-130 |
Number of pages | 14 |
DOIs | |
State | Published - 2020 |
Publication series
Name | Lecture Notes in Networks and Systems |
---|---|
Volume | 125 |
ISSN (Print) | 2367-3370 |
ISSN (Electronic) | 2367-3389 |
Bibliographical note
Funding Information:Acknowledgements The authors would like to thank King Fahd University of Petroleum and Minerals (KFUPM), Saudi Arabia, for the support during this work.
Publisher Copyright:
© 2020, Springer Nature Singapore Pte Ltd.
Keywords
- Association rules
- Clustering
- Data mining
- Hadoop MapReduce
- Knowledge discovery
- Pruning
ASJC Scopus subject areas
- Control and Systems Engineering
- Signal Processing
- Computer Networks and Communications