CHUQI-Miner: Mining Correlated Quantitative High Utility Itemsets

Mourad Nouioua, Philippe Fournier-Viger*, Jun Feng Qu, Jerry Chun Wei Lin, Wensheng Gan, Wei Song

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

To discover patterns of high importance in data, a popular data science task is high utility itemset mining (HUIM). It aims at discovering all sets of values that have a high utility (importance) in database records. A key application is to find products purchased together in online stores that yield a high profit (utility), as it can provide insights for marketing and product recommendation. But HUIM has two key limitations. First, the discovered patterns do not provide information about the quantities of items. But in real-life, quantities are important (e.g. buying 1 bread is not the same as buying 12 breads). Second, it is observed in real shopping data that many itemsets yield a high utility (profit) but contain weakly correlated items. Such itemsets can be misleading as their joint sale may just appear by chance. This paper addresses both issues by proposing a novel algorithm called CHUQI-Miner (Correlated High Utility Quantitative Itemset-Miner). It extends the state-of-the-art HUQI-Miner algorithm for quantitative high utility itemset mining with the bond correlation measure. This allows finding strongly correlated high utility itemsets with quantities. Experiments on retail data show that the algorithm is efficient and can filter a huge amount of spurious itemsets.

Original languageEnglish
Title of host publicationProceedings - 21st IEEE International Conference on Data Mining Workshops, ICDMW 2021
EditorsBing Xue, Mykola Pechenizkiy, Yun Sing Koh
PublisherIEEE Computer Society
Pages599-606
Number of pages8
ISBN (Electronic)9781665424271
DOIs
StatePublished - 2021
Externally publishedYes
Event21st IEEE International Conference on Data Mining Workshops, ICDMW 2021 - Virtual, Online, New Zealand
Duration: 7 Dec 202110 Dec 2021

Publication series

NameIEEE International Conference on Data Mining Workshops, ICDMW
Volume2021-December
ISSN (Print)2375-9232
ISSN (Electronic)2375-9259

Conference

Conference21st IEEE International Conference on Data Mining Workshops, ICDMW 2021
Country/TerritoryNew Zealand
CityVirtual, Online
Period7/12/2110/12/21

Bibliographical note

Publisher Copyright:
© 2021 IEEE.

Keywords

  • bond measure
  • correlation
  • high utility itemsets
  • quantitative itemsets

ASJC Scopus subject areas

  • Computer Science Applications
  • Software

Fingerprint

Dive into the research topics of 'CHUQI-Miner: Mining Correlated Quantitative High Utility Itemsets'. Together they form a unique fingerprint.

Cite this