Abstract
To discover patterns of high importance in data, a popular data science task is high utility itemset mining (HUIM). It aims at discovering all sets of values that have a high utility (importance) in database records. A key application is to find products purchased together in online stores that yield a high profit (utility), as it can provide insights for marketing and product recommendation. But HUIM has two key limitations. First, the discovered patterns do not provide information about the quantities of items. But in real-life, quantities are important (e.g. buying 1 bread is not the same as buying 12 breads). Second, it is observed in real shopping data that many itemsets yield a high utility (profit) but contain weakly correlated items. Such itemsets can be misleading as their joint sale may just appear by chance. This paper addresses both issues by proposing a novel algorithm called CHUQI-Miner (Correlated High Utility Quantitative Itemset-Miner). It extends the state-of-the-art HUQI-Miner algorithm for quantitative high utility itemset mining with the bond correlation measure. This allows finding strongly correlated high utility itemsets with quantities. Experiments on retail data show that the algorithm is efficient and can filter a huge amount of spurious itemsets.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 21st IEEE International Conference on Data Mining Workshops, ICDMW 2021 |
| Editors | Bing Xue, Mykola Pechenizkiy, Yun Sing Koh |
| Publisher | IEEE Computer Society |
| Pages | 599-606 |
| Number of pages | 8 |
| ISBN (Electronic) | 9781665424271 |
| DOIs | |
| State | Published - 2021 |
| Externally published | Yes |
| Event | 21st IEEE International Conference on Data Mining Workshops, ICDMW 2021 - Virtual, Online, New Zealand Duration: 7 Dec 2021 → 10 Dec 2021 |
Publication series
| Name | IEEE International Conference on Data Mining Workshops, ICDMW |
|---|---|
| Volume | 2021-December |
| ISSN (Print) | 2375-9232 |
| ISSN (Electronic) | 2375-9259 |
Conference
| Conference | 21st IEEE International Conference on Data Mining Workshops, ICDMW 2021 |
|---|---|
| Country/Territory | New Zealand |
| City | Virtual, Online |
| Period | 7/12/21 → 10/12/21 |
Bibliographical note
Publisher Copyright:© 2021 IEEE.
Keywords
- bond measure
- correlation
- high utility itemsets
- quantitative itemsets
ASJC Scopus subject areas
- Computer Science Applications
- Software