Discovering Representative Attribute-stars via Minimum Description Length

  • Jiahong Liu
  • , Min Zhou*
  • , Philippe Fournier-Viger*
  • , Menglin Yang
  • , Lujia Pan
  • , Mourad Nouioua
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Scopus citations

Abstract

Graphs are a popular data type found in many domains. Numerous techniques have been proposed to find interesting patterns in graphs to help understand the data and support decision-making. However, there are generally two limitations that hinder their practical use: (1) they have multiple parameters that are hard to set but greatly influence results, (2) and they generally focus on identifying complex subgraphs while ignoring relationships between attributes of nodes. Graphs are a popular data type found in many domains. Numerous techniques have been proposed to find interesting patterns in graphs to help understand the data and support decision-making. However, there are generally two limitations that hinder their practical use: (1) they have multiple parameters that are hard to set but greatly influence results, (2) and they generally focus on identifying complex subgraphs while ignoring relationships between attributes of nodes. To address these problems, we propose a parameter-free algorithm named CSPM (Compressing Star Pattern Miner) which identifies star-shaped patterns that indicate strong correlations among attributes via the concept of conditional entropy and the minimum description length principle. Experiments performed on several benchmark datasets show that CSPM reveals insightful and interpretable patterns and is efficient in runtime. Moreover, quantitative evaluations on two real-world applications show that CSPM has broad applications as it successfully boosts the accuracy of graph attribute completion models by up to 30.68% and uncovers important patterns in telecommunication alarm data.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE 38th International Conference on Data Engineering, ICDE 2022
PublisherIEEE Computer Society
Pages68-80
Number of pages13
ISBN (Electronic)9781665408837
DOIs
StatePublished - 2022
Externally publishedYes
Event38th IEEE International Conference on Data Engineering, ICDE 2022 - Virtual, Online, Malaysia
Duration: 9 May 202212 May 2022

Publication series

NameProceedings - International Conference on Data Engineering
Volume2022-May
ISSN (Print)1084-4627
ISSN (Electronic)2375-0286

Conference

Conference38th IEEE International Conference on Data Engineering, ICDE 2022
Country/TerritoryMalaysia
CityVirtual, Online
Period9/05/2212/05/22

Bibliographical note

Publisher Copyright:
© 2022 IEEE.

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Fingerprint

Dive into the research topics of 'Discovering Representative Attribute-stars via Minimum Description Length'. Together they form a unique fingerprint.

Cite this