Improved selectivity estimator for XML queries based on structural synopsis

Salahadin Mohammed*, El Sayed M. El-Alfy, Ahmad F. Barradah

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


With the increasing popularity of XML database applications, the use of efficient XML query optimizers is becoming very essential. The performance of an XML query optimizer depends heavily on the query selectivity estimators it uses to find the best possible query execution plan. In this work, we propose and evaluate a novel selectivity estimator, based on a structural synopsis, called SynopTech. The main idea of SynopTech is the generation of a summary tree by labeling the nodes of the source XML data tree using a fingerprint function and merging subtrees with similar structures. The generated summary tree is then used by SynopTech to estimate the selectivity of given queries. We experimented the proposed approach with four benchmark datasets of different structural characteristics and using different types of queries. Comparing with the Sampling algorithm, one of the state-of-the-art algorithms for selectivity estimations, SynopTech achieved lower selectivity estimation error rates, yet with very low memory budget. For example, for linear and existential queries, SynopTech had perfect estimations whereas the Sampling algorithm had an error rate of up to 70 %. For regular twig queries, SynopTech had a maximum error rate of 4.12 % whereas the Sampling algorithm had more than 55 %.

Original languageEnglish
Pages (from-to)1123-1144
Number of pages22
JournalWorld Wide Web
Issue number4
StatePublished - 6 Dec 2015

Bibliographical note

Publisher Copyright:
© 2014, Springer Science+Business Media New York.


  • Query optimization
  • Selectivity estimation
  • Structural synopsis
  • Twig pattern matching
  • XML query processing

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications


Dive into the research topics of 'Improved selectivity estimator for XML queries based on structural synopsis'. Together they form a unique fingerprint.

Cite this