Abstract
With the increasing popularity of XML database applications, the use of efficient XML query optimizers is becoming very essential. The performance of an XML query optimizer depends heavily on the query selectivity estimators it uses to find the best possible query execution plan. In this work, we propose and evaluate a novel selectivity estimator, based on a structural synopsis, called SynopTech. The main idea of SynopTech is the generation of a summary tree by labeling the nodes of the source XML data tree using a fingerprint function and merging subtrees with similar structures. The generated summary tree is then used by SynopTech to estimate the selectivity of given queries. We experimented the proposed approach with four benchmark datasets of different structural characteristics and using different types of queries. Comparing with the Sampling algorithm, one of the state-of-the-art algorithms for selectivity estimations, SynopTech achieved lower selectivity estimation error rates, yet with very low memory budget. For example, for linear and existential queries, SynopTech had perfect estimations whereas the Sampling algorithm had an error rate of up to 70 %. For regular twig queries, SynopTech had a maximum error rate of 4.12 % whereas the Sampling algorithm had more than 55 %.
Original language | English |
---|---|
Pages (from-to) | 1123-1144 |
Number of pages | 22 |
Journal | World Wide Web |
Volume | 18 |
Issue number | 4 |
DOIs | |
State | Published - 6 Dec 2015 |
Bibliographical note
Publisher Copyright:© 2014, Springer Science+Business Media New York.
Keywords
- Query optimization
- Selectivity estimation
- Structural synopsis
- Twig pattern matching
- XML query processing
ASJC Scopus subject areas
- Software
- Hardware and Architecture
- Computer Networks and Communications