Sub-optimal Join Order Identification with L1-error

Yesdaulet Izenov, Asoke Datta, Brian Tsan, Florin Rusu

Research output: Contribution to journalConference articlepeer-review

Abstract

Q-error -- the standard metric for quantifying the error of individual cardinality estimates -- has been widely adopted as a surrogate for query plan optimality in recent work on learning-based cardinality estimation. However, the only result connecting Q-error with plan optimality is an upper-bound on the cost of the worst possible query plan computed from a set of cardinality estimates---there is no connection between Q-error and the real plans generated by standard query optimizers. Therefore, in order to identify sub-optimal query plans, we propose a learning-based method having as its main feature a novel measure called L1-error. Similar to Q-error, L1-error requires complete knowledge of true cardinalities and estimates for all the sub-plans of a query plan. Unlike Q-error, which considers the estimates independently, L1-error is defined as a permutation distance between true cardinalities and estimates for all the sub-plans having the same number of joins. Moreover, L1-error takes into account errors relative to the magnitude of their cardinalities and gives larger weight to small multi-way joins. Our experimental results confirm that, when L1-error is integrated into a standard decision tree classifier, it leads to the accurate identification of sub-optimal plans across four different benchmarks. This accuracy can be further improved by combining L1-error with Q-error into a composite feature that can be computed without overhead from the same data.
Original languageEnglish
Pages (from-to)1–24
Number of pages24
JournalProceedings of the ACM SIGMOD International Conference on Management of Data
Volume2
Issue number1
DOIs
Publication statusPublished - 2024
Event2024 International Conference on Management of Data - Santiago, Chile
Duration: Jun 9 2024 → …
Conference number: 2024
https://2024.sigmod.org/sigmod_workshops.shtml

Keywords

  • join ordering
  • cardinality estimation
  • permutation distance
  • database query processing
  • feature engineering
  • bad query plans

Fingerprint

Dive into the research topics of 'Sub-optimal Join Order Identification with L1-error'. Together they form a unique fingerprint.

Cite this