When processing join queries over big data, a DBMS can become unresponsive, i.e., it takes very long until any output tuples appear. Ranked enumeration addresses this problem by returning the most important answers as quickly as possible, ideally in time linear (or quasilinear) in input size, even if the complete output is much larger.

Aside from its practical usefulness, ranked enumeration is closely related to, and in a way unifies, several other problems involving joins. The common goal is the design of optimal algorithms that are guaranteed to avoid large intermediate results and thus achieve time or space complexity close to a lower bound. Since avoiding query plans that produce huge intermediate results has been an overarching goal of database optimizers, optimal join algorithms, enumeration, and factorized representations have recently generated a lot of excitement.

In this tutorial, we embark on an exploration of these topics with ranked enumeration as our guide, showing how they are intimately connected with a wide range of fundamental problems in computer science.

Part 1: Introduction (Nikos)

What is this tutorial about?
Overview of queries/tasks
Measures of success
Overview of techniques

Part 2: Cycles & Tree Decompositions (Mirek)

Lower Bound and the Yannakakis Algorithm
Problems Caused by Cycles
Tree Decompositions
Summary

Part 3: Acyclic queries & Enumeration (Wolfgang)

Semi-join reductions as instance of message-passing with sideways information processing
Query hypergraphs, GYO reduction, ear removal, join trees
Yannakakis = acyclic query evaluation (table-at-a time)
Enumerating answers (tuple-at-a time)
Free-connex queries

Part 4: Factorization (Nikos)

High-level idea
Factorized representation of path-CQ
Factorized representation of tree-CQ & enumeration
Tuple-level vs Attribute-level representations

Part 5: Dynamic programming & Semirings (Wolfgang)

Top-1 = Dynamic Programming (DP), Principle of Optimality, Optimal substructure property, Shortest path calculation, Path counting, Longest increasing subsequence, Fibonacci numbers
Top-1 Yannakakis as variant of Tree-DP and Non-Serial Dynamic Programming (NSDP)
Algebra: Totally Ordered Commutative Monoids, Selective Commutative Dioids

Part 6: Any-k or Ranked Enumeration (Nikos)

Warm-up: Incremental QuickSort
Any-k for joins
AnyK-Part
AnyK-Rec
AnyK-Part+
Experimental Results & Summary

Part 7: Decomposition of Inequality Predicates and Conclusions (Mirek)

Joins with Inequality Predicates
Factorizing Inequalities
Experiments with Ranked Enumeration
Conclusions

Presenters from DATA Lab @ Northeastern University

Nikolaos Tziavelis (PhD student lead researcher)
Wolfgang Gatterbauer (faculty)
Mirek Riedewald (faculty)

Reference

Toward Responsive DBMS: Optimal Join Algorithms, Enumeration, Factorization, Ranking, and Dynamic Programming

Nikolaos Tziavelis, Wolfgang Gatterbauer, Mirek Riedewald

ICDE tutorials, pp. 3205-3208, 2022

IEEE | preprint | bib

@article{TziavelisGR:2022,
   author = {Nikolaos Tziavelis and Wolfgang Gatterbauer and Mirek Riedewald},
   title = {Toward Responsive DBMS: Optimal Join Algorithms, Enumeration, Factorization, Ranking, and Dynamic Programming},
  booktitle = {ICDE},
   year = {2022},
   doi = {10.1109/ICDE53745.2022.00299},
   url = {https://northeastern-datalab.github.io/responsive-dbms-tutorial}
}

Closely related papers authored by the presenters

Any-k Algorithms for Enumerating Ranked Answers to Conjunctive Queries

Nikolaos Tziavelis, Wolfgang Gatterbauer, Mirek Riedewald

arXiv, 2022

arXiv:2205.05649 | gs | bib

Project web page: Any-k

Beyond Equi-joins: Ranking, Enumeration and Factorization

Nikolaos Tziavelis, Wolfgang Gatterbauer, Mirek Riedewald

PVLDB, 14(11):2599-2612, 2021

VLDB | arXiv:2101.12158 (long version) | video (10min) | gs | bib

Optimal Algorithms for Ranked Enumeration of Answers to Full Conjunctive Queries

Nikolaos Tziavelis, Deepak Ajwani, Wolfgang Gatterbauer, Mirek Riedewald, Xiaofeng Yang

PVLDB 13(9):1582-1597, 2020

Optimal Join Algorithms meet Top-k

Nikolaos Tziavelis, Wolfgang Gatterbauer, Mirek Riedewald

SIGMOD tutorials, pp. 2659-2665, 2020

Tutorial page: Optimal Join Algorithms meet Top-k

Longer Bibliography

A longer list of related work is available on the website from our SIGMOD 2020 tutorial and in the related work section of arXiv:2205.05649.

Funding

This work has been supported in part by the National Science Foundation (NSF) under award numbers CAREER IIS-1762268 and IIS-1956096, by the Office of Naval Research (Grant#: N00014-21-C-1111), and by the National Institutes of Health (NIH) under award number R01 NS091421. Any opinions, findings, and conclusions or recommendations expressed in this presentation are those of the authors and do not necessarily reflect the views of the funding agencies.

Toward Responsive DBMS

Optimal Join Algorithms, Enumeration, Factorization, Ranking, and Dynamic Programming

ICDE 2022 Tutorial

Part 1: Introduction (Nikos)

Part 2: Cycles & Tree Decompositions (Mirek)

Part 3: Acyclic queries & Enumeration (Wolfgang)

Part 4: Factorization (Nikos)

Part 5: Dynamic programming & Semirings (Wolfgang)

Part 6: Any-k or Ranked Enumeration (Nikos)

Part 7: Decomposition of Inequality Predicates and Conclusions (Mirek)

Presenters from DATA Lab @ Northeastern University

Reference

Closely related papers authored by the presenters

Longer Bibliography

Funding