CS 7575: A Seminar On Relational Language Design (Spring 2026)
Seminar Description
Content: Most structured data today is relational, and SQL remains the dominant query language to access
data. Several recent papers have questioned many of SQL's core design decisions. In this seminar, we investigate the
design space of relational query languages: We study the core relational query languages, read recent alternative
proposals, tease out their common abstractions, try to create a new phenomenology of relational languages, and ask:
what are the "right" abstractions for relational query languages, especially as we are moving towards
natural-language interfaces.
-
PART 1: Foundations
Introduces and compares the core relational languages: SQL, Domain Relational Calculus (DRC), Tuple Relational
Calculus (TRC), Relational Algebra (RA), Datalog (including recursion with stratified negation). We also study
Relational Diagrams, the semantic concept of Relational Patterns, and the Generalized Tuple Relational Calculus
together with its Abstract Language Higraphs (a variant of Abstract Syntax Trees). This part is led by the
instructor and establishes a conceptual framework for studying relational languages.
-
PART 2: New Proposals and Ongoing Debates
We discuss recent proposals for alternative languages and data abstractions (including graph query languages).
Each student presents a paper, followed by an in-class discussion and comparative analysis of the language or
proposal.
-
PART 3: Beyond "traditional" query languages
We discuss extensions to relational languages, and languages that are not traditionally considered "relational":
Answer Set Programming, Disjunctive Logic programming, Integer Linear Programming, APL, J, Pandas, dplyr, SQL++,
etc. This part is again led by the instructor.
Prerequisites:
The course is fast-paced but self-contained. Standard undergraduate CS knowledge of
SQL (e.g., via [SAMS'19]), and
algorithms, logic and complexity theory (e.g., from textbooks such as
[Ericson'19],
[Dasgupta, Papadimitriou,
Vazirani'06],
[Cormen, Leiserson, Rivest,
Stein'09],
[Kleinberg, Tardos'05], or
[Lehman, Leighton, and Meyer'15]) will be helpful.
PhD program:
The seminar counts for the PhD
breath requirement in "Software".
Administrative Information
Time/location
-
Lectures: Mon/Thu 11:45am - 1:25pm,
Snell
Library 007,
in person.
Classes will not be recorded, but all slides by the lecturer from class will be made available within 2 days
*after* each lecture (i.e. WED for MON classes, SAT for THU classes), either on this course web page or from
within Piazza.
-
Office hours: directly after class, or by appointment in person at 450 West Village H, or via Microsoft Teams or Zoom.
Please email the instructors with 3 time slots possible for you.
Instructor:
Wolfgang Gatterbauer
Contact:
Please use Piazza (via direct access from within
Canvas)
for all questions related to lectures, coursework, and the project. Notice you can post questions anonymously to all
other students, or anonymously even to the instructors.
Alternatively, please use my anonymous feedback form to send
comments and suggestions that only I can see.
Coursework/Evaluation
50%: Course project:
A main component of this seminar will be a research project in the second half of the
semester.
The project should connect to the seminar (relational languages and how humans and machines interact with
data, today or in the future),
yet is completely flexible and allows you to build on your
existing PhD research.
Guidance on the project and preliminary dates are posted on the project page.
15%: Paper presentation in PART 2:
You will lead one class session (or half) by presenting a paper and facilitating an interactive discussion.
-
Choose a paper (or line of papers, or project) from the suggested list (or propose another paper you are
interested in, just talk to me).
-
Claim your paper on Piazza (there is a special page to post your ideas, first come, first served).
-
Present the language in class, analyze its patterns, find minimum illustrating examples that show some interesting
behavior.
-
Slides: I highly recommend that you prepare your presentation in PowerPoint (via Office 365) or similar tool. I have never heard a convincing
argument about why LaTeX beamer is better for presentations than PowerPoint ("convincing" here means that it not
only sounds reasonable but that it can't be easily debunked). Don't forget to slide numbers so that the audience
can refer to individual pages.
-
Please share a (even rough) outline of your slides up to two days before your presentation slot. I will make a
quick pass and add suggestions and questions. That way you are prepared upfront for some of the questions :)
Our goal is to make the session as informative for everyone as possible by drilling down one a few interesting
aspects of each paper.
-
Please share your finished slides with me right before you present so I can leave more helpful feedback
during your presentation.
-
It is completely ok, and I actually encourage you to choose the same paper (or a small set of papers on the same
topic) that you plan to base your project on.
That way you use the presentation to preview what you want to explore and to collect feedback and questions for
your
project.
15%: Mini projects:
You complete 3 mini "explorations" of your own choice and create a mini slide deck.
Thus, each mini project is an independent mini deep dive into some issue surrounding relational language design that
you find interesting and want to explore.
-
PhD seminars often ask students to "scribe" the lecture content.
However, we change the rule of the game.
Rather than scribing (= repeating and summarizing) the content of the class, your goal is to illustrate an
interesting aspect of the seminar topic
with imaginative, concrete, and ideally "tricky" illustrating examples.
Importantly, you ask your own question, you decide what you find interesting!
-
Guiding philosophy: A mini project is successful if the slide deck in turn can help other students understand an
interesting aspect that you found (ideally with minimally illustrating examples or queries).
Strong illustrations often surface common misconceptions, edge cases, or surprising behavior.
Think about what makes a research paper get accepted at a conference.
Or made a conference talk that you attended successful and memorable (and worthy of your time).
-
Format: slides with slide numbers (PowerPoint recommended), submitted as a PDF to Canvas.
Start from our PPTX template or use your own template (as long as
you include slide numbers). Please strictly follow the following file name convention:
cs7575-sp26-[YOUR NAME]-scribe[NUMBER]-[SOME DESCRIPTIVE TITLE].PDF
-
Required preliminary draft on Piazza:
Post a PDF draft to Piazza at least one week before the deadline.
If you prefer, you can post anonymously to other students (please then remove the title page). I will comment on
Piazza, and the comments may be helpful to both others and you (you may decide to address my feedback in your
final submission to Canvas).
-
Final submission on Canvas:
Submit your final PDF to Canvas.
If you decide to illustrate the behavior with a Python notebook and/or you have additional code examples (similar
to our SQL activities),
please submit those too.
-
Plan ahead:
Canvas deadlines are staggered. In the past, some students in similarly structured courses waited until the end of
the semester and then did not have enough interesting and sufficiently different topics left to illustrate. Also,
developing an illustration can naturally lead to a project idea.
Thus, it is completely fine to explore a topic that could serve as a precursor to your actual project.
Rationale: Georg Cantor is quoted as saying: "To ask the right question is harder than to answer it."
In that spirit, our mini projects are closer to research than routine assignments:
What particular aspect in a class is worthy to be "illustrated"? That's often the most difficult part.
For additional pedagogic motivation, see:
20%: Class participation:
Classes will be interactive and require concentration and participation.
I am a big fan of the Socratic Method
(please watch this 1:30min video clip from the 1973 movie "The
Paper Chase"
to see what we as teachers should strive for).
-
Participate when we discuss the merits or shortcomings of algorithms, or when we have small group break-out
sessions and in-class exercises.
-
Ask questions during class or on Piazza.
Questions that make me ponder or make me create new illustrating examples
are all great examples of class participation.
-
Also, *never* hesitate to point out to me any errors you spot in the slides, even if minor. You can also post
anonymously to the other students on Piazza (and even anonymously to the instructor via the anonymous feedback form, though then I would not be able to
associate you with your greatly appreciated participation).
-
Share links to relevant pedagogically valuable material (papers, tools, blog posts, datasets, demos). This all
counts toward participation.
Related Courses
The topic of this seminar is to the best of my knowledge new. If you know of a related course or seminar, please let
me know (for example via the anonymous feedback form).
The pedagogy of this class is inspired by the instructor's other Phd classes,
7240: Principles of scalable data management
and
7840: Foundations and Applications of Information
Theory.