CS 7575: A Seminar On Relational Language Design (Spring 2026)

Seminar Description

Content: Most structured data today is relational, and SQL remains the dominant query language to access data. Several recent papers have questioned many of SQL's core design decisions. In this seminar, we investigate the design space of relational query languages: We study the core relational query languages, read recent alternative proposals, tease out their common abstractions, try to create a new phenomenology of relational languages, and ask: what are the "right" abstractions for relational query languages, especially as we are moving towards natural-language interfaces.

PART 1: Foundations
Introduces and compares the core relational languages: SQL, Domain Relational Calculus (DRC), Tuple Relational Calculus (TRC), Relational Algebra (RA), Datalog (including recursion with stratified negation). We also study Relational Diagrams, the semantic concept of Relational Patterns, and the Generalized Tuple Relational Calculus together with its Abstract Language Higraphs (a variant of Abstract Syntax Trees). This part is led by the instructor and establishes a conceptual framework for studying relational languages.
PART 2: New Proposals and Ongoing Debates
We discuss recent proposals for alternative languages and data abstractions (including graph query languages). Each student presents a paper, followed by an in-class discussion and comparative analysis of the language or proposal.
PART 3: Beyond "traditional" query languages
We discuss extensions to relational languages, and languages that are not traditionally considered "relational": Answer Set Programming, Disjunctive Logic programming, Integer Linear Programming, APL, J, Pandas, dplyr, SQL++, etc. This part is again led by the instructor.

Prerequisites: The course is fast-paced but self-contained. Standard undergraduate CS knowledge of SQL (e.g., via [SAMS'19]), and algorithms, logic and complexity theory (e.g., from textbooks such as [Ericson'19], [Dasgupta, Papadimitriou, Vazirani'06], [Cormen, Leiserson, Rivest, Stein'09], [Kleinberg, Tardos'05], or [Lehman, Leighton, and Meyer'15]) will be helpful.

PhD program: The seminar counts for the PhD breath requirement in "Software".

Administrative Information

Time/location

Lectures: Mon/Thu 11:45am - 1:25pm, Snell Library 007, in person. Classes will not be recorded, but all slides by the lecturer from class will be made available within 2 days *after* each lecture (i.e. WED for MON classes, SAT for THU classes), either on this course web page or from within Piazza.
Office hours: directly after class, or by appointment in person at 450 West Village H, or via Microsoft Teams or Zoom. Please email the instructors with 3 time slots possible for you.

Instructor: Wolfgang Gatterbauer

Contact: Please use Piazza (via direct access from within Canvas) for all questions related to lectures, coursework, and the project. Notice you can post questions anonymously to all other students, or anonymously even to the instructors. Alternatively, please use my anonymous feedback form to send comments and suggestions that only I can see.

Coursework/Evaluation

50%: Course project: A main component of this seminar will be a research project in the second half of the semester. The project should connect to the seminar (relational languages and how humans and machines interact with data, today or in the future), yet is completely flexible and allows you to build on your existing PhD research. Guidance on the project and preliminary dates are posted on the project page.

15%: Paper presentation in PART 2: You will lead one class session (or half) by presenting a paper and facilitating an interactive discussion.

Choose a paper (or line of papers, or project) from the suggested list (or propose another paper you are interested in, just talk to me).
Claim your paper on Piazza (there is a special page to post your ideas, first come, first served).
Present the language in class, analyze its patterns, find minimum illustrating examples that show some interesting behavior.
Slides: I highly recommend that you prepare your presentation in PowerPoint (via Office 365) or similar tool. I have never heard a convincing argument about why LaTeX beamer is better for presentations than PowerPoint ("convincing" here means that it not only sounds reasonable but that it can't be easily debunked). Don't forget to slide numbers so that the audience can refer to individual pages.
Please share a (even rough) outline of your slides up to two days before your presentation slot. I will make a quick pass and add suggestions and questions. That way you are prepared upfront for some of the questions :) Our goal is to make the session as informative for everyone as possible by drilling down one a few interesting aspects of each paper.
Please share your finished slides with me right before you present so I can leave more helpful feedback during your presentation.
It is completely ok, and I actually encourage you to choose the same paper (or a small set of papers on the same topic) that you plan to base your project on. That way you use the presentation to preview what you want to explore and to collect feedback and questions for your project.

15%: Mini projects: You complete 3 mini "explorations" of your own choice and create a mini slide deck. Thus, each mini project is an independent mini deep dive into some issue surrounding relational language design that you find interesting and want to explore.

PhD seminars often ask students to "scribe" the lecture content. However, we change the rule of the game. Rather than scribing (= repeating and summarizing) the content of the class, your goal is to illustrate an interesting aspect of the seminar topic with imaginative, concrete, and ideally "tricky" illustrating examples. Importantly, you ask your own question, you decide what you find interesting!
Guiding philosophy: A mini project is successful if the slide deck in turn can help other students understand an interesting aspect that you found (ideally with minimally illustrating examples or queries). Strong illustrations often surface common misconceptions, edge cases, or surprising behavior. Think about what makes a research paper get accepted at a conference. Or made a conference talk that you attended successful and memorable (and worthy of your time).
Format: slides with slide numbers (PowerPoint recommended), submitted as a PDF to Canvas. Start from our PPTX template or use your own template (as long as you include slide numbers). Please strictly follow the following file name convention: cs7575-sp26-[YOUR NAME]-scribe[NUMBER]-[SOME DESCRIPTIVE TITLE].PDF
Required preliminary draft on Piazza: Post a PDF draft to Piazza at least one week before the deadline. If you prefer, you can post anonymously to other students (please then remove the title page). I will comment on Piazza, and the comments may be helpful to both others and you (you may decide to address my feedback in your final submission to Canvas).
Final submission on Canvas: Submit your final PDF to Canvas. If you decide to illustrate the behavior with a Python notebook and/or you have additional code examples (similar to our SQL activities), please submit those too.
Plan ahead: Canvas deadlines are staggered. In the past, some students in similarly structured courses waited until the end of the semester and then did not have enough interesting and sufficiently different topics left to illustrate. Also, developing an illustration can naturally lead to a project idea. Thus, it is completely fine to explore a topic that could serve as a precursor to your actual project.

Rationale: Georg Cantor is quoted as saying: "To ask the right question is harder than to answer it." In that spirit, our mini projects are closer to research than routine assignments: What particular aspect in a class is worthy to be "illustrated"? That's often the most difficult part. For additional pedagogic motivation, see:

20%: Class participation: Classes will be interactive and require concentration and participation. I am a big fan of the Socratic Method (please watch this 1:30min video clip from the 1973 movie "The Paper Chase" to see what we as teachers should strive for).

Participate when we discuss the merits or shortcomings of algorithms, or when we have small group break-out sessions and in-class exercises.
Ask questions during class or on Piazza. Questions that make me ponder or make me create new illustrating examples are all great examples of class participation.
Also, *never* hesitate to point out to me any errors you spot in the slides, even if minor. You can also post anonymously to the other students on Piazza (and even anonymously to the instructor via the anonymous feedback form, though then I would not be able to associate you with your greatly appreciated participation).
Share links to relevant pedagogically valuable material (papers, tools, blog posts, datasets, demos). This all counts toward participation.

Related Courses

The topic of this seminar is to the best of my knowledge new. If you know of a related course or seminar, please let me know (for example via the anonymous feedback form). The pedagogy of this class is inspired by the instructor's other Phd classes, 7240: Principles of scalable data management and 7840: Foundations and Applications of Information Theory.