CS 7240: Topics and approximate agenda (Spring'23)
This schedule will be updated regularly as the class progresses. Check back frequently.
I will usually post lecture slides by the end of the day following a lecture (thus the next day).
I post them here on this website (or in Canvas if I think they are not yet ready to be released in public).
Please also check our DATA lab seminar for talks of interest.
Topic 1: Data Models and Query Languages
-
Lecture 1 (Tue 1/10):
Course introduction /
T1-U1 SQL /
PostgreSQL setup /
SQL Activities
-
Lecture 2 (Fri 1/13):
T1-U1 SQL
-
Lecture 3 (Tue 1/17):
T1-U1 SQL
-
Lecture 4 (Fri 1/20):
T1-U1 SQL /
T1-U2 Logic & Relational Calculus
-
Lecture 5 (Tue 1/24):
T1-U2 Logic & Relational Calculus
-
Lecture 6 (Fri 1/27):
T1-U2 Logic & Relational Calculus /
T1-U3 Relational Algebra & Codd's Theorem
-
Lecture 7 (Tue 1/31):
T1-U3 Relational Algebra & Codd's Theorem
-
Lecture 8 (Fri 2/3):
T1-U3 Relational Algebra & Codd's Theorem /
T1-U4 Datalog & Recursion
-
Lecture 9 (Tue 2/7):
T1-U4 Datalog & Recursion
-
Lecture 10 (Tue 2/10):
T1-U4 Datalog & Recursion
-
Lecture 11 (Tue 2/14):
T1-U4 Datalog & Recursion
-
Lecture 12 (Fri 2/17):
T1-U4 Datalog & Recursion
-
Lecture 13 (Tue 2/21):
T1-U4 Datalog & Recursion
Pointers to relevant concepts & supplementary material:
-
Unit 1. SQL:
[SAMS'12],
[CS 3200],
[Cow'03] Ch3 & Ch5,
[Complete'08] Ch6,
[Silberschatz+'20] Ch3.8
-
Unit 2. Logic & Relational Calculus:
First-Order Logic (FOL),
relational calculus (RC):
[Barland+'08] 4.1.2 & 4.2.1 & 4.4,
[Genesereth+] Ch6,
[Halpern+'01],
[Cow'03] Ch4.3 & 4.4,
[Elmasri, Navathe'15] Ch8.6 & Ch8.7,
[Silberschatz+'20] Ch27.1 & Ch27.2,
[Alice'95] Ch3.1-3.3 & Ch4.2 & Ch4.4 & Ch5.3-5.4,
[Barker-Plummer+'11] Ch11
-
Unit 3. Relational Algebra & Codd's Theorem:
Relational Algebra (RA),
Codd's theorem:
[Cow'03] Ch4.2,
[Complete'08] Ch2.4 & Ch5.1-5.2,
[Elmasri, Navathe'15] Ch8,
[Silberschatz+'20] Ch2.6,
[Alice'95] Ch4.4 & Ch5.4
-
Unit 4. Datalog & Recursion:
Datalog, recursion,
Stratified Datalog with negation,
Datalog evaluation strategies,
Stable Model semantics, Answer Set Programming (ASP):
[Complete'08] Ch5.3, [Cow'03] Ch 24,
[x755'19],
[Soufflé],
[DLV], [G., Suciu'10], [Eiter+'09]
-
(Unit 5. Alternative Data Models:
NoSQL:
[Hellerstein, Stonebraker'05], [Sadalage, Fowler'12], [Harrison'16])
Topic 2: Complexity of Query Evaluation & Reverse Data Management
-
Lecture 13 (Tue 2/21):
T2-U1 Conjunctive Queries
-
Lecture 14 (Fri 2/24):
T2-U1 Conjunctive Queries
-
Lecture 15 (Tue 2/28):
T2-U1/2 Conjunctive & Beyond Conjunctive Queries
-
Lecture 16 (Fri 3/3):
T2-U2 Beyond Conjunctive Queries /
T2-U3 Provenance
-
Spring break (Tue 3/7, Fri 3/10): Northeast Database day 2023 @ Northeastern
-
Lecture 17 (Tue 3/14):
T2-U3 Provenance
-
Lecture 18 (Fri 3/17):
T2-U4 Reverse Data Management
Pointers to relevant concepts & supplementary material:
-
Unit 1. Conjunctive Queries:
Query evaluation of conjunctive queries (CQs),
data vs. query complexity, homomorphisms,
constraint satisfaction, query containment,
query minimization, absorption:
[Kolaitis, Vardi'00], [Vardi'00],
[Kolaitis'16],
[Koutris'19] L1 & L2
-
Unit 2. Beyond Conjunctive Queries:
unions of conjunctive queries,
bag semantics,
nested queries,
tree pattern queries:
[Kolaitis'16],
[Tan+'14], [G.'11], [Martens'17]
-
Unit 3. Provenance:
[Buneman+02],
[Green+07],
[Cheney+09],
[Green,Tannen'17],
[Kepner+16],
[Buneman, Tan'18]
-
Unit 4. Reverse Data Management:
update propagation, resilience:
[Buneman+02],
[Kimelfeld+12],
[Freire+15]
Topic 3: Efficient Query Evaluation & Factorized Representations
-
Lecture 19 (Tue 3/21):
T3-U1 Acyclic Queries
-
Lecture 20 (Fri 3/24):
T3-U1 Acyclic Queries
-
Lecture 21 (Tue 3/28):
T3-U1 Acyclic Queries /
T3-U2 Cyclic Queries
-
Lecture 22 (Fri 3/31):
T3-U2 Cyclic Queries
-
Lecture 23 (Tue 4/4):
T3-U2 Cyclic Queries
-
Lecture 24 (Fri 4/7):
T3-U2 Cyclic Queries
-
Lecture:
T3-U4 Optimization Problems & Top-k
Pointers to relevant concepts & supplementary material:
-
Unit 1. Acyclic Queries:
query hypergraph, Yannakakis algorithm, GYO reduction,
dynamic programming, algebraic semirings,
[Alice] Ch6.4, [Koutris'19] L4,
enumeration, ranked enumeration:[Tziavelis+'20]
-
Unit 2. Cyclic Queries:
tree & hypertree decomposition,
query widths,
fractional hypertree width,
AGM bound,
worst-case optimal join algorithms,
optimal algorithms,
submodular width and multiple decompositions:
[AGM'13], [NPRR'18], [KNR'17], [KNS'17], [Gottlob+'16]
-
Unit 3. Factorized Representations:
normalization, factorized databases [Olteanu, Schleich'16]
-
Unit 4. Optimization Problems & Top-k:
shortest paths, dynamic programming (DP),
Yannakakis, semirings,
rankings,
top-k: [Roughgarden'10], [Ilyas+08], [Rahul, Tao'19],
ranked enumeration [Tziavelis+'20a], [Tziavelis+'20b]
Topic 4: Linear Algebra & Iterative Graph Algorithms
-
Lecture 25 (Tue 4/11):
Graphs & Linear Algebra
-
Lecture:
Computation Graphs
Pointers to relevant concepts & supplementary material:
-
Unit 1. Graphs & Linear Algebra:
graphs,
linear algebra (LA),
semirings, iterative algorithms,
rankings on graphs,
associative arrays: [Lay+ 21], [Kepner, Gilbert'11], [Kepner, Jananthan'18], [Klein'13],
Random walks & PageRank axioms: [Newman'10], [Gleich'15], [AT'10],
label propagation & back-propagation,
semi-supervised learning [BDL'06],
factorized graph representations: [KLG'20],
belief propagation (BP): [Murphy'12], [GGKF'15]
-
Unit 2. Computation Graphs:
circuits, knowledge compilation
[DM'02], [JS'13]
Topic 5: Normalization, Information Theory & Axioms for Uncertainty
-
Lecture:
Normal Forms & Information Theory
-
Lecture:
Axioms for Uncertainty
Pointers to relevant concepts & supplementary material:
-
Unit 1. Normal Forms & Information Theory:
normal forms & their information-theoretic justification
[Complete'18] Ch3, [Lee'87], [Arenas, Libkin'05]
-
Unit 2. Axioms for Uncertainty:
Uncertainty & Inconsistency, Maximum entropy principle
[Cox'46], [Shannon'48], [Van Horn'03]
Project presentations
-
Lecture 26 (Fri 4/14): P4 Project presentations
-
Lecture 27 (Tue 4/18): P4 Project presentations
Literature list
Topic 1: Data Models and Query Languages
-
Unit 1: SQL
-
[SAMS'19] SAMS: Teach yourself SQL in 10min by Forta. 5th ed. 2019.
It is available for free for Northeastern students
from Safari books eBook
(you may have to first login from our library website, then try again the previous link).
If the book is checked out online, you can use the 4th edition (there is almost no difference between 4th and 5th ed) as
Safari books eBook,
or as EBSCOhost eBook.
-
[CS 3200] PostgreSQL setup,
PgAdmin 4 tutorial.
Files to follow along our SQL lectures: SQL Activities,
CS 3200 web pages with more detailed slides on SQL.
-
[Cow'03] Ramakrishnan, Gehrke. Database Management Systems. 3rd ed 2003.
Ch 5: SQL.
-
[Complete'08] Garcia-Molina, Ullman, Widom. Database Systems. 2nd ed 2008.
Ch 6: SQL.
-
[Elmasri, Navathe'15] Fundamentals of Database Systems. 7th ed 2015.
Ch 6: SQL
-
[Silberschatz+'20] Silberschatz, Korth, Sudarshan. Database system concepts. 7th ed 2020.
Ch 3.8: Nested subqueries.
-
Unit 2: Logic & Relational Calculus
-
[Barland+'08] Barland, Kolaitis, Vardi, Felleisen, Greiner.
Intro to Logic,
(alternative PDF version).
4.1.2 First-Order Logic: bound variables, free variables,
4.2.1 First-Order Logic: equivalences,
4.4 Exercises for First-Order Logic.
-
[Genesereth+] Genesereth et al. Introduction to logics.
Ch 6: Relational Logic.
-
[Halpern+'01] Halpern, Harper, Immerman, Kolaitis, Vardi, Vianu.
On the Unusual Effectiveness of Logic in Computer Science.
Bulletin of Symbolic Logic 2001.
-
[Cow'03] Ramakrishnan, Gehrke. Database Management Systems. 3rd ed 2003.
Ch 4.3: Relational calculus,
Ch 4.4: Safety.
-
[Elmasri, Navathe'15] Fundamentals of Database Systems. 7th ed 2015.
Ch 8.6: Tuple relational calculus,
Ch 8.7: Domain relational calculus,
Ch 8.6.8: Safe expressions.
-
[Silberschatz+'20] Silberschatz, Korth, Sudarshan. Database system concepts. 7th ed 2020.
Ch 27.1: Tuple relational calculus,
Ch 27.2: Domain relational calculus,
Ch 27.1.3 & 27.2.3: Safe expressions.
-
[Alice'95] Abiteboul, Hull, Vianu. Foundations of Databases. 1995.
Ch 3.1: Structure of the relational model,
Ch 3.2: Named vs. unnamed perspective,
Ch 3.3: Conventional vs. logic programming perspective,
Ch 4.2: Logic-based perspective,
Ch 5.3: Relational calculus, domain independence,
Ch 5.4: Syntactic Restrictions for Domain Independence.
-
[Barker-Plummer+'11]: Barker-Plummer, Barwise, Etchemendy. Language, Proof and Logic. 2nd ed. 2011.
Ch 11: Multiple Quantifiers.
-
[Vardi'13]. A Logical Revolution (slides), 2013.
[Video @ MSR (1h 15min)]
-
Unit 3: Relational Algebra & Codd's theorem
-
[Cow'03] Ramakrishnan, Gehrke. Database Management Systems. 3rd ed 2003.
Ch 4.2: Relational algebra.
-
[Complete'08] Garcia-Molina, Ullman, Widom. Database Systems. 2nd ed 2008.
Ch 2.4: Relational algebra,
Ch 5.1-5.2: Relational Algebra and bags.
-
[Elmasri, Navathe'15] Fundamentals of Database Systems. 7th ed 2015.
Ch 8: Relational Algebra.
-
[Silberschatz+'20] Silberschatz, Korth, Sudarshan. Database system concepts. 7th ed 2020.
Ch 2.6: Relational algebra.
-
[Alice'95] Abiteboul, Hull, Vianu. Foundations of Databases. 1995.
Ch 4.4: Algebraic perspectives,
Ch 5.3-5.4: Codd's theorem: Equivalence of algebra and calculus.
-
Unit 4: Datalog & Recursion
-
[Complete'08] Garcia-Molina, Ullman, Widom. Database Systems. 2nd ed. 2008.
Ch 5.3-5.4: Datalog and bags,
Ch 10.2: Recursion in SQL.
-
[Cow'03] Ramakrishnan, Gehrke. Database Management Systems. 3rd ed 2003.
Ch 24.1-24.2: Datalog.
-
[Alice'95] Abiteboul, Hull, Vianu. Foundations of Databases. 1995.
Ch 12.2: Datalog and model-theoretic semantics,
Ch 12.3: Fixpoint semantics,
Ch 13.1: Semi-naive evaluation,
Ch 15.2: Stratified semantics and precedence graphs
-
[Koutris'19] Lecture notes from CS784 Wisconsin 2019.
L9: Datalog evaluation,
L10: Datalog negation
-
[G., Suciu'10] Stable models: Gatterbauer, Suciu. Data conflict resolution using trust mappings.
SIGMOD 2010.
[PPTX slides],
[PDF slides],
[video (24min)]
-
[x775'19] A surprisingly nice introduction to Datalog
-
[Soufflé] Soufflé is an open-source Datalog engine. Also see Soufflé tutorial and Soufflé installation.
-
[DLV] DLV (DataLog with Disjunction) is an is a disjunctive logic programming system, implementing the stable model semantics under the Answer set programming paradigm.
-
Soufflé files to follow along our lectures on Datalog with stratified negation:
Souffle Activities,
DLV files to follow along our lectures on Answer Set Programming:
DLV Activities
-
[Eiter+'09] Eiter, Ianni, Krennwallner. Answer Set Programming: A Primer. Reasoning Web International Summer School 2009.
-
Unit 5: Alternative Data Models
-
[Sadalage, Fowler'12] NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence. 2012.
Safari books eBook(NEU free online access).
Ch 8: Key-value stoes,
Ch 9: Document databases,
Ch 10: Column-famility stores,
Ch 11: Graph databases.
-
[Harrison'16] Next Generation Databases: NoSQL, NewSQL, and Big Data. 2016.
Safari books eBook (NEU free online access)].
p43-p51: CAP, Eventual Consistency, Hashing,
p61-p62: MongoDB,
p65-p68: Graphs,
p75-p80: Columns,
p145-156: Data models,
p163, 164, 166: Secondary indexing.
-
[Stonebraker, Hellerstein'05] What Goes Around Comes Around. 2005.
Online PDF.
Also see various discussions of the paper.
-
[Kepner+'16] Kepner, Gadepally, Hutchison, Jananthan, Mattson, Samsi, Reuther. Associative Array Model of SQL, NoSQL, and NewSQL Databases. HPEC 2016.
Topic 2: Complexity of Query Evaluation & Reverse Data Management
-
Unit 1: Conjunctive Queries
-
[Chandra, Merlin'77] Optimal implementation of conjunctive queries in relational data bases. STOC 1977.
-
[Vardi'81] The Complexity of Relational Query Languages. STOC 1982.
-
[Vardi'00] Constraint satisfaction and database theory: a tutorial. PODS 2000.
-
[Kolaitis, Vardi'00]: Conjunctive-Query Containment and Constraint Satisfaction. JCSS 2000.
-
[Alice'95] Abiteboul, Hull, Vianu. Foundations of Databases. 1995.
Ch 2.1: Theoretical background,
Ch 6.2: Conjunctive queries & homomorphisms & query containment,
Ch 6.3: Undecidability of equivalence for calculus.
-
[Kolaitis'16]: Logic and Databases. Logical Structures in Computation Boot Camp, Berkeley 2016.
First-order logic as a database query language,
the query evaluation problem,
the query containment problem,
data complexity and combined complexity,
conjunctive queries.
-
[Koutris'19] Lecture notes from CS784 Wisconsin 2019.
L1: conjunctive queries,
L2: query containment.
-
Socratica on abstract algebra (Youtube):
why groups,
groups with examples,
Cayley tables for groups,
group homomorphisms,
group isomorphisms
-
3blue1brown on group homomorphisms (Youtube):
Euler's formula with introductory group theory,
Euler's number e
-
Unit 2: Beyond Conjunctive Queries
-
[Alice'95] Abiteboul, Hull, Vianu. Foundations of Databases. 1995.
Ch 6.3: Undecidability of equivalence for calculus.
-
[Kolaitis'16]: Logic and Databases. Logical Structures in Computation Boot Camp, Berkeley 2016.
unions of conjunctive queries,
undecidability of query containment for unions of conjunctive queries.
-
[Tan+'14] Tan, Van den Bussche, Zhang. Undecidability of satisfiability in the algebra of finite binary relations with union, composition, and difference. Corr 1406.0349.
-
[G.'11] Gatterbauer. Databases will visualize queries too. VLDB 2011.
Video,
Slides.
-
[Martens'17] Short summary of longer papers below Optimizing tree pattern queries: why cutting is not enough (invited talk). STOC 2017.
Czerwinski, Martens, Niewerth, Parys. Optimizing Tree Patterns for Querying Graph- and Tree-Structured Data. SIGMOD Record 2017.
Czerwinski, Martens, Niewerth, Parys. Minimization of Tree Patterns. JACM 2018.
-
Unit 3: Provenance
-
[Buneman+'02] Buneman, Khanna, Tan. On Propagation of Deletions and Annotations Through Views. PODS 2002.
-
[Cheney+'09] Cheney, Chiticariu, Tan. Provenance in databases: why, how, and where. Foundations and trends in databases 2009.
-
[Green+'07] Green, Karvounarakis, Tannen. Provenance Semirings. PODS 2007.
-
[Green, Tannen'17] PODS 2017 Test of Time Award: The Semiring Framework for Database Provenance. PODS 2017. (Slides).
See also:
Val Tannen's EDBT 2010 keynote: Provenance for Database Transformations. EDBT 2010.
Val Tannen's talk @ Logical Structures in Computation Boot Camp (including video):
Provenance for Database Transformations, Berkeley 2016.
-
[Kepner+'16] Kepner, Gadepally, Hutchison, Jananthan, Mattson, Samsi, Reuther. Associative Array Model of SQL, NoSQL, and NewSQL Databases. HPEC 2016.
-
[Buneman, Tang'18] Data Provenance: What next? SIGMOD Record 2018.
-
Socratica on abstract algebra (Youtube):
Rings with examples
-
[G.+11] Gatterbauer, Meliou, Suciu. Default-all is dangerous! TaPP 2011.
(PPT Slides,
PDF Slides)
-
Unit 4: Reverse Data Management
-
[Buneman+'02] Buneman, Khanna, Tan. On Propagation of Deletions and Annotations Through Views. PODS 2002.
-
[Kimelfeld+12] Kimelfeld, Vondrak, Williams. Maximizing Conjunctive Views in Deletion Propagation. TODS 2012.
-
[Freire+'15] Freire, Gatterbauer, Immerman, Meliou. The complexity of resilience and responsibility for self-join-free conjunctive queries. PVLDB 2015.
-
[Freire+'20] Freire, Gatterbauer, Immerman, Meliou. New results for the complexity of resilience for binary conjunctive queries with self-joins. PODS 2020.
(Video,
PPTX slides,
PDF slides)
-
[Meliou+'11] Meliou, Gatterbauer, Suciu.
Reverse data management.
PVLDB 2011 (challenges & vision track).
-
[Guidotti+'18] Guidotti, Monreale, Ruggieri, Turini, Giannotti, Pedreschi. A Survey of Methods for Explaining Black Box Models. ACM Computing Surveys 2018.
Topic 3: Efficient Query Evaluation & Factorized Representations
-
Unit 1: Acyclic Queries
-
[Yannakakis'81] Algorithms for acyclic database schemes. VLDB 1981.
-
[Bernstein, Chiu'81]
Bernstein, Chiu. Using semi-joins to solve relational queries. JACM 1981.
-
[Bernstein, Goodman'81] Power of natural semi-joins. SIAM J., 1981.
-
[Beeri+83] Beeri, Fagin, Maier, Yannakakis. On the desirability of acyclic database schemes. JACM 1983.
-
[Alice'95] Abiteboul, Hull, Vianu. Foundations of Databases. 1995.
Ch 6.4: Acyclic joins, semi-join reduction, Yannakakis, GYO algorithm.
-
[Gottlob+02] Gottlob, Leone, Scarcello. Hypertree Decompositions and Tractable Queries. JCSS 2002.
-
[Ullman'01] CS345 Lecture Notes. 2001.
Acyclic hypergraphs.
Computing acyclic joins, Yannakakis' algorithm.
-
[Koutris'19] Lecture notes from CS784 Wisconsin 2019.
L4: acyclic queries
-
[Brault-Baron'16] Hypergraph Acyclicity Revisited. ACM Computing Surveys 2016
-
[TGR'22] Gatterbauer. Acyclic Queries & Enumeration (video). Part 3 of
Towards Responsive DBMS, ICDE 2022 tutorial.
-
Unit 2: Cyclic Queries
-
[AGM'13] AGM bound:Atserias, Grohe, Marx. Size bounds and query plans for relational joins. SIAM J. Comput. 2013.
-
[NPRR'18] WCO joins: Ngo, Porat, Re, Rudra. Worst-case optimal join algorithms. JACM 2018.
-
[KNR'17] FAQ: Khamis, Ngo, Rudra. Juggling functions inside a database. SIGMOD record 2017.
-
[KNS'17] Khamis, Ngo, Suciu. What do Shannon-type inequalities, submodular width, and disjunctive Datalog have to do with one another? PODS 2017.
-
[Gottlob+'16] Gottlob, Greco, Leone, Scarcello. Hypertree Decompositions: Questions and Answers PODS 2016.
-
Unit 3: Factorized Representations
-
[Olteanu, Schleich'16] Factorized Databases. SIGMOD record 2016.
-
[Rendle'13] Scaling Factorization Machines to Relational Data. VLDB 2013.
-
[BCHDP'15] Bhattacherjee, Chavan, Huang, Deshpande, Parameswaran. Principles of Dataset Versioning: Exploring the Recreation/Storage Tradeoff. VLDB 2015.
-
[NNOS'18] Ngo, Nguyen, Olteanu, Schleich.
In-Database Learning with Sparse Tensors. PODS 2018.
-
Socratica (Youtube): Abstract algebra: rings
-
[Kepner, Jananthan'18]. Mathematics of Big Data: Spreadsheets, Databases, Matrices, and Graphs. MIT press 2018.
-
Unit 4: Top-k & Optimization Problems
-
[Fagin01] Fagin's algorithm: Fagin. Combining Fuzzy Information from Multiple Systems. JCSS 2001. (PODS 1996)
-
[Fagin+03] Threshold algorithm: Fagin, Lotem, Naor. Optimal aggregation algorithms for middleware. JCSS 2003. (PODS 2001)
-
[Roughgarden'10]: CS369N: Beyond Worst-Case Analysis Lecture #1: Instance Optimality. Stanford 2010.
-
[Ilyas+08] Ilyas, Beskales, Soliman. A survey of top-k query processing techniques in relational database systems. ACM Computing Surveys 2008.
-
[Rahul, Tao'19] Rahul, Tao. A Guide to Designing Top-k Indices. SIGMOD record 2019.
-
[Tziavelis+'20a] Tziavelis, Ajwani, Gatterbauer, Riedewald, Yang. Optimal algorithms for ranked enumeration of answers to full conjunctive queries. PVLDB 2020. Project page.
-
[Tziavelis+'20b] Tziavelis, Gatterbauer, Riedewald. Optimal Join Algorithms meet Top-k. SIGMOD tutorials 2020. Video tutorial,
Full tutorial page including slides.
-
[Gondran, Minoux'08]. Graphs, dioids and semirings: New Models and Algorithms. Springer 2008.
-
[Dijkstra'16]. Dijkstra’s Shortest Path Algorithm: an intuitive Youtube video.
-
[TGR'22] Gatterbauer. Dynamic Programming & Semirings (video). Part 5 of
Towards Responsive DBMS, ICDE 2022 tutorial.
-
[TGR'22] Tziavelis. Any-k or ranked enumeration (video). Part 6 of
Towards Responsive DBMS, ICDE 2022 tutorial.
Topic 4: Linear Algebra & Iterative Graph Algorithms
-
Unit 1: Graphs & Linear Algebra
-
[Kepner, Gilbert'11] Graph algorithms in the language of linear algebra. 2011.
Ch 1 (Graphs and Matrices), Ch 2 (Linear Algebraic Notations and Definitions), Ch 3 (Connected Components and Minimum Paths)
-
[Kepner, Jananthan'18] Mathematics of Big Data Spreadsheets, Databases, Matrices, and Graphs. 2018. (Online PDF).
Ch 6 (Manipulating Graphs with Matrices), 7 (Graph Analysis and Machine Learning Systems), 11 (Graph Construction and Graphical Patterns)
-
[Kepner+'18] Mathematics of Digital Hyperspace. IPDPSW 2021. (Online PDF).
-
[Easley, Kleinberg'10] Networks, crowds, and markets: reasoning about a highly connected world. 2010. Ch 14.6 (Spectral Analysis, Random Walks, and Web Search)
-
[3Blue1Brown]: 3Blue1Brown on Essence of linear algebra:
2. Linear combination, Span, Basis vectors,
7. Inverse matrices, column space, null space,
14. Eigenvectors, eigenvalues,
15. Abstract vector spaces
-
[Lay+'21]: Lay, Lay, McDonald. Linear algebra and its applications. 6th ed, 2021.
Ch 5 (Eigenvalues and Eigenvectors), Ch 10 (Finite-State Markov Chains)
-
[Meyer'01] Matrix analysis and applied linear algebra. SIAM. 2001.
Ch 7 (Eigenvalues)
-
[Klein'13] Coding the matrix: Linear Algebra through Applications to Computer Science. Newtonian Press. 2013. Online website with slides.
Ch 4 (The Matrix), Ch 10 (Special bases), Ch 11 (SVD), Ch 12 (The Eigenvector)
-
[Newman'10]: Networks: an introduction. Oxford University Press, 2010.
Ch 6 (Mathematics of networks), Ch 7 (Measures and Metrics)
-
[Gleich'15] PageRank beyond the Web, SIAM Review, 2015.
-
[AT'10] Altman, Tennenholtz. An axiomatic approach to personalized ranking systems, JACM, 2010.
-
[BDL'06] Bengio, Delalleau, Le Roux. Label Propagation and Quadratic Criterion. (book Ch 11 of "Semi-Supervised Learning", 2006)
-
[Cowen+'17] Cowen, Ideker, Raphael, Sharan. Network propagation: a universal amplifier of genetic associations. Nature reviews, 2017.
-
[Murphy'12] Machine Learning: a Probabilistic Perspective, MIT press, 2012.
Ch 22.2 (Loopy belief propagation: algorithmic issues)
-
[Bishop'06] Pattern Recognition and Machine Learning, Springer, 2006.
Ch 8.4.4 (The Sum-product algorithm), Ch 8.4.7 (Loopy belief Propagation)
-
[Huang'15] Online lecture on Belief Propagation by Bert Huang.
-
[Pearl'88] Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, 1988.
Ch 4 (Belief updating by network propagation)
-
[GGKF'15] Gatterbauer, Günnemann, Koutra, Faloutsos. Linearized and single-pass belief propagation. VLDB 2015. Youtube video.
-
[KLG'20] Kumar, Langton, Gatterbauer. Factorized Graph Representations for Semi-Supervised Learning from Sparse Data. SIGMOD 2020. Youtube video.
-
[Doyle, Snell'84] Random walks and electric networks, 1984.
Ch 1.1-1.3 (Random walks on various networks)
-
[Minsky, Papert'88] Perceptrons, MIT press, 1988. Wikipedia page
-
Also see related past research seminar: Methods and Algorithms for determining relevance in networks (Fall'13).
-
Unit 2: Computation Graphs (Circuits & Knowledge Compilation)
Topic 5: Normalization, Information theory & Axioms of Uncertainty
-
Unit 1: Normal Forms & Information Theory
-
Unit 2: Axioms for Uncertainty
Research best practice
Surveys and additional possible topics