Week Date Lecture Topics Reading Materials Assignment
1 May 7,
May 9
Parallel Processing Basics Introduction,
Parallel Processing Basics
Begin Homework 1
2 May 14,
May 16
Distributed Services: Distributed File System, Resource and Application Management Distributed System Services,
Distributed File System,
Resource and Application Management
3 May 21,
no class on May 23
MapReduce and Spark Introduction MapReduce and Spark Homework 1 Due
4 May 28,
May 30
Aggregation, Sort Fundamental Techniques Begin Homework 2
5 June 4,
June 6
Joins Joins Homework 2 Due
6 June 11,
June 13
Common Algorithm Building Blocks Common Building Blocks Begin Homework 3
7 June 18,
June 20
Graph Algorithms Graph Algorithms Homework 3 Due
8 June 25,
June 27
Data Mining 1 (K-Means, KNN) Data Mining 1 Begin Homework 4
9 July 2,
no class on July 4, Independence Day
Data Mining 2 (Ensembles) Data Mining 2 Homework 4 Due
10 July 9,
July 11
Intelligent Partitioning Intelligent Partitioning
11 July 16,
July 18
Lineage, Spark SQL More About Spark
12 no class on July 23 for exam preparation,
July 25
Exam
13 July 30,
Aug 1
Exam Solutions, Spark Stream, CAP Theorem Begin Project
14 Aug 6,
Aug 8
Spark MLlib, GraphX, HBase Beyond MapReduce and Spark Project Due
15 Aug 13,
Aug 15
Hive Project Presentations