Recommended Textbook and Materials
There is no required textbook because the instructor provides textbook-like course material. To gain a deeper understanding of the material covered in this course, we recommend the following books (most should be available online for free for Northeastern University students from O’Reilly for Higher Education):
- Design Patterns by Donald Miner and Adam Shook
- Hadoop: The Definitive Guide by Tom White
- High Performance Spark by Holden Karau and Rachel Warren
- Spark: The Definitive Guide by Bill Chambers and Matei Zaharia
- Spark in Action by Petar Zecevic and Marko Bonaci
- Programming Elastic MapReduce by Kevin Schmidt and Christopher Phillips
For some topics we will work with research papers or other online resources, e.g., the Hadoop and Spark API doc.