course summary      course structure      schedule/readings      miscellaneous     

Schedule/Reading List

Remember to Submit Your Reviews Here.

You can read other students review after the deadline has passed at this URL.
Here are some examples of strong, medium, and weak reviews.

Topic Mandatory Reading Optional Readings Presenter Notes / Slides
Welcome and Introduction Barzan Mozafari slides
Data Models What Goes Around Comes Around Barzan Mozafari slides
RDBMS Architecture Anatomy of a Database System and Operating System Support for Database Management Database Architecture Evolution: Mammals Flourished long before Dinosaurs became Extinct and SAP HANA Database - Data Management for Modern Business Applications Barzan Mozafari slides
Indexing R-Trees: A Dynamic Index Structure for Spatial Searching and Self-selecting, self-tuning, incrementally optimized indexes Database Cracking and CliffGuard: A Principled Framework for Finding Robust Database Designs Lynn Garrett, Barzan Mozafari slides (a)
slides (b)
Storage RAID: High-Performance, Reliable Secondary Storage and The Google File System RAID: A Personal Recollection of How Storage Became a System Gaurav Maden, Qijun Jiang slides (a)
slides (b)
How to Give a Good/Bad Talk
Initial proposals due in class.
Query Optimization An Overview of Query Optimization in Relational Systems by Chaudhuri and Automated Selection of Materialized Views and Indexes for SQL Databases Seeking the truth about adhoc join costs, Access Path Selection in a Relational Database Management System, and Join Processing in Database Systems with Large Main Memories Yitian Chen, Fangzhou Xing slides (a)
slides (b)
Modern Workloads: Complex Event Processing, Mobile Networks The Design of an Acquisitional Query Processor For Sensor Networks and Optimization of Sequence Queries in Database Systems High-Performance Complex Event Processing over XML Streams, TAG: a Tiny AGgregation service for ad-hoc sensor networks, and Data Management in the CarTel Mobile Sensor Computing System Jaleel Salhi, Chien-Wei Huang slides (a)
slides (b)
Fundamentals of Transaction Processing Chapter 16, except 16.7 (Database Management Systems by Ramakrishnan and Gehrke) and Chapter 17 (Database Management Systems by Ramakrishnan and Gehrke) Allocating Isolation Levels to Transactions, Generalized Isolation Level Definitions Ryan Wawrzaszek, Kevin Eykholt slides (a)
slides (b)
Performance Concerns in Transaction Processing Serializable Snapshot Isolation in PostgreSQL, OLTP Through the Looking Glass, and What We Found There A critique of ANSI SQL isolation levels, On Optimistic Methods for Concurrency Control Ahmad Tajik, Mengmeng Jie slides (a)
slides (b) Final proposals due by 8pm.
Transactions and In-memory Databases The End of an Architectural Era (It's Time for a Complete Rewrite) and Hekaton: SQL Server's Memory-Optimized OLTP Engine Main Memory Database Systems: An Overview, Low Overhead Concurrency Control for Partitioned Main Memory Databases and High-Performance Concurrency Control Mechanisms for Main-Memory Databases Dong-Hyeon Park, Preeti Ramaraj slides (a)
slides (b) External speaker: Aditya Parameswaran
Key-Value Stores (a.k.a. NoSQL) Dynamo: Amazon's Highly Available Key-value Store and Eventual Consistency Today: Limitations, Extensions, and Beyond HAT, not CAP: Towards Highly Available Transactions Zhizhong Zhang, Muhammed Uluyol slides (a)
slides (b)
NoSQL overview

External speaker: Mark Callaghan
Analytical Workloads, Data Mining Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals and Fast Algorithms for Mining Association Rules Sheng Xie, Yicong Zhang slides (a)
slides (b)
Extra slides
NO CLASS - Fall study break
Parallel and Distributed Databases (Fundamentals) Chapter 22 (Database Management Systems by Ramakrishnan and Gehrke) The Gamma Database Machine Project Dezhou Jiang slides
Midterm Midterm location: TBA
Modern Workloads: Web Search The PageRank Citation Ranking: Bringing Order to the Web and Authoritative Sources in a Hyperlinked Environment Junming Liu, Isaac Bowen slides (a)
slides (b)
New Solutions Emerge: Map-Reduce MapReduce: Simplified Data Processing on Large Clusters and Spark: Cluster Computing with Working Sets, Colorful Commentary Abraham Addisie, Charles Welch slides (a)
slides (b)
Databases Come Back: Column-stores C-Store: A Column-oriented DBMS and A Comparison of Approaches to Large-Scale Data Analysis. Dremel: Interactive Analysis of Web-Scale Datasets and Integrating Compression and Execution in Column-Oriented Database Systems Cheng-Yi Lee, Joshua Cheng slides (a)
slides (b)
Web-scale Parallel Databases Bigtable: A Distributed Storage System for Structured Data and Spark SQL: Relational Data Processing in Spark Scuba: Diving into Data at Facebook and Impala: A Modern, Open-Source SQL Engine for Hadoop Alexandros Lancaster, Dong Yoon slides (a)
slides (b)
Mid-semester Project Presentations I
Mid-semester Project Presentations II
Approximate Databases BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data and Join Synopses for Approximate Query Answering Knowing When You're Wrong: Building Fast and Reliable Approximate Query Processing Systems Efficient query evaluation on probabilistic databases, and Online Aggregation Barzan Mozafari, Fang-Yi Yu slides (a)
slides (b)
Data Integration Why Your Data Won't Mix: Semantic Heterogeneity and Generic Schema Matching with Cupid Hertina Kurnia, Yang Liu slides (a)
slides (b)
Crowd-enabled Databases, Scientific Databases CrowdDB: Answering Queries with Crowdsourcing and Requirements for Science Data Bases and SciDB Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers, Soylent: a word processor with a crowd inside, Science in an exponential world, and CloudBurst: highly sensitive read mapping with MapReduce Barzan Mozafari, Mason Wright slides (a)
slides (b)
Database-as-a-Service Relational Cloud: a Database Service for the cloud and RemusDB: Transparent High Availability for Database Systems Towards Database Virtualization for Database as a Service Barzan Mozafari, James Cheng slides (a)
slides (b)
Database Privacy and Security l-Diversity: Privacy Beyond k-Anonymity and CryptDB: Processing Queries on an Encrypted Database Mondrian Multidimensional K-Anonymity Ye Liu, Jiang Chen slides (a)
slides (b)
Project Posters All project deliverables due by Dec 11, 11:59pm.
Sep 8
Sep 10
Sep 15
Sep 17
Sep 22
Sep 24
Sep 29
Oct 1
Oct 6
Oct 8
Oct 13
Oct 15
Oct 20
Oct 22
Oct 27
Oct 29
Nov 3
Nov 5
Nov 10
Nov 12
Nov 17
Nov 19
Nov 24
Nov 26
Dec 1
Dec 3
Dec 8
Dec 10