Topic
|
Mandatory Reading
|
Optional Readings
|
Presenter
|
Notes / Slides
|
Welcome and Introduction
|
|
|
Barzan Mozafari
|
slides
|
Data Models
|
What Goes Around Comes Around
|
|
Barzan Mozafari
|
slides
|
RDBMS Architecture
|
Anatomy of a Database System and
Operating System Support for Database Management
|
Database Architecture Evolution: Mammals Flourished long before Dinosaurs became Extinct and
SAP HANA Database - Data Management for Modern Business Applications
|
Barzan Mozafari
|
slides
|
Indexing
|
R-Trees: A Dynamic Index Structure for Spatial Searching and
Self-selecting, self-tuning, incrementally optimized indexes
|
Database Cracking and CliffGuard: A Principled Framework for Finding Robust Database Designs
|
Lynn Garrett, Barzan Mozafari
|
slides (a)
slides (b)
|
Storage
|
RAID: High-Performance, Reliable Secondary Storage and
The Google File System
|
RAID: A Personal Recollection of How Storage Became a System
|
Gaurav Maden, Qijun Jiang
|
slides (a)
slides (b)
How to Give a Good/Bad Talk
Initial proposals due in class.
|
Query Optimization
|
An Overview of Query Optimization in Relational Systems by Chaudhuri
and
Automated Selection of Materialized Views and Indexes for SQL Databases
|
Seeking the truth about adhoc join costs,
Access Path Selection in a Relational Database Management System, and
Join Processing in Database Systems with Large Main Memories
|
Yitian Chen, Fangzhou Xing
|
slides (a)
slides (b)
|
Modern Workloads: Complex Event Processing, Mobile Networks
|
The Design of an Acquisitional Query Processor For Sensor Networks
and
Optimization of Sequence Queries in Database Systems
|
High-Performance Complex Event Processing over XML Streams,
TAG: a Tiny AGgregation service for ad-hoc sensor networks, and
Data Management in the CarTel Mobile Sensor Computing System
|
Jaleel Salhi, Chien-Wei Huang
|
slides (a)
slides (b)
|
Fundamentals of Transaction Processing
|
Chapter 16, except 16.7 (Database Management Systems by Ramakrishnan and Gehrke) and
Chapter 17 (Database Management Systems by Ramakrishnan and Gehrke)
|
Allocating Isolation Levels to Transactions,
Generalized Isolation Level Definitions
|
Ryan Wawrzaszek, Kevin Eykholt
|
slides (a)
slides (b)
|
Performance Concerns in Transaction Processing
|
Serializable Snapshot Isolation in PostgreSQL,
OLTP Through the Looking Glass, and What We Found There
|
A critique of ANSI SQL isolation levels,
On Optimistic Methods for Concurrency Control
|
Ahmad Tajik, Mengmeng Jie
|
slides (a)
slides (b)
Final proposals due by 8pm.
|
Transactions and In-memory Databases
|
The End of an Architectural Era (It's Time for a Complete Rewrite)
and
Hekaton: SQL Server's Memory-Optimized OLTP Engine
|
Main Memory Database Systems: An Overview,
Low Overhead Concurrency Control for Partitioned Main Memory Databases and
High-Performance Concurrency Control Mechanisms for Main-Memory Databases
|
Dong-Hyeon Park, Preeti Ramaraj
|
slides (a)
slides (b)
External speaker: Aditya Parameswaran
|
Key-Value Stores (a.k.a. NoSQL)
|
Dynamo: Amazon's Highly Available Key-value Store
and
Eventual Consistency Today: Limitations, Extensions, and Beyond
|
HAT, not CAP: Towards Highly Available Transactions
|
Zhizhong Zhang, Muhammed Uluyol
|
slides (a)
slides (b)
NoSQL overview
External speaker: Mark Callaghan
|
Analytical Workloads, Data Mining
|
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals and
Fast Algorithms for Mining Association Rules
|
|
Sheng Xie, Yicong Zhang
|
slides (a)
slides (b)
Extra slides
|
NO CLASS - Fall study break
|
|
|
|
|
Parallel and Distributed Databases (Fundamentals)
|
Chapter 22 (Database Management Systems by Ramakrishnan and Gehrke)
|
The Gamma Database Machine Project
|
Dezhou Jiang
|
slides
|
Midterm
|
|
|
|
Midterm location: TBA
|
Modern Workloads: Web Search
|
The PageRank Citation Ranking: Bringing Order to the Web and
Authoritative Sources in a Hyperlinked Environment
|
|
Junming Liu, Isaac Bowen
|
slides (a)
slides (b)
|
New Solutions Emerge: Map-Reduce
|
MapReduce: Simplified Data Processing on Large Clusters
and
Spark: Cluster Computing with Working Sets,
|
Colorful Commentary
|
Abraham Addisie, Charles Welch
|
slides (a)
slides (b)
|
Databases Come Back: Column-stores
|
C-Store: A Column-oriented DBMS and
A Comparison of Approaches to Large-Scale Data Analysis.
|
Dremel: Interactive Analysis of Web-Scale Datasets
and
Integrating Compression and Execution in Column-Oriented Database Systems
|
Cheng-Yi Lee, Joshua Cheng
|
slides (a)
slides (b)
|
Web-scale Parallel Databases
|
Bigtable: A Distributed Storage System for Structured Data
and
Spark SQL: Relational Data Processing in Spark
|
Scuba: Diving into Data at Facebook
and
Impala: A Modern, Open-Source SQL Engine for Hadoop
|
Alexandros Lancaster, Dong Yoon
|
slides (a)
slides (b)
|
Mid-semester Project Presentations I
|
|
|
|
|
Mid-semester Project Presentations II
|
|
|
|
|
Approximate Databases
|
BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data
and
Join Synopses for Approximate Query Answering
|
Knowing When You're Wrong: Building Fast and Reliable Approximate Query Processing Systems
Efficient query evaluation on probabilistic databases,
and
Online Aggregation
|
Barzan Mozafari, Fang-Yi Yu
|
slides (a)
slides (b)
|
Data Integration
|
Why Your Data Won't Mix: Semantic Heterogeneity and
Generic Schema Matching with Cupid
|
|
Hertina Kurnia, Yang Liu
|
slides (a)
slides (b)
|
Thanksgiving
|
|
|
|
|
Crowd-enabled Databases,
Scientific Databases
|
CrowdDB: Answering Queries with Crowdsourcing and
Requirements for Science Data Bases and SciDB
|
Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers,
Soylent: a word processor with a crowd inside,
Science in an exponential world,
and CloudBurst: highly sensitive read mapping with MapReduce
|
Barzan Mozafari, Mason Wright
|
slides (a)
slides (b)
|
Database-as-a-Service
|
Relational Cloud: a Database Service for the cloud and
RemusDB: Transparent High Availability for Database Systems
|
Towards Database Virtualization for Database as a Service
|
Barzan Mozafari, James Cheng
|
slides (a)
slides (b)
|
Database Privacy and Security
|
l-Diversity: Privacy Beyond k-Anonymity and
CryptDB: Processing Queries on an Encrypted Database
|
Mondrian Multidimensional K-Anonymity
|
Ye Liu, Jiang Chen
|
slides (a)
slides (b)
|
Project Posters
|
|
|
|
All project deliverables due by Dec 11, 11:59pm.
|