Publications
SlabCity: Whole-Query Optimization using Program Synthesis
Rui Dong, Jie Liu, Yuxuan Zhu, Cong Yan, Barzan Mozafari, Xinyu Wang
In Proceedings of the 49th International Conference on Very Large Data Bases (VLDB), August 28 - September 01, 2023
Making Data Clouds Smarter at Keebo: Automated Warehouse Optimization using Data Learning
Barzan Mozafari, Radu Alexandru Burcuta, Alan Cabrera, Andrei Constantin, Derek Francis, David Grömling, Alekh Jindal, Maciej Konkolowicz, Valentin Marian Spac, Yongjoo Park, Russell Razo Carranza, Ni
In Proceedings of In Proceedings of the ACM SIGMOD 2023 Conference, June 18-23, 2023
Try out Keebo
Communication-efficient Distributed Learning for Large Batch Optimization
Rui Liu and Barzan Mozafari
In Proceedings of In the Proceedings of the Thirty-ninth International Conference on Machine Learning (ICML), July, 2022
Provable Memorization Capacity of Transformers
Junghwan Kim and Barzan Mozafari
In Proceedings of the proceedings of International Conference on Learning Representations (ICLR), May, 2023
Transformer with Memory Replay
Rui Liu and Barzan Mozafari
In Proceedings of the 36th AAAI Conference on Artificial Intelligence, February 22 - March 01, 2022
DMon: Efficient Detection and Correction of Data Locality Problems using Selective Profiling
Tanvir Ahmed Khan, Ian Neal, Gilles Pokam, Barzan Mozafari, and Baris Kasikci
In Proceedings of Symposium on Operating Systems Design and Implementation (OSDI), July, 2021
Adam with Bandit Sampling for Deep Learning
Rui Liu, Tianyi Wu, and Barzan Mozafari
In Proceedings of the thirty-fourth Conference on Neural Information Processing Systems (NeurIPS), December, 2020
Technical Report
QuickSel: Quick Selectivity Learning with Mixture Models
Yongjoo Park, Shucheng Zhong, and Barzan Mozafari
In Proceedings of the ACM SIGMOD Conference, June 14-19, 2020
Joins on Samples: A Theoretical Guide for Practitioners
Dawei Huang, Dong Young Yoon, Seth Pettie, and Barzan Mozafari
In Proceedings of 46th International Conference on Very Large Databases (PVLDB), August 31 - September 04, 2020
Technical Report
Huron: Hybrid False Sharing Detection and Repair
Tanvir Ahmed Khan, Yifan Zhao, Gilles Pokam, Barzan Mozafari, and Baris Kasikci
In Proceedings of the Conference on Programming Language Design and Implementation (PLDI), June, 2019
BlinkML: Efficient Maximum Likelihood Estimation with Probabilistic Guarantees
Yongjoo Park, Jingyi Qing, Xiaoyang Shen, and Barzan Mozafari
In Proceedings of the ACM SIGMOD 2019 Conference, June 30 - July 05, 2019
Technical Report
Revisiting Projection-Free Optimization for Strongly Convex Constraint Sets
Jarrid Rector-Brooks, Jun-Kun Wang, and Barzan Mozafari
In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, January 27 - February 01, 2019
Technical Report
A Bandit Approach to Maximum Inner Product Search
Rui Liu, Tianyi Wu, and Barzan Mozafari
In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, January 27 - February 01, 2019
SnappyData
Barzan Mozafari
In Encyclopedia of Big Data Technologies, Springer, Cham, 2018
Download SnappyData
Distributed Lock Management with RDMA: Decentralization without Starvation
Dong Young Yoon, Mosharaf Chowdhury, and Barzan Mozafari
In Proceedings of the ACM SIGMOD 2018 Conference, June 10-15, 2018
VerdictDB: Universalizing Approximate Query Processing
Yongjoo Park, Barzan Mozafari, Joseph Sorenson, and Junhao Wang
In Proceedings of the ACM SIGMOD 2018 Conference, June 10-15, 2018
Download the latest release
Demonstration of VerdictDB, the Platform-Independent AQP System
Wen He, Yongjoo Park, Idris Hanafi, Jacob Yatvitskiy, and Barzan Mozafari
In Proceedings of the ACM SIGMOD 2018 Conference, June 10-15, 2018
Download VerdictDB
Watch a video
Contention-aware lock scheduling for transactional databases
Boyu Tian, Jiamin Huang, Barzan Mozafari, and Grant Schoenebeck
In Proceedings of 44th International Conference on Very Large Databases (PVLDB), August 27-31, 2018
(Adopted by MySQL 8.0.3+) Technical Report
Ensuring Authorized Updates in Multi-user Database-Backed Applications
Kevin Eykholt, Atul Prakash, and Barzan Mozafari
In Proceedings of the 26th Usenix Security Symposium, August 16-18, 2017
Statistical Analysis of Latency Through Semantic Profiling
Jiamin Huang, Barzan Mozafari and Thomas F. Wenisch
In Proceedings of the European Conference on Computer Systems (EuroSys), April 23-26, 2017
Download VProfiler's Latest Release
Approximate Query Engines: Commercial Challenges and Research Opportunities
Barzan Mozafari
In Proceedings of the ACM SIGMOD 2017 Conference, May 14-19, 2017
(Keynote) Slides
Database Learning: Toward a Database that Becomes Smarter Every Time
Yongjoo Park, Ahmad Shahab Tajik, Michael Cafarella, Barzan Mozafari
In Proceedings of the ACM SIGMOD 2017 Conference, May 14-19, 2017
Download Verdict
A Top-Down Approach to Achieving Performance Predictability in Database Systems
Jiamin Huang, Barzan Mozafari, Grant Schoenebeck, Thomas F. Wenisch
In Proceedings of the ACM SIGMOD 2017 Conference, May 14-19, 2017
Slides
SnappyData: A Unified Cluster for Streaming, Transactions and Interactice Analytics
Barzan Mozafari, Jags Ramnarayan, Sudhir Menon, Yogesh Mahajan, Soubhik Chakraborty, Hemant Bhanawat, Kishor Bachhav
In Proceedings of Conference on Innovative Data Systems Research (CIDR), January 08-11, 2017
Spin out an iSight cloud for free!
Download Our Latest Release
Identifying the Major Sources of Variance in Transaction Latencies: Towards More Predictable Databases
Jiamin Huang, Barzan Mozafari, Grant Schoenebeck, and Thomas Wenisch
In Technical Report, March, 2016
DBSherlock: A Performance Diagnostic Tool for Transactional Databases
Dong Young Yoon, Ning Niu, and Barzan Mozafari
In Proceedings of the ACM SIGMOD 2016 Conference, June 26 - July 01, 2016
Download the Datasets used in the Paper
Download the Source Code for DBSherlock / DBSeer
SnappyData: A Hybrid Transactional Analytical Store Built On Spark
Jags Ramnarayan, Barzan Mozafari, Sumedh Wale, Sudhir Menon, Neeraj Kumar, Hemant Bhanawat, Soubhik Chakraborty, Yogesh Mahajan, Rishitesh Mishra, Kishor Bachhav
In Proceedings of the ACM SIGMOD 2016 Conference, June 26 - July 01, 2016
Download the Source Code for SnappyData
Database Learning: Toward a Database that Becomes Smarter Every Time
Yongjoo Park, Ahmad Shahab Tajik, Michael Cafarella, and Barzan Mozafari
Technical Report, April, 2016
SnappyData: Streaming, Transactions, and Interactive Analytics in a Unified Engine
Jags Ramnarayan, Barzan Mozafari, Sudhir Menon, Sumedh Wale, Neeraj Kumar, Hemant Bhanawat, Soubhik Chakraborty, Yogesh Mahajan, Rishitesh Mishra, and Kishor Bachhav
Technical Report, March, 2016
Neighbor-Sensitive Hashing
Yongjoo Park, Michael Cafarella, Barzan Mozafari
In Proceedings of the 41st International Conference on Very Large Data Bases (PVLDB), September 05-09, 2016
Download the Source Code for NSH
Visualization-Aware Sampling for Very Large Databases
Yongjoo Park, Michael Cafarella, Barzan Mozafari
In Proceedings of 32nd IEEE International Conference on Data Engineering (ICDE), May 16-20, 2016
Technical Report
A Handbook for Building an Approximate Query Engine
Barzan Mozafari, and Ning Niu
IEEE Data Engineering Bulletin, October, 2015
DBSeer: Pain-free Database Administration through Workload Intelligence
Dong Young Yoon, Barzan Mozafari, and Douglas P. Brown
In Proceedings of the 41st International Conference on Very Large Data Bases (PVLDB), September 01-04, 2015
Watch a Video Demo of DBSeer
Download DBSeer's Latest Release
CliffGuard: A Principled Framework for Finding Robust Database Designs
Barzan Mozafari, Eugene Zhen Ye Goh, and Dong Young Yoon
In Proceedings of the ACM SIGMOD 2015 Conference, May 31 - June 04, 2015
Visit our project's website
Download CliffGuard's Open-source Release
Verdict: A System for Stochastic Query Planning
Barzan Mozafari
In Proceedings of the Conference on Innovative Data Systems Research (CIDR), January, 2015
Scaling Up Crowd-Sourcing to Very Large Datasets: A Case for Active Learning
Barzan Mozafari, Purna Sarkar, Michael Franklin, Michael Jordan, and Samuel Madden
the 41st International Conference on Very Large Data Bases (PVLDB), September 01-04, 2015
Download Our Extensible Active Learning System
Download the Active Learning and Crowdsourced Datasets (Sentiment Analysis for Tweets) Used in the Paper
The Analytical Bootstrap: a New Method for Fast Error Estimation in Approximate Query Processing
Kai Zeng, Shi Gao, Barzan Mozafari and Carlo Zaniolo
In Proceedings of the ACM SIGMOD 2014 Conference, June, 2014
Download the Error Estimation Source Code for SQL Queries (I)
Download the Error Estimation Source Code for SQL Queries (II)
Knowing When You're Wrong: Building Fast and Reliable Approximate Query Processing Systems
Sameer Agarwal, Henry Milner, Ariel Kleiner, Ameet Talwalkar, Michael Jordan, Samuel Madden, Barzan Mozafari and Ion Stoica
In Proceedings of the ACM SIGMOD 2014 Conference, June, 2014
ABS: a System for Scalable Approximate Queries with Accuracy Guarantees
Kai Zeng, Shi Gao, Jiaqi Gu, Barzan Mozafari and Carlo Zaniolo
In Proceedings of the ACM SIGMOD 2014 Conference, June, 2014
(ACM SIGMOD's Best Demo Award) Download ABS (A General Approximate Query Engine with Error Estimation)
Download the Hive modifications for ABS
Active Learning for Crowd-Sourced Databases
Barzan Mozafari, Purnamrita Sarkar, Michael J. Franklin, Michael I. Jordan, and Samuel Madden
In Technical Report, 2013
Download the Active Learning and Crowdsourced Datasets (Sentiment Analysis for Tweets) Used in the Paper
High-Performance Complex Event Processing over Hierarchical Data
Barzan Mozafari, Kai Zeng, Loris D'Antoni, and Carlo Zaniolo
In ACM TODS's Special Issue on, December, 2013
Performance and Resource Modeling in Highly-Concurrent OLTP Workloads
Barzan Mozafari, Carlo Curino, Alekh Jindal, and Samuel Madden
In Proceedings of the ACM SIGMOD 2013 Conference, June 22-27, 2013
Download DBSeer and start using it (it's now open source)!
BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data
Sameer Agarwal, Barzan Mozafari, Aurojit Panda, Henry Milner, Samuel Madden, and Ion Stoica
In Proceedings of the European Conference on Computer Systems (EuroSys), April 14-17, 2013
(Best Paper Award) Download BlinkDB's official release.
Complex Pattern Matching in Complex Structures: the XSeq Approach
Kai Zeng, Mohan Yang, Barzan Mozafari, and Carlo Zaniolo
In Proceedings of the 29th International Conference on Data Engineering (ICDE), April 08-11, 2013
DBSeer: Resource and Performance Prediction for Building a Next Generation Database Cloud
Barzan Mozafari, Carlo Curino, and Samuel Madden
In Proceedings of the Conference on Innovative Data Systems Research (CIDR), January 06-09, 2013
Download the code
High-Performance Complex Event Processing over XML Streams
Barzan Mozafari, Kai Zeng, and Carlo Zaniolo
In Proceedings of the ACM SIGMOD 2012 Conference, May 20-24, 2012
(Best Paper Award) Read the extended version here.
Blink and It's Done: Interactive Queries on Very Large Data
Sameer Agarwal, Aurojit Panda, Barzan Mozafari, Anand P. Iyer, Samuel Madden, and Ion Stoica
In Proceedings of the 38th International Conference on Very Large Data Bases (PVLDB), August 27-31, 2012
Download BlinkDB's official release.
SMM: a Data Stream Management System for Knowledge Discovery
Hetal Thakkar, Nikolay Laptev, Hamid Mousavi, Barzan Mozafari, Vincenzo Russo, and Carlo Zaniolo
In Proceedings of the 27th International Conference on Data Engineering (ICDE), April 11-16, 2011
From Regular Expressions to Nested Words: Unifying Languages and Query Execution for Relational and XML Sequences
Barzan Mozafari, Kai Zeng, and Carlo Zaniolo
In Proceedings of the 36th International Conference on Very Large Data Bases (PVLDB), September 12-17, 2010
K*SQL: A Unifying Engine for Sequence Patterns and XML
Barzan Mozafari, Kai Zeng, and Carlo Zaniolo
In Proceedings of the ACM SIGMOD 2010 Conference, June 06-11, 2010
(Honorable Mention Demo Award)
Optimal Load Shedding with Aggregates and Mining Queries
Barzan Mozafari and Carlo Zaniolo
In Proceedings of the 26th International Conference on Data Engineering (ICDE), March 01-06, 2010
Publishing Naive Bayesian Classifiers: Privacy without Accuracy Loss
Barzan Mozafari and Carlo Zaniolo
In Proceedings of the 35th International Conference on Very Large Data Bases (PVLDB), August 24-28, 2009
Continuous Post-Mining of Association Rules in a Data Stream Management System
Hetal Thakkar, Barzan Mozafari and Carlo Zaniolo
In Post-Mining of Association Rules: Techniques for Effective Knowledge Extraction, edited by Yanchang Zhao, Chengqi Zhang, and Longbing Cao, Information Science Reference,
Verifying and Mining Frequent Patterns from Large Windows over Data Streams
Barzan Mozafari, Hetal Thakkar, and Carlo Zaniolo
In Proceedings of the 24th International Conference on Data Engineering (ICDE), April 07-12, 2008
Download the source code for frequent pattern/itemset mining over data streams
Download the implementation for DTV and DFV verifiers
A Data Stream Mining System
Hetal Thakkar, Barzan Mozafari and Carlo Zaniolo
In Proceedings of the International Conference on Data Mining ICDM 2008, December 15-19, 2008
Designing an Inductive DSMS: the Stream Mill Experience
Hetal Thakkar, Barzan Mozafari and Carlo Zaniolo
In Proceedings of the 2nd International Workshop on Scalable Stream Processing Systems SSPS 2008 in conjunction with EDBT 2008, March 29, 2008
On the Evolution of Wikipedia
Rodrigo B. Almeida, Barzan Mozafari, and Junghoo Cho
In Proceedings of the International Conference on Weblogs and Social Media (ICWSM), March 26-28, 2007
An Efficient Recursive Algorithm and an Explicit Formula for Calculating Update Vectors of Running Walsh-Hadamard Transform
Barzan Mozafari and Mohammad Hasan Savoji
In Proceedings of the 20th International Symposium on Signal Processing and its Applications ISSPA 2007, February 12-17, 2007
A New Collision Resistant Hash Function based on Optimum Dimensionality Reduction using Walsh-Hadamard Transform
Barzan Mozafari and Mohammad Hasan Savoji
In Proceedings of 9th International Conference on Information Technology ICIT 2006, December 18-21, 2006