Projects

Structured analytics data platforms, such as data warehouses and lake houses, are crucial for all modern enterprises. Yet, their internal architecture often adheres to a traditional approach rooted in the early days of relational database management systems. For example, optimizing queries with multiple joins is still challenging due to inaccuracies in cost estimation. This often leads database systems to execute sub-optimal plans, resulting in poor performance and wasted resources. In cloud environments, these challenges are compounded by the prevalence of highly heterogeneous hardware. Queries must dynamically adapt to the available hardware, which may change during a query’s lifecycle, in a resource-efficient manner.

In our project, we are rethinking the traditional boundaries within structured data analytics platforms, including query optimization, execution, and scheduling. By reevaluating these aspects, we aim to develop more robust and efficient methods for processing analytic workloads. For instance, rather than focusing on optimizing equi-join sub-trees, commonly found in analytic workloads, we propose using modern adaptive methods that utilize sideways information passing for near-optimal runtime query performance. This strategy allows query optimizers to concentrate on optimizing other areas of the plan space.

Moreover, this approach facilitates the integration of agile and lightweight machine learning mechanisms into the core of these systems. Such mechanisms enable the systems to self-adapt to dynamically changing operational characteristics, including both hardware and workload parameters.

People

Publications

Y. Zhang, Y. Chronis, J. M. Patel, and T. Rekatsinas, "Simple Adaptive Query Processing vs. Learned Query Optimizers: Observations and Analysis," Proc. VLDB Endow., vol. 16, iss. 11, pp. 2962-2975, 2023. Bibtex DOI

@article{zhang232,
   author = {Zhang, Yunjia and Chronis, Yannis and Patel, Jignesh M. and Rekatsinas, Theodoros},
   title = {Simple Adaptive Query Processing vs. Learned Query Optimizers: Observations and Analysis},
   year = {2023},
   issue_date = {July 2023},
   publisher = {VLDB Endowment},
   volume = {16},
   number = {11},
   doi = {10.14778/3611479.3611501},
   journal = {Proc. VLDB Endow.},
   month = {aug},
   pages = {2962--2975},
   numpages = {14},
 }

H. Zhang, H. Lim, V. Leis, D. G. Andersen, M. Kaminsky, K. Keeton, and A. Pavlo, "SuRF: Practical Range Query Filtering with Fast Succinct Tries," in Proceedings of the 2018 ACM International Conference on Management of Data, 2018, pp. 323-336. Bibtex PDF

@inproceedings{zhang18,
   author = {Huanchen Zhang and Hyeontaek Lim and Viktor Leis and David G. Andersen and Michael Kaminsky and Kimberly Keeton and Andrew Pavlo},
   title = {SuRF: Practical Range Query Filtering with Fast Succinct Tries},
   booktitle = {Proceedings of the 2018 ACM International Conference on Management of Data},
   series = {SIGMOD '18},
   year = {2018},
   pages = {323--336},
   numpages = {14},
   url = {https://db.cs.cmu.edu/papers/2018/mod601-zhangA-hm.pdf},
 }

J. Zhu, N. Potti, S. Saurabh, and J. M. Patel, "Looking Ahead Makes Query Plans Robust," Proc. VLDB Endow., vol. 10, iss. 8, pp. 889-900, 2017. Bibtex PDF DOI

@article{DBLP:journals/pvldb/ZhuPSP17,
   author = {Jianqiao Zhu and Navneet Potti and Saket Saurabh and Jignesh M. Patel},
   title = {Looking Ahead Makes Query Plans Robust},
   journal = {Proc. {VLDB} Endow.},
   volume = {10},
   number = {8},
   pages = {889--900},
   year = {2017},
   url = {http://www.vldb.org/pvldb/vol10/p889-zhu.pdf},
   doi = {10.14778/3090163.3090167},
 }