Projects

Projects

Peloton

Project Years: 2016-2019
Peloton

In the last two decades, both researchers and vendors have built advisory tools to assist database administrators in various aspects of system tuning and physical design. Most of this previous work, however, is incomplete because they still require humans to make the final decisions about any changes to the database and are reactionary measures that fix problems after they occur.

What is needed for a truly “self-driving” database management system (DBMS) is a new architecture that is designed for autonomous operation. This is different than earlier attempts because all aspects of the system are controlled by an integrated planning component that not only optimizes the system for the current workload, but also predicts future workload trends so that the system can prepare itself accordingly. With this, the DBMS can support all of the previous tuning techniques without requiring a human to determine the right way and proper time to deploy them. It also enables new optimizations that are important for modern high-performance DBMSs, but which are not possible today because the complexity of managing these systems has surpassed the abilities of human experts.

Peloton is a relational database management system designed for fully autonomous optimization of hybrid workloads.

Visit Project Homepage

People

Acknowledgements

This project supported (in part) by Intel Labs, Google, Amazon, Samsung Research, Standuply, Alfred P. Sloan Research Fellowship, and the U.S. National Science Foundation (CCF-1438955, IIS-1718582, SPX-1822933, IIS-1846158).

Publications

  1. P. Menon, A. Ngom, and A. P. Lin Ma Todd C. Mowry, "Permutable Compiled Queries: Dynamically Adapting Compiled Queries without Recompiling," Proc. VLDB Endow., vol. 14, iss. 2, pp. 101-113, 2020. PDF BIB
    @article{menon2020,
       author = {Prashanth Menon and Amadou Ngom and Lin Ma, Todd C. Mowry, Andrew Pavlo},
       title = {Permutable Compiled Queries: Dynamically Adapting Compiled Queries without Recompiling},
       journal = {Proc. {VLDB} Endow.},
       volume = {14},
       number = {2},
       pages = {101--113},
       year = {2020},
       url = {https://db.cs.cmu.edu/papers/2020/p101-menon.pdf},
     }
  2. A. Pavlo, M. Butrovich, A. Joshi, L. Ma, P. Menon, D. V. Aken, L. Lee, and R. Salakhutdinov, "External vs. Internal: An Essay on Machine Learning Agents for Autonomous Database Management Systems," IEEE Data Engineering Bulletin, pp. 32-46, 2019. PDF BIB
    @article{pavlo19,
       author={Andrew Pavlo and Matthew Butrovich and Ananya Joshi and Lin Ma and Prashanth Menon and Dana Van Aken and Lisa Lee and Ruslan Salakhutdinov},
       title={External vs. Internal: An Essay on Machine Learning Agents for Autonomous Database Management Systems},
       journal={IEEE Data Engineering Bulletin},
       month={June},
       year={2019},
       pages={32--46},
       url = {https://db.cs.cmu.edu/papers/2019/pavlo-icde-bulletin2019.pdf},
     }
  3. Y. Sheng, A. Tomasic, T. Zhang, and A. Pavlo, "Scheduling OLTP transactions via learned abort prediction," in Proceedings of the Second International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, aiDM@SIGMOD 2019, 2019, p. 1:1-1:8. PDF DOI BIB
    @inproceedings{sheng19,
       author = {Yangjun Sheng and Anthony Tomasic and Tieying Zhang and Andrew Pavlo},
       title = {Scheduling {OLTP} transactions via learned abort prediction},
       booktitle = {Proceedings of the Second International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, aiDM@SIGMOD 2019},
       pages = {1:1--1:8},
       year = {2019},
       doi = {10.1145/3329859.3329871},
       url = {https://db.cs.cmu.edu/papers/2019/a1-sheng.pdf},
     }
  4. L. Ma, D. Van Aken, A. Hefny, G. Mezerhane, A. Pavlo, and G. J. Gordon, "Query-based Workload Forecasting for Self-Driving Database Management Systems," in Proceedings of the 2018 International Conference on Management of Data, 2018, pp. 631-645. PDF CODE DOI BIB
    @inproceedings{ma18,
       author = {Ma, Lin and Van Aken, Dana and Hefny, Ahmed and Mezerhane, Gustavo and Pavlo, Andrew and Gordon, Geoffrey J.},
       title = {Query-based Workload Forecasting for Self-Driving Database Management Systems},
       booktitle = {Proceedings of the 2018 International Conference on Management of Data},
       series = {SIGMOD '18},
       year = {2018},
       pages = {631--645},
       numpages = {15},
       doi = {10.1145/3183713.3196908},
       url = {https://db.cs.cmu.edu/papers/2018/mod435-maA.pdf},
       code = {https://github.com/malin1993ml/QueryBot5000},
     }
  5. Z. Wang, A. Pavlo, H. Lim, V. Leis, H. Zhang, M. Kaminsky, and D. G. Andersen, "Building a Bw-Tree Takes More Than Just Buzz Words," in Proceedings of the 2018 ACM International Conference on Management of Data, 2018, pp. 473-488. PDF CODE BIB
    @inproceedings{wang18,
       author = {Ziqi Wang and Andrew Pavlo and Hyeontaek Lim and Viktor Leis and Huanchen Zhang and Michael Kaminsky and David G. Andersen},
       title = {Building a Bw-Tree Takes More Than Just Buzz Words},
       booktitle = {Proceedings of the 2018 ACM International Conference on Management of Data},
       series = {SIGMOD '18},
       year = {2018},
       pages = {473--488},
       numpages = {16},
       url = {https://db.cs.cmu.edu/papers/2018/mod342-wangA.pdf},
       code = {https://github.com/wangziqi2016/index-microbench},
     }
  6. T. Zhang, A. Tomasic, Y. Sheng, and A. Pavlo, "Performance of OLTP via Intelligent Scheduling," in 2018 IEEE 34th International Conference on Data Engineering (ICDE), 2018, pp. 1288-1291. DOI BIB
    @inproceedings{zhang18icde,
       author={Zhang, Tieying and Tomasic, Anthony and Sheng, Yangjun and Pavlo, Andrew},
       booktitle={2018 IEEE 34th International Conference on Data Engineering (ICDE)},
       title={Performance of OLTP via Intelligent Scheduling},
       year={2018},
       volume={},
       number={},
       pages={1288--1291},
       doi={10.1109/ICDE.2018.00132},
     }
  7. P. Menon, T. C. Mowry, and A. Pavlo, "Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last," Proc. VLDB Endow., vol. 11, iss. 1, pp. 1-13, 2017. PDF BIB
    @article{menon17,
       author = {Prashanth Menon and Todd C. Mowry and Andrew Pavlo},
       title = {Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last},
       journal = {Proc. VLDB Endow.},
       volume = {11},
       number = {1},
       month = {September},
       year = {2017},
       pages = {1--13},
       publisher = {VLDB Endowment},
       url = {https://db.cs.cmu.edu/papers/2017/p1-menon.pdf},
     }
  8. A. Pavlo, G. Angulo, J. Arulraj, H. Lin, J. Lin, L. Ma, P. Menon, T. Mowry, M. Perron, I. Quah, S. Santurkar, A. Tomasic, S. Toor, D. V. Aken, Z. Wang, Y. Wu, R. Xian, and T. Zhang, "Self-Driving Database Management Systems," in CIDR 2017, Conference on Innovative Data Systems Research, 2017. PDF BIB
    @inproceedings{pavlo17,
       author = {Andrew Pavlo and Gustavo Angulo and Joy Arulraj and Haibin Lin and Jiexi Lin and Lin Ma and Prashanth Menon and Todd Mowry and Matthew Perron and Ian Quah and Siddharth Santurkar and Anthony Tomasic and Skye Toor and Dana Van Aken and Ziqi Wang and Yingjun Wu and Ran Xian and Tieying Zhang},
       title = {Self-Driving Database Management Systems},
       booktitle = {{CIDR} 2017, Conference on Innovative Data Systems Research},
       year = {2017},
       url = {https://db.cs.cmu.edu/papers/2017/p42-pavlo-cidr17.pdf},
     }
  9. Y. Wu, J. Arulraj, J. Lin, R. Xian, and A. Pavlo, "An Empirical Evaluation of In-Memory Multi-Version Concurrency Control," Proc. VLDB Endow., vol. 10, iss. 7, pp. 781-792, 2017. PDF CODE BIB
    @article{wu17,
       author = {Yingjun Wu and Joy Arulraj and Jiexi Lin and Ran Xian and Andrew Pavlo},
       title = {An Empirical Evaluation of In-Memory Multi-Version Concurrency Control},
       journal = {Proc. VLDB Endow.},
       volume = {10},
       number = {7},
       month = {March},
       year = {2017},
       pages = {781--792},
       publisher = {VLDB Endowment},
       url = {https://db.cs.cmu.edu/papers/2017/p781-wu.pdf},
       code = {https://github.com/yingjunwu/peloton/tree/mvcc-epoch},
     }
  10. J. Arulraj, A. Pavlo, and P. Menon, "Bridging the Archipelago Between Row-Stores and Column-Stores for Hybrid Workloads," in Proceedings of the 2016 International Conference on Management of Data, 2016, pp. 583-598. PDF DOI BIB
    @inproceedings{arulraj16,
       author = {Arulraj, Joy and Pavlo, Andrew and Menon, Prashanth},
       title = {Bridging the Archipelago Between Row-Stores and Column-Stores for Hybrid Workloads},
       booktitle = {Proceedings of the 2016 International Conference on Management of Data},
       series = {SIGMOD '16},
       year = {2016},
       pages = {583--598},
       numpages = {16},
       doi = {10.1145/2882903.2915231},
       url = {https://db.cs.cmu.edu/papers/2016/arulraj-sigmod2016.pdf},
     }
  11. J. Arulraj, M. Perron, and A. Pavlo, "Write-Behind Logging," Proc. VLDB Endow., vol. 10, iss. 4, pp. 337-348, 2016. PDF BIB
    @article{arulraj16-vldb,
       author = {Arulraj, Joy and Perron, Matthew and Pavlo, Andrew},
       title = {Write-Behind Logging},
       journal = {Proc. VLDB Endow.},
       volume = {10},
       number = {4},
       month = {December},
       year = {2016},
       pages = {337--348},
       publisher = {VLDB Endowment},
       url = {https://db.cs.cmu.edu/papers/2016/p337-arulraj.pdf},
     }