Andy lives a database-centric lifestyle. That means that he spends most of his time either thinking about databases, writing about databases, using databases, teaching others about databases, or programming databases. Truly his body is a vessel for which to conduct database research.
One day his beloved wife told him that if Stonebraker can have two kids than he can at least have one. This logic seemed to make sense to him at the time. And now she's pregnant.
The idea of having to be responsible for a dependent is stressing Andy out. As such, he is going on a coast-to-coast speaking tour to discuss the challenges of research on self-driving databases while simultaneously trying to be a responsible life partner.
Abstract: The current research trend is on developing "learned" components to supplement and replace legacy components in database management systems (DBMSs). Such learned components use machine learning (ML) methods to identify non-trivial trends and correlations in the DBMS's runtime behavior. They then use this information to create execution strategies and data structures that are tailored to the application's access patterns. The hope is that learned components will enable new optimizations that are not possible today because the complexity of managing DBMSs has surpassed the abilities of humans. This could then lead to the ultimate goal of achieving a "self-driving" DBMS that is able to configure, manage, and optimize itself automatically as the database and its workload evolve over time. The bad news is that creating such a fully autonomous DBMS is harder than that. The problem requires both holistic systems engineering and novel ML solutions that cannot be solved with just adding learned components to an existing DBMS.
In this talk, I discuss the pressing unsolved problems in self-driving DBMSs. These include how to support training data collection, fast state changes, succinct state and action representations, and accurate reward observations. I will also present techniques on how to build a new autonomous DBMS or the steps needed to retrofit an existing one to enable automated control.
Andy Pavlo is an Associate Professor of Databaseology in the Computer Science Department at Carnegie Mellon University. His (unnatural) infatuation with database systems has inadvertently caused him to incur several distinctions, such as the NSF CAREER (2019), a Sloan Fellowship (2018), and the ACM SIGMOD Jim Gray Dissertation Award (2014).
Date | Location | Public? | Time |
---|---|---|---|
August 13 | CockroachDB New York, NY |
YES | 12:00pm |
August 14 | MongoDB New York, NY |
NO | 12:00pm |
August 15 | Two Sigma New York, NY |
NO | 12:00pm |
August 19 | Snowflake San Mateo, CA |
NO | 1:00pm |
August 20 | Google Mountain View, CA |
NO | 12:00pm |
August 20 | Rockset San Mateo, CA |
YES | 6:00pm |
August 21 | Oracle Redwood City, CA |
NO | 10:00am |
August 21 | Yelp San Francisco, CA |
YES | 2:00pm |
August 26 | AIDB @ VLDB Los Angeles, CA |
YES | 9:00am |
September 19 | Cornell University Ithaca, NY |
YES | 12:00pm |
October 31 | UPMC Magee-Womens Hospital Pittsburgh, PA |
YES | TBD |