Events

Events

PhD Defense: On Holistic Database Optimization via Leveraging Similarity Across Actions, Workloads, Configurations, and Scenarios (William Zhang)

Speaker:
William Zhang
Date:
Fri Jan 23, 2026 @ 01:00pm EST
Date:
Fri Jan 23, 2026
Time:
01:00pm EST
Location:
NSH 3002
Title:
On Holistic Database Optimization via Leveraging Similarity Across Actions, Workloads, Configurations, and Scenarios

Talk Info:

Modern database management systems (DBMSs) have evolved to support increasingly sophisticated data-intensive applications, at the cost of substantial complexity to configure them for two reasons. First, DBMSs expose a vast configuration space with trillions of possibilities that encompass system knobs, physical design (e.g., indexes), and query options, amongst others. Second, these applications are constantly evolving with changes in data access patterns, query types, load intensities, hardware, and data distributions that necessitate continuous re-optimization.

To address these challenges, decades of autonomous DBMS optimization research have produced specialized tuning tools to assist human operators. Deploying these tools involves a complex multi-step workflow where an operator (1) observes the DBMS’s behavior, (2) selects tools based on the objectives and their expertise, (3) configures them with an isolated environment, (4) orchestrates their execution to obtain recommendations, and (5) reviews those recommendations before deployment. This cumbersome process results in suboptimal configurations and slow adaptation to evolving applications’ workloads due to isolated specialized tools, inefficient reuse of prior tuning knowledge, and the fallible human factor.

In this dissertation, we present techniques for addressing those limitations with similarity to enable holistic database optimization. First, we present a holistic tuning tool that optimizes multiple DBMS aspects simultaneously by using action similarity to organize actions into neighborhoods conducive to exploration. We then present a framework that assists tuners in adapting to environment changes by leveraging workload and configuration similarity to re-mix historical knowledge. Lastly, we present a system that transforms the human-centric tuning workflow into an agentic process by using scenario similarity to link the deployment context with semantic tool interfaces to optimize the deployment.

The techniques and associated similarity definitions presented in this dissertation enable agentic holistic DBMS optimization over a deployment’s lifetime, improving the deployment’s performance and reducing time taken to adapt to changes in upstream user applications.

Bio:

William Zhang is the #1 ranked Ph.D. student in the Carnegie Mellon Database Group.

More Info: https://csd.cmu.edu/calendar/2026-01-23/doctoral-thesis-oral-defense-william-zhang