Quarantine DB Talk 2020: The Cascades Framework for Query Optimization at Microsoft
The Cascades framework was an academic project introduced 25 years ago as a foundation for modern query optimizers. It provides extensibility, memoization-based dynamic programming, an algebraic representation of logical and physical operator trees, and manipulation of such trees using transformation rules to enable cost-based query optimization. Cascades provides a clean framework/skeleton for optimizer development, but it needs to be instantiated with domain-knowledge and augmented in several directions to cope with real-world workloads in an industrial setting. We will describe some design choices and extensions to Cascades that power multiple Microsoft products, including MS SQL Server and Azure Synapse Analytics.
This talk is part of the Quarantine Database Tech Talk Seminar Series.
Cesar is manager of the query processor team in Azure Synapse Analytics, our scale-out cloud service for Datawarehouse workloads. He was a member of the initial team that designed and implemented the query optimizer of SQL Server, first shipped in 1997, and a manager of the query optimizer team through several product releases. He has worked as an architect in several Microsoft database products, including the Microsoft Analytics Appliance and database services in the cloud. Cesar received his PhD from Harvard University.
Nico is an architect at Microsoft working in the query optimization space in the Azure Data organization. Prior to this role, he was the tech lead of Spanner's query optimizer at Google. Before that, he led the Cosmos query optimizer team at Microsoft Bing. Even before that, he worked on self-tuning databases at Microsoft Research. In ancient times, he graduated from Columbia University working on statistics for query optimization. In his free time, Nico enjoys not working on query optimization.
More Info: https://db.cs.cmu.edu/seminar2020/