User-Defined Function Compilation Magic
User-defined functions (UDFs) have become an increasingly popular way to extend SQL with procedural functionality. Despite their many software engineering advantages, UDFs are notoriously difficult to optimize due to the mismatch of language paradigms across the SQL/UDF boundary. The fundamental flaw of today’s UDF optimization strategies is that they treat UDFs as atomic units that are handled the same way: either entirely separated from the SQL code, or completely integrated into the SQL code.
Our research is on automated methods to strategically transform code across SQL/UDF boundaries to effectively harnesses the strengths of both SQL query optimizers and UDF compilers to maximize performance. By recognizing the specific information on each side of the boundary that is critical for optimization on the other side, the system automatically restructures both code and information flow across the SQL/UDF boundary. This restructuring enables the query optimizer to choose efficient set operations and the UDF compiler to generate efficient procedural code.
People
Acknowledgements
This project supported (in part) by Google DAPA Research Gift and the U.S. National Science Foundation (IIS-2404373).
Publications
- K. Franz, S. I. Arch, D. Hirn, T. Grust, T. Mowry, and A. Pavlo, "Dear User-Defined Functions, Inlining isn't working out so great for us. Let's try batching to make our relationship work. Sincerely, SQL," in CIDR 2024, Conference on Innovative Data Systems Research, 2024. PDF
Bibtex
@inproceedings{franz24, author = {Franz, Kai and Arch, Samuel I and Hirn, Denis and Grust, Torsten and Mowry, Todd and Pavlo, Andrew}, title = {{Dear User-Defined Functions, Inlining isn't working out so great for us. Let's try batching to make our relationship work. Sincerely, SQL}}, booktitle = {{CIDR} 2024, Conference on Innovative Data Systems Research}, year = {2024}, url = {https://db.cs.cmu.edu/papers/2024/p13-franz.pdf}, }