[DB Seminar] Fall 2018: Tianyu Li, Matt Butrovich, Sivaprasad Sudhir
Project 1: Storage Engine (Tianyu Li & Matt Butrovich)
In this talk, we will discuss the work we’ve done on terrier’s storage engine over the semester. We will cover the implementation of write-ahead logging and our proposed model for recovery, implementation of indexes, and our roadmap for the storage engine next semester. The immediate future direction for the storage work is to support Apache Arrow natively as our storage format to reduce ETL overhead to a data science pipeline, while relaxing some of the Arrow format’s constraints for transactionally hot data to maintain high transaction throughput. We will briefly introduce Apache Arrow and present our proposed system architecture for achieving Arrow interoperability in the storage layer.
Project 2: Execution Engine (Sivaprasad Sudhir)
In this talk, I will talk about the new execution engine we are building for Peloton. Our new architecture translates queries into a high-level DSL IR that opens up possibilities for a wide variety of optimizations at the relational algebra, DSL, bytecode, and IR levels. I will also discuss an adaptive query processing framework we are developing on top of this.