Evolution of the Storage Engine for Spanner, an Exabyte-scale Database System
Date
Time
Location
Speaker
I’ll describe the design of Spanner’s new storage engine, Ressi, which replaced untyped sorted string tables (inherited from Bigtable) with a strongly typed SQL-native representation. Live migration of 6 exabytes of data and multiple billion-user products to the new engine posed unique challenges. Sound methodology from experimental computer science was the key to its success.
The simplicity and power of declarative queries combined with strongly consistent transactional semantics has scaled to many thousands of machines running an aggregate of over 2 billion queries per second for some of the largest applications in the world. While challenges emerge as we continue to scale, I argue that the dominant obstacle to achieving zettabyte scale databases is in experimental methodology rather than in the underlying technical problems themselves.
Zoom link: https://cmu.zoom.us/my/jignesh
Bio:
David F. Bacon leads Google’s Spanner storage engine team, responsible for over 70% of the total fleet-wide cost of Spanner. His current work includes compression, RAM efficiency, ASIC support for databases, protection against “mercurial cores”, and tools for predicting fine-grained impact of software and hardware changes.
Prior to Google, he worked at IBM Research on programming language design, optimization, and hardware synthesis. He was named an ACM Fellow for pioneering work on real-time garbage collection.
He holds a Ph.D. from UC Berkeley, and his thesis work on optimizing virtual functions is used in most modern C++ and Java compilers. He has published over 80 papers and holds 29 patents.