Quarantine DB Talk 2020: FoundationDB or: How I Learned to Stop Worrying and Trust the Database
Getting multiple entities to work nicely together is a difficult task. This is true for machines as much as it is true for humans. This is why testing and debugging distributed systems is such a hard task. Even if well known algorithms are used, subtle bugs can introduce catastrophic failures. FoundationDB uses deterministic simulation to test these failures. This is the secret sauce that makes FoundationDB one of the most robust databases on the market.
FoundationDB is a distributed key value store with support for strictly serializable transactions. It was originally developed by a company of the same name but got later bought and open sourced by Apple. Some of its largest users and open source contributors are Snowflake, Apple, VMWare, and IBM.
In this talk I will start by giving a short overview of FoundationDB and how Snowflake uses it for it’s data warehouse. Then we’ll talk about distributed systems in general and what kind of failures one commonly (and less commonly) will see when running large distributed systems. This will make a case why testing such systems is so incredibly hard. The main focus of the talk will then be on deterministic simulation, which is what FoundationDB implements in order to deterministically produce failures. At the end I will give a quick demo of our testing infrastructure and show how we can simulate FoundationDB clusters running on unstable hardware for thousands of years.
This talk is part of the Quarantine Database Tech Talk Seminar Series.
Markus Pilman joined Snowflake in April 2016 as a software engineer where he works on FoundationDB. He currently manages the FoundationDB team which is part of the persistent metadata team. The FoundationDB team directly contributes to the open source project and is currently working on features for visibility, scalability, manageability, and a better backup/restore solution. Before joining Snowflake, he acquired a Dr. sc. ETH Zurich in Switzerland where he worked under the supervision of Donald Kossmann on distributed key value stores. In his free time he enjoys hiking, camping, photography, and reckless driving.
More Info: https://db.cs.cmu.edu/seminar2020/