Archived Events

Archived Events

Feb 28

2022

Feb 28 2022
[Vaccination 2022] It’s All Downhill From Here: The Motivations and Design of the sled Embedded Database (Tyler Neely) CANCELLED
Speaker:
Tyler Neely
System:
sled

The sled embedded database is generally regarded as an “ok” choice for working with embedded data in an ergonomic, too-fast-for-its-own-good, transactional manner. But it wasn’t always that way! This talk covers the motivations, design choices, mistakes, and evolution during the first 6 years of this young database’s life. Topics covered: lock-free index structures, low-overhead logging, cheap OLTP transaction techniques, the... Read More

Feb 21

2022

Feb 21 2022
[Vaccination 2022] Orca: A Modular Query Optimizer Architecture for VMware Greenplum (Venky Raghavan)
Speaker:
Venkatesh Raghavan
System:
Greenplum
Video:
YouTube

Greenplum is an established large scale data-warehouse system with both enterprise and open-source deployments. The massively parallel processing (MPP) architecture of Greenplum splits the data into disjoint parts that are stored across individual worker segments. The increased amount of data these systems have to process magnifies optimization mistakes and stresses the importance of query optimization more than ever. Furthermore, there... Read More

Feb 14

2022

Feb 14 2022
[Vaccination 2022] HTAP with Azure Cosmos DB: Hybrid Transaction & Analytical Processing (Hari Sudan S)
Speaker:
Hari Sudan S
System:
Azure Cosmos DB
Video:
YouTube

Azure Cosmos DB is a multi-tenant globally distributed database service for managing JSON documents at Internet scale. As the amount of data managed by the service has grown several times over the past 5 years, customers have shown an increasing need for being able to do efficient analytics on top of this operational data store. The customer asks include: reducing... Read More

Feb 7

2022

Feb 7 2022
[Vaccination 2022] SpiceDB: Flexible Permissions Database for the Internet Era (Jake Moshenko)
Speaker:
Jake Moshenko
System:
SpiceDB
Video:
YouTube

In this talk, we will walk through the architecture and implementation of SpiceDB, an open-source permissions database. As an implementation of Google’s Zanzibar (the singular global-scale authorization service that powers permissions and sharing across all Google properties) paper, he will focus heavily on facets of the database that allow it to run highly scalably, with low latency and incredible reliability.... Read More

Jan 31

2022

Jan 31 2022
[Vaccination 2022] Practical Considerations for ACID/MVCC Storage Engines (Oren Eini)
Speaker:
Oren Eini
System:
RavenDB
Video:
YouTube

In this talk, Oren Eini, founder of RavenDB, will discuss the design decisons and the manner in which RavenDB deals with storing data on disk. Achieving highly concurrent and transactional system can be a challenging task. RavenDB solves this issue using a storage engine called Voron. We'll go over the design of Voron and how it is able to achieve... Read More

Dec 13

2021

Dec 13 2021
[Vaccination 2021] How We Build Firebolt (Benjamin Wagner)
Speaker:
Benjamin Wagner
System:
Firebolt
Video:
YouTube

Data-driven companies are increasingly building customer-facing analytics products. These workloads demand lower latency, higher concurrency, and more predictable query performance than ever before - demands that traditional data warehouses struggle with. In this talk, Benjamin explains how Firebolt is designed to welcome this new generation of data challenges. This talk is part of the Vaccination Database (Second Dose) Tech Talk... Read More

Dec 6

2021

Dec 6 2021
[Vaccination 2021] Apache Arrow: High-Performance Columnar Data Framework (Wes McKinney)
Speaker:
Wes McKinney
System:
Arrow
Video:
YouTube

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing. With the aim to make the data ecosystem modular and connected, Wes will talk about Apache Arrow’s vision for a future more unified data analytics ecosystem. In this talk, Wes will discuss the underlying interfaces and protocols powering the project, trends in the Apache Arrow ecosystem, and... Read More

Nov 30

2021

Nov 30 2021
An Overview of Google BigQuery (Justin Levandoski)
Speaker:
Justin Levandoski

Google BigQuery is a serverless, scalable, and cost effective cloud data warehouse. Having evolved from internal Google infrastructure (Dremel), BigQuery is unique in a number of dimensions. In this talk, we provide a look at some of the key architectural aspects of BigQuery and how it provides a true serverless and multi-tenant warehousing solution to customers. We then provide an... Read More

Nov 29

2021

Nov 29 2021
[Vaccination 2021] Convex: Life Without a Backend Team (James Cowling)
Speaker:
James Cowling
System:
Convex
Video:
YouTube

Many of us have devoted decades of our lives to making databases faster, cheaper, more scalable, more reliable... much of which is completely irrelevant to the average developer. The main obstacle for the next generation of app developers is convenience. The "serverless revolution" and architectures like Jamstack demonstrate a desire for life without a backend team, yet current technologies in... Read More

Nov 22

2021

Nov 22 2021
[Vaccination 2021] Query Optimization and Acceleration at Dremio (Steven Phillips)
Speaker:
Steven Phillips
System:
Dremio
Video:
YouTube

This talk is part of the Vaccination Database (Second Dose) Tech Talk Seminar Series. Zoom Link: https://cmu.zoom.us/j/95002789605 (Passcode 982149) Read More