Archived Events

Archived Events

Aug 3

2020

Aug 3 2020
YugabyteDB: Bringing Together the Best of Amazon Aurora and Google Spanner
Speaker:
Karthik Ranganathan
System:
YugabyteDB
Video:
YouTube

PostgreSQL, a single-node open-source RDBMS, is widely adopted for its powerful set of features. However, PostgreSQL is not built to be used as a cloud-native database, and therefore cannot inherently survive failures, scale horizontally or support geo-distributed deployments. While Amazon Aurora has modified the subsystem of PostgreSQL that writes to disk along with simplifying async replication to make the database resilient... Read More

Jul 27

2020

Jul 27 2020
Black-box Isolation Checking with Elle
Speaker:
Kyle Kingsbury
System:
Jepsen
Video:
YouTube

Databases are awful. They lose information, corrupt state, and do other terrible things, both by design and by accident. You'd think that *testing* databases to see how awful they are would help make them better, but it turns out that testing most of the useful database safety properties is *also* awful. We came up with a better way to test... Read More

Jul 24

2020

Jul 24 2020
MS Thesis Defense: Filter Representation in Vectorized Query Execution (Amadou Ngom)
Speaker:
Amadou Ngom

Advances in memory capacity have allowed Database Management Systems (DBMSs) to store large amounts of data in memory, thereby shifting the performance bottleneck of query execution from disk accesses to CPU efficiency (i.e., instruction count and cycles per instruction). One technique used to achieve such efficiency in analytical applications is batch-oriented processing or vectorization: it reduces interpretation overhead, improves cache... Read More

Jul 20

2020

Jul 20 2020
Rockset: Realtime Indexing for fast queries on massive semi-structured data
Speaker:
Dhruba Borthakur
System:
Rockset
Video:
YouTube

Rockset is a realtime indexing database that powers fast SQL over semi-structured data such as JSON, Parquet, or XML without requiring any schematization. All data loaded into Rockset are automatically indexed and a fully featured SQL engine powers fast queries over semi-structured data without requiring any database tuning. Rockset exploits the hardware fluidity available in the cloud and automatically grows... Read More

Jul 13

2020

Jul 13 2020
Astra: How we built a Cassandra-as-a-Service
Speakers:
Jim McCollom , Jeff Carpenter
System:
Cassandra
Video:
YouTube

At DataStax, we’ve been on a multi-year journey to bring a Cassandra DBaaS to the market, culminating in the GA of Astra in May 2020. In this talk, we’ll share our successes and failures through the iterative journey to GA, our current Kubernetes based architecture, how we built scalability and reliability into the platform, and how Cassandra’s architecture and implementation... Read More

Jul 6

2020

Jul 6 2020
Another Relational Database, Why and How
Speaker:
Oscar Batori & Zach Musgrave
System:
Dolt
Video:
YouTube

There are a lot of relational database, so a fair question is why we decided to create a new one. The primary reason is trade-offs. Relational database are optimized for storing a single version of the truth and providing it or updating it with maximum efficiency. More succinctly they are optimized for being good OLTP stores. They are not optimized... Read More

Jun 29

2020

Jun 29 2020
[DB Seminar] Spring 2020 DB Group: Linux 4.x Tracing (Pre-Recorded)
Speaker:
Brendan Gregg

There is no invited speaker today. We will instead watch this video together: Linux 4.x Tracing: Performance Analysis with bcc/BPF (eBPF) Brendan Gregg https://youtu.be/w8nFRoFJ6EQ Zoom Password: 264771 Read More

Jun 22

2020

Jun 22 2020
Testing Cloud-Native Databases with Chaos Mesh
Speaker:
Siddon Tang
System:
Chaos Mesh
Video:
YouTube

In the world of distributed computing, faults happen to clusters unpredictably, especially when they run in the cloud. To make a distributed database like TiDB resilient enough, chaos engineering is the way to go. At PingCAP, we use Chaos Mesh®, an open-source chaos engineering platform for Kubernetes to improve the resiliency of TiDB. Chaos Mesh adopts a cloud-native design and currently... Read More

Jun 15

2020

Jun 15 2020
Deepgreen DB: Greenplum at Speed
Speaker:
CK Tan
System:
Vitesse
Video:
YouTube

Greenplum is an open source Postgres-based MPP solution that can scale to hundreds of nodes and petabytes of data. Deepgreen DB is an optimized version of Greenplum. On top of a mature, market-tested data warehouse, Deepgreen DB adds data-centric code generation for speed, columnar external data engine, new interconnect and SQL-level integration with Go/Python. This talk will mainly recount the... Read More

Jun 8

2020

Jun 8 2020
Finding Logic Bugs in Database Management Systems
Speaker:
Manuel Rigger
System:
SQLancer
Video:
YouTube

Database Management Systems (DBMS) are used ubiquitously for storing and retrieving data. It is critical that they function correctly --- incorrectly computed result sets (e.g., by omitting a row) can cause serious loss or damage. We refer to such defects as logic bugs. Despite their importance, finding logic bugs in production DBMS is a longstanding challenge. Existing techniques such as... Read More