Archived Events

Archived Events

Jun 1

2020

Jun 1 2020
Building Materialize, a Streaming SQL Database powered by Timely Dataflow
Speaker:
Arjun Narayan
System:
Materialize
Video:
YouTube

Materialize (Materialize.io, GitHub) is a streaming database. Instead of being optimized for processing ad-hoc transactional or analytical queries, it is optimized for view maintenance on an ongoing basis over streams of already processed transactions. Although OLTP and OLAP systems often have support for views, they are not architected to efficiently maintain these views as the data change. Systems designed for... Read More

May 18

2020

May 18 2020
APOLLO: Automatic Detection and Diagnosis of Performance Regressions in Database Systems
Speaker:
Jinho Jung
System:
APOLLO
Video:
YouTube

The practical art of constructing database management systems (DBMSs) involves a morass of trade-offs among query execution speed, query optimization speed, standards compliance, feature parity, modularity, portability, and other goals. It is no surprise that DBMSs, like all complex software systems, contain bugs that can adversely affect their performance. The performance of DBMSs is an important metric as it determines... Read More

May 11

2020

May 11 2020
Introducing ClickHouse–the fastest data warehouse you’ve never heard of
Speaker:
Robert Hodges
System:
ClickHouse
Video:
YouTube

The market for scalable SQL data warehouses is dominated by proprietary products. ClickHouse is one of the first open source projects to give those products a run for their money. ClickHouse scales to hundreds of nodes with ingest measured in millions of events per second. The user community includes CloudFlare, Cisco, and numerous financial services companies. This talk briefly recounts... Read More

May 4

2020

May 4 2020
[DB Seminar] Spring 2020 DB Group: Active Learning for ML Enhanced Database Systems
Speaker:
Lin Ma

Abstract: Recent research has shown promising results by using machine learning (ML) techniques to improve the performance of database systems, e.g., in query optimization or index recommendation. However, in many production deployments, the ML models’ performance degrades significantly when the test data diverges from the data used to train these models.   In this talk, I will present a solution to... Read More

Apr 27

2020

Apr 27 2020
Anna: a KVS for Any Scale
Speaker:
Chenggang Wu
System:
Anna
Video:
YouTube

Modern cloud providers offer dense hardware with multiple cores and large memories, hosted in global platforms. This raises the challenge of implementing high-performance software systems that can effectively scale from a single core to multicore to the globe. Conventional wisdom says that software designed for one scale point needs to be rewritten when scaling up by 10-100x. In contrast, we... Read More

Apr 20

2020

Apr 20 2020
DuckDB – The SQLite for Analytics
Speaker:
Mark Raasveldt
System:
DuckDB
Video:
YouTube

The great popularity of SQLite shows that there is a need for unobtrusive in-process data management solutions. However, there is no such system yet geared towards analytical workloads. In this talk I will present DuckDB, a novel data management system designed to execute analytical SQL queries while embedded in another process. Zoom Link: https://cmu.zoom.us/j/562649242 Read More

Apr 13

2020

Apr 13 2020
[DB Seminar] Spring 2020 DB Group: Mostly Order Preserving Dictionaries
Speaker:
Chunwei Liu

Dictionary encoding, or domain encoding, is an important form of compression that uses a bijective mapping to replace attributes from a large domain (i.e. strings) with a finite domain (i.e. 32 bit integers). This encoding both reduces data storage and allows for more efficient query execution. Traditional dictionary encoding only supports efficient equality queries, while range queries require that encoded... Read More

Apr 6

2020

Apr 6 2020
[DB Seminar] Spring 2020 DB Group: Round-table Discussion

The DB group will convene to have a casual round-table discussion on database topics. Zoom Link: https://cmu.zoom.us/j/562649242 Read More

Mar 30

2020

Mar 30 2020
[DB Seminar] Spring 2020 DB Group: OtterTune Update
Speaker:
Dana Van Aken

In this talk Dana will provide an update on running OtterTune at SocGen. Or, how OtterTune will take over the world. Zoom Link: https://cmu.zoom.us/j/562649242 Read More

Mar 23

2020

Mar 23 2020
DB Seminar [Spring 2020] : Zero-Overhead Deterministic C++ Exceptions (“Herb Exceptions”)
Speaker:
Rohan Aggarwal

In this talk, Rohan Aggarwal will present a new proposal to the C++ standard for zero-overhead exceptions. A fundamental reason why C++ is successful and loved is its adherence to Stroustrup’s zero-overhead principle: You don’t pay for what you don’t use, and if you do use a feature you can’t reasonably code it better by hand. In the C++ language... Read More