Archived Events

Archived Events

May 3 2021
04:30pm EST
[Vaccination 2021] Under the Hood of an Exadata Transaction – How Did We Harness the Power of Persistent Memory? (Jia Shi)

Persistent memory is a new silicon technology, adding a distinct storage tier of performance, capacity, and price between DRAM and Flash. The persistent memory is physically present on the memory bus of the storage server resulting in reads at memory speed, much faster than flash. Writes are persistent, surviving power cycles, unlike DRAM. Oracle has engineered Exadata Smart PMEM Cache and Exadata Smart PMEM Log capabilities with Intel Optane Persistent Memory to achieve this significant boost in Oracle Database OLTP... Read More

Apr 26 2021
04:30pm EST
[Vaccination 2021] Separation of Storage and Compute for Transactions and Analytics (Joyo Victor)

Separation of Storage and Compute, ala Snowflake or BigQuery, gives enormous benefits in terms of flexibility, scalability and durability. This talk presents a detailed architecture differentiated on low latency small writes. This talk is part of the Vaccination Database Tech Talk Seminar Series. Zoom Link: https://cmu.zoom.us/j/94112059546 (Password 809013) Read More

Apr 19 2021
04:30pm EST
[Vaccination 2021] Deterministic Database Management in Mission-Critical Applications (Andrei Gorine)

Mission- and safety-critical systems software designs embody key characteristics for which temporal correctness is essential. Deterministic, predictable, and fully controllable software components that complement modern real-time operating systems offerings are in demand. It is commonly believed by software developers that meeting timing requirements is a matter of sufficiently increasing system throughput. However, research, and industry projects have often brought forward temporal aspects and timing constraints of database transactions. This talk will discuss the objectives of deterministic, predictable database management in... Read More

Apr 12 2021
04:30pm EST
[Vaccination 2021] LeanStore: In-Memory Data Management Beyond Main Memory (Viktor Leis)

LeanStore is a high-performance OLTP storage engine optimized for many-core CPUs and NVMe SSDs. The goal of the project is to achieve performance comparable to in-memory systems when the data set fits into RAM, while being able to fully exploit the bandwidth of fast NVMe SSDs for large data sets. In this talk, I will present most of the important techniques used by LeanStore, including its low-overhead buffer manager, scalable synchronization primitives, optimized B-tree indexes, and an efficient logging and... Read More

Apr 5 2021
04:30pm EST
[Vaccination 2021] Query Processing in Google BigQuery (Hossein Ahmadi + Aleksandras Surna)
Hossein Ahmadi , Aleksandras Surna

Google BigQuery is a serverless, scalable, and cost effective cloud data warehouse. In this talk, we give an overview of distributed query execution in BigQuery and present various query optimization techniques used. In particular, we will discuss the dynamic query execution primitives built into BigQuery. This talk is part of the Vaccination Database Tech Talk Seminar Series. Zoom Link: https://cmu.zoom.us/j/94112059546 (Password 809013) Read More

Mar 29 2021
04:30pm EST
[Vaccination 2021] FASTER: Efficient State Management for the Modern Edge-Cloud (Badrish Chandramouli)

Managing state efficiently in modern applications written for the cloud and edge is hard. In the FASTER project, we have been creating building blocks such as FasterKV and FasterLog to alleviate this problem using techniques such as epoch protection, tiered storage, and asynchronous recoverability. In this talk, we describe these components and how we have been evolving the project over time to meet the needs of a diverse set of use cases at Microsoft and in open source. This talk... Read More

Mar 22 2021
04:30pm EST
[Vaccination 2021] NoisePage: The Self-Driving Database Management System (Lin Ma)

Database management systems (DBMSs) are an important part of modern data-driven applications. However, they are notoriously difficult to deploy and administer. There are existing methods that recommend physical design or knob configurations for DBMSs. But most of them require humans to make final decisions and decide when to apply changes. The goal of a self-driving DBMS is to remove the DBMS administration impediments by managing itself autonomously. In this talk, I present the design of a new self-driving DBMS (NoisePage)... Read More

Mar 16 2021
04:00pm EST
[PDL] Package Queries: Scalable Prescriptive Analytics Close to the Data (Matteo Brucato)

Decision making is central to a broad range of domains, including finance, transportation, healthcare, the travel industry, robotics, and engineering. It is often found at the very final step of business analytics--prescriptive analytics--to allow businesses to transform a rich understanding of data, typically provided by advanced predictive models, into actionable decisions. Modeling and solving these problems have relied on application-specific solutions, which are often complex, error-prone, and not generalizable. My goal is to create a domain-independent, declarative approach, supported and... Read More

Mar 15 2021
04:30pm EST
[Vaccination 2021] HarperDB’s Data Storage Journey: From File System to LMDB (Kyle Bernhardy)

HarperDB is a distributed database with hybrid SQL and NoSQL functionality in one, accessed via a REST API. Known as a structured object store with SQL capabilities, or NewSQL. HarperDB leverages a logical structure enabling ACID compliant efficient storage and retrieval without inconsistency, race conditions, or utilizing in-memory indexing. HarperDB is fully indexed and runs on any device from edge to cloud. In this talk we will cover HarperDB's Data Storage Journey. Kyle will review the different steps along the... Read More

Mar 8 2021
04:30pm EST
[Vaccination 2021] Novel Design Choices in Apache CouchDB (Adam Kocoloski)

Apache CouchDB is a JSON document store with a native HTTP API, server-side JavaScript indexing, and active/active data replication across flexible configurations of server instances that are free to come and go as they please. Under the hood the DBMS is implemented largely in Erlang and features copy-on-write B-trees, hash histories for automatic revision tracking of individual records, and a purely asynchronous index maintenance system. This novel combination of capabilities has been powering web and mobile applications of all shapes... Read More