Archived Events

Archived Events

Mar 28 2022
04:30pm EDT
[Vaccination 2022] Design and Implementation of the RelationalAI Knowledge Graph Management System (Martin Bravenboer)

RelationalAI is the next-generation database platform for new intelligent data applications based on relational knowledge graphs. The Relational Knowledge Graph Management System (KGMS) complements the modern data stack by allowing data applications to be implemented relationally and declaratively, leveraging knowledge/semantics for reasoning, graph analytics, relational machine learning, and mathematical optimization workloads. RelationalAI as a relational and cloud native system fits naturally in the modern data stack, providing virtually infinite compute and storage capacity, versioning, and a fully managed system. RelationalAI... Read More

Mar 24 2022
01:00pm EDT
NSH 4305
Hyperscale Data Processing with Network-centric Designs (Qizhen Zhang)

Today's largest data processing workloads are hosted in cloud data centers. Due to exponential data growth and the end of Moore's Law, these workloads have ballooned to the hyperscale level, encompassing billions to trillions of data items per query spread across hundreds to thousands of servers connected by the data center network. These massive scales fundamentally challenge the designs of both data processing systems and data center networks. My research rethinks the interactions between these two layers and seeks the... Read More

Mar 21 2022
04:30pm EDT
[Vaccination 2022] Stardog Query Optimiser: Architecture and Cardinality Estimations for Graph Queries (Pavel Klinov)

Stardog is a commercial knowledge graph platform at the heart of which lies a graph database. It manages graph data in the form of RDF (Resource Data Framework) triples and natively implements SPARQL 1.1 graph query language. This talk will present the general architecture of the query engine and then will delve deep into the internals of the query optimiser, particularly, graph statistics and cardinality estimations for graph patterns. Differently from some early SPARQL systems Stardog is not built on... Read More

Mar 14 2022
04:30pm EDT
[Vaccination 2022] Open-source Change Data Capture With Debezium (Gunnar Morling)

Change Data Capture (CDC) is one big enabler for your data; by reacting to changes in your database in "real-time", CDC comes in handy for implementing a wide range of use cases, such as low-latency data updates from OLTP data stores to OLAP systems, caches, or search indexes, data exchange between microservices, building audit logs, and many more. In this talk you'll learn about Debezium, a distributed open-source log-based CDC platform for a variety of databases, such as Postgres, MySQL,... Read More

Mar 7 2022
04:30pm EDT
[Vaccination 2022] ApertureDB: Designing a Purpose-built System for Visual Data and Data Science (Vishakha Gupta)

Data science and ML techniques can help understand visual content and enable better customer experience across domains, in turn driving the exponential growth in the amount of visual data. Managing large amounts of visual data (images or videos) is extremely time consuming, frustrating, and inefficient due to a lack of data management solutions designed with visual data or data science in mind. In this talk, I will start by briefly highlighting why visual data needs special treatment now and how... Read More

Feb 28 2022
04:30pm EDT
[Vaccination 2022] It’s All Downhill From Here: The Motivations and Design of the sled Embedded Database (Tyler Neely) CANCELLED

The sled embedded database is generally regarded as an “ok” choice for working with embedded data in an ergonomic, too-fast-for-its-own-good, transactional manner. But it wasn’t always that way! This talk covers the motivations, design choices, mistakes, and evolution during the first 6 years of this young database’s life. Topics covered: lock-free index structures, low-overhead logging, cheap OLTP transaction techniques, the RUM conjecture’s implications for database design, finding vast troves of bugs with very little testing code in concurrent and stateful... Read More

Feb 21 2022
04:30pm EDT
[Vaccination 2022] Orca: A Modular Query Optimizer Architecture for VMware Greenplum (Venky Raghavan)

Greenplum is an established large scale data-warehouse system with both enterprise and open-source deployments. The massively parallel processing (MPP) architecture of Greenplum splits the data into disjoint parts that are stored across individual worker segments. The increased amount of data these systems have to process magnifies optimization mistakes and stresses the importance of query optimization more than ever. Furthermore, there is growing need for optimizers to be highly extensible and modular to ensure that optimizer can keep up with the... Read More

Feb 14 2022
04:30pm EDT
[Vaccination 2022] HTAP with Azure Cosmos DB: Hybrid Transaction & Analytical Processing (Hari Sudan S)

Azure Cosmos DB is a multi-tenant globally distributed database service for managing JSON documents at Internet scale. As the amount of data managed by the service has grown several times over the past 5 years, customers have shown an increasing need for being able to do efficient analytics on top of this operational data store. The customer asks include: reducing cost, removing the need to manage separate data storage or ETL, as well as being able to query data using... Read More

Feb 7 2022
04:30pm EDT
[Vaccination 2022] SpiceDB: Flexible Permissions Database for the Internet Era (Jake Moshenko)

In this talk, we will walk through the architecture and implementation of SpiceDB, an open-source permissions database. As an implementation of Google’s Zanzibar (the singular global-scale authorization service that powers permissions and sharing across all Google properties) paper, he will focus heavily on facets of the database that allow it to run highly scalably, with low latency and incredible reliability. He will also cover some of the innovations that have made the database service easier to understand and consume. This... Read More

Jan 31 2022
04:30pm EDT
[Vaccination 2022] Practical Considerations for ACID/MVCC Storage Engines (Oren Eini)

In this talk, Oren Eini, founder of RavenDB, will discuss the design decisons and the manner in which RavenDB deals with storing data on disk. Achieving highly concurrent and transactional system can be a challenging task. RavenDB solves this issue using a storage engine called Voron. We'll go over the design of Voron and how it is able to achieve both high performance and maintain ACID integrity. This talk is part of the Vaccination Database (Booster) Tech Talk Seminar Series.... Read More