Archived Events

Archived Events

May 2

2022

May 2 2022
[Vaccination 2022] IO in PostgreSQL: Past, Present, Future (Andres Freund)
Speaker:
Andres Freund
System:
PostgreSQL
Video:
YouTube

PostgreSQL traditionally has handled IO in a fairly minimal way, relying on the operating system more than most other databases. This talk will discuss why PostgreSQL mostly got away with that so far, why current hardware trends (NVMe with very high bandwidth / low latency, cloud storage with high latency but​ good random / concurrent read behaviour) require changing course... Read More

Apr 25

2022

Apr 25 2022
[Vaccination 2022] RonDB: A Key-Value Store with SQL Capabilities and LATS Properties (Mikael Ronström)
Speaker:
Mikael Ronström
System:
RonDB
Video:
YouTube

RonDB is a key-value store with SQL capabilities and LATS (Latency/Availability/Throughput/ScalableStorage) properties. It is based on MySQL NDB Cluster that is used in extremely available applications such as universal data storage for mobile operators for many billions of subscribers. It is also used in gaming applications, financial applications and other areas. The main focus of RonDB in Hopsworks is as... Read More

Apr 18

2022

Apr 18 2022
[Vaccination 2022] Velox: An Open-source Unified Execution Engine (Deepak Majeti)
Speaker:
Deepak Majeti
System:
Velox
Video:
YouTube

Data keeps getting bigger, processing keeps getting more and more complex but the hardware does not get faster. We need to reconsider efficiency from the ground up. While these data processing systems handle various workloads (e.g. “batch”, “analytical”, “streaming”, “AI/ML”), they employ common features such as functions, joins, filter-pushdown, sorting, grouping, projections, etc… A shared library that provides optimized implementations... Read More

Apr 11

2022

Apr 11 2022
[Vaccination 2022] QuestDB: Fast Open Source Time Series Database (Vlad Ilyushchenko)
Speaker:
Vlad Ilyushchenko
System:
QuestDB
Video:
YouTube

In this talk, we will discuss major technical challenges developers face when dealing with time series data and QuestDB's design principles that are meant to solve these challenges. We will then go through QuestDB's performance focused architecture and cover topics like storage model, transactions, in-order and out-of-order ingestion, concurrency control, and network interfaces. This talk is part of the Vaccination... Read More

Apr 4

2022

Apr 4 2022
[Vaccination 2022] Yellowbrick: An Elastic Data Warehouse on Kubernetes (Mark Cusack)
Speaker:
Mark Cusack
System:
Yellowbrick
Video:
YouTube

Yellowbrick is an elastic SQL data warehouse with a design centered on efficiency, high concurrency and performance. The database management system is composed from a set of Kubernetes-orchestrated containers. Kubernetes provides the single-source-of-truth for system configuration and state, and manages all warehousing lifecycle operations. In this session, I'll provide an overview of Yellowbrick and its microservices architecture, and focus on... Read More

Mar 28

2022

Mar 28 2022
[Vaccination 2022] Design and Implementation of the RelationalAI Knowledge Graph Management System (Martin Bravenboer)
Speaker:
Martin Bravenboer
System:
RelationalAI
Video:
YouTube

RelationalAI is the next-generation database platform for new intelligent data applications based on relational knowledge graphs. The Relational Knowledge Graph Management System (KGMS) complements the modern data stack by allowing data applications to be implemented relationally and declaratively, leveraging knowledge/semantics for reasoning, graph analytics, relational machine learning, and mathematical optimization workloads. RelationalAI as a relational and cloud native system fits... Read More

Mar 24

2022

Mar 24 2022
Hyperscale Data Processing with Network-centric Designs (Qizhen Zhang)
Speaker:
Qizhen Zhang

Today's largest data processing workloads are hosted in cloud data centers. Due to exponential data growth and the end of Moore's Law, these workloads have ballooned to the hyperscale level, encompassing billions to trillions of data items per query spread across hundreds to thousands of servers connected by the data center network. These massive scales fundamentally challenge the designs of... Read More

Mar 21

2022

Mar 21 2022
[Vaccination 2022] Stardog Query Optimiser: Architecture and Cardinality Estimations for Graph Queries (Pavel Klinov)
Speaker:
Pavel Klinov
System:
Stardog
Video:
YouTube

Stardog is a commercial knowledge graph platform at the heart of which lies a graph database. It manages graph data in the form of RDF (Resource Data Framework) triples and natively implements SPARQL 1.1 graph query language. This talk will present the general architecture of the query engine and then will delve deep into the internals of the query optimiser,... Read More

Mar 14

2022

Mar 14 2022
[Vaccination 2022] Open-source Change Data Capture With Debezium (Gunnar Morling)
Speaker:
Gunnar Morling
System:
Debezium
Video:
YouTube

Change Data Capture (CDC) is one big enabler for your data; by reacting to changes in your database in "real-time", CDC comes in handy for implementing a wide range of use cases, such as low-latency data updates from OLTP data stores to OLAP systems, caches, or search indexes, data exchange between microservices, building audit logs, and many more. In this... Read More

Mar 7

2022

Mar 7 2022
[Vaccination 2022] ApertureDB: Designing a Purpose-built System for Visual Data and Data Science (Vishakha Gupta)
Speaker:
Vishakha Gupta
System:
ApertureDB
Video:
YouTube

Data science and ML techniques can help understand visual content and enable better customer experience across domains, in turn driving the exponential growth in the amount of visual data. Managing large amounts of visual data (images or videos) is extremely time consuming, frustrating, and inefficient due to a lack of data management solutions designed with visual data or data science... Read More