- Aerospike
- Akamas
- AlloyDB
- ApertureDB
- Arrow
- Azure Cosmos DB
- BigQuery
- Bodo
- Cassandra
- Chroma
- Citus
- CockroachDB
- Convex
- CrateDB
- Databricks
- Datometry
- dbt
- Delta Lake
- Dremio
- DSQL
- DVMS
- EraDB
- eXtremeDB
- Fauna
- Featureform
- Firebolt
- Fluss
- Gaia
- GlareDB
- GoogleSQL
- GreptimeDB
- Heron
- Hudi
- Impala
- Jepsen
- Kinetica
- LanceDB
- Litestream
- Malloy
- MariaDB
- MemSQL
- Modin
- MongoDB
- MotherDuck
- MySQL
- Neon
- Noria
- OceanBase
- Oracle
- Oxla
- ParadeDB
- Pinot
- PlanetScale
- PostgresML
- PRQL
- QMDB
- QuestDB
- Redshift
- RisingWave
- Rockset
- rqlite
- Samza
- SingleStore
- sled
- Smooth
- SpacetimeDB
- SpiceDB
- SplinterDB
- SQL Server
- SQLite
- Stardog
- Striim
- Swarm64
- Technical University of Munich
- TiDB
- TileDB
- Tokutek
- TopK
- turbopuffer
- Velox
- VillageSQL
- VoltDB
- Weaviate
- XTDB
- YugabyteDB
- AirFlow
- Alibaba
- Anna
- APOLLO
- Aurora DSQL
- Berkeley DB
- BlazingDB
- Brytlyt
- Chaos Mesh
- Chronon
- ClickHouse
- Confluent
- CouchDB
- CrocodileDB
- DataFusion
- Datomic
- Debezium
- Dolt
- Druid
- DuckDB
- EdgeDB
- Exon
- FASTER
- FeatureBase
- Feldera
- Fluree
- FoundationDB
- Gel
- Google Spanner
- Greenplum
- HarperDB
- HorizonDB
- Iceberg
- InfluxDB
- kdb
- ksqlDB
- LeanStore
- LMDB
- MapD
- Materialize
- Milvus
- MonetDB
- Mooncake
- Multigres
- Napa
- NoisePage
- NuoDB
- OpenDAL
- OtterTune
- OxQL
- Pinecone
- Pixeltable
- Polaris
- PostgreSQL
- Qdrant
- QuasarDB
- RavenDB
- RelationalAI
- RocksDB
- RonDB
- SalesForce
- ScyllaDB
- Sirius
- SLOG
- Snowflake
- Spice.ai
- Splice Machine
- SQL Anywhere
- SQLancer
- SQream
- StarRocks
- Summingbird
- Synnada
- TerminusDB
- TigerBeetle
- TimescaleDB
- TonicDB
- Trino
- Umbra
- Vertica
- Vitesse
- Vortex
- WiredTiger
- Yellowbrick
- Aerospike
- Alibaba
- ApertureDB
- Aurora DSQL
- BigQuery
- Brytlyt
- Chroma
- ClickHouse
- Convex
- CrocodileDB
- Datometry
- Debezium
- Dremio
- DuckDB
- EraDB
- FASTER
- Featureform
- Fluree
- Gaia
- Google Spanner
- GreptimeDB
- HorizonDB
- Impala
- kdb
- LanceDB
- LMDB
- MariaDB
- Milvus
- MongoDB
- Multigres
- Neon
- NuoDB
- Oracle
- OxQL
- Pinot
- Polaris
- PRQL
- QuasarDB
- Redshift
- RocksDB
- rqlite
- ScyllaDB
- sled
- Snowflake
- SpiceDB
- SQL Anywhere
- SQLite
- StarRocks
- Swarm64
- TerminusDB
- TileDB
- TonicDB
- turbopuffer
- Vertica
- VoltDB
- WiredTiger
- YugabyteDB
- AirFlow
- AlloyDB
- APOLLO
- Azure Cosmos DB
- BlazingDB
- Cassandra
- Chronon
- CockroachDB
- CouchDB
- Databricks
- Datomic
- Delta Lake
- Druid
- DVMS
- Exon
- Fauna
- Feldera
- Fluss
- Gel
- GoogleSQL
- HarperDB
- Hudi
- InfluxDB
- Kinetica
- LeanStore
- Malloy
- Materialize
- Modin
- Mooncake
- MySQL
- NoisePage
- OceanBase
- OtterTune
- ParadeDB
- Pixeltable
- PostgresML
- Qdrant
- QuestDB
- RelationalAI
- Rockset
- SalesForce
- SingleStore
- SLOG
- SpacetimeDB
- Splice Machine
- SQL Server
- SQream
- Striim
- Synnada
- TiDB
- TimescaleDB
- TopK
- Umbra
- VillageSQL
- Vortex
- XTDB
- Akamas
- Anna
- Arrow
- Berkeley DB
- Bodo
- Chaos Mesh
- Citus
- Confluent
- CrateDB
- DataFusion
- dbt
- Dolt
- DSQL
- EdgeDB
- eXtremeDB
- FeatureBase
- Firebolt
- FoundationDB
- GlareDB
- Greenplum
- Heron
- Iceberg
- Jepsen
- ksqlDB
- Litestream
- MapD
- MemSQL
- MonetDB
- MotherDuck
- Napa
- Noria
- OpenDAL
- Oxla
- Pinecone
- PlanetScale
- PostgreSQL
- QMDB
- RavenDB
- RisingWave
- RonDB
- Samza
- Sirius
- Smooth
- Spice.ai
- SplinterDB
- SQLancer
- Stardog
- Summingbird
- Technical University of Munich
- TigerBeetle
- Tokutek
- Trino
- Velox
- Vitesse
- Weaviate
- Yellowbrick
- Aerospike
- AlloyDB
- Arrow
- BigQuery
- Cassandra
- Citus
- Convex
- Databricks
- dbt
- Dremio
- DVMS
- eXtremeDB
- Featureform
- Fluss
- GlareDB
- GreptimeDB
- Hudi
- Jepsen
- LanceDB
- Malloy
- MemSQL
- MongoDB
- MySQL
- Noria
- Oracle
- ParadeDB
- PlanetScale
- PRQL
- QuestDB
- RisingWave
- rqlite
- SingleStore
- Smooth
- SpiceDB
- SQL Server
- Stardog
- Swarm64
- TiDB
- Tokutek
- turbopuffer
- VillageSQL
- Weaviate
- YugabyteDB
- AirFlow
- Anna
- Aurora DSQL
- BlazingDB
- Chaos Mesh
- ClickHouse
- CouchDB
- DataFusion
- Debezium
- Druid
- EdgeDB
- FASTER
- Feldera
- FoundationDB
- Google Spanner
- HarperDB
- Iceberg
- kdb
- LeanStore
- MapD
- Milvus
- Mooncake
- Napa
- NuoDB
- OtterTune
- Pinecone
- Polaris
- Qdrant
- RavenDB
- RocksDB
- SalesForce
- Sirius
- Snowflake
- Splice Machine
- SQLancer
- StarRocks
- Synnada
- TigerBeetle
- TonicDB
- Umbra
- Vitesse
- WiredTiger
- Akamas
- ApertureDB
- Azure Cosmos DB
- Bodo
- Chroma
- CockroachDB
- CrateDB
- Datometry
- Delta Lake
- DSQL
- EraDB
- Fauna
- Firebolt
- Gaia
- GoogleSQL
- Heron
- Impala
- Kinetica
- Litestream
- MariaDB
- Modin
- MotherDuck
- Neon
- OceanBase
- Oxla
- Pinot
- PostgresML
- QMDB
- Redshift
- Rockset
- Samza
- sled
- SpacetimeDB
- SplinterDB
- SQLite
- Striim
- Technical University of Munich
- TileDB
- TopK
- Velox
- VoltDB
- XTDB
- Alibaba
- APOLLO
- Berkeley DB
- Brytlyt
- Chronon
- Confluent
- CrocodileDB
- Datomic
- Dolt
- DuckDB
- Exon
- FeatureBase
- Fluree
- Gel
- Greenplum
- HorizonDB
- InfluxDB
- ksqlDB
- LMDB
- Materialize
- MonetDB
- Multigres
- NoisePage
- OpenDAL
- OxQL
- Pixeltable
- PostgreSQL
- QuasarDB
- RelationalAI
- RonDB
- ScyllaDB
- SLOG
- Spice.ai
- SQL Anywhere
- SQream
- Summingbird
- TerminusDB
- TimescaleDB
- Trino
- Vertica
- Vortex
- Yellowbrick
Oct 28
2021
Rethinking Systems for Data-Intensive Computing (Matei Zaharia)
- Speaker:
- Matei Zaharia
A growing fraction of applications today, from basic business processing to machine learning, are data-intensive: they need to correctly process and produce massive datasets that are too large for any human to inspect. These applications pose many systems challenges, from programming interfaces, to monitoring and debugging (how can a human make sure these applications are working well?), to performance. I’ll... Read More
Oct 25
2021
[Vaccination 2021] An Overview of the Starburst Trino Query Optimizer (Karol Sobczak)
- Speaker:
- Karol Sobczak
- System:
- Trino
- Video:
- YouTube
Starburst unlocks the value of data by making it fast and easy to analyze anywhere. Starburst queries data across any database, making it instantly actionable for organizations. With Starburst, teams can lower the total cost of their infrastructure and analytics investments, use the tools that work best for their business, and our open source Trino roots prevent data lock-in. Trusted... Read More
Oct 18
2021
[Vaccination 2021] Reinventing Amazon Redshift (Ippokratis Pandis)
- Speaker:
- Ippokratis Pandis
- System:
- Redshift
- Video:
- YouTube
In 2013, eight years ago, Amazon Web Services revolutionized the data warehousing industry by launching Amazon Redshift, the first fully managed, petabyte-scale cloud data warehouse solution. Amazon Redshift made it simple and cost-effective to efficiently analyze large volumes of data using existing business intelligence tools. This launch was a significant leap from the traditional on-premise data warehousing solutions which were... Read More
Oct 11
2021
[Vaccination 2021] How to Count Things with dbt (Drew Banin)
- Speaker:
- Drew Banin
- System:
- dbt
- Video:
- YouTube
Modern organizations leverage machine learning, data science, and AI to build predictive, responsive, and personalized applications. BUT! Most are bad at counting things. That's where dbt comes in. dbt is an open source framework used to define, test, and document datasets. In this talk, we will discuss the what, why, and how behind dbt and data warehousing in the year... Read More
Oct 4
2021
[Vaccination 2021] Bodo: Automatic HPC Performance and Scaling for Data Processing in Python (Ehsan Totoni)
- Speaker:
- Ehsan Totoni
- System:
- Bodo
- Video:
- YouTube
Python is the language of choice for machine learning (ML) and AI, but SQL has been used for data processing for decades. Many data applications are often a mix of the two languages, which makes development and deployment cumbersome for data teams. BodoSQL addresses the "two-language" problem by compiling Python and SQL code together, providing type checking, error checking, end-to-end... Read More
Sep 27
2021
[Vaccination 2021] The TileDB Universal Database (Stavros Papadopoulos)
- Speaker:
- Stavros Papadopoulos
- System:
- TileDB
- Video:
- YouTube
TileDB makes data management universal by modeling all types of data (tables, images, video, genomics, LiDAR and many more) as multi-dimensional arrays. TileDB enables storage on any backend and offers extreme interoperability via numerous language APIs, SQL databases and data science tools. It also takes data sharing, monetization and computation to extreme scale via its powerful serverless architecture. In this... Read More
Sep 20
2021
[Vaccination 2021] Google Napa: Powering Scalable Data Warehousing with Robust Query Performance (Jagan Sankaranarayanan + Indrajit Roy)
- Speakers:
- Jagan Sankaranarayanan, Indrajit Roy
- System:
- Napa
- Video:
- YouTube
Napa powers Google’s data warehouse needs for critical clients like Ads and payments. These clients have differing requirements around cost, performance, and data freshness, including a strong expectation of variance-free, robust query performance. At its core, Napa’s principal technologies for robust query performance include the aggressive use of materialized views, which are maintained consistently as new data is ingested across... Read More
Sep 13
2021
[Vaccination 2021] rqlite – The Distributed Database Built on Raft and SQLite (Philip O’Toole)
- Speaker:
- Philip O’Toole
- System:
- rqlite
- Video:
- YouTube
rqlite is a lightweight, distributed database which uses SQLite as its database engine. This presentation will discuss its goals, design, and implementation, with particular reference to its use of the Raft consensus algorithm, and its embedding of SQLite. We will also discuss rqlite testing, performance, lessons learned during development, and some of its real-world applications. This talk is part of... Read More
Aug 18
2021
PhD Defense: Self-Driving Database Management Systems: Forecasting, Modeling, and Planning (Lin Ma)
- Speaker:
- Lin Ma
Database management systems (DBMSs) are an important part of modern data-driven applications. However, they are notoriously difficult to deploy and administer because they have many aspects that one can change that affect their performance, including database physical design and system configuration. There are existing methods that recommend how to change these aspects of databases for an application. But most of... Read More
Aug 9
2021
MS Thesis Defense: Code Generation Log Replay for In-memory Database Management Systems (Tianlei Pan)
- Speaker:
- Tianlei Pan
Code generation is a widely-used technique for improving query execution throughput by compiling instructions into native code. This technique, however, leads to design challenges for the recovery system of a DBMS. The log replay process will be disconnected from the built-in execution engine that has been modified to operate efficiently on compiled code. This usually leads to the implementation of... Read More