- Aerospike
- Akamas
- AlloyDB
- ApertureDB
- Arrow
- Azure Cosmos DB
- BigQuery
- Bodo
- Cassandra
- Chroma
- Citus
- CockroachDB
- Convex
- CrateDB
- Databricks
- Datometry
- dbt
- Delta Lake
- Dremio
- DSQL
- DVMS
- EraDB
- eXtremeDB
- Fauna
- Featureform
- Firebolt
- Fluss
- Gaia
- GlareDB
- GoogleSQL
- GreptimeDB
- Heron
- Hudi
- Impala
- Jepsen
- Kinetica
- LanceDB
- Litestream
- Malloy
- MariaDB
- MemSQL
- Modin
- MongoDB
- MotherDuck
- MySQL
- Neon
- Noria
- OceanBase
- Oracle
- Oxla
- ParadeDB
- Pinot
- PlanetScale
- PostgresML
- PRQL
- QMDB
- QuestDB
- Redshift
- RisingWave
- Rockset
- rqlite
- Samza
- SingleStore
- sled
- Smooth
- SpacetimeDB
- SpiceDB
- SplinterDB
- SQL Server
- SQLite
- Stardog
- Striim
- Swarm64
- Technical University of Munich
- TiDB
- TileDB
- Tokutek
- TopK
- turbopuffer
- Velox
- VillageSQL
- VoltDB
- Weaviate
- XTDB
- YugabyteDB
- AirFlow
- Alibaba
- Anna
- APOLLO
- Aurora DSQL
- Berkeley DB
- BlazingDB
- Brytlyt
- Chaos Mesh
- Chronon
- ClickHouse
- Confluent
- CouchDB
- CrocodileDB
- DataFusion
- Datomic
- Debezium
- Dolt
- Druid
- DuckDB
- EdgeDB
- Exon
- FASTER
- FeatureBase
- Feldera
- Fluree
- FoundationDB
- Gel
- Google Spanner
- Greenplum
- HarperDB
- HorizonDB
- Iceberg
- InfluxDB
- kdb
- ksqlDB
- LeanStore
- LMDB
- MapD
- Materialize
- Milvus
- MonetDB
- Mooncake
- Multigres
- Napa
- NoisePage
- NuoDB
- OpenDAL
- OtterTune
- OxQL
- Pinecone
- Pixeltable
- Polaris
- PostgreSQL
- Qdrant
- QuasarDB
- RavenDB
- RelationalAI
- RocksDB
- RonDB
- SalesForce
- ScyllaDB
- Sirius
- SLOG
- Snowflake
- Spice.ai
- Splice Machine
- SQL Anywhere
- SQLancer
- SQream
- StarRocks
- Summingbird
- Synnada
- TerminusDB
- TigerBeetle
- TimescaleDB
- TonicDB
- Trino
- Umbra
- Vertica
- Vitesse
- Vortex
- WiredTiger
- Yellowbrick
- Aerospike
- Alibaba
- ApertureDB
- Aurora DSQL
- BigQuery
- Brytlyt
- Chroma
- ClickHouse
- Convex
- CrocodileDB
- Datometry
- Debezium
- Dremio
- DuckDB
- EraDB
- FASTER
- Featureform
- Fluree
- Gaia
- Google Spanner
- GreptimeDB
- HorizonDB
- Impala
- kdb
- LanceDB
- LMDB
- MariaDB
- Milvus
- MongoDB
- Multigres
- Neon
- NuoDB
- Oracle
- OxQL
- Pinot
- Polaris
- PRQL
- QuasarDB
- Redshift
- RocksDB
- rqlite
- ScyllaDB
- sled
- Snowflake
- SpiceDB
- SQL Anywhere
- SQLite
- StarRocks
- Swarm64
- TerminusDB
- TileDB
- TonicDB
- turbopuffer
- Vertica
- VoltDB
- WiredTiger
- YugabyteDB
- AirFlow
- AlloyDB
- APOLLO
- Azure Cosmos DB
- BlazingDB
- Cassandra
- Chronon
- CockroachDB
- CouchDB
- Databricks
- Datomic
- Delta Lake
- Druid
- DVMS
- Exon
- Fauna
- Feldera
- Fluss
- Gel
- GoogleSQL
- HarperDB
- Hudi
- InfluxDB
- Kinetica
- LeanStore
- Malloy
- Materialize
- Modin
- Mooncake
- MySQL
- NoisePage
- OceanBase
- OtterTune
- ParadeDB
- Pixeltable
- PostgresML
- Qdrant
- QuestDB
- RelationalAI
- Rockset
- SalesForce
- SingleStore
- SLOG
- SpacetimeDB
- Splice Machine
- SQL Server
- SQream
- Striim
- Synnada
- TiDB
- TimescaleDB
- TopK
- Umbra
- VillageSQL
- Vortex
- XTDB
- Akamas
- Anna
- Arrow
- Berkeley DB
- Bodo
- Chaos Mesh
- Citus
- Confluent
- CrateDB
- DataFusion
- dbt
- Dolt
- DSQL
- EdgeDB
- eXtremeDB
- FeatureBase
- Firebolt
- FoundationDB
- GlareDB
- Greenplum
- Heron
- Iceberg
- Jepsen
- ksqlDB
- Litestream
- MapD
- MemSQL
- MonetDB
- MotherDuck
- Napa
- Noria
- OpenDAL
- Oxla
- Pinecone
- PlanetScale
- PostgreSQL
- QMDB
- RavenDB
- RisingWave
- RonDB
- Samza
- Sirius
- Smooth
- Spice.ai
- SplinterDB
- SQLancer
- Stardog
- Summingbird
- Technical University of Munich
- TigerBeetle
- Tokutek
- Trino
- Velox
- Vitesse
- Weaviate
- Yellowbrick
- Aerospike
- AlloyDB
- Arrow
- BigQuery
- Cassandra
- Citus
- Convex
- Databricks
- dbt
- Dremio
- DVMS
- eXtremeDB
- Featureform
- Fluss
- GlareDB
- GreptimeDB
- Hudi
- Jepsen
- LanceDB
- Malloy
- MemSQL
- MongoDB
- MySQL
- Noria
- Oracle
- ParadeDB
- PlanetScale
- PRQL
- QuestDB
- RisingWave
- rqlite
- SingleStore
- Smooth
- SpiceDB
- SQL Server
- Stardog
- Swarm64
- TiDB
- Tokutek
- turbopuffer
- VillageSQL
- Weaviate
- YugabyteDB
- AirFlow
- Anna
- Aurora DSQL
- BlazingDB
- Chaos Mesh
- ClickHouse
- CouchDB
- DataFusion
- Debezium
- Druid
- EdgeDB
- FASTER
- Feldera
- FoundationDB
- Google Spanner
- HarperDB
- Iceberg
- kdb
- LeanStore
- MapD
- Milvus
- Mooncake
- Napa
- NuoDB
- OtterTune
- Pinecone
- Polaris
- Qdrant
- RavenDB
- RocksDB
- SalesForce
- Sirius
- Snowflake
- Splice Machine
- SQLancer
- StarRocks
- Synnada
- TigerBeetle
- TonicDB
- Umbra
- Vitesse
- WiredTiger
- Akamas
- ApertureDB
- Azure Cosmos DB
- Bodo
- Chroma
- CockroachDB
- CrateDB
- Datometry
- Delta Lake
- DSQL
- EraDB
- Fauna
- Firebolt
- Gaia
- GoogleSQL
- Heron
- Impala
- Kinetica
- Litestream
- MariaDB
- Modin
- MotherDuck
- Neon
- OceanBase
- Oxla
- Pinot
- PostgresML
- QMDB
- Redshift
- Rockset
- Samza
- sled
- SpacetimeDB
- SplinterDB
- SQLite
- Striim
- Technical University of Munich
- TileDB
- TopK
- Velox
- VoltDB
- XTDB
- Alibaba
- APOLLO
- Berkeley DB
- Brytlyt
- Chronon
- Confluent
- CrocodileDB
- Datomic
- Dolt
- DuckDB
- Exon
- FeatureBase
- Fluree
- Gel
- Greenplum
- HorizonDB
- InfluxDB
- ksqlDB
- LMDB
- Materialize
- MonetDB
- Multigres
- NoisePage
- OpenDAL
- OxQL
- Pixeltable
- PostgreSQL
- QuasarDB
- RelationalAI
- RonDB
- ScyllaDB
- SLOG
- Spice.ai
- SQL Anywhere
- SQream
- Summingbird
- TerminusDB
- TimescaleDB
- Trino
- Vertica
- Vortex
- Yellowbrick
Nov 17
2025
Cortex AISQL: A Production SQL Engine for Unstructured Data
- Speaker:
- Anupam Datta
- System:
- Snowflake
Snowflake’s Cortex AISQL is a production SQL engine that integrates native semantic operations directly into SQL. This integration allows users to write declarative queries that combine relational operations with semantic reasoning, enabling them to query both structured and unstructured data effortlessly. However, making semantic operations efficient at production scale poses fundamental challenges. Semantic operations are more expensive than traditional SQL... Read More
Nov 11
2025
[Fall 2025] Open Data Infrastructure with Iceberg and dbt
- Speaker:
- Connor McArthur
- System:
- dbt
Apache Iceberg is now interoperable with most modern data platforms and compute systems. While Iceberg enables powerful new capabilities, real-world adoption still presents challenges for many organizations. In this talk, we will unpack Iceberg's architecture; demonstrate a novel architecture where multiple compute systems connect to the same underlying Iceberg catalog; and discuss the maturity and continued investment needed to ensure... Read More
Nov 10
2025
[Future Data] Mooncake: Real-Time Apache Iceberg Without Compromise
- Speaker:
- Cheng Chen
- System:
- Mooncake
- Video:
- YouTube
Apache Iceberg is great for large-scale analytics, but it was built for batch workloads. For streaming use cases, keeping tables fresh means writing snapshots more often, which creates excess small Parquet files, bloated metadata, and costly compaction that never ends. Updates and deletes make things worse because equality deletes push the burden to query engines, leaving readers slow and inefficient.... Read More
Nov 4
2025
Real Time Analytics Query Architecture Evolution @ Uber (Ankit Sultana)
- Speaker:
- Ankit Sultana
- System:
- Pinot
- Video:
- YouTube
We will talk about how Apache Pinot's query feature set has grown tremendously over the past few years and how that growth has shaped Uber's Real Time Analytics Query Architecture. We will dive into the different query engines in Apache Pinot and briefly discuss our legacy and unique Presto over Pinot architecture. Read More
Nov 3
2025
[Future Data] Multi-statement Transactions in the Databricks Lakehouse
- Speaker:
- Ryan Johnson
- System:
- Delta Lake
- Video:
- YouTube
The data lake architecture originally focused on self-standing tables in cloud storage, with catalogs as mere discovery aids. Modern lakehouse architectures add an ever-growing set of data warehousing capabilities to that original value proposition. Historically a key missing piece was multi-statement transactions -- Delta Lake supported single-statement single-table transactions, with ACID properties for changes made to that table. Sophisticated MERGE... Read More
Nov 3
2025
Transactions and Coordination in Aurora DSQL
- Speaker:
- Marc Brooker
- System:
- DSQL
Aurora DSQL is a new global, serverless, scalable relational database system, built at AWS. In this talk, I’ll dive into the architecture of DSQL, how it handles transactions, and how and why it was designed to minimize coordination. We’ll touch on transaction protocols, isolation, and virtualization. Read More
Oct 27
2025
[Future Data] Storage Metadata for Modern Cloud Databases
- Speaker:
- Joyo Victor
- System:
- SingleStore
- Video:
- YouTube
In modern database architecture, separating compute from storage unlocks powerful capabilities. Our tiered storage, “bottomless”, started by uploading files to remote object storage. This worked well until we wanted to create database branches pointing to the same remote storage. One branch does not know if it can delete a file that another branch depends on. To solve this, we built... Read More
Oct 21
2025
[Fall 2025] Astronomer / Apache AirFlow Tech Talk
- Speaker:
- Julian LaNeve
- System:
- AirFlow
Apache Airflow is the most popular data orchestration tool there is, downloaded over 40m times per month and used to power the data, ML, and AI platforms at OpenAI, Lyft, Airbnb, Uber, and Apple. At its core, Airflow allows you to define data workflows as DAGs using Python. We’ll do a deep dive on how Airflow came to be and... Read More
Oct 20
2025
[Future Data] Where We’re Going, We Don’t Need Rows: Columnar Data Connectivity with ADBC
- Speaker:
- Ian Cook
- System:
- Arrow
- Video:
- YouTube
ADBC (Arrow Database Connectivity) is Apache Arrow’s answer to ODBC and JDBC: It’s a database access API and driver standard that delivers data in Arrow columnar format instead of a row-oriented format. ADBC is on a roll, speeding and simplifying data access for dbt, Databricks, DuckDB, Microsoft, Snowflake, and more. This talk presents the architecture of ADBC (APIs, drivers, driver... Read More
Oct 13
2025
[Future Data] Vortex: LLVM for File Formats
- Speaker:
- Will Manning
- System:
- Vortex
- Video:
- YouTube
Apache Parquet revolutionized columnar storage after its initial release in 2013, but has largely failed to evolve since then. As a result, nearly every Tier 1 tech company has built their own columnar format to replace Parquet. Enter Vortex, a Linux Foundation project that currently achieves 100x faster random access, 10-20x faster scans, and 5x higher write throughput, while maintaining... Read More