Events

Events

Nov 10

2025

04:30pm EST
Nov 10 2025
[Future Data] Mooncake: Real-Time Apache Iceberg Without Compromise
Speaker:
Cheng Chen
System:
Mooncake

Apache Iceberg is great for large-scale analytics, but it was built for batch workloads. For streaming use cases, keeping tables fresh means writing snapshots more often, which creates excess small Parquet files, bloated metadata, and costly compaction that never ends. Updates and deletes make things worse because equality deletes push the burden to query engines, leaving readers slow and inefficient.... Read More

Nov 11

2025

12:00pm EST
GHC 8115
Nov 11 2025
[Fall 2025] Open Data Infrastructure with Iceberg and dbt
Speaker:
Connor McArthur
System:
dbt

Apache Iceberg is now interoperable with most modern data platforms and compute systems. While Iceberg enables powerful new capabilities, real-world adoption still presents challenges for many organizations. In this talk, we will unpack Iceberg's architecture; demonstrate a novel architecture where multiple compute systems connect to the same underlying Iceberg catalog; and discuss the maturity and continued investment needed to ensure... Read More

Nov 17

2025

04:30pm EST

Nov 24

2025

04:30pm EST

Dec 1

2025

04:30pm EST
Dec 1 2025
[Future Data] From Storage Formats to Open Governance: The Evolution to Apache Polaris
Speaker:
Prashant Singh
System:
Polaris

As organizations build their data lakehouses on Apache Iceberg, the primary challenge shifts from managing individual files to orchestrating a cohesive ecosystem of tables. How can you guarantee consistency and enable complex operations when multiple data engines—like Spark, Trino, and Flink—need to interact with the same data concurrently? The answer lies in a standardized service layer, defined by the Iceberg... Read More