[Future Data] Mooncake: Real-Time Apache Iceberg Without Compromise
- Speaker:
- Cheng Chen
- Date:
- Mon Nov 10, 2025 @ 04:30pm EDT
- Date:
- Mon Nov 10, 2025
- Time:
- 04:30pm EDT
- Location:
- https://cmu.zoom.us/j/96274590594?pwd=ZIhPZi8CFwaVd5kN9sS5uEiuWanTCa.1Zoom
- Title:
- Mooncake: Real-Time Apache Iceberg Without Compromise
- System:
- Mooncake
Talk Info:
Apache Iceberg is great for large-scale analytics, but it was built for batch workloads. For streaming use cases, keeping tables fresh means writing snapshots more often, which creates excess small Parquet files, bloated metadata, and costly compaction that never ends. Updates and deletes make things worse because equality deletes push the burden to query engines, leaving readers slow and inefficient.
Mooncake adds a real-time layer to Iceberg. It supports streaming writes and mirroring from relational databases with sub-second latency. It also provides continuous optimization, caching, and indexing for fast, user-facing analytics, while remaining fully compatible with the Iceberg spec.
This talk is part of the Future Data Systems Seminar Series.
Bio:
Cheng is co-founder of Mooncake Labs (recently acquired by Databricks). He now works on the Lakebase team at Databricks, helping integrate Postgres with lakehouse. Previously, he led the Search & Extension team at SingleStore and worked on various query execution stuffs.