Events

Events

[Future Data] An Extremely Technical Overview of how the Apache Iceberg™ Planning Implementation Actually Works

Speaker:
Russell Spitzer
Date:
Mon Sep 22, 2025 @ 04:30pm EDT
Date:
Mon Sep 22, 2025
Time:
04:30pm EDT
Location:
https://cmu.zoom.us/j/96274590594?pwd=ZIhPZi8CFwaVd5kN9sS5uEiuWanTCa.1Zoom
Title:
An Extremely Technical Overview of how the Apache Iceberg™ Planning Implementation Actually Works
System:
Iceberg
Video:
YouTube

Talk Info:

What are you trying to tell me? That I can read data fast? No, User. I'm trying to tell you that when you are ready, you won't have to.

Everyone's heard about how fast Apache Iceberg and maybe you've even heard a few notes about "predicate pushdown" and "file metrics" but you've been left wanting more. You want to know the nitty gritty of how a predicate from a query engine is actually transformed and applied to Iceberg metadata. In this talk we will show you just that, we'll work through the actual code of the Iceberg project showing how the metadata is read, how predicates are transformed, and finally how file tasks are actually broken up and sent to execution engines. We'll talk in detail about all of the properties which control Iceberg planning and see the classes in which those parameters actually take effect. This is definitely more detail than any user of an Apache Iceberg table would actually need to know, but don't you want to join the small group of developers who know how it actually works?

This talk is part of the Future Data Systems Seminar Series.

Bio:

Russell Spitzer received his Ph.D from UCSF in after performing a lot of comparisons of protein binding sites. Following that, he became deeply invested in distributed computing and joined Datastax, a company using Apache Cassandra. While working at Datastax he was a key contributor to the DataStax Spark-Cassandra Connector and also worked on many other Apache projects. After leaving Datastax he worked at Apple growing the then nascent Apache Iceberg project where he worked on data file management and advancing the table format. Currently Russell is working on OSS software at Snowflake and is a PMC member of the Apache Iceberg project, and a PPMC member of the Apache Polaris (Incubating) project.