Quarantine DB Talk 2020: Apache Arrow Flight: Accelerating Columnar Dataset Transport
In this talk I will discuss the role that Apache Arrow and Arrow Flight are playing to provide a faster and more efficient approach to building data services that transport large datasets. We’ll look at the technical details of why the Arrow protocol is an attractive choice and look at specific examples of where Arrow has been employed for better performance and resource efficiency. Finally, I will discuss the implications for databases and the upcoming generation of data systems.
This talk is part of the Quarantine Database Tech Talk Seminar Series.
Wes McKinney is an open source software developer focusing on analytical computing. He created the Python pandas project and is a co-creator of Apache Arrow, his current focus. He authored 2 editions of the reference book "Python for Data Analysis". Wes is a Member of The Apache Software Foundation and also a PMC member for Apache Parquet. He is the director of Ursa Labs, an open source development group focused on data science tools for Python and R powered by Apache Arrow, built in partnership with RStudio. Previously, he worked for Two Sigma, Cloudera, and AQR Capital Management, and he was co-founder and CEO of the startup DataPad.