Ippokratis Pandis (Cloudera)
The Cloudera Impala project is pioneering the next generation of Hadoop capabilities: the convergence of fast SQL queries with the capacity, scalability, and flexibility of a Hadoop cluster. With Impala, the academic and Hadoop communities now have an open-sourced codebase that helps query data stored in HDFS and Apache HBase in real time, using familiar SQL syntax. In contrast with other SQL-on-Hadoop initiatives, Impala’s operations are fast enough to do interactively on native Hadoop data rather than in long-running batch jobs.
This talk starts out with an overview of Impala from the user’s perspective, followed by a presentation of Impala’s architecture and implementation. It concludes with a summary of Impala’s benefits when compared with the available SQL-on-Hadoop alternatives.
Ippokratis Pandis is a software engineer at Cloudera working on the Impala project. Before joining Impala and Cloudera, Ippokratis was member of the research staff at IBM Almaden Research Center. At IBM, he was member of the core team that designed and implemented the BLU column-store engine, which currently ships as part of IBM's DB2 LUW v10.5 with BLU Acceleration. Ippokratis received his PhD from the Electrical and Computer Engineering department at Carnegie Mellon University. He is the recipient of Best Demonstration awards at ICDE 2006 and SIGMOD 2011 and Best Paper runner up award at CIDR 2013. He is serving or has served as PC chair of DaMon 2015 and DaMoN 2014 and PC area chair for CIKM 2014.