[PDL Visit Day 2015] Tirthankar Lahiri (Oracle)
The Oracle Database In-Memory Option allows Oracle to function as the industry-first dual-format in-memory database. Row formats are ideal for OLTP workloads which typically use indexes to limit their data access to a small set of rows, while column formats are better suited for Analytic operations which typically examine a small number of columns from a large number of rows. Since no single data format is ideal for all types of workloads, our approach was to allow data to be simultaneously maintained in both formats with strict transactional consistency between them. The new columnar format is a pure in-memory format with no impact to the on-disk representation. Tables required for fast analytics can be populated into the In-Memory column store. In-Memory columnar formats allow a variety of optimizations including various levels of compression, SIMD vector processing, and in-memory storage indexes. The Oracle in-memory column format thus results in per CPU core scan speeds exceeding many billions of rows per second. Furthermore, the greatly accelerated scan speed enables a variety of query optimizations such as In-Memory aggregation (Vector Group By), a real-time computation of cube aggregations that converts costly joins and aggregations into a series of filtered scans. The in-memory column store can be scaled out on a RAC cluster with additional high availability via in-memory duplication. OLTP updates and highly selective lookups on the same tables can be performed via the existing in-memory row store, i.e. the buffer cache. This allows the DBA to drop indexes required purely for analytics, and use the column store instead. Dropping analytic indexes may provide a substantial speedup for OLTP DML operations by eliminating costly index maintenance. The in-memory column store is seamlessly built in to the Oracle Database engine, therefore ensuring that all of the rich functionality and High Availability mechanisms of the Oracle Database work transparently with Database In-Memory.
Tirthankar Lahiri is Vice President of Development at Oracle, and is responsible for the Data Technologies area for the Oracle Database (this area coves Data, Space, and Transaction management) as well as the Oracle TimesTen In-Memory Database. Tirthankar has 18 years of experience in the Database industry. He has worked extensively on a variety of core Database Systems areas, for which he holds multiple patents: Manageability, Performance, Scalability, High Availability, Caching, Distributed Concurrency Control, In-Memory Data Management, etc. Tirthankar has a B.Tech in Computer Science from IIT, Kharagpur, and an MS in Electrical Engineering from Stanford University. He was in the PhD program at Stanford and his research areas included Multiprocessor Operating Systems and Semi-Structured Data.