Talk type: Talk
Pragmatic Code Generation for Efficient Execution
Volcano-based iterator interface provides a simple yet powerful abstraction for operator evaluation. This model worked well in the past when systems were bottlenecked on disk IO. However, with a large reserve of main memory available in most systems, execution time is largely determined by the raw CPU throughput of operator evaluation. Most modern query engines employ either code generation or vectorization to overcome the cost of virtualization imposed by the Volcano interface. While both techniques have been shown to be competitive in performance, they have their respective challenges.
In this talk, we will discuss the design of NetSpring’s code-generation based execution engine for efficient operator evaluation. We’ll begin with a primer on SQL engines, talk about the differences between vectorized and code-generation based systems, and then finally propose a middle-ground that draws ideas from both paradigms to attain the best of both worlds. We’ll also discuss the practical challenges of code-generation in production systems and the techniques to overcome those challenges.