Extreme Performance – DB2 BLU Acceleration
November 11, 2013 1 Comment
Michael Kwok, Ph.D.; Senior Manager, DB2 LUW Warehouse and BLU Performance
A lot higher.
DB2 with BLU Acceleration can easily speed an analytic workload by 8 to 25 times!
Are you serious?
I lead the DB2 warehouse performance team at the IBM Toronto Lab, responsible for the performance of BLU Acceleration. My team, together with the greatest minds from research and development, delivers the extreme performance you’ll find in BLU Acceleration.
I personally witnessed an order of magnitude speed-up in analytic workloads using BLU Acceleration when compared to the traditional row-based database. I even saw queries running 1000 times faster, speeding from minutes to seconds. At first, I thought these queries hit some sort of error that resulted in completing almost instantly. But, after I checked, all of these queries returned the correct results. It was just so amazing!
So what makes DB2 with BLU Acceleration so fast? In short, the innovative, dynamic in-memory columnar technologies are responsible for the extreme performance.
First of all, DB2 with BLU Acceleration massively improves I/O efficiency. We perform I/O only on columns involved in a query. The new prefetching algorithm effectively reads relevant data into memory before they are accessed. The new compression technique gives substantial storage savings, yielding a significant reduction of I/O.
Our columnar technologies also support massive improvements in memory and cache efficiency. BLU Acceleration eliminates the need to consume memory, cache space or bandwidth for unneeded columns. Data is kept compressed in memory, and packed into cache-friendly structures during processing. While we have a more effective prefetching algorithm, the new scan-friendly victim selection algorithm further allows us to keep a near optimal set of data buffered in memory.
Our new compression technique makes data a lot smaller both on disk and in memory. A patented technology preserves order so that the data can be used without decompressing. We call it “actionable compression.” In other words, we can now work on predicate evaluation (e.g., =, <>, <, >, …), joins and aggregation directly on the compressed data. Imagine how many CPU cycles this can save.
BLU Acceleration uses parallel vector processing. Vectors have superior memory efficiency. The runtime engine of BLU Acceleration is automatically parallelized across cores and achieves excellent multi-core scalability. For example, careful data placement and alignment, coupled with adapting to the physical server attributes, helps achieve this multi-core scalability. We also leverage the Single Instruction Multiple Data (SIMD). Using hardware instructions, we can apply a single instruction to many data elements simultaneously: in predicate evaluations, joins, groupings and arithmetic. This speeds up query processing a lot.
Last but not least, data skipping contributes to this extreme performance BLU Acceleration automatically creates a small data structure called “Synopsis” to store the minimum and maximum values for each page of column data. This synopsis allows us to quickly skip pages that do not qualify for a query and can be ignored. This saves I/O and CPU processing.
BLU Acceleration is not just one technology or one idea, nor an ordinary columnar technology. It is a collection of many innovative technologies from research to development. It is CPU-, memory-, and I/O-optimized.
See for yourself: