Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
Impala 2.2.4
-
None
Description
Using the AVX to accelerate the bitmap filter phase in the join operation.
The micro benchmark is using AVX instruction to get the value corresponding to the index from a bitmap in the SIMD way. The configuration of micro benchmark is listed as following:
Array length = 10000000 (using bitmap to filter this array)
GCC Version: gcc version 4.9.2 (Ubuntu 4.9.2-0ubuntu1~12.04)
Compile flag: -O3 -mavx
We have tested two implementations: one is using 64-bit data length and the other one is using 32-bit data length. The runtime is measured is second. Smaller value is better.
Results
scalar | packet32 | packet64 |
---|---|---|
9619.94 | 9089.52 | 9127.87 |
9617.52 | 9086.75 | 9124.53 |
9617.74 | 9088.79 | 9123.69 |
9616.09 | 9088.36 | 9135.85 |
9617.28 | 9089.15 | 9127.88 |
9623.66 | 9097.62 | 9126.26 |
9621.24 | 9119.68 | 9123.97 |
9618.38 | 9092.66 | 9123.72 |
9621.83 | 9094.18 | 9124.65 |
9625.51 | 9110.22 | 9137.34 |