Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-4160 Vectorized Query Execution in Hive
  3. HIVE-4613

Improve cache friendliness of VectorHashKeyWrapper

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • vectorization-branch
    • None
    • Query Processor
    • None

    Description

      1) the implementation of VectorHashKeyWrapper uses an array of primitives even when there is only one key, which implies one extra pointer chase. for single key group by we can do a more optimal implementation that uses a primitive field. The draw back is that the VectorHashKeyWrapper becomes abstract and the API uses virtual functions. Note that the most critical function is .equals, which is already virtual.
      2) make the bucket list comparison more cache friendly. I expect this to be the critical for perf because of the number of calls on bucket collisions. If we ensure that instances with equals hashcode were allocated together (eg. using array allocations in batches) then we can get some benefit from allocation proximity (TLB, NUMA).

      I'll leave these as minor since we have no evidence atm that the issue actually exists, nor any way to measure the impact of a fix.

      Attachments

        Activity

          People

            rusanu Remus Rusanu
            rusanu Remus Rusanu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: