Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9434

Implement / Evaluate Robin Hood Hashing for exec/hash-table.h

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: In Progress
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: Impala 3.4.0
    • Fix Version/s: None
    • Component/s: Backend
    • Labels:
    • Epic Color:
      ghx-label-7

      Description

      exec/hash-table.h's HashTable currently implements an open addressing hash table with linear and quadratic probing. Since there are few guarantees about the probe length, we limit the load factor to 0.75 to avoid high probe lengths. In general, probe lengths increase severely as the load factor goes higher than that.

      Robin Hood Hashing reduces the variances of probe lengths by continually rebalancing elements. It steals from the "rich" elements that have a very short probe length and gives to the "poor" elements with a long probe length. This tighter guarantee about the probe length allows for a higher load factor (0.9).

      We should evaluate Robin Hood Hashing to see if it improves our hash table performance. Since this hash table does not support deletes, that reduces the complexity. The Bucket already has some unused bytes in case we need them.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                rizaon Riza Suminto
                Reporter:
                joemcdonnell Joe McDonnell
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: