Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20873

Use Murmur hash for VectorHashKeyWrapperTwoLong to reduce hash collision

    XMLWordPrintableJSON

Details

    Description

      VectorHashKeyWrapperTwoLong is implemented with few bit shift operators and XOR operators for short computation time, but more hash collision. Group by operations become very slow on large data sets. It needs Murmur hash or a better hash function for less hash collision.

      Attachments

        1. HIVE-20873.1.patch
          6 kB
          Teddy Choi
        2. HIVE-20873.2.patch
          5 kB
          Teddy Choi
        3. HIVE-20873.3.patch
          6 kB
          Teddy Choi

        Activity

          People

            teddy.choi Teddy Choi
            teddy.choi Teddy Choi
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 20m
                20m