Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-21531

Vectorization: all NULL hashcodes are not computed using Murmur3

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 3.1.1, 4.0.0
    • 4.0.0-alpha-1
    • None
    • None

    Description

      The comments in Vectorized hash computation call out the MurmurHash implementation (the one using 0x5bd1e995), while the non-vectorized codepath calls out the Murmur3 one (using 0xcc9e2d51).

      The comments here are wrong

       /**
         * Batch compute the hash codes for all the serialized keys.
         *
         * NOTE: MAJOR MAJOR ASSUMPTION:
         *     We assume that HashCodeUtil.murmurHash produces the same result
         *     as MurmurHash.hash with seed = 0 (the method used by ReduceSinkOperator for
         *     UNIFORM distribution).
         */
        protected void computeSerializedHashCodes() {
          int offset = 0;
          int keyLength;
          byte[] bytes = output.getData();
          for (int i = 0; i < nonNullKeyCount; i++) {
            keyLength = serializedKeyLengths[i];
            hashCodes[i] = Murmur3.hash32(bytes, offset, keyLength, 0);
            offset += keyLength;
          }
        }
      

      but the wrong comment is followed in the Vector RS operator

            System.arraycopy(nullKeyOutput.getData(), 0, nullBytes, 0, nullBytesLength);
            nullKeyHashCode = HashCodeUtil.calculateBytesHashCode(nullBytes, 0, nullBytesLength);
      

      Attachments

        1. HIVE-21531.WIP.patch
          3 kB
          Gopal Vijayaraghavan
        2. HIVE-21531.1.patch
          26 kB
          Gopal Vijayaraghavan
        3. HIVE-21531.1.patch
          26 kB
          Gopal Vijayaraghavan
        4. HIVE-21531.2.patch
          29 kB
          Gopal Vijayaraghavan

        Activity

          People

            gopalv Gopal Vijayaraghavan
            gopalv Gopal Vijayaraghavan
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: