Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-288

Redundant computation in hashCode() implemenation

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed

      Description

      In some hashCode() implementation (e.g. Block.hashCode()), the formula has the form

      37 * 17 + f(x),
      

      where f( x ) is some function. Adding 37*17 to f( x ) shifts the value of f( x ). It is not effective to do shifting in hash functions. The computation is redundant.

      1. h288_20090820.patch
        2 kB
        Tsz Wo Nicholas Sze

        Issue Links

          Activity

          Hide
          steve_l added a comment -

          the whole 17/37 stuff comes from effective Java (page 37 , and some of the IDEs implement the feature, so its become quite common:

          http://www.google.com/codesearch?hl=en&q=+lang:java+%5C*37&start=20&sa=N

          Maybe the goal was to have 37 * (17 + f), but the braces were left out. Even then, I don't know whether is beneficial or just some form of superstition.

          Show
          steve_l added a comment - the whole 17/37 stuff comes from effective Java (page 37 , and some of the IDEs implement the feature, so its become quite common: http://www.google.com/codesearch?hl=en&q=+lang:java+%5C*37&start=20&sa=N Maybe the goal was to have 37 * (17 + f ), but the braces were left out. Even then, I don't know whether is beneficial or just some form of superstition.
          Hide
          Shevek added a comment -

          the whole 17/37 stuff comes from effective Java (page 37)

          I believe my response in Hadoop-5272 explains why this is at least as much superstition as anything. It was particularly fashionable in around 1997 when the first drafts of many of these books were being written, but Sun have since rewritten many of the libraries which call hashCode to use superior Mersenne algorithms for distribution. You need only ensure that your hashCodes are generally distinct, rather than attempting to spread them out modulo some (assumed smallish) N.

          Show
          Shevek added a comment - the whole 17/37 stuff comes from effective Java (page 37) I believe my response in Hadoop-5272 explains why this is at least as much superstition as anything. It was particularly fashionable in around 1997 when the first drafts of many of these books were being written, but Sun have since rewritten many of the libraries which call hashCode to use superior Mersenne algorithms for distribution. You need only ensure that your hashCodes are generally distinct, rather than attempting to spread them out modulo some (assumed smallish) N.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          h288_20090820.patch: remove 37 * 17 from a few hashCode() implementations.

          Show
          Tsz Wo Nicholas Sze added a comment - h288_20090820.patch: remove 37 * 17 from a few hashCode() implementations.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12417193/h288_20090820.patch
          against trunk revision 806746.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/81/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/81/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/81/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/81/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12417193/h288_20090820.patch against trunk revision 806746. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/81/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/81/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/81/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/81/console This message is automatically generated.
          Hide
          Tom White added a comment -

          I've just committed this. Thanks Nicholas!

          (Test failure was unrelated.)

          Show
          Tom White added a comment - I've just committed this. Thanks Nicholas! (Test failure was unrelated.)
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #16 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/16/)
          . Redundant computation in hashCode() implementation. Contributed by Tsz Wo (Nicholas), SZE.

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #16 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/16/ ) . Redundant computation in hashCode() implementation. Contributed by Tsz Wo (Nicholas), SZE.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #75 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/75/)
          . Redundant computation in hashCode() implementation. Contributed by Tsz Wo (Nicholas), SZE.

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #75 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/75/ ) . Redundant computation in hashCode() implementation. Contributed by Tsz Wo (Nicholas), SZE.
          Hide
          Hudson added a comment -

          Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #5 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/5/)
          . Redundant computation in hashCode() implementation. Contributed by Tsz Wo (Nicholas), SZE.

          Show
          Hudson added a comment - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #5 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/5/ ) . Redundant computation in hashCode() implementation. Contributed by Tsz Wo (Nicholas), SZE.
          Hide
          Hudson added a comment -

          Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #21 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/21/)

          Show
          Hudson added a comment - Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #21 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/21/ )
          Hide
          Robert Chansler added a comment -

          Editorial pass over all release notes prior to publication of 0.21.

          Show
          Robert Chansler added a comment - Editorial pass over all release notes prior to publication of 0.21.

            People

            • Assignee:
              Tsz Wo Nicholas Sze
              Reporter:
              Tsz Wo Nicholas Sze
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development