Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-288

Redundant computation in hashCode() implemenation

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed

      Description

      In some hashCode() implementation (e.g. Block.hashCode()), the formula has the form

      37 * 17 + f(x),
      

      where f( x ) is some function. Adding 37*17 to f( x ) shifts the value of f( x ). It is not effective to do shifting in hash functions. The computation is redundant.

      1. h288_20090820.patch
        2 kB
        Tsz Wo Nicholas Sze

        Issue Links

          Activity

          Tsz Wo Nicholas Sze created issue -
          Robert Chansler made changes -
          Field Original Value New Value
          Component/s dfs [ 12310710 ]
          Hide
          Steve Loughran added a comment -

          the whole 17/37 stuff comes from effective Java (page 37 , and some of the IDEs implement the feature, so its become quite common:

          http://www.google.com/codesearch?hl=en&q=+lang:java+%5C*37&start=20&sa=N

          Maybe the goal was to have 37 * (17 + f), but the braces were left out. Even then, I don't know whether is beneficial or just some form of superstition.

          Show
          Steve Loughran added a comment - the whole 17/37 stuff comes from effective Java (page 37 , and some of the IDEs implement the feature, so its become quite common: http://www.google.com/codesearch?hl=en&q=+lang:java+%5C*37&start=20&sa=N Maybe the goal was to have 37 * (17 + f ), but the braces were left out. Even then, I don't know whether is beneficial or just some form of superstition.
          Tsz Wo Nicholas Sze made changes -
          Link This issue is related to HADOOP-5727 [ HADOOP-5727 ]
          Hide
          Shevek added a comment -

          the whole 17/37 stuff comes from effective Java (page 37)

          I believe my response in Hadoop-5272 explains why this is at least as much superstition as anything. It was particularly fashionable in around 1997 when the first drafts of many of these books were being written, but Sun have since rewritten many of the libraries which call hashCode to use superior Mersenne algorithms for distribution. You need only ensure that your hashCodes are generally distinct, rather than attempting to spread them out modulo some (assumed smallish) N.

          Show
          Shevek added a comment - the whole 17/37 stuff comes from effective Java (page 37) I believe my response in Hadoop-5272 explains why this is at least as much superstition as anything. It was particularly fashionable in around 1997 when the first drafts of many of these books were being written, but Sun have since rewritten many of the libraries which call hashCode to use superior Mersenne algorithms for distribution. You need only ensure that your hashCodes are generally distinct, rather than attempting to spread them out modulo some (assumed smallish) N.
          Owen O'Malley made changes -
          Project Hadoop Common [ 12310240 ] HDFS [ 12310942 ]
          Key HADOOP-3383 HDFS-288
          Component/s dfs [ 12310710 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          h288_20090820.patch: remove 37 * 17 from a few hashCode() implementations.

          Show
          Tsz Wo Nicholas Sze added a comment - h288_20090820.patch: remove 37 * 17 from a few hashCode() implementations.
          Tsz Wo Nicholas Sze made changes -
          Attachment h288_20090820.patch [ 12417193 ]
          Tsz Wo Nicholas Sze made changes -
          Hadoop Flags [Incompatible change]
          Release Note Removed redundant computations from the hashCode() implementations of Block, GenerationStamp and DirectoryScanner.ScanInfo.
          Tsz Wo Nicholas Sze made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Assignee Tsz Wo (Nicholas), SZE [ szetszwo ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12417193/h288_20090820.patch
          against trunk revision 806746.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/81/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/81/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/81/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/81/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12417193/h288_20090820.patch against trunk revision 806746. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/81/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/81/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/81/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/81/console This message is automatically generated.
          Hide
          Tom White added a comment -

          I've just committed this. Thanks Nicholas!

          (Test failure was unrelated.)

          Show
          Tom White added a comment - I've just committed this. Thanks Nicholas! (Test failure was unrelated.)
          Tom White made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Incompatible change] [Incompatible change, Reviewed]
          Fix Version/s 0.21.0 [ 12314046 ]
          Resolution Fixed [ 1 ]
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #16 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/16/)
          . Redundant computation in hashCode() implementation. Contributed by Tsz Wo (Nicholas), SZE.

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #16 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/16/ ) . Redundant computation in hashCode() implementation. Contributed by Tsz Wo (Nicholas), SZE.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #75 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/75/)
          . Redundant computation in hashCode() implementation. Contributed by Tsz Wo (Nicholas), SZE.

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #75 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/75/ ) . Redundant computation in hashCode() implementation. Contributed by Tsz Wo (Nicholas), SZE.
          Hide
          Hudson added a comment -

          Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #5 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/5/)
          . Redundant computation in hashCode() implementation. Contributed by Tsz Wo (Nicholas), SZE.

          Show
          Hudson added a comment - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #5 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/5/ ) . Redundant computation in hashCode() implementation. Contributed by Tsz Wo (Nicholas), SZE.
          Hide
          Hudson added a comment -

          Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #21 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/21/)

          Show
          Hudson added a comment - Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #21 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/21/ )
          Hide
          Robert Chansler added a comment -

          Editorial pass over all release notes prior to publication of 0.21.

          Show
          Robert Chansler added a comment - Editorial pass over all release notes prior to publication of 0.21.
          Robert Chansler made changes -
          Release Note Removed redundant computations from the hashCode() implementations of Block, GenerationStamp and DirectoryScanner.ScanInfo.
          Tom White made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          465d 1h 23m 1 Tsz Wo Nicholas Sze 22/Aug/09 00:00
          Patch Available Patch Available Resolved Resolved
          16d 12h 11m 1 Tom White 07/Sep/09 12:11
          Resolved Resolved Closed Closed
          351d 9h 36m 1 Tom White 24/Aug/10 21:47

            People

            • Assignee:
              Tsz Wo Nicholas Sze
              Reporter:
              Tsz Wo Nicholas Sze
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development