Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-2365

Result of HashFunction.hash() contains all identical values

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 0.16.0
    • 0.16.0
    • None
    • None

    Description

      There is a small bug in HashFunction:112 - initvalue should be changed between the loop iterations in order to spread the hash values over the whole allowed range. Instead the current code uses a fixed initvalue = 0, which gives all identical hash values in the result array. As a result, BloomFilter-s have extremely high rate of false positives.

      Attachments

        1. hash-v1.patch
          0.7 kB
          Andrzej Bialecki
        2. patch.txt
          0.6 kB
          Jim Kellerman
        3. patch.txt
          2 kB
          Jim Kellerman

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jimk Jim Kellerman
            ab Andrzej Bialecki
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment