Uploaded image for project: 'Derby'
  1. Derby
  2. DERBY-3981

Improve distribution of hash codes in SQLBinary and SQLChar

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 10.4.2.0
    • Fix Version/s: 10.5.1.1
    • Component/s: SQL
    • Labels:
      None
    • Issue & fix info:
      Newcomer
    • Bug behavior facts:
      Performance

      Description

      SQLBinary.hashCode() and SQLChar.hashCode() use a very simple algorithm that just takes the sum of the values in the array. This gives a poor distribution of hash values because similar values will have a higher probability of mapping to the same hash code, and the higher bits won't be used unless the array is very long. We should change these methods so that they use an algorithm similar to the one used in java.lang.String.hashCode(), described here: <URL:http://java.sun.com/javase/6/docs/api/java/lang/String.html#hashCode()>. This may have a positive effect on the performance of hash scans as it will reduce the likelihood of collisions in the hash table.

        Attachments

        1. distinct-test.diff
          7 kB
          Knut Anders Hatlen
        2. d3981.diff
          3 kB
          Knut Anders Hatlen

          Activity

            People

            • Assignee:
              knutanders Knut Anders Hatlen
              Reporter:
              knutanders Knut Anders Hatlen

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment