Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5827

Create tests for detecting skew in hash codes generated by Hash Functions

    XMLWordPrintableJSON

Details

    • Task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      There have been few instances where the hash function used by Drill has produced skewed results on different data sets. It would be good to create some tests which can detect skew in hash codes produced by Hash Functions. This will help to avoid any regression based on changes for hash function usage or implementations. Creating data on fly in the tests like:
      1) Set of random numbers.
      2) Set of randomly generated strings
      3) Set of random string with same prefix
      4) Set of random string with same suffix
      5) Set of continuous numbers.

      And also adding Issue Data sets found during investigations in DRILL-4237 / DRILL-5816 / DRILL-4119

      Attachments

        Activity

          People

            Unassigned Unassigned
            shamirwasia Sorabh Hamirwasia
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: