Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22239

Scale data size using column value ranges

    XMLWordPrintableJSON

Details

    Description

      Currently, min/max values for columns are only used to determine whether a certain range filter falls out of range and thus filters all rows or none at all. If it does not, we just use a heuristic that the condition will filter 1/3 of the input rows. Instead of using that heuristic, we can use another one that assumes that data will be uniformly distributed across that range, and calculate the selectivity for the condition accordingly.

      Attachments

        1. HIVE-22239.patch
          144 kB
          jcamachorodriguez
        2. HIVE-22239.06.patch
          1.03 MB
          jcamachorodriguez
        3. HIVE-22239.05.patch
          1.03 MB
          jcamachorodriguez
        4. HIVE-22239.05.patch
          1.03 MB
          jcamachorodriguez
        5. HIVE-22239.04.patch
          1.76 MB
          jcamachorodriguez
        6. HIVE-22239.04.patch
          1.76 MB
          jcamachorodriguez
        7. HIVE-22239.03.patch
          1.76 MB
          jcamachorodriguez
        8. HIVE-22239.02.patch
          1.95 MB
          jcamachorodriguez
        9. HIVE-22239.01.patch
          1.91 MB
          jcamachorodriguez

        Issue Links

          Activity

            People

              jcamacho Jesús Camacho Rodríguez
              jcamacho Jesús Camacho Rodríguez
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 5h 20m
                  5h 20m