Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17555

StatsUtils considers all ranges to be 'long'; and loose precision / introduce bugs in some cases

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • None
    • None
    • Statistics
    • None

    Description

      The following test fails because the combined range is: [0:0]

      This problem is present in other methods of StatsUtil as well

      package org.apache.hadoop.hive.ql.stats;
      
      import static org.junit.Assert.assertTrue;
      
      import org.apache.hadoop.hive.ql.plan.ColStatistics.Range;
      import org.junit.Test;
      
      public class TestStatsUtils {
      
        @Test
        public void test1() {
          Range r1 = new Range(0.1f, 0.4f);
          Range r2 = new Range(0.3f, 0.9f);
          assertTrue(rangeContains(r1, 0.2f));
          Range r3 = StatsUtils.combineRange(r1, r2);
          System.out.println(r3);
          assertTrue(rangeContains(r3, 0.2f));
        }
      
        private boolean rangeContains(Range range, Number f) {
          double m = range.minValue.doubleValue();
          double M = range.maxValue.doubleValue();
          double v = f.doubleValue();
          return m <= v && v <= M;
        }
      
      }
      

      https://github.com/apache/hive/blob/32e854ef1c25f21d53f7932723cfc76bf75a71cd/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1955

      Attachments

        Activity

          People

            kgyrtkirk Zoltan Haindrich
            kgyrtkirk Zoltan Haindrich
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: