Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-6819

Correctness issue with Hive limit operator & predicate push down

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 0.12.0
    • 0.13.0
    • Query Processor
    • None

    Description

      Following query produces 0 rows with Predicate Push Down optimization turned on; the same query produces 130 rows with predicate push down turned off.

      SELECT t2.c_int FROM (select key, value, c_float, c_int from t1 ORDER BY key, value, c_float, c_int LIMIT 10) t1 JOIN t2 on t1.c_int=t2.c_int and t1.c_float=t2.c_float WHERE t2.c_int>=1;
      

      I could reproduce this on Apache Trunk.
      Haven't checked if previous releases have the same issue.

      hive> desc t1;
      Query ID = jpullokkaran_20140401191515_36e441c6-074b-45ae-aff6-489e13a6f401
      OK
      key string
      value string
      c_int int
      c_float float
      c_boolean boolean
      Time taken: 0.077 seconds, Fetched: 5 row(s)
      hive> select distinct key, value, c_float, c_int from t1;
      OK
      1 1 1.0 1
      1 1 1.0 1
      1 1 1.0 1
      1 1 1.0 1
      null null NULL NULL
      Time taken: 0.062 seconds, Fetched: 5 row(s)
      hive> desc t2;
      Query ID = jpullokkaran_20140401191616_dfbd14bb-b5b8-4165-8d01-e9a61a7f1c33
      OK
      key string
      value string
      c_int int
      c_float float
      c_boolean boolean
      Time taken: 0.062 seconds, Fetched: 5 row(s)
      hive> select distinct key, value, c_float, c_int from t2;
      OK
      1 1 1.0 1
      1 1 1.0 1
      1 1 1.0 1
      1 1 1.0 1
      2 2 2.0 2
      null null NULL NULL
      Time taken: 4.698 seconds, Fetched: 6 row(s)
      hive> select t2.c_int from (select key, value, c_float, c_int from t1 order by key,value,c_float,c_int limit 10)t1 join t2 on t1.c_int=t2.c_int and t1.c_float=t2.c_float where t2.c_int>=1;
      MapredLocal task succeeded
      OK
      Time taken: 13.029 seconds
      hive>
      hive> select t2.c_int from (select key, value, c_float, c_int from t1 order by key,value,c_float,c_int limit 10)t1 join t2 on t1.c_int=t2.c_int and t1.c_float=t2.c_float where t2.c_int>=1;
      MapredLocal task succeeded
      OK
      ...
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      1
      Time taken: 9.317 seconds, Fetched: 130 row(s)
      hive>

      Attachments

        1. HIVE-6819.1.patch
          3 kB
          Harish Butani

        Issue Links

          Activity

            People

              jpullokkaran Laljo John Pullokkaran
              jpullokkaran Laljo John Pullokkaran
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: