Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-5964

Hive missing a filter predicate causing wrong results joining tables after sort by

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 0.10.0, 0.11.0
    • 0.13.0
    • Query Processor
    • None

    Description

      It seems like the optimization of predicate pushdown is failing under certain conditions causing wrong results as a filter predicate appears to be completely disregarded by the query processor for some reason.

      Here is the scenario (assuming "dual" table exists) :

      set hive.optimize.ppd=true;
      drop table if exists test_tbl ;
      create table test_tbl (id string,name string);

      insert into table test_tbl
      select 'a','b' from dual;

      test_tbl now contains :
      a b

      the following query :
      select t2.*
      from
      (select id,name from (select id,name from test_tbl) t1 sort by id) t2
      join test_tbl t3 on (t2.id=t3.id )
      where t2.name='c' and t3.id='a';

      returns :
      a b

      The filter :" t2.name='c' " is missing from the execution plan and obviously doesn't apply.
      The filter "t3.id='a' " does appear in the plan and is being applied before the join.

      If the query changes a little bit like removing the sort by, removing the t1 sub-query or disabling hive.optimize.ppd then the predicate appears.

      I'm able to reproduce the problem both in Hive 0.10 and Hive 0.11 although It seems to work fine in Hive 0.7

      Attachments

        Issue Links

          Activity

            People

              navis Navis Ryu
              dimamah dima machlin
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: