Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-12762

Common join on parquet tables returns incorrect result when hive.optimize.index.filter set to true

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.0.0, 2.1.0
    • 2.0.0
    • Logical Optimizer
    • None

    Description

      The following query will give incorrect result.

      CREATE TABLE tbl1(id INT) STORED AS PARQUET;
      INSERT INTO tbl1 VALUES(1), (2);
      
      CREATE TABLE tbl2(id INT, value STRING) STORED AS PARQUET;
      INSERT INTO tbl2 VALUES(1, 'value1');
      INSERT INTO tbl2 VALUES(1, 'value2');
      
      set hive.optimize.index.filter = true;
      set hive.auto.convert.join=false;
      select tbl1.id, t1.value, t2.value
      FROM tbl1
      JOIN (SELECT * FROM tbl2 WHERE value='value1') t1 ON tbl1.id=t1.id
      JOIN (SELECT * FROM tbl2 WHERE value='value2') t2 ON tbl1.id=t2.id;
      

      We are enforcing to use common join and tbl2 will have 2 files after 2 insertions underneath.

      the map job contains 3 TableScan operators (2 for tbl2 and 1 for tbl1). When hive.optimize.index.filter is set to true, we are incorrectly applying the later filtering condition to each block, which causes no data is returned for the subquery SELECT * FROM tbl2 WHERE value='value1'.

      Attachments

        1. HIVE-12762.patch
          25 kB
          Aihua Xu
        2. HIVE-12762.2.patch
          15 kB
          Aihua Xu

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            aihuaxu Aihua Xu Assign to me
            aihuaxu Aihua Xu
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment