Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-25410

CommonMergeJoin fails for ARRAY join keys with varying size

    XMLWordPrintableJSON

Details

    Description

      Thanks to HIVE-24883, CommonMergeJoinOperator can handle ARRAY or STRUCT types as a JOIN key.

      There are corner cases where CommonMergeJoinOperator fails with `ArrayIndexOutOfBoundsException`.

       

      This is a simple case.

      SET hive.auto.convert.join=false;
      CREATE TABLE table_list_types (id int, key array<int>);
      INSERT INTO table_list_types VALUES (1, array(1, 2)), (2, array(1, 2)), (3, array(1, 2, 3)), (4, array(1, 2, 3));
      SELECT * FROM table_list_types t1 INNER JOIN table_list_types t2 ON t1.key = t2.key; 

      With 69c97c26ac68a245f4d327cc2f7b3a2333f8fa84, the following error happened.

      Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
      	at org.apache.hadoop.hive.ql.exec.HiveStructComparator.compare(HiveStructComparator.java:57)
      	at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.compareKey(CommonMergeJoinOperator.java:629)
      	at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.compareKeys(CommonMergeJoinOperator.java:597)
      	at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.processKey(CommonMergeJoinOperator.java:566)
      	at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:249)
      	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370)
      	... 26 more 

      Attachments

        Issue Links

          Activity

            People

              okumin Shohei Okumiya
              okumin Shohei Okumiya
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1.5h
                  1.5h