Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-9886

Hive on tez: NPE when converting join to SMB in sub-query

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.0.0, 1.1.0
    • Fix Version/s: 1.0.0, 1.2.0
    • Component/s: Tez
    • Labels:
      None

      Description

      set hive.auto.convert.sortmerge.join = true;
      
      create table t1(
      id string,
      od string);
      
      create table t2(
      id string,
      od string);
      
      select vt1.id from
      (select rt1.id from
      (select t1.id, row_number() over (partition by id order by od desc) as row_no from t1) rt1
      where rt1.row_no=1) vt1
      join
      (select rt2.id from
      (select t2.id, row_number() over (partition by id order by od desc) as row_no from t2) rt2
      where rt2.row_no=1) vt2
      where vt1.id=vt2.id;
      

      throws NPE:

      	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:146)
      	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
      	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
      	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
      	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
      	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:415)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
      	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
      	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:744)
      Caused by: java.lang.NullPointerException
      	at org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.getValueObjectInspectors(AbstractMapJoinOperator.java:96)
      	at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:167)
      	at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:310)
      	at org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:72)
      	at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.initializeOp(CommonMergeJoinOperator.java:89)
      	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
      	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469)
      	at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425)
      	at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
      	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
      	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469)
      	at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425)
      	at org.apache.hadoop.hive.ql.exec.FilterOperator.initializeOp(FilterOperator.java:66)
      	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
      	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469)
      	at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425)
      	at org.apache.hadoop.hive.ql.exec.Operator.initializeOp(Operator.java:410)
      	at org.apache.hadoop.hive.ql.exec.PTFOperator.initializeOp(PTFOperator.java:89)
      	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
      	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469)
      	at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425)
      	at org.apache.hadoop.hive.ql.exec.ExtractOperator.initializeOp(ExtractOperator.java:40)
      	at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
      	at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:116)
      	... 14 more
      
      1. HIVE-9886.1.patch
        64 kB
        Vikram Dixit K
      2. HIVE-9886.2.patch
        41 kB
        Vikram Dixit K
      3. HIVE-9886.3.patch
        40 kB
        Vikram Dixit K
      4. HIVE-9886.4.patch
        40 kB
        Vikram Dixit K
      5. HIVE-9886.5.patch
        40 kB
        Vikram Dixit K
      6. HIVE-9886.6.patch
        40 kB
        Vikram Dixit K

        Issue Links

          Activity

          Hide
          sushanth Sushanth Sowmyan added a comment -

          This issue has been fixed and released as part of the 1.2.0 release. If you find an issue which seems to be related to this one, please create a new jira and link this one with new jira.

          Show
          sushanth Sushanth Sowmyan added a comment - This issue has been fixed and released as part of the 1.2.0 release. If you find an issue which seems to be related to this one, please create a new jira and link this one with new jira.
          Hide
          vikram.dixit Vikram Dixit K added a comment -

          As mentioned in the comments below, HiveQA ran the tests and the tests had passed but hadn't been posted to the jira. I have committed the patch to 1.0.0 and 1.2.0 as well.

          Show
          vikram.dixit Vikram Dixit K added a comment - As mentioned in the comments below, HiveQA ran the tests and the tests had passed but hadn't been posted to the jira. I have committed the patch to 1.0.0 and 1.2.0 as well.
          Hide
          hagleitn Gunther Hagleitner added a comment -

          Vikram Dixit K can you reupload? to see if we can get a clean run.

          Show
          hagleitn Gunther Hagleitner added a comment - Vikram Dixit K can you reupload? to see if we can get a clean run.
          Show
          vikram.dixit Vikram Dixit K added a comment - Looks like test results are not getting posted: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2973/testReport/
          Hide
          hagleitn Gunther Hagleitner added a comment -

          +1

          Show
          hagleitn Gunther Hagleitner added a comment - +1
          Hide
          vikram.dixit Vikram Dixit K added a comment -

          Same patch but with prefix for RB to work.

          Show
          vikram.dixit Vikram Dixit K added a comment - Same patch but with prefix for RB to work.
          Hide
          hagleitn Gunther Hagleitner added a comment -

          I think we should have some follow up jiras:

          • Trait set propagation for map join (one side sort column propagation)
          • ReduceRecordProc should be able to handle smb
          • Think through the assumption that multiple smbs in the same vertex are always merged (smb-mj-smb-mj)

          For now the safest might be to set the reducesink count once and not update while transforming the plan. That will allow smb followed by any number of mj in the first (map) vertex, but much nothing else (as far as smb is concerned).

          Show
          hagleitn Gunther Hagleitner added a comment - I think we should have some follow up jiras: Trait set propagation for map join (one side sort column propagation) ReduceRecordProc should be able to handle smb Think through the assumption that multiple smbs in the same vertex are always merged (smb-mj-smb-mj) For now the safest might be to set the reducesink count once and not update while transforming the plan. That will allow smb followed by any number of mj in the first (map) vertex, but much nothing else (as far as smb is concerned).
          Hide
          hagleitn Gunther Hagleitner added a comment -

          For patch 2: joinOp.getOpTraits().getNumReduceSinks() - 1) -> the traits have to be propagated to all downstream operators after a change like this, no?

          Show
          hagleitn Gunther Hagleitner added a comment - For patch 2: joinOp.getOpTraits().getNumReduceSinks() - 1) -> the traits have to be propagated to all downstream operators after a change like this, no?
          Hide
          hagleitn Gunther Hagleitner added a comment -

          There's a lot of ws changes in the patch that make it hard to review. I'm usually not against fixing identation etc, but in this case it seems to make the indentation worse in a lot of places.

          Show
          hagleitn Gunther Hagleitner added a comment - There's a lot of ws changes in the patch that make it hard to review. I'm usually not against fixing identation etc, but in this case it seems to make the indentation worse in a lot of places.

            People

            • Assignee:
              vikram.dixit Vikram Dixit K
              Reporter:
              vikram.dixit Vikram Dixit K
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development