Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-21760

Sharedwork optimization should be bypassed for SMB joins

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 4.0.0-alpha-1
    • Query Planning
    • None

    Description

      SMB join introduces DUMMY OPERATOR, if shared work optimizer merges plan containing dummy operator task generation fails.
      I am not sure what is the root cause of failure in task generation but presumably it has some assumption regarding plan containing dummy operator

      Reproducer

      Run the following as TestMiniLlapLocalCliDriver test

      SELECT `t`.`p_name`
      FROM (SELECT `p_name`, `p_type`, `p_size` + 1 AS `size`
      FROM `part`) AS `t`
      LEFT JOIN (SELECT `t5`.`size`, `t2`.`c`, `t2`.`ck`
      FROM (SELECT `p_size` + 1 AS `+`, COUNT(*) AS `c`, COUNT(`p_type`) AS `ck`
      FROM `part`
      WHERE `p_size` IS NOT NULL
      GROUP BY `p_size` + 1) AS `t2`
      INNER JOIN (SELECT `p_size` + 1 AS `size`
      FROM `part`
      WHERE `p_size` IS NOT NULL
      GROUP BY `p_size` + 1) AS `t5` ON `t2`.`+` = `t5`.`size`) AS `t6` ON `t`.`size` = `t6`.`size`
      LEFT JOIN (SELECT `t9`.`p_type`, `t12`.`size`, TRUE AS `$f2`
      FROM (SELECT `p_type`, `p_size` + 1 AS `+`
      FROM `part`
      WHERE `p_size` IS NOT NULL AND `p_type` IS NOT NULL
      GROUP BY `p_type`, `p_size` + 1) AS `t9`
      INNER JOIN (SELECT `p_size` + 1 AS `size`
      FROM `part`
      WHERE `p_size` IS NOT NULL
      GROUP BY `p_size` + 1) AS `t12` ON `t9`.`+` = `t12`.`size`) AS `t14` ON `t`.`p_type` = `t14`.`p_type` AND `t`.`size` = `t14`.`size`
      WHERE (`t14`.`$f2` IS NULL OR `t6`.`c` = 0 OR `t6`.`c` IS NULL) AND (`t`.`p_type` IS NOT NULL OR `t6`.`c` = 0 OR `t6`.`c` IS NULL OR `t14`.`$f2` IS NOT NULL) AND (`t6`.`ck` < `t6`.`c` IS NOT TRUE OR `t6`.`c` = 0 OR `t6`.`c` IS NULL OR `t14`.`$f2` IS NOT NULL OR `t`.`p_type` IS NULL);
      
      java.lang.NullPointerException
      	at org.apache.hadoop.hive.ql.plan.TezWork.connect(TezWork.java:376)
      	at org.apache.hadoop.hive.ql.parse.GenTezWork.process(GenTezWork.java:470)
      	at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
      	at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
      	at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:90)
      	at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:109)
      	at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:109)
      	at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:109)
      	at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:109)
      	at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:109)
      	at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:109)
      	at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.startWalking(GenTezWorkWalker.java:72)
      	at org.apache.hadoop.hive.ql.parse.TezCompiler.generateTaskTree(TezCompiler.java:641)
      	at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:278)
      	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12562)
      	at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:370)
      	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
      	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:671)
      	at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1905)
      	at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1852)
      	at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1847)
      	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
      	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:219)
      	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242)
      	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189)
      	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
      	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:340)
      	at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:676)
      	at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:647)
      	at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:182)
      	at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104)
      	at org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:59)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
      	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
      	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
      	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
      	at org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:92)
      	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
      	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
      	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
      	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
      	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
      	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
      	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
      	at org.junit.runners.Suite.runChild(Suite.java:127)
      	at org.junit.runners.Suite.runChild(Suite.java:26)
      	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
      	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
      	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
      	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
      	at org.apache.hadoop.hive.cli.control.CliAdapter$1$1.evaluate(CliAdapter.java:73)
      	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
      	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
      	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
      	at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
      	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
      	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
      	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
      	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
      	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
      	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)
      

      Attachments

        1. HIVE-21760.1.patch
          34 kB
          Vineet Garg

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            vgarg Vineet Garg Assign to me
            vgarg Vineet Garg
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment