Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-21760

Sharedwork optimization should be bypassed for SMB joins

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 4.0.0-alpha-1
    • Query Planning
    • None

    Description

      SMB join introduces DUMMY OPERATOR, if shared work optimizer merges plan containing dummy operator task generation fails.
      I am not sure what is the root cause of failure in task generation but presumably it has some assumption regarding plan containing dummy operator

      Reproducer

      Run the following as TestMiniLlapLocalCliDriver test

      SELECT `t`.`p_name`
      FROM (SELECT `p_name`, `p_type`, `p_size` + 1 AS `size`
      FROM `part`) AS `t`
      LEFT JOIN (SELECT `t5`.`size`, `t2`.`c`, `t2`.`ck`
      FROM (SELECT `p_size` + 1 AS `+`, COUNT(*) AS `c`, COUNT(`p_type`) AS `ck`
      FROM `part`
      WHERE `p_size` IS NOT NULL
      GROUP BY `p_size` + 1) AS `t2`
      INNER JOIN (SELECT `p_size` + 1 AS `size`
      FROM `part`
      WHERE `p_size` IS NOT NULL
      GROUP BY `p_size` + 1) AS `t5` ON `t2`.`+` = `t5`.`size`) AS `t6` ON `t`.`size` = `t6`.`size`
      LEFT JOIN (SELECT `t9`.`p_type`, `t12`.`size`, TRUE AS `$f2`
      FROM (SELECT `p_type`, `p_size` + 1 AS `+`
      FROM `part`
      WHERE `p_size` IS NOT NULL AND `p_type` IS NOT NULL
      GROUP BY `p_type`, `p_size` + 1) AS `t9`
      INNER JOIN (SELECT `p_size` + 1 AS `size`
      FROM `part`
      WHERE `p_size` IS NOT NULL
      GROUP BY `p_size` + 1) AS `t12` ON `t9`.`+` = `t12`.`size`) AS `t14` ON `t`.`p_type` = `t14`.`p_type` AND `t`.`size` = `t14`.`size`
      WHERE (`t14`.`$f2` IS NULL OR `t6`.`c` = 0 OR `t6`.`c` IS NULL) AND (`t`.`p_type` IS NOT NULL OR `t6`.`c` = 0 OR `t6`.`c` IS NULL OR `t14`.`$f2` IS NOT NULL) AND (`t6`.`ck` < `t6`.`c` IS NOT TRUE OR `t6`.`c` = 0 OR `t6`.`c` IS NULL OR `t14`.`$f2` IS NOT NULL OR `t`.`p_type` IS NULL);
      
      java.lang.NullPointerException
      	at org.apache.hadoop.hive.ql.plan.TezWork.connect(TezWork.java:376)
      	at org.apache.hadoop.hive.ql.parse.GenTezWork.process(GenTezWork.java:470)
      	at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
      	at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
      	at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:90)
      	at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:109)
      	at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:109)
      	at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:109)
      	at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:109)
      	at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:109)
      	at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:109)
      	at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.startWalking(GenTezWorkWalker.java:72)
      	at org.apache.hadoop.hive.ql.parse.TezCompiler.generateTaskTree(TezCompiler.java:641)
      	at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:278)
      	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12562)
      	at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:370)
      	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
      	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:671)
      	at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1905)
      	at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1852)
      	at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1847)
      	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
      	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:219)
      	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242)
      	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189)
      	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
      	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:340)
      	at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:676)
      	at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:647)
      	at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:182)
      	at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104)
      	at org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:59)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
      	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
      	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
      	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
      	at org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:92)
      	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
      	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
      	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
      	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
      	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
      	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
      	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
      	at org.junit.runners.Suite.runChild(Suite.java:127)
      	at org.junit.runners.Suite.runChild(Suite.java:26)
      	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
      	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
      	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
      	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
      	at org.apache.hadoop.hive.cli.control.CliAdapter$1$1.evaluate(CliAdapter.java:73)
      	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
      	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
      	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
      	at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
      	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
      	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
      	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
      	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
      	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
      	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)
      

      Attachments

        1. HIVE-21760.1.patch
          34 kB
          Vineet Garg

        Issue Links

          Activity

            People

              vgarg Vineet Garg
              vgarg Vineet Garg
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: