Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22412

StatsUtils throw NPE when explain

    XMLWordPrintableJSON

    Details

      Description

      The demo like this:

      drop table if exists explain_npe_map;
      drop table if exists explain_npe_array;
      drop table if exists explain_npe_struct;
      
      create table explain_npe_map    ( c1 map<string, string> );
      create table explain_npe_array  ( c1 array<string> );
      create table explain_npe_struct ( c1 struct<name:string, age:int> );
      
      -- error
      set hive.cbo.enable=false;
      explain select c1 from explain_npe_map where c1 is null;
      explain select c1 from explain_npe_array where c1 is null;
      explain select c1 from explain_npe_struct where c1 is null;
      
      -- correct
      set hive.cbo.enable=true;
      explain select c1 from explain_npe_map where c1 is null;
      explain select c1 from explain_npe_array where c1 is null;
      explain select c1 from explain_npe_struct where c1 is null;

       

      if the conf 'hive.cbo.enable' set false , NPE will be thrown ; otherwise will not.

      hive> drop table if exists explain_npe_map;
      OK
      Time taken: 0.063 seconds
      hive> drop table if exists explain_npe_array;
      OK
      Time taken: 0.035 seconds
      hive> drop table if exists explain_npe_struct;
      OK
      Time taken: 0.015 seconds
      hive>
          > create table explain_npe_map    ( c1 map<string, string> );
      OK
      Time taken: 0.584 seconds
      hive> create table explain_npe_array  ( c1 array<string> );
      OK
      Time taken: 0.216 seconds
      hive> create table explain_npe_struct ( c1 struct<name:string, age:int> );
      OK
      Time taken: 0.17 seconds
      hive>
          > set hive.cbo.enable=false;
      hive> explain select c1 from explain_npe_map where c1 is null;
      FAILED: NullPointerException null
      hive> explain select c1 from explain_npe_array where c1 is null;
      FAILED: NullPointerException null
      hive> explain select c1 from explain_npe_struct where c1 is null;
      FAILED: RuntimeException Error invoking signature method
      hive>
          > set hive.cbo.enable=true;
      hive> explain select c1 from explain_npe_map where c1 is null;
      OK
      STAGE DEPENDENCIES:
        Stage-0 is a root stageSTAGE PLANS:
        Stage: Stage-0
          Fetch Operator
            limit: -1
            Processor Tree:
              TableScan
                alias: explain_npe_map
                Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
                Filter Operator
                  predicate: false (type: boolean)
                  Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
                  Select Operator
                    expressions: c1 (type: map<string,string>)
                    outputColumnNames: _col0
                    Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
                    ListSinkTime taken: 1.593 seconds, Fetched: 20 row(s)
      hive> explain select c1 from explain_npe_array where c1 is null;
      OK
      STAGE DEPENDENCIES:
        Stage-0 is a root stageSTAGE PLANS:
        Stage: Stage-0
          Fetch Operator
            limit: -1
            Processor Tree:
              TableScan
                alias: explain_npe_array
                Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
                Filter Operator
                  predicate: false (type: boolean)
                  Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
                  Select Operator
                    expressions: c1 (type: array<string>)
                    outputColumnNames: _col0
                    Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
                    ListSinkTime taken: 1.969 seconds, Fetched: 20 row(s)
      hive> explain select c1 from explain_npe_struct where c1 is null;
      OK
      STAGE DEPENDENCIES:
        Stage-0 is a root stageSTAGE PLANS:
        Stage: Stage-0
          Fetch Operator
            limit: -1
            Processor Tree:
              TableScan
                alias: explain_npe_struct
                Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
                Filter Operator
                  predicate: false (type: boolean)
                  Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
                  Select Operator
                    expressions: c1 (type: struct<name:string,age:int>)
                    outputColumnNames: _col0
                    Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
                    ListSinkTime taken: 2.932 seconds, Fetched: 20 row(s)
      hive>
      

      ms error like:

      for map:

      java.lang.NullPointerException
              at org.apache.hadoop.hive.ql.stats.StatsUtils.getSizeOfMap(StatsUtils.java:1045)
              at org.apache.hadoop.hive.ql.stats.StatsUtils.getSizeOfComplexTypes(StatsUtils.java:931)
              at org.apache.hadoop.hive.ql.stats.StatsUtils.getAvgColLenOfVariableLengthTypes(StatsUtils.java:869)
              at org.apache.hadoop.hive.ql.stats.StatsUtils.estimateRowSizeFromSchema(StatsUtils.java:526)
              at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:223)
              at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:136)
              at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:124)
              at org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$TableScanStatsRule.process(StatsRulesProcFactory.java:111)
              at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
              at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
              at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
              at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:56)
              at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
              at org.apache.hadoop.hive.ql.optimizer.stats.annotation.AnnotateWithStatistics.transform(AnnotateWithStatistics.java:78)
              at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:192)
              at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10205)
              at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:210)
              at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
              at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
              at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
              at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:425)
              at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:309)
              at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1153)
              at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1206)
              at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1082)
              at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
              at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
              at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
              at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
              at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
              at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
              at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
              at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

       

      for array:

      java.lang.NullPointerException
              at org.apache.hadoop.hive.ql.stats.StatsUtils.getSizeOfComplexTypes(StatsUtils.java:1168)
              at org.apache.hadoop.hive.ql.stats.StatsUtils.getAvgColLenOf(StatsUtils.java:1132)
              at org.apache.hadoop.hive.ql.stats.StatsUtils.estimateRowSizeFromSchema(StatsUtils.java:686)
              at org.apache.hadoop.hive.ql.stats.StatsUtils.estimateRowSizeFromSchema(StatsUtils.java:664)
              at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:254)
              at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:162)
              at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:150)
              at org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$TableScanStatsRule.process(StatsRulesProcFactory.java:142)
              at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
              at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
              at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
              at org.apache.hadoop.hive.ql.lib.LevelOrderWalker.walk(LevelOrderWalker.java:143)
              at org.apache.hadoop.hive.ql.lib.LevelOrderWalker.startWalking(LevelOrderWalker.java:122)
              at org.apache.hadoop.hive.ql.optimizer.stats.annotation.AnnotateWithStatistics.transform(AnnotateWithStatistics.java:78)
              at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:250)
              at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12481)
              at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11824)
              at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285)
              at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:166)
              at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285)
              at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:664)
              at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1854)
              at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1801)
              at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1796)
              at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
              at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)
              at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
              at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
              at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
              at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
              at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
              at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
              at org.apache.hadoop.util.RunJar.main(RunJar.java:141)
      

      for struct:

      Maybe correct in branch of master,  but i think it is necessary to initialize the value of StandardConstantStructObjectInspector

      //代码占位符
      2020-06-10T16:40:56,971 ERROR [52839d08-57a7-475f-b87f-8f1410978b8a main] ql.Driver: FAILED: RuntimeException Error invoking signature method
      java.lang.RuntimeException: Error invoking signature method
              at org.apache.hadoop.hive.ql.optimizer.signature.SignatureUtils$SignatureMapper.write(SignatureUtils.java:76)
              at org.apache.hadoop.hive.ql.optimizer.signature.SignatureUtils.write(SignatureUtils.java:40)
              at org.apache.hadoop.hive.ql.optimizer.signature.OpSignature.<init>(OpSignature.java:53)
              at org.apache.hadoop.hive.ql.optimizer.signature.OpSignature.of(OpSignature.java:57)
              at org.apache.hadoop.hive.ql.optimizer.signature.OpTreeSignature.<init>(OpTreeSignature.java:50)
              at org.apache.hadoop.hive.ql.optimizer.signature.OpTreeSignature.of(OpTreeSignature.java:63)
              at org.apache.hadoop.hive.ql.optimizer.signature.OpTreeSignatureFactory$CachedFactory.lambda$getSignature$0(OpTreeSignatureFactory.java:62)
              at java.util.Map.computeIfAbsent(Map.java:957)
              at org.apache.hadoop.hive.ql.optimizer.signature.OpTreeSignatureFactory$CachedFactory.getSignature(OpTreeSignatureFactory.java:62)
              at org.apache.hadoop.hive.ql.plan.mapper.PlanMapper.getSignatureOf(PlanMapper.java:265)
              at org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory.applyRuntimeStats(StatsRulesProcFactory.java:2666)
              at org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory.access$000(StatsRulesProcFactory.java:116)
              at org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$SelectStatsRule.process(StatsRulesProcFactory.java:211)
              at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
              at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
              at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
              at org.apache.hadoop.hive.ql.lib.LevelOrderWalker.walk(LevelOrderWalker.java:143)
              at org.apache.hadoop.hive.ql.lib.LevelOrderWalker.startWalking(LevelOrderWalker.java:122)
              at org.apache.hadoop.hive.ql.optimizer.stats.annotation.AnnotateWithStatistics.transform(AnnotateWithStatistics.java:78)
              at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:250)
              at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12481)
              at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11824)
              at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285)
              at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:166)
              at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285)
              at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:664)
              at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1854)
              at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1801)
              at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1796)
              at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
              at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)
              at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
              at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
              at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
              at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
              at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
              at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
              at org.apache.hadoop.util.RunJar.main(RunJar.java:141)
      Caused by: java.lang.reflect.InvocationTargetException
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.hadoop.hive.ql.optimizer.signature.SignatureUtils$SignatureMapper.write(SignatureUtils.java:73)
              ... 42 more
      Caused by: java.lang.NullPointerException
              at org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc.getExprString(ExprNodeConstantDesc.java:158)
              at org.apache.hadoop.hive.ql.plan.ExprNodeDesc.getExprString(ExprNodeDesc.java:90)
              at org.apache.hadoop.hive.ql.plan.PlanUtils.addExprToStringBuffer(PlanUtils.java:1104)
              at org.apache.hadoop.hive.ql.plan.PlanUtils.getExprListString(PlanUtils.java:1092)
              at org.apache.hadoop.hive.ql.plan.PlanUtils.getExprListString(PlanUtils.java:1075)
              at org.apache.hadoop.hive.ql.plan.SelectDesc.getColListString(SelectDesc.java:79)
              ... 47 more
      

       

      We can fix it by initializing value for StandardConstantMapObjectInspector, StandardConstantListObjectInspector and StandardConstantStructObjectInspector.

       

        Attachments

        1. HIVE-22412.patch
          17 kB
          xiepengjie

          Issue Links

            Activity

              People

              • Assignee:
                xiepengjie xiepengjie
                Reporter:
                xiepengjie xiepengjie
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 3h 20m
                  3h 20m