Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-21853

NPE in StatsUtils.getWritableSize() when value passed in is null

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None
    • Hortonworks 

      • Ambari version 2.7.3.0
      • HDP stack version 3.1
      • HDP stack repo version 3.1.0.0
      • stack vdf version 3.1.0.0-78

    Description

      getWritableSize(ObjectInspector oi, Object value) method in org.apache.hadoop.hive.ql.stats.StatsUtils class fails with NPE when 2nd parameter (Object value) is null.

      Attached is patch with unit test and fix (HIVE-21853.patch)

      Issue was originally found when running UDF query against Hortonworks cluster with HDP 3.1 running Hive 3.1.0. The issue occurs when executing the UDF against a cluster using the tez execution engine

      beeline hive configurations

      set hive.execution.engine=tez;
      set hive.fetch.task.conversion=none;

      Attached is sample code with an implementation of a simple UDF that duplicates the behavior (HIVE21853.java).

      steps to reproduce

      on a Hortonworks cluster with HDP 3.1 deployed

      -start beeline Hive session
      -set above hive configurations
      -add jar containing UDF from sample code
      -create table containing one string column
          create table tmptable(col1 string)
          insert into table tmptable values ('somestring')
      -create function hiveUDF as 'Hive21853UDF ';
      -select hiveUDF from tmptable;
      this will result in a null pointer exception similar to this

      ql.Driver ()) - FAILED: NullPointerException nulljava.lang.NullPointerException
      at org.apache.hadoop.hive.ql.stats.StatsUtils.getWritableSize(StatsUtils.java:1373)
      at org.apache.hadoop.hive.ql.stats.StatsUtils.getSizeOfStruct(StatsUtils.java:1356)
      at org.apache.hadoop.hive.ql.stats.StatsUtils.getSizeOfComplexTypes(StatsUtils.java:1212)
      at org.apache.hadoop.hive.ql.stats.StatsUtils.getAvgColLenOf(StatsUtils.java:1140)
      at org.apache.hadoop.hive.ql.stats.StatsUtils.getColStatisticsFromExpression(StatsUtils.java:1584)
      at org.apache.hadoop.hive.ql.stats.StatsUtils.getColStatisticsFromExprMap(StatsUtils.java:1424)
      at org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$SelectStatsRule.process(StatsRulesProcFactory.java:196)
      at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
      at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
      at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
      at org.apache.hadoop.hive.ql.lib.LevelOrderWalker.startWalking(LevelOrderWalker.java:122)
      at org.apache.hadoop.hive.ql.optimizer.stats.annotation.AnnotateWithStatistics.transform(AnnotateWithStatistics.java:78)
      at org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsAnnotation(TezCompiler.java:397)
      at org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:161)
      at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:148)
      at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12443)
      at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:358)
      at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285)
      at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:664)
      at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1863)
      at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1810)
      at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1805)
      at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
      at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:197)
      at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:262)
      at org.apache.hive.service.cli.operation.Operation.run(Operation.java:247)
      at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:541)
      at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:527)
      at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315)
      at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:562)
      at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
      at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
      at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
      at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
      at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
      at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)

      Attachments

        1. HIVE-21853.patch
          3 kB
          John LeBrun
        2. HIVE21853.java
          2 kB
          John LeBrun

        Activity

          People

            Unassigned Unassigned
            jo036le John LeBrun
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: