Hive
  1. Hive
  2. HIVE-6984

Analyzing partitioned table with NULL values for the partition column failed with NPE

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.14.0
    • Fix Version/s: 0.14.0
    • Component/s: Statistics
    • Labels:
      None

      Description

      The following describes how to produce the bug:

      hive> desc test2;
      name                	string              	                    
      age                 	int                 	                    
      
      hive> select * from test2;
      6666666666666666666	NULL
      5555555555555555555	NULL
      tom	15
      john	NULL
      mayr	40
      	30
      	NULL
      
      hive> create table test3(name string) partitioned by (age int);
      
      hive> from test2 insert overwrite table test3 partition(age) select test2.name, test2.age;
      Loading data to table default.test3 partition (age=null)
      	Loading partition {age=40}
      	Loading partition {age=__HIVE_DEFAULT_PARTITION__}
      	Loading partition {age=30}
      	Loading partition {age=15}
      Partition default.test3{age=15} stats: [numFiles=1, numRows=1, totalSize=4, rawDataSize=3]
      Partition default.test3{age=30} stats: [numFiles=1, numRows=1, totalSize=1, rawDataSize=0]
      Partition default.test3{age=40} stats: [numFiles=1, numRows=1, totalSize=5, rawDataSize=4]
      Partition default.test3{age=__HIVE_DEFAULT_PARTITION__} stats: [numFiles=1, numRows=4, totalSize=46, rawDataSize=42]
      
      hive> analyze table test3 partition(age) compute statistics;
      ...
      Task with the most failures(4): 
      -----
      Diagnostic Messages for this Task:
      java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"name":"6666666666666666666","age":null,"raw__data__size":19}
      	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
      	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
      	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
      	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
      	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:415)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499)
      	at org.apache.hadoop.mapred.Child.main(Child.java:262)
      Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"name":"6666666666666666666","age":null,"raw__data__size":19}
      	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:549)
      	at org.apache.hado
      
      FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
      

      The following is the stack trace in mapper log:

      2014-04-28 15:39:25,073 FATAL org.apache.hadoop.hive.ql.exec.mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"name":"6666666666666666666","age":null,"raw__data__size":19}
      	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:549)
      	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
      	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
      	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
      	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
      	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:415)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499)
      	at org.apache.hadoop.mapred.Child.main(Child.java:262)
      Caused by: java.lang.NullPointerException
      	at org.apache.hadoop.hive.ql.exec.TableScanOperator.gatherStats(TableScanOperator.java:149)
      	at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90)
      	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
      	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:539)
      	... 9 more
      
      1. HIVE-6984.patch
        8 kB
        Xuefu Zhang
      2. HIVE-6984.2.patch
        49 kB
        Xuefu Zhang
      3. HIVE-6984.1.patch
        18 kB
        Xuefu Zhang

        Activity

        Hide
        Sergey Shelukhin added a comment -

        +1; one question - should q file have stats output, to verify they are correct in the output?

        Show
        Sergey Shelukhin added a comment - +1; one question - should q file have stats output, to verify they are correct in the output?
        Hide
        Ashutosh Chauhan added a comment -

        Also, instead of adding new data file consider reusing one from data/files/

        Show
        Ashutosh Chauhan added a comment - Also, instead of adding new data file consider reusing one from data/files/
        Hide
        Xuefu Zhang added a comment -

        one question - should q file have stats output, to verify they are correct in the output?

        I tried "desc extended test2;" but it shows no stats (it shows "parameters : {}"). So, I skipped that. (I think there is a JIRA about that, but wasn't able to find it.)

        Show
        Xuefu Zhang added a comment - one question - should q file have stats output, to verify they are correct in the output? I tried "desc extended test2;" but it shows no stats (it shows "parameters : {}"). So, I skipped that. (I think there is a JIRA about that, but wasn't able to find it.)
        Hide
        Sergey Shelukhin added a comment -

        maybe some bogus explain query can be used, I think they output stats

        Show
        Sergey Shelukhin added a comment - maybe some bogus explain query can be used, I think they output stats
        Hide
        Xuefu Zhang added a comment -

        Patch #1 updated based on feedback.

        Show
        Xuefu Zhang added a comment - Patch #1 updated based on feedback.
        Hide
        Hive QA added a comment -

        Overall: -1 at least one tests failed

        Here are the results of testing the latest attachment:
        https://issues.apache.org/jira/secure/attachment/12642553/HIVE-6984.1.patch

        ERROR: -1 due to 19 failed/errored test(s), 5491 tests executed
        Failed tests:

        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_skew
        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23
        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket_map_join_tez1
        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket_map_join_tez2
        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_count
        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_limit_pushdown
        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_load_dyn_part1
        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mrr
        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dml
        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union
        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union2
        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union3
        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union5
        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union7
        org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union9
        org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
        org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_dynamic_partitions_with_whitelist
        org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_partialscan_autogether
        

        Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/84/testReport
        Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/84/console

        Messages:

        Executing org.apache.hive.ptest.execution.PrepPhase
        Executing org.apache.hive.ptest.execution.ExecutionPhase
        Executing org.apache.hive.ptest.execution.ReportingPhase
        Tests exited with: TestsFailedException: 19 tests failed
        

        This message is automatically generated.

        ATTACHMENT ID: 12642553

        Show
        Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12642553/HIVE-6984.1.patch ERROR: -1 due to 19 failed/errored test(s), 5491 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_skew org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket_map_join_tez1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket_map_join_tez2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_count org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_limit_pushdown org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_load_dyn_part1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mrr org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dml org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union5 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union7 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_dynamic_partitions_with_whitelist org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_partialscan_autogether Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/84/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/84/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 19 tests failed This message is automatically generated. ATTACHMENT ID: 12642553
        Hide
        Xuefu Zhang added a comment -

        Patch #2 updated with regenerated test output.

        Show
        Xuefu Zhang added a comment - Patch #2 updated with regenerated test output.
        Hide
        Sergey Shelukhin added a comment -

        +1

        Show
        Sergey Shelukhin added a comment - +1
        Hide
        Hive QA added a comment -

        Overall: -1 at least one tests failed

        Here are the results of testing the latest attachment:
        https://issues.apache.org/jira/secure/attachment/12642690/HIVE-6984.2.patch

        ERROR: -1 due to 6 failed/errored test(s), 5427 tests executed
        Failed tests:

        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_skew
        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
        org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23
        org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
        org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_dynamic_partitions_with_whitelist
        org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_partialscan_autogether
        

        Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/92/testReport
        Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/92/console

        Messages:

        Executing org.apache.hive.ptest.execution.PrepPhase
        Executing org.apache.hive.ptest.execution.ExecutionPhase
        Executing org.apache.hive.ptest.execution.ReportingPhase
        Tests exited with: TestsFailedException: 6 tests failed
        

        This message is automatically generated.

        ATTACHMENT ID: 12642690

        Show
        Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12642690/HIVE-6984.2.patch ERROR: -1 due to 6 failed/errored test(s), 5427 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_skew org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_dynamic_partitions_with_whitelist org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_partialscan_autogether Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/92/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/92/console Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed This message is automatically generated. ATTACHMENT ID: 12642690
        Hide
        Xuefu Zhang added a comment -

        The above test failures don't seem related to the patch. They appeared in other test runs.

        Show
        Xuefu Zhang added a comment - The above test failures don't seem related to the patch. They appeared in other test runs.
        Hide
        Xuefu Zhang added a comment -

        Patch committed to trunk. Thanks Sergey for the review.

        Show
        Xuefu Zhang added a comment - Patch committed to trunk. Thanks Sergey for the review.
        Hide
        Thejas M Nair added a comment -

        This has been fixed in 0.14 release. Please open new jira if you see any issues.

        Show
        Thejas M Nair added a comment - This has been fixed in 0.14 release. Please open new jira if you see any issues.

          People

          • Assignee:
            Xuefu Zhang
            Reporter:
            Xuefu Zhang
          • Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development