Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-8062

Stats collection for columns fails on a partitioned table with null values in partitioning column

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.14.0
    • 0.14.0
    • Statistics
    • None

    Description

      Steps to reproduce:
      1. Create a data file abc.txt with the following contents:

      a,1
      b,
      

      2. Use the Hive CLI to create and load the partitioned table:

      hive> create table abc(a string, b int);
      OK
      Time taken: 0.272 seconds
      hive> load data local inpath 'abc.txt' into table abc;
      Loading data to table default.abc
      Table default.abc stats: [numFiles=1, numRows=0, totalSize=7, rawDataSize=0]
      OK
      Time taken: 0.463 seconds
      hive> create table abc1(a string) partitioned by (b int);
      OK
      Time taken: 0.098 seconds
      hive> set hive.exec.dynamic.partition.mode=nonstrict;
      hive> insert overwrite table abc1 partition (b) select a, b from abc;
      Query ID = hrt_qa_20140911210909_1200fae7-1e18-4e0d-b74f-040453c27cff
      Total jobs = 1
      Launching Job 1 out of 1
      
      
      Status: Running (application id: Executing on YARN cluster with App id application_1410457588978_0063)
      
      Map 1: -/-	Reducer 2: 0/1
      Map 1: 0/1	Reducer 2: 0/1
      Map 1: 0(+1)/1	Reducer 2: 0/1
      Map 1: 1/1	Reducer 2: 0(+1)/1
      Map 1: 1/1	Reducer 2: 0/1
      Map 1: 1/1	Reducer 2: 1/1
      Status: Finished successfully
      Loading data to table default.abc1 partition (b=null)
      	Loading partition {b=__HIVE_DEFAULT_PARTITION__}
      Partition default.abc1{b=__HIVE_DEFAULT_PARTITION__} stats: [numFiles=1, numRows=2, totalSize=7, rawDataSize=5]
      OK
      Time taken: 7.49 seconds
      

      3. Now run the analyze statistics command for columns:

      hive> analyze table abc1 partition (b) compute statistics for columns;
      Query ID = hrt_qa_20140911211010_440bdb4a-6a0d-496b-9d2e-5fc84db3d0ee
      Total jobs = 1
      Launching Job 1 out of 1
      
      
      Status: Running (application id: Executing on YARN cluster with App id application_1410457588978_0063)
      
      Map 1: 0(+1)/1	Reducer 2: 0/1
      Map 1: 1/1	Reducer 2: 0(+1)/1
      Map 1: 1/1	Reducer 2: 1/1
      Status: Finished successfully
      FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.ColumnStatsTask
      

      The analyze statistics for columns fails.

      Attachments

        1. HIVE-8062.patch
          7 kB
          Ashutosh Chauhan

        Issue Links

          Activity

            People

              ivikashks Vikash Singh
              deepesh Deepesh Khandelwal
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: