Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10208

Drop stats doesnt remove impala_intermediate_stats_num_chunks from PARTITION_PARAMS

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Catalog
    • Labels:
    • Epic Color:
      ghx-label-11

      Description

      Steps to replicate the issue:

      Step1:

      CREATE TABLE impala_partition_test1 (                                          
         a INT                                                                               
       )                                                                                     
       PARTITIONED BY (                                                                      
         b STRING                                                                            
       ); alter table impala_partition_test1 add partition(b="part1");
       alter table impala_partition_test1 add partition(b="part2");
       alter table impala_partition_test1 add partition(b="part3");
       alter table impala_partition_test1 add partition(b="part4");
      

      Step2: Populating the partitions

      for i in `seq 1 10`; do base64 /dev/urandom | head -c 5000K > text_data  && hdfs dfs -put text_data hdfs://nameservice1/user/hive/warehouse/impala_partition_test1/b=part1/test_${i}; done
       for i in `seq 1 10`; do base64 /dev/urandom | head -c 5000K > text_data  && hdfs dfs -put text_data hdfs://nameservice1/user/hive/warehouse/impala_partition_test1/b=part2/test_${i}; done
       for i in `seq 1 10`; do base64 /dev/urandom | head -c 5000K > text_data  && hdfs dfs -put text_data hdfs://nameservice1/user/hive/warehouse/impala_partition_test1/b=part3/test_${i}; done
       for i in `seq 1 10`; do base64 /dev/urandom | head -c 5000K > text_data  && hdfs dfs -put text_data hdfs://nameservice1/user/hive/warehouse/impala_partition_test1/b=part4/test_${i}; done
      

      Step3: Run compute incremental stats impala_partition_test1;

      Step4: In HMS DB when you run the below query

      select A.TBL_NAME, B.PART_NAME, C.PARAM_KEY, sum(length(C.PARAM_KEY) + length(C.PARAM_VALUE)) from TBLS A join PARTITIONS B join PARTITION_PARAMS C on A.TBL_ID = B.TBL_ID and C.PART_ID=B.PART_ID and C.PARAM_KEY like "%impala_intermediate_stats%" group by A.TBL_NAME,B.PART_NAME,C.PARAM_KEY;	
      

      You will be noticing

      Step5: After you drop the stats [drop stats impala_partition_test1 ] you still be noticing impala_intermediate_stats_num_chunks left unremoved.

      When you have million partitions this could contribute to 37mb I suppose. Requesting you to remove impala_intermediate_stats_num_chunks while we drop stats from table.

        Attachments

        1. image-2020-10-02-10-38-16-153.png
          544 kB
          Venkat Sambath
        2. image-2020-10-02-10-38-48-144.png
          456 kB
          Venkat Sambath
        3. image-2020-10-02-10-39-18-642.png
          564 kB
          Venkat Sambath

          Activity

            People

            • Assignee:
              chufucun Fucun Chu
              Reporter:
              Venkat Sambath Venkat Sambath
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: