[IMPALA-10208] Drop stats doesnt remove impala_intermediate_stats_num_chunks from PARTITION_PARAMS - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Catalog
Labels:
- newbie
- ramp-up

Epic Color:
ghx-label-11

Description

Steps to replicate the issue:

Step1:

CREATE TABLE impala_partition_test1 (                                          
   a INT                                                                               
 )                                                                                     
 PARTITIONED BY (                                                                      
   b STRING                                                                            
 ); alter table impala_partition_test1 add partition(b="part1");
 alter table impala_partition_test1 add partition(b="part2");
 alter table impala_partition_test1 add partition(b="part3");
 alter table impala_partition_test1 add partition(b="part4");

Step2: Populating the partitions

for i in `seq 1 10`; do base64 /dev/urandom | head -c 5000K > text_data  && hdfs dfs -put text_data hdfs://nameservice1/user/hive/warehouse/impala_partition_test1/b=part1/test_${i}; done
 for i in `seq 1 10`; do base64 /dev/urandom | head -c 5000K > text_data  && hdfs dfs -put text_data hdfs://nameservice1/user/hive/warehouse/impala_partition_test1/b=part2/test_${i}; done
 for i in `seq 1 10`; do base64 /dev/urandom | head -c 5000K > text_data  && hdfs dfs -put text_data hdfs://nameservice1/user/hive/warehouse/impala_partition_test1/b=part3/test_${i}; done
 for i in `seq 1 10`; do base64 /dev/urandom | head -c 5000K > text_data  && hdfs dfs -put text_data hdfs://nameservice1/user/hive/warehouse/impala_partition_test1/b=part4/test_${i}; done

Step3: Run compute incremental stats impala_partition_test1;

Step4: In HMS DB when you run the below query

select A.TBL_NAME, B.PART_NAME, C.PARAM_KEY, sum(length(C.PARAM_KEY) + length(C.PARAM_VALUE)) from TBLS A join PARTITIONS B join PARTITION_PARAMS C on A.TBL_ID = B.TBL_ID and C.PART_ID=B.PART_ID and C.PARAM_KEY like "%impala_intermediate_stats%" group by A.TBL_NAME,B.PART_NAME,C.PARAM_KEY;

You will be noticing

Step5: After you drop the stats [drop stats impala_partition_test1 ] you still be noticing impala_intermediate_stats_num_chunks left unremoved.

When you have million partitions this could contribute to 37mb I suppose. Requesting you to remove impala_intermediate_stats_num_chunks while we drop stats from table.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

image-2020-10-02-10-39-18-642.png
02/Oct/20 05:09
564 kB
Venkat Sambath
image-2020-10-02-10-38-48-144.png
02/Oct/20 05:08
456 kB
Venkat Sambath
image-2020-10-02-10-38-16-153.png
02/Oct/20 05:08
544 kB
Venkat Sambath

Activity

People

Assignee:: Fucun Chu

Reporter:: Venkat Sambath

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 02/Oct/20 05:09

Updated:: 12/Nov/20 12:40