Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
ghx-label-8
Description
In Impala, create a partitioned table and create one partition in it using INSERT:
create table my_part (i int) partitioned by (p int) stored as parquet; insert into my_part partition(p=0) values (0),(1),(2); show partitions my_part +-------+-------+--------+------+--------------+-------------------+---------+-------------------+---------------------------------------------------+-----------+ | p | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | Incremental stats | Location | EC Policy | +-------+-------+--------+------+--------------+-------------------+---------+-------------------+---------------------------------------------------+-----------+ | 0 | -1 | 1 | 358B | NOT CACHED | NOT CACHED | PARQUET | false | hdfs://localhost:20500/test-warehouse/my_part/p=0 | NONE | | Total | -1 | 1 | 358B | 0B | | | | | | +-------+-------+--------+------+--------------+-------------------+---------+-------------------+---------------------------------------------------+-----------+
In Hive, describe the partition. We can see parameters of "impala.events.catalogServiceId" and "impala.events.catalogVersion" added by Impala. This is ok.
hive> desc formatted my_part partition(p=0); +-----------------------------------+----------------------------------------------------+-----------------------------------+ | col_name | data_type | comment | +-----------------------------------+----------------------------------------------------+-----------------------------------+ | i | int | | | | NULL | NULL | | # Partition Information | NULL | NULL | | # col_name | data_type | comment | | p | int | | | | NULL | NULL | | # Detailed Partition Information | NULL | NULL | | Partition Value: | [0] | NULL | | Database: | default | NULL | | Table: | my_part | NULL | | CreateTime: | Wed Aug 09 15:24:50 CST 2023 | NULL | | LastAccessTime: | UNKNOWN | NULL | | Location: | hdfs://localhost:20500/test-warehouse/my_part/p=0 | NULL | | Partition Parameters: | NULL | NULL | | | impala.events.catalogServiceId | eab33ebb8a14cfd:8b2bdc12df3568df | | | impala.events.catalogVersion | 1882 | | | numFiles | 1 | | | totalSize | 358 | | | transient_lastDdlTime | 1691565890 | | | NULL | NULL | | # Storage Information | NULL | NULL | | SerDe Library: | org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe | NULL | | InputFormat: | org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat | NULL | | OutputFormat: | org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat | NULL | | Compressed: | No | NULL | | Num Buckets: | 0 | NULL | | Bucket Columns: | [] | NULL | | Sort Columns: | [] | NULL | +-----------------------------------+----------------------------------------------------+-----------------------------------+
Now run an ALTER statement on the partition in Hive, e.g. changing the location:
alter table my_part partition(p=0) set location '/tmp';
Impala will skip the ALTER_PARTITION event since it's considered as a self-event. In catalogd logs:
I0809 15:30:19.628449 29844 MetastoreEvents.java:628] EventId: 8351549 EventType: ALTER_PARTITION Incremented events skipped counter to 12 I0809 15:30:19.628616 29844 MetastoreEvents.java:628] EventId: 8351549 EventType: ALTER_PARTITION Not processing the event as it is a self-event