Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
ghx-label-7
Description
INSERT into a partition that exists in catalogd but doesn't exist in HMS will fail in metadata reloading on the partition. The cause is that updateCatalog doesn't create the partition in HMS (since catalogd is not aware of the non-existence of the partition in HMS):
https://github.com/apache/impala/blob/d0fe4c604f72d41019832513ebf65cfe8f469953/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L6697-L6699
When reloading the partition, catalogd first removes it since it doesn't exist in HMS:
https://github.com/apache/impala/blob/d0fe4c604f72d41019832513ebf65cfe8f469953/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1530-L1531
It then try to reload it, which hits NullPointerException at:
https://github.com/apache/impala/blob/d0fe4c604f72d41019832513ebf65cfe8f469953/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1566
To reproduce the issue, launch Impala with event processing disabled so catalogd can be unsynced with HMS. Create a partitioned table in Impala with one partition:
bin/start-impala-cluster.py --catalogd_args=--hms_event_polling_interval_s=0 impala> create table my_part2 (id int) partitioned by (p int) stored as textfile; impala> insert into my_part2 partition(p=0) values (0);
Drop the partition in Hive:
hive> alter table my_part2 drop partition (p=0);
Then insert the partition again in Impala
impala> insert into my_part2 partition(p=0) values (1); ERROR: TableLoadingException: Failed to load metadata for table: default.my_part2 CAUSED BY: NullPointerException: Invalid partition name: p=0
The exception:
E0710 19:34:43.569339 4413 JniUtil.java:183] bb4452d18eafe116:eaf16c4000000000] Error in Update catalog for default.my_part2. Time spent: 1s186ms I0710 19:34:43.569918 4413 jni-util.cc:288] bb4452d18eafe116:eaf16c4000000000] org.apache.impala.catalog.TableLoadingException: Failed to load metadata for table: default.my_part2 at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1308) at org.apache.impala.service.CatalogOpExecutor.loadTableMetadata(CatalogOpExecutor.java:1521) at org.apache.impala.service.CatalogOpExecutor.updateCatalog(CatalogOpExecutor.java:6863) at org.apache.impala.service.JniCatalog.lambda$updateCatalog$16(JniCatalog.java:471) at org.apache.impala.service.JniCatalogOp.lambda$execAndSerialize$1(JniCatalogOp.java:90) at org.apache.impala.service.JniCatalogOp.execOp(JniCatalogOp.java:58) at org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:89) at org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:100) at org.apache.impala.service.JniCatalog.execAndSerialize(JniCatalog.java:230) at org.apache.impala.service.JniCatalog.updateCatalog(JniCatalog.java:470) Caused by: java.lang.NullPointerException: Invalid partition name: p=0 at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:907) at org.apache.impala.catalog.HdfsTable.getPartitionsForNames(HdfsTable.java:1766) at org.apache.impala.catalog.HdfsTable$PartitionDeltaUpdater.apply(HdfsTable.java:1566) at org.apache.impala.catalog.HdfsTable.updatePartitionsFromHms(HdfsTable.java:1447) at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1282) ... 9 more