Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12257

createInsertEvents failed by NullPointerException: Invalid partition name

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • Impala 4.3.0
    • Catalog
    • None
    • ghx-label-7

    Description

      INSERT on partitioned table could fail in createInsertEvents() if some updated partitions are missing in catalogd but actually exists in HMS. The failure is a NullPointerException, e.g.

      I0630 14:44:29.779798 30287 jni-util.cc:288] java.lang.NullPointerException: Invalid partition name: p=0
              at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:907)
              at org.apache.impala.catalog.HdfsTable.getPartitionsForNames(HdfsTable.java:1758)
              at org.apache.impala.service.CatalogOpExecutor.createInsertEvents(CatalogOpExecutor.java:6935)
              at org.apache.impala.service.CatalogOpExecutor.updateCatalog(CatalogOpExecutor.java:6830)
              at org.apache.impala.service.JniCatalog.lambda$updateCatalog$16(JniCatalog.java:471)
              at org.apache.impala.service.JniCatalogOp.lambda$execAndSerialize$1(JniCatalogOp.java:90)
              at org.apache.impala.service.JniCatalogOp.execOp(JniCatalogOp.java:58)
              at org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:89)
              at org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:100)
              at org.apache.impala.service.JniCatalog.execAndSerialize(JniCatalog.java:230)
              at org.apache.impala.service.JniCatalog.updateCatalog(JniCatalog.java:470) 

      We've seen this in an issue caused by IMPALA-12256 in which the stale DROP_PARTITION event incorrectly drops the partition, causing inconsistency between catalogd and HMS on the partition list.

      To steadily reproduce the issue, we can disable HMS event-processing and manually make the partition list differs between catalogd and HMS.

      Start Impala with HMS event-processing disabled and create a partitioned table. Run a query on it to make it loaded in catalogd:

      bin/start-impala-cluster.py --catalogd_args=--hms_event_polling_interval_s=0
      
      impala> create table my_part (id int) partitioned by (p int) stored as textfile;
      impala> show partitions my_part;

      Add one partition in Hive. Catalogd is not aware of it:

      hive> alter table my_part add partition(p=0);
      

      Then run INSERT on the same partition in Impala:

      impala> insert into my_part partition(p=0) values (0);
      ERROR: NullPointerException: Invalid partition name: p=0
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            stigahuang Quanlong Huang
            stigahuang Quanlong Huang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment