Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12189

updateCatalog not releasing the catalog lock if createTblTransaction() throws exceptions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • Impala 4.3.0
    • Catalog
    • None

    Description

      We saw an issue that catalogd can't finish RPC requests after this error:

      I0605 21:04:49.356642  6145 jni-util.cc:288] org.apache.impala.common.TransactionException: Internal error processing allocate_table_write_ids
              at org.apache.impala.catalog.Hive3MetastoreShimBase.allocateTableWriteId(Hive3MetastoreShimBase.java:763)
              at org.apache.impala.catalog.Hive3MetastoreShimBase.createTblTransaction(Hive3MetastoreShimBase.java:129)
              at org.apache.impala.service.CatalogOpExecutor.updateCatalog(CatalogOpExecutor.java:6394)
              at org.apache.impala.service.JniCatalog.updateCatalog(JniCatalog.java:507)
      I0605 21:04:49.356665  6145 status.cc:129] TransactionException: Internal error processing allocate_table_write_ids
      

      Code snipper of the downstream branch:

      6370   public TUpdateCatalogResponse updateCatalog(TUpdateCatalogRequest update)
      6371       throws ImpalaException {
      6372     TUpdateCatalogResponse response = new TUpdateCatalogResponse();
      6373     // Only update metastore for Hdfs tables.
      6374     Table table = getExistingTable(update.getDb_name(), update.getTarget_table(),
      6375         "Load for INSERT");
      6376     if (!(table instanceof FeFsTable)) {
      6377       throw new InternalException("Unexpected table type: " +
      6378           update.getTarget_table());
      6379     }
      6380 
      6381     tryWriteLock(table, "updating the catalog");
      6382     final Timer.Context context
      6383         = table.getMetrics().getTimer(HdfsTable.CATALOG_UPDATE_DURATION_METRIC).time();
      6384 
      6385     long transactionId = -1;
      6386     TblTransaction tblTxn = null;
      6387     if (update.isSetTransaction_id()) {
      6388       transactionId = update.getTransaction_id();
      6389       Preconditions.checkState(transactionId > 0);
      6390       try (MetaStoreClient msClient = catalog_.getMetaStoreClient()) {
      6391          // Setup transactional parameters needed to do alter table/partitions later.
      6392          // TODO: Could be optimized to possibly save some RPCs, as these parameters are
      6393          //       not always needed + the writeId of the INSERT could be probably reused.
      6394          tblTxn = MetastoreShim.createTblTransaction(
      6395              msClient.getHiveClient(), table.getMetaStoreTable(), transactionId);
      6396       }
      6397     }
      6398 
      6399     try {
      6400       // Get new catalog version for table in insert.
      6401       long newCatalogVersion = catalog_.incrementAndGetCatalogVersion();
      6402       catalog_.getLock().writeLock().unlock();
      ...
      6617     } finally {
      6618       context.stop();
      6619       UnlockWriteLockIfErronouslyLocked();
      6620       table.releaseWriteLock();
      6621     }
      

      The catalog lock (versionLock) is acquired at line 6381 if the current thread get the table lock. In normal workload, it will be released at line 6402. However, if MetastoreShim.createTblTransaction() throws exceptions, there are no place to release the lock. Note that there is a finally-clause at line 6619 that can release the lock. But it's not guarding the code that calls createTblTransaction().

      If the write lock of versionLock is not released, other threads can't proceed in their catalog operations, including table loading and the event-processor.

      I'm able to reproduce the issue by modifying the code to explicitly throws an exception at
      https://github.com/apache/impala/blob/4cf0bfa83f9641eb95d83c76af7962e6a3f1e064/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L6636

      CC csringhofer gfurnstahl 

      Attachments

        Issue Links

          Activity

            People

              stigahuang Quanlong Huang
              stigahuang Quanlong Huang
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: