Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-7127 Fetch-on-demand metadata for the impalad-side catalog
  3. IMPALA-9062

Don't need to acquire table locks in gathering catalog topic updates in minimal topic mode

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Not A Problem
    • None
    • None
    • None
    • None
    • ghx-label-6

    Description

      If catalog_topic_mode is minimal, for table updates, catalogd only propagates the database name, table name and catalog version associated with the table:
      https://github.com/apache/impala/blob/3.3.0/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L619

          private TCatalogObject getMinimalObjectForV2(TCatalogObject obj) {
            Preconditions.checkState(topicMode_ == TopicMode.MINIMAL ||
                topicMode_ == TopicMode.MIXED);
            TCatalogObject min = new TCatalogObject(obj.type, obj.catalog_version);
            switch (obj.type) {
            case DATABASE:
              min.setDb(new TDatabase(obj.db.db_name));
              break;
            case TABLE:
            case VIEW:
              min.setTable(new TTable(obj.table.db_name, obj.table.tbl_name));
              break;

      We acquire the table lock in case of reading partial results written by other concurrent DDLs: https://github.com/apache/impala/blob/3.3.0/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L1078

        private void addTableToCatalogDeltaHelper(Table tbl, GetCatalogDeltaContext ctx)
            throws TException {
          TCatalogObject catalogTbl =
              new TCatalogObject(TCatalogObjectType.TABLE, Catalog.INITIAL_CATALOG_VERSION);
          tbl.getLock().lock();  <-- acquire table lock here could be blocked by DDLs
          try {
            long tblVersion = tbl.getCatalogVersion();
            if (tblVersion <= ctx.fromVersion) return;
            String tableUniqueName = tbl.getUniqueName();
            TopicUpdateLog.Entry topicUpdateEntry =
                topicUpdateLog_.getOrCreateLogEntry(tableUniqueName);
            if (tblVersion > ctx.toVersion &&
                topicUpdateEntry.getNumSkippedTopicUpdates() < MAX_NUM_SKIPPED_TOPIC_UPDATES) {
              LOG.info("Table " + tbl.getFullName() + " is skipping topic update " +
                  ctx.toVersion);
              topicUpdateLog_.add(tableUniqueName,
                  new TopicUpdateLog.Entry(
                      topicUpdateEntry.getNumSkippedTopicUpdates() + 1,
                      topicUpdateEntry.getLastSentVersion(),
                      topicUpdateEntry.getLastSentCatalogUpdate()));
              return;
            }
            try {
              catalogTbl.setTable(tbl.toThrift());
            } catch (Exception e) {
              LOG.error(String.format("Error calling toThrift() on table %s: %s",
                  tbl.getFullName(), e.getMessage()), e);
              return;
            }
            catalogTbl.setCatalog_version(tbl.getCatalogVersion());
            ctx.addCatalogObject(catalogTbl, false);
          } finally {
            tbl.getLock().unlock();
          }
        } 

      Acquiring the table lock here could be blocked by slow concurrent DDLs like REFRESHs, causing problems like IMPALA-6671. Actually in minimal topic mode we just need database name, table name and catalog version for a table. The first two won't change during DDLs (rename are treated as drop+create). The last one, catalog version, is acceptable to propogate a value older than the latest version since it's already newer than or equal to those cached in coordinators. Thus, we don't need to acquire the table lock here.

      Attachments

        Activity

          People

            Unassigned Unassigned
            stigahuang Quanlong Huang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: