Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5405

Catalog will not send full update of catalog topic when statestore restarts

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Catalog
    • ghx-label-6

    Description

      If:

      • No DDL operations have happened since the last cluster restart
      • The statestore is restarted

      The catalog will not re-publish its metadata topic. Any new Impala daemons won't get updates, and won't be able to accept queries.

      For a minimal repro, start a cluster. Wait for metadata to be loaded (i.e. you can run a query), and then restart the statestore. After 30s or so, check /topics on the statestore's UI - the catalog-update topic will exist, but will have 0 entries.

      The bug appears to be in this code in the catalog:

      if (delta.from_version == 0 && delta.to_version == 0 &&
            catalog_objects_min_version_ != 0) {
          catalog_topic_entry_keys_.clear();
          last_sent_catalog_version_ = 0L;
        } else {
          // .. publish intermediate update
        }
      

      When the statestore restarts and sends the first topic update for the catalog topic, catalog_min_update_ may be 0, so the first branch which is for publishing the complete metadata topic is not taken. If any DDL operations have happened on the cluster, catalog_min_update_ becomes non-zero, and the bug is no longer hit.

      Workaround Either a) Trigger metadata publication by running INVALIDATE METADATA or REFRESH <tbl>, or b) restart catalogd.

      Attachments

        Activity

          People

            Unassigned Unassigned
            henryr Henry Robinson
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: