Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10801

Check the latest compaction Id before serving request

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • Impala 4.1.0
    • Catalog
    • None
    • ghx-label-12

    Description

      Cache compaction Id for a given table/file-metadata in CatalogD.

      Whenever there is a read request to CatalogD, get the latest compaction event Id from HMS, compare it with what is cached in CatalogD, and based on that decide whether to serve the data from cache or to refresh it from the filesystem. This can avoid notification based cache invalidation.

      Also, since there will be an open txn for the current long running query which is being served from CatalogD, we can be sure that current file-metadata being served is not already deleted by the cleaner.

      This proposal will use a new HMS APIĀ (https://issues.apache.org/jira/browse/HIVE-24828) to get the latest compaction id for a table.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            hsnusonic Yu-Wen Lai
            hsnusonic Yu-Wen Lai
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment