Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-8690

Better eviction algorithm for data cache

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 3.3.0
    • Impala 4.0.0
    • Backend
    • None
    • ghx-label-9

    Description

      With the current implementation of data cache, all data access will be cached regardless of the access pattern. The current LRU eviction algorithm is not resistant to scan traffic so in case some users scan a big fact table, a lot of the heavily accessed items will be evicted inevitably. We should adopt better eviction algorithm (e.g. LRFU or some other well known ones in the literature). Would be nice to evaluate it against some users' traces now that IMPALA-8542 is fixed.

      In the short run, we probably need some workaround (e.g. query hints to disable caching for certain tables). Will file a separate jira for it.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            joemcdonnell Joe McDonnell
            kwho Michael Ho
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment