Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-7265

Cache remote file handles

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • Impala 3.1.0
    • Impala 3.2.0
    • Backend
    • None
    • Hide
      This introduced a new parameter cache_remote_file_handles, which modifies the behavior of the file handle cache. I think some pieces of documentation will need updates:
      http://impala.apache.org/docs/build/html/topics/impala_scalability.html (section: "Scalability Considerations for NameNode Traffic with File Handle Caching")
      Show
      This introduced a new parameter cache_remote_file_handles, which modifies the behavior of the file handle cache. I think some pieces of documentation will need updates: http://impala.apache.org/docs/build/html/topics/impala_scalability.html (section: "Scalability Considerations for NameNode Traffic with File Handle Caching")
    • ghx-label-4

    Description

      The file handle cache currently does not allow caching remote file handles. This means that clusters that have a lot of remote reads can suffer from overloading the NameNode. Impala should be able to cache remote file handles.

      There are some open questions about remote file handles and whether they behave differently from local file handles. In particular:

      1. Is there any resource constraint on the number of remote file handles open? (e.g. do they maintain a network connection?)
      2. Are there any semantic differences in how remote file handles behave when files are deleted, overwritten, or appended?
      3. Are there any extra failure cases for remote file handles? (i.e. if a machine goes down or a remote file handle is left open for an extended period of time)

      The form of caching will depend on the answers, but at the very least, it should be possible to cache a remote file handle at the level of a query so that a Parquet file with multiple columns can share file handles.

      Attachments

        Issue Links

          Activity

            People

              joemcdonnell Joe McDonnell
              joemcdonnell Joe McDonnell
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: